linux

mirror of https://github.com/torvalds/linux.git synced 2025-12-07 20:06:24 +00:00

Author	SHA1	Message	Date
Linus Torvalds	59b723cd2a	Linux 6.12-rc6	2024-11-03 14:05:52 -10:00
Linus Torvalds	a8cc743272	Merge tag 'mm-hotfixes-stable-2024-11-03-10-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "17 hotfixes. 9 are cc:stable. 13 are MM and 4 are non-MM. The usual collection of singletons - please see the changelogs" * tag 'mm-hotfixes-stable-2024-11-03-10-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: multi-gen LRU: use {ptep,pmdp}_clear_young_notify() mm: multi-gen LRU: remove MM_LEAF_OLD and MM_NONLEAF_TOTAL stats mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizes mm: shrinker: avoid memleak in alloc_shrinker_info .mailmap: update e-mail address for Eugen Hristev vmscan,migrate: fix page count imbalance on node stats when demoting pages mailmap: update Jarkko's email addresses mm: allow set/clear page_type again nilfs2: fix potential deadlock with newly created symlinks Squashfs: fix variable overflow in squashfs_readpage_block kasan: remove vmalloc_percpu test tools/mm: -Werror fixes in page-types/slabinfo mm, swap: avoid over reclaim of full clusters mm: fix PSWPIN counter for large folios swap-in mm: avoid VM_BUG_ON when try to map an anon large folio to zero page. mm/codetag: fix null pointer check logic for ref and tag mm/gup: stop leaking pinned pages in low memory conditions	2024-11-03 10:25:05 -10:00
Linus Torvalds	d5aaa0bc6d	Merge tag 'phy-fixes-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: - Qualcomm QMP driver fixes for null deref on suspend, bogus supplies fix and reset entries fix - BCM usb driver init array fix - cadence array offset fix - starfive link configuration fix - config dependency fix for rockchip driver - freescale reset signal fix before pll lock - tegra driver fix for error pointer check * tag 'phy-fixes-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: tegra: xusb: Add error pointer check in xusb.c dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Fix X1E80100 resets entries phy: freescale: imx8m-pcie: Do CMN_RST just before PHY PLL lock check phy: phy-rockchip-samsung-hdptx: Depend on CONFIG_COMMON_CLK phy: ti: phy-j721e-wiz: fix usxgmii configuration phy: starfive: jh7110-usb: Fix link configuration to controller phy: qcom: qmp-pcie: drop bogus x1e80100 qref supplies phy: qcom: qmp-combo: move driver data initialisation earlier phy: qcom: qmp-usbc: fix NULL-deref on runtime suspend phy: qcom: qmp-usb-legacy: fix NULL-deref on runtime suspend phy: qcom: qmp-usb: fix NULL-deref on runtime suspend dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: add missing x1e80100 pipediv2 clocks phy: usb: disable COMMONONN for dual mode phy: cadence: Sierra: Fix offset of DEQ open eye algorithm control register phy: usb: Fix missing elements in BCM4908 USB init array	2024-11-03 10:19:34 -10:00
Linus Torvalds	e8529dcb12	Merge tag 'dmaengine-fix-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: - TI driver fix to set EOP for cyclic BCDMA transfers - sh rz-dmac driver fix for handling config with zero address * tag 'dmaengine-fix-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: ti: k3-udma: Set EOP for all TRs in cyclic BCDMA transfer dmaengine: sh: rz-dmac: handle configs where one address is zero	2024-11-03 10:15:50 -10:00
Linus Torvalds	886b7e80ab	Merge tag 'driver-core-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core revert from Greg KH: "Here is a single driver core revert for 6.12-rc6. It reverts a change that came in -rc1 that was supposed to resolve a reported problem, but caused another one, so revert it for now so that we can get this all worked out properly in 6.13. The revert has been in linux-next all week with no reported issues" * tag 'driver-core-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Revert "driver core: Fix uevent_show() vs driver detach race"	2024-11-03 08:51:53 -10:00
Linus Torvalds	be5bfa1378	Merge tag 'usb-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt fixes from Greg KH: "Here are some small USB and Thunderbolt driver fixes for 6.12-rc6 that have been sitting in my tree this week. Included in here are the following: - thunderbolt driver fixes for reported issues - USB typec driver fixes - xhci driver fixes for reported problems - dwc2 driver revert for a broken change - usb phy driver fix - usbip tool fix All of these have been in linux-next this week with no reported issues" * tag 'usb-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: typec: tcpm: restrict SNK_WAIT_CAPABILITIES_TIMEOUT transitions to non self-powered devices usb: phy: Fix API devm_usb_put_phy() can not release the phy usb: typec: use cleanup facility for 'altmodes_node' usb: typec: fix unreleased fwnode_handle in typec_port_register_altmodes() usb: typec: qcom-pmic-typec: fix missing fwnode removal in error path usb: typec: qcom-pmic-typec: use fwnode_handle_put() to release fwnodes usb: acpi: fix boot hang due to early incorrect 'tunneled' USB3 device links Revert "usb: dwc2: Skip clock gating on Broadcom SoCs" xhci: Fix Link TRB DMA in command ring stopped completion event xhci: Use pm_runtime_get to prevent RPM on unsupported systems usbip: tools: Fix detach_port() invalid port error path thunderbolt: Honor TMU requirements in the domain when setting TMU mode thunderbolt: Fix KASAN reported stack out-of-bounds read in tb_retimer_scan()	2024-11-03 08:48:11 -10:00
Yu Zhao	1d4832becd	mm: multi-gen LRU: use {ptep,pmdp}_clear_young_notify() When the MM_WALK capability is enabled, memory that is mostly accessed by a VM appears younger than it really is, therefore this memory will be less likely to be evicted. Therefore, the presence of a running VM can significantly increase swap-outs for non-VM memory, regressing the performance for the rest of the system. Fix this regression by always calling {ptep,pmdp}_clear_young_notify() whenever we clear the young bits on PMDs/PTEs. [jthoughton@google.com: fix link-time error] Link: https://lkml.kernel.org/r/20241019012940.3656292-3-jthoughton@google.com Fixes: `bd74fdaea1` ("mm: multi-gen LRU: support page table walks") Signed-off-by: Yu Zhao <yuzhao@google.com> Signed-off-by: James Houghton <jthoughton@google.com> Reported-by: David Stevens <stevensd@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Matlack <dmatlack@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Wei Xu <weixugc@google.com> Cc: <stable@vger.kernel.org> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-11-03 10:47:03 -08:00
Yu Zhao	ddd6d8e975	mm: multi-gen LRU: remove MM_LEAF_OLD and MM_NONLEAF_TOTAL stats Patch series "mm: multi-gen LRU: Have secondary MMUs participate in MM_WALK". Today, the MM_WALK capability causes MGLRU to clear the young bit from PMDs and PTEs during the page table walk before eviction, but MGLRU does not call the clear_young() MMU notifier in this case. By not calling this notifier, the MM walk takes less time/CPU, but it causes pages that are accessed mostly through KVM / secondary MMUs to appear younger than they should be. We do call the clear_young() notifier today, but only when attempting to evict the page, so we end up clearing young/accessed information less frequently for secondary MMUs than for mm PTEs, and therefore they appear younger and are less likely to be evicted. Therefore, memory that is not being accessed mostly by KVM will be evicted more frequently, worsening performance. ChromeOS observed a tab-open latency regression when enabling MGLRU with a setup that involved running a VM: Tab-open latency histogram (ms) Version p50 mean p95 p99 max base 1315 1198 2347 3454 10319 mglru 2559 1311 7399 12060 43758 fix 1119 926 2470 4211 6947 This series replaces the final non-selftest patchs from this series[1], which introduced a similar change (and a new MMU notifier) with KVM optimizations. I'll send a separate series (to Sean and Paolo) for the KVM optimizations. This series also makes proactive reclaim with MGLRU possible for KVM memory. I have verified that this functions correctly with the selftest from [1], but given that that test is a KVM selftest, I'll send it with the rest of the KVM optimizations later. Andrew, let me know if you'd like to take the test now anyway. [1]: https://lore.kernel.org/linux-mm/20240926013506.860253-18-jthoughton@google.com/ This patch (of 2): The removed stats, MM_LEAF_OLD and MM_NONLEAF_TOTAL, are not very helpful and become more complicated to properly compute when adding test/clear_young() notifiers in MGLRU's mm walk. Link: https://lkml.kernel.org/r/20241019012940.3656292-1-jthoughton@google.com Link: https://lkml.kernel.org/r/20241019012940.3656292-2-jthoughton@google.com Fixes: `bd74fdaea1` ("mm: multi-gen LRU: support page table walks") Signed-off-by: Yu Zhao <yuzhao@google.com> Signed-off-by: James Houghton <jthoughton@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Matlack <dmatlack@google.com> Cc: David Rientjes <rientjes@google.com> Cc: David Stevens <stevensd@google.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Wei Xu <weixugc@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-11-03 10:47:02 -08:00
Linus Torvalds	32cfb3c48e	Merge tag 'char-misc-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull misc driver fixes from Greg KH: "Here are some small char/misc/iio fixes for 6.12-rc6 that resolve some reported issues. Included in here are the following: - small IIO driver fixes for many reported issues - mei driver fix for a suddenly much reported issue for an "old" issue. - MAINTAINERS update for a developer who has moved companies and forgot to update their old entry. All of these have been in linux-next this week with no reported issues" * tag 'char-misc-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: mei: use kvmalloc for read buffer MAINTAINERS: add netup_unidvb maintainer iio: dac: Kconfig: Fix build error for ltc2664 iio: adc: ad7124: fix division by zero in ad7124_set_channel_odr() staging: iio: frequency: ad9832: fix division by zero in ad9832_calc_freqreg() docs: iio: ad7380: fix supply for ad7380-4 iio: adc: ad7380: fix supplies for ad7380-4 iio: adc: ad7380: add missing supplies iio: adc: ad7380: use devm_regulator_get_enable_read_voltage() dt-bindings: iio: adc: ad7380: fix ad7380-4 reference supply iio: light: veml6030: fix microlux value calculation iio: gts-helper: Fix memory leaks for the error path of iio_gts_build_avail_scale_table() iio: gts-helper: Fix memory leaks in iio_gts_build_avail_scale_table()	2024-11-03 08:45:03 -10:00
Linus Torvalds	295ba6501d	Merge tag 'input-for-v6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - a fix for regression in input core introduced in 6.11 preventing re-registering input handlers - a fix for adp5588-keys driver tyring to disable interrupt 0 at suspend when devices is used without interrupt - a fix for edt-ft5x06 to stop leaking regmap structure when probing fails and to make sure it is not released too early on removal. * tag 'input-for-v6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: fix regression when re-registering input handlers Input: adp5588-keys - do not try to disable interrupt 0 Input: edt-ft5x06 - fix regmap leak when probe fails	2024-11-03 08:35:29 -10:00
Linus Torvalds	a33ab3f94f	Merge tag 'kbuild-fixes-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix a memory leak in modpost - Resolve build issues when cross-compiling RPM and Debian packages - Fix another regression in Kconfig - Fix incorrect MODULE_ALIAS() output in modpost * tag 'kbuild-fixes-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: modpost: fix input MODULE_DEVICE_TABLE() built for 64-bit on 32-bit host modpost: fix acpi MODULE_DEVICE_TABLE built with mismatched endianness kconfig: show sub-menu entries even if the prompt is hidden kbuild: deb-pkg: add pkg.linux-upstream.nokerneldbg build profile kbuild: deb-pkg: add pkg.linux-upstream.nokernelheaders build profile kbuild: rpm-pkg: disable kernel-devel package when cross-compiling sumversion: Fix a memory leak in get_src_version()	2024-11-03 08:29:02 -10:00
Linus Torvalds	b9021de3ec	Merge tag 'x86-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A trivial compile test fix for x86: When CONFIG_AMD_NB is not set a COMPILE_TEST of an AMD specific driver fails due to a missing inline stub. Add the stub to cure it" * tag 'x86-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/amd_nb: Fix compile-testing without CONFIG_AMD_NB	2024-11-03 08:26:00 -10:00
Linus Torvalds	b019b4a670	Merge tag 'timers-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single fix for posix CPU timers. When a thread is cloned, the posix CPU timers are not inherited. If the parent has a CPU timer armed the corresponding tick dependency in the tasks tick_dep_mask is set and copied to the new thread, which means the new thread and all decendants will prevent the system to go into full NOHZ operation. Clear the tick dependency mask in copy_process() to fix this" * tag 'timers-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-cpu-timers: Clear TICK_DEP_BIT_POSIX_TIMER on clone	2024-11-03 08:22:21 -10:00
Linus Torvalds	33e83ffe4c	Merge tag 'sched-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: - Plug a race between pick_next_task_fair() and try_to_wake_up() where both try to write to the same task, even though both paths hold a runqueue lock, but obviously from different runqueues. The problem is that the store to task::on_rq in __block_task() is visible to try_to_wake_up() which assumes that the task is not queued. Both sides then operate on the same task. Cure it by rearranging __block_task() so the the store to task::on_rq is the last operation on the task. - Prevent a potential NULL pointer dereference in task_numa_work() task_numa_work() iterates the VMAs of a process. A concurrent unmap of the address space can result in a NULL pointer return from vma_next() which is unchecked. Add the missing NULL pointer check to prevent this. - Operate on the correct scheduler policy in task_should_scx() task_should_scx() returns true when a task should be handled by sched EXT. It checks the tasks scheduling policy. This fails when the check is done before a policy has been set. Cure it by handing the policy into task_should_scx() so it operates on the requested value. - Add the missing handling of sched EXT in the delayed dequeue mechanism. This was simply forgotten. * tag 'sched-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/ext: Fix scx vs sched_delayed sched: Pass correct scheduling policy to __setscheduler_class sched/numa: Fix the potential null pointer dereference in task_numa_work() sched: Fix pick_next_task_fair() vs try_to_wake_up() race	2024-11-03 08:18:28 -10:00
Linus Torvalds	68f05b251b	Merge tag 'perf-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Thomas Gleixner: "perf_event_clear_cpumask() uses list_for_each_entry_rcu() without being in a RCU read side critical section, which triggers a 'suspicious RCU usage' warning. It turns out that the list walk does not be RCU protected because the write side lock is held in this context. Change it to a regular list walk" * tag 'perf-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix missing RCU reader protection in perf_event_clear_cpumask()	2024-11-03 08:13:52 -10:00
Linus Torvalds	8f0b844adc	Merge tag 'irq-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: - Fix an off-by-one error in the failure path of msi_domain_alloc(), which causes the cleanup loop to terminate early and leaking the first allocated interrupt. - Handle a corner case in GIC-V4 versus a lazily mapped Virtual Processing Element (VPE). If the VPE has not been mapped because the guest has not yet emitted a mapping command, then the set_affinity() callback returns an error code, which causes the vCPU management to fail. Return success in this case without touching the hardware. This will be done later when the guest issues the mapping command. * tag 'irq-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v4: Correctly deal with set_affinity on lazily-mapped VPEs genirq/msi: Fix off-by-one error in msi_domain_alloc()	2024-11-03 08:09:25 -10:00
Masahiro Yamada	77dc55a978	modpost: fix input MODULE_DEVICE_TABLE() built for 64-bit on 32-bit host When building a 64-bit kernel on a 32-bit build host, incorrect input MODULE_ALIAS() entries may be generated. For example, when compiling a 64-bit kernel with CONFIG_INPUT_MOUSEDEV=m on a 64-bit build machine, you will get the correct output: $ grep MODULE_ALIAS drivers/input/mousedev.mod.c MODULE_ALIAS("input:bvpe-e1,2,k110,r0,1,amlsfw"); MODULE_ALIAS("input:bvpe-e1,2,kr8,amlsfw"); MODULE_ALIAS("input:bvpe-e1,3,k14A,ra0,1,mlsfw"); MODULE_ALIAS("input:bvpe-e1,3,k145,ra0,1,18,1C,mlsfw"); MODULE_ALIAS("input:bvpe-e1,3,k110,ra0,1,mlsfw"); However, building the same kernel on a 32-bit machine results in incorrect output: $ grep MODULE_ALIAS drivers/input/mousedev.mod.c MODULE_ALIAS("input:bvpe-e1,2,k110,130,r0,1,amlsfw"); MODULE_ALIAS("input:bvpe-e1,2,kr8,amlsfw"); MODULE_ALIAS("input:bvpe-e1,3,k14A,16A,ra0,1,20,21,mlsfw"); MODULE_ALIAS("input:bvpe-e1,3,k145,165,ra0,1,18,1C,20,21,38,3C,mlsfw"); MODULE_ALIAS("input:bvpe-e1,3,k110,130,ra0,1,20,21,mlsfw"); A similar issue occurs with CONFIG_INPUT_JOYDEV=m. On a 64-bit build machine, the output is: $ grep MODULE_ALIAS drivers/input/joydev.mod.c MODULE_ALIAS("input:bvpe-e3,kra0,mlsfw"); MODULE_ALIAS("input:bvpe-e3,kra2,mlsfw"); MODULE_ALIAS("input:bvpe-e3,kra8,mlsfw"); MODULE_ALIAS("input:bvpe-e3,kra6,mlsfw"); MODULE_ALIAS("input:bvpe-e1,k120,ramlsfw"); MODULE_ALIAS("input:bvpe-e1,k130,ramlsfw"); MODULE_ALIAS("input:bvpe-e1,k2C0,ramlsfw"); However, on a 32-bit machine, the output is incorrect: $ grep MODULE_ALIAS drivers/input/joydev.mod.c MODULE_ALIAS("input:bvpe-e3,kra0,20,mlsfw"); MODULE_ALIAS("input:bvpe-e3,kra2,22,mlsfw"); MODULE_ALIAS("input:bvpe-e3,kra8,28,mlsfw"); MODULE_ALIAS("input:bvpe-e3,kra6,26,mlsfw"); MODULE_ALIAS("input:bvpe-e1,k11F,13F,ramlsfw"); MODULE_ALIAS("input:bvpe-e1,k11F,13F,ramlsfw"); MODULE_ALIAS("input:bvpe-e1,k2C0,2E0,ramlsfw*"); When building a 64-bit kernel, BITS_PER_LONG is defined as 64. However, on a 32-bit build machine, the constant 1L is a signed 32-bit value. Left-shifting it beyond 32 bits causes wraparound, and shifting by 31 or 63 bits makes it a negative value. The fix in commit `e0e9263271` ("[PATCH] PATCH: 1 line 2.6.18 bugfix: modpost-64bit-fix.patch") is incorrect; it only addresses cases where a 64-bit kernel is built on a 64-bit build machine, overlooking cases on a 32-bit build machine. Using 1ULL ensures a 64-bit width on both 32-bit and 64-bit machines, avoiding the wraparound issue. Fixes: `e0e9263271` ("[PATCH] PATCH: 1 line 2.6.18 bugfix: modpost-64bit-fix.patch") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-11-03 23:58:56 +09:00
Masahiro Yamada	2e766a1f5f	modpost: fix acpi MODULE_DEVICE_TABLE built with mismatched endianness When CONFIG_SATA_AHCI_PLATFORM=m, modpost outputs incorect acpi MODULE_ALIAS() if the endianness of the target and the build machine do not match. When the endianness of the target kernel and the build machine match, the output is correct: $ grep 'MODULE_ALIAS("acpi' drivers/ata/ahci_platform.mod.c MODULE_ALIAS("acpi:APMC0D33:"); MODULE_ALIAS("acpi:010601:"); However, when building a little-endian kernel on a big-endian machine (or vice versa), the output is incorrect: $ grep 'MODULE_ALIAS("acpi' drivers/ata/ahci_platform.mod.c MODULE_ALIAS("acpi:APMC0D33:"); MODULE_ALIAS("acpi:0601??:"); The 'cls' and 'cls_msk' fields are 32-bit. DEF_FIELD() must be used instead of DEF_FIELD_ADDR() to correctly handle endianness of these 32-bit fields. The check 'if (cls)' was unnecessary; it never became NULL, as it was the pointer to 'symval' plus the offset to the 'cls' field. Fixes: `26095a01d3` ("ACPI / scan: Add support for ACPI _CLS device matching") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-11-03 22:52:12 +09:00
Dmitry Torokhov	071b24b54d	Input: fix regression when re-registering input handlers Commit `d469647baf` ("Input: simplify event handling logic") introduced code that would set handler->events() method to either input_handler_events_filter() or input_handler_events_default() or input_handler_events_null(), depending on the kind of input handler (a filter or a regular one) we are dealing with. Unfortunately this breaks cases when we try to re-register the same filter (as is the case with sysrq handler): after initial registration the handler will have 2 event handling methods defined, and will run afoul of the check in input_handler_check_methods(): input: input_handler_check_methods: only one event processing method can be defined (sysrq) sysrq: Failed to register input handler, error -22 Fix this by adding handle_events() method to input_handle structure and setting it up when registering a new input handle according to event handling methods defined in associated input_handler structure, thus avoiding modifying the input_handler structure. Reported-by: "Ned T. Crigler" <crigler@gmail.com> Reported-by: Christian Heusel <christian@heusel.eu> Tested-by: "Ned T. Crigler" <crigler@gmail.com> Tested-by: Peter Seiderer <ps.report@gmx.net> Fixes: `d469647baf` ("Input: simplify event handling logic") Link: https://lore.kernel.org/r/Zx2iQp6csn42PJA7@xavtug Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>	2024-11-02 22:28:58 -07:00
Linus Torvalds	3e5e6c9900	Merge tag 'nfsd-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix two async COPY bugs found during NFS bake-a-thon - Fix an svcrdma memory leak * tag 'nfsd-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: rpcrdma: Always release the rpcrdma_device's xa_array NFSD: Never decrement pending_async_copies on error NFSD: Initialize struct nfsd4_copy earlier	2024-11-02 09:27:11 -10:00
Linus Torvalds	f6a7b4ec74	Merge tag 'xfs-6.12-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Carlos Maiolino: - fix a sysbot reported crash on filestreams - Reduce cpu time spent searching for extents in a very fragmented FS - Check for delayed allocations before setting extsize * tag 'xfs-6.12-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: streamline xfs_filestream_pick_ag xfs: fix finding a last resort AG in xfs_filestream_pick_ag xfs: Reduce unnecessary searches when searching for the best extents xfs: Check for delayed allocations before setting extsize	2024-11-02 09:22:16 -10:00
Linus Torvalds	11066801dd	Merge tag 'linux_kselftest-fixes-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: - fix syntax error in frequency calculation arithmetic expression in intel_pstate run.sh - add missing cpupower dependency check intel_pstate run.sh - fix idmap_mount_tree_invalid test failure due to incorrect argument - fix watchdog-test run leaving the watchdog timer enabled causing system reboot. With this fix, the test disables the watchdog timer when it gets terminated with SIGTERM, SIGKILL, and SIGQUIT in addition to SIGINT * tag 'linux_kselftest-fixes-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/watchdog-test: Fix system accidentally reset after watchdog-test selftests/intel_pstate: check if cpupower is installed selftests/intel_pstate: fix operand expected error selftests/mount_setattr: fix idmap_mount_tree_invalid failed to run	2024-11-01 16:05:50 -10:00
Linus Torvalds	f7292c0934	Merge tag 'rust-fixes-6.12-3' of https://github.com/Rust-for-Linux/linux Pull rust fixes from Miguel Ojeda: "Toolchain and infrastructure: - Avoid build errors with old 'rustc's without LLVM patch version (important since it impacts people that do not even enable Rust) - Update LLVM version for 'HAVE_CFI_ICALL_NORMALIZE_INTEGERS' in 'depends on' condition (the fix was eventually backported rather than land in LLVM 19)" * tag 'rust-fixes-6.12-3' of https://github.com/Rust-for-Linux/linux: cfi: tweak llvm version for HAVE_CFI_ICALL_NORMALIZE_INTEGERS kbuild: rust: avoid errors with old `rustc`s without LLVM patch version	2024-11-01 15:59:46 -10:00
Linus Torvalds	05b92660cd	Merge tag 'pci-v6.12-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fix from Bjorn Helgaas: - Enable device-specific ACS-like functionality even if the device doesn't advertise an ACS capability, which got broken when adding fancy ACS kernel parameter (Jason Gunthorpe) * tag 'pci-v6.12-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: Fix pci_enable_acs() support for the ACS quirks	2024-11-01 15:44:23 -10:00
Linus Torvalds	269ce3bd62	Merge tag 'drm-fixes-2024-11-02' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "Regular fixes pull, nothing too out of the ordinary, the mediatek fixes came in a batch that I might have preferred a bit earlier but all seem fine, otherwise regular xe/amdgpu and a few misc ones. xe: - Fix missing HPD interrupt enabling, bringing one PM refactor with it - Workaround LNL GGTT invalidation not being visible to GuC - Avoid getting jobs stuck without a protecting timeout ivpu: - Fix firewall IRQ handling panthor: - Fix firmware initialization wrt page sizes - Fix handling and reporting of dead job groups sched: - Guarantee forward progress via WC_MEM_RECLAIM tests: - Fix memory leak in drm_display_mode_from_cea_vic() amdgpu: - DCN 3.5 fix - Vangogh SMU KASAN fix - SMU 13 profile reporting fix mediatek: - Fix degradation problem of alpha blending - Fix color format MACROs in OVL - Fix get efuse issue for MT8188 DPTX - Fix potential NULL dereference in mtk_crtc_destroy() - Correct dpi power-domains property - Add split subschema property constraints" * tag 'drm-fixes-2024-11-02' of https://gitlab.freedesktop.org/drm/kernel: (27 commits) drm/xe: Don't short circuit TDR on jobs not started drm/xe: Add mmio read before GGTT invalidate drm/tests: hdmi: Fix memory leaks in drm_display_mode_from_cea_vic() drm/connector: hdmi: Fix memory leak in drm_display_mode_from_cea_vic() drm/tests: helpers: Add helper for drm_display_mode_from_cea_vic() drm/panthor: Report group as timedout when we fail to properly suspend drm/panthor: Fail job creation when the group is dead drm/panthor: Fix firmware initialization on systems with a page size > 4k accel/ivpu: Fix NOC firewall interrupt handling drm/xe/display: Add missing HPD interrupt enabling during non-d3cold RPM resume drm/xe/display: Separate the d3cold and non-d3cold runtime PM handling drm/xe: Remove runtime argument from display s/r functions drm/amdgpu/smu13: fix profile reporting drm/amd/pm: Vangogh: Fix kernel memory out of bounds write Revert "drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35" drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM drm/tegra: Fix NULL vs IS_ERR() check in probe() dt-bindings: display: mediatek: split: add subschema property constraints dt-bindings: display: mediatek: dpi: correct power-domains property drm/mediatek: Fix potential NULL dereference in mtk_crtc_destroy() ...	2024-11-01 15:37:09 -10:00
Linus Torvalds	b1966a1fd2	Merge tag 'cxl-fixes-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Ira Weiny: "The bulk of these fixes center around an initialization order bug reported by Gregory Price and some additional fall out from the debugging effort. In summary, cxl_acpi and cxl_mem race and previously worked because of a bus_rescan_devices() while testing without modules built in. Unfortunately with modules built in the rescan would fail due to the cxl_port driver being registered late via the build order. Furthermore it was found bus_rescan_devices() did not guarantee a probe barrier which CXL was expecting. Additional fixes to cxl-test and decoder allocation came along as they were found in this debugging effort. The other fixes are pretty minor but one affects trace point data seen by user space. Summary: - Fix crashes when running with cxl-test code - Fix Trace DRAM Event Record field decodes - Fix module/built in initialization order errors - Fix use after free on decoder shutdowns - Fix out of order decoder allocations - Improve cxl-test to better reflect real world systems" * tag 'cxl-fixes-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/test: Improve init-order fidelity relative to real-world systems cxl/port: Prevent out-of-order decoder allocation cxl/port: Fix use-after-free, permit out-of-order decoder shutdown cxl/acpi: Ensure ports ready at cxl_acpi_probe() return cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices() cxl/port: Fix CXL port initialization order when the subsystem is built-in cxl/events: Fix Trace DRAM Event Record cxl/core: Return error when cxl_endpoint_gather_bandwidth() handles a non-PCI device	2024-11-01 15:22:57 -10:00
Linus Torvalds	f4a1e8e369	Merge tag 'block-6.12-20241101' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: - Fixup for a recent blk_rq_map_user_bvec() patch - NVMe pull request via Keith: - Spec compliant identification fix (Keith) - Module parameter to enable backward compatibility on unusual namespace formats (Keith) - Target double free fix when using keys (Vitaliy) - Passthrough command error handling fix (Keith) * tag 'block-6.12-20241101' of git://git.kernel.dk/linux: nvme: re-fix error-handling for io_uring nvme-passthrough nvmet-auth: assign dh_key to NULL after kfree_sensitive nvme: module parameter to disable pi with offsets block: fix queue limits checks in blk_rq_map_user_bvec for real nvme: enhance cns version checking	2024-11-01 13:41:55 -10:00
Linus Torvalds	f0d3699aef	Merge tag 'io_uring-6.12-20241101' of git://git.kernel.dk/linux Pull io_uring fix from Jens Axboe: - Fix not honoring IOCB_NOWAIT for starting buffered writes in terms of calling sb_start_write(), leading to a deadlock if someone is attempting to freeze the file system with writes in progress, as each side will end up waiting for the other to make progress. * tag 'io_uring-6.12-20241101' of git://git.kernel.dk/linux: io_uring/rw: fix missing NOWAIT check for O_DIRECT start write	2024-11-01 13:38:01 -10:00
Linus Torvalds	c426456857	Merge tag 'acpi-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Make the ACPI CPPC library use a raw spinlock for operations carried out in scheduler context via the schedutil governor and the ACPI CPPC cpufreq driver (Pierre Gondois)" * tag 'acpi-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: CPPC: Make rmw_lock a raw_spin_lock	2024-11-01 09:04:23 -10:00
Linus Torvalds	edf0227abd	Merge tag 'gpio-fixes-for-v6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - fix an uninitialized variable in GPIO swnode code - add a missing return value check for devm_mutex_init() - fix an old issue with debugfs output * tag 'gpio-fixes-for-v6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpiolib: fix debugfs dangling chip separator gpiolib: fix debugfs newline separators gpio: sloppy-logic-analyzer: Check for error code from devm_mutex_init() call gpio: fix uninit-value in swnode_find_gpio	2024-11-01 09:03:02 -10:00
Dave Airlie	f99c7cca2f	Merge tag 'drm-xe-fixes-2024-10-31' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix missing HPD interrupt enabling, bringing one PM refactor with it (Imre / Maarten) - Workaround LNL GGTT invalidation not being visible to GuC (Matthew Brost) - Avoid getting jobs stuck without a protecting timeout (Matthew Brost) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/tsbftadm7owyizzdaqnqu7u4tqggxgeqeztlfvmj5fryxlfomi@5m5bfv2zvzmw	2024-11-02 04:44:27 +10:00
Linus Torvalds	a031e15404	Merge tag 'riscv-for-linus-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - Avoid accessing the early boot ACPI tables via unsafe memory attributes, which can result in incorrect ACPI table data appearing. This can cause all sorts of bad behavior. - Avoid compiler-inserted library calls in the VDSO. - GCC+Rust builds have been disabled, to avoid issues related to ISA string mismatched between the GCC and LLVM Rust implementations. - The NX flag is now set in the EFI PE/COFF headers, which is necessary for some distro GRUB versions to boot images. - A fix to avoid leaking DT node reference counts on ACPI systems during cache info parsing. - CPU numbers are now printed as unsigned values during hotplug. - A pair of build fixes for usused macros, which can trigger warnings on some configurations. * tag 'riscv-for-linus-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Remove duplicated GET_RM riscv: Remove unused GENERATING_ASM_OFFSETS riscv: Use '%u' to format the output of 'cpu' riscv: Prevent a bad reference count on CPU nodes riscv: efi: Set NX compat flag in PE/COFF header RISC-V: disallow gcc + rust builds riscv: Do not use fortify in early code RISC-V: ACPI: fix early_ioremap to early_memremap riscv: vdso: Prevent the compiler from inserting calls to memset()	2024-11-01 08:26:38 -10:00
Linus Torvalds	3dfffd506e	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The important one is a change to the way in which we handle protection keys around signal delivery so that we're more closely aligned with the x86 behaviour, however there is also a revert of the previous fix to disable software tag-based KASAN with GCC, since a workaround materialised shortly afterwards. I'd love to say we're done with 6.12, but we're aware of some longstanding fpsimd register corruption issues that we're almost at the bottom of resolving. Summary: - Fix handling of POR_EL0 during signal delivery so that pushing the signal context doesn't fail based on the pkey configuration of the interrupted context and align our user-visible behaviour with that of x86. - Fix a bogus pointer being passed to the CPU hotplug code from the Arm SDEI driver. - Re-enable software tag-based KASAN with GCC by using an alternative implementation of '__no_sanitize_address'" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: signal: Improve POR_EL0 handling to avoid uaccess failures firmware: arm_sdei: Fix the input parameter of cpuhp_remove_state() Revert "kasan: Disable Software Tag-Based KASAN with GCC" kasan: Fix Software Tag-Based KASAN with GCC	2024-11-01 07:54:11 -10:00
Linus Torvalds	17fa6a5f93	Merge tag 'vfs-6.12-rc6.iomap' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull iomap fixes from Christian Brauner: "Fixes for iomap to prevent data corruption bugs in the fallocate unshare range implementation of fsdax and a small cleanup to turn iomap_want_unshare_iter() into an inline function" * tag 'vfs-6.12-rc6.iomap' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: iomap: turn iomap_want_unshare_iter into an inline function fsdax: dax_unshare_iter needs to copy entire blocks fsdax: remove zeroing code from dax_unshare_iter iomap: share iomap_unshare_iter predicate code with fsdax xfs: don't allocate COW extents when unsharing a hole	2024-11-01 07:45:00 -10:00
Linus Torvalds	d56239a82e	Merge tag 'vfs-6.12-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull filesystem fixes from Christian Brauner: "VFS: - Fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP=y is set - Add a get_tree_bdev_flags() helper that allows to modify e.g., whether errors are logged into the filesystem context during superblock creation. This is used by erofs to fix a userspace regression where an error is currently logged when its used on a regular file which is an new allowed mode in erofs. netfs: - Fix the sysfs debug path in the documentation. - Fix iov_iter_get_pages() for folio queues by skipping the page extracation if we're at the end of a folio. afs: - Fix moving subdirectories to different parent directory. autofs: - Fix handling of AUTOFS_DEV_IOCTL_TIMEOUT_CMD ioctl in validate_dev_ioctl(). The actual ioctl number, not the ioctl command needs to be checked for autofs" tag 'vfs-6.12-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP autofs: fix thinko in validate_dev_ioctl() iov_iter: Fix iov_iter_get_pages*() for folio_queue afs: Fix missing subdir edit when renamed between parent dirs doc: correcting the debug path for cachefiles erofs: use get_tree_bdev_flags() to avoid misleading messages fs/super.c: introduce get_tree_bdev_flags()	2024-11-01 07:37:10 -10:00
Linus Torvalds	6b4926494e	Merge tag 'for-6.12-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more stability fixes. There's one patch adding export of MIPS cmpxchg helper, used in the error propagation fix. - fix error propagation from split bios to the original btrfs bio - fix merging of adjacent extents (normal operation, defragmentation) - fix potential use after free after freeing btrfs device structures" * tag 'for-6.12-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix defrag not merging contiguous extents due to merged extent maps btrfs: fix extent map merging not happening for adjacent extents btrfs: fix use-after-free of block device file in __btrfs_free_extra_devids() btrfs: fix error propagation of split bios MIPS: export __cmpxchg_small()	2024-11-01 07:31:47 -10:00
Linus Torvalds	7b83601da4	Merge tag 'bcachefs-2024-10-31' of git://evilpiepirate.org/bcachefs Pull bcachefs fixes from Kent Overstreet: "Various syzbot fixes, and the more notable ones: - Fix for pointers in an extent overflowing the max (16) on a filesystem with many devices: we were creating too many cached copies when moving data around. Now, we only create at most one cached copy if there's a promote target set. Caching will be a bit broken for reflinked data until 6.13: I have larger series queued up which significantly improves the plumbing for data options down into the extent (bch_extent_rebalance) to fix this. - Fix for deadlock on -ENOSPC on tiny filesystems Allocation from the partial open_bucket list wasn't correctly accounting partial open_buckets as free: this fixes the main cause of tests timing out in the automated tests" * tag 'bcachefs-2024-10-31' of git://evilpiepirate.org/bcachefs: bcachefs: Fix NULL ptr dereference in btree_node_iter_and_journal_peek bcachefs: fix possible null-ptr-deref in __bch2_ec_stripe_head_get() bcachefs: Fix deadlock on -ENOSPC w.r.t. partial open buckets bcachefs: Don't filter partial list buckets in open_buckets_to_text() bcachefs: Don't keep tons of cached pointers around bcachefs: init freespace inited bits to 0 in bch2_fs_initialize bcachefs: Fix unhandled transaction restart in fallocate bcachefs: Fix UAF in bch2_reconstruct_alloc() bcachefs: fix null-ptr-deref in have_stripes() bcachefs: fix shift oob in alloc_lru_idx_fragmentation bcachefs: Fix invalid shift in validate_sb_layout()	2024-11-01 07:21:03 -10:00
Vlastimil Babka	d4148aeab4	mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizes Since commit `efa7df3e3b` ("mm: align larger anonymous mappings on THP boundaries") a mmap() of anonymous memory without a specific address hint and of at least PMD_SIZE will be aligned to PMD so that it can benefit from a THP backing page. However this change has been shown to regress some workloads significantly. [1] reports regressions in various spec benchmarks, with up to 600% slowdown of the cactusBSSN benchmark on some platforms. The benchmark seems to create many mappings of 4632kB, which would have merged to a large THP-backed area before commit `efa7df3e3b` and now they are fragmented to multiple areas each aligned to PMD boundary with gaps between. The regression then seems to be caused mainly due to the benchmark's memory access pattern suffering from TLB or cache aliasing due to the aligned boundaries of the individual areas. Another known regression bisected to commit `efa7df3e3b` is darktable [2] [3] and early testing suggests this patch fixes the regression there as well. To fix the regression but still try to benefit from THP-friendly anonymous mapping alignment, add a condition that the size of the mapping must be a multiple of PMD size instead of at least PMD size. In case of many odd-sized mapping like the cactusBSSN creates, those will stop being aligned and with gaps between, and instead naturally merge again. Link: https://lkml.kernel.org/r/20241024151228.101841-2-vbabka@suse.cz Fixes: `efa7df3e3b` ("mm: align larger anonymous mappings on THP boundaries") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Michael Matz <matz@suse.de> Debugged-by: Gabriel Krisman Bertazi <gabriel@krisman.be> Closes: https://bugzilla.suse.com/show_bug.cgi?id=1229012 [1] Reported-by: Matthias Bodenbinder <matthias@bodenbinder.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219366 [2] Closes: https://lore.kernel.org/all/2050f0d4-57b0-481d-bab8-05e8d48fed0c@leemhuis.info/ [3] Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Cc: Rik van Riel <riel@surriel.com> Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Petr Tesarik <ptesarik@suse.com> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-31 20:27:04 -07:00
Chen Ridong	15e8156713	mm: shrinker: avoid memleak in alloc_shrinker_info A memleak was found as below: unreferenced object 0xffff8881010d2a80 (size 32): comm "mkdir", pid 1559, jiffies 4294932666 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... backtrace (crc 2e7ef6fa): [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 [<ffffffff81198dd9>] online_css+0x29/0xa0 [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e alloc_shrinker_info(), when shrinker_unit_alloc() returns an errer, the info won't be freed. Just fix it. Link: https://lkml.kernel.org/r/20241025060942.1049263-1-chenridong@huaweicloud.com Fixes: `307bececcd` ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") Signed-off-by: Chen Ridong <chenridong@huawei.com> Acked-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Wang Weiyang <wangweiyang2@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-31 20:27:04 -07:00
Eugen Hristev	0173471d21	.mailmap: update e-mail address for Eugen Hristev Update e-mail address. Link: https://lkml.kernel.org/r/20241025085848.483149-1-eugen.hristev@linaro.org Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-31 20:27:04 -07:00
Gregory Price	35e41024c4	vmscan,migrate: fix page count imbalance on node stats when demoting pages When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. migrate_pages will decrement the the node's isolated page count, leading to an imbalanced count when invoked from (MG)LRU code. The result is dmesg output like such: $ cat /proc/sys/vm/stat_refresh [77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642 This negative value may impact compaction and reclaim throttling. The following path produces the decrement: shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- decrement This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code. When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION. Link: https://lkml.kernel.org/r/20241025141724.17927-1-gourry@gourry.net Fixes: `26aa2d199d` ("mm/migrate: demote pages during reclaim") Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Wei Xu <weixugc@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-31 20:27:04 -07:00
Jarkko Sakkinen	85d16bceaf	mailmap: update Jarkko's email addresses Remove my previous work email, and the new one. The previous was never used in the commit log, so there's no good reason to spare it. Link: https://lkml.kernel.org/r/20241025181530.6151-1-jarkko@kernel.org Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Cc: Alex Elder <elder@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Geliang Tang <geliang@kernel.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Kees Cook <kees@kernel.org> Cc: Matthieu Baerts (NGI0) <matttbe@kernel.org> Cc: Matt Ranostay <matt@ranostay.sg> Cc: Neeraj Upadhyay <neeraj.upadhyay@kernel.org> Cc: Quentin Monnet <qmo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-31 20:27:03 -07:00
Linus Torvalds	6c52d4da1c	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma fixes from Jason Gunthorpe: - Put the QP netlink dump back in cxgb4, fixes a user visible regression - Don't change the rounding style in mlx5 for user provided rd_atomic values - Resolve a race in bnxt_re around the qp-handle table array * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/bnxt_re: synchronize the qp-handle table array RDMA/bnxt_re: Fix the usage of control path spin locks RDMA/mlx5: Round max_rd_atomic/max_dest_rd_atomic up instead of down RDMA/cxgb4: Dump vendor specific QP details	2024-10-31 16:49:23 -10:00
Linus Torvalds	5635f18942	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Daniel Borkmann: - Fix BPF verifier to force a checkpoint when the program's jump history becomes too long (Eduard Zingerman) - Add several fixes to the BPF bits iterator addressing issues like memory leaks and overflow problems (Hou Tao) - Fix an out-of-bounds write in trie_get_next_key (Byeonguk Jeong) - Fix BPF test infra's LIVE_FRAME frame update after a page has been recycled (Toke Høiland-Jørgensen) - Fix BPF verifier and undo the 40-bytes extra stack space for bpf_fastcall patterns due to various bugs (Eduard Zingerman) - Fix a BPF sockmap race condition which could trigger a NULL pointer dereference in sock_map_link_update_prog (Cong Wang) - Fix tcp_bpf_recvmsg_parser to retrieve seq_copied from tcp_sk under the socket lock (Jiayuan Chen) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf, test_run: Fix LIVE_FRAME frame update after a page has been recycled selftests/bpf: Add three test cases for bits_iter bpf: Use __u64 to save the bits in bits iterator bpf: Check the validity of nr_words in bpf_iter_bits_new() bpf: Add bpf_mem_alloc_check_size() helper bpf: Free dynamically allocated bits in bpf_iter_bits_destroy() bpf: disallow 40-bytes extra stack for bpf_fastcall patterns selftests/bpf: Add test for trie_get_next_key() bpf: Fix out-of-bounds write in trie_get_next_key() selftests/bpf: Test with a very short loop bpf: Force checkpoint when jmp history is too long bpf: fix filed access without lock sock_map: fix a NULL pointer dereference in sock_map_link_update_prog()	2024-10-31 14:56:19 -10:00
Linus Torvalds	90602c251c	Merge tag 'net-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from WiFi, bluetooth and netfilter. No known new regressions outstanding. Current release - regressions: - wifi: mt76: do not increase mcu skb refcount if retry is not supported Current release - new code bugs: - wifi: - rtw88: fix the RX aggregation in USB 3 mode - mac80211: fix memory corruption bug in struct ieee80211_chanctx Previous releases - regressions: - sched: - stop qdisc_tree_reduce_backlog on TC_H_ROOT - sch_api: fix xa_insert() error path in tcf_block_get_ext() - wifi: - revert "wifi: iwlwifi: remove retry loops in start" - cfg80211: clear wdev->cqm_config pointer on free - netfilter: fix potential crash in nf_send_reset6() - ip_tunnel: fix suspicious RCU usage warning in ip_tunnel_find() - bluetooth: fix null-ptr-deref in hci_read_supported_codecs - eth: mlxsw: add missing verification before pushing Tx header - eth: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issue Previous releases - always broken: - wifi: mac80211: do not pass a stopped vif to the driver in .get_txpower - netfilter: sanitize offset and length before calling skb_checksum() - core: - fix crash when config small gso_max_size/gso_ipv4_max_size - skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension - mptcp: protect sched with rcu_read_lock - eth: ice: fix crash on probe for DPLL enabled E810 LOM - eth: macsec: fix use-after-free while sending the offloading packet - eth: stmmac: fix unbalanced DMA map/unmap for non-paged SKB data - eth: hns3: fix kernel crash when 1588 is sent on HIP08 devices - eth: mtk_wed: fix path of MT7988 WO firmware" * tag 'net-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits) net: hns3: fix kernel crash when 1588 is sent on HIP08 devices net: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issue net: hns3: initialize reset_timer before hclgevf_misc_irq_init() net: hns3: don't auto enable misc vector net: hns3: Resolved the issue that the debugfs query result is inconsistent. net: hns3: fix missing features due to dev->features configuration too early net: hns3: fixed reset failure issues caused by the incorrect reset type net: hns3: add sync command to sync io-pgtable net: hns3: default enable tx bounce buffer when smmu enabled netfilter: nft_payload: sanitize offset and length before calling skb_checksum() net: ethernet: mtk_wed: fix path of MT7988 WO firmware selftests: forwarding: Add IPv6 GRE remote change tests mlxsw: spectrum_ipip: Fix memory leak when changing remote IPv6 address mlxsw: pci: Sync Rx buffers for device mlxsw: pci: Sync Rx buffers for CPU mlxsw: spectrum_ptp: Add missing verification before pushing Tx header net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension Bluetooth: hci: fix null-ptr-deref in hci_read_supported_codecs netfilter: nf_reject_ipv6: fix potential crash in nf_send_reset6() netfilter: Fix use-after-free in get_info() ...	2024-10-31 12:39:58 -10:00
Dave Airlie	427360718e	Merge tag 'mediatek-drm-fixes-20241028' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20241028 1. Fix degradation problem of alpha blending 2. Fix color format MACROs in OVL 3. Fix get efuse issue for MT8188 DPTX 4. Fix potential NULL dereference in mtk_crtc_destroy() 5. Correct dpi power-domains property 6. Add split subschema property constraints Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241028135846.3570-1-chunkuang.hu@kernel.org	2024-11-01 07:34:21 +10:00
Dave Airlie	8594a2d8d7	Merge tag 'amd-drm-fixes-6.12-2024-10-31' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.12-2024-10-31: amdgpu: - DCN 3.5 fix - Vangogh SMU KASAN fix - SMU 13 profile reporting fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241031151539.3523633-1-alexander.deucher@amd.com	2024-11-01 07:25:41 +10:00
Dave Airlie	989c5b9051	Merge tag 'drm-misc-fixes-2024-10-31' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: ivpu: - Fix firewall IRQ handling panthor: - Fix firmware initialization wrt page sizes - Fix handling and reporting of dead job groups sched: - Guarantee forward progress via WC_MEM_RECLAIM tests: - Fix memory leak in drm_display_mode_from_cea_vic() Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241031144348.GA7826@linux-2.fritz.box	2024-11-01 06:08:24 +10:00
Linus Torvalds	15cb732c16	Merge tag 'sound-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Here we see slightly more commits than wished, but basically all are small and mostly trivial fixes. The only core change is the workaround for __counted_by() usage in ASoC DAPM code, while the rest are device-specific fixes for Intel Baytrail devices, Cirrus and wcd937x codecs, and HD-audio / USB-audio devices" * tag 'sound-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1 ALSA: hda/realtek: Fix headset mic on TUXEDO Gemini 17 Gen3 ALSA: usb-audio: Add quirks for Dell WD19 dock ASoC: codecs: wcd937x: relax the AUX PDM watchdog ASoC: codecs: wcd937x: add missing LO Switch control ASoC: dt-bindings: rockchip,rk3308-codec: add port property ALSA: hda/realtek: Add subwoofer quirk for Infinix ZERO BOOK 13 ASoC: dapm: fix bounds checker error in dapm_widget_list_create ASoC: Intel: sst: Fix used of uninitialized ctx to log an error ASoC: cs42l51: Fix some error handling paths in cs42l51_probe() ASoC: Intel: sst: Support LPE0F28 ACPI HID ALSA: hda/realtek: Limit internal Mic boost on Dell platform ASoC: Intel: bytcr_rt5640: Add DMI quirk for Vexia Edu Atla 10 tablet ASoC: Intel: bytcr_rt5640: Add support for non ACPI instantiated codec ASoC: codecs: rt5640: Always disable IRQs from rt5640_cancel_work()	2024-10-31 08:15:40 -10:00
Johan Hovold	604888f8c3	gpiolib: fix debugfs dangling chip separator Add the missing newline after entries for recently removed gpio chips so that the chip sections are separated by a newline as intended. Fixes: `e348544f79` ("gpio: protect the list of GPIO devices with SRCU") Cc: stable@vger.kernel.org # 6.9 Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20241028125000.24051-3-johan+linaro@kernel.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-10-31 19:14:17 +01:00
Johan Hovold	3e8b7238b4	gpiolib: fix debugfs newline separators The gpiolib debugfs interface exports a list of all gpio chips in a system and the state of their pins. The gpio chip sections are supposed to be separated by a newline character, but a long-standing bug prevents the separator from being included when output is generated in multiple sessions, making the output inconsistent and hard to read. Make sure to only suppress the newline separator at the beginning of the file as intended. Fixes: `f9c4a31f61` ("gpiolib: Use seq_file's iterator interface") Cc: stable@vger.kernel.org # 3.7 Cc: Thierry Reding <treding@nvidia.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20241028125000.24051-2-johan+linaro@kernel.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-10-31 19:14:17 +01:00
Filipe Manana	77b0d113ee	btrfs: fix defrag not merging contiguous extents due to merged extent maps When running defrag (manual defrag) against a file that has extents that are contiguous and we already have the respective extent maps loaded and merged, we end up not defragging the range covered by those contiguous extents. This happens when we have an extent map that was the result of merging multiple extent maps for contiguous extents and the length of the merged extent map is greater than or equals to the defrag threshold length. The script below reproduces this scenario: $ cat test.sh #!/bin/bash DEV=/dev/sdi MNT=/mnt/sdi mkfs.btrfs -f $DEV mount $DEV $MNT # Create a 256K file with 4 extents of 64K each. xfs_io -f -c "falloc 0 64K" \ -c "pwrite 0 64K" \ -c "falloc 64K 64K" \ -c "pwrite 64K 64K" \ -c "falloc 128K 64K" \ -c "pwrite 128K 64K" \ -c "falloc 192K 64K" \ -c "pwrite 192K 64K" \ $MNT/foo umount $MNT echo -n "Initial number of file extent items: " btrfs inspect-internal dump-tree -t 5 $DEV \| grep EXTENT_DATA \| wc -l mount $DEV $MNT # Read the whole file in order to load and merge extent maps. cat $MNT/foo > /dev/null btrfs filesystem defragment -t 128K $MNT/foo umount $MNT echo -n "Number of file extent items after defrag with 128K threshold: " btrfs inspect-internal dump-tree -t 5 $DEV \| grep EXTENT_DATA \| wc -l mount $DEV $MNT # Read the whole file in order to load and merge extent maps. cat $MNT/foo > /dev/null btrfs filesystem defragment -t 256K $MNT/foo umount $MNT echo -n "Number of file extent items after defrag with 256K threshold: " btrfs inspect-internal dump-tree -t 5 $DEV \| grep EXTENT_DATA \| wc -l Running it: $ ./test.sh Initial number of file extent items: 4 Number of file extent items after defrag with 128K threshold: 4 Number of file extent items after defrag with 256K threshold: 4 The 4 extents don't get merged because we have an extent map with a size of 256K that is the result of merging the individual extent maps for each of the four 64K extents and at defrag_lookup_extent() we have a value of zero for the generation threshold ('newer_than' argument) since this is a manual defrag. As a consequence we don't call defrag_get_extent() to get an extent map representing a single file extent item in the inode's subvolume tree, so we end up using the merged extent map at defrag_collect_targets() and decide not to defrag. Fix this by updating defrag_lookup_extent() to always discard extent maps that were merged and call defrag_get_extent() regardless of the minimum generation threshold ('newer_than' argument). A test case for fstests will be sent along soon. CC: stable@vger.kernel.org # 6.1+ Fixes: `199257a78b` ("btrfs: defrag: don't use merged extent map for their generation check") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-31 16:46:41 +01:00
Filipe Manana	a0f0625390	btrfs: fix extent map merging not happening for adjacent extents If we have 3 or more adjacent extents in a file, that is, consecutive file extent items pointing to adjacent extents, within a contiguous file range and compatible flags, we end up not merging all the extents into a single extent map. For example: $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt/sdc $ xfs_io -f -d -c "pwrite -b 64K 0 64K" \ -c "pwrite -b 64K 64K 64K" \ -c "pwrite -b 64K 128K 64K" \ -c "pwrite -b 64K 192K 64K" \ /mnt/sdc/foo After all the ordered extents complete we unpin the extent maps and try to merge them, but instead of getting a single extent map we get two because: 1) When the first ordered extent completes (file range [0, 64K)) we unpin its extent map and attempt to merge it with the extent map for the range [64K, 128K), but we can't because that extent map is still pinned; 2) When the second ordered extent completes (file range [64K, 128K)), we unpin its extent map and merge it with the previous extent map, for file range [0, 64K), but we can't merge with the next extent map, for the file range [128K, 192K), because this one is still pinned. The merged extent map for the file range [0, 128K) gets the flag EXTENT_MAP_MERGED set; 3) When the third ordered extent completes (file range [128K, 192K)), we unpin its extent map and attempt to merge it with the previous extent map, for file range [0, 128K), but we can't because that extent map has the flag EXTENT_MAP_MERGED set (mergeable_maps() returns false due to different flags) while the extent map for the range [128K, 192K) doesn't have that flag set. We also can't merge it with the next extent map, for file range [192K, 256K), because that one is still pinned. At this moment we have 3 extent maps: One for file range [0, 128K), with the flag EXTENT_MAP_MERGED set. One for file range [128K, 192K). One for file range [192K, 256K) which is still pinned; 4) When the fourth and final extent completes (file range [192K, 256K)), we unpin its extent map and attempt to merge it with the previous extent map, for file range [128K, 192K), which succeeds since none of these extent maps have the EXTENT_MAP_MERGED flag set. So we end up with 2 extent maps: One for file range [0, 128K), with the flag EXTENT_MAP_MERGED set. One for file range [128K, 256K), with the flag EXTENT_MAP_MERGED set. Since after merging extent maps we don't attempt to merge again, that is, merge the resulting extent map with the one that is now preceding it (and the one following it), we end up with those two extent maps, when we could have had a single extent map to represent the whole file. Fix this by making mergeable_maps() ignore the EXTENT_MAP_MERGED flag. While this doesn't present any functional issue, it prevents the merging of extent maps which allows to save memory, and can make defrag not merging extents too (that will be addressed in the next patch). Fixes: `199257a78b` ("btrfs: defrag: don't use merged extent map for their generation check") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-31 16:45:16 +01:00
Toke Høiland-Jørgensen	c40dd8c473	bpf, test_run: Fix LIVE_FRAME frame update after a page has been recycled The test_run code detects whether a page has been modified and re-initialises the xdp_frame structure if it has, using xdp_update_frame_from_buff(). However, xdp_update_frame_from_buff() doesn't touch frame->mem, so that wasn't correctly re-initialised, which led to the pages from page_pool not being returned correctly. Syzbot noticed this as a memory leak. Fix this by also copying the frame->mem structure when re-initialising the frame, like we do on initialisation of a new page from page_pool. Fixes: `e5995bc7e2` ("bpf, test_run: fix crashes due to XDP frame overwriting/corruption") Fixes: `b530e9e106` ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") Reported-by: syzbot+d121e098da06af416d23@syzkaller.appspotmail.com Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: syzbot+d121e098da06af416d23@syzkaller.appspotmail.com Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/bpf/20241030-test-run-mem-fix-v1-1-41e88e8cae43@redhat.com	2024-10-31 16:15:21 +01:00
Jens Axboe	d0c6cc6c6a	Merge tag 'nvme-6.12-2024-10-31' of git://git.infradead.org/nvme into block-6.12 Pull NVMe fixes from Keith: "nvme fixes for Linux 6.12 - Spec compliant identification fix (Keith) - Module parameter to enable backward compatibility on unusual namespace formats (Keith) - Target double free fix when using keys (Vitaliy) - Passthrough command error handling fix (Keith)" * tag 'nvme-6.12-2024-10-31' of git://git.infradead.org/nvme: nvme: re-fix error-handling for io_uring nvme-passthrough nvmet-auth: assign dh_key to NULL after kfree_sensitive nvme: module parameter to disable pi with offsets nvme: enhance cns version checking	2024-10-31 09:10:07 -06:00
Jens Axboe	1d60d74e85	io_uring/rw: fix missing NOWAIT check for O_DIRECT start write When io_uring starts a write, it'll call kiocb_start_write() to bump the super block rwsem, preventing any freezes from happening while that write is in-flight. The freeze side will grab that rwsem for writing, excluding any new writers from happening and waiting for existing writes to finish. But io_uring unconditionally uses kiocb_start_write(), which will block if someone is currently attempting to freeze the mount point. This causes a deadlock where freeze is waiting for previous writes to complete, but the previous writes cannot complete, as the task that is supposed to complete them is blocked waiting on starting a new write. This results in the following stuck trace showing that dependency with the write blocked starting a new write: task:fio state:D stack:0 pid:886 tgid:886 ppid:876 Call trace: __switch_to+0x1d8/0x348 __schedule+0x8e8/0x2248 schedule+0x110/0x3f0 percpu_rwsem_wait+0x1e8/0x3f8 __percpu_down_read+0xe8/0x500 io_write+0xbb8/0xff8 io_issue_sqe+0x10c/0x1020 io_submit_sqes+0x614/0x2110 __arm64_sys_io_uring_enter+0x524/0x1038 invoke_syscall+0x74/0x268 el0_svc_common.constprop.0+0x160/0x238 do_el0_svc+0x44/0x60 el0_svc+0x44/0xb0 el0t_64_sync_handler+0x118/0x128 el0t_64_sync+0x168/0x170 INFO: task fsfreeze:7364 blocked for more than 15 seconds. Not tainted 6.12.0-rc5-00063-g76aaf945701c #7963 with the attempting freezer stuck trying to grab the rwsem: task:fsfreeze state:D stack:0 pid:7364 tgid:7364 ppid:995 Call trace: __switch_to+0x1d8/0x348 __schedule+0x8e8/0x2248 schedule+0x110/0x3f0 percpu_down_write+0x2b0/0x680 freeze_super+0x248/0x8a8 do_vfs_ioctl+0x149c/0x1b18 __arm64_sys_ioctl+0xd0/0x1a0 invoke_syscall+0x74/0x268 el0_svc_common.constprop.0+0x160/0x238 do_el0_svc+0x44/0x60 el0_svc+0x44/0xb0 el0t_64_sync_handler+0x118/0x128 el0t_64_sync+0x168/0x170 Fix this by having the io_uring side honor IOCB_NOWAIT, and only attempt a blocking grab of the super block rwsem if it isn't set. For normal issue where IOCB_NOWAIT would always be set, this returns -EAGAIN which will have io_uring core issue a blocking attempt of the write. That will in turn also get completions run, ensuring forward progress. Since freezing requires CAP_SYS_ADMIN in the first place, this isn't something that can be triggered by a regular user. Cc: stable@vger.kernel.org # 5.10+ Reported-by: Peter Mann <peter.mann@sh.cz> Link: https://lore.kernel.org/io-uring/38c94aec-81c9-4f62-b44e-1d87f5597644@sh.cz Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-31 08:21:02 -06:00
Matthew Brost	fe05cee4d9	drm/xe: Don't short circuit TDR on jobs not started Short circuiting TDR on jobs not started is an optimization which is not required. On LNL we are facing an issue where jobs do not get scheduled by the GuC if it misses a GGTT page update. When this occurs let the TDR fire, toggle the scheduling which may get the job unstuck, and print a warning message. If the TDR fires twice on job that hasn't started, timeout the job. v2: - Add warning message (Paulo) - Add fixes tag (Paulo) - Timeout job which hasn't started after TDR firing twice v3: - Include local change v4: - Short circuit check_timeout on job not started - use warn level rather than notice (Paulo) Fixes: `7ddb9403dd` ("drm/xe: Sample ctx timestamp to determine if jobs have timed out") Cc: stable@vger.kernel.org Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241025214330.2010521-2-matthew.brost@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `35d25a4a00`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-31 07:03:14 -07:00
Matthew Brost	993ca0ecce	drm/xe: Add mmio read before GGTT invalidate On LNL without a mmio read before a GGTT invalidate the GuC can incorrectly read the GGTT scratch page upon next access leading to jobs not getting scheduled. A mmio read before a GGTT invalidate seems to fix this. Since a GGTT invalidate is not a hot code path, blindly do a mmio read before each GGTT invalidate. Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: stable@vger.kernel.org Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3164 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023221200.1797832-1-matthew.brost@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `5a71019688`) [ Fix conflict with mmio vs gt argument ] Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-31 07:02:42 -07:00
Andy Shevchenko	90bad74985	gpio: sloppy-logic-analyzer: Check for error code from devm_mutex_init() call Even if it's not critical, the avoidance of checking the error code from devm_mutex_init() call today diminishes the point of using devm variant of it. Tomorrow it may even leak something. Add the missed check. Fixes: `7828b7bbbf` ("gpio: add sloppy logic analyzer using polling") Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20241030174132.2113286-3-andriy.shevchenko@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-10-31 13:48:25 +01:00
Masahiro Yamada	d01661e1f4	kconfig: show sub-menu entries even if the prompt is hidden Since commit `f79dc03fe6` ("kconfig: refactor choice value calculation"), when EXPERT is disabled, nothing within the "if INPUT" ... "endif" block in drivers/input/Kconfig is displayed. This issue affects all command-line interfaces and GUI frontends. The prompt for INPUT is hidden when EXPERT is disabled. Previously, menu_is_visible() returned true in this case; however, it now returns false, resulting in all sub-menu entries being skipped. Here is a simplified test case illustrating the issue: config A bool "A" if X default y config B bool "B" depends on A When X is disabled, A becomes unconfigurable and is forced to y. B should be displayed, as its dependency is met. This commit restores the necessary code, so menu_is_visible() functions as it did previously. Fixes: `f79dc03fe6` ("kconfig: refactor choice value calculation") Reported-by: Edmund Raile <edmund.raile@proton.me> Closes: https://lore.kernel.org/all/5fd0dfc7ff171aa74352e638c276069a5f2e888d.camel@proton.me/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-10-31 21:42:20 +09:00
Masahiro Yamada	2ad7126c51	kbuild: deb-pkg: add pkg.linux-upstream.nokerneldbg build profile The Debian kernel supports the pkg.linux.nokerneldbg build profile. The debug package tends to become huge, and you may not want to build it even when CONFIG_DEBUG_INFO is enabled. This commit introduces a similar profile for the upstream kernel. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>	2024-10-31 21:41:02 +09:00
Masahiro Yamada	e2c318225a	kbuild: deb-pkg: add pkg.linux-upstream.nokernelheaders build profile Since commit `f1d87664b8` ("kbuild: cross-compile linux-headers package when possible"), 'make bindeb-pkg' may attempt to cross-compile the linux-headers package, but it fails under certain circumstances. For example, when CONFIG_MODULE_SIG_FORMAT is enabled on Debian, the following command fails: $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- bindeb-pkg [ snip ] Rebuilding host programs with aarch64-linux-gnu-gcc... HOSTCC debian/linux-headers-6.12.0-rc4/usr/src/linux-headers-6.12.0-rc4/scripts/kallsyms HOSTCC debian/linux-headers-6.12.0-rc4/usr/src/linux-headers-6.12.0-rc4/scripts/sorttable HOSTCC debian/linux-headers-6.12.0-rc4/usr/src/linux-headers-6.12.0-rc4/scripts/asn1_compiler HOSTCC debian/linux-headers-6.12.0-rc4/usr/src/linux-headers-6.12.0-rc4/scripts/sign-file In file included from /usr/include/openssl/opensslv.h:109, from debian/linux-headers-6.12.0-rc4/usr/src/linux-headers-6.12.0-rc4/scripts/sign-file.c:25: /usr/include/openssl/macros.h:14:10: fatal error: openssl/opensslconf.h: No such file or directory 14 \| #include <openssl/opensslconf.h> \| ^~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. This commit adds a new profile, pkg.linux-upstream.nokernelheaders, to guard the linux-headers package. There are two options to fix the above issue. Option 1: Set the pkg.linux-upstream.nokernelheaders build profile $ DEB_BUILD_PROFILES=pkg.linux-upstream.nokernelheaders \ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- bindeb-pkg This skips the building of the linux-headers package. Option 2: Install the necessary build dependencies If you want to cross-compile the linux-headers package, you need to install additional packages. For example, on Debian, the packages necessary for cross-compiling it to arm64 can be installed with the following commands: # dpkg --add-architecture arm64 # apt update # apt install gcc-aarch64-linux-gnu libssl-dev:arm64 Fixes: `f1d87664b8` ("kbuild: cross-compile linux-headers package when possible") Reported-by: Ron Economos <re@w6rz.net> Closes: https://lore.kernel.org/all/b3d4f49e-7ddb-29ba-0967-689232329b53@w6rz.net/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Ron Economos <re@w6rz.net> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>	2024-10-31 21:41:02 +09:00
Masahiro Yamada	cb08a02659	kbuild: rpm-pkg: disable kernel-devel package when cross-compiling Since commit `f1d87664b8` ("kbuild: cross-compile linux-headers package when possible"), 'make binrpm-pkg' may attempt to cross-compile the kernel-devel package, but it fails under certain circumstances. For example, when CONFIG_MODULE_SIG_FORMAT is enabled on openSUSE Tumbleweed, the following command fails: $ make ARCH=arm64 CROSS_COMPILE=aarch64-suse-linux- binrpm-pkg [ snip ] Rebuilding host programs with aarch64-suse-linux-gcc... HOSTCC /home/masahiro/ref/linux/rpmbuild/BUILDROOT/kernel-6.12.0_rc4-1.aarch64/usr/src/kernels/6.12.0-rc4/scripts/kallsyms HOSTCC /home/masahiro/ref/linux/rpmbuild/BUILDROOT/kernel-6.12.0_rc4-1.aarch64/usr/src/kernels/6.12.0-rc4/scripts/sorttable HOSTCC /home/masahiro/ref/linux/rpmbuild/BUILDROOT/kernel-6.12.0_rc4-1.aarch64/usr/src/kernels/6.12.0-rc4/scripts/asn1_compiler HOSTCC /home/masahiro/ref/linux/rpmbuild/BUILDROOT/kernel-6.12.0_rc4-1.aarch64/usr/src/kernels/6.12.0-rc4/scripts/sign-file /home/masahiro/ref/linux/rpmbuild/BUILDROOT/kernel-6.12.0_rc4-1.aarch64/usr/src/kernels/6.12.0-rc4/scripts/sign-file.c:25:10: fatal error: openssl/opensslv.h: No such file or directory 25 \| #include <openssl/opensslv.h> \| ^~~~~~~~~~~~~~~~~~~~ compilation terminated. I believe this issue is less common on Fedora because the disto's cross- compilier cannot link user-space programs. Hence, CONFIG_CC_CAN_LINK is unset. On Fedora 40, the package information explains this limitation clearly: $ dnf info gcc-aarch64-linux-gnu [ snip ] Description : Cross-build GNU C compiler. : : Only building kernels is currently supported. Support for cross-building : user space programs is not currently provided as that would massively multiply : the number of packages. Anyway, cross-compiling RPM packages is somewhat challenging. This commit disables the kernel-devel package when cross-compiling because I did not come up with a better solution. Fixes: `f1d87664b8` ("kbuild: cross-compile linux-headers package when possible") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org>	2024-10-31 21:40:46 +09:00
Suraj Sonawane	a14968aea6	gpio: fix uninit-value in swnode_find_gpio Fix an issue detected by the Smatch tool: drivers/gpio/gpiolib-swnode.c:78 swnode_find_gpio() error: uninitialized symbol 'ret'. The issue occurs because the 'ret' variable may be used without initialization if the for_each_gpio_property_name loop does not run. This could lead to returning an undefined value, causing unpredictable behavior. Initialize 'ret' to 0 before the loop to ensure the function returns an error code if no properties are parsed, maintaining proper error handling. Fixes: `9e4c6c1ad` ("Merge tag 'io_uring-6.12-20241011' of git://git.kernel.dk/linux") Signed-off-by: Suraj Sonawane <surajsonawane0215@gmail.com> Link: https://lore.kernel.org/r/20241026090642.28633-1-surajsonawane0215@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-10-31 13:39:25 +01:00
Paolo Abeni	50ae879de1	Merge tag 'nf-24-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter fixes for net: 1) Remove unused parameters in conntrack_dump_flush.c used by selftests, from Liu Jing. 2) Fix possible UaF when removing xtables module via getsockopt() interface, from Dong Chenchen. 3) Fix potential crash in nf_send_reset6() reported by syzkaller. From Eric Dumazet 4) Validate offset and length before calling skb_checksum() in nft_payload, otherwise hitting BUG() is possible. netfilter pull request 24-10-31 * tag 'nf-24-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_payload: sanitize offset and length before calling skb_checksum() netfilter: nf_reject_ipv6: fix potential crash in nf_send_reset6() netfilter: Fix use-after-free in get_info() selftests: netfilter: remove unused parameter ==================== Link: https://patch.msgid.link/ Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 12:13:08 +01:00
Paolo Abeni	ee802a4954	Merge tag 'for-net-2024-10-30' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci: fix null-ptr-deref in hci_read_supported_codecs * tag 'for-net-2024-10-30' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: hci: fix null-ptr-deref in hci_read_supported_codecs ==================== Link: https://patch.msgid.link/20241030192205.38298-1-luiz.dentz@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 11:32:57 +01:00
Paolo Abeni	d80a309130	Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver' Jijie Shao says: ==================== There are some bugfix for the HNS3 ethernet driver ChangeLog: v2 -> v3: - Rewrite the commit logs of net: hns3: add sync command to sync io-pgtable' to add more verbose explanation, suggested Paolo. - Add fixes tag for hardware issue, suggested Paolo and Simon Horman. v2: https://lore.kernel.org/all/20241018101059.1718375-1-shaojijie@huawei.com/ v1 -> v2: - Pass IRQF_NO_AUTOEN to request_irq(), suggested by Jakub. - Rewrite the commit logs of 'net: hns3: default enable tx bounce buffer when smmu enabled' and 'net: hns3: add sync command to sync io-pgtable'. v1: https://lore.kernel.org/all/20241011094521.3008298-1-shaojijie@huawei.com/ ==================== Link: https://patch.msgid.link/20241025092938.2912958-1-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 11:15:48 +01:00
Jie Wang	2cf2461435	net: hns3: fix kernel crash when 1588 is sent on HIP08 devices Currently, HIP08 devices does not register the ptp devices, so the hdev->ptp is NULL. But the tx process would still try to set hardware time stamp info with SKBTX_HW_TSTAMP flag and cause a kernel crash. [ 128.087798] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018 ... [ 128.280251] pc : hclge_ptp_set_tx_info+0x2c/0x140 [hclge] [ 128.286600] lr : hclge_ptp_set_tx_info+0x20/0x140 [hclge] [ 128.292938] sp : ffff800059b93140 [ 128.297200] x29: ffff800059b93140 x28: 0000000000003280 [ 128.303455] x27: ffff800020d48280 x26: ffff0cb9dc814080 [ 128.309715] x25: ffff0cb9cde93fa0 x24: 0000000000000001 [ 128.315969] x23: 0000000000000000 x22: 0000000000000194 [ 128.322219] x21: ffff0cd94f986000 x20: 0000000000000000 [ 128.328462] x19: ffff0cb9d2a166c0 x18: 0000000000000000 [ 128.334698] x17: 0000000000000000 x16: ffffcf1fc523ed24 [ 128.340934] x15: 0000ffffd530a518 x14: 0000000000000000 [ 128.347162] x13: ffff0cd6bdb31310 x12: 0000000000000368 [ 128.353388] x11: ffff0cb9cfbc7070 x10: ffff2cf55dd11e02 [ 128.359606] x9 : ffffcf1f85a212b4 x8 : ffff0cd7cf27dab0 [ 128.365831] x7 : 0000000000000a20 x6 : ffff0cd7cf27d000 [ 128.372040] x5 : 0000000000000000 x4 : 000000000000ffff [ 128.378243] x3 : 0000000000000400 x2 : ffffcf1f85a21294 [ 128.384437] x1 : ffff0cb9db520080 x0 : ffff0cb9db500080 [ 128.390626] Call trace: [ 128.393964] hclge_ptp_set_tx_info+0x2c/0x140 [hclge] [ 128.399893] hns3_nic_net_xmit+0x39c/0x4c4 [hns3] [ 128.405468] xmit_one.constprop.0+0xc4/0x200 [ 128.410600] dev_hard_start_xmit+0x54/0xf0 [ 128.415556] sch_direct_xmit+0xe8/0x634 [ 128.420246] __dev_queue_xmit+0x224/0xc70 [ 128.425101] dev_queue_xmit+0x1c/0x40 [ 128.429608] ovs_vport_send+0xac/0x1a0 [openvswitch] [ 128.435409] do_output+0x60/0x17c [openvswitch] [ 128.440770] do_execute_actions+0x898/0x8c4 [openvswitch] [ 128.446993] ovs_execute_actions+0x64/0xf0 [openvswitch] [ 128.453129] ovs_dp_process_packet+0xa0/0x224 [openvswitch] [ 128.459530] ovs_vport_receive+0x7c/0xfc [openvswitch] [ 128.465497] internal_dev_xmit+0x34/0xb0 [openvswitch] [ 128.471460] xmit_one.constprop.0+0xc4/0x200 [ 128.476561] dev_hard_start_xmit+0x54/0xf0 [ 128.481489] __dev_queue_xmit+0x968/0xc70 [ 128.486330] dev_queue_xmit+0x1c/0x40 [ 128.490856] ip_finish_output2+0x250/0x570 [ 128.495810] __ip_finish_output+0x170/0x1e0 [ 128.500832] ip_finish_output+0x3c/0xf0 [ 128.505504] ip_output+0xbc/0x160 [ 128.509654] ip_send_skb+0x58/0xd4 [ 128.513892] udp_send_skb+0x12c/0x354 [ 128.518387] udp_sendmsg+0x7a8/0x9c0 [ 128.522793] inet_sendmsg+0x4c/0x8c [ 128.527116] __sock_sendmsg+0x48/0x80 [ 128.531609] __sys_sendto+0x124/0x164 [ 128.536099] __arm64_sys_sendto+0x30/0x5c [ 128.540935] invoke_syscall+0x50/0x130 [ 128.545508] el0_svc_common.constprop.0+0x10c/0x124 [ 128.551205] do_el0_svc+0x34/0xdc [ 128.555347] el0_svc+0x20/0x30 [ 128.559227] el0_sync_handler+0xb8/0xc0 [ 128.563883] el0_sync+0x160/0x180 Fixes: `0bf5eb7885` ("net: hns3: add support for PTP") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 11:15:43 +01:00
Hao Lan	3e22b7de34	net: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issue The TQP BAR space is divided into two segments. TQPs 0-1023 and TQPs 1024-1279 are in different BAR space addresses. However, hclge_fetch_pf_reg does not distinguish the tqp space information when reading the tqp space information. When the number of TQPs is greater than 1024, access bar space overwriting occurs. The problem of different segments has been considered during the initialization of tqp.io_base. Therefore, tqp.io_base is directly used when the queue is read in hclge_fetch_pf_reg. The error message: Unable to handle kernel paging request at virtual address ffff800037200000 pc : hclge_fetch_pf_reg+0x138/0x250 [hclge] lr : hclge_get_regs+0x84/0x1d0 [hclge] Call trace: hclge_fetch_pf_reg+0x138/0x250 [hclge] hclge_get_regs+0x84/0x1d0 [hclge] hns3_get_regs+0x2c/0x50 [hns3] ethtool_get_regs+0xf4/0x270 dev_ethtool+0x674/0x8a0 dev_ioctl+0x270/0x36c sock_do_ioctl+0x110/0x2a0 sock_ioctl+0x2ac/0x530 __arm64_sys_ioctl+0xa8/0x100 invoke_syscall+0x4c/0x124 el0_svc_common.constprop.0+0x140/0x15c do_el0_svc+0x30/0xd0 el0_svc+0x1c/0x2c el0_sync_handler+0xb0/0xb4 el0_sync+0x168/0x180 Fixes: `939ccd107f` ("net: hns3: move dump regs function to a separate file") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 11:15:43 +01:00
Jian Shen	d1c2e2961a	net: hns3: initialize reset_timer before hclgevf_misc_irq_init() Currently the misc irq is initialized before reset_timer setup. But it will access the reset_timer in the irq handler. So initialize the reset_timer earlier. Fixes: `ff200099d2` ("net: hns3: remove unnecessary work in hclgevf_main") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 11:15:43 +01:00
Jian Shen	5f62009ff1	net: hns3: don't auto enable misc vector Currently, there is a time window between misc irq enabled and service task inited. If an interrupte is reported at this time, it will cause warning like below: [ 16.324639] Call trace: [ 16.324641] __queue_delayed_work+0xb8/0xe0 [ 16.324643] mod_delayed_work_on+0x78/0xd0 [ 16.324655] hclge_errhand_task_schedule+0x58/0x90 [hclge] [ 16.324662] hclge_misc_irq_handle+0x168/0x240 [hclge] [ 16.324666] __handle_irq_event_percpu+0x64/0x1e0 [ 16.324667] handle_irq_event+0x80/0x170 [ 16.324670] handle_fasteoi_edge_irq+0x110/0x2bc [ 16.324671] __handle_domain_irq+0x84/0xfc [ 16.324673] gic_handle_irq+0x88/0x2c0 [ 16.324674] el1_irq+0xb8/0x140 [ 16.324677] arch_cpu_idle+0x18/0x40 [ 16.324679] default_idle_call+0x5c/0x1bc [ 16.324682] cpuidle_idle_call+0x18c/0x1c4 [ 16.324684] do_idle+0x174/0x17c [ 16.324685] cpu_startup_entry+0x30/0x6c [ 16.324687] secondary_start_kernel+0x1a4/0x280 [ 16.324688] ---[ end trace 6aa0bff672a964aa ]--- So don't auto enable misc vector when request irq.. Fixes: `7be1b9f3e9` ("net: hns3: make hclge_service use delayed workqueue") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 11:15:42 +01:00
Hao Lan	2758f18a83	net: hns3: Resolved the issue that the debugfs query result is inconsistent. This patch modifies the implementation of debugfs: When the user process stops unexpectedly, not all data of the file system is read. In this case, the save_buf pointer is not released. When the user process is called next time, save_buf is used to copy the cached data to the user space. As a result, the queried data is inconsistent. To solve this problem, determine whether the function is invoked for the first time based on the value of ppos. If ppos is 0, obtain the actual data. Fixes: `5e69ea7ee2` ("net: hns3: refactor the debugfs process") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Guangwei Zhang <zhangwangwei6@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 11:15:42 +01:00
Hao Lan	662ecfc466	net: hns3: fix missing features due to dev->features configuration too early Currently, the netdev->features is configured in hns3_nic_set_features. As a result, __netdev_update_features considers that there is no feature difference, and the procedures of the real features are missing. Fixes: `2a7556bb2b` ("net: hns3: implement ndo_features_check ops for hns3 driver") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 11:15:42 +01:00
Hao Lan	3e0f7cc887	net: hns3: fixed reset failure issues caused by the incorrect reset type When a reset type that is not supported by the driver is input, a reset pending flag bit of the HNAE3_NONE_RESET type is generated in reset_pending. The driver does not have a mechanism to clear this type of error. As a result, the driver considers that the reset is not complete. This patch provides a mechanism to clear the HNAE3_NONE_RESET flag and the parameter of hnae3_ae_ops.set_default_reset_request is verified. The error message: hns3 0000:39:01.0: cmd failed -16 hns3 0000:39:01.0: hclge device re-init failed, VF is disabled! hns3 0000:39:01.0: failed to reset VF stack hns3 0000:39:01.0: failed to reset VF(4) hns3 0000:39:01.0: prepare reset(2) wait done hns3 0000:39:01.0 eth4: already uninitialized Use the crash tool to view struct hclgevf_dev: struct hclgevf_dev { ... default_reset_request = 0x20, reset_level = HNAE3_NONE_RESET, reset_pending = 0x100, reset_type = HNAE3_NONE_RESET, ... }; Fixes: `720bd5837e` ("net: hns3: add set_default_reset_request in the hnae3_ae_ops") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 11:15:42 +01:00
Jian Shen	f2c14899ca	net: hns3: add sync command to sync io-pgtable To avoid errors in pgtable prefectch, add a sync command to sync io-pagtable. This is a supplement for the previous patch. We want all the tx packet can be handled with tx bounce buffer path. But it depends on the remain space of the spare buffer, checked by the hns3_can_use_tx_bounce(). In most cases, maybe 99.99%, it returns true. But once it return false by no available space, the packet will be handled with the former path, which will map/unmap the skb buffer. Then the driver will face the smmu prefetch risk again. So add a sync command in this case to avoid smmu prefectch, just protects corner scenes. Fixes: `295ba232a8` ("net: hns3: add device version to replace pci revision") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 11:15:42 +01:00
Peiyang Wang	e6ab19443b	net: hns3: default enable tx bounce buffer when smmu enabled The SMMU engine on HIP09 chip has a hardware issue. SMMU pagetable prefetch features may prefetch and use a invalid PTE even the PTE is valid at that time. This will cause the device trigger fake pagefaults. The solution is to avoid prefetching by adding a SYNC command when smmu mapping a iova. But the performance of nic has a sharp drop. Then we do this workaround, always enable tx bounce buffer, avoid mapping/unmapping on TX path. This issue only affects HNS3, so we always enable tx bounce buffer when smmu enabled to improve performance. Fixes: `295ba232a8` ("net: hns3: add device version to replace pci revision") Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-31 11:15:42 +01:00
Pablo Neira Ayuso	d5953d680f	netfilter: nft_payload: sanitize offset and length before calling skb_checksum() If access to offset + length is larger than the skbuff length, then skb_checksum() triggers BUG_ON(). skb_checksum() internally subtracts the length parameter while iterating over skbuff, BUG_ON(len) at the end of it checks that the expected length to be included in the checksum calculation is fully consumed. Fixes: `7ec3f7b47b` ("netfilter: nft_payload: add packet mangling support") Reported-by: Slavin Liu <slavin-ayu@qq.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-10-31 10:54:49 +01:00
Jinjie Ruan	add4163aca	drm/tests: hdmi: Fix memory leaks in drm_display_mode_from_cea_vic() modprobe drm_hdmi_state_helper_test and then rmmod it, the following memory leak occurs. The `mode` allocated in drm_mode_duplicate() called by drm_display_mode_from_cea_vic() is not freed, which cause the memory leak: unreferenced object 0xffffff80ccd18100 (size 128): comm "kunit_try_catch", pid 1851, jiffies 4295059695 hex dump (first 32 bytes): 57 62 00 00 80 02 90 02 f0 02 20 03 00 00 e0 01 Wb........ ..... ea 01 ec 01 0d 02 00 00 0a 00 00 00 00 00 00 00 ................ backtrace (crc c2f1aa95): [<000000000f10b11b>] kmemleak_alloc+0x34/0x40 [<000000001cd4cf73>] __kmalloc_cache_noprof+0x26c/0x2f4 [<00000000f1f3cffa>] drm_mode_duplicate+0x44/0x19c [<000000008cbeef13>] drm_display_mode_from_cea_vic+0x88/0x98 [<0000000019daaacf>] 0xffffffedc11ae69c [<000000000aad0f85>] kunit_try_run_case+0x13c/0x3ac [<00000000a9210bac>] kunit_generic_run_threadfn_adapter+0x80/0xec [<000000000a0b2e9e>] kthread+0x2e8/0x374 [<00000000bd668858>] ret_from_fork+0x10/0x20 ...... Free `mode` by using drm_kunit_display_mode_from_cea_vic() to fix it. Cc: stable@vger.kernel.org Fixes: `4af70f19e5` ("drm/tests: Add RGB Quantization tests") Acked-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241030023504.530425-4-ruanjinjie@huawei.com Signed-off-by: Maxime Ripard <mripard@kernel.org>	2024-10-31 10:31:35 +01:00
Jinjie Ruan	926163342a	drm/connector: hdmi: Fix memory leak in drm_display_mode_from_cea_vic() modprobe drm_connector_test and then rmmod drm_connector_test, the following memory leak occurs. The `mode` allocated in drm_mode_duplicate() called by drm_display_mode_from_cea_vic() is not freed, which cause the memory leak: unreferenced object 0xffffff80cb0ee400 (size 128): comm "kunit_try_catch", pid 1948, jiffies 4294950339 hex dump (first 32 bytes): 14 44 02 00 80 07 d8 07 04 08 98 08 00 00 38 04 .D............8. 3c 04 41 04 65 04 00 00 05 00 00 00 00 00 00 00 <.A.e........... backtrace (crc 90e9585c): [<00000000ec42e3d7>] kmemleak_alloc+0x34/0x40 [<00000000d0ef055a>] __kmalloc_cache_noprof+0x26c/0x2f4 [<00000000c2062161>] drm_mode_duplicate+0x44/0x19c [<00000000f96c74aa>] drm_display_mode_from_cea_vic+0x88/0x98 [<00000000d8f2c8b4>] 0xffffffdc982a4868 [<000000005d164dbc>] kunit_try_run_case+0x13c/0x3ac [<000000006fb23398>] kunit_generic_run_threadfn_adapter+0x80/0xec [<000000006ea56ca0>] kthread+0x2e8/0x374 [<000000000676063f>] ret_from_fork+0x10/0x20 ...... Free `mode` by using drm_kunit_display_mode_from_cea_vic() to fix it. Cc: stable@vger.kernel.org Fixes: `abb6f74973` ("drm/tests: Add HDMI TDMS character rate tests") Acked-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241030023504.530425-3-ruanjinjie@huawei.com Signed-off-by: Maxime Ripard <mripard@kernel.org>	2024-10-31 10:31:34 +01:00
Jinjie Ruan	caa714f866	drm/tests: helpers: Add helper for drm_display_mode_from_cea_vic() As Maxime suggested, add a new helper drm_kunit_display_mode_from_cea_vic(), it can replace the direct call of drm_display_mode_from_cea_vic(), and it will help solving the `mode` memory leaks. Acked-by: Maxime Ripard <mripard@kernel.org> Suggested-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241030023504.530425-2-ruanjinjie@huawei.com Signed-off-by: Maxime Ripard <mripard@kernel.org>	2024-10-31 10:31:34 +01:00
Yu Zhao	9d08ec41a0	mm: allow set/clear page_type again Some page flags (page->flags) were converted to page types (page->page_types). A recent example is PG_hugetlb. From the exclusive writer's perspective, e.g., a thread doing __folio_set_hugetlb(), there is a difference between the page flag and type APIs: the former allows the same non-atomic operation to be repeated whereas the latter does not. For example, calling __folio_set_hugetlb() twice triggers VM_BUG_ON_FOLIO(), since the second call expects the type (PG_hugetlb) not to be set previously. Using add_hugetlb_folio() as an example, it calls __folio_set_hugetlb() in the following error-handling path. And when that happens, it triggers the aforementioned VM_BUG_ON_FOLIO(). if (folio_test_hugetlb(folio)) { rc = hugetlb_vmemmap_restore_folio(h, folio); if (rc) { spin_lock_irq(&hugetlb_lock); add_hugetlb_folio(h, folio, false); ... It is possible to make hugeTLB comply with the new requirements from the page type API. However, a straightforward fix would be to just allow the same page type to be set or cleared again inside the API, to avoid any changes to its callers. Link: https://lkml.kernel.org/r/20241020042212.296781-1-yuzhao@google.com Fixes: `d99e3140a4` ("mm: turn folio_test_hugetlb into a PageType") Signed-off-by: Yu Zhao <yuzhao@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-30 20:14:12 -07:00
Ryusuke Konishi	b3a033e3ec	nilfs2: fix potential deadlock with newly created symlinks Syzbot reported that page_symlink(), called by nilfs_symlink(), triggers memory reclamation involving the filesystem layer, which can result in circular lock dependencies among the reader/writer semaphore nilfs->ns_segctor_sem, s_writers percpu_rwsem (intwrite) and the fs_reclaim pseudo lock. This is because after commit `21fc61c73c` ("don't put symlink bodies in pagecache into highmem"), the gfp flags of the page cache for symbolic links are overwritten to GFP_KERNEL via inode_nohighmem(). This is not a problem for symlinks read from the backing device, because the __GFP_FS flag is dropped after inode_nohighmem() is called. However, when a new symlink is created with nilfs_symlink(), the gfp flags remain overwritten to GFP_KERNEL. Then, memory allocation called from page_symlink() etc. triggers memory reclamation including the FS layer, which may call nilfs_evict_inode() or nilfs_dirty_inode(). And these can cause a deadlock if they are called while nilfs->ns_segctor_sem is held: Fix this issue by dropping the __GFP_FS flag from the page cache GFP flags of newly created symlinks in the same way that nilfs_new_inode() and __nilfs_read_inode() do, as a workaround until we adopt nofs allocation scope consistently or improve the locking constraints. Link: https://lkml.kernel.org/r/20241020050003.4308-1-konishi.ryusuke@gmail.com Fixes: `21fc61c73c` ("don't put symlink bodies in pagecache into highmem") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+9ef37ac20608f4836256@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9ef37ac20608f4836256 Tested-by: syzbot+9ef37ac20608f4836256@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-30 20:14:12 -07:00
Phillip Lougher	d31638ff6c	Squashfs: fix variable overflow in squashfs_readpage_block Syzbot reports a slab out of bounds access in squashfs_readpage_block(). This is caused by an attempt to read page index 0x2000000000. This value (start_index) is stored in an integer loop variable which overflows producing a value of 0. This causes a loop which iterates over pages start_index -> end_index to iterate over 0 -> end_index, which ultimately causes an out of bounds page array access. Fix by changing variable to a loff_t, and rename to index to make it clearer it is a page index, and not a loop count. Link: https://lkml.kernel.org/r/20241020232200.837231-1-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: "Lai, Yi" <yi1.lai@linux.intel.com> Closes: https://lore.kernel.org/all/ZwzcnCAosIPqQ9Ie@ly-workstation/ Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-30 20:14:12 -07:00
Andrey Konovalov	330d8df81f	kasan: remove vmalloc_percpu test Commit `1a2473f0cb` ("kasan: improve vmalloc tests") added the vmalloc_percpu KASAN test with the assumption that __alloc_percpu always uses vmalloc internally, which is tagged by KASAN. However, __alloc_percpu might allocate memory from the first per-CPU chunk, which is not allocated via vmalloc(). As a result, the test might fail. Remove the test until proper KASAN annotation for the per-CPU allocated are added; tracked in https://bugzilla.kernel.org/show_bug.cgi?id=215019. Link: https://lkml.kernel.org/r/20241022160706.38943-1-andrey.konovalov@linux.dev Fixes: `1a2473f0cb` ("kasan: improve vmalloc tests") Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com> Reported-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/all/4a245fff-cc46-44d1-a5f9-fd2f1c3764ae@sifive.com/ Reported-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Link: https://lore.kernel.org/all/CACzwLxiWzNqPBp4C1VkaXZ2wDwvY3yZeetCi1TLGFipKW77drA@mail.gmail.com/ Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-30 20:14:11 -07:00
Wladislav Wiebe	ece5897e5a	tools/mm: -Werror fixes in page-types/slabinfo Commit `e6d2c436ff` ("tools/mm: allow users to provide additional cflags/ldflags") passes now CFLAGS to Makefile. With this, build systems with default -Werror enabled found: slabinfo.c:1300:25: error: ignoring return value of 'chdir' declared with attribute 'warn_unused_result' [-Werror=unused-result] chdir(".."); ^~~~~~~~~~~ page-types.c:397:35: error: format '%lu' expects argument of type 'long unsigned int', but argument 2 has type 'uint64_t' {aka 'long long unsigned int'} [-Werror=format=] printf("%lu\t", mapcnt0); ~~^ ~~~~~~~ .. Fix page-types by using PRIu64 for uint64_t prints and check in slabinfo for return code on chdir(".."). Link: https://lkml.kernel.org/r/c1ceb507-94bc-461c-934d-c19b77edd825@gmail.com Fixes: `e6d2c436ff` ("tools/mm: allow users to provide additional cflags/ldflags") Signed-off-by: Wladislav Wiebe <wladislav.kw@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Herton R. Krzesinski <herton@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-30 20:14:11 -07:00
Kairui Song	5168a68eb7	mm, swap: avoid over reclaim of full clusters When running low on usable slots, cluster allocator will try to reclaim the full clusters aggressively to reclaim HAS_CACHE slots. This guarantees that as long as there are any usable slots, HAS_CACHE or not, the swap device will be usable and workload won't go OOM early. Before the cluster allocator, swap allocator fails easily if device is filled up with reclaimable HAS_CACHE slots. Which can be easily reproduced with following simple program: #include <stdio.h> #include <string.h> #include <linux/mman.h> #include <sys/mman.h> #define SIZE 8192UL * 1024UL * 1024UL int main(int argc, char *argv) { long tmp; char p = mmap(NULL, SIZE, PROT_READ \| PROT_WRITE, MAP_PRIVATE \| MAP_ANONYMOUS, -1, 0); memset(p, 0, SIZE); madvise(p, SIZE, MADV_PAGEOUT); for (unsigned long i = 0; i < SIZE; ++i) tmp += p[i]; getchar(); /* Pause */ return 0; } Setup an 8G non ramdisk swap, the first run of the program will swapout 8G ram successfully. But run same program again after the first run paused, the second run can't swapout all 8G memory as now half of the swap device is pinned by HAS_CACHE. There was a random scan in the old allocator that may reclaim part of the HAS_CACHE by luck, but it's unreliable. The new allocator's added reclaim of full clusters when device is low on usable slots. But when multiple CPUs are seeing the device is low on usable slots at the same time, they ran into a thundering herd problem. This is an observable problem on large machine with mass parallel workload, as full cluster reclaim is slower on large swap device and higher number of CPUs will also make things worse. Testing using a 128G ZRAM on a 48c96t system. When the swap device is very close to full (eg. 124G / 128G), running build linux kernel with make -j96 in a 1G memory cgroup will hung (not a softlockup though) spinning in full cluster reclaim for about ~5min before go OOM. To solve this, split the full reclaim into two parts: - Instead of do a synchronous aggressively reclaim when device is low, do only one aggressively reclaim when device is strictly full with a kworker. This still ensures in worst case the device won't be unusable because of HAS_CACHE slots. - To avoid allocation (especially higher order) suffer from HAS_CACHE filling up clusters and kworker not responsive enough, do one synchronous scan every time the free list is drained, and only scan one cluster. This is kind of similar to the random reclaim before, keeps the full clusters rotated and has a minimal latency. This should provide a fair reclaim strategy suitable for most workloads. Link: https://lkml.kernel.org/r/20241022175512.10398-1-ryncsn@gmail.com Fixes: `2cacbdfdee` ("mm: swap: add a adaptive full cluster cache reclaim") Signed-off-by: Kairui Song <kasong@tencent.com> Cc: Barry Song <v-songbaohua@oppo.com> Cc: Chris Li <chrisl@kernel.org> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-30 20:14:11 -07:00
Barry Song	b54e1bfecc	mm: fix PSWPIN counter for large folios swap-in Similar to PSWPOUT, we should count the number of base pages instead of large folios. Link: https://lkml.kernel.org/r/20241023210201.2798-1-21cnbao@gmail.com Fixes: `242d12c981` ("mm: support large folios swap-in for sync io devices") Signed-off-by: Barry Song <v-songbaohua@oppo.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Chris Li <chrisl@kernel.org> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Kairui Song <kasong@tencent.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> Cc: Usama Arif <usamaarif642@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-30 20:14:11 -07:00
Zi Yan	e0fc203748	mm: avoid VM_BUG_ON when try to map an anon large folio to zero page. An anonymous large folio can be split into non order-0 folios, try_to_map_unused_to_zeropage() should not VM_BUG_ON compound pages but just return false. This fixes the crash when splitting anonymous large folios to non order-0 folios. Link: https://lkml.kernel.org/r/20241023171236.1122535-1-ziy@nvidia.com Fixes: `b1f202060a` ("mm: remap unused subpages to shared zeropage when splitting isolated thp") Signed-off-by: Zi Yan <ziy@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Usama Arif <usamaarif642@gmail.com> Cc: Barry Song <baohua@kernel.org> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-30 20:14:10 -07:00
Hao Ge	f4657e16e7	mm/codetag: fix null pointer check logic for ref and tag When we compile and load lib/slub_kunit.c,it will cause a panic. The root cause is that __kmalloc_cache_noprof was directly called instead of kmem_cache_alloc,which resulted in no alloc_tag being allocated.This caused current->alloc_tag to be null,leading to a null pointer dereference in alloc_tag_ref_set. Despite the fact that my colleague Pei Xiao will later fix the code in slub_kunit.c,we still need fix null pointer check logic for ref and tag to avoid panic caused by a null pointer dereference. Here is the log for the panic: [ 74.779373][ T2158] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 [ 74.780130][ T2158] Mem abort info: [ 74.780406][ T2158] ESR = 0x0000000096000004 [ 74.780756][ T2158] EC = 0x25: DABT (current EL), IL = 32 bits [ 74.781225][ T2158] SET = 0, FnV = 0 [ 74.781529][ T2158] EA = 0, S1PTW = 0 [ 74.781836][ T2158] FSC = 0x04: level 0 translation fault [ 74.782288][ T2158] Data abort info: [ 74.782577][ T2158] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 74.783068][ T2158] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 74.783533][ T2158] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 74.784010][ T2158] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000105f34000 [ 74.784586][ T2158] [0000000000000020] pgd=0000000000000000, p4d=0000000000000000 [ 74.785293][ T2158] Internal error: Oops: 0000000096000004 [#1] SMP [ 74.785805][ T2158] Modules linked in: slub_kunit kunit ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_mangle 4 [ 74.790661][ T2158] CPU: 0 UID: 0 PID: 2158 Comm: kunit_try_catch Kdump: loaded Tainted: G W N 6.12.0-rc3+ #2 [ 74.791535][ T2158] Tainted: [W]=WARN, [N]=TEST [ 74.791889][ T2158] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 74.792479][ T2158] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 74.793101][ T2158] pc : alloc_tagging_slab_alloc_hook+0x120/0x270 [ 74.793607][ T2158] lr : alloc_tagging_slab_alloc_hook+0x120/0x270 [ 74.794095][ T2158] sp : ffff800084d33cd0 [ 74.794418][ T2158] x29: ffff800084d33cd0 x28: 0000000000000000 x27: 0000000000000000 [ 74.795095][ T2158] x26: 0000000000000000 x25: 0000000000000012 x24: ffff80007b30e314 [ 74.795822][ T2158] x23: ffff000390ff6f10 x22: 0000000000000000 x21: 0000000000000088 [ 74.796555][ T2158] x20: ffff000390285840 x19: fffffd7fc3ef7830 x18: ffffffffffffffff [ 74.797283][ T2158] x17: ffff8000800e63b4 x16: ffff80007b33afc4 x15: ffff800081654c00 [ 74.798011][ T2158] x14: 0000000000000000 x13: 205d383531325420 x12: 5b5d383734363537 [ 74.798744][ T2158] x11: ffff800084d337e0 x10: 000000000000005d x9 : 00000000ffffffd0 [ 74.799476][ T2158] x8 : 7f7f7f7f7f7f7f7f x7 : ffff80008219d188 x6 : c0000000ffff7fff [ 74.800206][ T2158] x5 : ffff0003fdbc9208 x4 : ffff800081edd188 x3 : 0000000000000001 [ 74.800932][ T2158] x2 : 0beaa6dee1ac5a00 x1 : 0beaa6dee1ac5a00 x0 : ffff80037c2cb000 [ 74.801656][ T2158] Call trace: [ 74.801954][ T2158] alloc_tagging_slab_alloc_hook+0x120/0x270 [ 74.802494][ T2158] __kmalloc_cache_noprof+0x148/0x33c [ 74.802976][ T2158] test_kmalloc_redzone_access+0x4c/0x104 [slub_kunit] [ 74.803607][ T2158] kunit_try_run_case+0x70/0x17c [kunit] [ 74.804124][ T2158] kunit_generic_run_threadfn_adapter+0x2c/0x4c [kunit] [ 74.804768][ T2158] kthread+0x10c/0x118 [ 74.805141][ T2158] ret_from_fork+0x10/0x20 [ 74.805540][ T2158] Code: b9400a80 11000400 b9000a80 97ffd858 (f94012d3) [ 74.806176][ T2158] SMP: stopping secondary CPUs [ 74.808130][ T2158] Starting crashdump kernel... Link: https://lkml.kernel.org/r/20241020070819.307944-1-hao.ge@linux.dev Fixes: `e0a955bf7f` ("mm/codetag: add pgalloc_tag_copy()") Signed-off-by: Hao Ge <gehao@kylinos.cn> Acked-by: Suren Baghdasaryan <surenb@google.com> Suggested-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-30 20:14:10 -07:00
John Hubbard	aa6f8b2593	mm/gup: stop leaking pinned pages in low memory conditions If a driver tries to call any of the pin_user_pages(FOLL_LONGTERM) family of functions, and requests "too many" pages, then the call will erroneously leave pages pinned. This is visible in user space as an actual memory leak. Repro is trivial: just make enough pin_user_pages(FOLL_LONGTERM) calls to exhaust memory. The root cause of the problem is this sequence, within __gup_longterm_locked(): __get_user_pages_locked() rc = check_and_migrate_movable_pages() ...which gets retried in a loop. The loop error handling is incomplete, clearly due to a somewhat unusual and complicated tri-state error API. But anyway, if -ENOMEM, or in fact, any unexpected error is returned from check_and_migrate_movable_pages(), then __gup_longterm_locked() happily returns the error, while leaving the pages pinned. In the failed case, which is an app that requests (via a device driver) 30720000000 bytes to be pinned, and then exits, I see this: $ grep foll /proc/vmstat nr_foll_pin_acquired 7502048 nr_foll_pin_released 2048 And after applying this patch, it returns to balanced pins: $ grep foll /proc/vmstat nr_foll_pin_acquired 7502048 nr_foll_pin_released 7502048 Note that the child routine, check_and_migrate_movable_folios(), avoids this problem, by unpinning any folios in the folios argument, before returning an error. Fix this by making check_and_migrate_movable_pages() behave in exactly the same way as check_and_migrate_movable_folios(): unpin all pages in *pages, before returning an error. Also, documentation was an aggravating factor, so: 1) Consolidate the documentation for these two routines, now that they have identical external behavior. 2) Rewrite the consolidated documentation: a) Clearly list the three return code cases, and what happens in each case. b) Mention that one of the cases unpins the pages or folios, before returning an error code. Link: https://lkml.kernel.org/r/20241018223411.310331-1-jhubbard@nvidia.com Fixes: `24a95998e9` ("mm/gup.c: simplify and fix check_and_migrate_movable_pages() return codes") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Suggested-by: David Hildenbrand <david@redhat.com> Cc: Shigeru Yoshida <syoshida@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-30 20:14:10 -07:00
Daniel Golle	637f414763	net: ethernet: mtk_wed: fix path of MT7988 WO firmware linux-firmware commit 808cba84 ("mtk_wed: add firmware for mt7988 Wireless Ethernet Dispatcher") added mt7988_wo_{0,1}.bin in the 'mediatek/mt7988' directory while driver current expects the files in the 'mediatek' directory. Change path in the driver header now that the firmware has been added. Fixes: `e2f64db13a` ("net: ethernet: mtk_wed: introduce WED support for MT7988") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/Zxz0GWTR5X5LdWPe@pidgin.makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-30 18:26:24 -07:00
Jakub Kicinski	b919f1e54e	Merge branch 'mlxsw-fixes' Petr Machata says: ==================== mlxsw: Fixes In this patchset: - Tx header should be pushed for each packet which is transmitted via Spectrum ASICs. Patch #1 adds a missing call to skb_cow_head() to make sure that there is both enough room to push the Tx header and that the SKB header is not cloned and can be modified. - Commit `b5b60bb491` ("mlxsw: pci: Use page pool for Rx buffers allocation") converted mlxsw to use page pool for Rx buffers allocation. Sync for CPU and for device should be done for Rx pages. In patches #2 and #3, add the missing calls to sync pages for, respectively, CPU and the device. - Patch #4 then fixes a bug to IPv6 GRE forwarding offload. Patch #5 adds a generic forwarding test that fails with mlxsw ports prior to the fix. ==================== Link: https://patch.msgid.link/cover.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-30 18:24:42 -07:00
Ido Schimmel	d7bd61fa02	selftests: forwarding: Add IPv6 GRE remote change tests Test that after changing the remote address of an ip6gre net device traffic is forwarded as expected. Test with both flat and hierarchical topologies and with and without an input / output keys. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/02b05246d2cdada0cf2fccffc0faa8a424d0f51b.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-30 18:24:40 -07:00
Ido Schimmel	12ae97c531	mlxsw: spectrum_ipip: Fix memory leak when changing remote IPv6 address The device stores IPv6 addresses that are used for encapsulation in linear memory that is managed by the driver. Changing the remote address of an ip6gre net device never worked properly, but since cited commit the following reproducer [1] would result in a warning [2] and a memory leak [3]. The problem is that the new remote address is never added by the driver to its hash table (and therefore the device) and the old address is never removed from it. Fix by programming the new address when the configuration of the ip6gre net device changes and removing the old one. If the address did not change, then the above would result in increasing the reference count of the address and then decreasing it. [1] # ip link add name bla up type ip6gre local 2001:db8:1::1 remote 2001:db8:2::1 tos inherit ttl inherit # ip link set dev bla type ip6gre remote 2001:db8:3::1 # ip link del dev bla # devlink dev reload pci/0000:01:00.0 [2] WARNING: CPU: 0 PID: 1682 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3002 mlxsw_sp_ipv6_addr_put+0x140/0x1d0 Modules linked in: CPU: 0 UID: 0 PID: 1682 Comm: ip Not tainted 6.12.0-rc3-custom-g86b5b55bc835 #151 Hardware name: Nvidia SN5600/VMOD0013, BIOS 5.13 05/31/2023 RIP: 0010:mlxsw_sp_ipv6_addr_put+0x140/0x1d0 [...] Call Trace: <TASK> mlxsw_sp_router_netdevice_event+0x55f/0x1240 notifier_call_chain+0x5a/0xd0 call_netdevice_notifiers_info+0x39/0x90 unregister_netdevice_many_notify+0x63e/0x9d0 rtnl_dellink+0x16b/0x3a0 rtnetlink_rcv_msg+0x142/0x3f0 netlink_rcv_skb+0x50/0x100 netlink_unicast+0x242/0x390 netlink_sendmsg+0x1de/0x420 ____sys_sendmsg+0x2bd/0x320 ___sys_sendmsg+0x9a/0xe0 __sys_sendmsg+0x7a/0xd0 do_syscall_64+0x9e/0x1a0 entry_SYSCALL_64_after_hwframe+0x77/0x7f [3] unreferenced object 0xffff898081f597a0 (size 32): comm "ip", pid 1626, jiffies 4294719324 hex dump (first 32 bytes): 20 01 0d b8 00 02 00 00 00 00 00 00 00 00 00 01 ............... 21 49 61 83 80 89 ff ff 00 00 00 00 01 00 00 00 !Ia............. backtrace (crc fd9be911): [<00000000df89c55d>] __kmalloc_cache_noprof+0x1da/0x260 [<00000000ff2a1ddb>] mlxsw_sp_ipv6_addr_kvdl_index_get+0x281/0x340 [<000000009ddd445d>] mlxsw_sp_router_netdevice_event+0x47b/0x1240 [<00000000743e7757>] notifier_call_chain+0x5a/0xd0 [<000000007c7b9e13>] call_netdevice_notifiers_info+0x39/0x90 [<000000002509645d>] register_netdevice+0x5f7/0x7a0 [<00000000c2e7d2a9>] ip6gre_newlink_common.isra.0+0x65/0x130 [<0000000087cd6d8d>] ip6gre_newlink+0x72/0x120 [<000000004df7c7cc>] rtnl_newlink+0x471/0xa20 [<0000000057ed632a>] rtnetlink_rcv_msg+0x142/0x3f0 [<0000000032e0d5b5>] netlink_rcv_skb+0x50/0x100 [<00000000908bca63>] netlink_unicast+0x242/0x390 [<00000000cdbe1c87>] netlink_sendmsg+0x1de/0x420 [<0000000011db153e>] ____sys_sendmsg+0x2bd/0x320 [<000000003b6d53eb>] ___sys_sendmsg+0x9a/0xe0 [<00000000cae27c62>] __sys_sendmsg+0x7a/0xd0 Fixes: `cf42911523` ("mlxsw: spectrum_ipip: Use common hash table for IPv6 address mapping") Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/e91012edc5a6cb9df37b78fd377f669381facfcb.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-30 18:24:39 -07:00
Amit Cohen	d0fbdc3ae9	mlxsw: pci: Sync Rx buffers for device Non-coherent architectures, like ARM, may require invalidating caches before the device can use the DMA mapped memory, which means that before posting pages to device, drivers should sync the memory for device. Sync for device can be configured as page pool responsibility. Set the relevant flag and define max_len for sync. Cc: Jiri Pirko <jiri@resnulli.us> Fixes: `b5b60bb491` ("mlxsw: pci: Use page pool for Rx buffers allocation") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/92e01f05c4f506a4f0a9b39c10175dcc01994910.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-30 18:24:39 -07:00
Amit Cohen	15f73e601a	mlxsw: pci: Sync Rx buffers for CPU When Rx packet is received, drivers should sync the pages for CPU, to ensure the CPU reads the data written by the device and not stale data from its cache. Add the missing sync call in Rx path, sync the actual length of data for each fragment. Cc: Jiri Pirko <jiri@resnulli.us> Fixes: `b5b60bb491` ("mlxsw: pci: Use page pool for Rx buffers allocation") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/461486fac91755ca4e04c2068c102250026dcd0b.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-30 18:24:39 -07:00
Amit Cohen	0a66e5582b	mlxsw: spectrum_ptp: Add missing verification before pushing Tx header Tx header should be pushed for each packet which is transmitted via Spectrum ASICs. The cited commit moved the call to skb_cow_head() from mlxsw_sp_port_xmit() to functions which handle Tx header. In case that mlxsw_sp->ptp_ops->txhdr_construct() is used to handle Tx header, and txhdr_construct() is mlxsw_sp_ptp_txhdr_construct(), there is no call for skb_cow_head() before pushing Tx header size to SKB. This flow is relevant for Spectrum-1 and Spectrum-4, for PTP packets. Add the missing call to skb_cow_head() to make sure that there is both enough room to push the Tx header and that the SKB header is not cloned and can be modified. An additional set will be sent to net-next to centralize the handling of the Tx header by pushing it to every packet just before transmission. Cc: Richard Cochran <richardcochran@gmail.com> Fixes: `24157bc69f` ("mlxsw: Send PTP packets as data packets to overcome a limitation") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/5145780b07ebbb5d3b3570f311254a3a2d554a44.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-30 18:24:39 -07:00
Benoît Monin	04c20a9356	net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension As documented in skbuff.h, devices with NETIF_F_IPV6_CSUM capability can only checksum TCP and UDP over IPv6 if the IP header does not contains extension. This is enforced for UDP packets emitted from user-space to an IPv6 address as they go through ip6_make_skb(), which calls __ip6_append_data() where a check is done on the header size before setting CHECKSUM_PARTIAL. But the introduction of UDP encapsulation with fou6 added a code-path where it is possible to get an skb with a partial UDP checksum and an IPv6 header with extension: * fou6 adds a UDP header with a partial checksum if the inner packet does not contains a valid checksum. * ip6_tunnel adds an IPv6 header with a destination option extension header if encap_limit is non-zero (the default value is 4). The thread linked below describes in more details how to reproduce the problem with GRE-in-UDP tunnel. Add a check on the network header size in skb_csum_hwoffload_help() to make sure no IPv6 packet with extension header is handed to a network device with NETIF_F_IPV6_CSUM capability. Link: https://lore.kernel.org/netdev/26548921.1r3eYUQgxm@benoit.monin/T/#u Fixes: `aa3463d65e` ("fou: Add encap ops for IPv6 tunnels") Signed-off-by: Benoît Monin <benoit.monin@gmx.fr> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/5fbeecfc311ea182aa1d1c771725ab8b4cac515e.1729778144.git.benoit.monin@gmx.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-30 18:21:30 -07:00
Alice Ryhl	2313ab74c3	cfi: tweak llvm version for HAVE_CFI_ICALL_NORMALIZE_INTEGERS The llvm fix [1] did not make it for 19.0.0, but ended up getting backported to llvm 19.1.3 [2]. Thus, fix the version requirement to correctly specify which versions have the bug. Link: https://github.com/llvm/llvm-project/pull/104826 [1] Link: https://github.com/llvm/llvm-project/pull/113938 [2] Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202410281414.c351044e-oliver.sang@intel.com Fixes: `8b8ca9c25f` ("cfi: fix conditions for HAVE_CFI_ICALL_NORMALIZE_INTEGERS") Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20241030-cfi-icall-1913-v1-1-ab8a26e13733@google.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2024-10-31 00:41:37 +01:00
Peter Zijlstra	69d5e722be	sched/ext: Fix scx vs sched_delayed Commit `98442f0ccd` ("sched: Fix delayed_dequeue vs switched_from_fair()") forgot about scx :/ Fixes: `98442f0ccd` ("sched: Fix delayed_dequeue vs switched_from_fair()") Reported-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lkml.kernel.org/r/20241030104934.GK14555@noisy.programming.kicks-ass.net	2024-10-30 22:42:12 +01:00
Linus Torvalds	0fc810ae3a	x86/uaccess: Avoid barrier_nospec() in 64-bit copy_from_user() The barrier_nospec() in 64-bit copy_from_user() is slow. Instead use pointer masking to force the user pointer to all 1's for an invalid address. The kernel test robot reports a 2.6% improvement in the per_thread_ops benchmark [1]. This is a variation on a patch originally by Josh Poimboeuf [2]. Link: https://lore.kernel.org/202410281344.d02c72a2-oliver.sang@intel.com [1] Link: https://lore.kernel.org/5b887fe4c580214900e21f6c61095adf9a142735.1730166635.git.jpoimboe@kernel.org [2] Tested-and-reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-10-30 11:38:10 -10:00
Linus Torvalds	14b7d43c5c	Merge tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Update more header copies with the kernel sources, including const.h, msr-index.h, arm64's cputype.h, kvm's, bits.h and unaligned.h - The return from 'write' isn't a pid, fix cut'n'paste error in 'perf trace' - Fix up the python binding build on architectures without HAVE_KVM_STAT_SUPPORT - Add some more bounds checks to augmented_raw_syscalls.bpf.c (used to collect syscall pointer arguments in 'perf trace') to make the resulting bytecode to pass the kernel BPF verifier, allowing us to go back accepting clang 12.0.1 as the minimum version required for compiling BPF sources - Add __NR_capget for x86 to fix a regression on running perf + intel PT (hw tracing) as non-root setting up the capabilities as described in https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html - Fix missing syscalltbl in non-explicitly listed architectures, noticed on ARM 32-bit, that still needs a .tbl generator for the syscall id<->name tables, should be added for v6.13 - Handle 'perf test' failure when handling broken DWARF for ASM files * tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf cap: Add __NR_capget to arch/x86 unistd tools headers: Update the linux/unaligned.h copy with the kernel sources tools headers arm64: Sync arm64's cputype.h with the kernel sources tools headers: Synchronize {uapi/}linux/bits.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources perf python: Fix up the build on architectures without HAVE_KVM_STAT_SUPPORT perf test: Handle perftool-testsuite_probe failure due to broken DWARF tools headers UAPI: Sync kvm headers with the kernel sources perf trace: Fix non-listed archs in the syscalltbl routines perf build: Change the clang check back to 12.0.1 perf trace augmented_raw_syscalls: Add more checks to pass the verifier perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers perf trace: The return from 'write' isn't a pid tools headers UAPI: Sync linux/const.h with the kernel headers	2024-10-30 11:17:47 -10:00
Chuck Lever	63a81588cd	rpcrdma: Always release the rpcrdma_device's xa_array Dai pointed out that the xa_init_flags() in rpcrdma_add_one() needs to have a matching xa_destroy() in rpcrdma_remove_one() to release underlying memory that the xarray might have accrued during operation. Reported-by: Dai Ngo <dai.ngo@oracle.com> Fixes: `7e86845a03` ("rpcrdma: Implement generic device removal") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-10-30 16:14:00 -04:00
Alexei Starovoitov	053b212b3a	Merge branch 'fixes-for-bits-iterator' Hou Tao says: ==================== The patch set fixes several issues in bits iterator. Patch #1 fixes the kmemleak problem of bits iterator. Patch #2~#3 fix the overflow problem of nr_bits. Patch #4 fixes the potential stack corruption when bits iterator is used on 32-bit host. Patch #5 adds more test cases for bits iterator. Please see the individual patches for more details. And comments are always welcome. --- v4: * patch #1: add ack from Yafang * patch #3: revert code-churn like changes: (1) compute nr_bytes and nr_bits before the check of nr_words. (2) use nr_bits == 64 to check for single u64, preventing build warning on 32-bit hosts. * patch #4: use "BITS_PER_LONG == 32" instead of "!defined(CONFIG_64BIT)" v3: https://lore.kernel.org/bpf/20241025013233.804027-1-houtao@huaweicloud.com/T/#t * split the bits-iterator related patches from "Misc fixes for bpf" patch set * patch #1: use "!nr_bits \|\| bits >= nr_bits" to stop the iteration * patch #2: add a new helper for the overflow problem * patch #3: decrease the limitation from 512 to 511 and check whether nr_bytes is too large for bpf memory allocator explicitly * patch #5: add two more test cases for bit iterator v2: http://lore.kernel.org/bpf/d49fa2f4-f743-c763-7579-c3cab4dd88cb@huaweicloud.com ==================== Link: https://lore.kernel.org/r/20241030100516.3633640-1-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-30 12:13:55 -07:00
Hou Tao	ebafc1e535	selftests/bpf: Add three test cases for bits_iter Add more test cases for bits iterator: (1) huge word test Verify the multiplication overflow of nr_bits in bits_iter. Without the overflow check, when nr_words is 67108865, nr_bits becomes 64, causing bpf_probe_read_kernel_common() to corrupt the stack. (2) max word test Verify correct handling of maximum nr_words value (511). (3) bad word test Verify early termination of bits iteration when bits iterator initialization fails. Also rename bits_nomem to bits_too_big to better reflect its purpose. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-6-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-30 12:13:46 -07:00
Hou Tao	e133938367	bpf: Use __u64 to save the bits in bits iterator On 32-bit hosts (e.g., arm32), when a bpf program passes a u64 to bpf_iter_bits_new(), bpf_iter_bits_new() will use bits_copy to store the content of the u64. However, bits_copy is only 4 bytes, leading to stack corruption. The straightforward solution would be to replace u64 with unsigned long in bpf_iter_bits_new(). However, this introduces confusion and problems for 32-bit hosts because the size of ulong in bpf program is 8 bytes, but it is treated as 4-bytes after passed to bpf_iter_bits_new(). Fix it by changing the type of both bits and bit_count from unsigned long to u64. However, the change is not enough. The main reason is that bpf_iter_bits_next() uses find_next_bit() to find the next bit and the pointer passed to find_next_bit() is an unsigned long pointer instead of a u64 pointer. For 32-bit little-endian host, it is fine but it is not the case for 32-bit big-endian host. Because under 32-bit big-endian host, the first iterated unsigned long will be the bits 32-63 of the u64 instead of the expected bits 0-31. Therefore, in addition to changing the type, swap the two unsigned longs within the u64 for 32-bit big-endian host. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-5-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-30 12:13:46 -07:00
Hou Tao	393397fbdc	bpf: Check the validity of nr_words in bpf_iter_bits_new() Check the validity of nr_words in bpf_iter_bits_new(). Without this check, when multiplication overflow occurs for nr_bits (e.g., when nr_words = 0x0400-0001, nr_bits becomes 64), stack corruption may occur due to bpf_probe_read_kernel_common(..., nr_bytes = 0x2000-0008). Fix it by limiting the maximum value of nr_words to 511. The value is derived from the current implementation of BPF memory allocator. To ensure compatibility if the BPF memory allocator's size limitation changes in the future, use the helper bpf_mem_alloc_check_size() to check whether nr_bytes is too larger. And return -E2BIG instead of -ENOMEM for oversized nr_bytes. Fixes: `4665415975` ("bpf: Add bits iterator") Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-30 12:13:46 -07:00
Hou Tao	62a898b07b	bpf: Add bpf_mem_alloc_check_size() helper Introduce bpf_mem_alloc_check_size() to check whether the allocation size exceeds the limitation for the kmalloc-equivalent allocator. The upper limit for percpu allocation is LLIST_NODE_SZ bytes larger than non-percpu allocation, so a percpu argument is added to the helper. The helper will be used in the following patch to check whether the size parameter passed to bpf_mem_alloc() is too big. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-30 12:13:46 -07:00
Hou Tao	101ccfbabf	bpf: Free dynamically allocated bits in bpf_iter_bits_destroy() bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the bits are dynamically allocated. However, the check is incorrect and may cause a kmemleak as shown below: unreferenced object 0xffff88812628c8c0 (size 32): comm "swapper/0", pid 1, jiffies 4294727320 hex dump (first 32 bytes): b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0 ..U........... f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00 .............. backtrace (crc 781e32cc): [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80 [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0 [<00000000597124d6>] __alloc.isra.0+0x89/0xb0 [<000000004ebfffcd>] alloc_bulk+0x2af/0x720 [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0 [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610 [<000000008b616eac>] bpf_global_ma_init+0x19/0x30 [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0 [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940 [<00000000b119f72f>] kernel_init+0x20/0x160 [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70 [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30 That is because nr_bits will be set as zero in bpf_iter_bits_next() after all bits have been iterated. Fix the issue by setting kit->bit to kit->nr_bits instead of setting kit->nr_bits to zero when the iteration completes in bpf_iter_bits_next(). In addition, use "!nr_bits \|\| bits >= nr_bits" to check whether the iteration is complete and still use "nr_bits > 64" to indicate whether bits are dynamically allocated. The "!nr_bits" check is necessary because bpf_iter_bits_new() may fail before setting kit->nr_bits, and this condition will stop the iteration early instead of accessing the zeroed or freed kit->bits. Considering the initial value of kit->bits is -1 and the type of kit->nr_bits is unsigned int, change the type of kit->nr_bits to int. The potential overflow problem will be handled in the following patch. Fixes: `4665415975` ("bpf: Add bits iterator") Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-30 12:13:46 -07:00
Sungwoo Kim	1e67d86418	Bluetooth: hci: fix null-ptr-deref in hci_read_supported_codecs Fix __hci_cmd_sync_sk() to return not NULL for unknown opcodes. __hci_cmd_sync_sk() returns NULL if a command returns a status event. However, it also returns NULL where an opcode doesn't exist in the hci_cc table because hci_cmd_complete_evt() assumes status = skb->data[0] for unknown opcodes. This leads to null-ptr-deref in cmd_sync for HCI_OP_READ_LOCAL_CODECS as there is no hci_cc for HCI_OP_READ_LOCAL_CODECS, which always assumes status = skb->data[0]. KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] CPU: 1 PID: 2000 Comm: kworker/u9:5 Not tainted 6.9.0-ga6bcb805883c-dirty #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci7 hci_power_on RIP: 0010:hci_read_supported_codecs+0xb9/0x870 net/bluetooth/hci_codec.c:138 Code: 08 48 89 ef e8 b8 c1 8f fd 48 8b 75 00 e9 96 00 00 00 49 89 c6 48 ba 00 00 00 00 00 fc ff df 4c 8d 60 70 4c 89 e3 48 c1 eb 03 <0f> b6 04 13 84 c0 0f 85 82 06 00 00 41 83 3c 24 02 77 0a e8 bf 78 RSP: 0018:ffff888120bafac8 EFLAGS: 00010212 RAX: 0000000000000000 RBX: 000000000000000e RCX: ffff8881173f0040 RDX: dffffc0000000000 RSI: ffffffffa58496c0 RDI: ffff88810b9ad1e4 RBP: ffff88810b9ac000 R08: ffffffffa77882a7 R09: 1ffffffff4ef1054 R10: dffffc0000000000 R11: fffffbfff4ef1055 R12: 0000000000000070 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810b9ac000 FS: 0000000000000000(0000) GS:ffff8881f6c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6ddaa3439e CR3: 0000000139764003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> hci_read_local_codecs_sync net/bluetooth/hci_sync.c:4546 [inline] hci_init_stage_sync net/bluetooth/hci_sync.c:3441 [inline] hci_init4_sync net/bluetooth/hci_sync.c:4706 [inline] hci_init_sync net/bluetooth/hci_sync.c:4742 [inline] hci_dev_init_sync net/bluetooth/hci_sync.c:4912 [inline] hci_dev_open_sync+0x19a9/0x2d30 net/bluetooth/hci_sync.c:4994 hci_dev_do_open net/bluetooth/hci_core.c:483 [inline] hci_power_on+0x11e/0x560 net/bluetooth/hci_core.c:1015 process_one_work kernel/workqueue.c:3267 [inline] process_scheduled_works+0x8ef/0x14f0 kernel/workqueue.c:3348 worker_thread+0x91f/0xe50 kernel/workqueue.c:3429 kthread+0x2cb/0x360 kernel/kthread.c:388 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Fixes: `abfeea476c` ("Bluetooth: hci_sync: Convert MGMT_OP_START_DISCOVERY") Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-10-30 14:49:09 -04:00
Linus Torvalds	4236f91380	Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two small fixes, both in drivers (ufs and scsi_debug)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Fix another deadlock during RTC update scsi: scsi_debug: Fix do_device_access() handling of unexpected SG copy length	2024-10-30 08:16:23 -10:00
Chuck Lever	8286f8b622	NFSD: Never decrement pending_async_copies on error The error flow in nfsd4_copy() calls cleanup_async_copy(), which already decrements nn->pending_async_copies. Reported-by: Olga Kornievskaia <okorniev@redhat.com> Fixes: `aadc3bbea1` ("NFSD: Limit the number of concurrent async COPY operations") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-10-30 14:12:16 -04:00
Boris Brezillon	4700fd3e05	drm/panthor: Report group as timedout when we fail to properly suspend If we don't do that, the group is considered usable by userspace, but all further GROUP_SUBMIT will fail with -EINVAL. Changes in v3: - Add R-bs Changes in v2: - New patch Fixes: `de85488138` ("drm/panthor: Add the scheduler logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029152912.270346-3-boris.brezillon@collabora.com	2024-10-30 16:37:18 +01:00
Boris Brezillon	412a2a8fdd	drm/panthor: Fail job creation when the group is dead Userspace can use GROUP_SUBMIT errors as a trigger to check the group state and recreate the group if it became unusable. Make sure we report an error when the group became unusable. Changes in v3: - None Changes in v2: - Add R-bs Fixes: `de85488138` ("drm/panthor: Add the scheduler logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029152912.270346-2-boris.brezillon@collabora.com	2024-10-30 16:37:09 +01:00
Boris Brezillon	5d01b56f05	drm/panthor: Fix firmware initialization on systems with a page size > 4k The system and GPU MMU page size might differ, which becomes a problem for FW sections that need to be mapped at explicit addresses since our PAGE_SIZE alignment might cover a VA range that's expected to be used for another section. Make sure we never map more than we need. Changes in v3: - Add R-bs Changes in v2: - Plan for per-VM page sizes so the MCU VM and user VM can have different pages sizes Fixes: `2718d91816` ("drm/panthor: Add the FW logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241030150231.768949-1-boris.brezillon@collabora.com	2024-10-30 16:30:21 +01:00
Keith Busch	5eed4fb274	nvme: re-fix error-handling for io_uring nvme-passthrough This was previously fixed with commit `1147dd0503` ("nvme: fix error-handling for io_uring nvme-passthrough"), but the change was mistakenly undone in a later commit. Fixes: `d6aacee925` ("nvme: use bio_integrity_map_user") Cc: stable@vger.kernel.org Reported-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-30 07:19:18 -07:00
Vitaliy Shevtsov	d2f551b1f7	nvmet-auth: assign dh_key to NULL after kfree_sensitive ctrl->dh_key might be used across multiple calls to nvmet_setup_dhgroup() for the same controller. So it's better to nullify it after release on error path in order to avoid double free later in nvmet_destroy_auth(). Found by Linux Verification Center (linuxtesting.org) with Svace. Fixes: `7a277c37d3` ("nvmet-auth: Diffie-Hellman key exchange support") Cc: stable@vger.kernel.org Signed-off-by: Vitaliy Shevtsov <v.shevtsov@maxima.ru> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-30 07:19:18 -07:00
Keith Busch	42ab37eaad	nvme: module parameter to disable pi with offsets A recent commit enables integrity checks for formats the previous kernel versions registered with the "nop" integrity profile. This means namespaces using that format become unreadable when upgrading the kernel past that commit. Introduce a module parameter to restore the "nop" integrity profile so that storage can be readable once again. This could be a boot device, so the setting needs to happen at module load time. Fixes: `921e81db52` ("nvme: allow integrity when PI is not in first bytes") Reported-by: David Wei <dw@davidwei.uk> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-30 07:19:18 -07:00
Christoffer Sandberg	e49370d769	ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1 Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <cs@tuxedo.de> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241029151653.80726-2-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-30 14:46:59 +01:00
Christoffer Sandberg	0b04fbe886	ALSA: hda/realtek: Fix headset mic on TUXEDO Gemini 17 Gen3 Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <cs@tuxedo.de> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241029151653.80726-1-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-30 14:46:54 +01:00
Jan Schär	4413665dd6	ALSA: usb-audio: Add quirks for Dell WD19 dock The WD19 family of docks has the same audio chipset as the WD15. This change enables jack detection on the WD19. We don't need the dell_dock_mixer_init quirk for the WD19. It is only needed because of the dell_alc4020_map quirk for the WD15 in mixer_maps.c, which disables the volume controls. Even for the WD15, this quirk was apparently only needed when the dock firmware was not updated. Signed-off-by: Jan Schär <jan@jschaer.ch> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241029221249.15661-1-jan@jschaer.ch Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-30 14:46:49 +01:00
Takashi Iwai	7027eee090	Merge tag 'asoc-fix-v6.12-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.12 The biggest set of changes here is Hans' fixes and quirks for various Baytrail based platforms with RT5640 CODECs, and there's one core fix for a missed length assignment for __counted_by() checking. Otherwise it's small device specific fixes, several of them in the DT bindings.	2024-10-30 14:46:35 +01:00
Eric Dumazet	4ed234fe79	netfilter: nf_reject_ipv6: fix potential crash in nf_send_reset6() I got a syzbot report without a repro [1] crashing in nf_send_reset6() I think the issue is that dev->hard_header_len is zero, and we attempt later to push an Ethernet header. Use LL_MAX_HEADER, as other functions in net/ipv6/netfilter/nf_reject_ipv6.c. [1] skbuff: skb_under_panic: text:ffffffff89b1d008 len:74 put:14 head:ffff88803123aa00 data:ffff88803123a9f2 tail:0x3c end:0x140 dev:syz_tun kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 7373 Comm: syz.1.568 Not tainted 6.12.0-rc2-syzkaller-00631-g6d858708d465 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:skb_panic net/core/skbuff.c:206 [inline] RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216 Code: 0d 8d 48 c7 c6 60 a6 29 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 ba 30 38 02 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc900045269b0 EFLAGS: 00010282 RAX: 0000000000000088 RBX: dffffc0000000000 RCX: cd66dacdc5d8e800 RDX: 0000000000000000 RSI: 0000000000000200 RDI: 0000000000000000 RBP: ffff88802d39a3d0 R08: ffffffff8174afec R09: 1ffff920008a4ccc R10: dffffc0000000000 R11: fffff520008a4ccd R12: 0000000000000140 R13: ffff88803123aa00 R14: ffff88803123a9f2 R15: 000000000000003c FS: 00007fdbee5ff6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000005d322000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_push+0xe5/0x100 net/core/skbuff.c:2636 eth_header+0x38/0x1f0 net/ethernet/eth.c:83 dev_hard_header include/linux/netdevice.h:3208 [inline] nf_send_reset6+0xce6/0x1270 net/ipv6/netfilter/nf_reject_ipv6.c:358 nft_reject_inet_eval+0x3b9/0x690 net/netfilter/nft_reject_inet.c:48 expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline] nft_do_chain+0x4ad/0x1da0 net/netfilter/nf_tables_core.c:288 nft_do_chain_inet+0x418/0x6b0 net/netfilter/nft_chain_filter.c:161 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] NF_HOOK include/linux/netfilter.h:312 [inline] br_nf_pre_routing_ipv6+0x63e/0x770 net/bridge/br_netfilter_ipv6.c:184 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_bridge_pre net/bridge/br_input.c:277 [inline] br_handle_frame+0x9fd/0x1530 net/bridge/br_input.c:424 __netif_receive_skb_core+0x13e8/0x4570 net/core/dev.c:5562 __netif_receive_skb_one_core net/core/dev.c:5666 [inline] __netif_receive_skb+0x12f/0x650 net/core/dev.c:5781 netif_receive_skb_internal net/core/dev.c:5867 [inline] netif_receive_skb+0x1e8/0x890 net/core/dev.c:5926 tun_rx_batched+0x1b7/0x8f0 drivers/net/tun.c:1550 tun_get_user+0x3056/0x47e0 drivers/net/tun.c:2007 tun_chr_write_iter+0x10d/0x1f0 drivers/net/tun.c:2053 new_sync_write fs/read_write.c:590 [inline] vfs_write+0xa6d/0xc90 fs/read_write.c:683 ksys_write+0x183/0x2b0 fs/read_write.c:736 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fdbeeb7d1ff Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 c9 8d 02 00 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 1c 8e 02 00 48 RSP: 002b:00007fdbee5ff000 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fdbeed36058 RCX: 00007fdbeeb7d1ff RDX: 000000000000008e RSI: 0000000020000040 RDI: 00000000000000c8 RBP: 00007fdbeebf12be R08: 0000000000000000 R09: 0000000000000000 R10: 000000000000008e R11: 0000000000000293 R12: 0000000000000000 R13: 0000000000000000 R14: 00007fdbeed36058 R15: 00007ffc38de06e8 </TASK> Fixes: `c8d7b98bec` ("netfilter: move nf_send_resetX() code to nf_reject_ipvX modules") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-10-30 13:17:36 +01:00
Dong Chenchen	f48d258f0a	netfilter: Fix use-after-free in get_info() ip6table_nat module unload has refcnt warning for UAF. call trace is: WARNING: CPU: 1 PID: 379 at kernel/module/main.c:853 module_put+0x6f/0x80 Modules linked in: ip6table_nat(-) CPU: 1 UID: 0 PID: 379 Comm: ip6tables Not tainted 6.12.0-rc4-00047-gc2ee9f594da8-dirty #205 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:module_put+0x6f/0x80 Call Trace: <TASK> get_info+0x128/0x180 do_ip6t_get_ctl+0x6a/0x430 nf_getsockopt+0x46/0x80 ipv6_getsockopt+0xb9/0x100 rawv6_getsockopt+0x42/0x190 do_sock_getsockopt+0xaa/0x180 __sys_getsockopt+0x70/0xc0 __x64_sys_getsockopt+0x20/0x30 do_syscall_64+0xa2/0x1a0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Concurrent execution of module unload and get_info() trigered the warning. The root cause is as follows: cpu0 cpu1 module_exit //mod->state = MODULE_STATE_GOING ip6table_nat_exit xt_unregister_template kfree(t) //removed from templ_list getinfo() t = xt_find_table_lock list_for_each_entry(tmpl, &xt_templates[af]...) if (strcmp(tmpl->name, name)) continue; //table not found try_module_get list_for_each_entry(t, &xt_net->tables[af]...) return t; //not get refcnt module_put(t->me) //uaf unregister_pernet_subsys //remove table from xt_net list While xt_table module was going away and has been removed from xt_templates list, we couldnt get refcnt of xt_table->me. Check module in xt_net->tables list re-traversal to fix it. Fixes: `fdacd57c79` ("netfilter: x_tables: never register tables by default") Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-10-30 13:17:36 +01:00
Liu Jing	76342e8425	selftests: netfilter: remove unused parameter err is never used, remove it. Signed-off-by: Liu Jing <liujing@cmss.chinamobile.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-10-30 13:17:36 +01:00
Christoph Hellwig	81a1e1c32e	xfs: streamline xfs_filestream_pick_ag Directly return the error from xfs_bmap_longest_free_extent instead of breaking from the loop and handling it there, and use a done label to directly jump to the exist when we found a suitable perag structure to reduce the indentation level and pag/max_pag check complexity in the tail of the function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-30 11:27:18 +01:00
Christoph Hellwig	dc60992ce7	xfs: fix finding a last resort AG in xfs_filestream_pick_ag When the main loop in xfs_filestream_pick_ag fails to find a suitable AG it tries to just pick the online AG. But the loop for that uses args->pag as loop iterator while the later code expects pag to be set. Fix this by reusing the max_pag case for this last resort, and also add a check for impossible case of no AG just to make sure that the uninitialized pag doesn't even escape in theory. Reported-by: syzbot+4125a3c514e3436a02e6@syzkaller.appspotmail.com Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: syzbot+4125a3c514e3436a02e6@syzkaller.appspotmail.com Fixes: `f8f1ed1ab3` ("xfs: return a referenced perag from filestreams allocator") Cc: <stable@vger.kernel.org> # v6.3 Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-30 11:27:18 +01:00
Chi Zhiling	3ef2268403	xfs: Reduce unnecessary searches when searching for the best extents Recently, we found that the CPU spent a lot of time in xfs_alloc_ag_vextent_size when the filesystem has millions of fragmented spaces. The reason is that we conducted much extra searching for extents that could not yield a better result, and these searches would cost a lot of time when there were millions of extents to search through. Even if we get the same result length, we don't switch our choice to the new one, so we can definitely terminate the search early. Since the result length cannot exceed the found length, when the found length equals the best result length we already have, we can conclude the search. We did a test in that filesystem: [root@localhost ~]# xfs_db -c freesp /dev/vdb from to extents blocks pct 1 1 215 215 0.01 2 3 994476 1988952 99.99 Before this patch: 0) \| xfs_alloc_ag_vextent_size [xfs]() { 0) * 15597.94 us \| } After this patch: 0) \| xfs_alloc_ag_vextent_size [xfs]() { 0) 19.176 us \| } Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-30 11:27:18 +01:00
Ojaswin Mujoo	2a492ff666	xfs: Check for delayed allocations before setting extsize Extsize should only be allowed to be set on files with no data in it. For this, we check if the files have extents but miss to check if delayed extents are present. This patch adds that check. While we are at it, also refactor this check into a helper since it's used in some other places as well like xfs_inactive() or xfs_ioctl_setattr_xflags() Without the patch (SUCCEEDS) $ xfs_io -c 'open -f testfile' -c 'pwrite 0 1024' -c 'extsize 65536' wrote 1024/1024 bytes at offset 0 1 KiB, 1 ops; 0.0002 sec (4.628 MiB/sec and 4739.3365 ops/sec) With the patch (FAILS as expected) $ xfs_io -c 'open -f testfile' -c 'pwrite 0 1024' -c 'extsize 65536' wrote 1024/1024 bytes at offset 0 1 KiB, 1 ops; 0.0002 sec (4.628 MiB/sec and 4739.3365 ops/sec) xfs_io: FS_IOC_FSSETXATTR testfile: Invalid argument Fixes: `e94af02a9c` ("[XFS] fix old xfs_setattr mis-merge from irix; mostly harmless esp if not using xfs rt") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-30 11:27:18 +01:00
Andrzej Kacprowski	72f7e16ecc	accel/ivpu: Fix NOC firewall interrupt handling The NOC firewall interrupt means that the HW prevented unauthorized access to a protected resource, so there is no need to trigger device reset in such case. To facilitate security testing add firewall_irq_counter debugfs file that tracks firewall interrupts. Fixes: `8a27ad81f7` ("accel/ivpu: Split IP and buttress code") Cc: stable@vger.kernel.org # v6.11+ Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241017144958.79327-1-jacek.lawrynowicz@linux.intel.com	2024-10-30 10:17:00 +01:00
Eduard Zingerman	d0b98f6a17	bpf: disallow 40-bytes extra stack for bpf_fastcall patterns Hou Tao reported an issue with bpf_fastcall patterns allowing extra stack space above MAX_BPF_STACK limit. This extra stack allowance is not integrated properly with the following verifier parts: - backtracking logic still assumes that stack can't exceed MAX_BPF_STACK; - bpf_verifier_env->scratched_stack_slots assumes only 64 slots are available. Here is an example of an issue with precision tracking (note stack slot -8 tracked as precise instead of -520): 0: (b7) r1 = 42 ; R1_w=42 1: (b7) r2 = 42 ; R2_w=42 2: (7b) (u64 )(r10 -512) = r1 ; R1_w=42 R10=fp0 fp-512_w=42 3: (7b) (u64 )(r10 -520) = r2 ; R2_w=42 R10=fp0 fp-520_w=42 4: (85) call bpf_get_smp_processor_id#8 ; R0_w=scalar(...) 5: (79) r2 = (u64 )(r10 -520) ; R2_w=42 R10=fp0 fp-520_w=42 6: (79) r1 = (u64 )(r10 -512) ; R1_w=42 R10=fp0 fp-512_w=42 7: (bf) r3 = r10 ; R3_w=fp0 R10=fp0 8: (0f) r3 += r2 mark_precise: frame0: last_idx 8 first_idx 0 subseq_idx -1 mark_precise: frame0: regs=r2 stack= before 7: (bf) r3 = r10 mark_precise: frame0: regs=r2 stack= before 6: (79) r1 = (u64 )(r10 -512) mark_precise: frame0: regs=r2 stack= before 5: (79) r2 = (u64 )(r10 -520) mark_precise: frame0: regs= stack=-8 before 4: (85) call bpf_get_smp_processor_id#8 mark_precise: frame0: regs= stack=-8 before 3: (7b) (u64 )(r10 -520) = r2 mark_precise: frame0: regs=r2 stack= before 2: (7b) (u64 )(r10 -512) = r1 mark_precise: frame0: regs=r2 stack= before 1: (b7) r2 = 42 9: R2_w=42 R3_w=fp42 9: (95) exit This patch disables the additional allowance for the moment. Also, two test cases are removed: - bpf_fastcall_max_stack_ok: it fails w/o additional stack allowance; - bpf_fastcall_max_stack_fail: this test is no longer necessary, stack size follows regular rules, pattern invalidation is checked by other test cases. Reported-by: Hou Tao <houtao@huaweicloud.com> Closes: https://lore.kernel.org/bpf/20241023022752.172005-1-houtao@huaweicloud.com/ Fixes: `5b5f51bff1` ("bpf: no_caller_saved_registers attribute for helper calls") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241029193911.1575719-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-29 19:43:16 -07:00
Linus Torvalds	c1e939a21e	Merge tag 'cgroup-for-6.12-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - cgroup_bpf_release_fn() could saturate system_wq with cgrp->bpf.release_work which can then form a circular dependency leading to deadlocks. Fix by using a dedicated workqueue. The system_wq's max concurrency limit is being increased separately. - Fix theoretical off-by-one bug when enforcing max cgroup hierarchy depth * tag 'cgroup-for-6.12-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Fix potential overflow issue when checking max_depth cgroup/bpf: use a dedicated workqueue for cgroup bpf destruction	2024-10-29 16:41:30 -10:00
Linus Torvalds	daa9f66fe1	Merge tag 'sched_ext-for-6.12-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Instances of scx_ops_bypass() could race each other leading to misbehavior. Fix by protecting the operation with a spinlock. - selftest and userspace header fixes * tag 'sched_ext-for-6.12-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Fix enq_last_no_enq_fails selftest sched_ext: Make cast_mask() inline scx: Fix raciness in scx_ops_bypass() scx: Fix exit selftest to use custom DSQ sched_ext: Fix function pointer type mismatches in BPF selftests selftests/sched_ext: add order-only dependency of runner.o on BPFOBJ	2024-10-29 16:35:40 -10:00
Linus Torvalds	7fbaacafbc	Merge tag 'slab-for-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fixes from Vlastimil Babka: - Fix for a slub_kunit test warning with MEM_ALLOC_PROFILING_DEBUG (Pei Xiao) - Fix for a MTE-based KASAN BUG in krealloc() (Qun-Wei Lin) * tag 'slab-for-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm: krealloc: Fix MTE false alarm in __do_krealloc slub/kunit: fix a WARNING due to unwrapped __kmalloc_cache_noprof	2024-10-29 16:24:02 -10:00
Linus Torvalds	9251e3e93c	Merge tag 'mm-hotfixes-stable-2024-10-28-21-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "21 hotfixes. 13 are cc:stable. 13 are MM and 8 are non-MM. No particular theme here - mainly singletons, a couple of doubletons. Please see the changelogs" * tag 'mm-hotfixes-stable-2024-10-28-21-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits) mm: avoid unconditional one-tick sleep when swapcache_prepare fails mseal: update mseal.rst mm: split critical region in remap_file_pages() and invoke LSMs in between selftests/mm: fix deadlock for fork after pthread_create with atomic_bool Revert "selftests/mm: replace atomic_bool with pthread_barrier_t" Revert "selftests/mm: fix deadlock for fork after pthread_create on ARM" tools: testing: add expand-only mode VMA test mm/vma: add expand-only VMA merge mode and optimise do_brk_flags() resource,kexec: walk_system_ram_res_rev must retain resource flags nilfs2: fix kernel bug due to missing clearing of checked flag mm: numa_clear_kernel_node_hotplug: Add NUMA_NO_NODE check for node id ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow mm: shmem: fix data-race in shmem_getattr() mm: mark mas allocation in vms_abort_munmap_vmas as __GFP_NOFAIL x86/traps: move kmsan check after instrumentation_begin resource: remove dependency on SPARSEMEM from GET_FREE_REGION mm/mmap: fix race in mmap_region() with ftruncate() mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves fork: only invoke khugepaged, ksm hooks if no error fork: do not invoke uffd on fork if error occurs ...	2024-10-29 16:19:15 -10:00
Linus Torvalds	d5b2ee0fe8	Merge tag 'tpmdd-next-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fix from Jarkko Sakkinen: "Address a significant boot-time delay issue" Link: https://bugzilla.kernel.org/show_bug.cgi?id=219229 * tag 'tpmdd-next-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: Lazily flush the auth session tpm: Rollback tpm2_load_null() tpm: Return tpm2_sessions_init() when null key creation fails	2024-10-29 16:04:24 -10:00
Jakub Kicinski	c05c62850a	Merge tag 'wireless-2024-10-29' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== wireless fixes for v6.12-rc6 Another set of fixes, mostly iwlwifi: * fix infinite loop in 6 GHz scan if more than 255 colocated APs were reported * revert removal of retry loops for now to work around issues with firmware initialization on some devices/platforms * fix SAR table issues with some BIOSes * fix race in suspend/debug collection * fix memory leak in fw recovery * fix link ID leak in AP mode for older devices * fix sending TX power constraints * fix link handling in FW restart And also the stack: * fix setting TX power from userspace with the new chanctx emulation code for old-style drivers * fix a memory corruption bug due to structure embedding * fix CQM configuration double-free when moving between net namespaces * tag 'wireless-2024-10-29' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: mac80211: ieee80211_i: Fix memory corruption bug in struct ieee80211_chanctx wifi: iwlwifi: mvm: fix 6 GHz scan construction wifi: cfg80211: clear wdev->cqm_config pointer on free mac80211: fix user-power when emulating chanctx Revert "wifi: iwlwifi: remove retry loops in start" wifi: iwlwifi: mvm: don't add default link in fw restart flow wifi: iwlwifi: mvm: Fix response handling in iwl_mvm_send_recovery_cmd() wifi: iwlwifi: mvm: SAR table alignment wifi: iwlwifi: mvm: Use the sync timepoint API in suspend wifi: iwlwifi: mvm: really send iwl_txpower_constraints_cmd wifi: iwlwifi: mvm: don't leak a link on AP removal ==================== Link: https://patch.msgid.link/20241029093926.13750-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-29 18:57:12 -07:00
Wang Liang	9ab5cf19fb	net: fix crash when config small gso_max_size/gso_ipv4_max_size Config a small gso_max_size/gso_ipv4_max_size will lead to an underflow in sk_dst_gso_max_size(), which may trigger a BUG_ON crash, because sk->sk_gso_max_size would be much bigger than device limits. Call Trace: tcp_write_xmit tso_segs = tcp_init_tso_segs(skb, mss_now); tcp_set_skb_tso_segs tcp_skb_pcount_set // skb->len = 524288, mss_now = 8 // u16 tso_segs = 524288/8 = 65535 -> 0 tso_segs = DIV_ROUND_UP(skb->len, mss_now) BUG_ON(!tso_segs) Add check for the minimum value of gso_max_size and gso_ipv4_max_size. Fixes: `46e6b992c2` ("rtnetlink: allow GSO maximums to be set on device creation") Fixes: `9eefedd58a` ("net: add gso_ipv4_max_size and gro_ipv4_max_size per device") Signed-off-by: Wang Liang <wangliang74@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241023035213.517386-1-wangliang74@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-29 16:18:23 -07:00
Zhihao Cheng	aec8e6bf83	btrfs: fix use-after-free of block device file in __btrfs_free_extra_devids() Mounting btrfs from two images (which have the same one fsid and two different dev_uuids) in certain executing order may trigger an UAF for variable 'device->bdev_file' in __btrfs_free_extra_devids(). And following are the details: 1. Attach image_1 to loop0, attach image_2 to loop1, and scan btrfs devices by ioctl(BTRFS_IOC_SCAN_DEV): / btrfs_device_1 → loop0 fs_device \ btrfs_device_2 → loop1 2. mount /dev/loop0 /mnt btrfs_open_devices btrfs_device_1->bdev_file = btrfs_get_bdev_and_sb(loop0) btrfs_device_2->bdev_file = btrfs_get_bdev_and_sb(loop1) btrfs_fill_super open_ctree fail: btrfs_close_devices // -ENOMEM btrfs_close_bdev(btrfs_device_1) fput(btrfs_device_1->bdev_file) // btrfs_device_1->bdev_file is freed btrfs_close_bdev(btrfs_device_2) fput(btrfs_device_2->bdev_file) 3. mount /dev/loop1 /mnt btrfs_open_devices btrfs_get_bdev_and_sb(&bdev_file) // EIO, btrfs_device_1->bdev_file is not assigned, // which points to a freed memory area btrfs_device_2->bdev_file = btrfs_get_bdev_and_sb(loop1) btrfs_fill_super open_ctree btrfs_free_extra_devids if (btrfs_device_1->bdev_file) fput(btrfs_device_1->bdev_file) // UAF ! Fix it by setting 'device->bdev_file' as 'NULL' after closing the btrfs_device in btrfs_close_one_device(). Fixes: `1423881941` ("btrfs: do not background blkdev_put()") CC: stable@vger.kernel.org # 4.19+ Link: https://bugzilla.kernel.org/show_bug.cgi?id=219408 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-29 21:59:25 +01:00
Byeonguk Jeong	d7f214aeac	selftests/bpf: Add test for trie_get_next_key() Add a test for out-of-bounds write in trie_get_next_key() when a full path from root to leaf exists and bpf_map_get_next_key() is called with the leaf node. It may crashes the kernel on failure, so please run in a VM. Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/Zxx4ep78tsbeWPVM@localhost.localdomain Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-29 13:41:41 -07:00
Byeonguk Jeong	13400ac8fb	bpf: Fix out-of-bounds write in trie_get_next_key() trie_get_next_key() allocates a node stack with size trie->max_prefixlen, while it writes (trie->max_prefixlen + 1) nodes to the stack when it has full paths from the root to leaves. For example, consider a trie with max_prefixlen is 8, and the nodes with key 0x00/0, 0x00/1, 0x00/2, ... 0x00/8 inserted. Subsequent calls to trie_get_next_key with _key with .prefixlen = 8 make 9 nodes be written on the node stack with size 8. Fixes: `b471f2f1de` ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map") Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com> Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org> Tested-by: Hou Tao <houtao1@huawei.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/Zxx384ZfdlFYnz6J@localhost.localdomain Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-29 13:41:40 -07:00
Chuck Lever	63fab04cbd	NFSD: Initialize struct nfsd4_copy earlier Ensure the refcount and async_copies fields are initialized early. cleanup_async_copy() will reference these fields if an error occurs in nfsd4_copy(). If they are not correctly initialized, at the very least, a refcount underflow occurs. Reported-by: Olga Kornievskaia <okorniev@redhat.com> Fixes: `aadc3bbea1` ("NFSD: Limit the number of concurrent async COPY operations") Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-10-29 15:31:18 -04:00
Mark Brown	2db63e9218	wcd937x codec fixes Merge series from Alexey Klimov <alexey.klimov@linaro.org>: This sent as RFC because of the following: - regarding the LO switch patch. I've got info about that from two persons independently hence not sure what tags to put there and who should be the author. Please let me know if that needs to be corrected. - the wcd937x pdm watchdog is a problem for audio playback and needs to be fixed. The minimal fix would be to at least increase timeout value but it will still trigger in case of plenty of dbg messages or other delay-generating things. Unfortunately, I can't test HPHL/R outputs hence the patch is only for AUX. The other options would be introducing module parameter for debugging and using HOLD_OFF bit for that or adding Kconfig option. Alexey Klimov (2): ASoC: codecs: wcd937x: add missing LO Switch control ASoC: codecs: wcd937x: relax the AUX PDM watchdog sound/soc/codecs/wcd937x.c \| 12 ++++++++++-- sound/soc/codecs/wcd937x.h \| 4 ++++ 2 files changed, 14 insertions(+), 2 deletions(-) -- 2.45.2	2024-10-29 19:18:48 +00:00
Benoît Monin	6b3f18a76b	net: usb: qmi_wwan: add Quectel RG650V Add support for Quectel RG650V which is based on Qualcomm SDX65 chip. The composition is DIAG / NMEA / AT / AT / QMI. T: Bus=02 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2c7c ProdID=0122 Rev=05.15 S: Manufacturer=Quectel S: Product=RG650V-EU S: SerialNumber=xxxxxxx C: #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=9ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=9ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=88(I) Atr=03(Int.) MxPS= 8 Ivl=9ms Signed-off-by: Benoît Monin <benoit.monin@gmx.fr> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241024151113.53203-1-benoit.monin@gmx.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-29 11:49:29 -07:00
Vladimir Oltean	a13e690191	net/sched: sch_api: fix xa_insert() error path in tcf_block_get_ext() This command: $ tc qdisc replace dev eth0 ingress_block 1 egress_block 1 clsact Error: block dev insert failed: -EBUSY. fails because user space requests the same block index to be set for both ingress and egress. [ side note, I don't think it even failed prior to commit `913b47d342` ("net/sched: Introduce tc block netdev tracking infra"), because this is a command from an old set of notes of mine which used to work, but alas, I did not scientifically bisect this ] The problem is not that it fails, but rather, that the second time around, it fails differently (and irrecoverably): $ tc qdisc replace dev eth0 ingress_block 1 egress_block 1 clsact Error: dsa_core: Flow block cb is busy. [ another note: the extack is added by me for illustration purposes. the context of the problem is that clsact_init() obtains the same &q->ingress_block pointer as &q->egress_block, and since we call tcf_block_get_ext() on both of them, "dev" will be added to the block->ports xarray twice, thus failing the operation: once through the ingress block pointer, and once again through the egress block pointer. the problem itself is that when xa_insert() fails, we have emitted a FLOW_BLOCK_BIND command through ndo_setup_tc(), but the offload never sees a corresponding FLOW_BLOCK_UNBIND. ] Even correcting the bad user input, we still cannot recover: $ tc qdisc replace dev swp3 ingress_block 1 egress_block 2 clsact Error: dsa_core: Flow block cb is busy. Basically the only way to recover is to reboot the system, or unbind and rebind the net device driver. To fix the bug, we need to fill the correct error teardown path which was missed during code movement, and call tcf_block_offload_unbind() when xa_insert() fails. [ last note, fundamentally I blame the label naming convention in tcf_block_get_ext() for the bug. The labels should be named after what they do, not after the error path that jumps to them. This way, it is obviously wrong that two labels pointing to the same code mean something is wrong, and checking the code correctness at the goto site is also easier ] Fixes: `94e2557d08` ("net: sched: move block device tracking into tcf_block_get/put_ext()") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20241023100541.974362-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-29 11:45:23 -07:00
Zichen Xie	4ce1f56a1e	netdevsim: Add trailing zero to terminate the string in nsim_nexthop_bucket_activity_write() This was found by a static analyzer. We should not forget the trailing zero after copy_from_user() if we will further do some string operations, sscanf() in this case. Adding a trailing zero will ensure that the function performs properly. Fixes: `c6385c0b67` ("netdevsim: Allow reporting activity on nexthop buckets") Signed-off-by: Zichen Xie <zichenxie0106@gmail.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20241022171907.8606-1-zichenxie0106@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-29 11:43:01 -07:00
Eduard Zingerman	1fb315892d	selftests/bpf: Test with a very short loop The test added is a simplified reproducer from syzbot report [1]. If verifier does not insert checkpoint somewhere inside the loop, verification of the program would take a very long time. This would happen because mark_chain_precision() for register r7 would constantly trace jump history of the loop back, processing many iterations for each mark_chain_precision() call. [1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20241029172641.1042523-2-eddyz87@gmail.com	2024-10-29 11:42:23 -07:00
Eduard Zingerman	aa30eb3260	bpf: Force checkpoint when jmp history is too long A specifically crafted program might trick verifier into growing very long jump history within a single bpf_verifier_state instance. Very long jump history makes mark_chain_precision() unreasonably slow, especially in case if verifier processes a loop. Mitigate this by forcing new state in is_state_visited() in case if current state's jump history is too long. Use same constant as in `skip_inf_loop_check`, but multiply it by arbitrarily chosen value 2 to account for jump history containing not only information about jumps, but also information about stack access. For an example of problematic program consider the code below, w/o this patch the example is processed by verifier for ~15 minutes, before failing to allocate big-enough chunk for jmp_history. 0: r7 = (u16 )(r1 +0);" 1: r7 += 0x1ab064b9;" 2: if r7 & 0x702000 goto 1b; 3: r7 &= 0x1ee60e;" 4: r7 += r1;" 5: if r7 s> 0x37d2 goto +0;" 6: r0 = 0;" 7: exit;" Perf profiling shows that most of the time is spent in mark_chain_precision() ~95%. The easiest way to explain why this program causes problems is to apply the following patch: diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 0c216e71cec7..4b4823961abe 100644 \--- a/include/linux/bpf.h \+++ b/include/linux/bpf.h \@@ -1926,7 +1926,7 @@ struct bpf_array { }; }; -#define BPF_COMPLEXITY_LIMIT_INSNS 1000000 /* yes. 1M insns / +#define BPF_COMPLEXITY_LIMIT_INSNS 256 / yes. 1M insns / #define MAX_TAIL_CALL_CNT 33 / Maximum number of loops for bpf_loop and bpf_iter_num. diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f514247ba8ba..75e88be3bb3e 100644 \--- a/kernel/bpf/verifier.c \+++ b/kernel/bpf/verifier.c \@@ -18024,8 +18024,13 @@ static int is_state_visited(struct bpf_verifier_env env, int insn_idx) skip_inf_loop_check: if (!force_new_state && env->jmps_processed - env->prev_jmps_processed < 20 && - env->insn_processed - env->prev_insn_processed < 100) + env->insn_processed - env->prev_insn_processed < 100) { + verbose(env, "is_state_visited: suppressing checkpoint at %d, %d jmps processed, cur->jmp_history_cnt is %d\n", + env->insn_idx, + env->jmps_processed - env->prev_jmps_processed, + cur->jmp_history_cnt); add_new_state = false; + } goto miss; } / If sl->state is a part of a loop and this loop's entry is a part of \@@ -18142,6 +18147,9 @@ static int is_state_visited(struct bpf_verifier_env env, int insn_idx) if (!add_new_state) return 0; + verbose(env, "is_state_visited: new checkpoint at %d, resetting env->jmps_processed\n", + env->insn_idx); + / There were no equivalent states, remember the current one. * Technically the current state is not proven to be safe yet, * but it will either reach outer most bpf_exit (which means it's safe) And observe verification log: ... is_state_visited: new checkpoint at 5, resetting env->jmps_processed 5: R1=ctx() R7=ctx(...) 5: (65) if r7 s> 0x37d2 goto pc+0 ; R7=ctx(...) 6: (b7) r0 = 0 ; R0_w=0 7: (95) exit from 5 to 6: R1=ctx() R7=ctx(...) R10=fp0 6: R1=ctx() R7=ctx(...) R10=fp0 6: (b7) r0 = 0 ; R0_w=0 7: (95) exit is_state_visited: suppressing checkpoint at 1, 3 jmps processed, cur->jmp_history_cnt is 74 from 2 to 1: R1=ctx() R7_w=scalar(...) R10=fp0 1: R1=ctx() R7_w=scalar(...) R10=fp0 1: (07) r7 += 447767737 is_state_visited: suppressing checkpoint at 2, 3 jmps processed, cur->jmp_history_cnt is 75 2: R7_w=scalar(...) 2: (45) if r7 & 0x702000 goto pc-2 ... mark_precise 152 steps for r7 ... 2: R7_w=scalar(...) is_state_visited: suppressing checkpoint at 1, 4 jmps processed, cur->jmp_history_cnt is 75 1: (07) r7 += 447767737 is_state_visited: suppressing checkpoint at 2, 4 jmps processed, cur->jmp_history_cnt is 76 2: R7_w=scalar(...) 2: (45) if r7 & 0x702000 goto pc-2 ... BPF program is too large. Processed 257 insn The log output shows that checkpoint at label (1) is never created, because it is suppressed by `skip_inf_loop_check` logic: a. When 'if' at (2) is processed it pushes a state with insn_idx (1) onto stack and proceeds to (3); b. At (5) checkpoint is created, and this resets env->{jmps,insns}_processed. c. Verification proceeds and reaches `exit`; d. State saved at step (a) is popped from stack and is_state_visited() considers if checkpoint needs to be added, but because env->{jmps,insns}_processed had been just reset at step (b) the `skip_inf_loop_check` logic forces `add_new_state` to false. e. Verifier proceeds with current state, which slowly accumulates more and more entries in the jump history. The accumulation of entries in the jump history is a problem because of two factors: - it eventually exhausts memory available for kmalloc() allocation; - mark_chain_precision() traverses the jump history of a state, meaning that if `r7` is marked precise, verifier would iterate ever growing jump history until parent state boundary is reached. (note: the log also shows a REG INVARIANTS VIOLATION warning upon jset processing, but that's another bug to fix). With this patch applied, the example above is rejected by verifier under 1s of time, reaching 1M instructions limit. The program is a simplified reproducer from syzbot report. Previous discussion could be found at [1]. The patch does not cause any changes in verification performance, when tested on selftests from veristat.cfg and cilium programs taken from [2]. [1] https://lore.kernel.org/bpf/20241009021254.2805446-1-eddyz87@gmail.com/ [2] https://github.com/anakryiko/cilium Changelog: - v1 -> v2: - moved patch to bpf tree; - moved force_new_state variable initialization after declaration and shortened the comment. v1: https://lore.kernel.org/bpf/20241018020307.1766906-1-eddyz87@gmail.com/ Fixes: `2589726d12` ("bpf: introduce bounded loops") Reported-by: syzbot+7e46cdef14bf496a3ab4@syzkaller.appspotmail.com Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20241029172641.1042523-1-eddyz87@gmail.com Closes: https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/	2024-10-29 11:42:21 -07:00
Pedro Tammela	2e95c43844	net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOT In qdisc_tree_reduce_backlog, Qdiscs with major handle ffff: are assumed to be either root or ingress. This assumption is bogus since it's valid to create egress qdiscs with major handle ffff: Budimir Markovic found that for qdiscs like DRR that maintain an active class list, it will cause a UAF with a dangling class pointer. In `066a3b5b23`, the concern was to avoid iterating over the ingress qdisc since its parent is itself. The proper fix is to stop when parent TC_H_ROOT is reached because the only way to retrieve ingress is when a hierarchy which does not contain a ffff: major handle call into qdisc_lookup with TC_H_MAJ(TC_H_ROOT). In the scenario where major ffff: is an egress qdisc in any of the tree levels, the updates will also propagate to TC_H_ROOT, which then the iteration must stop. Fixes: `066a3b5b23` ("[NET_SCHED] sch_api: fix qdisc_tree_decrease_qlen() loop") Reported-by: Budimir Markovic <markovicbudimir@gmail.com> Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> net/sched/sch_api.c \| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241024165547.418570-1-jhs@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-29 11:32:26 -07:00
Florian Westphal	c59d72d0a4	selftests: netfilter: nft_flowtable.sh: make first pass deterministic The CI occasionaly encounters a failing test run. Example: # PASS: ipsec tunnel mode for ns1/ns2 # re-run with random mtus: -o 10966 -l 19499 -r 31322 # PASS: flow offloaded for ns1/ns2 [..] # FAIL: ipsec tunnel ... counter 1157059 exceeds expected value 878489 This script will re-exec itself, on the second run, random MTUs are chosen for the involved links. This is done so we can cover different combinations (large mtu on client, small on server, link has lowest mtu, etc). Furthermore, file size is random, even for the first run. Rework this script and always use the same file size on initial run so that at least the first round can be expected to have reproducible behavior. Second round will use random mtu/filesize. Raise the failure limit to that of the file size, this should avoid all errneous test errors. Currently, first fin will remove the offload, so if one peer is already closing remaining data is handled by classic path, which result in larger-than-expected counter and a test failure. Given packet path also counts tcp/ip headers, in case offload is completely broken this test will still fail (as expected). The test counter limit could be made more strict again in the future once flowtable can keep a connection in offloaded state until FINs in both directions were seen. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241022152324.13554-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-29 11:26:09 -07:00
Pablo Neira Ayuso	7515e37bce	gtp: allow -1 to be specified as file description from userspace Existing user space applications maintained by the Osmocom project are breaking since a recent fix that addresses incorrect error checking. Restore operation for user space programs that specify -1 as file descriptor to skip GTPv0 or GTPv1 only sockets. Fixes: `defd8b3c37` ("gtp: fix a potential NULL pointer dereference") Reported-by: Pau Espin Pedrol <pespin@sysmocom.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Oliver Smith <osmith@sysmocom.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241022144825.66740-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-29 11:24:08 -07:00
Matt Johnston	01e215975f	mctp i2c: handle NULL header address daddr can be NULL if there is no neighbour table entry present, in that case the tx packet should be dropped. saddr will usually be set by MCTP core, but check for NULL in case a packet is transmitted by a different protocol. Fixes: `f5b8abf9fc` ("mctp i2c: MCTP I2C binding driver") Cc: stable@vger.kernel.org Reported-by: Dung Cao <dung@os.amperecomputing.com> Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241022-mctp-i2c-null-dest-v3-1-e929709956c5@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-29 11:22:38 -07:00
Ido Schimmel	90e0569dd3	ipv4: ip_tunnel: Fix suspicious RCU usage warning in ip_tunnel_find() The per-netns IP tunnel hash table is protected by the RTNL mutex and ip_tunnel_find() is only called from the control path where the mutex is taken. Add a lockdep expression to hlist_for_each_entry_rcu() in ip_tunnel_find() in order to validate that the mutex is held and to silence the suspicious RCU usage warning [1]. [1] WARNING: suspicious RCU usage 6.12.0-rc3-custom-gd95d9a31aceb #139 Not tainted ----------------------------- net/ipv4/ip_tunnel.c:221 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by ip/362: #0: ffffffff86fc7cb0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x377/0xf60 stack backtrace: CPU: 12 UID: 0 PID: 362 Comm: ip Not tainted 6.12.0-rc3-custom-gd95d9a31aceb #139 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0xba/0x110 lockdep_rcu_suspicious.cold+0x4f/0xd6 ip_tunnel_find+0x435/0x4d0 ip_tunnel_newlink+0x517/0x7a0 ipgre_newlink+0x14c/0x170 __rtnl_newlink+0x1173/0x19c0 rtnl_newlink+0x6c/0xa0 rtnetlink_rcv_msg+0x3cc/0xf60 netlink_rcv_skb+0x171/0x450 netlink_unicast+0x539/0x7f0 netlink_sendmsg+0x8c1/0xd80 ____sys_sendmsg+0x8f9/0xc20 ___sys_sendmsg+0x197/0x1e0 __sys_sendmsg+0x122/0x1f0 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: `c544193214` ("GRE: Refactor GRE tunneling code.") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241023123009.749764-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-29 11:12:53 -07:00
Ido Schimmel	ad4a3ca6a8	ipv4: ip_tunnel: Fix suspicious RCU usage warning in ip_tunnel_init_flow() There are code paths from which the function is called without holding the RCU read lock, resulting in a suspicious RCU usage warning [1]. Fix by using l3mdev_master_upper_ifindex_by_index() which will acquire the RCU read lock before calling l3mdev_master_upper_ifindex_by_index_rcu(). [1] WARNING: suspicious RCU usage 6.12.0-rc3-custom-gac8f72681cf2 #141 Not tainted ----------------------------- net/core/dev.c:876 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by ip/361: #0: ffffffff86fc7cb0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x377/0xf60 stack backtrace: CPU: 3 UID: 0 PID: 361 Comm: ip Not tainted 6.12.0-rc3-custom-gac8f72681cf2 #141 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0xba/0x110 lockdep_rcu_suspicious.cold+0x4f/0xd6 dev_get_by_index_rcu+0x1d3/0x210 l3mdev_master_upper_ifindex_by_index_rcu+0x2b/0xf0 ip_tunnel_bind_dev+0x72f/0xa00 ip_tunnel_newlink+0x368/0x7a0 ipgre_newlink+0x14c/0x170 __rtnl_newlink+0x1173/0x19c0 rtnl_newlink+0x6c/0xa0 rtnetlink_rcv_msg+0x3cc/0xf60 netlink_rcv_skb+0x171/0x450 netlink_unicast+0x539/0x7f0 netlink_sendmsg+0x8c1/0xd80 ____sys_sendmsg+0x8f9/0xc20 ___sys_sendmsg+0x197/0x1e0 __sys_sendmsg+0x122/0x1f0 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: `db53cd3d88` ("net: Handle l3mdev in ip_tunnel_init_flow") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20241022063822.462057-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-29 11:12:25 -07:00
Kevin Brodsky	2e8a1acea8	arm64: signal: Improve POR_EL0 handling to avoid uaccess failures Reset POR_EL0 to "allow all" before writing the signal frame, preventing spurious uaccess failures. When POE is supported, the POR_EL0 register constrains memory accesses based on the target page's POIndex (pkey). This raises the question: what constraints should apply to a signal handler? The current answer is that POR_EL0 is reset to POR_EL0_INIT when invoking the handler, giving it full access to POIndex 0. This is in line with x86's MPK support and remains unchanged. This is only part of the story, though. POR_EL0 constrains all unprivileged memory accesses, meaning that uaccess routines such as put_user() are also impacted. As a result POR_EL0 may prevent the signal frame from being written to the signal stack (ultimately causing a SIGSEGV). This is especially concerning when an alternate signal stack is used, because userspace may want to prevent access to it outside of signal handlers. There is currently no provision for that: POR_EL0 is reset after writing to the stack, and POR_EL0_INIT only enables access to POIndex 0. This patch ensures that POR_EL0 is reset to its most permissive state before the signal stack is accessed. Once the signal frame has been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up to the signal handler to enable access to additional pkeys if needed. As to sigreturn(), it expects having access to the stack like any other syscall; we only need to ensure that POR_EL0 is restored from the signal frame after all uaccess calls. This approach is in line with the recent x86/pkeys series [1]. Resetting POR_EL0 early introduces some complications, in that we can no longer read the register directly in preserve_poe_context(). This is addressed by introducing a struct (user_access_state) and helpers to manage any such register impacting user accesses (uaccess and accesses in userspace). Things look like this on signal delivery: 1. Save original POR_EL0 into struct [save_reset_user_access_state()] 2. Set POR_EL0 to "allow all" [save_reset_user_access_state()] 3. Create signal frame 4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()] 5. Finalise signal frame 6. If all operations succeeded: a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()] b. Else reset POR_EL0 to its original value [restore_user_access_state()] If any step fails when setting up the signal frame, the process will be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures that the original POR_EL0 is saved in the signal frame when delivering that SIGSEGV (so that the original value is restored by sigreturn). The return path (sys_rt_sigreturn) doesn't strictly require any change since restore_poe_context() is already called last. However, to avoid uaccess calls being accidentally added after that point, we use the same approach as in the delivery path, i.e. separating uaccess from writing to the register: 1. Read saved POR_EL0 value from the signal frame [restore_poe_context()] 2. Set POR_EL0 to the saved value [restore_user_access_state()] [1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/ Fixes: `9160f7e909` ("arm64: add POE signal support") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-29 17:59:12 +00:00
Jiayuan Chen	a32aee8f0d	bpf: fix filed access without lock The tcp_bpf_recvmsg_parser() function, running in user context, retrieves seq_copied from tcp_sk without holding the socket lock, and stores it in a local variable seq. However, the softirq context can modify tcp_sk->seq_copied concurrently, for example, n tcp_read_sock(). As a result, the seq value is stale when it is assigned back to tcp_sk->copied_seq at the end of tcp_bpf_recvmsg_parser(), leading to incorrect behavior. Due to concurrency, the copied_seq field in tcp_bpf_recvmsg_parser() might be set to an incorrect value (less than the actual copied_seq) at the end of function: 'WRITE_ONCE(tcp->copied_seq, seq)'. This causes the 'offset' to be negative in tcp_read_sock()->tcp_recv_skb() when processing new incoming packets (sk->copied_seq - skb->seq becomes less than 0), and all subsequent packets will be dropped. Signed-off-by: Jiayuan Chen <mrpre@163.com> Link: https://lore.kernel.org/r/20241028065226.35568-1-mrpre@163.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-10-29 10:54:05 -07:00
Arnd Bergmann	fce9642c76	x86/amd_nb: Fix compile-testing without CONFIG_AMD_NB node_to_amd_nb() is defined to NULL in non-AMD configs: drivers/platform/x86/amd/hsmp/plat.c: In function 'init_platform_device': drivers/platform/x86/amd/hsmp/plat.c:165:68: error: dereferencing 'void *' pointer [-Werror] 165 \| sock->root = node_to_amd_nb(i)->root; \| ^~ drivers/platform/x86/amd/hsmp/plat.c:165:68: error: request for member 'root' in something not a structure or union Users of the interface who also allow COMPILE_TEST will cause the above build error so provide an inline stub to fix that. [ bp: Massage commit message. ] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241029092329.3857004-1-arnd@kernel.org	2024-10-29 18:16:05 +01:00
Jason Gunthorpe	f3c3ccc4fe	PCI: Fix pci_enable_acs() support for the ACS quirks There are ACS quirks that hijack the normal ACS processing and deliver to to special quirk code. The enable path needs to call pci_dev_specific_enable_acs() and then pci_dev_specific_acs_enabled() will report the hidden ACS state controlled by the quirk. The recent rework got this out of order and we should try to call pci_dev_specific_enable_acs() regardless of any actual ACS support in the device. As before command line parameters that effect standard PCI ACS don't interact with the quirk versions, including the new config_acs= option. Link: https://lore.kernel.org/r/0-v1-f96b686c625b+124-pci_acs_quirk_fix_jgg@nvidia.com Fixes: `47c8846a49` ("PCI: Extend ACS configurability") Reported-by: Jiri Slaby <jirislaby@kernel.org> Closes: https://lore.kernel.org/all/e89107da-ac99-4d3a-9527-a4df9986e120@kernel.org Closes: https://bugzilla.suse.com/show_bug.cgi?id=1229019 Tested-by: Steffen Dirkwinkel <me@steffen.cc> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2024-10-29 11:35:52 -05:00
Paolo Abeni	bacccddbbc	Merge branch 'intel-wired-lan-driver-fixes-2024-10-21-igb-ice' Jacob Keller says: ==================== Intel Wired LAN Driver Fixes 2024-10-21 (igb, ice) This series includes fixes for the ice and igb drivers. Wander fixes an issue in igb when operating on PREEMPT_RT kernels due to the PREEMPT_RT kernel switching IRQs to be threaded by default. Michal fixes the ice driver to block subfunction port creation when the PF is operating in legacy (non-switchdev) mode. Arkadiusz fixes a crash when loading the ice driver on an E810 LOM which has DPLL enabled. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> ==================== Link: https://patch.msgid.link/20241021-iwl-2024-10-21-iwl-net-fixes-v1-0-a50cb3059f55@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-29 15:24:56 +01:00
Arkadiusz Kubalewski	6e58c33106	ice: fix crash on probe for DPLL enabled E810 LOM The E810 Lan On Motherboard (LOM) design is vendor specific. Intel provides the reference design, but it is up to vendor on the final product design. For some cases, like Linux DPLL support, the static values defined in the driver does not reflect the actual LOM design. Current implementation of dpll pins is causing the crash on probe of the ice driver for such DPLL enabled E810 LOM designs: WARNING: (...) at drivers/dpll/dpll_core.c:495 dpll_pin_get+0x2c4/0x330 ... Call Trace: <TASK> ? __warn+0x83/0x130 ? dpll_pin_get+0x2c4/0x330 ? report_bug+0x1b7/0x1d0 ? handle_bug+0x42/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? dpll_pin_get+0x117/0x330 ? dpll_pin_get+0x2c4/0x330 ? dpll_pin_get+0x117/0x330 ice_dpll_get_pins.isra.0+0x52/0xe0 [ice] ... The number of dpll pins enabled by LOM vendor is greater than expected and defined in the driver for Intel designed NICs, which causes the crash. Prevent the crash and allow generic pin initialization within Linux DPLL subsystem for DPLL enabled E810 LOM designs. Newly designed solution for described issue will be based on "per HW design" pin initialization. It requires pin information dynamically acquired from the firmware and is already in progress, planned for next-tree only. Fixes: `d7999f5ea6` ("ice: implement dpll interface to control cgu") Reviewed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-29 15:24:53 +01:00
Michal Swiatkowski	3e13a8c0a5	ice: block SF port creation in legacy mode There is no support for SF in legacy mode. Reflect it in the code. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Fixes: `eda69d654c` ("ice: add basic devlink subfunctions support") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-29 15:24:53 +01:00
Wander Lairson Costa	338c4d3902	igb: Disable threaded IRQ for igb_msix_other During testing of SR-IOV, Red Hat QE encountered an issue where the ip link up command intermittently fails for the igbvf interfaces when using the PREEMPT_RT variant. Investigation revealed that e1000_write_posted_mbx returns an error due to the lack of an ACK from e1000_poll_for_ack. The underlying issue arises from the fact that IRQs are threaded by default under PREEMPT_RT. While the exact hardware details are not available, it appears that the IRQ handled by igb_msix_other must be processed before e1000_poll_for_ack times out. However, e1000_write_posted_mbx is called with preemption disabled, leading to a scenario where the IRQ is serviced only after the failure of e1000_write_posted_mbx. To resolve this, we set IRQF_NO_THREAD for the affected interrupt, ensuring that the kernel handles it immediately, thereby preventing the aforementioned error. Reproducer: #!/bin/bash # echo 2 > /sys/class/net/ens14f0/device/sriov_numvfs ipaddr_vlan=3 nic_test=ens14f0 vf=${nic_test}v0 while true; do ip link set ${nic_test} mtu 1500 ip link set ${vf} mtu 1500 ip link set $vf up ip link set ${nic_test} vf 0 vlan ${ipaddr_vlan} ip addr add 172.30.${ipaddr_vlan}.1/24 dev ${vf} ip addr add 2021:db8:${ipaddr_vlan}::1/64 dev ${vf} if ! ip link show $vf \| grep 'state UP'; then echo 'Error found' break fi ip link set $vf down done Signed-off-by: Wander Lairson Costa <wander@redhat.com> Fixes: `9d5c824399` ("igb: PCI-Express 82575 Gigabit Ethernet driver") Reported-by: Yuying Ma <yuma@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-29 15:24:53 +01:00
Imre Deak	6a9d2e2988	drm/xe/display: Add missing HPD interrupt enabling during non-d3cold RPM resume Atm the display HPD interrupts that got disabled during runtime suspend, are re-enabled only if d3cold is enabled. Fix things by also re-enabling the interrupts if d3cold is disabled. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-5-imre.deak@intel.com (cherry picked from commit `bbc4a30de0`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-29 07:12:56 -07:00
Imre Deak	dcb6c1d071	drm/xe/display: Separate the d3cold and non-d3cold runtime PM handling For clarity separate the d3cold and non-d3cold runtime PM handling. The only change in behavior is disabling polling later during runtime resume. This shouldn't make a difference, since the poll disabling is handled from a work, which could run at any point wrt. the runtime resume handler. The work will also require a runtime PM reference, syncing it with the resume handler. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-4-imre.deak@intel.com (cherry picked from commit `a4de6beb83`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-29 07:12:56 -07:00
Maarten Lankhorst	25f2ff5383	drm/xe: Remove runtime argument from display s/r functions The previous change ensures that pm_suspend is only called when suspending or resuming. This ensures no further bugs like those in the previous commit. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240905150052.174895-3-maarten.lankhorst@linux.intel.com (cherry picked from commit `f90491d4b6`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-29 07:12:55 -07:00
Alexey Klimov	107a5c853e	ASoC: codecs: wcd937x: relax the AUX PDM watchdog On a system with wcd937x, rxmacro and Qualcomm audio DSP, which is pretty common set of devices on Qualcomm platforms, and due to the order of how DAPM widgets are powered on (they are sorted), there is a small time window when wcd937x chip is online and expects the flow of incoming data but rxmacro is not yet online. When wcd937x is programmed to receive data via AUX port then its AUX PDM watchdog is enabled in wcd937x_codec_enable_aux_pa(). If due to some reasons the rxmacro and soundwire machinery are delayed to start streaming data, then there is a chance for this AUX PDM watchdog to reset the wcd937x codec. Such event is not logged as a message and only wcd937x IRQ counter is increased however there could be a lot of other reasons for that IRQ. There is a similar opportunity for such delay during DAPM widgets power down sequence. If wcd937x codec reset happens on the start of the playback, then there will be no sound and if such reset happens at the end of a playback then it may generate additional clicks and pops noises. On qrb4210 RB2 board without any debugging bits the wcd937x resets are sometimes observed at the end of a playback though not always. With some debugging messages or with some tracing enabled the AUX PDM watchdog resets the wcd937x codec at the start of a playback and there is no sound output at all. In this patch: - TIMEOUT_SEL bit in PDM_WD_CTL2 register is set to increase the watchdog reset delay to 100ms which eliminates the AUX PDM watchdog IRQs on qrb4210 RB2 board completely and decreases the number of unwanted clicks noises; - HOLD_OFF bit postpones triggering such watchdog IRQ till wcd937x codec reset which usually happens at the end of a playback. This allows to actually output some sound in case of debugging. Cc: Adam Skladowski <a39.skl@gmail.com> Cc: Mohammad Rafi Shaik <quic_mohs@quicinc.com> Cc: Prasad Kumpatla <quic_pkumpatl@quicinc.com> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://patch.msgid.link/20241022033132.787416-3-alexey.klimov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-29 14:11:00 +00:00
Alexey Klimov	041db4bbe0	ASoC: codecs: wcd937x: add missing LO Switch control The wcd937x supports also AUX input but the control that sets correct soundwire port for this is missing. This control is required for audio playback, for instance, on qrb4210 RB2 board as well as on other SoCs. Reported-by: Adam Skladowski <a39.skl@gmail.com> Reported-by: Prasad Kumpatla <quic_pkumpatl@quicinc.com> Suggested-by: Adam Skladowski <a39.skl@gmail.com> Suggested-by: Prasad Kumpatla <quic_pkumpatl@quicinc.com> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Mohammad Rafi Shaik <quic_mohs@quicinc.com> Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://patch.msgid.link/20241022033132.787416-2-alexey.klimov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-29 14:10:59 +00:00
Aboorva Devarajan	5db91545ef	sched: Pass correct scheduling policy to __setscheduler_class Commit `98442f0ccd` ("sched: Fix delayed_dequeue vs switched_from_fair()") overlooked that __setscheduler_prio(), now __setscheduler_class() relies on p->policy for task_should_scx(), and moved the call before __setscheduler_params() updates it, causing it to be using the old p->policy value. Resolve this by changing task_should_scx() to take the policy itself instead of a task pointer, such that __sched_setscheduler() can pass in the updated policy. Fixes: `98442f0ccd` ("sched: Fix delayed_dequeue vs switched_from_fair()") Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org>	2024-10-29 13:57:51 +01:00
Dmitry Yashin	cc8475a07c	ASoC: dt-bindings: rockchip,rk3308-codec: add port property Fix DTB warnings when rk3308-codec used with audio-graph-card by documenting port property: codec@ff560000: 'port' does not match any of the regexes: 'pinctrl-[0-9]+' Signed-off-by: Dmitry Yashin <dmt.yashin@gmail.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Link: https://patch.msgid.link/20241028213314.476776-2-dmt.yashin@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-29 12:46:31 +00:00
Pierre Gondois	1c10941e34	ACPI: CPPC: Make rmw_lock a raw_spin_lock The following BUG was triggered: ============================= [ BUG: Invalid wait context ] 6.12.0-rc2-XXX #406 Not tainted ----------------------------- kworker/1:1/62 is trying to lock: ffffff8801593030 (&cpc_ptr->rmw_lock){+.+.}-{3:3}, at: cpc_write+0xcc/0x370 other info that might help us debug this: context-{5:5} 2 locks held by kworker/1:1/62: #0: ffffff897ef5ec98 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2c/0x50 #1: ffffff880154e238 (&sg_policy->update_lock){....}-{2:2}, at: sugov_update_shared+0x3c/0x280 stack backtrace: CPU: 1 UID: 0 PID: 62 Comm: kworker/1:1 Not tainted 6.12.0-rc2-g9654bd3e8806 #406 Workqueue: 0x0 (events) Call trace: dump_backtrace+0xa4/0x130 show_stack+0x20/0x38 dump_stack_lvl+0x90/0xd0 dump_stack+0x18/0x28 __lock_acquire+0x480/0x1ad8 lock_acquire+0x114/0x310 _raw_spin_lock+0x50/0x70 cpc_write+0xcc/0x370 cppc_set_perf+0xa0/0x3a8 cppc_cpufreq_fast_switch+0x40/0xc0 cpufreq_driver_fast_switch+0x4c/0x218 sugov_update_shared+0x234/0x280 update_load_avg+0x6ec/0x7b8 dequeue_entities+0x108/0x830 dequeue_task_fair+0x58/0x408 __schedule+0x4f0/0x1070 schedule+0x54/0x130 worker_thread+0xc0/0x2e8 kthread+0x130/0x148 ret_from_fork+0x10/0x20 sugov_update_shared() locks a raw_spinlock while cpc_write() locks a spinlock. To have a correct wait-type order, update rmw_lock to a raw spinlock and ensure that interrupts will be disabled on the CPU holding it. Fixes: `60949b7b80` ("ACPI: CPPC: Fix MASK_VAL() usage") Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Link: https://patch.msgid.link/20241028125657.1271512-1-pierre.gondois@arm.com [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-10-29 12:56:19 +01:00
Furong Xu	66600fac7a	net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data In case the non-paged data of a SKB carries protocol header and protocol payload to be transmitted on a certain platform that the DMA AXI address width is configured to 40-bit/48-bit, or the size of the non-paged data is bigger than TSO_MAX_BUFF_SIZE on a certain platform that the DMA AXI address width is configured to 32-bit, then this SKB requires at least two DMA transmit descriptors to serve it. For example, three descriptors are allocated to split one DMA buffer mapped from one piece of non-paged data: dma_desc[N + 0], dma_desc[N + 1], dma_desc[N + 2]. Then three elements of tx_q->tx_skbuff_dma[] will be allocated to hold extra information to be reused in stmmac_tx_clean(): tx_q->tx_skbuff_dma[N + 0], tx_q->tx_skbuff_dma[N + 1], tx_q->tx_skbuff_dma[N + 2]. Now we focus on tx_q->tx_skbuff_dma[entry].buf, which is the DMA buffer address returned by DMA mapping call. stmmac_tx_clean() will try to unmap the DMA buffer _ONLY_IF_ tx_q->tx_skbuff_dma[entry].buf is a valid buffer address. The expected behavior that saves DMA buffer address of this non-paged data to tx_q->tx_skbuff_dma[entry].buf is: tx_q->tx_skbuff_dma[N + 0].buf = NULL; tx_q->tx_skbuff_dma[N + 1].buf = NULL; tx_q->tx_skbuff_dma[N + 2].buf = dma_map_single(); Unfortunately, the current code misbehaves like this: tx_q->tx_skbuff_dma[N + 0].buf = dma_map_single(); tx_q->tx_skbuff_dma[N + 1].buf = NULL; tx_q->tx_skbuff_dma[N + 2].buf = NULL; On the stmmac_tx_clean() side, when dma_desc[N + 0] is closed by the DMA engine, tx_q->tx_skbuff_dma[N + 0].buf is a valid buffer address obviously, then the DMA buffer will be unmapped immediately. There may be a rare case that the DMA engine does not finish the pending dma_desc[N + 1], dma_desc[N + 2] yet. Now things will go horribly wrong, DMA is going to access a unmapped/unreferenced memory region, corrupted data will be transmited or iommu fault will be triggered :( In contrast, the for-loop that maps SKB fragments behaves perfectly as expected, and that is how the driver should do for both non-paged data and paged frags actually. This patch corrects DMA map/unmap sequences by fixing the array index for tx_q->tx_skbuff_dma[entry].buf when assigning DMA buffer address. Tested and verified on DWXGMAC CORE 3.20a Reported-by: Suraj Jaiswal <quic_jsuraj@quicinc.com> Fixes: `f748be531d` ("stmmac: support new GMAC4") Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241021061023.2162701-1-0x1207@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-29 11:37:10 +01:00
Piotr Zalewski	3726a1970b	bcachefs: Fix NULL ptr dereference in btree_node_iter_and_journal_peek Add NULL check for key returned from bch2_btree_and_journal_iter_peek in btree_node_iter_and_journal_peek to avoid NULL ptr dereference in bch2_bkey_buf_reassemble. When key returned from bch2_btree_and_journal_iter_peek is NULL it means that btree topology needs repair. Print topology error message with position at which node wasn't found, its parent node information and btree_id with level. Return error code returned by bch2_topology_error to ensure that topology error is handled properly by recovery. Reported-by: syzbot+005ef9aa519f30d97657@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=005ef9aa519f30d97657 Fixes: `5222a4607c` ("bcachefs: BTREE_ITER_WITH_JOURNAL") Suggested-by: Alan Huang <mmpgouride@gmail.com> Suggested-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Piotr Zalewski <pZ010001011111@proton.me> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-29 06:34:11 -04:00
Gaosheng Cui	ca959e328b	bcachefs: fix possible null-ptr-deref in __bch2_ec_stripe_head_get() The function ec_new_stripe_head_alloc() returns nullptr if kzalloc() fails. It is crucial to verify its return value before dereferencing it to avoid a potential nullptr dereference. Fixes: `035d72f72c` ("bcachefs: bch2_ec_stripe_head_get() now checks for change in rw devices") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-29 06:34:10 -04:00
Kent Overstreet	778ac324cc	bcachefs: Fix deadlock on -ENOSPC w.r.t. partial open buckets Open buckets on the partial list should not count as allocated when we're trying to allocate from the partial list. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-29 06:34:10 -04:00
Kent Overstreet	e0fafac5c4	bcachefs: Don't filter partial list buckets in open_buckets_to_text() these are an important source of stranded buckets we need to be able to watch Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-29 06:34:10 -04:00
Kent Overstreet	a34eef6dd1	bcachefs: Don't keep tons of cached pointers around We had a bug report where the data update path was creating an extent that failed to validate because it had too many pointers; almost all of them were cached. To fix this, we have: - want_cached_ptr(), a new helper that checks if we even want a cached pointer (is on appropriate target, device is readable). - bch2_extent_set_ptr_cached() now only sets a pointer cached if we want it. - bch2_extent_normalize_by_opts() now ensures that we only have a single cached pointer that we want. While working on this, it was noticed that this doesn't work well with reflinked data and per-file options. Another patch series is coming that plumbs through additional io path options through bch_extent_rebalance, with improved option handling. Reported-by: Reed Riley <reed@riley.engineer> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-29 06:34:10 -04:00
Piotr Zalewski	3fd27e9c57	bcachefs: init freespace inited bits to 0 in bch2_fs_initialize Initialize freespace_initialized bits to 0 in member's flags and update member's cached version for each device in bch2_fs_initialize. It's possible for the bits to be set to 1 before fs is initialized and if call to bch2_trans_mark_dev_sbs (just before bch2_fs_freespace_init) fails bits remain to be 1 which can later indirectly trigger BUG condition in bch2_bucket_alloc_freelist during shutdown. Reported-by: syzbot+2b6a17991a6af64f9489@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2b6a17991a6af64f9489 Fixes: `bbe682c767` ("bcachefs: Ensure devices are always correctly initialized") Suggested-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Piotr Zalewski <pZ010001011111@proton.me> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-29 06:34:10 -04:00
Kent Overstreet	c1fa854acc	bcachefs: Fix unhandled transaction restart in fallocate This used to not matter, but now we're being more strict. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-29 06:34:10 -04:00
Ley Foon Tan	f84ef58e55	net: stmmac: dwmac4: Fix high address display by updating reg_space[] from register values The high address will display as 0 if the driver does not set the reg_space[]. To fix this, read the high address registers and update the reg_space[] accordingly. Fixes: `fbf68229ff` ("net: stmmac: unify registers dumps methods") Signed-off-by: Ley Foon Tan <leyfoon.tan@starfivetech.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241021054625.1791965-1-leyfoon.tan@starfivetech.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-29 11:29:40 +01:00
Qun-Wei Lin	704573851b	mm: krealloc: Fix MTE false alarm in __do_krealloc This patch addresses an issue introduced by commit `1a83a716ec` ("mm: krealloc: consider spare memory for __GFP_ZERO") which causes MTE (Memory Tagging Extension) to falsely report a slab-out-of-bounds error. The problem occurs when zeroing out spare memory in __do_krealloc. The original code only considered software-based KASAN and did not account for MTE. It does not reset the KASAN tag before calling memset, leading to a mismatch between the pointer tag and the memory tag, resulting in a false positive. Example of the error: ================================================================== swapper/0: BUG: KASAN: slab-out-of-bounds in __memset+0x84/0x188 swapper/0: Write at addr f4ffff8005f0fdf0 by task swapper/0/1 swapper/0: Pointer tag: [f4], memory tag: [fe] swapper/0: swapper/0: CPU: 4 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12. swapper/0: Hardware name: MT6991(ENG) (DT) swapper/0: Call trace: swapper/0: dump_backtrace+0xfc/0x17c swapper/0: show_stack+0x18/0x28 swapper/0: dump_stack_lvl+0x40/0xa0 swapper/0: print_report+0x1b8/0x71c swapper/0: kasan_report+0xec/0x14c swapper/0: __do_kernel_fault+0x60/0x29c swapper/0: do_bad_area+0x30/0xdc swapper/0: do_tag_check_fault+0x20/0x34 swapper/0: do_mem_abort+0x58/0x104 swapper/0: el1_abort+0x3c/0x5c swapper/0: el1h_64_sync_handler+0x80/0xcc swapper/0: el1h_64_sync+0x68/0x6c swapper/0: __memset+0x84/0x188 swapper/0: btf_populate_kfunc_set+0x280/0x3d8 swapper/0: __register_btf_kfunc_id_set+0x43c/0x468 swapper/0: register_btf_kfunc_id_set+0x48/0x60 swapper/0: register_nf_nat_bpf+0x1c/0x40 swapper/0: nf_nat_init+0xc0/0x128 swapper/0: do_one_initcall+0x184/0x464 swapper/0: do_initcall_level+0xdc/0x1b0 swapper/0: do_initcalls+0x70/0xc0 swapper/0: do_basic_setup+0x1c/0x28 swapper/0: kernel_init_freeable+0x144/0x1b8 swapper/0: kernel_init+0x20/0x1a8 swapper/0: ret_from_fork+0x10/0x20 ================================================================== Fixes: `1a83a716ec` ("mm: krealloc: consider spare memory for __GFP_ZERO") Signed-off-by: Qun-Wei Lin <qun-wei.lin@mediatek.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>	2024-10-29 10:40:53 +01:00
Piyush Raj Chouhan	ef5fbdf732	ALSA: hda/realtek: Add subwoofer quirk for Infinix ZERO BOOK 13 Infinix ZERO BOOK 13 has a 2+2 speaker system which isn't probed correctly. This patch adds a quirk with the proper pin connections. Also The mic in this laptop suffers too high gain resulting in mostly fan noise being recorded, This patch Also limit mic boost. HW Probe for device; https://linux-hardware.org/?probe=a2e892c47b Test: All 4 speaker works, Mic has low noise. Signed-off-by: Piyush Raj Chouhan <piyushchouhan1598@gmail.com> Link: https://patch.msgid.link/20241028155516.15552-1-piyuschouhan1598@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-29 10:02:42 +01:00
Barry Song	01626a1823	mm: avoid unconditional one-tick sleep when swapcache_prepare fails Commit `13ddaf26be` ("mm/swap: fix race when skipping swapcache") introduced an unconditional one-tick sleep when `swapcache_prepare()` fails, which has led to reports of UI stuttering on latency-sensitive Android devices. To address this, we can use a waitqueue to wake up tasks that fail `swapcache_prepare()` sooner, instead of always sleeping for a full tick. While tasks may occasionally be woken by an unrelated `do_swap_page()`, this method is preferable to two scenarios: rapid re-entry into page faults, which can cause livelocks, and multiple millisecond sleeps, which visibly degrade user experience. Oven's testing shows that a single waitqueue resolves the UI stuttering issue. If a 'thundering herd' problem becomes apparent later, a waitqueue hash similar to `folio_wait_table[PAGE_WAIT_TABLE_SIZE]` for page bit locks can be introduced. [v-songbaohua@oppo.com: wake_up only when swapcache_wq waitqueue is active] Link: https://lkml.kernel.org/r/20241008130807.40833-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240926211936.75373-1-21cnbao@gmail.com Fixes: `13ddaf26be` ("mm/swap: fix race when skipping swapcache") Signed-off-by: Barry Song <v-songbaohua@oppo.com> Reported-by: Oven Liyang <liyangouwen1@oppo.com> Tested-by: Oven Liyang <liyangouwen1@oppo.com> Cc: Kairui Song <kasong@tencent.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Yu Zhao <yuzhao@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Chris Li <chrisl@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: SeongJae Park <sj@kernel.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:41 -07:00
Jeff Xu	1834300798	mseal: update mseal.rst Pedro Falcato's optimization [1] for checking sealed VMAs, which replaces the can_modify_mm() function with an in-loop check, necessitates an update to the mseal.rst documentation to reflect this change. Furthermore, the document has received offline comments regarding the code sample and suggestions for sentence clarification to enhance reader comprehension. [1] https://lore.kernel.org/linux-mm/20240817-mseal-depessimize-v3-0-d8d2e037df30@gmail.com/ Update doc after in-loop change: mprotect/madvise can have partially updated and munmap is atomic. Fix indentation and clarify some sections to improve readability. Link: https://lkml.kernel.org/r/20241008040942.1478931-2-jeffxu@chromium.org Fixes: `df2a7df9a9` ("mm/munmap: replace can_modify_mm with can_modify_vma") Fixes: `4a2dd02b09` ("mm/mprotect: replace can_modify_mm with can_modify_vma") Fixes: `38075679b5` ("mm/mremap: replace can_modify_mm with can_modify_vma") Fixes: `23c57d1fa2` ("mseal: replace can_modify_mm_madv with a vma variant") Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Cc: Elliott Hughes <enh@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Kees Cook <keescook@chromium.org> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Stephen Röttger <sroettger@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: "Theo de Raadt" <deraadt@openbsd.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:41 -07:00
Kirill A. Shutemov	58a039e679	mm: split critical region in remap_file_pages() and invoke LSMs in between Commit `ea7e2d5e49` ("mm: call the security_mmap_file() LSM hook in remap_file_pages()") fixed a security issue, it added an LSM check when trying to remap file pages, so that LSMs have the opportunity to evaluate such action like for other memory operations such as mmap() and mprotect(). However, that commit called security_mmap_file() inside the mmap_lock lock, while the other calls do it before taking the lock, after commit `8b3ec6814c` ("take security_mmap_file() outside of ->mmap_sem"). This caused lock inversion issue with IMA which was taking the mmap_lock and i_mutex lock in the opposite way when the remap_file_pages() system call was called. Solve the issue by splitting the critical region in remap_file_pages() in two regions: the first takes a read lock of mmap_lock, retrieves the VMA and the file descriptor associated, and calculates the 'prot' and 'flags' variables; the second takes a write lock on mmap_lock, checks that the VMA flags and the VMA file descriptor are the same as the ones obtained in the first critical region (otherwise the system call fails), and calls do_mmap(). In between, after releasing the read lock and before taking the write lock, call security_mmap_file(), and solve the lock inversion issue. Link: https://lkml.kernel.org/r/20241018161415.3845146-1-roberto.sassu@huaweicloud.com Fixes: `ea7e2d5e49` ("mm: call the security_mmap_file() LSM hook in remap_file_pages()") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reported-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.GAE@google.com/ Tested-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Paul Moore <paul@paul-moore.com> Tested-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Eric Snowberg <eric.snowberg@oracle.com> Cc: James Morris <jmorris@namei.org> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Shu Han <ebpqwerty472123@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:41 -07:00
Edward Liaw	f2330b650e	selftests/mm: fix deadlock for fork after pthread_create with atomic_bool Some additional synchronization is needed on Android ARM64; we see a deadlock with pthread_create when the parent thread races forward before the child has a chance to start doing work. Link: https://lkml.kernel.org/r/20241018171734.2315053-4-edliaw@google.com Fixes: `cff2945827` ("selftests/mm: extend and rename uffd pagemap test") Signed-off-by: Edward Liaw <edliaw@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:41 -07:00
Edward Liaw	3673167a3a	Revert "selftests/mm: replace atomic_bool with pthread_barrier_t" This reverts commit `e61ef21e27`. uffd_poll_thread may be called by other tests that do not initialize the pthread_barrier, so this approach is not correct. This will revert to using atomic_bool instead. Link: https://lkml.kernel.org/r/20241018171734.2315053-3-edliaw@google.com Fixes: `e61ef21e27` ("selftests/mm: replace atomic_bool with pthread_barrier_t") Signed-off-by: Edward Liaw <edliaw@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:41 -07:00
Edward Liaw	5bb1f4c934	Revert "selftests/mm: fix deadlock for fork after pthread_create on ARM" Patch series "selftests/mm: revert pthread_barrier change" On Android arm, pthread_create followed by a fork caused a deadlock in the case where the fork required work to be completed by the created thread. The previous patches incorrectly assumed that the parent would always initialize the pthread_barrier for the child thread. This reverts the change and replaces the fix for wp-fork-with-event with the original use of atomic_bool. This patch (of 3): This reverts commit `e142cc87ac`. fork_event_consumer may be called by other tests that do not initialize the pthread_barrier, so this approach is not correct. The subsequent patch will revert to using atomic_bool instead. Link: https://lkml.kernel.org/r/20241018171734.2315053-1-edliaw@google.com Link: https://lkml.kernel.org/r/20241018171734.2315053-2-edliaw@google.com Fixes: `e142cc87ac` ("fix deadlock for fork after pthread_create on ARM") Signed-off-by: Edward Liaw <edliaw@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:40 -07:00
Lorenzo Stoakes	e8133a7799	tools: testing: add expand-only mode VMA test Add a test to assert that VMG_FLAG_JUST_EXPAND functions as expected - that is, when the VMA iterator is positioned at the previous VMA and no VMAs proceed it, we observe an expansion with all state as expected. Explicitly place a prior VMA that would otherwise fail this test if the mode were not enabled (as it would traverse to the previous-previous VMA). Link: https://lkml.kernel.org/r/d2f88330254a6448092412bf7dfe077a579ab0dc.1729174352.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: kernel test robot <oliver.sang@intel.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:40 -07:00
Lorenzo Stoakes	c4d91e225f	mm/vma: add expand-only VMA merge mode and optimise do_brk_flags() Patch series "introduce VMA merge mode to improve brk() performance". A ~5% performance regression was discovered on the aim9.brk_test.ops_per_sec by the linux kernel test bot [0]. In the past to satisfy brk() performance we duplicated VMA expansion code and special-cased do_brk_flags(). This is however horrid and undoes work to abstract this logic, so in resolving the issue I have endeavoured to avoid this. Investigating further I was able to observe that the use of a vma_iter_next_range() and vma_prev() pair, causing an unnecessary maple tree walk. In addition there is work that we do that is simply unnecessary for brk(). Therefore, add a special VMA merge mode VMG_FLAG_JUST_EXPAND to avoid doing any of this - it assumes the VMA iterator is pointing at the previous VMA and which skips logic that brk() does not require. This mostly eliminates the performance regression reducing it to ~2% which is in the realm of noise. In addition, the will-it-scale test brk2, written to be more representative of real-world brk() usage, shows a modest performance improvement - which gives me confidence that we are not meaningfully regressing real workloads here. This series includes a test asserting that the 'just expand' mode works as expected. With many thanks to Oliver Sang for helping with performance testing of candidate patch sets! [0]:https://lore.kernel.org/linux-mm/202409301043.629bea78-oliver.sang@intel.com This patch (of 2): We know in advance that do_brk_flags() wants only to perform a VMA expansion (if the prior VMA is compatible), and that we assume no mergeable VMA follows it. These are the semantics of this function prior to the recent rewrite of the VMA merging logic, however we are now doing more work than necessary - positioning the VMA iterator at the prior VMA and performing tasks that are not required. Add a new field to the vmg struct to permit merge flags and add a new merge flag VMG_FLAG_JUST_EXPAND which implies this behaviour, and have do_brk_flags() use this. This fixes a reported performance regression in a brk() benchmarking suite. Link: https://lkml.kernel.org/r/cover.1729174352.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/4e65d4395e5841c5acf8470dbcb714016364fd39.1729174352.git.lorenzo.stoakes@oracle.com Fixes: `cacded5e42` ("mm: avoid using vma_merge() for new VMAs") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/linux-mm/202409301043.629bea78-oliver.sang@intel.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:40 -07:00
Gregory Price	b125a0def2	resource,kexec: walk_system_ram_res_rev must retain resource flags walk_system_ram_res_rev() erroneously discards resource flags when passing the information to the callback. This causes systems with IORESOURCE_SYSRAM_DRIVER_MANAGED memory to have these resources selected during kexec to store kexec buffers if that memory happens to be at placed above normal system ram. This leads to undefined behavior after reboot. If the kexec buffer is never touched, nothing happens. If the kexec buffer is touched, it could lead to a crash (like below) or undefined behavior. Tested on a system with CXL memory expanders with driver managed memory, TPM enabled, and CONFIG_IMA_KEXEC=y. Adding printk's showed the flags were being discarded and as a result the check for IORESOURCE_SYSRAM_DRIVER_MANAGED passes. find_next_iomem_res: name(System RAM (kmem)) start(10000000000) end(1034fffffff) flags(83000200) locate_mem_hole_top_down: start(10000000000) end(1034fffffff) flags(0) [.] BUG: unable to handle page fault for address: ffff89834ffff000 [.] #PF: supervisor read access in kernel mode [.] #PF: error_code(0x0000) - not-present page [.] PGD c04c8bf067 P4D c04c8bf067 PUD c04c8be067 PMD 0 [.] Oops: 0000 [#1] SMP [.] RIP: 0010:ima_restore_measurement_list+0x95/0x4b0 [.] RSP: 0018:ffffc900000d3a80 EFLAGS: 00010286 [.] RAX: 0000000000001000 RBX: 0000000000000000 RCX: ffff89834ffff000 [.] RDX: 0000000000000018 RSI: ffff89834ffff000 RDI: ffff89834ffff018 [.] RBP: ffffc900000d3ba0 R08: 0000000000000020 R09: ffff888132b8a900 [.] R10: 4000000000000000 R11: 000000003a616d69 R12: 0000000000000000 [.] R13: ffffffff8404ac28 R14: 0000000000000000 R15: ffff89834ffff000 [.] FS: 0000000000000000(0000) GS:ffff893d44640000(0000) knlGS:0000000000000000 [.] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [.] ata5: SATA link down (SStatus 0 SControl 300) [.] CR2: ffff89834ffff000 CR3: 000001034d00f001 CR4: 0000000000770ef0 [.] PKRU: 55555554 [.] Call Trace: [.] <TASK> [.] ? __die+0x78/0xc0 [.] ? page_fault_oops+0x2a8/0x3a0 [.] ? exc_page_fault+0x84/0x130 [.] ? asm_exc_page_fault+0x22/0x30 [.] ? ima_restore_measurement_list+0x95/0x4b0 [.] ? template_desc_init_fields+0x317/0x410 [.] ? crypto_alloc_tfm_node+0x9c/0xc0 [.] ? init_ima_lsm+0x30/0x30 [.] ima_load_kexec_buffer+0x72/0xa0 [.] ima_init+0x44/0xa0 [.] __initstub__kmod_ima__373_1201_init_ima7+0x1e/0xb0 [.] ? init_ima_lsm+0x30/0x30 [.] do_one_initcall+0xad/0x200 [.] ? idr_alloc_cyclic+0xaa/0x110 [.] ? new_slab+0x12c/0x420 [.] ? new_slab+0x12c/0x420 [.] ? number+0x12a/0x430 [.] ? sysvec_apic_timer_interrupt+0xa/0x80 [.] ? asm_sysvec_apic_timer_interrupt+0x16/0x20 [.] ? parse_args+0xd4/0x380 [.] ? parse_args+0x14b/0x380 [.] kernel_init_freeable+0x1c1/0x2b0 [.] ? rest_init+0xb0/0xb0 [.] kernel_init+0x16/0x1a0 [.] ret_from_fork+0x2f/0x40 [.] ? rest_init+0xb0/0xb0 [.] ret_from_fork_asm+0x11/0x20 [.] </TASK> Link: https://lore.kernel.org/all/20231114091658.228030-1-bhe@redhat.com/ Link: https://lkml.kernel.org/r/20241017190347.5578-1-gourry@gourry.net Fixes: `7acf164b25` ("resource: add walk_system_ram_res_rev()") Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:40 -07:00
Ryusuke Konishi	41e192ad27	nilfs2: fix kernel bug due to missing clearing of checked flag Syzbot reported that in directory operations after nilfs2 detects filesystem corruption and degrades to read-only, __block_write_begin_int(), which is called to prepare block writes, may fail the BUG_ON check for accesses exceeding the folio/page size, triggering a kernel bug. This was found to be because the "checked" flag of a page/folio was not cleared when it was discarded by nilfs2's own routine, which causes the sanity check of directory entries to be skipped when the directory page/folio is reloaded. So, fix that. This was necessary when the use of nilfs2's own page discard routine was applied to more than just metadata files. Link: https://lkml.kernel.org/r/20241017193359.5051-1-konishi.ryusuke@gmail.com Fixes: `8c26c4e269` ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+d6ca2daf692c7a82f959@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d6ca2daf692c7a82f959 Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:40 -07:00
Nobuhiro Iwamatsu	d95fb348f0	mm: numa_clear_kernel_node_hotplug: Add NUMA_NO_NODE check for node id The acquired memory blocks for reserved may include blocks outside of memory management. In this case, the nid variable is set to NUMA_NO_NODE (-1), so an error occurs in node_set(). This adds a check using numa_valid_node() to numa_clear_kernel_node_hotplug() that skips node_set() when nid is set to NUMA_NO_NODE. Link: https://lkml.kernel.org/r/1729070461-13576-1-git-send-email-nobuhiro1.iwamatsu@toshiba.co.jp Fixes: `8748270821` ("mm: introduce numa_memblks") Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Suggested-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:40 -07:00
Edward Adam Davis	bc0a2f3a73	ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow Syzbot reported a kernel BUG in ocfs2_truncate_inline. There are two reasons for this: first, the parameter value passed is greater than ocfs2_max_inline_data_with_xattr, second, the start and end parameters of ocfs2_truncate_inline are "unsigned int". So, we need to add a sanity check for byte_start and byte_len right before ocfs2_truncate_inline() in ocfs2_remove_inode_range(), if they are greater than ocfs2_max_inline_data_with_xattr return -EINVAL. Link: https://lkml.kernel.org/r/tencent_D48DB5122ADDAEDDD11918CFB68D93258C07@qq.com Fixes: `1afc32b952` ("ocfs2: Write support for inline data") Signed-off-by: Edward Adam Davis <eadavis@qq.com> Reported-by: syzbot+81092778aac03460d6b7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=81092778aac03460d6b7 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:40 -07:00
Jeongjun Park	d949d1d14f	mm: shmem: fix data-race in shmem_getattr() I got the following KCSAN report during syzbot testing: ================================================================== BUG: KCSAN: data-race in generic_fillattr / inode_set_ctime_current write to 0xffff888102eb3260 of 4 bytes by task 6565 on cpu 1: inode_set_ctime_to_ts include/linux/fs.h:1638 [inline] inode_set_ctime_current+0x169/0x1d0 fs/inode.c:2626 shmem_mknod+0x117/0x180 mm/shmem.c:3443 shmem_create+0x34/0x40 mm/shmem.c:3497 lookup_open fs/namei.c:3578 [inline] open_last_lookups fs/namei.c:3647 [inline] path_openat+0xdbc/0x1f00 fs/namei.c:3883 do_filp_open+0xf7/0x200 fs/namei.c:3913 do_sys_openat2+0xab/0x120 fs/open.c:1416 do_sys_open fs/open.c:1431 [inline] __do_sys_openat fs/open.c:1447 [inline] __se_sys_openat fs/open.c:1442 [inline] __x64_sys_openat+0xf3/0x120 fs/open.c:1442 x64_sys_call+0x1025/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:258 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e read to 0xffff888102eb3260 of 4 bytes by task 3498 on cpu 0: inode_get_ctime_nsec include/linux/fs.h:1623 [inline] inode_get_ctime include/linux/fs.h:1629 [inline] generic_fillattr+0x1dd/0x2f0 fs/stat.c:62 shmem_getattr+0x17b/0x200 mm/shmem.c:1157 vfs_getattr_nosec fs/stat.c:166 [inline] vfs_getattr+0x19b/0x1e0 fs/stat.c:207 vfs_statx_path fs/stat.c:251 [inline] vfs_statx+0x134/0x2f0 fs/stat.c:315 vfs_fstatat+0xec/0x110 fs/stat.c:341 __do_sys_newfstatat fs/stat.c:505 [inline] __se_sys_newfstatat+0x58/0x260 fs/stat.c:499 __x64_sys_newfstatat+0x55/0x70 fs/stat.c:499 x64_sys_call+0x141f/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:263 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e value changed: 0x2755ae53 -> 0x27ee44d3 Reported by Kernel Concurrency Sanitizer on: CPU: 0 UID: 0 PID: 3498 Comm: udevd Not tainted 6.11.0-rc6-syzkaller-00326-gd1f2d51b711a-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 ================================================================== When calling generic_fillattr(), if you don't hold read lock, data-race will occur in inode member variables, which can cause unexpected behavior. Since there is no special protection when shmem_getattr() calls generic_fillattr(), data-race occurs by functions such as shmem_unlink() or shmem_mknod(). This can cause unexpected results, so commenting it out is not enough. Therefore, when calling generic_fillattr() from shmem_getattr(), it is appropriate to protect the inode using inode_lock_shared() and inode_unlock_shared() to prevent data-race. Link: https://lkml.kernel.org/r/20240909123558.70229-1-aha310510@gmail.com Fixes: `44a30220bc` ("shmem: recalculate file inode when fstat") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Reported-by: syzbot <syzkaller@googlegroup.com> Cc: Hugh Dickins <hughd@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:39 -07:00
Jann Horn	14611508cb	mm: mark mas allocation in vms_abort_munmap_vmas as __GFP_NOFAIL vms_abort_munmap_vmas() is a recovery path where, on entry, some VMAs have already been torn down halfway (in a way we can't undo) but are still present in the maple tree. At this point, we must remove the VMAs from the VMA tree, otherwise we get UAF. Because removing VMA tree nodes can require memory allocation, the existing code has an error path which tries to handle this by reattaching the VMAs; but that can't be done safely. A nicer way to fix it would probably be to preallocate enough maple tree nodes for the removal before the point of no return, or something like that; but for now, fix it the easy and kinda ugly way, by marking this allocation __GFP_NOFAIL. Link: https://lkml.kernel.org/r/20241016-fix-munmap-abort-v1-1-601c94b2240d@google.com Fixes: `4f87153e82` ("mm: change failure of MAP_FIXED to restoring the gap on failure") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:39 -07:00
Sabyrzhan Tasbolatov	1db272864f	x86/traps: move kmsan check after instrumentation_begin During x86_64 kernel build with CONFIG_KMSAN, the objtool warns following: AR built-in.a AR vmlinux.a LD vmlinux.o vmlinux.o: warning: objtool: handle_bug+0x4: call to kmsan_unpoison_entry_regs() leaves .noinstr.text section OBJCOPY modules.builtin.modinfo GEN modules.builtin MODPOST Module.symvers CC .vmlinux.export.o Moving kmsan_unpoison_entry_regs() _after_ instrumentation_begin() fixes the warning. There is decode_bug(regs->ip, &imm) is left before KMSAN unpoisoining, but it has the return condition and if we include it after instrumentation_begin() it results the warning "return with instrumentation enabled", hence, I'm concerned that regs will not be KMSAN unpoisoned if `ud_type == BUG_NONE` is true. Link: https://lkml.kernel.org/r/20241016152407.3149001-1-snovitoll@gmail.com Fixes: `ba54d194f8` ("x86/traps: avoid KMSAN bugs originating from handle_bug()") Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:39 -07:00
Huang Ying	b7c5f9a1fb	resource: remove dependency on SPARSEMEM from GET_FREE_REGION We want to use the functions (get_free_mem_region()) configured via GET_FREE_REGION in resource kunit tests. However, GET_FREE_REGION depends on SPARSEMEM now. This makes resource kunit tests cannot be built on some architectures lacking SPARSEMEM, or causes config warning as follows, WARNING: unmet direct dependencies detected for GET_FREE_REGION Depends on [n]: SPARSEMEM [=n] Selected by [y]: - RESOURCE_KUNIT_TEST [=y] && RUNTIME_TESTING_MENU [=y] && KUNIT [=y] When get_free_mem_region() was introduced the only consumers were those looking to pass the address range to memremap_pages(). That address range needed to be mindful of the maximum addressable platform physical address which at the time only SPARSMEM defined via MAX_PHYSMEM_BITS. Given that memremap_pages() also depended on SPARSEMEM via ZONE_DEVICE, it was easier to just depend on that definition than invent a general MAX_PHYSMEM_BITS concept outside of SPARSEMEM. Turns out that decision was buggy and did not account for KASAN consumption of physical address space. That problem was resolved recently with commit `ea72ce5da2` ("x86/kaslr: Expose and use the end of the physical memory address space"), and GET_FREE_REGION dropped its MAX_PHYSMEM_BITS dependency. Then commit `99185c10d5` ("resource, kunit: add test case for region_intersects()"), went ahead and fixed up the only remaining dependency on SPARSEMEM which was usage of the PA_SECTION_SHIFT macro for setting the default alignment. A PAGE_SIZE fallback is fine in the SPARSEMEM=n case. With those build dependencies gone GET_FREE_REGION no longer depends on SPARSEMEM. So, the patch removes dependency on SPARSEMEM from GET_FREE_REGION to fix the build issues. Link: https://lkml.kernel.org/r/20241016014730.339369-1-ying.huang@intel.com Link: https://lore.kernel.org/lkml/20240922225041.603186-1-linux@roeck-us.net/ Link: https://lkml.kernel.org/r/20241015051554.294734-1-ying.huang@intel.com Fixes: `99185c10d5` ("resource, kunit: add test case for region_intersects()") Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Co-developed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> # build Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:39 -07:00
Liam R. Howlett	79f3d123ca	mm/mmap: fix race in mmap_region() with ftruncate() Avoiding the zeroing of the vma tree in mmap_region() introduced a race with truncate in the page table walk. To avoid any races, create a hole in the rmap during the operation by clearing the pagetable entries earlier under the mmap write lock and (critically) before the new vma is installed into the vma tree. The result is that the old vma(s) are left in the vma tree, but free_pgtables() removes them from the rmap and clears the ptes while holding the necessary locks. This change extends the fix required for hugetblfs and the call_mmap() function by moving the cleanup higher in the function and running it unconditionally. Link: https://lkml.kernel.org/r/20241016013455.2241533-1-Liam.Howlett@oracle.com Fixes: `f8d112a4e6` ("mm/mmap: avoid zeroing vma tree in mmap_region()") Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reported-by: Jann Horn <jannh@google.com> Closes: https://lore.kernel.org/all/CAG48ez0ZpGzxi=-5O_uGQ0xKXOmbjeQ0LjZsRJ1Qtf2X5eOr1w@mail.gmail.com/ Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:39 -07:00
Matt Fleming	281dd25c1a	mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves Under memory pressure it's possible for GFP_ATOMIC order-0 allocations to fail even though free pages are available in the highatomic reserves. GFP_ATOMIC allocations cannot trigger unreserve_highatomic_pageblock() since it's only run from reclaim. Given that such allocations will pass the watermarks in __zone_watermark_unusable_free(), it makes sense to fallback to highatomic reserves the same way that ALLOC_OOM can. This fixes order-0 page allocation failures observed on Cloudflare's fleet when handling network packets: kswapd1: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0-7 CPU: 10 PID: 696 Comm: kswapd1 Kdump: loaded Tainted: G O 6.6.43-CUSTOM #1 Hardware name: MACHINE Call Trace: <IRQ> dump_stack_lvl+0x3c/0x50 warn_alloc+0x13a/0x1c0 __alloc_pages_slowpath.constprop.0+0xc9d/0xd10 __alloc_pages+0x327/0x340 __napi_alloc_skb+0x16d/0x1f0 bnxt_rx_page_skb+0x96/0x1b0 [bnxt_en] bnxt_rx_pkt+0x201/0x15e0 [bnxt_en] __bnxt_poll_work+0x156/0x2b0 [bnxt_en] bnxt_poll+0xd9/0x1c0 [bnxt_en] __napi_poll+0x2b/0x1b0 bpf_trampoline_6442524138+0x7d/0x1000 __napi_poll+0x5/0x1b0 net_rx_action+0x342/0x740 handle_softirqs+0xcf/0x2b0 irq_exit_rcu+0x6c/0x90 sysvec_apic_timer_interrupt+0x72/0x90 </IRQ> [mfleming@cloudflare.com: update comment] Link: https://lkml.kernel.org/r/20241015125158.3597702-1-matt@readmodwrite.com Link: https://lkml.kernel.org/r/20241011120737.3300370-1-matt@readmodwrite.com Link: https://lore.kernel.org/all/CAGis_TWzSu=P7QJmjD58WWiu3zjMTVKSzdOwWE8ORaGytzWJwQ@mail.gmail.com/ Fixes: `1d91df85f3` ("mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs") Signed-off-by: Matt Fleming <mfleming@cloudflare.com> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:39 -07:00
Lorenzo Stoakes	985da552a9	fork: only invoke khugepaged, ksm hooks if no error There is no reason to invoke these hooks early against an mm that is in an incomplete state. The change in commit `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") makes this more pertinent as we may be in a state where entries in the maple tree are not yet consistent. Their placement early in dup_mmap() only appears to have been meaningful for early error checking, and since functionally it'd require a very small allocation to fail (in practice 'too small to fail') that'd only occur in the most dire circumstances, meaning the fork would fail or be OOM'd in any case. Since both khugepaged and KSM tracking are there to provide optimisations to memory performance rather than critical functionality, it doesn't really matter all that much if, under such dire memory pressure, we fail to register an mm with these. As a result, we follow the example of commit `d2081b2bf8` ("mm: khugepaged: make khugepaged_enter() void function") and make ksm_fork() a void function also. We only expose the mm to these functions once we are done with them and only if no error occurred in the fork operation. Link: https://lkml.kernel.org/r/e0cb8b840c9d1d5a6e84d4f8eff5f3f2022aa10c.1729014377.git.lorenzo.stoakes@oracle.com Fixes: `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Jann Horn <jannh@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Jann Horn <jannh@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:39 -07:00
Lorenzo Stoakes	f64e67e5d3	fork: do not invoke uffd on fork if error occurs Patch series "fork: do not expose incomplete mm on fork". During fork we may place the virtual memory address space into an inconsistent state before the fork operation is complete. In addition, we may encounter an error during the fork operation that indicates that the virtual memory address space is invalidated. As a result, we should not be exposing it in any way to external machinery that might interact with the mm or VMAs, machinery that is not designed to deal with incomplete state. We specifically update the fork logic to defer khugepaged and ksm to the end of the operation and only to be invoked if no error arose, and disallow uffd from observing fork events should an error have occurred. This patch (of 2): Currently on fork we expose the virtual address space of a process to userland unconditionally if uffd is registered in VMAs, regardless of whether an error arose in the fork. This is performed in dup_userfaultfd_complete() which is invoked unconditionally, and performs two duties - invoking registered handlers for the UFFD_EVENT_FORK event via dup_fctx(), and clearing down userfaultfd_fork_ctx objects established in dup_userfaultfd(). This is problematic, because the virtual address space may not yet be correctly initialised if an error arose. The change in commit `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") makes this more pertinent as we may be in a state where entries in the maple tree are not yet consistent. We address this by, on fork error, ensuring that we roll back state that we would otherwise expect to clean up through the event being handled by userland and perform the memory freeing duty otherwise performed by dup_userfaultfd_complete(). We do this by implementing a new function, dup_userfaultfd_fail(), which performs the same loop, only decrementing reference counts. Note that we perform mmgrab() on the parent and child mm's, however userfaultfd_ctx_put() will mmdrop() this once the reference count drops to zero, so we will avoid memory leaks correctly here. Link: https://lkml.kernel.org/r/cover.1729014377.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/d3691d58bb58712b6fb3df2be441d175bd3cdf07.1729014377.git.lorenzo.stoakes@oracle.com Fixes: `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Jann Horn <jannh@google.com> Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:38 -07:00
David Hildenbrand	7c18d48110	mm/pagewalk: fix usage of pmd_leaf()/pud_leaf() without present check pmd_leaf()/pud_leaf() only implies a pmd_present()/pud_present() check on some architectures. We really should check for pmd_present()/pud_present() first. This should explain the report we got on ppc64 (which has CONFIG_PGTABLE_HAS_HUGE_LEAVES set in the config) that triggered: VM_WARN_ON_ONCE(pmd_leaf(pmdp_get_lockless(pmdp))); Likely we had a PMD migration entry for which pmd_leaf() did not trigger. We raced with restoring the PMD migration entry, and suddenly saw a pmd_leaf(). In this case, pte_offset_map_lock() saved us from more trouble, because it rechecks the PMD value, but we would not have processed the migration entry -- which is not too bad because the only user of FW_MIGRATION is KSM for unsharing, and KSM only applies to small folios. Further, we shouldn't re-read the PMD/PUD value for our warning, the primary purpose of the VM_WARN_ON_ONCE() is to find spurious use of pmd_leaf()/pud_leaf() without CONFIG_PGTABLE_HAS_HUGE_LEAVES. As a side note, we are currently not implementing FW_MIGRATION support for PUD migration entries, which likely should exist due to hugetlb. Add a TODO so this won't fall through the cracks if more FW_MIGRATION users get added. Was able to write a quick reproducer and verify that the issue no longer triggers with this fix. https://gitlab.com/davidhildenbrand/scratchspace/-/blob/main/reproducers/move-pages-pmd-leaf.c Without this fix after a couple of seconds in a VM with 2 NUMA nodes: [ 54.333753] ------------[ cut here ]------------ [ 54.334901] WARNING: CPU: 20 PID: 1704 at mm/pagewalk.c:815 folio_walk_start+0x48f/0x6e0 [ 54.336455] Modules linked in: ... [ 54.345009] CPU: 20 UID: 0 PID: 1704 Comm: move-pages-pmd- Not tainted 6.12.0-rc2+ #81 [ 54.346529] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 [ 54.348191] RIP: 0010:folio_walk_start+0x48f/0x6e0 [ 54.349134] Code: b5 ad 48 8d 35 00 00 00 00 e8 6d 59 d7 ff e8 08 74 da ff e9 9c fe ff ff 4c 8b 7c 24 08 4c 89 ff e8 26 2b be 00 e9 8a fe ff ff <0f> 0b e9 ec fe ff ff f7 c2 ff 0f 00 00 0f 85 81 fe ff ff 48 8b 02 [ 54.352660] RSP: 0018:ffffb7e4c430bc78 EFLAGS: 00010282 [ 54.353679] RAX: 80000002a3e008e7 RBX: ffff9946039aa580 RCX: ffff994380000000 [ 54.355056] RDX: ffff994606aec000 RSI: 00007f004b000000 RDI: 0000000000000000 [ 54.356440] RBP: 00007f004b000000 R08: 0000000000000591 R09: 0000000000000001 [ 54.357820] R10: 0000000000000200 R11: 0000000000000001 R12: ffffb7e4c430bd10 [ 54.359198] R13: ffff994606aec2c0 R14: 0000000000000002 R15: ffff994604a89b00 [ 54.360564] FS: 00007f004ae006c0(0000) GS:ffff9947f7400000(0000) knlGS:0000000000000000 [ 54.362111] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 54.363242] CR2: 00007f004adffe58 CR3: 0000000281e12005 CR4: 0000000000770ef0 [ 54.364615] PKRU: 55555554 [ 54.365153] Call Trace: [ 54.365646] <TASK> [ 54.366073] ? __warn.cold+0xb7/0x14d [ 54.366796] ? folio_walk_start+0x48f/0x6e0 [ 54.367628] ? report_bug+0xff/0x140 [ 54.368324] ? handle_bug+0x58/0x90 [ 54.369019] ? exc_invalid_op+0x17/0x70 [ 54.369771] ? asm_exc_invalid_op+0x1a/0x20 [ 54.370606] ? folio_walk_start+0x48f/0x6e0 [ 54.371415] ? folio_walk_start+0x9e/0x6e0 [ 54.372227] do_pages_move+0x1c5/0x680 [ 54.372972] kernel_move_pages+0x1a1/0x2b0 [ 54.373804] __x64_sys_move_pages+0x25/0x30 Link: https://lkml.kernel.org/r/20241015111236.1290921-1-david@redhat.com Fixes: `aa39ca6940` ("mm/pagewalk: introduce folio_walk_start() + folio_walk_end()") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: syzbot+7d917f67c05066cec295@syzkaller.appspotmail.com Closes: https://lkml.kernel.org/r/670d3248.050a0220.3e960.0064.GAE@google.com Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-28 21:40:38 -07:00
Amit Sunil Dhamne	afb92ad873	usb: typec: tcpm: restrict SNK_WAIT_CAPABILITIES_TIMEOUT transitions to non self-powered devices PD3.1 spec ("8.3.3.3.3 PE_SNK_Wait_for_Capabilities State") mandates that the policy engine perform a hard reset when SinkWaitCapTimer expires. Instead the code explicitly does a GET_SOURCE_CAP when the timer expires as part of SNK_WAIT_CAPABILITIES_TIMEOUT. Due to this the following compliance test failures are reported by the compliance tester (added excerpts from the PD Test Spec): * COMMON.PROC.PD.2#1: The Tester receives a Get_Source_Cap Message from the UUT. This message is valid except the following conditions: [COMMON.PROC.PD.2#1] a. The check fails if the UUT sends this message before the Tester has established an Explicit Contract ... * TEST.PD.PROT.SNK.4: ... 4. The check fails if the UUT does not send a Hard Reset between tTypeCSinkWaitCap min and max. [TEST.PD.PROT.SNK.4#1] The delay is between the VBUS present vSafe5V min and the time of the first bit of Preamble of the Hard Reset sent by the UUT. For the purpose of interoperability, restrict the quirk introduced in https://lore.kernel.org/all/20240523171806.223727-1-sebastian.reichel@collabora.com/ to only non self-powered devices as battery powered devices will not have the issue mentioned in that commit. Cc: stable@vger.kernel.org Fixes: `122968f8dd` ("usb: typec: tcpm: avoid resets for missing source capability messages") Reported-by: Badhri Jagan Sridharan <badhri@google.com> Closes: https://lore.kernel.org/all/CAPTae5LAwsVugb0dxuKLHFqncjeZeJ785nkY4Jfd+M-tCjHSnQ@mail.gmail.com/ Signed-off-by: Amit Sunil Dhamne <amitsd@google.com> Reviewed-by: Badhri Jagan Sridharan <badhri@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Link: https://lore.kernel.org/r/20241024022233.3276995-1-amitsd@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:38:00 +01:00
Zijun Hu	fdce49b5da	usb: phy: Fix API devm_usb_put_phy() can not release the phy For devm_usb_put_phy(), its comment says it needs to invoke usb_put_phy() to release the phy, but it does not do that actually, so it can not fully undo what the API devm_usb_get_phy() does, that is wrong, fixed by using devres_release() instead of devres_destroy() within the API. Fixes: `cedf860237` ("usb: phy: move bulk of otg/otg.c to phy/phy.c") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241020-usb_phy_fix-v1-1-7f79243b8e1e@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:37:48 +01:00
Li Zhijian	dc1308bee1	selftests/watchdog-test: Fix system accidentally reset after watchdog-test When running watchdog-test with 'make run_tests', the watchdog-test will be terminated by a timeout signal(SIGTERM) due to the test timemout. And then, a system reboot would happen due to watchdog not stop. see the dmesg as below: ``` [ 1367.185172] watchdog: watchdog0: watchdog did not stop! ``` Fix it by registering more signals(including SIGTERM) in watchdog-test, where its signal handler will stop the watchdog. After that # timeout 1 ./watchdog-test Watchdog Ticking Away! . Stopping watchdog ticks... Link: https://lore.kernel.org/all/20241029031324.482800-1-lizhijian@fujitsu.com/ Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-10-28 21:34:43 -06:00
Javier Carrasco	1ab0b9ae58	usb: typec: use cleanup facility for 'altmodes_node' Use the __free() macro for 'altmodes_node' to automatically release the node when it goes out of scope, removing the need for explicit calls to fwnode_handle_put(). Suggested-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20241021-typec-class-fwnode_handle_put-v2-2-3281225d3d27@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:28:51 +01:00
Javier Carrasco	9581acb91e	usb: typec: fix unreleased fwnode_handle in typec_port_register_altmodes() The 'altmodes_node' fwnode_handle is never released after it is no longer required, which leaks the resource. Add the required call to fwnode_handle_put() when 'altmodes_node' is no longer required. Cc: stable@vger.kernel.org Fixes: `7b458a4c5d` ("usb: typec: Add typec_port_register_altmodes()") Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://lore.kernel.org/r/20241021-typec-class-fwnode_handle_put-v2-1-3281225d3d27@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:28:51 +01:00
Javier Carrasco	b8423a2f58	usb: typec: qcom-pmic-typec: fix missing fwnode removal in error path If drm_dp_hpd_bridge_register() fails, the probe function returns without removing the fwnode via fwnode_handle_put(), leaking the resource. Jump to fwnode_remove if drm_dp_hpd_bridge_register() fails to remove the fwnode acquired with device_get_named_child_node(). Cc: stable@vger.kernel.org Fixes: `7d9f1b72b2` ("usb: typec: qcom-pmic-typec: switch to DRM_AUX_HPD_BRIDGE") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Acked-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20241020-qcom_pmic_typec-fwnode_remove-v2-2-7054f3d2e215@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:28:33 +01:00
Javier Carrasco	7f02b8a5b6	usb: typec: qcom-pmic-typec: use fwnode_handle_put() to release fwnodes The right function to release a fwnode acquired via device_get_named_child_node() is fwnode_handle_put(), and not fwnode_remove_software_node(), as no software node is being handled. Replace the calls to fwnode_remove_software_node() with fwnode_handle_put() in qcom_pmic_typec_probe() and qcom_pmic_typec_remove(). Cc: stable@vger.kernel.org Fixes: `a4422ff221` ("usb: typec: qcom: Add Qualcomm PMIC Type-C driver") Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Acked-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241020-qcom_pmic_typec-fwnode_remove-v2-1-7054f3d2e215@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:28:33 +01:00
Mathias Nyman	623dae3e70	usb: acpi: fix boot hang due to early incorrect 'tunneled' USB3 device links Fix a boot hang issue triggered when a USB3 device is incorrectly assumed to be tunneled over USB4, thus attempting to create a device link between the USB3 "consumer" device and the USB4 "supplier" Host Interface before the USB4 side is properly bound to a driver. This could happen if xhci isn't capable of detecting tunneled devices, but ACPI tables contain all info needed to assume device is tunneled. i.e. udev->tunnel_mode == USB_LINK_UNKNOWN. It turns out that even for actual tunneled USB3 devices it can't be assumed that the thunderbolt driver providing the tunnel is loaded before the tunneled USB3 device is created. The tunnel can be created by BIOS and remain in use by thunderbolt/USB4 host driver once it loads. Solve this by making the device link "stateless", which doesn't create a driver presence order dependency between the supplier and consumer drivers. It still guarantees correct suspend/resume and shutdown ordering. cc: Mario Limonciello <mario.limonciello@amd.com> Fixes: `f1bfb4a6fe` ("usb: acpi: add device link between tunneled USB3 device and USB4 Host Interface") Tested-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241024131355.3836538-1-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:26:26 +01:00
Stefan Wahren	f3b311325f	Revert "usb: dwc2: Skip clock gating on Broadcom SoCs" The commit `d483f034f0` ("usb: dwc2: Skip clock gating on Broadcom SoCs") introduced a regression on Raspberry Pi 3 B Plus, which prevents enumeration of the onboard Microchip LAN7800 in case no external USB device is connected during boot. Fixes: `d483f034f0` ("usb: dwc2: Skip clock gating on Broadcom SoCs") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://lore.kernel.org/r/20241025103621.4780-2-wahrenst@gmx.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:24:54 +01:00
Faisal Hassan	075919f6df	xhci: Fix Link TRB DMA in command ring stopped completion event During the aborting of a command, the software receives a command completion event for the command ring stopped, with the TRB pointing to the next TRB after the aborted command. If the command we abort is located just before the Link TRB in the command ring, then during the 'command ring stopped' completion event, the xHC gives the Link TRB in the event's cmd DMA, which causes a mismatch in handling command completion event. To address this situation, move the 'command ring stopped' completion event check slightly earlier, since the specific command it stopped on isn't of significant concern. Fixes: `7f84eef0da` ("USB: xhci: No-op command queueing and irq handler.") Cc: stable@vger.kernel.org Signed-off-by: Faisal Hassan <quic_faisalh@quicinc.com> Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241022155631.1185-1-quic_faisalh@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:23:59 +01:00
Basavaraj Natikar	31004740e4	xhci: Use pm_runtime_get to prevent RPM on unsupported systems Use pm_runtime_put in the remove function and pm_runtime_get to disable RPM on platforms that don't support runtime D3, as re-enabling it through sysfs auto power control may cause the controller to malfunction. This can lead to issues such as hotplug devices not being detected due to failed interrupt generation. Fixes: `a5d6264b63` ("xhci: Enable RPM on controllers that support low-power states") Cc: stable <stable@kernel.org> Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241024133718.723846-1-Basavaraj.Natikar@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:23:36 +01:00
Zongmin Zhou	e7cd4b811c	usbip: tools: Fix detach_port() invalid port error path The detach_port() doesn't return error when detach is attempted on an invalid port. Fixes: `40ecdeb1a1` ("usbip: usbip_detach: fix to check for invalid ports") Cc: stable@vger.kernel.org Reviewed-by: Hongren Zheng <i@zenithal.me> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Zongmin Zhou <zhouzongmin@kylinos.cn> Link: https://lore.kernel.org/r/20241024022700.1236660-1-min_halo@163.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:23:23 +01:00
Alessandro Zanni	722d89c34c	selftests/intel_pstate: check if cpupower is installed Running "make kselftest TARGETS=intel_pstate" results in the following errors: - ./run.sh: line 89: cpupower: command not found - ./run.sh: line 91: cpupower: command not found if the cpupower is not installed. Since the test depends on cpupower, this patch stops the test if the cpupower is not installed. Link: https://lore.kernel.org/all/cc01753c8dab0f33669a5a0fc162544078055bd1.1730141362.git.alessandro.zanni87@gmail.com/ Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-10-28 21:18:57 -06:00
Alessandro Zanni	6553bfcb84	selftests/intel_pstate: fix operand expected error Running "make kselftest TARGETS=intel_pstate" results in the following errors: - ./run.sh: line 90: / 1000: syntax error: operand expected (error token is "/ 1000") - ./run.sh: line 92: / 1000: syntax error: operand expected (error token is "/ 1000") This fix allows to have cross-platform compatibility when using arithmetic expression with command substitutions. Link: https://lore.kernel.org/r/f37df23888cd5ea6b3976f19d3e25796129dd090.1730141362.git.alessandro.zanni87@gmail.com Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-10-28 21:18:52 -06:00
zhouyuhang	fa0122eaca	selftests/mount_setattr: fix idmap_mount_tree_invalid failed to run Test case idmap_mount_tree_invalid failed to run on the newer kernel with the following output: # RUN mount_setattr_idmapped.idmap_mount_tree_invalid ... # mount_setattr_test.c:1428:idmap_mount_tree_invalid:Expected sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr)) (0) ! = 0 (0) # idmap_mount_tree_invalid: Test terminated by assertion This is because tmpfs is mounted at "/mnt/A", and tmpfs already contains the flag FS_ALLOW_IDMAP after the commit `7a80e5b8c6` ("shmem: support idmapped mounts for tmpfs"). So calling sys_mount_setattr here returns 0 instead of -EINVAL as expected. Ramfs does not support idmap mounts, so we can use it here to test invalid mounts, which allows the test case to pass with the following output: # Starting 1 tests from 1 test cases. # RUN mount_setattr_idmapped.idmap_mount_tree_invalid ... # OK mount_setattr_idmapped.idmap_mount_tree_invalid ok 1 mount_setattr_idmapped.idmap_mount_tree_invalid # PASSED: 1 / 1 tests passed. Link: https://lore.kernel.org/all/20241028084132.3212598-1-zhouyuhang1010@163.com/ Signed-off-by: zhouyuhang <zhouyuhang@kylinos.cn> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-10-28 21:16:47 -06:00
Greg Kroah-Hartman	5963e0786a	Merge tag 'thunderbolt-for-v6.12-rc5' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-linus Mika writes: thunderbolt: Fixes for v6.12-rc5 This includes following USB4/Thunderbolt fixes for v6.12-rc5: - Fix KASAN reported stack out-of-bounds read - Honor Time Management Unit (TMU) requirements in the domain when configuring TMU mode of a newly plugged router. Both have been in linux-next with no reported issues. * tag 'thunderbolt-for-v6.12-rc5' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: thunderbolt: Honor TMU requirements in the domain when setting TMU mode thunderbolt: Fix KASAN reported stack out-of-bounds read in tb_retimer_scan()	2024-10-29 04:12:04 +01:00
Greg Kroah-Hartman	d0bc3b92fb	Merge tag 'iio-fixes-for-6.12b' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next Jonathan writes: IIO: Fixes for 6.12, set 2 Usual mixed back of fixes for ancient bugs and some more recently introduced problems. gts-helper module - Memory leak fixes for this library code to handle complex gain cases. adi,ad7124 - Fix a divide by zero that can be triggered from userspace. adi,ad7380 - Various supply fixes. Includes some minor rework that simplifies the fix though increases the apparent scale of the change. adi,ad9832 - Avoid a potential divide by zero if clk_get_rate() returns 0. adi,ltc2642 - Fix wrong Kconfig regmap dependency. vishay,veml6030 - Fix a scaling problem with decimal part of processed channel. Note that only the illuminance channel is fixed as a larger series of cleanups not suitable for this point in the rc cycle removes the intensity channel anyway. * tag 'iio-fixes-for-6.12b' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio: iio: dac: Kconfig: Fix build error for ltc2664 iio: adc: ad7124: fix division by zero in ad7124_set_channel_odr() staging: iio: frequency: ad9832: fix division by zero in ad9832_calc_freqreg() docs: iio: ad7380: fix supply for ad7380-4 iio: adc: ad7380: fix supplies for ad7380-4 iio: adc: ad7380: add missing supplies iio: adc: ad7380: use devm_regulator_get_enable_read_voltage() dt-bindings: iio: adc: ad7380: fix ad7380-4 reference supply iio: light: veml6030: fix microlux value calculation iio: gts-helper: Fix memory leaks for the error path of iio_gts_build_avail_scale_table() iio: gts-helper: Fix memory leaks in iio_gts_build_avail_scale_table()	2024-10-29 04:10:12 +01:00
Alexander Usyskin	4adf613e01	mei: use kvmalloc for read buffer Read buffer is allocated according to max message size, reported by the firmware and may reach 64K in systems with pxp client. Contiguous 64k allocation may fail under memory pressure. Read buffer is used as in-driver message storage and not required to be contiguous. Use kvmalloc to allow kernel to allocate non-contiguous memory. Fixes: `3030dc0564` ("mei: add wrapper for queuing control commands.") Cc: stable <stable@kernel.org> Reported-by: Rohit Agarwal <rohiagar@chromium.org> Closes: https://lore.kernel.org/all/20240813084542.2921300-1-rohiagar@chromium.org/ Tested-by: Brian Geffon <bgeffon@google.com> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Acked-by: Tomas Winkler <tomasw@gmail.com> Link: https://lore.kernel.org/r/20241015123157.2337026-1-alexander.usyskin@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:01:40 +01:00
Abylay Ospan	cb617e148b	MAINTAINERS: add netup_unidvb maintainer Adding/restoring maintainership for the following drivers: F: drivers/media/pci/netup_unidvb/* F: drivers/media/dvb-frontends/helene* F: drivers/media/dvb-frontends/horus3a* F: drivers/media/dvb-frontends/lnbh25* F: drivers/media/dvb-frontends/ascot2e* F: drivers/media/dvb-frontends/cxd2841er* Signed-off-by: Abylay Ospan <aospan@amazon.com> Link: https://lore.kernel.org/r/20241023163425.30492-1-aospan@amazon.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 04:00:09 +01:00
Cong Wang	740be3b9a6	sock_map: fix a NULL pointer dereference in sock_map_link_update_prog() The following race condition could trigger a NULL pointer dereference: sock_map_link_detach(): sock_map_link_update_prog(): mutex_lock(&sockmap_mutex); ... sockmap_link->map = NULL; mutex_unlock(&sockmap_mutex); mutex_lock(&sockmap_mutex); ... sock_map_prog_link_lookup(sockmap_link->map); mutex_unlock(&sockmap_mutex); <continue> Fix it by adding a NULL pointer check. In this specific case, it makes no sense to update a link which is being released. Reported-by: Ruan Bonan <bonan.ruan@u.nus.edu> Fixes: `699c23f02c` ("bpf: Add bpf_link support for sk_msg and sk_skb progs") Cc: Yonghong Song <yonghong.song@linux.dev> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Link: https://lore.kernel.org/r/20241026185522.338562-1-xiyou.wangcong@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-10-28 18:44:14 -07:00
Greg Kroah-Hartman	9a71892cbc	Revert "driver core: Fix uevent_show() vs driver detach race" This reverts commit `15fffc6a56`. This commit causes a regression, so revert it for now until it can come back in a way that works for everyone. Link: https://lore.kernel.org/all/172790598832.1168608.4519484276671503678.stgit@dwillia2-xfh.jf.intel.com/ Fixes: `15fffc6a56` ("driver core: Fix uevent_show() vs driver detach race") Cc: stable <stable@kernel.org> Cc: Ashish Sangwan <a.sangwan@samsung.com> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Dirk Behme <dirk.behme@de.bosch.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-29 01:23:43 +01:00
Aleksei Vetrov	2ef9439f7a	ASoC: dapm: fix bounds checker error in dapm_widget_list_create The widgets array in the snd_soc_dapm_widget_list has a __counted_by attribute attached to it, which points to the num_widgets variable. This attribute is used in bounds checking, and if it is not set before the array is filled, then the bounds sanitizer will issue a warning or a kernel panic if CONFIG_UBSAN_TRAP is set. This patch sets the size of the widgets list calculated with list_for_each as the initial value for num_widgets as it is used for allocating memory for the array. It is updated with the actual number of added elements after the array is filled. Signed-off-by: Aleksei Vetrov <vvvvvv@google.com> Fixes: `80e698e2df` ("ASoC: soc-dapm: Annotate struct snd_soc_dapm_widget_list with __counted_by") Link: https://patch.msgid.link/20241028-soc-dapm-bounds-checker-fix-v1-1-262b0394e89e@google.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-28 23:30:33 +00:00
Benjamin Große	94c11e8529	usb: add support for new USB device ID 0x17EF:0x3098 for the r8152 driver This patch adds support for another Lenovo Mini dock 0x17EF:0x3098 to the r8152 driver. The device has been tested on NixOS, hotplugging and sleep included. Signed-off-by: Benjamin Große <ste3ls@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241020174128.160898-1-ste3ls@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-28 16:22:35 -07:00
Jianbo Liu	f1e54d11b2	macsec: Fix use-after-free while sending the offloading packet KASAN reports the following UAF. The metadata_dst, which is used to store the SCI value for macsec offload, is already freed by metadata_dst_free() in macsec_free_netdev(), while driver still use it for sending the packet. To fix this issue, dst_release() is used instead to release metadata_dst. So it is not freed instantly in macsec_free_netdev() if still referenced by skb. BUG: KASAN: slab-use-after-free in mlx5e_xmit+0x1e8f/0x4190 [mlx5_core] Read of size 2 at addr ffff88813e42e038 by task kworker/7:2/714 [...] Workqueue: mld mld_ifc_work Call Trace: <TASK> dump_stack_lvl+0x51/0x60 print_report+0xc1/0x600 kasan_report+0xab/0xe0 mlx5e_xmit+0x1e8f/0x4190 [mlx5_core] dev_hard_start_xmit+0x120/0x530 sch_direct_xmit+0x149/0x11e0 __qdisc_run+0x3ad/0x1730 __dev_queue_xmit+0x1196/0x2ed0 vlan_dev_hard_start_xmit+0x32e/0x510 [8021q] dev_hard_start_xmit+0x120/0x530 __dev_queue_xmit+0x14a7/0x2ed0 macsec_start_xmit+0x13e9/0x2340 dev_hard_start_xmit+0x120/0x530 __dev_queue_xmit+0x14a7/0x2ed0 ip6_finish_output2+0x923/0x1a70 ip6_finish_output+0x2d7/0x970 ip6_output+0x1ce/0x3a0 NF_HOOK.constprop.0+0x15f/0x190 mld_sendpack+0x59a/0xbd0 mld_ifc_work+0x48a/0xa80 process_one_work+0x5aa/0xe50 worker_thread+0x79c/0x1290 kthread+0x28f/0x350 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x11/0x20 </TASK> Allocated by task 3922: kasan_save_stack+0x20/0x40 kasan_save_track+0x10/0x30 __kasan_kmalloc+0x77/0x90 __kmalloc_noprof+0x188/0x400 metadata_dst_alloc+0x1f/0x4e0 macsec_newlink+0x914/0x1410 __rtnl_newlink+0xe08/0x15b0 rtnl_newlink+0x5f/0x90 rtnetlink_rcv_msg+0x667/0xa80 netlink_rcv_skb+0x12c/0x360 netlink_unicast+0x551/0x770 netlink_sendmsg+0x72d/0xbd0 __sock_sendmsg+0xc5/0x190 ____sys_sendmsg+0x52e/0x6a0 ___sys_sendmsg+0xeb/0x170 __sys_sendmsg+0xb5/0x140 do_syscall_64+0x4c/0x100 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Freed by task 4011: kasan_save_stack+0x20/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x50 poison_slab_object+0x10c/0x190 __kasan_slab_free+0x11/0x30 kfree+0xe0/0x290 macsec_free_netdev+0x3f/0x140 netdev_run_todo+0x450/0xc70 rtnetlink_rcv_msg+0x66f/0xa80 netlink_rcv_skb+0x12c/0x360 netlink_unicast+0x551/0x770 netlink_sendmsg+0x72d/0xbd0 __sock_sendmsg+0xc5/0x190 ____sys_sendmsg+0x52e/0x6a0 ___sys_sendmsg+0xeb/0x170 __sys_sendmsg+0xb5/0x140 do_syscall_64+0x4c/0x100 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: `0a28bfd497` ("net/macsec: Add MACsec skb_metadata_dst Tx Data path support") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Chris Mi <cmi@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20241021100309.234125-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-28 16:13:50 -07:00
Jakub Kicinski	b5abbf6120	Merge branch 'mptcp-sched-fix-some-lock-issues' Matthieu Baerts says: ==================== mptcp: sched: fix some lock issues Two small fixes related to the MPTCP packets scheduler: - Patch 1: add missing rcu_read_(un)lock(). A fix for >= 6.6. And some modifications in the MPTCP selftests: - Patch 2: a small addition to the MPTCP selftests to cover more code. ==================== Link: https://patch.msgid.link/20241021-net-mptcp-sched-lock-v1-0-637759cf061c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-28 15:50:58 -07:00
Matthieu Baerts (NGI0)	5513dc1d8f	selftests: mptcp: list sysctl data Listing all the values linked to the MPTCP sysctl knobs was not exercised in MPTCP test suite. Let's do that to avoid any regressions, but also to have a kernel with a debug kconfig verifying more assumptions. For the moment, we are not interested by the output, only to avoid crashes and warnings. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241021-net-mptcp-sched-lock-v1-3-637759cf061c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-28 15:50:57 -07:00
Matthieu Baerts (NGI0)	3deb12c788	mptcp: init: protect sched with rcu_read_lock Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT creates this splat when an MPTCP socket is created: ============================= WARNING: suspicious RCU usage 6.12.0-rc2+ #11 Not tainted ----------------------------- net/mptcp/sched.c:44 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by mptcp_connect/176. stack backtrace: CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ #11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822) mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7)) mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1)) ? sock_init_data_uid (arch/x86/include/asm/atomic.h:28) inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386) ? __sock_create (include/linux/rcupdate.h:347 (discriminator 1)) __sock_create (net/socket.c:1576) __sys_socket (net/socket.c:1671) ? __pfx___sys_socket (net/socket.c:1712) ? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1)) __x64_sys_socket (net/socket.c:1728) do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1)) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) That's because when the socket is initialised, rcu_read_lock() is not used despite the explicit comment written above the declaration of mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the warning. Fixes: `1730b2b2c5` ("mptcp: add sched in mptcp_sock") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/523 Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241021-net-mptcp-sched-lock-v1-1-637759cf061c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-28 15:50:54 -07:00
Levi Zim	b935252cc2	docs: networking: packet_mmap: replace dead links with archive.org links The original link returns 404 now. This commit replaces the dead google site link with archive.org link. Signed-off-by: Levi Zim <rsworktech@outlook.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20241021-packet_mmap_fix_link-v1-1-dffae4a174c0@outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-28 15:47:10 -07:00
Jarkko Sakkinen	df745e2509	tpm: Lazily flush the auth session Move the allocation of chip->auth to tpm2_start_auth_session() so that this field can be used as flag to tell whether auth session is active or not. Instead of flushing and reloading the auth session for every transaction separately, keep the session open unless /dev/tpm0 is used. Reported-by: Pengyu Ma <mapengyu@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219229 Cc: stable@vger.kernel.org # v6.10+ Fixes: `7ca110f267` ("tpm: Address !chip->auth in tpm_buf_append_hmac_session*()") Tested-by: Pengyu Ma <mapengyu@gmail.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2024-10-29 00:46:20 +02:00
Alex Deucher	935abb86a9	drm/amdgpu/smu13: fix profile reporting The following 3 commits landed in parallel: commit `d7d2688bf4` ("drm/amd/pm: update workload mask after the setting") commit `7a1613e47e` ("drm/amdgpu/smu13: always apply the powersave optimization") commit `7c210ca5a2` ("drm/amdgpu: handle default profile on on devices without fullscreen 3D") While everything is set correctly, this caused the profile to be reported incorrectly because both the powersave and fullscreen3d bits were set in the mask and when the driver prints the profile, it looks for the first bit set. Fixes: `d7d2688bf4` ("drm/amd/pm: update workload mask after the setting") Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `ecfe9b2376`) Cc: stable@vger.kernel.org	2024-10-28 17:19:45 -04:00
Linus Torvalds	e42b1a9a25	Merge tag 'spi-fix-v6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A small collection of driver specific fixes for SPI, there's nothing particularly remarkable about any of them" * tag 'spi-fix-v6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-fsl-dspi: Fix crash when not using GPIO chip select spi: geni-qcom: Fix boot warning related to pm_runtime and devres spi: mtk-snfi: fix kerneldoc for mtk_snand_is_page_ops() spi: stm32: fix missing device mode capability in stm32mp25	2024-10-28 11:16:33 -10:00
Tvrtko Ursulin	4aa923a6e6	drm/amd/pm: Vangogh: Fix kernel memory out of bounds write KASAN reports that the GPU metrics table allocated in vangogh_tables_init() is not large enough for the memset done in smu_cmn_init_soft_gpu_metrics(). Condensed report follows: [ 33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu] [ 33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067 ... [ 33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G W 6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544 [ 33.861816] Tainted: [W]=WARN [ 33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023 [ 33.861822] Call Trace: [ 33.861826] <TASK> [ 33.861829] dump_stack_lvl+0x66/0x90 [ 33.861838] print_report+0xce/0x620 [ 33.861853] kasan_report+0xda/0x110 [ 33.862794] kasan_check_range+0xfd/0x1a0 [ 33.862799] __asan_memset+0x23/0x40 [ 33.862803] smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.863306] vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.864257] vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.865682] amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.866160] amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.867135] dev_attr_show+0x43/0xc0 [ 33.867147] sysfs_kf_seq_show+0x1f1/0x3b0 [ 33.867155] seq_read_iter+0x3f8/0x1140 [ 33.867173] vfs_read+0x76c/0xc50 [ 33.867198] ksys_read+0xfb/0x1d0 [ 33.867214] do_syscall_64+0x90/0x160 ... [ 33.867353] Allocated by task 378 on cpu 7 at 22.794876s: [ 33.867358] kasan_save_stack+0x33/0x50 [ 33.867364] kasan_save_track+0x17/0x60 [ 33.867367] __kasan_kmalloc+0x87/0x90 [ 33.867371] vangogh_init_smc_tables+0x3f9/0x840 [amdgpu] [ 33.867835] smu_sw_init+0xa32/0x1850 [amdgpu] [ 33.868299] amdgpu_device_init+0x467b/0x8d90 [amdgpu] [ 33.868733] amdgpu_driver_load_kms+0x19/0xf0 [amdgpu] [ 33.869167] amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu] [ 33.869608] local_pci_probe+0xda/0x180 [ 33.869614] pci_device_probe+0x43f/0x6b0 Empirically we can confirm that the former allocates 152 bytes for the table, while the latter memsets the 168 large block. Root cause appears that when GPU metrics tables for v2_4 parts were added it was not considered to enlarge the table to fit. The fix in this patch is rather "brute force" and perhaps later should be done in a smarter way, by extracting and consolidating the part version to size logic to a common helper, instead of brute forcing the largest possible allocation. Nevertheless, for now this works and fixes the out of bounds write. v2: * Drop impossible v3_0 case. (Mario) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: `41cec40bc9` ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Evan Quan <evan.quan@amd.com> Cc: Wenyou Yang <WenYou.Yang@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241025145639.19124-1-tursulin@igalia.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `0880f58f96`) Cc: stable@vger.kernel.org # v6.6+	2024-10-28 17:14:08 -04:00
Ovidiu Bunea	1b6063a577	Revert "drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35" This reverts commit `9dad21f910` ("drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35") [why & how] The offending commit exposes a hang with lid close/open behavior. Both issues seem to be related to ODM 2:1 mode switching, so there is another issue generic to that sequence that needs to be investigated. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `68bf95317e`) Cc: stable@vger.kernel.org	2024-10-28 17:13:25 -04:00
Christoph Hellwig	be0e822bb3	block: fix queue limits checks in blk_rq_map_user_bvec for real blk_rq_map_user_bvec currently only has ad-hoc checks for queue limits, and the last fix to it enabled valid NVMe I/O to pass, but also allowed invalid one for drivers that set a max_segment_size or seg_boundary limit. Fix it once for all by using the bio_split_rw_at helper from the I/O path that indicates if and where a bio would be have to be split to adhere to the queue limits, and it returns a positive value, turn that into -EREMOTEIO to retry using the copy path. Fixes: `2ff9494418` ("block: fix sanity checks in blk_rq_map_user_bvec") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241028090840.446180-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-28 12:35:05 -06:00
Matthew Brost	746ae46c11	drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM drm_gpu_scheduler.submit_wq is used to submit jobs, jobs are in the path of dma-fences, and dma-fences are in the path of reclaim. Mark scheduler work queue with WQ_MEM_RECLAIM to ensure forward progress during reclaim; without WQ_MEM_RECLAIM, work queues cannot make forward progress during reclaim. v2: - Fixes tags (Philipp) - Reword commit message (Philipp) Cc: Luben Tuikov <ltuikov89@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Philipp Stanner <pstanner@redhat.com> Cc: stable@vger.kernel.org Fixes: `34f50cc644` ("drm/sched: Use drm sched lockdep map for submit_wq") Fixes: `a6149f0393` ("drm/sched: Convert drm scheduler to use a work queue rather than kthread") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023235917.1836428-1-matthew.brost@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-28 14:12:56 -04:00
Hans de Goede	c1895ba181	ASoC: Intel: sst: Fix used of uninitialized ctx to log an error Fix the new "LPE0F28" code path using the uninitialized ctx variable to log an error. Fixes: `6668610b4d` ("ASoC: Intel: sst: Support LPE0F28 ACPI HID") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410261106.EBx49ssy-lkp@intel.com/ Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241026143615.171821-1-hdegoede@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-28 16:44:52 +00:00
Ian Rogers	a5384c4267	perf cap: Add __NR_capget to arch/x86 unistd As there are duplicated kernel headers in tools/include libc can pick up the wrong definitions. This was causing the wrong system call for capget in perf. Reported-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: `e25ebda78e` ("perf cap: Tidy up and improve capability testing") Closes: https://lore.kernel.org/lkml/cc7d6bdf-1aeb-4179-9029-4baf50b59342@intel.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241026055448.312247-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-28 13:04:52 -03:00
Arnaldo Carvalho de Melo	55f1b540d8	tools headers: Update the linux/unaligned.h copy with the kernel sources To pick up the changes in: `7f053812da` ("random: vDSO: minimize and simplify header includes") That required adding a copy of include/vdso/unaligned.h and its checking in tools/perf/check-headers.h. Addressing this perf tools build warning: Warning: Kernel ABI header differences: diff -u tools/include/linux/unaligned.h include/linux/unaligned.h Please see tools/include/uapi/README for further details. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Ian Rogers <irogers@google.com> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Zx-uHvAbPAESofEN@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-28 12:34:28 -03:00
Arnaldo Carvalho de Melo	93e4b86b3e	tools headers arm64: Sync arm64's cputype.h with the kernel sources To get the changes in: `924725707d` ("arm64: cputype: Add Neoverse-N3 definitions") That makes this perf source code to be rebuilt: CC /tmp/build/perf-tools/util/arm-spe.o The changes in the above patch add MIDR_NEOVERSE_N3, that probably need changes in arm-spe.c, so probably we need to add it to that array? Or maybe we need to leave this for later when this is all tested on those machines? static const struct midr_range neoverse_spe[] = { MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), {}, }; Mark Rutland recommended about arm-spe.c in a previous update to this file: "I would not touch this for now -- someone would have to go audit the TRMs to check that those other cores have the same encoding, and I think it'd be better to do that as a follow-up." That addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Zx-dffKdGsgkhG96@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-28 12:33:50 -03:00
Arnaldo Carvalho de Melo	21a3a3d015	tools headers: Synchronize {uapi/}linux/bits.h with the kernel sources To pick up the changes in this cset: `947697c6f0` ("uapi: Define GENMASK_U128") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/bits.h include/uapi/linux/bits.h diff -u tools/include/linux/bits.h include/linux/bits.h Please see tools/include/uapi/README for further details. Acked-by: Yury Norov <yury.norov@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Zx-ZVH7bHqtFn8Dv@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-28 12:32:25 -03:00
Jarkko Sakkinen	cc7d859434	tpm: Rollback tpm2_load_null() Do not continue on tpm2_create_primary() failure in tpm2_load_null(). Cc: stable@vger.kernel.org # v6.10+ Fixes: `eb24c9788c` ("tpm: disable the TPM if NULL name changes") Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2024-10-28 17:22:01 +02:00
Jarkko Sakkinen	d658d59471	tpm: Return tpm2_sessions_init() when null key creation fails Do not continue tpm2_sessions_init() further if the null key pair creation fails. Cc: stable@vger.kernel.org # v6.10+ Fixes: `d2add27cf2` ("tpm: Add NULL primary creation") Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2024-10-28 17:22:01 +02:00
Hugh Dickins	c749d9b7eb	iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP generic/077 on x86_32 CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP=y with highmem, on huge=always tmpfs, issues a warning and then hangs (interruptibly): WARNING: CPU: 5 PID: 3517 at mm/highmem.c:622 kunmap_local_indexed+0x62/0xc9 CPU: 5 UID: 0 PID: 3517 Comm: cp Not tainted 6.12.0-rc4 #2 ... copy_page_from_iter_atomic+0xa6/0x5ec generic_perform_write+0xf6/0x1b4 shmem_file_write_iter+0x54/0x67 Fix copy_page_from_iter_atomic() by limiting it in that case (include/linux/skbuff.h skb_frag_must_loop() does similar). But going forward, perhaps CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is too surprising, has outlived its usefulness, and should just be removed? Fixes: `908a1ad894` ("iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic()") Signed-off-by: Hugh Dickins <hughd@google.com> Link: https://lore.kernel.org/r/dd5f0c89-186e-18e1-4f43-19a60f5a9774@google.com Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-28 13:39:35 +01:00
Christophe JAILLET	d221b844ee	ASoC: cs42l51: Fix some error handling paths in cs42l51_probe() If devm_gpiod_get_optional() fails, we need to disable previously enabled regulators, as done in the other error handling path of the function. Also, gpiod_set_value_cansleep(, 1) needs to be called to undo a potential gpiod_set_value_cansleep(, 0). If the "reset" gpio is not defined, this additional call is just a no-op. This behavior is the same as the one already in the .remove() function. Fixes: `11b9cd748e` ("ASoC: cs42l51: add reset management") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/a5e5f4b9fb03f46abd2c93ed94b5c395972ce0d1.1729975570.git.christophe.jaillet@wanadoo.fr Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-28 12:21:17 +00:00
Ian Kent	f19910006e	autofs: fix thinko in validate_dev_ioctl() I was so sure the per-dentry expire timeout patch worked ok but my testing was flawed. In validate_dev_ioctl() the check for ioctl AUTOFS_DEV_IOCTL_TIMEOUT_CMD should use the ioctl number not the passed in ioctl command. Fixes: `433f9d76a0` ("autofs: add per dentry expire timeout") Cc: <stable@vger.kernel.org> # mainline only Signed-off-by: Ian Kent <raven@themaw.net> Link: https://lore.kernel.org/r/20241027224732.5507-1-raven@themaw.net Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-28 13:16:56 +01:00
Miguel Ojeda	c38a04ecb6	kbuild: rust: avoid errors with old `rustc`s without LLVM patch version Some old versions of `rustc` did not report the LLVM version without the patch version, e.g.: $ rustc --version --verbose rustc 1.48.0 (7eac88abb 2020-11-16) binary: rustc commit-hash: 7eac88abb2e57e752f3302f02be5f3ce3d7adfb4 commit-date: 2020-11-16 host: x86_64-unknown-linux-gnu release: 1.48.0 LLVM version: 11.0 Which would make the new `scripts/rustc-llvm-version.sh` fail and, in turn, the build: $ make LLVM=1 SYNC include/config/auto.conf.cmd ./scripts/rustc-llvm-version.sh: 13: arithmetic expression: expecting primary: "10000 * 10 + 100 * 0 + " init/Kconfig:83: syntax error init/Kconfig:83: invalid statement make[3]: * [scripts/kconfig/Makefile:85: syncconfig] Error 1 make[2]: * [Makefile:679: syncconfig] Error 2 make[1]: * [/home/cam/linux/Makefile:780: include/config/auto.conf.cmd] Error 2 make: * [Makefile:224: __sub-make] Error 2 Since we do not need to support such binaries, we can avoid adding logic for computing `rustc`'s LLVM version for those old binaries. Thus, instead, just make the match stricter. Other `rustc` binaries (even newer) did not report the LLVM version at all, but that was fine, since it would not match "LLVM", e.g.: $ rustc --version --verbose rustc 1.49.0 (e1884a8e3 2020-12-29) binary: rustc commit-hash: e1884a8e3c3e813aada8254edfa120e85bf5ffca commit-date: 2020-12-29 host: x86_64-unknown-linux-gnu release: 1.49.0 Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: Gary Guo <gary@garyguo.net> Reported-by: Cameron MacPherson <cameron.macpherson@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219423 Fixes: `af0121c2d3` ("kbuild: rust: add `CONFIG_RUSTC_LLVM_VERSION`") Tested-by: Cameron MacPherson <cameron.macpherson@gmail.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20241027145636.416030-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2024-10-28 00:27:16 +01:00
Linus Torvalds	8198375843	Linux 6.12-rc5	2024-10-27 12:52:02 -10:00
Linus Torvalds	ea1fda89f5	Merge tag 'x86_urgent_for_v6.12_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Prevent a certain range of pages which get marked as hypervisor-only, to get allocated to a CoCo (SNP) guest which cannot use them and thus fail booting - Fix the microcode loader on AMD to pay attention to the stepping of a patch and to handle the case where a BIOS config option splits the machine into logical NUMA nodes per L3 cache slice - Disable LAM from being built by default due to security concerns * tag 'x86_urgent_for_v6.12_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Ensure that RMP table fixups are reserved x86/microcode/AMD: Split load_microcode_amd() x86/microcode/AMD: Pay attention to the stepping dynamically x86/lam: Disable ADDRESS_MASKING in most cases	2024-10-27 09:01:36 -10:00
Linus Torvalds	f69a1accfe	Merge tag 'ftrace-v6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace fixes from Steven Rostedt: - Fix missing mutex unlock in error path of register_ftrace_graph() A previous fix added a return on an error path and forgot to unlock the mutex. Instead of dealing with error paths, use guard(mutex) as the mutex is just released at the exit of the function anyway. Other functions in this file should be updated with this, but that's a cleanup and not a fix. - Change cpuhp setup name to be consistent with other cpuhp states The same fix that the above patch fixes added a cpuhp_setup_state() call with the name of "fgraph_idle_init". I was informed that it should instead be something like: "fgraph:online". Update that too. * tag 'ftrace-v6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: fgraph: Change the name of cpuhp state to "fgraph:online" fgraph: Fix missing unlock in register_ftrace_graph()	2024-10-27 08:56:22 -10:00
Linus Torvalds	284a2f8996	Merge tag 'platform-drivers-x86-v6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - Asus thermal profile fix, fixing performance issues on Lunar Lake - Intel PMC: one revert for a lockdep issue and one bugfix - Dell WMI: Ignore some WMI events on suspend/resume to silence warnings * tag 'platform-drivers-x86-v6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: asus-wmi: Fix thermal profile initialization platform/x86: dell-wmi: Ignore suspend notifications platform/x86/intel/pmc: Fix pmc_core_iounmap to call iounmap for valid addresses platform/x86:intel/pmc: Revert "Enable the ACPI PM Timer to be turned off when suspended"	2024-10-27 08:40:33 -10:00
Linus Torvalds	7bec4657b0	Merge tag 'firewire-fixes-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fix from Takashi Sakamoto: "A single commit to resolve a regression existing in v6.11 or later. The change in 1394 OHCI driver in v6.11 kernel could cause general protection faults when rediscovering nodes in IEEE 1394 bus while holding a spin lock. Consequently, watchdog checks can report a hard lockup. Currently, this issue is observed primarily during the system resume phase when using an extra node with three ports or more is used. However, it could potentially occur in the other cases as well" * tag 'firewire-fixes-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: core: fix invalid port index for parent device	2024-10-27 08:36:01 -10:00
Linus Torvalds	75f8b2f526	Merge tag 'block-6.12-20241026' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: - Pull request for MD via Song fixing a few issues - Fix a wrong check in blk_rq_map_user_bvec(), causing IO errors on passthrough IO (Xinyu) * tag 'block-6.12-20241026' of git://git.kernel.dk/linux: block: fix sanity checks in blk_rq_map_user_bvec md/raid10: fix null ptr dereference in raid10_size() md: ensure child flush IO does not affect origin bio->bi_status	2024-10-27 08:29:36 -10:00
Linus Torvalds	a8b3be2617	Merge tag 'xfs-6.12-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Carlos Maiolino: - Fix recovery of allocator ops after a growfs - Do not fail repairs on metadata files with no attr fork * tag 'xfs-6.12-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: update the pag for the last AG at recovery time xfs: don't use __GFP_RETRY_MAYFAIL in xfs_initialize_perag xfs: error out when a superblock buffer update reduces the agcount xfs: update the file system geometry after recoverying superblock buffers xfs: merge the perag freeing helpers xfs: pass the exact range to initialize to xfs_initialize_perag xfs: don't fail repairs on metadata files with no attr fork	2024-10-27 08:23:49 -10:00
Marc Zyngier	e6c24e2d05	irqchip/gic-v4: Correctly deal with set_affinity on lazily-mapped VPEs Zenghui points out that a recent change to the way set_affinity is handled for VPEs has the potential to return an error if the VPE hasn't been mapped yet (because the guest hasn't emited a MAPTI command yet), affecting GICv4.0 implementations that rely on the ITSList feature. Fix this by making the set_affinity succeed in this case, and return early, without trying to touch the HW. Fixes: `1442ee0011` ("irqchip/gic-v4: Don't allow a VMOVP on a dying VPE") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/all/20241027102220.1858558-1-maz@kernel.org Link: https://lore.kernel.org/r/aab45cd3-e5ca-58cf-e081-e32a17f5b4e7@huawei.com	2024-10-27 17:30:16 +01:00
Jinjie Ruan	5f994f5341	genirq/msi: Fix off-by-one error in msi_domain_alloc() The error path in msi_domain_alloc(), frees the already allocated MSI interrupts in a loop, but the loop condition terminates when the index reaches zero, which fails to free the first allocated MSI interrupt at index zero. Check for >= 0 so that msi[0] is freed as well. Fixes: `f3cf8bb0d6` ("genirq: Add generic msi irq domain support") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241026063639.10711-1-ruanjinjie@huawei.com	2024-10-27 10:40:47 +01:00
Benjamin Segall	b5413156ba	posix-cpu-timers: Clear TICK_DEP_BIT_POSIX_TIMER on clone When cloning a new thread, its posix_cputimers are not inherited, and are cleared by posix_cputimers_init(). However, this does not clear the tick dependency it creates in tsk->tick_dep_mask, and the handler does not reach the code to clear the dependency if there were no timers to begin with. Thus if a thread has a cputimer running before clone/fork, all descendants will prevent nohz_full unless they create a cputimer of their own. Fix this by entirely clearing the tick_dep_mask in copy_process(). (There is currently no inherited state that needs a tick dependency) Process-wide timers do not have this problem because fork does not copy signal_struct as a baseline, it creates one from scratch. Fixes: `b78783000d` ("posix-cpu-timers: Migrate to use new tick dependency mask model") Signed-off-by: Ben Segall <bsegall@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/xm26o737bq8o.fsf@google.com	2024-10-27 10:36:04 +01:00
Takashi Sakamoto	f6a6780e0b	firewire: core: fix invalid port index for parent device In a commit `24b7f8e5cd` ("firewire: core: use helper functions for self ID sequence"), the enumeration over self ID sequence was refactored with some helper functions with KUnit tests. These helper functions are guaranteed to work expectedly by the KUnit tests, however their application includes a mistake to assign invalid value to the index of port connected to parent device. This bug affects the case that any extra node devices which has three or more ports are connected to 1394 OHCI controller. In the case, the path to update the tree cache could hits WARN_ON(), and gets general protection fault due to the access to invalid address computed by the invalid value. This commit fixes the bug to assign correct port index. Cc: stable@vger.kernel.org Reported-by: Edmund Raile <edmund.raile@proton.me> Closes: https://lore.kernel.org/lkml/8a9902a4ece9329af1e1e42f5fea76861f0bf0e8.camel@proton.me/ Fixes: `24b7f8e5cd` ("firewire: core: use helper functions for self ID sequence") Link: https://lore.kernel.org/r/20241025034137.99317-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2024-10-27 11:14:35 +09:00
Armin Wolf	b012170fed	platform/x86: asus-wmi: Fix thermal profile initialization When support for vivobook fan profiles was added, the initial call to throttle_thermal_policy_set_default() was removed, which however is necessary for full initialization. Fix this by calling throttle_thermal_policy_set_default() again when setting up the platform profile. Fixes: `bcbfcebda2` ("platform/x86: asus-wmi: add support for vivobook fan profiles") Reported-by: Michael Larabel <Michael@phoronix.com> Closes: https://www.phoronix.com/review/lunar-lake-xe2/5 Signed-off-by: Armin Wolf <W_Armin@gmx.de> Link: https://lore.kernel.org/r/20241025191514.15032-2-W_Armin@gmx.de Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2024-10-26 13:03:10 +02:00
Shawn Wang	9c70b2a33c	sched/numa: Fix the potential null pointer dereference in task_numa_work() When running stress-ng-vm-segv test, we found a null pointer dereference error in task_numa_work(). Here is the backtrace: [323676.066985] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ...... [323676.067108] CPU: 35 PID: 2694524 Comm: stress-ng-vm-se ...... [323676.067113] pstate: 23401009 (nzCv daif +PAN -UAO +TCO +DIT +SSBS BTYPE=--) [323676.067115] pc : vma_migratable+0x1c/0xd0 [323676.067122] lr : task_numa_work+0x1ec/0x4e0 [323676.067127] sp : ffff8000ada73d20 [323676.067128] x29: ffff8000ada73d20 x28: 0000000000000000 x27: 000000003e89f010 [323676.067130] x26: 0000000000080000 x25: ffff800081b5c0d8 x24: ffff800081b27000 [323676.067133] x23: 0000000000010000 x22: 0000000104d18cc0 x21: ffff0009f7158000 [323676.067135] x20: 0000000000000000 x19: 0000000000000000 x18: ffff8000ada73db8 [323676.067138] x17: 0001400000000000 x16: ffff800080df40b0 x15: 0000000000000035 [323676.067140] x14: ffff8000ada73cc8 x13: 1fffe0017cc72001 x12: ffff8000ada73cc8 [323676.067142] x11: ffff80008001160c x10: ffff000be639000c x9 : ffff8000800f4ba4 [323676.067145] x8 : ffff000810375000 x7 : ffff8000ada73974 x6 : 0000000000000001 [323676.067147] x5 : 0068000b33e26707 x4 : 0000000000000001 x3 : ffff0009f7158000 [323676.067149] x2 : 0000000000000041 x1 : 0000000000004400 x0 : 0000000000000000 [323676.067152] Call trace: [323676.067153] vma_migratable+0x1c/0xd0 [323676.067155] task_numa_work+0x1ec/0x4e0 [323676.067157] task_work_run+0x78/0xd8 [323676.067161] do_notify_resume+0x1ec/0x290 [323676.067163] el0_svc+0x150/0x160 [323676.067167] el0t_64_sync_handler+0xf8/0x128 [323676.067170] el0t_64_sync+0x17c/0x180 [323676.067173] Code: d2888001 910003fd f9000bf3 aa0003f3 (f9401000) [323676.067177] SMP: stopping secondary CPUs [323676.070184] Starting crashdump kernel... stress-ng-vm-segv in stress-ng is used to stress test the SIGSEGV error handling function of the system, which tries to cause a SIGSEGV error on return from unmapping the whole address space of the child process. Normally this program will not cause kernel crashes. But before the munmap system call returns to user mode, a potential task_numa_work() for numa balancing could be added and executed. In this scenario, since the child process has no vma after munmap, the vma_next() in task_numa_work() will return a null pointer even if the vma iterator restarts from 0. Recheck the vma pointer before dereferencing it in task_numa_work(). Fixes: `214dbc4281` ("sched: convert to vma iterator") Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org # v6.2+ Link: https://lkml.kernel.org/r/20241025022208.125527-1-shawnwang@linux.alibaba.com	2024-10-26 09:28:37 +02:00
Dmitry Torokhov	2860586c58	Input: adp5588-keys - do not try to disable interrupt 0 Commit `dc748812fc` ("Input: adp5588-keys - add support for pure gpio") made having interrupt line optional for the device, however it neglected to update suspend and resume handlers that try to disable interrupts for the duration of suspend. Fix this by checking if interrupt number assigned to the i2c device is not 0 before trying to disable or reenable it. Fixes: `dc748812fc` ("Input: adp5588-keys - add support for pure gpio") Link: https://lore.kernel.org/r/Zv_2jEMYSWDw2gKs@google.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>	2024-10-25 15:52:45 -07:00
Gustavo A. R. Silva	cf44e74504	wifi: mac80211: ieee80211_i: Fix memory corruption bug in struct ieee80211_chanctx Move the `struct ieee80211_chanctx_conf conf` to the end of `struct ieee80211_chanctx` and fix a memory corruption bug triggered e.g. in `hwsim_set_chanctx_magic()`: `radar_detected` is being overwritten when `cp->magic = HWSIM_CHANCTX_MAGIC;` See the function call sequence below: drv_add_chanctx(... struct ieee80211_chanctx ctx) -> local->ops->add_chanctx(&local->hw, &ctx->conf) -> mac80211_hwsim_add_chanctx(... struct ieee80211_chanctx_conf ctx) -> hwsim_set_chanctx_magic(ctx) This also happens in a number of other drivers. Also, add a code comment to try to prevent people from introducing new members after `struct ieee80211_chanctx_conf conf`. Notice that `struct ieee80211_chanctx_conf` is a flexible structure --a structure that contains a flexible-array member, so it should always be at the end of any other containing structures. This change also fixes 50 of the following warnings: net/mac80211/ieee80211_i.h:895:39: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] -Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Fixes: `bca8bc0399` ("wifi: mac80211: handle ieee80211_radar_detected() for MLO") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://patch.msgid.link/ZxwWPrncTeSi1UTq@kspp [also refer to other drivers in commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-26 00:42:49 +02:00
Linus Torvalds	850925a813	Merge tag '9p-for-6.12-rc5' of https://github.com/martinetd/linux Pull more 9p reverts from Dominique Martinet: "Revert patches causing inode collision problems. The code simplification introduced significant regressions on servers that do not remap inode numbers when exporting multiple underlying filesystems with colliding inodes. See the top-most revert (commit `be2ca38253`) for details. This problem had been ignored for too long and the reverts will also head to stable (6.9+). I'm confident this set of patches gets us back to previous behaviour (another related patch had already been reverted back in April and we're almost back to square 1, and the rest didn't touch inode lifecycle)" * tag '9p-for-6.12-rc5' of https://github.com/martinetd/linux: Revert "fs/9p: simplify iget to remove unnecessary paths" Revert "fs/9p: fix uaf in in v9fs_stat2inode_dotl" Revert "fs/9p: remove redundant pointer v9ses" Revert " fs/9p: mitigate inode collisions"	2024-10-25 15:25:02 -07:00
Tejun Heo	c31f2ee5cd	sched_ext: Fix enq_last_no_enq_fails selftest `cc9877fb76` ("sched_ext: Improve error reporting during loading") changed how load failures are reported so that more error context can be communicated. This breaks the enq_last_no_enq_fails test as attach no longer fails. The scheduler is guaranteed to be ejected on attach completion with full error information. Update enq_last_no_enq_fails so that it checks that the scheduler is ejected using ops.exit(). Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Vishal Chourasia <vishalc@linux.ibm.com> Link: http://lkml.kernel.org/r/Zxknp7RAVNjmdJSc@linux.ibm.com Fixes: `cc9877fb76` ("sched_ext: Improve error reporting during loading")	2024-10-25 12:20:29 -10:00
Tejun Heo	7724abf0ca	sched_ext: Make cast_mask() inline cast_mask() doesn't do any actual work and is defined in a header file. Force it to be inline. When it is not inlined and the function is not used, it can cause verificaiton failures like the following: # tools/testing/selftests/sched_ext/runner -t minimal ===== START ===== TEST: minimal DESCRIPTION: Verify we can load a fully minimal scheduler OUTPUT: libbpf: prog 'cast_mask': missing BPF prog type, check ELF section name '.text' libbpf: prog 'cast_mask': failed to load: -22 libbpf: failed to load object 'minimal' libbpf: failed to load BPF skeleton 'minimal': -22 ERR: minimal.c:20 Failed to open and load skel not ok 1 minimal # ===== END ===== Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: `a748db0c8c` ("tools/sched_ext: Receive misc updates from SCX repo")	2024-10-25 12:19:44 -10:00
David Vernet	0e7ffff1b8	scx: Fix raciness in scx_ops_bypass() scx_ops_bypass() can currently race on the ops enable / disable path as follows: 1. scx_ops_bypass(true) called on enable path, bypass depth is set to 1 2. An op on the init path exits, which schedules scx_ops_disable_workfn() 3. scx_ops_bypass(false) is called on the disable path, and bypass depth is decremented to 0 4. kthread is scheduled to execute scx_ops_disable_workfn() 5. scx_ops_bypass(true) called, bypass depth set to 1 6. scx_ops_bypass() races when iterating over CPUs While it's not safe to take any blocking locks on the bypass path, it is safe to take a raw spinlock which cannot be preempted. This patch therefore updates scx_ops_bypass() to use a raw spinlock to synchronize, and changes scx_ops_bypass_depth to be a regular int. Without this change, we observe the following warnings when running the 'exit' sched_ext selftest (sometimes requires a couple of runs): .[root@virtme-ng sched_ext]# ./runner -t exit ===== START ===== TEST: exit ... [ 14.935078] WARNING: CPU: 2 PID: 360 at kernel/sched/ext.c:4332 scx_ops_bypass+0x1ca/0x280 [ 14.935126] Modules linked in: [ 14.935150] CPU: 2 UID: 0 PID: 360 Comm: sched_ext_ops_h Not tainted 6.11.0-virtme #24 [ 14.935192] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 14.935242] Sched_ext: exit (enabling+all) [ 14.935244] RIP: 0010:scx_ops_bypass+0x1ca/0x280 [ 14.935300] Code: ff ff ff e8 48 96 10 00 fb e9 08 ff ff ff c6 05 7b 34 e8 01 01 90 48 c7 c7 89 86 88 87 e8 be 1d f8 ff 90 0f 0b 90 90 eb 95 90 <0f> 0b 90 41 8b 84 24 24 0a 00 00 eb 97 90 0f 0b 90 41 8b 84 24 24 [ 14.935394] RSP: 0018:ffffb706c0957ce0 EFLAGS: 00010002 [ 14.935424] RAX: 0000000000000009 RBX: 0000000000000001 RCX: 00000000e3fb8b2a [ 14.935465] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffffff88a4c080 [ 14.935512] RBP: 0000000000009b56 R08: 0000000000000004 R09: 00000003f12e520a [ 14.935555] R10: ffffffff863a9795 R11: 0000000000000000 R12: ffff8fc5fec31300 [ 14.935598] R13: ffff8fc5fec31318 R14: 0000000000000286 R15: 0000000000000018 [ 14.935642] FS: 0000000000000000(0000) GS:ffff8fc5fe680000(0000) knlGS:0000000000000000 [ 14.935684] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 14.935721] CR2: 0000557d92890b88 CR3: 000000002464a000 CR4: 0000000000750ef0 [ 14.935765] PKRU: 55555554 [ 14.935782] Call Trace: [ 14.935802] <TASK> [ 14.935823] ? __warn+0xce/0x220 [ 14.935850] ? scx_ops_bypass+0x1ca/0x280 [ 14.935881] ? report_bug+0xc1/0x160 [ 14.935909] ? handle_bug+0x61/0x90 [ 14.935934] ? exc_invalid_op+0x1a/0x50 [ 14.935959] ? asm_exc_invalid_op+0x1a/0x20 [ 14.935984] ? raw_spin_rq_lock_nested+0x15/0x30 [ 14.936019] ? scx_ops_bypass+0x1ca/0x280 [ 14.936046] ? srso_alias_return_thunk+0x5/0xfbef5 [ 14.936081] ? __pfx_scx_ops_disable_workfn+0x10/0x10 [ 14.936111] scx_ops_disable_workfn+0x146/0xac0 [ 14.936142] ? finish_task_switch+0xa9/0x2c0 [ 14.936172] ? srso_alias_return_thunk+0x5/0xfbef5 [ 14.936211] ? __pfx_scx_ops_disable_workfn+0x10/0x10 [ 14.936244] kthread_worker_fn+0x101/0x2c0 [ 14.936268] ? __pfx_kthread_worker_fn+0x10/0x10 [ 14.936299] kthread+0xec/0x110 [ 14.936327] ? __pfx_kthread+0x10/0x10 [ 14.936351] ret_from_fork+0x37/0x50 [ 14.936374] ? __pfx_kthread+0x10/0x10 [ 14.936400] ret_from_fork_asm+0x1a/0x30 [ 14.936427] </TASK> [ 14.936443] irq event stamp: 21002 [ 14.936467] hardirqs last enabled at (21001): [<ffffffff863aa35f>] resched_cpu+0x9f/0xd0 [ 14.936521] hardirqs last disabled at (21002): [<ffffffff863dd0ba>] scx_ops_bypass+0x11a/0x280 [ 14.936571] softirqs last enabled at (20642): [<ffffffff863683d7>] __irq_exit_rcu+0x67/0xd0 [ 14.936622] softirqs last disabled at (20637): [<ffffffff863683d7>] __irq_exit_rcu+0x67/0xd0 [ 14.936672] ---[ end trace 0000000000000000 ]--- [ 14.953282] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 14.953352] ------------[ cut here ]------------ [ 14.953383] WARNING: CPU: 2 PID: 360 at kernel/sched/ext.c:4335 scx_ops_bypass+0x1d8/0x280 [ 14.953428] Modules linked in: [ 14.953453] CPU: 2 UID: 0 PID: 360 Comm: sched_ext_ops_h Tainted: G W 6.11.0-virtme #24 [ 14.953505] Tainted: [W]=WARN [ 14.953527] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 14.953574] RIP: 0010:scx_ops_bypass+0x1d8/0x280 [ 14.953603] Code: c6 05 7b 34 e8 01 01 90 48 c7 c7 89 86 88 87 e8 be 1d f8 ff 90 0f 0b 90 90 eb 95 90 0f 0b 90 41 8b 84 24 24 0a 00 00 eb 97 90 <0f> 0b 90 41 8b 84 24 24 0a 00 00 eb 92 f3 0f 1e fa 49 8d 84 24 f0 [ 14.953693] RSP: 0018:ffffb706c0957ce0 EFLAGS: 00010046 [ 14.953722] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001 [ 14.953763] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8fc5fec31318 [ 14.953804] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 14.953845] R10: ffffffff863a9795 R11: 0000000000000000 R12: ffff8fc5fec31300 [ 14.953888] R13: ffff8fc5fec31318 R14: 0000000000000286 R15: 0000000000000018 [ 14.953934] FS: 0000000000000000(0000) GS:ffff8fc5fe680000(0000) knlGS:0000000000000000 [ 14.953974] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 14.954009] CR2: 0000557d92890b88 CR3: 000000002464a000 CR4: 0000000000750ef0 [ 14.954052] PKRU: 55555554 [ 14.954068] Call Trace: [ 14.954085] <TASK> [ 14.954102] ? __warn+0xce/0x220 [ 14.954126] ? scx_ops_bypass+0x1d8/0x280 [ 14.954150] ? report_bug+0xc1/0x160 [ 14.954178] ? handle_bug+0x61/0x90 [ 14.954203] ? exc_invalid_op+0x1a/0x50 [ 14.954226] ? asm_exc_invalid_op+0x1a/0x20 [ 14.954250] ? raw_spin_rq_lock_nested+0x15/0x30 [ 14.954285] ? scx_ops_bypass+0x1d8/0x280 [ 14.954311] ? __mutex_unlock_slowpath+0x3a/0x260 [ 14.954343] scx_ops_disable_workfn+0xa3e/0xac0 [ 14.954381] ? __pfx_scx_ops_disable_workfn+0x10/0x10 [ 14.954413] kthread_worker_fn+0x101/0x2c0 [ 14.954442] ? __pfx_kthread_worker_fn+0x10/0x10 [ 14.954479] kthread+0xec/0x110 [ 14.954507] ? __pfx_kthread+0x10/0x10 [ 14.954530] ret_from_fork+0x37/0x50 [ 14.954553] ? __pfx_kthread+0x10/0x10 [ 14.954576] ret_from_fork_asm+0x1a/0x30 [ 14.954603] </TASK> [ 14.954621] irq event stamp: 21002 [ 14.954644] hardirqs last enabled at (21001): [<ffffffff863aa35f>] resched_cpu+0x9f/0xd0 [ 14.954686] hardirqs last disabled at (21002): [<ffffffff863dd0ba>] scx_ops_bypass+0x11a/0x280 [ 14.954735] softirqs last enabled at (20642): [<ffffffff863683d7>] __irq_exit_rcu+0x67/0xd0 [ 14.954782] softirqs last disabled at (20637): [<ffffffff863683d7>] __irq_exit_rcu+0x67/0xd0 [ 14.954829] ---[ end trace 0000000000000000 ]--- [ 15.022283] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 15.092282] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 15.149282] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) ok 1 exit # ===== END ===== And with it, the test passes without issue after 1000s of runs: .[root@virtme-ng sched_ext]# ./runner -t exit ===== START ===== TEST: exit DESCRIPTION: Verify we can cleanly exit a scheduler in multiple places OUTPUT: [ 7.412856] sched_ext: BPF scheduler "exit" enabled [ 7.427924] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 7.466677] sched_ext: BPF scheduler "exit" enabled [ 7.475923] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 7.512803] sched_ext: BPF scheduler "exit" enabled [ 7.532924] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 7.586809] sched_ext: BPF scheduler "exit" enabled [ 7.595926] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 7.661923] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 7.723923] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) ok 1 exit # ===== END ===== ============================= RESULTS: PASSED: 1 SKIPPED: 0 FAILED: 0 Fixes: `f0e1a0643a` ("sched_ext: Implement BPF extensible scheduler class") Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-10-25 11:10:51 -10:00
Dan Williams	3a2b97b321	cxl/test: Improve init-order fidelity relative to real-world systems The investigation of an initialization failure [1] highlighted that cxl_test does not reflect the init-order of real world systems. The expected order is root/bus first then async probing of the memory devices. Fix up cxl_test to reflect that order. While it did not reproduce the initial bug report (since that is dependent on built-in vs modular builds), it did reveal a separate latent bug in the subsystem's decoder shutdown flow. Fix for that sent separately. Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/172964784521.81806.15791069994065969243.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>	2024-10-25 16:07:04 -05:00
Dan Williams	105b6235ad	cxl/port: Prevent out-of-order decoder allocation With the recent change to allow out-of-order decoder de-commit it highlights a need to strengthen the in-order decoder commit guarantees. As it stands match_free_decoder() ensures that if 2 regions are racing decoder allocations the one that wins the race will get the lower id decoder, but that still leaves the race to commit the decoder. Rather than have this complicated case of "reserved in-order, but may still commit out-of-order", just arrange for the reservation order to match the commit-order. In other words, prevent subsequent allocations until the last reservation is committed. This precludes overlapping region creation events and requires the previous regionN to either move forward to the decoder commit stage or drop its reservation before regionN+1 can move forward. That is, provided that regionN and regionN+1 decode through the same switch port. As a side effect this allows match_free_decoder() to drop its dependency on needing write access to the device_find_child() @data parameter [1]. Reported-by: Zijun Hu <quic_zijuhu@quicinc.com> Closes: http://lore.kernel.org/20240905-const_dfc_prepare-v4-0-4180e1d5a244@quicinc.com Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964783668.81806.14962699553881333486.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>	2024-10-25 16:07:03 -05:00
Dan Williams	101c268bd2	cxl/port: Fix use-after-free, permit out-of-order decoder shutdown In support of investigating an initialization failure report [1], cxl_test was updated to register mock memory-devices after the mock root-port/bus device had been registered. That led to cxl_test crashing with a use-after-free bug with the following signature: cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1 cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1 cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0 1) cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1 [..] cxld_unregister: cxl decoder14.0: cxl_region_decode_reset: cxl_region region3: mock_decoder_reset: cxl_port port3: decoder3.0 reset 2) mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1 cxl_endpoint_decoder_release: cxl decoder14.0: [..] cxld_unregister: cxl decoder7.0: 3) cxl_region_decode_reset: cxl_region region3: Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI [..] RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core] [..] Call Trace: <TASK> cxl_region_decode_reset+0x69/0x190 [cxl_core] cxl_region_detach+0xe8/0x210 [cxl_core] cxl_decoder_kill_region+0x27/0x40 [cxl_core] cxld_unregister+0x5d/0x60 [cxl_core] At 1) a region has been established with 2 endpoint decoders (7.0 and 14.0). Those endpoints share a common switch-decoder in the topology (3.0). At teardown, 2), decoder14.0 is the first to be removed and hits the "out of order reset case" in the switch decoder. The effect though is that region3 cleanup is aborted leaving it in-tact and referencing decoder14.0. At 3) the second attempt to teardown region3 trips over the stale decoder14.0 object which has long since been deleted. The fix here is to recognize that the CXL specification places no mandate on in-order shutdown of switch-decoders, the driver enforces in-order allocation, and hardware enforces in-order commit. So, rather than fail and leave objects dangling, always remove them. In support of making cxl_region_decode_reset() always succeed, cxl_region_invalidate_memregion() failures are turned into warnings. Crashing the kernel is ok there since system integrity is at risk if caches cannot be managed around physical address mutation events like CXL region destruction. A new device_for_each_child_reverse_from() is added to cleanup port->commit_end after all dependent decoders have been disabled. In other words if decoders are allocated 0->1->2 and disabled 1->2->0 then port->commit_end only decrements from 2 after 2 has been disabled, and it decrements all the way to zero since 1 was disabled previously. Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Cc: stable@vger.kernel.org Fixes: `176baefb2e` ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964782781.81806.17902885593105284330.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>	2024-10-25 16:07:03 -05:00
Dan Williams	48f62d38a0	cxl/acpi: Ensure ports ready at cxl_acpi_probe() return In order to ensure root CXL ports are enabled upon cxl_acpi_probe() when the 'cxl_port' driver is built as a module, arrange for the module to be pre-loaded or built-in. The "Fixes:" but no "Cc: stable" on this patch reflects that the issue is merely by inspection since the bug that triggered the discovery of this potential problem [1] is fixed by other means. However, a stable backport should do no harm. Fixes: `8dd2bc0f8e` ("cxl/mem: Add the cxl_mem driver") Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964781969.81806.17276352414854540808.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>	2024-10-25 16:07:03 -05:00
Dan Williams	3d6ebf1643	cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices() It turns out since its original introduction, pre-2.6.12, bus_rescan_devices() has skipped devices that might be in the process of attaching or detaching from their driver. For CXL this behavior is unwanted and expects that cxl_bus_rescan() is a probe barrier. That behavior is simple enough to achieve with bus_for_each_dev() paired with call to device_attach(), and it is unclear why bus_rescan_devices() took the position of lockless consumption of dev->driver which is racy. The "Fixes:" but no "Cc: stable" on this patch reflects that the issue is merely by inspection since the bug that triggered the discovery of this potential problem [1] is fixed by other means. However, a stable backport should do no harm. Fixes: `8dd2bc0f8e` ("cxl/mem: Add the cxl_mem driver") Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964781104.81806.4277549800082443769.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>	2024-10-25 16:07:03 -05:00
Dan Williams	6575b26815	cxl/port: Fix CXL port initialization order when the subsystem is built-in When the CXL subsystem is built-in the module init order is determined by Makefile order. That order violates expectations. The expectation is that cxl_acpi and cxl_mem can race to attach. If cxl_acpi wins the race, cxl_mem will find the enabled CXL root ports it needs. If cxl_acpi loses the race it will retrigger cxl_mem to attach via cxl_bus_rescan(). That flow only works if cxl_acpi can assume ports are enabled immediately upon cxl_acpi_probe() return. That in turn can only happen in the CONFIG_CXL_ACPI=y case if the cxl_port driver is registered before cxl_acpi_probe() runs. Fix up the order to prevent initialization failures. Ensure that cxl_port is built-in when cxl_acpi is also built-in, arrange for Makefile order to resolve the subsys_initcall() order of cxl_port and cxl_acpi, and arrange for Makefile order to resolve the device_initcall() (module_init()) order of the remaining objects. As for what contributed to this not being found earlier, the CXL regression environment, cxl_test, builds all CXL functionality as a module to allow to symbol mocking and other dynamic reload tests. As a result there is no regression coverage for the built-in case. Reported-by: Gregory Price <gourry@gourry.net> Closes: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net Tested-by: Gregory Price <gourry@gourry.net> Fixes: `8dd2bc0f8e` ("cxl/mem: Add the cxl_mem driver") Cc: stable@vger.kernel.org Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/172988474904.476062.7961350937442459266.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>	2024-10-25 16:06:49 -05:00
Peter Wang	cb7e509c4e	scsi: ufs: core: Fix another deadlock during RTC update If ufshcd_rtc_work calls ufshcd_rpm_put_sync() and the pm's usage_count is 0, we will enter the runtime suspend callback. However, the runtime suspend callback will wait to flush ufshcd_rtc_work, causing a deadlock. Replace ufshcd_rpm_put_sync() with ufshcd_rpm_put() to avoid the deadlock. Fixes: `6bf999e0eb` ("scsi: ufs: core: Add UFS RTC support") Cc: stable@vger.kernel.org #6.11.x Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20241024015453.21684-1-peter.wang@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-10-25 14:51:34 -04:00
John Garry	d28d17a845	scsi: scsi_debug: Fix do_device_access() handling of unexpected SG copy length If the sg_copy_buffer() call returns less than sdebug_sector_size, then we drop out of the copy loop. However, we still report that we copied the full expected amount, which is not proper. Fix by keeping a running total and return that value. Fixes: `84f3a3c01d` ("scsi: scsi_debug: Atomic write support") Reported-by: Colin Ian King <colin.i.king@gmail.com> Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241018101655.4207-1-john.g.garry@oracle.com Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-10-25 14:48:27 -04:00
Linus Torvalds	c71f8fb4dc	Merge tag 'v6.12-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - Fix init module error caseb - Fix memory allocation error path (for passwords) in mount * tag 'v6.12-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix warning when destroy 'cifs_io_request_pool' smb: client: Handle kstrdup failures for passwords	2024-10-25 11:45:22 -07:00
Linus Torvalds	81dcc79758	Merge tag 'fuse-fixes-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: - Fix cached size after passthrough writes This fix needed a trivial change in the backing-file API, which resulted in some non-fuse files being touched. - Revert a commit meant as a cleanup but which triggered a WARNING - Remove a stray debug line left-over * tag 'fuse-fixes-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: remove stray debug line Revert "fuse: move initialization of fuse_file to fuse_writepages() instead of in callback" fuse: update inode size after extending passthrough write fs: pass offset and result to backing_file end_write() callback	2024-10-25 11:41:18 -07:00
Linus Torvalds	f647053312	Merge tag 'nfsd-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a couple of use-after-free bugs * tag 'nfsd-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: cancel nfsd_shrinker_work using sync mode in nfs4_state_shutdown_net nfsd: fix race between laundromat and free_stateid	2024-10-25 11:38:15 -07:00
Linus Torvalds	b423f5a9a6	Merge tag 'acpi-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix an ACPI PRM (Platform Runtime Mechanism) issue and add two new DMI quirks, one for an ACPI IRQ override and one for lid switch detection: - Make acpi_parse_prmt() look for EFI_MEMORY_RUNTIME memory regions only to comply with the UEFI specification and make PRM use efi_guid_t instead of guid_t to avoid a compiler warning triggered by that change (Koba Ko, Dan Carpenter) - Add an ACPI IRQ override quirk for LG 16T90SP (Christian Heusel) - Add a lid switch detection quirk for Samsung Galaxy Book2 (Shubham Panwar)" * tag 'acpi-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: PRM: Clean up guid type in struct prm_handler_info ACPI: button: Add DMI quirk for Samsung Galaxy Book2 to fix initial lid detection issue ACPI: resource: Add LG 16T90SP to irq1_level_low_skip_override[] ACPI: PRM: Find EFI_MEMORY_RUNTIME block for PRM handler and context	2024-10-25 11:04:34 -07:00
Linus Torvalds	8c76163fff	Merge tag 'pm-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "Update cpufreq documentation to match the code after recent changes (Christian Loehle), fix a units conversion issue in the CPPC cpufreq driver (liwei), and fix an error check in the dtpm_devfreq power capping driver (Yuan Can)" * tag 'pm-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: CPPC: fix perf_to_khz/khz_to_perf conversion exception powercap: dtpm_devfreq: Fix error check against dev_pm_qos_add_request() cpufreq: docs: Reflect latency changes in docs	2024-10-25 11:00:50 -07:00
Linus Torvalds	48005a5a74	Merge tag 'pci-v6.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Hold the rescan lock while adding devices to avoid race with concurrent pwrctl rescan that can lead to a crash (Bartosz Golaszewski) - Avoid binding pwrctl driver to QCom WCN wifi if the DT lacks the necessary PMU regulator descriptions (Bartosz Golaszewski) * tag 'pci-v6.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI/pwrctl: Abandon QCom WCN probe on pre-pwrseq device-trees PCI: Hold rescan lock while adding devices during host probe	2024-10-25 10:56:06 -07:00
Linus Torvalds	86d6688e60	Merge tag 'fbdev-for-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev fixes from Helge Deller: - Fix some build warnings and failures with CONFIG_FB_IOMEM_FOPS and CONFIG_FB_DEVICE - Remove the da8xx fbdev driver - Constify struct sbus_mmap_map and fix indentation warning * tag 'fbdev-for-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: fbdev: wm8505fb: select CONFIG_FB_IOMEM_FOPS fbdev: da8xx: remove the driver fbdev: Constify struct sbus_mmap_map fbdev: nvidiafb: fix inconsistent indentation warning fbdev: sstfb: Make CONFIG_FB_DEVICE optional	2024-10-25 10:51:58 -07:00
Linus Torvalds	f0560f974e	Merge tag 'gpio-fixes-for-v6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fix from Bartosz Golaszewski: "Update MAINTAINERS with a keyword pattern for legacy GPIO API The goal is to alert us to anyone trying to use the deprecated, legacy API (this happens almost every release)" * tag 'gpio-fixes-for-v6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: MAINTAINERS: add a keyword entry for the GPIO subsystem	2024-10-25 10:47:51 -07:00
Linus Torvalds	7a7aecd9c0	Merge tag 'ata-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Niklas Cassel: - Fix the handling of ATA commands that timeout (command that did not receive a completion interrupt within the configured timeout time). Commands that timeout, while also having either the FAILFAST flag set, or the command being a passthrough command, should never be retried. Restore this behavior (as it was before v6.12-rc1). * tag 'ata-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata: Set DID_TIME_OUT for commands that actually timed out	2024-10-25 10:42:29 -07:00
Linus Torvalds	01154cc30e	Merge tag 'sound-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "The majority of changes here are about ASoC. There are two core changes in ASoC (the bump of minimal topology ABI version and the fix for references of components in DAPM code), and others are mostly various device-specific fixes for SoundWire, AMD, Intel, SOF, Qualcomm and FSL, in addition to a few usual HD-audio quirks and fixes" * tag 'sound-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (33 commits) ALSA: hda/realtek: Update default depop procedure ASoC: qcom: sc7280: Fix missing Soundwire runtime stream alloc ASoC: fsl_micfil: Add sample rate constraint ASoC: rt722-sdca: increase clk_stop_timeout to fix clock stop issue ALSA: hda/tas2781: select CRC32 instead of CRC32_SARWATE ALSA: hda/realtek: Add subwoofer quirk for Acer Predator G9-593 ALSA: firewire-lib: Avoid division by zero in apply_constraint_to_size() ASoC: fsl_micfil: Add a flag to distinguish with different volume control types ASoC: codecs: lpass-rx-macro: fix RXn(rx,n) macro for DSM_CTL and SEC7 regs ASoC: Change my e-mail to gmail ASoC: Intel: soc-acpi: lnl: Add match entry for TM2 laptops ASoC: amd: yc: Fix non-functional mic on ASUS E1404FA ASoC: SOF: Intel: hda: Always clean up link DMA during stop soundwire: intel_ace2x: Send PDI stream number during prepare ASoC: SOF: Intel: hda: Handle prepare without close for non-HDA DAI's ASoC: SOF: ipc4-topology: Do not set ALH node_id for aggregated DAIs MAINTAINERS: Update maintainer list for MICROCHIP ASOC, SSC and MCP16502 drivers ASoC: qcom: Select missing common Soundwire module code on SDM845 ASoC: fsl_esai: change dev_warn to dev_dbg in irq handler ASoC: rsnd: Fix probe failure on HiHope boards due to endpoint parsing ...	2024-10-25 10:35:29 -07:00
Linus Torvalds	fd143856b0	Merge tag 'drm-fixes-2024-10-25' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "Weekly drm fixes, mostly amdgpu and xe, with minor bridge and an i915 Kconfig fix. Nothing too scary and it seems to be pretty quiet. amdgpu: - ACPI method handling fixes - SMU 14.x fixes - Display idle optimization fix - DP link layer compliance fix - SDMA 7.x fix - PSR-SU fix - SWSMU fix i915: - Fix DRM_I915_GVT_KVMGT dependencies in Kconfig xe: - Increase invalidation timeout to avoid errors in some hosts - Flush worker on timeout - Better handling for force wake failure - Improve argument check on user fence creation - Don't restart parallel queues multiple times on GT reset bridge: - aux: Fix assignment of OF node - tc358767: Add missing of_node_put() in error path" * tag 'drm-fixes-2024-10-25' of https://gitlab.freedesktop.org/drm/kernel: drm/xe: Don't restart parallel queues multiple times on GT reset drm/xe/ufence: Prefetch ufence addr to catch bogus address drm/xe: Handle unreliable MMIO reads during forcewake drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout drm/xe: Enlarge the invalidation timeout from 150 to 500 drm/amdgpu: handle default profile on on devices without fullscreen 3D drm/amd/display: Disable PSR-SU on Parade 08-01 TCON too drm/amdgpu: fix random data corruption for sdma 7 drm/amd/display: temp w/a for DP Link Layer compliance drm/amd/display: temp w/a for dGPU to enter idle optimizations drm/amd/pm: update deep sleep status on smu v14.0.2/3 drm/amd/pm: update overdrive function on smu v14.0.2/3 drm/amd/pm: update the driver-fw interface file for smu v14.0.2/3 drm/amd: Guard against bad data for ATIF ACPI method drm/bridge: tc358767: fix missing of_node_put() in for_each_endpoint_of_node() drm/bridge: Fix assignment of the of_node of the parent to aux bridge i915: fix DRM_I915_GVT_KVMGT dependencies	2024-10-25 10:29:51 -07:00
Kent Overstreet	8e910ca20e	bcachefs: Fix UAF in bch2_reconstruct_alloc() write_super() -> sb_counters_from_cpu() may reallocate the superblock Reported-by: syzbot+9fc4dac4775d07bcfe34@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-25 13:17:23 -04:00
Jeongjun Park	a25a83de45	bcachefs: fix null-ptr-deref in have_stripes() c->btree_roots_known[i].b can be NULL. In this case, a NULL pointer dereference occurs, so you need to add code to check the variable. Reported-by: syzbot+b468b9fef56949c3b528@syzkaller.appspotmail.com Fixes: `7773df19c3` ("bcachefs: metadata version bucket_stripe_sectors") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-25 13:17:06 -04:00
David Vernet	895669fd0d	scx: Fix exit selftest to use custom DSQ In commit `63fb3ec805` ("sched_ext: Allow only user DSQs for scx_bpf_consume(), scx_bpf_dsq_nr_queued() and bpf_iter_scx_dsq_new()"), we updated the consume path to only accept user DSQs, thus making it invalid to consume SCX_DSQ_GLOBAL. This selftest was doing that, so let's create a custom DSQ and use that instead. The test now passes: [root@virtme-ng sched_ext]# ./runner -t exit ===== START ===== TEST: exit DESCRIPTION: Verify we can cleanly exit a scheduler in multiple places OUTPUT: [ 12.387229] sched_ext: BPF scheduler "exit" enabled [ 12.406064] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 12.453325] sched_ext: BPF scheduler "exit" enabled [ 12.474064] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 12.515241] sched_ext: BPF scheduler "exit" enabled [ 12.532064] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 12.592063] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 12.654063] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 12.715062] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) ok 1 exit # ===== END ===== Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-10-25 07:03:36 -10:00
Linus Torvalds	4dc1f31ec3	x86: fix whitespace in runtime-const assembler output The x86 user pointer validation changes made me look at compiler output a lot, and the wrong indentation for the ".popsection" in the generated assembler triggered me. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-10-25 09:53:03 -07:00
Linus Torvalds	86e6b1547b	x86: fix user address masking non-canonical speculation issue It turns out that AMD has a "Meltdown Lite(tm)" issue with non-canonical accesses in kernel space. And so using just the high bit to decide whether an access is in user space or kernel space ends up with the good old "leak speculative data" if you have the right gadget using the result: CVE-2020-12965 “Transient Execution of Non-Canonical Accesses“ Now, the kernel surrounds the access with a STAC/CLAC pair, and those instructions end up serializing execution on older Zen architectures, which closes the speculation window. But that was true only up until Zen 5, which renames the AC bit [1]. That improves performance of STAC/CLAC a lot, but also means that the speculation window is now open. Note that this affects not just the new address masking, but also the regular valid_user_address() check used by access_ok(), and the asm version of the sign bit check in the get_user() helpers. It does not affect put_user() or clear_user() variants, since there's no speculative result to be used in a gadget for those operations. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Link: https://lore.kernel.org/all/80d94591-1297-4afb-b510-c665efd37f10@citrix.com/ Link: https://lore.kernel.org/all/20241023094448.GAZxjFkEOOF_DM83TQ@fat_crate.local/ [1] Link: https://www.amd.com/en/resources/product-security/bulletin/amd-sb-1010.html Link: https://arxiv.org/pdf/2108.10771 Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Tested-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> # LAM case Fixes: `2865baf540` ("x86: support user address masking instead of non-speculative conditional") Fixes: `6014bc2756` ("x86-64: make access_ok() independent of LAM") Fixes: `b19b74bc99` ("x86/mm: Rework address range check in get_user() and put_user()") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-10-25 09:53:03 -07:00
Shiju Jose	53ab8678e7	cxl/events: Fix Trace DRAM Event Record CXL spec rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record. Fix decode memory event type field of DRAM Event Record. For e.g. if value is 0x1 it will be reported as an Invalid Address (General Media Event Record - Memory Event Type) instead of Scrub Media ECC Error (DRAM Event Record - Memory Event Type) and so on. Fixes: `2d6c1e6d60` ("cxl/mem: Trace DRAM Event Record") Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20241014143003.1170-1-shiju.jose@huawei.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>	2024-10-25 11:23:26 -05:00
Johannes Berg	7245012f0f	wifi: iwlwifi: mvm: fix 6 GHz scan construction If more than 255 colocated APs exist for the set of all APs found during 2.4/5 GHz scanning, then the 6 GHz scan construction will loop forever since the loop variable has type u8, which can never reach the number found when that's bigger than 255, and is stored in a u32 variable. Also move it into the loops to have a smaller scope. Using a u32 there is fine, we limit the number of APs in the scan list and each has a limit on the number of RNR entries due to the frame size. With a limit of 1000 scan results, a frame size upper bound of 4096 (really it's more like ~2300) and a TBTT entry size of at least 11, we get an upper bound for the number of ~372k, well in the bounds of a u32. Cc: stable@vger.kernel.org Fixes: `eae94cf82d` ("iwlwifi: mvm: add support for 6GHz") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219375 Link: https://patch.msgid.link/20241023091744.f4baed5c08a1.I8b417148bbc8c5d11c101e1b8f5bf372e17bf2a7@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-25 17:53:47 +02:00
Johannes Berg	d5fee261df	wifi: cfg80211: clear wdev->cqm_config pointer on free When we free wdev->cqm_config when unregistering, we also need to clear out the pointer since the same wdev/netdev may get re-registered in another network namespace, then destroyed later, running this code again, which results in a double-free. Reported-by: syzbot+36218cddfd84b5cc263e@syzkaller.appspotmail.com Fixes: `37c20b2eff` ("wifi: cfg80211: fix cqm_config access race") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20241022161742.7c34b2037726.I121b9cdb7eb180802eafc90b493522950d57ee18@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-25 17:53:40 +02:00
Ben Greear	9b15c6cf8d	mac80211: fix user-power when emulating chanctx ieee80211_calc_hw_conf_chan was ignoring the configured user_txpower. If it is set, use it to potentially decrease txpower as requested. Signed-off-by: Ben Greear <greearb@candelatech.com> Link: https://patch.msgid.link/20241010203954.1219686-1-greearb@candelatech.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-25 17:53:38 +02:00
Emmanuel Grumbach	bfc0ed73e0	Revert "wifi: iwlwifi: remove retry loops in start" Revert commit `dfdfe4be18` ("wifi: iwlwifi: remove retry loops in start"), it turns out that there's an issue with the PNVM load notification from firmware not getting processed, that this patch has been somewhat successfully papering over. Since this is being reported, revert the loop removal for now. We will later at least clean this up to only attempt to retry if there was a timeout, but currently we don't even bubble up the failure reason to the correct layer, only returning NULL. Fixes: `dfdfe4be18` ("wifi: iwlwifi: remove retry loops in start") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://patch.msgid.link/20241022092212.4aa82a558a00.Ibdeff9c8f0d608bc97fc42024392ae763b6937b7@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-25 17:53:37 +02:00
Emmanuel Grumbach	734a377e1e	wifi: iwlwifi: mvm: don't add default link in fw restart flow When we add the vif (and its default link) in fw restart we may override the link that already exists. We take care of this but if link 0 is a valid MLO link, then we will re-create a default link on mvmvif->link[0] and we'll loose the real link we had there. In non-MLO, we need to re-create the default link upon the interface creation, this is fine. In MLO, we'll just wait for change_vif_links() to re-build the links. Fixes: `bf976c814c` ("wifi: iwlwifi: mvm: implement link change ops") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.385bfea1b2e9.I4a127312285ccb529cc95cc4edf6fbe1e0a136ad@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-25 17:53:35 +02:00
Daniel Gabay	07a6e3b78a	wifi: iwlwifi: mvm: Fix response handling in iwl_mvm_send_recovery_cmd() 1. The size of the response packet is not validated. 2. The response buffer is not freed. Resolve these issues by switching to iwl_mvm_send_cmd_status(), which handles both size validation and frees the buffer. Fixes: `f130bb75d8` ("iwlwifi: add FW recovery flow") Signed-off-by: Daniel Gabay <daniel.gabay@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.76c73185951e.Id3b6ca82ced2081f5ee4f33c997491d0ebda83f7@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-25 17:53:33 +02:00
Anjaneyulu	32d95ab330	wifi: iwlwifi: mvm: SAR table alignment SAR table format in ACPI and local data base are different, So modified code to read data properly. Signed-off-by: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.f077aced4dee.I4dc618f12d01f7ad19f9f8881f6e09eea77e9a14@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-25 17:53:31 +02:00
Daniel Gabay	9715246ca0	wifi: iwlwifi: mvm: Use the sync timepoint API in suspend When starting the suspend flow, HOST_D3_START triggers an _async_ firmware dump collection for debugging purposes. The async worker may race with suspend flow and fail to get NIC access, resulting in the following warning: "Timeout waiting for hardware access (CSR_GP_CNTRL 0xffffffff)" Fix this by switching to the sync version to ensure the dump completes before proceeding with the suspend flow, avoiding potential race issues. Signed-off-by: Daniel Gabay <daniel.gabay@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.9aae318cd593.I4b322009f39489c0b1d8893495c887870f73ed9c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-25 17:53:29 +02:00
Miri Korenblit	cbe84e9ad5	wifi: iwlwifi: mvm: really send iwl_txpower_constraints_cmd iwl_mvm_send_ap_tx_power_constraint_cmd is a no-op if the link is not active (we need to know the band etc.) However, for the station case it will be called just before we set the link to active (by calling iwl_mvm_link_changed with the LINK_CONTEXT_MODIFY_ACTIVE bit set in the 'changed' flags and active = true), so it will end up doing nothing. Fix this by calling iwl_mvm_send_ap_tx_power_constraint_cmd before iwl_mvm_link_changed. Fixes: `6b82f4e119` ("wifi: iwlwifi: mvm: handle TPE advertised by AP") Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.5c235fccd3f1.I2d40dea21e5547eba458565edcb4c354d094d82a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-25 17:53:26 +02:00
Emmanuel Grumbach	3ed092997a	wifi: iwlwifi: mvm: don't leak a link on AP removal Release the link mapping resource in AP removal. This impacted devices that do not support the MLD API (9260 and down). On those devices, we couldn't start the AP again after the AP has been already started and stopped. Fixes: `a8b5d4809b` ("wifi: iwlwifi: mvm: Configure the link mapping for non-MLD FW") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.c54c42779882.Ied79e0d6244dc5a372e8b6ffa8ee9c6e1379ec1d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-25 17:53:24 +02:00
Rafael J. Wysocki	1646a3f2b1	Merge branch 'pm-powercap' Merge a dtpm_devfreq power capping driver fix for 6.12-rc5: - Fix a dev_pm_qos_add_request() return value check in __dtpm_devfreq_setup() to prevent it from failing if a positive number is returned (Yuan Can). * pm-powercap: powercap: dtpm_devfreq: Fix error check against dev_pm_qos_add_request()	2024-10-25 17:27:19 +02:00
Rafael J. Wysocki	54774abb55	Merge branches 'acpi-resource' and 'acpi-button' Merge new DMI quirks for 6.12-rc5: - Add an ACPI IRQ override quirk for LG 16T90SP (Christian Heusel). - Add a lid switch detection quirk for Samsung Galaxy Book2 (Shubham Panwar). * acpi-resource: ACPI: resource: Add LG 16T90SP to irq1_level_low_skip_override[] * acpi-button: ACPI: button: Add DMI quirk for Samsung Galaxy Book2 to fix initial lid detection issue	2024-10-25 17:08:14 +02:00
Miklos Szeredi	d34a5575e6	fuse: remove stray debug line It wasn't there when the patch was posted for review, but somehow made it into the pull. Link: https://lore.kernel.org/all/20240913104703.1673180-1-mszeredi@redhat.com/ Fixes: `efad7153bf` ("fuse: allow O_PATH fd for FUSE_DEV_IOC_BACKING_OPEN") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2024-10-25 17:05:49 +02:00
Thomas Zimmermann	fc5ced75d6	Merge drm/drm-fixes into drm-misc-fixes Backmerging to get the latest fixes from upstream. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2024-10-25 15:24:08 +02:00
Palmer Dabbelt	5f153a692b	Merge commit 'bf40167d54d5' into fixes This fix is part of a series on for-next, but it fixes broken builds so I'm picking it up as a fix. * commit 'bf40167d54d5': riscv: vdso: Prevent the compiler from inserting calls to memset()	2024-10-25 06:18:43 -07:00
Chunyan Zhang	164f66de6b	riscv: Remove duplicated GET_RM The macro GET_RM defined twice in this file, one can be removed. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> Fixes: `956d705dd2` ("riscv: Unaligned load/store handling for M_MODE") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241008094141.549248-3-zhangchunyan@iscas.ac.cn Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-10-25 06:18:42 -07:00
Chunyan Zhang	46d4e5ac6f	riscv: Remove unused GENERATING_ASM_OFFSETS The macro is not used in the current version of kernel, it looks like can be removed to avoid a build warning: ../arch/riscv/kernel/asm-offsets.c: At top level: ../arch/riscv/kernel/asm-offsets.c:7: warning: macro "GENERATING_ASM_OFFSETS" is not used [-Wunused-macros] 7 \| #define GENERATING_ASM_OFFSETS Fixes: `9639a44394` ("RISC-V: Provide a cleaner raw_smp_processor_id()") Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> Link: https://lore.kernel.org/r/20241008094141.549248-2-zhangchunyan@iscas.ac.cn Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-10-25 06:18:41 -07:00
WangYuli	e0872ab726	riscv: Use '%u' to format the output of 'cpu' 'cpu' is an unsigned integer, so its conversion specifier should be %u, not %d. Suggested-by: Wentao Guan <guanwentao@uniontech.com> Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://lore.kernel.org/all/alpine.DEB.2.21.2409122309090.40372@angie.orcam.me.uk/ Signed-off-by: WangYuli <wangyuli@uniontech.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Charlie Jenkins <charlie@rivosinc.com> Fixes: `f1e58583b9` ("RISC-V: Support cpu hotplug") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/4C127DEECDA287C8+20241017032010.96772-1-wangyuli@uniontech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-10-25 06:18:40 -07:00
Miquel Sabaté Solà	37233169a6	riscv: Prevent a bad reference count on CPU nodes When populating cache leaves we previously fetched the CPU device node at the very beginning. But when ACPI is enabled we go through a specific branch which returns early and does not call 'of_node_put' for the node that was acquired. Since we are not using a CPU device node for the ACPI code anyways, we can simply move the initialization of it just passed the ACPI block, and we are guaranteed to have an 'of_node_put' call for the acquired node. This prevents a bad reference count of the CPU device node. Moreover, the previous function did not check for errors when acquiring the device node, so a return -ENOENT has been added for that case. Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Fixes: `604f32ea69` ("riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240913080053.36636-1-mikisabate@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-10-25 06:18:39 -07:00
Heinrich Schuchardt	d41373a4b9	riscv: efi: Set NX compat flag in PE/COFF header The IMAGE_DLLCHARACTERISTICS_NX_COMPAT informs the firmware that the EFI binary does not rely on pages that are both executable and writable. The flag is used by some distro versions of GRUB to decide if the EFI binary may be executed. As the Linux kernel neither has RWX sections nor needs RWX pages for relocation we should set the flag. Cc: Ard Biesheuvel <ardb@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Fixes: `cb7d2dd561` ("RISC-V: Add PE/COFF header for EFI stub") Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240929140233.211800-1-heinrich.schuchardt@canonical.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-10-25 06:18:38 -07:00
Conor Dooley	33549fcf37	RISC-V: disallow gcc + rust builds During the discussion before supporting rust on riscv, it was decided not to support gcc yet, due to differences in extension handling compared to llvm (only the version of libclang matching the c compiler is supported). Recently Jason Montleon reported [1] that building with gcc caused build issues, due to unsupported arguments being passed to libclang. After some discussion between myself and Miguel, it is better to disable gcc + rust builds to match the original intent, and subsequently support it when an appropriate set of extensions can be deduced from the version of libclang. Closes: https://lore.kernel.org/all/20240917000848.720765-2-jmontleo@redhat.com/ [1] Link: https://lore.kernel.org/all/20240926-battering-revolt-6c6a7827413e@spud/ [2] Fixes: `70a57b2472` ("RISC-V: enable building 64-bit kernels with rust support") Cc: stable@vger.kernel.org Reported-by: Jason Montleon <jmontleo@redhat.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20241001-playlist-deceiving-16ece2f440f5@spud Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-10-25 06:18:37 -07:00
Alexandre Ghiti	afedc3126e	riscv: Do not use fortify in early code Early code designates the code executed when the MMU is not yet enabled, and this comes with some limitations (see Documentation/arch/riscv/boot.rst, section "Pre-MMU execution"). FORTIFY_SOURCE must be disabled then since it can trigger kernel panics as reported in [1]. Reported-by: Jason Montleon <jmontleo@redhat.com> Closes: https://lore.kernel.org/linux-riscv/CAJD_bPJes4QhmXY5f63GHV9B9HFkSCoaZjk-qCT2NGS7Q9HODg@mail.gmail.com/ [1] Fixes: `a35707c3d8` ("riscv: add memory-type errata for T-Head") Fixes: `26e7aacb83` ("riscv: Allow to downgrade paging mode from the command line") Cc: stable@vger.kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241009072749.45006-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-10-25 06:18:36 -07:00
Yunhui Cui	1966db682f	RISC-V: ACPI: fix early_ioremap to early_memremap When SVPBMT is enabled, __acpi_map_table() will directly access the data in DDR through the IO attribute, rather than through hardware cache consistency, resulting in incorrect data in the obtained ACPI table. The log: ACPI: [ACPI:0x18] Invalid zero length. We do not assume whether the bootloader flushes or not. We should access in a cacheable way instead of maintaining cache consistency by software. Fixes: `3b426d4b5b` ("RISC-V: ACPI : Fix for usage of pointers in different address space") Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Reviewed-by: Sunil V L <sunilvl@ventanamicro.com> Link: https://lore.kernel.org/r/20241014130141.86426-1-cuiyunhui@bytedance.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-10-25 06:18:31 -07:00
Hans de Goede	6668610b4d	ASoC: Intel: sst: Support LPE0F28 ACPI HID Some old Bay Trail tablets which shipped with Android as factory OS have the SST/LPE audio engine described by an ACPI device with a HID (Hardware-ID) of LPE0F28 instead of 80860F28. Add support for this. Note this uses a new sst_res_info for just the LPE0F28 case because it has a different layout for the IO-mem ACPI resources then the 80860F28. An example of a tablet which needs this is the Vexia EDU ATLA 10 tablet, which has been distributed to schools in the Spanish Andalucía region. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241025090221.52198-1-hdegoede@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-25 14:10:11 +01:00
David S. Miller	e31a8219fb	Merge tag 'wireless-2024-10-21' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless wireless fixes for v6.12-rc5 The first set of wireless fixes for v6.12. We have been busy and have not been able to send this earlier, so there are more fixes than usual. The fixes are all over, both in stack and in drivers, but nothing special really standing out.	2024-10-25 10:44:41 +01:00
Kailang Yang	78e7be0187	ALSA: hda/realtek: Limit internal Mic boost on Dell platform Dell want to limit internal Mic boost on all Dell platform. Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/561fc5f5eff04b6cbd79ed173cd1c1db@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-25 10:47:42 +02:00
Dave Airlie	4d95a12beb	Merge tag 'drm-xe-fixes-2024-10-24-1' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Increase invalidation timeout to avoid errors in some hosts (Shuicheng) - Flush worker on timeout (Badal) - Better handling for force wake failure (Shuicheng) - Improve argument check on user fence creation (Nirmoy) - Don't restart parallel queues multiple times on GT reset (Nirmoy) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/trlkoiewtc4x2cyhsxmj3atayyq4zwto4iryea5pvya2ymc3yp@fdx5nhwmiyem	2024-10-25 16:55:39 +10:00
Steven Rostedt	a574e7f80e	fgraph: Change the name of cpuhp state to "fgraph:online" The cpuhp state name given to cpuhp_setup_state() is "fgraph_idle_init" which doesn't really conform to the names that are used for cpu hotplug setups. Instead rename it to "fgraph:online" to be in line with other states. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/20241024222944.473d88c5@rorschach.local.home Suggested-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: `2c02f7375e` ("fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-10-24 23:41:14 -04:00
Li Huafei	bd3734db86	fgraph: Fix missing unlock in register_ftrace_graph() Use guard(mutex)() to acquire and automatically release ftrace_lock, fixing the issue of not unlocking when calling cpuhp_setup_state() fails. Fixes smatch warning: kernel/trace/fgraph.c:1317 register_ftrace_graph() warn: inconsistent returns '&ftrace_lock'. Link: https://lore.kernel.org/20241024155917.1019580-1-lihuafei1@huawei.com Fixes: `2c02f7375e` ("fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202410220121.wxg0olfd-lkp@intel.com/ Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Li Huafei <lihuafei1@huawei.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-10-24 22:26:06 -04:00
Dmitry Torokhov	bffdf9d7e5	Input: edt-ft5x06 - fix regmap leak when probe fails The driver neglects to free the instance of I2C regmap constructed at the beginning of the edt_ft5x06_ts_probe() method when probe fails. Additionally edt_ft5x06_ts_remove() is freeing the regmap too early, before the rest of the device resources that are managed by devm are released. Fix this by installing a custom devm action that will ensure that the regmap is released at the right time during normal teardown as well as in case of probe failure. Note that devm_regmap_init_i2c() could not be used because the driver may replace the original regmap with a regmap specific for M06 devices in the middle of the probe, and using devm_regmap_init_i2c() would result in releasing the M06 regmap too early. Reported-by: Li Zetao <lizetao1@huawei.com> Fixes: `9dfd9708ff` ("Input: edt-ft5x06 - convert to use regmap API") Cc: stable@vger.kernel.org Reviewed-by: Oliver Graute <oliver.graute@kococonnector.com> Link: https://lore.kernel.org/r/ZxL6rIlVlgsAu-Jv@google.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>	2024-10-24 18:38:07 -07:00
Dave Airlie	e3e1cfe33f	Merge tag 'drm-misc-fixes-2024-10-24' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: bridge: - aux: Fix assignment of OF node - tc358767: Add missing of_node_put() in error path Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241024124921.GA20475@localhost.localdomain	2024-10-25 11:11:59 +10:00
Linus Torvalds	ae90f6a617	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Daniel Borkmann: - Fix an out-of-bounds read in bpf_link_show_fdinfo for BPF sockmap link file descriptors (Hou Tao) - Fix BPF arm64 JIT's address emission with tag-based KASAN enabled reserving not enough size (Peter Collingbourne) - Fix BPF verifier do_misc_fixups patching for inlining of the bpf_get_branch_snapshot BPF helper (Andrii Nakryiko) - Fix a BPF verifier bug and reject BPF program write attempts into read-only marked BPF maps (Daniel Borkmann) - Fix perf_event_detach_bpf_prog error handling by removing an invalid check which would skip BPF program release (Jiri Olsa) - Fix memory leak when parsing mount options for the BPF filesystem (Hou Tao) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Check validity of link->type in bpf_link_show_fdinfo() bpf: Add the missing BPF_LINK_TYPE invocation for sockmap bpf: fix do_misc_fixups() for bpf_get_branch_snapshot() bpf,perf: Fix perf_event_detach_bpf_prog error handling selftests/bpf: Add test for passing in uninit mtu_len selftests/bpf: Add test for writes to .rodata bpf: Remove MEM_UNINIT from skb/xdp MTU helpers bpf: Fix overloading of MEM_UNINIT's meaning bpf: Add MEM_WRITE attribute bpf: Preserve param->string when parsing mount options bpf, arm64: Fix address emission with tag-based KASAN enabled	2024-10-24 16:53:20 -07:00
Hans de Goede	0107f28f13	ASoC: Intel: bytcr_rt5640: Add DMI quirk for Vexia Edu Atla 10 tablet The Vexia Edu Atla 10 tablet mostly uses the BYTCR tablet defaults, but as happens on more models it is using IN1 instead of IN3 for its internal mic and JD_SRC_JD2_IN4N instead of JD_SRC_JD1_IN4P for jack-detection. Add a DMI quirk for this to fix the internal-mic and jack-detection. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241024211615.79518-2-hdegoede@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-25 00:50:15 +01:00
Hans de Goede	d48696b915	ASoC: Intel: bytcr_rt5640: Add support for non ACPI instantiated codec On some x86 Bay Trail tablets which shipped with Android as factory OS, the DSDT is so broken that the codec needs to be manually instantatiated by the special x86-android-tablets.ko "fixup" driver for cases like this. This means that the codec-dev cannot be retrieved through its ACPI fwnode, add support to the bytcr_rt5640 machine driver for such manually instantiated rt5640 i2c_clients. An example of a tablet which needs this is the Vexia EDU ATLA 10 tablet, which has been distributed to schools in the Spanish Andalucía region. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241024211615.79518-1-hdegoede@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-25 00:50:14 +01:00
Hans de Goede	032532f91a	ASoC: codecs: rt5640: Always disable IRQs from rt5640_cancel_work() Disable IRQs from rt5640_cancel_work(), this fixes a crash caused by the IRQ never getting freed when the driver is unbound from the i2c_client with jack-detection active: [ 193.138780] rt5640 i2c-rt5640: ASoC: unknown pin LDO2 [ 193.138830] rt5640 i2c-rt5640: ASoC: unknown pin MICBIAS1 [ 193.671218] BUG: kernel NULL pointer dereference, address: 0000000000000078 [ 193.671239] #PF: supervisor read access in kernel mode [ 193.671248] #PF: error_code(0x0000) - not-present page ... [ 193.671531] ? asm_exc_page_fault+0x22/0x30 [ 193.671551] ? rt5640_jack_inserted+0x10/0x80 [snd_soc_rt5640] [ 193.671574] rt5640_detect_headset+0x93/0x130 [snd_soc_rt5640] [ 193.671596] rt5640_jack_work+0x93/0x355 [snd_soc_rt5640] Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241024215612.92147-1-hdegoede@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-25 00:50:13 +01:00
Linus Torvalds	d44cd82264	Merge tag 'net-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfiler, xfrm and bluetooth. Oddly this includes a fix for a posix clock regression; in our previous PR we included a change there as a pre-requisite for networking one. That fix proved to be buggy and requires the follow-up included here. Thomas suggested we should send it, given we sent the buggy patch. Current release - regressions: - posix-clock: Fix unbalanced locking in pc_clock_settime() - netfilter: fix typo causing some targets not to load on IPv6 Current release - new code bugs: - xfrm: policy: remove last remnants of pernet inexact list Previous releases - regressions: - core: fix races in netdev_tx_sent_queue()/dev_watchdog() - bluetooth: fix UAF on sco_sock_timeout - eth: hv_netvsc: fix VF namespace also in synthetic NIC NETDEV_REGISTER event - eth: usbnet: fix name regression - eth: be2net: fix potential memory leak in be_xmit() - eth: plip: fix transmit path breakage Previous releases - always broken: - sched: deny mismatched skip_sw/skip_hw flags for actions created by classifiers - netfilter: bpf: must hold reference on net namespace - eth: virtio_net: fix integer overflow in stats - eth: bnxt_en: replace ptp_lock with irqsave variant - eth: octeon_ep: add SKB allocation failures handling in __octep_oq_process_rx() Misc: - MAINTAINERS: add Simon as an official reviewer" * tag 'net-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (40 commits) net: dsa: mv88e6xxx: support 4000ps cycle counter period net: dsa: mv88e6xxx: read cycle counter period from hardware net: dsa: mv88e6xxx: group cycle counter coefficients net: usb: qmi_wwan: add Fibocom FG132 0x0112 composition hv_netvsc: Fix VF namespace also in synthetic NIC NETDEV_REGISTER event net: dsa: microchip: disable EEE for KSZ879x/KSZ877x/KSZ876x Bluetooth: ISO: Fix UAF on iso_sock_timeout Bluetooth: SCO: Fix UAF on sco_sock_timeout Bluetooth: hci_core: Disable works on hci_unregister_dev posix-clock: posix-clock: Fix unbalanced locking in pc_clock_settime() r8169: avoid unsolicited interrupts net: sched: use RCU read-side critical section in taprio_dump() net: sched: fix use-after-free in taprio_change() net/sched: act_api: deny mismatched skip_sw/skip_hw flags for actions created by classifiers net: usb: usbnet: fix name regression mlxsw: spectrum_router: fix xa_store() error checking virtio_net: fix integer overflow in stats net: fix races in netdev_tx_sent_queue()/dev_watchdog() net: wwan: fix global oob in wwan_rtnl_policy netfilter: xtables: fix typo causing some targets not to load on IPv6 ...	2024-10-24 16:43:50 -07:00
Linus Torvalds	c9a50b9090	Merge tag 'hid-for-linus-20241024' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: "Device-specific functionality quirks for Thinkpad X1 Gen3, Logitech Bolt and some Goodix touchpads (Bartłomiej Maryńczak, Hans de Goede and Kenneth Albanowski)" * tag 'hid-for-linus-20241024' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: lenovo: Add support for Thinkpad X1 Tablet Gen 3 keyboard HID: multitouch: Add quirk for Logitech Bolt receiver w/ Casa touchpad HID: i2c-hid: Delayed i2c resume wakeup for 0x0d42 Goodix touchpad	2024-10-24 16:31:58 -07:00
Dave Airlie	2ba1f81ec7	Merge tag 'drm-intel-fixes-2024-10-24' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix DRM_I915_GVT_KVMGT dependencies in Kconfig Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZxniUlDg59RxOO-6@jlahtine-mobl.ger.corp.intel.com	2024-10-25 07:43:41 +10:00
Jeongjun Park	5c41f75d1b	bcachefs: fix shift oob in alloc_lru_idx_fragmentation The size of a.data_type is set abnormally large, causing shift-out-of-bounds. To fix this, we need to add validation on a.data_type in alloc_lru_idx_fragmentation(). Reported-by: syzbot+7f45fa9805c40db3f108@syzkaller.appspotmail.com Fixes: `260af1562e` ("bcachefs: Kill alloc_v4.fragmentation_lru") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-24 17:41:43 -04:00
Gianfranco Trad	2045fc4295	bcachefs: Fix invalid shift in validate_sb_layout() Add check on layout->sb_max_size_bits against BCH_SB_LAYOUT_SIZE_BITS_MAX to prevent UBSAN shift-out-of-bounds in validate_sb_layout(). Reported-by: syzbot+089fad5a3a5e77825426@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=089fad5a3a5e77825426 Fixes: `03ef80b469` ("bcachefs: Ignore unknown mount options") Tested-by: syzbot+089fad5a3a5e77825426@syzkaller.appspotmail.com Signed-off-by: Gianfranco Trad <gianf.trad@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-24 17:41:43 -04:00
Dominique Martinet	be2ca38253	Revert "fs/9p: simplify iget to remove unnecessary paths" This reverts commit `724a08450f`. This code simplification introduced significant regressions on servers that do not remap inode numbers when exporting multiple underlying filesystems with colliding inodes, as can be illustrated with simple tmpfs exports in qemu with remapping disabled: ``` # host side cd /tmp/linux-test mkdir m1 m2 mount -t tmpfs tmpfs m1 mount -t tmpfs tmpfs m2 mkdir m1/dir m2/dir echo foo > m1/dir/foo echo bar > m2/dir/bar # guest side # started with -virtfs local,path=/tmp/linux-test,mount_tag=tmp,security_model=mapped-file mount -t 9p -o trans=virtio,debug=1 tmp /mnt/t ls /mnt/t/m1/dir # foo ls /mnt/t/m2/dir # bar (works ok if directry isn't open) # cd to keep first dir's inode alive cd /mnt/t/m1/dir ls /mnt/t/m2/dir # foo (should be bar) ``` Other examples can be crafted with regular files with fscache enabled, in which case I/Os just happen to the wrong file leading to corruptions, or guest failing to boot with: \| VFS: Lookup of 'com.android.runtime' in 9p 9p would have caused loop In theory, we'd want the servers to be smart enough and ensure they never send us two different files with the same 'qid.path', but while qemu has an option to remap that is recommended (and qemu prints a warning if this case happens), there are many other servers which do not (kvmtool, nfs-ganesha, probably diod...), we should at least ensure we don't cause regressions on this: - assume servers can't be trusted and operations that should get a 'new' inode properly do so. commit `d05dcfdf5e` (" fs/9p: mitigate inode collisions") attempted to do this, but v9fs_fid_iget_dotl() was not called so some higher level of caching got in the way; this needs to be fixed properly before we can re-apply the patches. - if we ever want to really simplify this code, we will need to add some negotiation with the server at mount time where the server could claim they handle this properly, at which point we could optimize this out. (but that might not be needed at all if we properly handle the 'new' check?) Fixes: `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths") Reported-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/all/20240408141436.GA17022@redhat.com/ Link: https://lkml.kernel.org/r/20240923100508.GA32066@willie-the-truck Cc: stable@vger.kernel.org # v6.9+ Message-ID: <20241024-revert_iget-v1-4-4cac63d25f72@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-10-25 06:26:09 +09:00
Dominique Martinet	26f8dd2dde	Revert "fs/9p: fix uaf in in v9fs_stat2inode_dotl" This reverts commit `11763a8598`. This is a requirement to revert commit `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths"), see that revert for details. Fixes: `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths") Reported-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20240923100508.GA32066@willie-the-truck Cc: stable@vger.kernel.org # v6.9+ Message-ID: <20241024-revert_iget-v1-3-4cac63d25f72@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-10-25 06:26:09 +09:00
Dominique Martinet	fedd06210b	Revert "fs/9p: remove redundant pointer v9ses" This reverts commit `10211b4a23`. This is a requirement to revert commit `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths"), see that revert for details. Fixes: `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths") Reported-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20240923100508.GA32066@willie-the-truck Cc: stable@vger.kernel.org # v6.9+ Message-ID: <20241024-revert_iget-v1-2-4cac63d25f72@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-10-25 06:26:09 +09:00
Dominique Martinet	f69999b5f9	Revert " fs/9p: mitigate inode collisions" This reverts commit `d05dcfdf5e`. This is a requirement to revert commit `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths"), see that revert for details. Fixes: `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths") Reported-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20240923100508.GA32066@willie-the-truck Cc: stable@vger.kernel.org # v6.9+ Message-ID: <20241024-revert_iget-v1-1-4cac63d25f72@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-10-25 06:26:08 +09:00
Dave Airlie	19c6890c3d	Merge tag 'amd-drm-fixes-6.12-2024-10-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.12-2024-10-23: amdgpu: - ACPI method handling fixes - SMU 14.x fixes - Display idle optimization fix - DP link layer compliance fix - SDMA 7.x fix - PSR-SU fix - SWSMU fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023180208.452636-1-alexander.deucher@amd.com	2024-10-25 07:17:49 +10:00
Linus Torvalds	3964f82a4d	Merge tag 'loongarch-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Get correct cores_per_package for SMT systems, enable IRQ if do_ale() triggered in irq-enabled context, and fix some bugs about vDSO, memory managenent, hrtimer in KVM, etc" * tag 'loongarch-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Mark hrtimer to expire in hard interrupt context LoongArch: Make KASAN usable for variable cpu_vabits LoongArch: Set initial pte entry with PAGE_GLOBAL for kernel space LoongArch: Don't crash in stack_top() for tasks without vDSO LoongArch: Set correct size for vDSO code mapping LoongArch: Enable IRQ if do_ale() triggered in irq-enabled context LoongArch: Get correct cores_per_package for SMT systems LoongArch: Use "Exception return address" to comment ERA	2024-10-24 14:17:34 -07:00
Li Zhijian	cce3cd6477	cxl/core: Return error when cxl_endpoint_gather_bandwidth() handles a non-PCI device The function cxl_endpoint_gather_bandwidth() invokes pci_bus_read/write_XXX(), however, not all CXL devices are presently implemented via PCI. It is recognized that the cxl_test has realized a CXL device using a platform device. Calling pci_bus_read/write_XXX() in cxl_test will cause kernel panic: platform cxl_host_bridge.3: host supports CXL (restricted) Oops: general protection fault, probably for non-canonical address 0x3ef17856fcae4fbd: 0000 [#1] PREEMPT SMP PTI Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? die_addr+0x38/0x60 ? exc_general_protection+0x1f5/0x4b0 ? asm_exc_general_protection+0x22/0x30 ? pci_bus_read_config_word+0x1c/0x60 pcie_capability_read_word+0x93/0xb0 pcie_link_speed_mbps+0x18/0x50 cxl_pci_get_bandwidth+0x18/0x60 [cxl_core] cxl_endpoint_gather_bandwidth.constprop.0+0xf4/0x230 [cxl_core] ? xas_store+0x54/0x660 ? preempt_count_add+0x69/0xa0 ? _raw_spin_lock+0x13/0x40 ? __kmalloc_cache_noprof+0xe7/0x270 cxl_region_shared_upstream_bandwidth_update+0x9c/0x790 [cxl_core] cxl_region_attach+0x520/0x7e0 [cxl_core] store_targetN+0xf2/0x120 [cxl_core] kernfs_fop_write_iter+0x13a/0x1f0 vfs_write+0x23b/0x410 ksys_write+0x53/0xd0 do_syscall_64+0x62/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e And Ying also reported a KASAN error with similar calltrace. Reported-by: Huang, Ying <ying.huang@intel.com> Closes: http://lore.kernel.org/87y12w9vp5.fsf@yhuang6-desk2.ccr.corp.intel.com Fixes: `a5ab0de0eb` ("cxl: Calculate region bandwidth of targets with shared upstream link") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Huang, Ying <ying.huang@intel.com> Link: https://patch.msgid.link/20241022030054.258942-1-lizhijian@fujitsu.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>	2024-10-24 16:05:52 -05:00
Linus Torvalds	c2cd8e4592	Merge tag 'probes-fixes-v6.12-rc4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - objpool: Fix choosing allocation for percpu slots Fixes to allocate objpool's percpu slots correctly according to the GFP flag. It checks whether "any bit" in GFP_ATOMIC is set to choose the vmalloc source, but it should check "all bits" in GFP_ATOMIC flag is set, because GFP_ATOMIC is a combined flag. - tracing/probes: Fix MAX_TRACE_ARGS limit handling If more than MAX_TRACE_ARGS are passed for creating a probe event, the entries over MAX_TRACE_ARG in trace_arg array are not initialized. Thus if the kernel accesses those entries, it crashes. This rejects creating event if the number of arguments is over MAX_TRACE_ARGS. - tracing: Consider the NUL character when validating the event length A strlen() is used when parsing the event name, and the original code does not consider the terminal null byte. Thus it can pass the name one byte longer than the buffer. This fixes to check it correctly. * tag 'probes-fixes-v6.12-rc4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Consider the NULL character when validating the event length tracing/probes: Fix MAX_TRACE_ARGS limit handling objpool: fix choosing allocation for percpu slots	2024-10-24 13:51:58 -07:00
Linus Torvalds	4e46774408	Merge tag 'for-6.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - mount option fixes: - fix handling of compression mount options on remount - reject rw remount in case there are options that don't work in read-write mode (like rescue options) - fix zone accounting of unusable space - fix in-memory corruption when merging extent maps - fix delalloc range locking for sector < page - use more convenient default value of drop subtree threshold, clean more subvolumes without the fallback to marking quotas inconsistent - fix smatch warning about incorrect value passed to ERR_PTR * tag 'for-6.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix passing 0 to ERR_PTR in btrfs_search_dir_index_item() btrfs: reject ro->rw reconfiguration if there are hard ro requirements btrfs: fix read corruption due to race with extent map merging btrfs: fix the delalloc range locking if sector size < page size btrfs: qgroup: set a more sane default value for subtree drop threshold btrfs: clear force-compress on remount when compress mount option is given btrfs: zoned: fix zone unusable accounting for freed reserved extent	2024-10-24 13:04:15 -07:00
Linus Torvalds	6cc65abee8	Merge tag 'jfs-6.12-rc5' of github.com:kleikamp/linux-shaggy Pull jfs fix from David Kleikamp: "Fix a regression introduced in 6.12-rc1" * tag 'jfs-6.12-rc5' of github.com:kleikamp/linux-shaggy: jfs: Fix sanity check in dbMount	2024-10-24 12:47:01 -07:00
Linus Torvalds	c1e822754c	Merge tag 'bcachefs-2024-10-22' of https://github.com/koverstreet/bcachefs Pull bcachefs fixes from Kent Overstreet: "Lots of hotfixes: - transaction restart injection has been shaking out a few things - fix a data corruption in the buffered write path on -ENOSPC, found by xfstests generic/299 - Some small show_options fixes - Repair mismatches in inode hash type, seed: different snapshot versions of an inode must have the same hash/type seed, used for directory entries and xattrs. We were checking the hash seed, but not the type, and a user contributed a filesystem where the hash type on one inode had somehow been flipped; these fixes allow his filesystem to repair. Additionally, the hash type flip made some directory entries invisible, which were then recreated by userspace; so the hash check code now checks for duplicate non dangling dirents, and renames one of them if necessary. - Don't use wait_event_interruptible() in recovery: this fixes some filesystems failing to mount with -ERESTARTSYS - Workaround for kvmalloc not supporting > INT_MAX allocations, causing an -ENOMEM when allocating the sorted array of journal keys: this allows a 75 TB filesystem to mount - Make sure bch_inode_unpacked.bi_snapshot is set in the old inode compat path: this alllows Marcin's filesystem (in use since before 6.7) to repair and mount" * tag 'bcachefs-2024-10-22' of https://github.com/koverstreet/bcachefs: (26 commits) bcachefs: Set bch_inode_unpacked.bi_snapshot in old inode path bcachefs: Mark more errors as AUTOFIX bcachefs: Workaround for kvmalloc() not supporting > INT_MAX allocations bcachefs: Don't use wait_event_interruptible() in recovery bcachefs: Fix __bch2_fsck_err() warning bcachefs: fsck: Improve hash_check_key() bcachefs: bch2_hash_set_or_get_in_snapshot() bcachefs: Repair mismatches in inode hash seed, type bcachefs: Add hash seed, type to inode_to_text() bcachefs: INODE_STR_HASH() for bch_inode_unpacked bcachefs: Run in-kernel offline fsck without ratelimit errors bcachefs: skip mount option handle for empty string. bcachefs: fix incorrect show_options results bcachefs: Fix data corruption on -ENOSPC in buffered write path bcachefs: bch2_folio_reservation_get_partial() is now better behaved bcachefs: fix disk reservation accounting in bch2_folio_reservation_get() bcachefS: ec: fix data type on stripe deletion bcachefs: Don't use commit_do() unnecessarily bcachefs: handle restarts in bch2_bucket_io_time_reset() bcachefs: fix restart handling in __bch2_resume_logged_op_finsert() ...	2024-10-24 12:38:59 -07:00
Dominique Martinet	f009e946c1	Revert "9p: Enable multipage folios" This reverts commit `1325e4a91a`. using multipage folios apparently break some madvise operations like MADV_PAGEOUT which do not reliably unload the specified page anymore, Revert the patch until that is figured out. Reported-by: Andrii Nakryiko <andrii@kernel.org> Fixes: `1325e4a91a` ("9p: Enable multipage folios") Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-10-24 11:24:05 -07:00
Alexandre Ghiti	bf40167d54	riscv: vdso: Prevent the compiler from inserting calls to memset() The compiler is smart enough to insert a call to memset() in riscv_vdso_get_cpus(), which generates a dynamic relocation. So prevent this by using -fno-builtin option. Fixes: `e2c0cdfba7` ("RISC-V: User-facing API") Cc: stable@vger.kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20241016083625.136311-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-10-24 10:52:52 -07:00
Jinjie Ruan	7bd4923940	iio: dac: Kconfig: Fix build error for ltc2664 If REGMAP_SPI is n and LTC2664 is y, the following build error occurs: riscv64-unknown-linux-gnu-ld: drivers/iio/dac/ltc2664.o: in function `ltc2664_probe': ltc2664.c:(.text+0x714): undefined reference to `__devm_regmap_init_spi' Select REGMAP_SPI instead of REGMAP for LTC2664 to fix it. Fixes: `4cc2fc445d` ("iio: dac: ltc2664: Add driver for LTC2664 and LTC2672") Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20241024015553.1111253-1-ruanjinjie@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-24 18:46:04 +01:00
Nirmoy Das	cdc21021f0	drm/xe: Don't restart parallel queues multiple times on GT reset In case of parallel submissions multiple GuC id will point to the same exec queue and on GT reset such exec queues will get restarted multiple times which is not desirable. v2: don't use exec_queue_enabled() which could race, do the same for xe_guc_submit_stop (Matt B) Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2295 Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022103555.731557-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit `c8b0acd6d8`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-24 12:42:52 -05:00
Nirmoy Das	9c1813b325	drm/xe/ufence: Prefetch ufence addr to catch bogus address access_ok() only checks for addr overflow so also try to read the addr to catch invalid addr sent from userspace. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1630 Cc: Francois Dugast <francois.dugast@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016082304.66009-2-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit `9408c45084`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-24 12:42:52 -05:00
Shuicheng Lin	69418db678	drm/xe: Handle unreliable MMIO reads during forcewake In some cases, when the driver attempts to read an MMIO register, the hardware may return 0xFFFFFFFF. The current force wake path code treats this as a valid response, as it only checks the BIT. However, 0xFFFFFFFF should be considered an invalid value, indicating a potential issue. To address this, we should add a log entry to highlight this condition and return failure. The force wake failure log level is changed from notice to err to match the failure return value. v2 (Matt Brost): - set ret value (-EIO) to kick the error to upper layers v3 (Rodrigo): - add commit message for the log level promotion from notice to err v4: - update reviewed info Suggested-by: Alex Zuo <alex.zuo@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Acked-by: Badal Nilawar <badal.nilawar@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241017221547.1564029-1-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `a9fbeabe72`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-24 12:42:52 -05:00
Badal Nilawar	22ef43c786	drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout In case if g2h worker doesn't get opportunity to within specified timeout delay then flush the g2h worker explicitly. v2: - Describe change in the comment and add TODO (Matt B/John H) - Add xe_gt_warn on fence done after G2H flush (John H) v3: - Updated the comment with root cause - Clean up xe_gt_warn message (John H) Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1620 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2902 Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241017111410.2553784-2-badal.nilawar@intel.com (cherry picked from commit `e515272338`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-24 12:42:52 -05:00
Shuicheng Lin	c8fb95e7a5	drm/xe: Enlarge the invalidation timeout from 150 to 500 There are error messages like below that are occurring during stress testing: "[ 31.004009] xe 0000:03:00.0: [drm] ERROR GT0: Global invalidation timeout". Previously it was hitting this 3 out of 1000 executions of warm reboot. After raising it to 500, 1000 warm reboot executions passed and it didn't fail. Due to the way xe_mmio_wait32() is implemented, the timeout is able to expire early when the register matches the expected value due to the wait increments starting small. So, the larger timeout value should have no effect during normal use cases. v2 (Jonathan): - rework the commit message v3 (Lucas): - add conclusive message for the fail rate and test case v4: - add suggested-by Suggested-by: Jia Yao <jia.yao@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Tested-by: Zongyao Bai <zongyao.bai@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241015161207.1373401-1-shuicheng.lin@intel.com (cherry picked from commit `2eb460ab9f`) [ Fix conflict with gt->mmio ] Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-24 12:42:52 -05:00
Zicheng Qu	efa353ae1b	iio: adc: ad7124: fix division by zero in ad7124_set_channel_odr() In the ad7124_write_raw() function, parameter val can potentially be zero. This may lead to a division by zero when DIV_ROUND_CLOSEST() is called within ad7124_set_channel_odr(). The ad7124_write_raw() function is invoked through the sequence: iio_write_channel_raw() -> iio_write_channel_attribute() -> iio_channel_write(), with no checks in place to ensure val is non-zero. Cc: stable@vger.kernel.org Fixes: `7b8d045e49` ("iio: adc: ad7124: allow more than 8 channels") Signed-off-by: Zicheng Qu <quzicheng@huawei.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Link: https://patch.msgid.link/20241022134330.574601-1-quzicheng@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-24 18:30:47 +01:00
Zicheng Qu	6bd301819f	staging: iio: frequency: ad9832: fix division by zero in ad9832_calc_freqreg() In the ad9832_write_frequency() function, clk_get_rate() might return 0. This can lead to a division by zero when calling ad9832_calc_freqreg(). The check if (fout > (clk_get_rate(st->mclk) / 2)) does not protect against the case when fout is 0. The ad9832_write_frequency() function is called from ad9832_write(), and fout is derived from a text buffer, which can contain any value. Link: https://lore.kernel.org/all/2024100904-CVE-2024-47663-9bdc@gregkh/ Fixes: `ea707584ba` ("Staging: IIO: DDS: AD9832 / AD9835 driver") Cc: stable@vger.kernel.org Signed-off-by: Zicheng Qu <quzicheng@huawei.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/20241022134354.574614-1-quzicheng@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-24 18:30:47 +01:00
Julien Stephan	795114e849	docs: iio: ad7380: fix supply for ad7380-4 ad7380-4 is the only device from ad738x family that doesn't have an internal reference. Moreover it's external reference is called REFIN in the datasheet while all other use REFIO as an optional external reference. Update documentation to highlight this. Fixes: `3e82dfc82f` ("docs: iio: new docs for ad7380 driver") Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Julien Stephan <jstephan@baylibre.com> Link: https://patch.msgid.link/20241022-ad7380-fix-supplies-v3-5-f0cefe1b7fa6@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-24 18:30:47 +01:00
Julien Stephan	05f9c67179	iio: adc: ad7380: fix supplies for ad7380-4 ad7380-4 is the only device in the family that does not have an internal reference. It uses "refin" as a required external reference. All other devices in the family use "refio"" as an optional external reference. Fixes: `737413da87` ("iio: adc: ad7380: add support for ad738x-4 4 channels variants") Reviewed-by: Nuno Sa <nuno.sa@analog.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Julien Stephan <jstephan@baylibre.com> Link: https://patch.msgid.link/20241022-ad7380-fix-supplies-v3-4-f0cefe1b7fa6@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-24 18:30:47 +01:00
Julien Stephan	7ddbc27287	iio: adc: ad7380: add missing supplies vcc and vlogic are required but are not retrieved and enabled in the probe. Add them. In order to prepare support for additional parts requiring different supplies, add vcc and vlogic to the platform specific structures Reviewed-by: Nuno Sa <nuno.sa@analog.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Julien Stephan <jstephan@baylibre.com> Link: https://patch.msgid.link/20241022-ad7380-fix-supplies-v3-3-f0cefe1b7fa6@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-24 18:30:47 +01:00
Julien Stephan	2ac6b2e823	iio: adc: ad7380: use devm_regulator_get_enable_read_voltage() Use devm_regulator_get_enable_read_voltage() to simplify the code. Reviewed-by: Nuno Sa <nuno.sa@analog.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Julien Stephan <jstephan@baylibre.com> Link: https://patch.msgid.link/20241022-ad7380-fix-supplies-v3-2-f0cefe1b7fa6@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-24 18:30:47 +01:00
Julien Stephan	fbe5956e88	dt-bindings: iio: adc: ad7380: fix ad7380-4 reference supply ad7380-4 is the only device from ad738x family that doesn't have an internal reference. Moreover its external reference is called REFIN in the datasheet while all other use REFIO as an optional external reference. If refio-supply is omitted the internal reference is used. Fix the binding by adding refin-supply and makes it required for ad7380-4 only. Fixes: `1a291cc8ee` ("dt-bindings: iio: adc: ad7380: add support for ad738x-4 4 channels variants") Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Julien Stephan <jstephan@baylibre.com> Link: https://patch.msgid.link/20241022-ad7380-fix-supplies-v3-1-f0cefe1b7fa6@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-24 18:30:47 +01:00
Javier Carrasco	63dd163cd6	iio: light: veml6030: fix microlux value calculation The raw value conversion to obtain a measurement in lux as INT_PLUS_MICRO does not calculate the decimal part properly to display it as micro (in this case microlux). It only calculates the module to obtain the decimal part from a resolution that is 10000 times the provided in the datasheet (0.5376 lux/cnt for the veml6030). The resulting value must still be multiplied by 100 to make it micro. This bug was introduced with the original implementation of the driver. Only the illuminance channel is fixed becuase the scale is non sensical for the intensity channels anyway. Cc: stable@vger.kernel.org Fixes: `7b779f573c` ("iio: light: add driver for veml6030 ambient light sensor") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241016-veml6030-fix-processed-micro-v1-1-4a5644796437@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-24 18:30:03 +01:00
Andrii Nakryiko	d5fb316e2a	Merge branch 'add-the-missing-bpf_link_type-invocation-for-sockmap' Hou Tao says: ==================== Add the missing BPF_LINK_TYPE invocation for sockmap From: Hou Tao <houtao1@huawei.com> Hi, The tiny patch set fixes the out-of-bound read problem when reading the fdinfo of sock map link fd. And in order to spot such omission early for the newly-added link type in the future, it also checks the validity of the link->type and adds a WARN_ONCE() for missed invocation. Please see individual patches for more details. And comments are always welcome. v3: * patch #2: check and warn the validity of link->type instead of adding a static assertion for bpf_link_type_strs array. v2: http://lore.kernel.org/bpf/d49fa2f4-f743-c763-7579-c3cab4dd88cb@huaweicloud.com ==================== Link: https://lore.kernel.org/r/20241024013558.1135167-1-houtao@huaweicloud.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-10-24 10:17:13 -07:00
Hou Tao	8421d4c876	bpf: Check validity of link->type in bpf_link_show_fdinfo() If a newly-added link type doesn't invoke BPF_LINK_TYPE(), accessing bpf_link_type_strs[link->type] may result in an out-of-bounds access. To spot such missed invocations early in the future, checking the validity of link->type in bpf_link_show_fdinfo() and emitting a warning when such invocations are missed. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241024013558.1135167-3-houtao@huaweicloud.com	2024-10-24 10:17:12 -07:00
Hou Tao	c2f803052b	bpf: Add the missing BPF_LINK_TYPE invocation for sockmap There is an out-of-bounds read in bpf_link_show_fdinfo() for the sockmap link fd. Fix it by adding the missing BPF_LINK_TYPE invocation for sockmap link Also add comments for bpf_link_type to prevent missing updates in the future. Fixes: `699c23f02c` ("bpf: Add bpf_link support for sk_msg and sk_skb progs") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241024013558.1135167-2-houtao@huaweicloud.com	2024-10-24 10:17:12 -07:00
Vishal Chourasia	4f7f417042	sched_ext: Fix function pointer type mismatches in BPF selftests Fix incompatible function pointer type warnings in sched_ext BPF selftests by explicitly casting the function pointers when initializing struct_ops. This addresses multiple -Wincompatible-function-pointer-types warnings from the clang compiler where function signatures didn't match exactly. The void * cast ensures the compiler accepts the function pointer assignment despite minor type differences in the parameters. Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-10-24 06:56:17 -10:00
Dan Carpenter	a85df8c7b5	drm/tegra: Fix NULL vs IS_ERR() check in probe() The iommu_paging_domain_alloc() function doesn't return NULL pointers, it returns error pointers. Update the check to match. Fixes: `45c690aea8` ("drm/tegra: Use iommu_paging_domain_alloc()") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.freedesktop.org/patch/msgid/ba31cf3a-af3d-4ff1-87a8-f05aaf8c780b@stanley.mountain	2024-10-24 18:50:04 +02:00
liwei	d93df29bda	cpufreq: CPPC: fix perf_to_khz/khz_to_perf conversion exception When the nominal_freq recorded by the kernel is equal to the lowest_freq, and the frequency adjustment operation is triggered externally, there is a logic error in cppc_perf_to_khz()/cppc_khz_to_perf(), resulting in perf and khz conversion errors. Fix this by adding a branch processing logic when nominal_freq is equal to lowest_freq. Fixes: `ec1c7ad476` ("cpufreq: CPPC: Fix performance/frequency conversion") Signed-off-by: liwei <liwei728@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/20241024022952.2627694-1-liwei728@huawei.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-10-24 17:35:13 +02:00
Dan Carpenter	3d1c651272	ACPI: PRM: Clean up guid type in struct prm_handler_info Clang 19 prints a warning when we pass &th->guid to efi_pa_va_lookup(): drivers/acpi/prmt.c:156:29: error: passing 1-byte aligned argument to 4-byte aligned parameter 1 of 'efi_pa_va_lookup' may result in an unaligned pointer access [-Werror,-Walign-mismatch] 156 \| (void *)efi_pa_va_lookup(&th->guid, handler_info->handler_address); \| ^ The problem is that efi_pa_va_lookup() takes a efi_guid_t and &th->guid is a regular guid_t. The difference between the two types is the alignment. efi_guid_t is a typedef. typedef guid_t efi_guid_t __aligned(__alignof__(u32)); It's possible that this a bug in Clang 19. Even though the alignment of &th->guid is not explicitly specified, it will still end up being aligned at 4 or 8 bytes. Anyway, as Ard points out, it's cleaner to change guid to efi_guid_t type and that also makes the warning go away. Fixes: `088984c8d5` ("ACPI: PRM: Find EFI_MEMORY_RUNTIME block for PRM handler and context") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://patch.msgid.link/3777d71b-9e19-45f4-be4e-17bf4fa7a834@stanley.mountain [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-10-24 17:31:41 +02:00
Arnaldo Carvalho de Melo	08a7d25255	tools arch x86: Sync the msr-index.h copy with the kernel sources To pick up the changes from these csets: `dc1e67f70f` ("KVM VMX: Move MSR_IA32_VMX_MISC bit defines to asm/vmx.h") `d7bfc9ffd5` ("KVM: VMX: Move MSR_IA32_VMX_BASIC bit defines to asm/vmx.h") `beb2e44604` ("x86/cpu: KVM: Move macro to encode PAT value to common header") `e7e80b66fb` ("x86/cpu: KVM: Add common defines for architectural memory types (PAT, MTRRs, etc.)") That cause no changes to tooling: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after $ To see how this works take a look at this previous update: https://git.kernel.org/torvalds/c/174372668933ede5 `1743726689` ("tools arch x86: Sync the msr-index.h copy with the kernel sources to pick IA32_MKTME_KEYID_PARTITIONING") Just silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Please see tools/include/uapi/README for further details. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Xin Li <xin3.li@intel.com> Link: https://lore.kernel.org/lkml/ZxpLSBzGin3vjs3b@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-24 10:27:59 -03:00
David Howells	e65a0dc1ca	iov_iter: Fix iov_iter_get_pages*() for folio_queue p9_get_mapped_pages() uses iov_iter_get_pages_alloc2() to extract pages from an iterator when performing a zero-copy request and under some circumstances, this crashes with odd page errors[1], for example, I see: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xbcf0 flags: 0x2000000000000000(zone=1) ... page dumped because: VM_BUG_ON_FOLIO(((unsigned int) folio_ref_count(folio) + 127u <= 127u)) ------------[ cut here ]------------ kernel BUG at include/linux/mm.h:1444! This is because, unlike in iov_iter_extract_folioq_pages(), the iter_folioq_get_pages() helper function doesn't skip the current folio when iov_offset points to the end of it, but rather extracts the next page beyond the end of the folio and adds it to the list. Reading will then clobber the contents of this page, leading to system corruption, and if the page is not in use, put_page() may try to clean up the unused page. This can be worked around by copying the iterator before each extraction[2] and using iov_iter_advance() on the original as the advance function steps over the page we're at the end of. Fix this by skipping the page extraction if we're at the end of the folio. This was reproduced in the ktest environment[3] by forcing 9p to use the fscache caching mode and then reading a file through 9p. Fixes: `db0aa2e956` ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios") Reported-by: Antony Antony <antony@phenome.org> Closes: https://lore.kernel.org/r/ZxFQw4OI9rrc7UYc@Antony2201.local/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/ZxFEi1Tod43pD6JC@moon.secunet.de/ [1] Link: https://lore.kernel.org/r/2299159.1729543103@warthog.procyon.org.uk/ [2] Link: https://github.com/koverstreet/ktest.git [3] Tested-by: Antony Antony <antony.antony@secunet.com> Link: https://lore.kernel.org/r/3327438.1729678025@warthog.procyon.org.uk Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-24 13:50:27 +02:00
David Howells	247d65fb12	afs: Fix missing subdir edit when renamed between parent dirs When rename moves an AFS subdirectory between parent directories, the subdir also needs a bit of editing: the ".." entry needs updating to point to the new parent (though I don't make use of the info) and the DV needs incrementing by 1 to reflect the change of content. The server also sends a callback break notification on the subdirectory if we have one, but we can take care of recovering the promise next time we access the subdir. This can be triggered by something like: mount -t afs %example.com:xfstest.test20 /xfstest.test/ mkdir /xfstest.test/{aaa,bbb,aaa/ccc} touch /xfstest.test/bbb/ccc/d mv /xfstest.test/{aaa/ccc,bbb/ccc} touch /xfstest.test/bbb/ccc/e When the pathwalk for the second touch hits "ccc", kafs spots that the DV is incorrect and downloads it again (so the fix is not critical). Fix this, if the rename target is a directory and the old and new parents are different, by: (1) Incrementing the DV number of the target locally. (2) Editing the ".." entry in the target to refer to its new parent's vnode ID and uniquifier. Link: https://lore.kernel.org/r/3340431.1729680010@warthog.procyon.org.uk Fixes: `63a4681ff3` ("afs: Locally edit directory data for mkdir/create/unlink/...") cc: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-24 13:50:27 +02:00
Hongbo Li	6b51b9f65c	doc: correcting the debug path for cachefiles The original debug path is under "/sys/modules", that's wrong. The real path in kernel is "/sys/module". So we can correct it. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Link: https://lore.kernel.org/r/20241022013812.2880883-1-lihongbo22@huawei.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-24 13:50:27 +02:00
Paolo Abeni	9efc44fb2d	Merge branch 'net-dsa-mv88e6xxx-fix-mv88e6393x-phc-frequency-on-internal-clock' Shenghao Yang says: ==================== net: dsa: mv88e6xxx: fix MV88E6393X PHC frequency on internal clock The MV88E6393X family of switches can additionally run their cycle counters using a 250MHz internal clock instead of the usual 125MHz external clock [1]. The driver currently assumes all designs utilize that external clock, but MikroTik's RB5009 uses the internal source - causing the PHC to be seen running at 2x real time in userspace, making synchronization with ptp4l impossible. This series adds support for reading off the cycle counter frequency known to the hardware in the TAI_CLOCK_PERIOD register and picking an appropriate set of scaling coefficients instead of using a fixed set for each switch family. Patch 1 groups those cycle counter coefficients into a new structure to make it easier to pass them around. Patch 2 modifies PTP initialization to probe TAI_CLOCK_PERIOD and use an appropriate set of coefficients. Patch 3 adds support for 4000ps cycle counter periods. Changes since v2 [2]: - Patch 1: "net: dsa: mv88e6xxx: group cycle counter coefficients" - Moved declaration of mv88e6xxx_cc_coeffs to avoid moving that in Patch 2. - Patch 2: "net: dsa: mv88e6xxx: read cycle counter period from hardware" - Removed move of mv88e6xxx_cc_coeffs declaration. - Patch 3: "net: dsa: mv88e6xxx: support 4000ps cycle counter periods" - No change. [1] https://lore.kernel.org/netdev/d6622575-bf1b-445a-b08f-2739e3642aae@lunn.ch/ [2] https://lore.kernel.org/netdev/20241006145951.719162-1-me@shenghaoyang.info/ ==================== Link: https://patch.msgid.link/20241020063833.5425-1-me@shenghaoyang.info Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-24 12:57:48 +02:00
Shenghao Yang	3e65ede526	net: dsa: mv88e6xxx: support 4000ps cycle counter period The MV88E6393X family of devices can run its cycle counter off an internal 250MHz clock instead of an external 125MHz one. Add support for this cycle counter period by adding another set of coefficients and lowering the periodic cycle counter read interval to compensate for faster overflows at the increased frequency. Otherwise, the PHC runs at 2x real time in userspace and cannot be synchronized. Fixes: `de776d0d31` ("net: dsa: mv88e6xxx: add support for mv88e6393x family") Signed-off-by: Shenghao Yang <me@shenghaoyang.info> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-24 12:57:46 +02:00
Shenghao Yang	7e3c18097a	net: dsa: mv88e6xxx: read cycle counter period from hardware Instead of relying on a fixed mapping of hardware family to cycle counter frequency, pull this information from the MV88E6XXX_TAI_CLOCK_PERIOD register. This lets us support switches whose cycle counter frequencies depend on board design. Fixes: `de776d0d31` ("net: dsa: mv88e6xxx: add support for mv88e6393x family") Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Shenghao Yang <me@shenghaoyang.info> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-24 12:57:46 +02:00
Shenghao Yang	67af86afff	net: dsa: mv88e6xxx: group cycle counter coefficients Instead of having them as individual fields in ptp_ops, wrap the coefficients in a separate struct so they can be referenced together. Fixes: `de776d0d31` ("net: dsa: mv88e6xxx: add support for mv88e6393x family") Signed-off-by: Shenghao Yang <me@shenghaoyang.info> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-24 12:57:46 +02:00
Reinhard Speyerer	64761c980c	net: usb: qmi_wwan: add Fibocom FG132 0x0112 composition Add Fibocom FG132 0x0112 composition: T: Bus=03 Lev=02 Prnt=06 Port=01 Cnt=02 Dev#= 10 Spd=12 MxCh= 0 D: Ver= 2.01 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2cb7 ProdID=0112 Rev= 5.15 S: Manufacturer=Fibocom Wireless Inc. S: Product=Fibocom Module S: SerialNumber=xxxxxxxx C:* #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms E: Ad=81(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=86(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms Signed-off-by: Reinhard Speyerer <rspmn@arcor.de> Link: https://patch.msgid.link/ZxLKp5YZDy-OM0-e@arcor.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-24 12:47:20 +02:00
Haiyang Zhang	4c262801ea	hv_netvsc: Fix VF namespace also in synthetic NIC NETDEV_REGISTER event The existing code moves VF to the same namespace as the synthetic NIC during netvsc_register_vf(). But, if the synthetic device is moved to a new namespace after the VF registration, the VF won't be moved together. To make the behavior more consistent, add a namespace check for synthetic NIC's NETDEV_REGISTER event (generated during its move), and move the VF if it is not in the same namespace. Cc: stable@vger.kernel.org Fixes: `c0a41b887c` ("hv_netvsc: move VF to same namespace as netvsc device") Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1729275922-17595-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-24 12:43:20 +02:00
Tim Harvey	ee76eb2434	net: dsa: microchip: disable EEE for KSZ879x/KSZ877x/KSZ876x The well-known errata regarding EEE not being functional on various KSZ switches has been refactored a few times. Recently the refactoring has excluded several switches that the errata should also apply to. Disable EEE for additional switches with this errata and provide additional comments referring to the public errata document. The original workaround for the errata was applied with a register write to manually disable the EEE feature in MMD 7:60 which was being applied for KSZ9477/KSZ9897/KSZ9567 switch ID's. Then came commit `26dd2974c5` ("net: phy: micrel: Move KSZ9477 errata fixes to PHY driver") and commit `6068e6d7ba` ("net: dsa: microchip: remove KSZ9477 PHY errata handling") which moved the errata from the switch driver to the PHY driver but only for PHY_ID_KSZ9477 (PHY ID) however that PHY code was dead code because an entry was never added for PHY_ID_KSZ9477 via MODULE_DEVICE_TABLE. This was apparently realized much later and commit `54a4e5c163` ("net: phy: micrel: add Microchip KSZ 9477 to the device table") added the PHY_ID_KSZ9477 to the PHY driver but as the errata was only being applied to PHY_ID_KSZ9477 it's not completely clear what switches that relates to. Later commit `6149db4997` ("net: phy: micrel: fix KSZ9477 PHY issues after suspend/resume") breaks this again for all but KSZ9897 by only applying the errata for that PHY ID. Following that this was affected with commit 08c6d8bae48c("net: phy: Provide Module 4 KSZ9477 errata (DS80000754C)") which removes the blatant register write to MMD 7:60 and replaces it by setting phydev->eee_broken_modes = -1 so that the generic phy-c45 code disables EEE but this is only done for the KSZ9477_CHIP_ID (Switch ID). Lastly commit `0411f73c13` ("net: dsa: microchip: disable EEE for KSZ8567/KSZ9567/KSZ9896/KSZ9897.") adds some additional switches that were missing to the errata due to the previous changes. This commit adds an additional set of switches. Fixes: `0411f73c13` ("net: dsa: microchip: disable EEE for KSZ8567/KSZ9567/KSZ9896/KSZ9897.") Signed-off-by: Tim Harvey <tharvey@gateworks.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20241018160658.781564-1-tharvey@gateworks.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-24 12:39:39 +02:00
Paolo Abeni	1876479d98	Merge tag 'for-net-2024-10-23' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci_core: Disable works on hci_unregister_dev - SCO: Fix UAF on sco_sock_timeout - ISO: Fix UAF on iso_sock_timeout * tag 'for-net-2024-10-23' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: ISO: Fix UAF on iso_sock_timeout Bluetooth: SCO: Fix UAF on sco_sock_timeout Bluetooth: hci_core: Disable works on hci_unregister_dev ==================== Link: https://patch.msgid.link/20241023143005.2297694-1-luiz.dentz@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-24 12:30:23 +02:00
Niklas Cassel	8e59a2a545	ata: libata: Set DID_TIME_OUT for commands that actually timed out When ata_qc_complete() schedules a command for EH using ata_qc_schedule_eh(), blk_abort_request() will be called, which leads to req->q->mq_ops->timeout() / scsi_timeout() being called. scsi_timeout(), if the LLDD has no abort handler (libata has no abort handler), will set host byte to DID_TIME_OUT, and then call scsi_eh_scmd_add() to add the command to EH. Thus, when commands first enter libata's EH strategy_handler, all the commands that have been added to EH will have DID_TIME_OUT set. Commit `e5dd410acb` ("ata: libata: Clear DID_TIME_OUT for ATA PT commands with sense data") clears this bogus DID_TIME_OUT flag for all commands that reached libata's EH strategy_handler. libata has its own flag (AC_ERR_TIMEOUT), that it sets for commands that have not received a completion at the time of entering EH. ata_eh_worth_retry() has no special handling for AC_ERR_TIMEOUT, so by default timed out commands will get flag ATA_QCFLAG_RETRY set, and will be retried after the port has been reset (ata_eh_link_autopsy() always triggers a port reset if any command has AC_ERR_TIMEOUT set). For a command that has ATA_QCFLAG_RETRY set, while also having an error flag set (e.g. AC_ERR_TIMEOUT), ata_eh_finish() will not increment scmd->allowed, so the command will at most be retried scmd->allowed number of times (which by default is set to 3). However, scsi_eh_flush_done_q() will only retry commands for which scsi_noretry_cmd() returns false. For a command that has DID_TIME_OUT set, while also having either the FAILFAST flag set, or the command being a passthrough command, scsi_noretry_cmd() will return true. Thus, such a command will never be retried. Thus, make sure that libata sets SCSI's DID_TIME_OUT flag for commands that actually timed out (libata's AC_ERR_TIMEOUT flag), such that timed out commands will once again not be retried if they are also a FAILFAST or passthrough command. Cc: stable@vger.kernel.org Fixes: `e5dd410acb` ("ata: libata: Clear DID_TIME_OUT for ATA PT commands with sense data") Reported-by: Lai, Yi <yi1.lai@linux.intel.com> Closes: https://lore.kernel.org/linux-ide/ZxYz871I3Blsi30F@ly-workstation/ Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20241023105540.1070012-2-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org>	2024-10-24 11:14:00 +02:00
Paolo Abeni	1e424d08d3	Merge tag 'ipsec-2024-10-22' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2024-10-22 1) Fix routing behavior that relies on L4 information for xfrm encapsulated packets. From Eyal Birger. 2) Remove leftovers of pernet policy_inexact lists. From Florian Westphal. 3) Validate new SA's prefixlen when the selector family is not set from userspace. From Sabrina Dubroca. 4) Fix a kernel-infoleak when dumping an auth algorithm. From Petr Vaganov. Please pull or let me know if there are problems. ipsec-2024-10-22 * tag 'ipsec-2024-10-22' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec: xfrm: fix one more kernel-infoleak in algo dumping xfrm: validate new SA's prefixlen using SA family when sel.family is unset xfrm: policy: remove last remnants of pernet inexact list xfrm: respect ip protocols rules criteria when performing dst lookups xfrm: extract dst lookup parameters into a struct ==================== Link: https://patch.msgid.link/20241022092226.654370-1-steffen.klassert@secunet.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-24 11:11:33 +02:00
Takashi Iwai	c9f7a144e7	Merge tag 'asoc-fix-v6.12-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.12 An uncomfortably large set of fixes due to me not getting round to sending them for longer than I should due to travel and illness. This is mostly smaller driver specific changes, but there are a couple of generic changes: - Bumping the minimal topology ABI we check for during validation, the code had support for v4 removed previously but the update of the define used for initial validation was missed. - Fix the assumption that DAPM structs will be embedded in a component which isn't true for card widgets when doing name comparisons, though fortunately this is rarely triggered. We've pulled in one Soundwire fix which was part of a larger series fixing cleanup issues in on Intel Soundwire systems.	2024-10-24 07:57:39 +02:00
Andrii Nakryiko	9806f28314	bpf: fix do_misc_fixups() for bpf_get_branch_snapshot() We need `goto next_insn;` at the end of patching instead of `continue;`. It currently works by accident by making verifier re-process patched instructions. Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Fixes: `314a53623c` ("bpf: inline bpf_get_branch_snapshot() helper") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Link: https://lore.kernel.org/r/20241023161916.2896274-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-23 22:16:45 -07:00
Xinyu Zhang	2ff9494418	block: fix sanity checks in blk_rq_map_user_bvec blk_rq_map_user_bvec contains a check bytes + bv->bv_len > nr_iter which causes unnecessary failures in NVMe passthrough I/O, reproducible as follows: - register a 2 page, page-aligned buffer against a ring - use that buffer to do a 1 page io_uring NVMe passthrough read The second (i = 1) iteration of the loop in blk_rq_map_user_bvec will then have nr_iter == 1 page, bytes == 1 page, bv->bv_len == 1 page, so the check bytes + bv->bv_len > nr_iter will succeed, causing the I/O to fail. This failure is unnecessary, as when the check succeeds, it means we've checked the entire buffer that will be used by the request - i.e. blk_rq_map_user_bvec should complete successfully. Therefore, terminate the loop early and return successfully when the check bytes + bv->bv_len > nr_iter succeeds. While we're at it, also remove the check that all segments in the bvec are single-page. While this seems to be true for all users of the function, it doesn't appear to be required anywhere downstream. CC: stable@vger.kernel.org Signed-off-by: Xinyu Zhang <xizhang@purestorage.com> Co-developed-by: Uday Shankar <ushankar@purestorage.com> Signed-off-by: Uday Shankar <ushankar@purestorage.com> Fixes: `3798754793` ("block: extend functionality to map bvec iterator") Link: https://lore.kernel.org/r/20241023211519.4177873-1-ushankar@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-23 17:02:48 -06:00
Arnaldo Carvalho de Melo	758f181589	perf python: Fix up the build on architectures without HAVE_KVM_STAT_SUPPORT Noticed while building on a raspbian arm 32-bit system. There was also this other case, fixed by adding a missing util/stat.h with the prototypes: /tmp/tmp.MbiSHoF3dj/perf-6.12.0-rc3/tools/perf/util/python.c:1396:6: error: no previous prototype for ‘perf_stat__set_no_csv_summary’ [-Werror=missing-prototypes] 1396 \| void perf_stat__set_no_csv_summary(int set __maybe_unused) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /tmp/tmp.MbiSHoF3dj/perf-6.12.0-rc3/tools/perf/util/python.c:1400:6: error: no previous prototype for ‘perf_stat__set_big_num’ [-Werror=missing-prototypes] 1400 \| void perf_stat__set_big_num(int set __maybe_unused) \| ^~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors In other architectures this must be building due to some lucky indirect inclusion of that header. Fixes: `9dabf40034` ("perf python: Switch module to linking libraries from building source") Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZxllAtpmEw5fg9oy@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-23 19:29:50 -03:00
Frank Li	25f00a13dc	spi: spi-fsl-dspi: Fix crash when not using GPIO chip select Add check for the return value of spi_get_csgpiod() to avoid passing a NULL pointer to gpiod_direction_output(), preventing a crash when GPIO chip select is not used. Fix below crash: [ 4.251960] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 4.260762] Mem abort info: [ 4.263556] ESR = 0x0000000096000004 [ 4.267308] EC = 0x25: DABT (current EL), IL = 32 bits [ 4.272624] SET = 0, FnV = 0 [ 4.275681] EA = 0, S1PTW = 0 [ 4.278822] FSC = 0x04: level 0 translation fault [ 4.283704] Data abort info: [ 4.286583] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 4.292074] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 4.297130] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 4.302445] [0000000000000000] user address but active_mm is swapper [ 4.308805] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 4.315072] Modules linked in: [ 4.318124] CPU: 2 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc4-next-20241023-00008-ga20ec42c5fc1 #359 [ 4.328130] Hardware name: LS1046A QDS Board (DT) [ 4.332832] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 4.339794] pc : gpiod_direction_output+0x34/0x5c [ 4.344505] lr : gpiod_direction_output+0x18/0x5c [ 4.349208] sp : ffff80008003b8f0 [ 4.352517] x29: ffff80008003b8f0 x28: 0000000000000000 x27: ffffc96bcc7e9068 [ 4.359659] x26: ffffc96bcc6e00b0 x25: ffffc96bcc598398 x24: ffff447400132810 [ 4.366800] x23: 0000000000000000 x22: 0000000011e1a300 x21: 0000000000020002 [ 4.373940] x20: 0000000000000000 x19: 0000000000000000 x18: ffffffffffffffff [ 4.381081] x17: ffff44740016e600 x16: 0000000500000003 x15: 0000000000000007 [ 4.388221] x14: 0000000000989680 x13: 0000000000020000 x12: 000000000000001e [ 4.395362] x11: 0044b82fa09b5a53 x10: 0000000000000019 x9 : 0000000000000008 [ 4.402502] x8 : 0000000000000002 x7 : 0000000000000007 x6 : 0000000000000000 [ 4.409641] x5 : 0000000000000200 x4 : 0000000002000000 x3 : 0000000000000000 [ 4.416781] x2 : 0000000000022202 x1 : 0000000000000000 x0 : 0000000000000000 [ 4.423921] Call trace: [ 4.426362] gpiod_direction_output+0x34/0x5c (P) [ 4.431067] gpiod_direction_output+0x18/0x5c (L) [ 4.435771] dspi_setup+0x220/0x334 Fixes: `9e264f3f85` ("spi: Replace all spi->chip_select and spi->cs_gpiod references with function call") Cc: stable@vger.kernel.org Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20241023203032.1388491-1-Frank.Li@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-23 22:37:54 +01:00
Jiri Olsa	0ee288e69d	bpf,perf: Fix perf_event_detach_bpf_prog error handling Peter reported that perf_event_detach_bpf_prog might skip to release the bpf program for -ENOENT error from bpf_prog_array_copy. This can't happen because bpf program is stored in perf event and is detached and released only when perf event is freed. Let's drop the -ENOENT check and make sure the bpf program is released in any case. Fixes: `170a7e3ea0` ("bpf: bpf_prog_array_copy() should return -ENOENT if exclude_prog not found") Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241023200352.3488610-1-jolsa@kernel.org Closes: https://lore.kernel.org/lkml/20241022111638.GC16066@noisy.programming.kicks-ass.net/	2024-10-23 14:33:02 -07:00
Veronika Molnarova	06a130e42a	perf test: Handle perftool-testsuite_probe failure due to broken DWARF Test case test_adding_blacklisted ends in failure if the blacklisted probe is of an assembler function with no DWARF available. At the same time, probing the blacklisted function with ASM DWARF doesn't test the blacklist itself as the failure is a result of the broken DWARF. When the broken DWARF output is encountered, check if the probed function was compiled by the assembler. If so, the broken DWARF message is expected and does not report a perf issue, else report a failure. If the ASM DWARF affected the probe, try the next probe on the blacklist. If the first 5 probes are defective due to broken DWARF, skip the test case. Fixes: `def5480d63` ("perf testsuite probe: Add test for blacklisted kprobes handling") Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Veronika Molnarova <vmolnaro@redhat.com> Link: https://lore.kernel.org/r/20241017161555.236769-1-vmolnaro@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-23 17:23:09 -03:00
Ihor Solodrai	9b3c11a867	selftests/sched_ext: add order-only dependency of runner.o on BPFOBJ The runner.o may start building before libbpf headers are installed, and as a result build fails. This happened a couple of times on libbpf/ci test jobs: * https://github.com/libbpf/ci/actions/runs/11447667257/job/31849533100 * https://github.com/theihor/libbpf-ci/actions/runs/11445162764/job/31841649552 Headers are installed in a recipe for $(BPFOBJ) target, and adding an order-only dependency should ensure this doesn't happen. Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-10-23 08:56:32 -10:00
Peter Zijlstra	b55945c500	sched: Fix pick_next_task_fair() vs try_to_wake_up() race Syzkaller robot reported KCSAN tripping over the ASSERT_EXCLUSIVE_WRITER(p->on_rq) in __block_task(). The report noted that both pick_next_task_fair() and try_to_wake_up() were concurrently trying to write to the same p->on_rq, violating the assertion -- even though both paths hold rq->__lock. The logical consequence is that both code paths end up holding a different rq->__lock. And looking through ttwu(), this is possible when the __block_task() 'p->on_rq = 0' store is visible to the ttwu() 'p->on_rq' load, which then assumes the task is not queued and continues to migrate it. Rearrange things such that __block_task() releases @p with the store and no code thereafter will use @p again. Fixes: `152e11f6df` ("sched/fair: Implement delayed dequeue") Reported-by: syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Marco Elver <elver@google.com> Link: https://lkml.kernel.org/r/20241023093641.GE16066@noisy.programming.kicks-ass.net	2024-10-23 20:52:26 +02:00
Kan Liang	e3dfd64c1f	perf: Fix missing RCU reader protection in perf_event_clear_cpumask() Running rcutorture scenario TREE05, the below warning is triggered. [ 32.604594] WARNING: suspicious RCU usage [ 32.605928] 6.11.0-rc5-00040-g4ba4f1afb6a9 #55238 Not tainted [ 32.607812] ----------------------------- [ 32.609140] kernel/events/core.c:13946 RCU-list traversed in non-reader section!! [ 32.611595] other info that might help us debug this: [ 32.614247] rcu_scheduler_active = 2, debug_locks = 1 [ 32.616392] 3 locks held by cpuhp/4/35: [ 32.617687] #0: ffffffffb666a650 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x4e/0x200 [ 32.620563] #1: ffffffffb666cd20 (cpuhp_state-down){+.+.}-{0:0}, at: cpuhp_thread_fun+0x4e/0x200 [ 32.623412] #2: ffffffffb677c288 (pmus_lock){+.+.}-{3:3}, at: perf_event_exit_cpu_context+0x32/0x2f0 In perf_event_clear_cpumask(), uses list_for_each_entry_rcu() without an obvious RCU read-side critical section. Either pmus_srcu or pmus_lock is good enough to protect the pmus list. In the current context, pmus_lock is already held. The list_for_each_entry_rcu() is not required. Fixes: `4ba4f1afb6` ("perf: Generic hotplug support for a PMU with a scope") Closes: https://lore.kernel.org/lkml/2b66dff8-b827-494b-b151-1ad8d56f13e6@paulmck-laptop/ Closes: https://lore.kernel.org/oe-lkp/202409131559.545634cc-oliver.sang@intel.com Reported-by: "Paul E. McKenney" <paulmck@kernel.org> Reported-by: kernel test robot <oliver.sang@intel.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: "Paul E. McKenney" <paulmck@kernel.org> Link: https://lore.kernel.org/r/20240913162340.2142976-1-kan.liang@linux.intel.com	2024-10-23 20:52:25 +02:00
Bartosz Golaszewski	ad783b9f8e	PCI/pwrctl: Abandon QCom WCN probe on pre-pwrseq device-trees Old device trees for some platforms already define wifi nodes for the WCN family of chips since before power sequencing was added upstream. These nodes don't consume the regulator outputs from the PMU, and if we allow this driver to bind to one of such "incomplete" nodes, we'll see a kernel log error about the infinite probe deferral. Extend the driver by adding a platform data struct matched against the compatible. This struct contains the pwrseq target string as well as a validation function called right after entering probe(). For Qualcomm WCN models, check the existence of the regulator supply property that indicates the DT is already using power sequencing and return -ENODEV if it's not there, indicating to the driver model that the device should not be bound to the pwrctl driver. Link: https://lore.kernel.org/r/20241007092447.18616-1-brgl@bgdev.pl Fixes: `6140d185a4` ("PCI/pwrctl: Add a PCI power control driver for power sequenced devices") Reported-by: Johan Hovold <johan@kernel.org> Closes: https://lore.kernel.org/all/Zv565olMDDGHyYVt@hovoldconsulting.com/ Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2024-10-23 13:34:41 -05:00
Naohiro Aota	d48e1dea39	btrfs: fix error propagation of split bios The purpose of btrfs_bbio_propagate_error() shall be propagating an error of split bio to its original btrfs_bio, and tell the error to the upper layer. However, it's not working well on some cases. * Case 1. Immediate (or quick) end_bio with an error When btrfs sends btrfs_bio to mirrored devices, btrfs calls btrfs_bio_end_io() when all the mirroring bios are completed. If that btrfs_bio was split, it is from btrfs_clone_bioset and its end_io function is btrfs_orig_write_end_io. For this case, btrfs_bbio_propagate_error() accesses the orig_bbio's bio context to increase the error count. That works well in most cases. However, if the end_io is called enough fast, orig_bbio's (remaining part after split) bio context may not be properly set at that time. Since the bio context is set when the orig_bbio (the last btrfs_bio) is sent to devices, that might be too late for earlier split btrfs_bio's completion. That will result in NULL pointer dereference. That bug is easily reproducible by running btrfs/146 on zoned devices [1] and it shows the following trace. [1] You need raid-stripe-tree feature as it create "-d raid0 -m raid1" FS. BUG: kernel NULL pointer dereference, address: 0000000000000020 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 UID: 0 PID: 13 Comm: kworker/u32:1 Not tainted 6.11.0-rc7-BTRFS-ZNS+ #474 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: writeback wb_workfn (flush-btrfs-5) RIP: 0010:btrfs_bio_end_io+0xae/0xc0 [btrfs] BTRFS error (device dm-0): bdev /dev/mapper/error-test errs: wr 2, rd 0, flush 0, corrupt 0, gen 0 RSP: 0018:ffffc9000006f248 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888005a7f080 RCX: ffffc9000006f1dc RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff888005a7f080 RBP: ffff888011dfc540 R08: 0000000000000000 R09: 0000000000000001 R10: ffffffff82e508e0 R11: 0000000000000005 R12: ffff88800ddfbe58 R13: ffff888005a7f080 R14: ffff888005a7f158 R15: ffff888005a7f158 FS: 0000000000000000(0000) GS:ffff88803ea80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000020 CR3: 0000000002e22006 CR4: 0000000000370ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __die_body.cold+0x19/0x26 ? page_fault_oops+0x13e/0x2b0 ? _printk+0x58/0x73 ? do_user_addr_fault+0x5f/0x750 ? exc_page_fault+0x76/0x240 ? asm_exc_page_fault+0x22/0x30 ? btrfs_bio_end_io+0xae/0xc0 [btrfs] ? btrfs_log_dev_io_error+0x7f/0x90 [btrfs] btrfs_orig_write_end_io+0x51/0x90 [btrfs] dm_submit_bio+0x5c2/0xa50 [dm_mod] ? find_held_lock+0x2b/0x80 ? blk_try_enter_queue+0x90/0x1e0 __submit_bio+0xe0/0x130 ? ktime_get+0x10a/0x160 ? lockdep_hardirqs_on+0x74/0x100 submit_bio_noacct_nocheck+0x199/0x410 btrfs_submit_bio+0x7d/0x150 [btrfs] btrfs_submit_chunk+0x1a1/0x6d0 [btrfs] ? lockdep_hardirqs_on+0x74/0x100 ? __folio_start_writeback+0x10/0x2c0 btrfs_submit_bbio+0x1c/0x40 [btrfs] submit_one_bio+0x44/0x60 [btrfs] submit_extent_folio+0x13f/0x330 [btrfs] ? btrfs_set_range_writeback+0xa3/0xd0 [btrfs] extent_writepage_io+0x18b/0x360 [btrfs] extent_write_locked_range+0x17c/0x340 [btrfs] ? __pfx_end_bbio_data_write+0x10/0x10 [btrfs] run_delalloc_cow+0x71/0xd0 [btrfs] btrfs_run_delalloc_range+0x176/0x500 [btrfs] ? find_lock_delalloc_range+0x119/0x260 [btrfs] writepage_delalloc+0x2ab/0x480 [btrfs] extent_write_cache_pages+0x236/0x7d0 [btrfs] btrfs_writepages+0x72/0x130 [btrfs] do_writepages+0xd4/0x240 ? find_held_lock+0x2b/0x80 ? wbc_attach_and_unlock_inode+0x12c/0x290 ? wbc_attach_and_unlock_inode+0x12c/0x290 __writeback_single_inode+0x5c/0x4c0 ? do_raw_spin_unlock+0x49/0xb0 writeback_sb_inodes+0x22c/0x560 __writeback_inodes_wb+0x4c/0xe0 wb_writeback+0x1d6/0x3f0 wb_workfn+0x334/0x520 process_one_work+0x1ee/0x570 ? lock_is_held_type+0xc6/0x130 worker_thread+0x1d1/0x3b0 ? __pfx_worker_thread+0x10/0x10 kthread+0xee/0x120 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Modules linked in: dm_mod btrfs blake2b_generic xor raid6_pq rapl CR2: 0000000000000020 * Case 2. Earlier completion of orig_bbio for mirrored btrfs_bios btrfs_bbio_propagate_error() assumes the end_io function for orig_bbio is called last among split bios. In that case, btrfs_orig_write_end_io() sets the bio->bi_status to BLK_STS_IOERR by seeing the bioc->error [2]. Otherwise, the increased orig_bio's bioc->error is not checked by anyone and return BLK_STS_OK to the upper layer. [2] Actually, this is not true. Because we only increases orig_bioc->errors by max_errors, the condition "atomic_read(&bioc->error) > bioc->max_errors" is still not met if only one split btrfs_bio fails. * Case 3. Later completion of orig_bbio for un-mirrored btrfs_bios In contrast to the above case, btrfs_bbio_propagate_error() is not working well if un-mirrored orig_bbio is completed last. It sets orig_bbio->bio.bi_status to the btrfs_bio's error. But, that is easily over-written by orig_bbio's completion status. If the status is BLK_STS_OK, the upper layer would not know the failure. * Solution Considering the above cases, we can only save the error status in the orig_bbio (remaining part after split) itself as it is always available. Also, the saved error status should be propagated when all the split btrfs_bios are finished (i.e, bbio->pending_ios == 0). This commit introduces "status" to btrfs_bbio and saves the first error of split bios to original btrfs_bio's "status" variable. When all the split bios are finished, the saved status is loaded into original btrfs_bio's status. With this commit, btrfs/146 on zoned devices does not hit the NULL pointer dereference anymore. Fixes: `852eee62d3` ("btrfs: allow btrfs_submit_bio to split bios") CC: stable@vger.kernel.org # 6.6+ Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-23 18:17:43 +02:00
David Sterba	90a88784cd	MIPS: export __cmpxchg_small() Export the symbol __cmpxchg_small() for btrfs.ko that uses it to store blk_status_t, which is u8. Reported by LKP: >> ERROR: modpost: "__cmpxchg_small" [fs/btrfs/btrfs.ko] undefined! Patch using the cmpxchg() https://lore.kernel.org/linux-btrfs/1d4f72f7fee285b2ddf4bf62b0ac0fd89def5417.1728575379.git.naohiro.aota@wdc.com/ Link: https://lore.kernel.org/all/20241016134919.GO1609@suse.cz/ Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-23 18:10:24 +02:00
Xiongfeng Wang	c83212d79b	firmware: arm_sdei: Fix the input parameter of cpuhp_remove_state() In sdei_device_freeze(), the input parameter of cpuhp_remove_state() is passed as 'sdei_entry_point' by mistake. Change it to 'sdei_hp_state'. Fixes: `d2c48b2387` ("firmware: arm_sdei: Fix sleep from invalid context BUG") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20241016084740.183353-1-wangxiongfeng2@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-23 16:19:03 +01:00
Marco Elver	237ab03e30	Revert "kasan: Disable Software Tag-Based KASAN with GCC" This reverts commit `7aed6a2c51`. Now that __no_sanitize_address attribute is fixed for KASAN_SW_TAGS with GCC, allow re-enabling KASAN_SW_TAGS with GCC. Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrew Pinski <pinskia@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://lore.kernel.org/r/20241021120013.3209481-2-elver@google.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-23 16:04:30 +01:00
Marco Elver	894b00a335	kasan: Fix Software Tag-Based KASAN with GCC Per [1], -fsanitize=kernel-hwaddress with GCC currently does not disable instrumentation in functions with __attribute__((no_sanitize_address)). However, __attribute__((no_sanitize("hwaddress"))) does correctly disable instrumentation. Use it instead. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117196 [1] Link: https://lore.kernel.org/r/000000000000f362e80620e27859@google.com Link: https://lore.kernel.org/r/ZvFGwKfoC4yVjN_X@J2N7QTR9R3 Link: https://bugzilla.kernel.org/show_bug.cgi?id=218854 Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com Tested-by: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrew Pinski <pinskia@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Fixes: `7b861a53e4` ("kasan: Bump required compiler version") Link: https://lore.kernel.org/r/20241021120013.3209481-1-elver@google.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-23 16:04:30 +01:00
Moudy Ho	3ad0edc46f	dt-bindings: display: mediatek: split: add subschema property constraints The display node in mt8195.dtsi was triggering a CHECK_DTBS error due to an excessively long 'clocks' property: display@14f06000: clocks: [[31, 14], [31, 43], [31, 44]] is too long To resolve this issue, the constraints for 'clocks' and other properties within the subschemas will be reinforced. Fixes: `739058a9c5` ("dt-bindings: display: mediatek: split: add compatible for MT8195") Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com> Signed-off-by: Moudy Ho <moudy.ho@mediatek.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241007022834.4609-1-moudy.ho@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-10-23 14:47:39 +00:00
Arnaldo Carvalho de Melo	d822ca29a4	tools headers UAPI: Sync kvm headers with the kernel sources To pick the changes in: `aa8d1f48d3` ("KVM: x86/mmu: Introduce a quirk to control memslot zap behavior") That don't change functionality in tools/perf, as no new ioctl is added for the 'perf trace' scripts to harvest. This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Please see tools/include/uapi/README for further details. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/lkml/ZxgN0O02YrAJ2qIC@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-23 11:34:56 -03:00
Jiri Slaby	5d35634ecc	perf trace: Fix non-listed archs in the syscalltbl routines This fixes a build breakage on 32-bit arm, where the syscalltbl__id_at_idx() function was missing. Committer notes: Generating a proper syscall table from a copy of arch/arm/tools/syscall.tbl ends up being too big a patch for this rc stage, I started doing it but while testing noticed some other problems with using BPF to collect pointer args on arm7 (32-bit) will maybe continue trying to make it work on the next cycle... Fixes: `7a2fb5619c` ("perf trace: Fix iteration of syscall ids in syscalltbl->entries") Suggested-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: <jslaby@suse.cz> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/lkml/3a592835-a14f-40be-8961-c0cee7720a94@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-23 11:34:56 -03:00
Howard Chu	7fbff3c0e0	perf build: Change the clang check back to 12.0.1 This serves as a revert for this patch: https://lore.kernel.org/linux-perf-users/ZuGL9ROeTV2uXoSp@x1/ Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: James Clark <james.clark@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alan Maguire <alan.maguire@oracle.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241011021403.4089793-2-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-23 11:34:56 -03:00
Howard Chu	395d38419f	perf trace augmented_raw_syscalls: Add more checks to pass the verifier Add some more checks to pass the verifier in more kernels. Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alan Maguire <alan.maguire@oracle.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241011021403.4089793-3-howardchu95@gmail.com [ Reduced the patch removing things that can be done later ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-23 11:34:56 -03:00
Arnaldo Carvalho de Melo	ecabac70ff	perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers In a RHEL8 kernel (4.18.0-513.11.1.el8_9.x86_64), that, as enterprise kernels go, have backports from modern kernels, the verifier complains about lack of bounds check for the index into the array of syscall arguments, on a BPF bytecode generated by clang 17, with: ; } else if (size < 0 && size >= -6) { /* buffer / 116: (b7) r1 = -6 117: (2d) if r1 > r6 goto pc-30 R0=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1_w=inv-6 R2=map_value(id=0,off=16,ks=4,vs=8272,imm=0) R3=inv(id=0) R5=inv40 R6=inv(id=0,umin_value=18446744073709551610,var_off=(0xffffffff00000000; 0xffffffff)) R7=map_value(id=0,off=56,ks=4,vs=8272,imm=0) R8=invP6 R9=map_value(id=0,off=20,ks=4,vs=24,imm=0) R10=fp0 fp-8=mmmmmmmm fp-16=map_value fp-24=map_value fp-32=inv40 fp-40=ctx fp-48=map_value fp-56=inv1 fp-64=map_value fp-72=map_value fp-80=map_value ; index = -(size + 1); 118: (a7) r6 ^= -1 119: (67) r6 <<= 32 120: (77) r6 >>= 32 ; aug_size = args->args[index]; 121: (67) r6 <<= 3 122: (79) r1 = (u64 )(r10 -24) 123: (0f) r1 += r6 last_idx 123 first_idx 116 regs=40 stack=0 before 122: (79) r1 = (u64 )(r10 -24) regs=40 stack=0 before 121: (67) r6 <<= 3 regs=40 stack=0 before 120: (77) r6 >>= 32 regs=40 stack=0 before 119: (67) r6 <<= 32 regs=40 stack=0 before 118: (a7) r6 ^= -1 regs=40 stack=0 before 117: (2d) if r1 > r6 goto pc-30 regs=42 stack=0 before 116: (b7) r1 = -6 R0_w=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1_w=inv1 R2_w=map_value(id=0,off=16,ks=4,vs=8272,imm=0) R3_w=inv(id=0) R5_w=inv40 R6_rw=invP(id=0,smin_value=-2147483648,smax_value=0) R7_w=map_value(id=0,off=56,ks=4,vs=8272,imm=0) R8_w=invP6 R9_w=map_value(id=0,off=20,ks=4,vs=24,imm=0) R10=fp0 fp-8=mmmmmmmm fp-16_w=map_value fp-24_r=map_value fp-32_w=inv40 fp-40=ctx fp-48=map_value fp-56_w=inv1 fp-64_w=map_value fp-72=map_value fp-80=map_value parent didn't have regs=40 stack=0 marks last_idx 110 first_idx 98 regs=40 stack=0 before 110: (6d) if r1 s> r6 goto pc+5 regs=42 stack=0 before 109: (b7) r1 = 1 regs=40 stack=0 before 108: (65) if r6 s> 0x1000 goto pc+7 regs=40 stack=0 before 98: (55) if r6 != 0x1 goto pc+9 R0_w=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1_w=invP12 R2_w=map_value(id=0,off=16,ks=4,vs=8272,imm=0) R3_rw=inv(id=0) R5_w=inv24 R6_rw=invP(id=0,smin_value=-2147483648,smax_value=2147483647) R7_w=map_value(id=0,off=40,ks=4,vs=8272,imm=0) R8_rw=invP4 R9_w=map_value(id=0,off=12,ks=4,vs=24,imm=0) R10=fp0 fp-8=mmmmmmmm fp-16_rw=map_value fp-24_r=map_value fp-32_rw=invP24 fp-40_r=ctx fp-48_r=map_value fp-56_w=invP1 fp-64_rw=map_value fp-72_r=map_value fp-80_r=map_value parent already had regs=40 stack=0 marks 124: (79) r6 = (u64 )(r1 +16) R0=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1_w=map_value(id=0,off=0,ks=4,vs=8272,umax_value=34359738360,var_off=(0x0; 0x7fffffff8),s32_max_value=2147483640,u32_max_value=-8) R2=map_value(id=0,off=16,ks=4,vs=8272,imm=0) R3=inv(id=0) R5=inv40 R6_w=invP(id=0,umax_value=34359738360,var_off=(0x0; 0x7fffffff8),s32_max_value=2147483640,u32_max_value=-8) R7=map_value(id=0,off=56,ks=4,vs=8272,imm=0) R8=invP6 R9=map_value(id=0,off=20,ks=4,vs=24,imm=0) R10=fp0 fp-8=mmmmmmmm fp-16=map_value fp-24=map_value fp-32=inv40 fp-40=ctx fp-48=map_value fp-56=inv1 fp-64=map_value fp-72=map_value fp-80=map_value R1 unbounded memory access, make sure to bounds check any such access processed 466 insns (limit 1000000) max_states_per_insn 2 total_states 20 peak_states 20 mark_read 3 If we add this line, as used in other BPF programs, to cap that index: index &= 7; The generated BPF program is considered safe by that version of the BPF verifier, allowing perf to collect the syscall args in one more kernel using the BPF based pointer contents collector. With the above one-liner it works with that kernel: [root@dell-per740-01 ~]# uname -a Linux dell-per740-01.khw.eng.rdu2.dc.redhat.com 4.18.0-513.11.1.el8_9.x86_64 #1 SMP Thu Dec 7 03:06:13 EST 2023 x86_64 x86_64 x86_64 GNU/Linux [root@dell-per740-01 ~]# ~acme/bin/perf trace -e sleep* sleep 1.234567890 0.000 (1234.704 ms): sleep/3863610 nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 234567890 }) = 0 [root@dell-per740-01 ~]# As well as with the one in Fedora 40: root@number:~# uname -a Linux number 6.11.3-200.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Oct 10 22:31:19 UTC 2024 x86_64 GNU/Linux root@number:~# perf trace -e sleep sleep 1.234567890 0.000 (1234.722 ms): sleep/14873 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 234567890 }, rmtp: 0x7ffe87311a40) = 0 root@number:~# Song Liu reported that this one-liner was being optimized out by clang 18, so I suggested and he tested that adding a compiler barrier before it made clang v18 to keep it and the verifier in the kernel in Song's case (Meta's 5.12 based kernel) also was happy with the resulting bytecode. I'll investigate using virtme-ng[1] to have all the perf BPF based functionality thoroughly tested over multiple kernels and clang versions. [1] https://kernel-recipes.org/en/2024/virtme-ng/ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alan Maguire <alan.maguire@oracle.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrea Righi <andrea.righi@linux.dev> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/lkml/Zw7JgJc0LOwSpuvx@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-23 11:34:56 -03:00
Macpaul Lin	af6ab107ce	dt-bindings: display: mediatek: dpi: correct power-domains property The MediaTek DPI module is typically associated with one of the following multimedia power domains: - POWER_DOMAIN_DISPLAY - POWER_DOMAIN_VDOSYS - POWER_DOMAIN_MM The specific power domain used varies depending on the SoC design. These power domains are shared by multiple devices within the SoC. In most cases, these power domains are enabled by other devices. As a result, the DPI module of legacy SoCs often functions correctly even without explicit configuration. It is recommended to explicitly add the appropriate power domain property to the DPI node in the device tree. Hence drop the compatible checking for specific SoCs. Fixes: `5474d49b2f` ("dt-bindings: display: mediatek: dpi: Add power domains") Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com> Signed-off-by: Jitao Shi <jitao.shi@mediatek.com> Signed-off-by: Pablo Sun <pablo.sun@mediatek.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241003030919.17980-4-macpaul.lin@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-10-23 14:31:15 +00:00
Luiz Augusto von Dentz	246b435ad6	Bluetooth: ISO: Fix UAF on iso_sock_timeout conn->sk maybe have been unlinked/freed while waiting for iso_conn_lock so this checks if the conn->sk is still valid by checking if it part of iso_sk_list. Fixes: `ccf74f2390` ("Bluetooth: Add BTPROTO_ISO socket type") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-10-23 10:21:14 -04:00
Luiz Augusto von Dentz	1bf4470a39	Bluetooth: SCO: Fix UAF on sco_sock_timeout conn->sk maybe have been unlinked/freed while waiting for sco_conn_lock so this checks if the conn->sk is still valid by checking if it part of sco_sk_list. Reported-by: syzbot+4c0d0c4cde787116d465@syzkaller.appspotmail.com Tested-by: syzbot+4c0d0c4cde787116d465@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4c0d0c4cde787116d465 Fixes: `ba316be1b6` ("Bluetooth: schedule SCO timeouts with delayed_work") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-10-23 10:20:29 -04:00
Luiz Augusto von Dentz	989fa5171f	Bluetooth: hci_core: Disable works on hci_unregister_dev This make use of disable_work_* on hci_unregister_dev since the hci_dev is about to be freed new submissions are not disarable. Fixes: `0d151a1037` ("Bluetooth: hci_core: cancel all works upon hci_unregister_dev()") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-10-23 10:19:44 -04:00
Huacai Chen	73adbd92f3	LoongArch: KVM: Mark hrtimer to expire in hard interrupt context Like commit `2c0d278f32` ("KVM: LAPIC: Mark hrtimer to expire in hard interrupt context") and commit `9090825fa9` ("KVM: arm/arm64: Let the timer expire in hardirq context on RT"), On PREEMPT_RT enabled kernels unmarked hrtimers are moved into soft interrupt expiry mode by default. Then the timers are canceled from an preempt-notifier which is invoked with disabled preemption which is not allowed on PREEMPT_RT. The timer callback is short so in could be invoked in hard-IRQ context. So let the timer expire on hard-IRQ context even on -RT. This fix a "scheduling while atomic" bug for PREEMPT_RT enabled kernels: BUG: scheduling while atomic: qemu-system-loo/1011/0x00000002 Modules linked in: amdgpu rfkill nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ns CPU: 1 UID: 0 PID: 1011 Comm: qemu-system-loo Tainted: G W 6.12.0-rc2+ #1774 Tainted: [W]=WARN Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0-prebeta9 10/21/2022 Stack : ffffffffffffffff 0000000000000000 9000000004e3ea38 9000000116744000 90000001167475a0 0000000000000000 90000001167475a8 9000000005644830 90000000058dc000 90000000058dbff8 9000000116747420 0000000000000001 0000000000000001 6a613fc938313980 000000000790c000 90000001001c1140 00000000000003fe 0000000000000001 000000000000000d 0000000000000003 0000000000000030 00000000000003f3 000000000790c000 9000000116747830 90000000057ef000 0000000000000000 9000000005644830 0000000000000004 0000000000000000 90000000057f4b58 0000000000000001 9000000116747868 900000000451b600 9000000005644830 9000000003a13998 0000000010000020 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d ... Call Trace: [<9000000003a13998>] show_stack+0x38/0x180 [<9000000004e3ea34>] dump_stack_lvl+0x84/0xc0 [<9000000003a71708>] __schedule_bug+0x48/0x60 [<9000000004e45734>] __schedule+0x1114/0x1660 [<9000000004e46040>] schedule_rtlock+0x20/0x60 [<9000000004e4e330>] rtlock_slowlock_locked+0x3f0/0x10a0 [<9000000004e4f038>] rt_spin_lock+0x58/0x80 [<9000000003b02d68>] hrtimer_cancel_wait_running+0x68/0xc0 [<9000000003b02e30>] hrtimer_cancel+0x70/0x80 [<ffff80000235eb70>] kvm_restore_timer+0x50/0x1a0 [kvm] [<ffff8000023616c8>] kvm_arch_vcpu_load+0x68/0x2a0 [kvm] [<ffff80000234c2d4>] kvm_sched_in+0x34/0x60 [kvm] [<9000000003a749a0>] finish_task_switch.isra.0+0x140/0x2e0 [<9000000004e44a70>] __schedule+0x450/0x1660 [<9000000004e45cb0>] schedule+0x30/0x180 [<ffff800002354c70>] kvm_vcpu_block+0x70/0x120 [kvm] [<ffff800002354d80>] kvm_vcpu_halt+0x60/0x3e0 [kvm] [<ffff80000235b194>] kvm_handle_gspr+0x3f4/0x4e0 [kvm] [<ffff80000235f548>] kvm_handle_exit+0x1c8/0x260 [kvm] Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-10-23 22:15:44 +08:00
Huacai Chen	3c252263be	LoongArch: Make KASAN usable for variable cpu_vabits Currently, KASAN on LoongArch assume the CPU VA bits is 48, which is true for Loongson-3 series, but not for Loongson-2 series (only 40 or lower), this patch fix that issue and make KASAN usable for variable cpu_vabits. Solution is very simple: Just define XRANGE_SHADOW_SHIFT which means valid address length from VA_BITS to min(cpu_vabits, VA_BITS). Cc: stable@vger.kernel.org Signed-off-by: Kanglong Wang <wangkanglong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-10-23 22:15:30 +08:00
Dan Carpenter	4018651ba5	drm/mediatek: Fix potential NULL dereference in mtk_crtc_destroy() In mtk_crtc_create(), if the call to mbox_request_channel() fails then we set the "mtk_crtc->cmdq_client.chan" pointer to NULL. In that situation, we do not call cmdq_pkt_create(). During the cleanup, we need to check if the "mtk_crtc->cmdq_client.chan" is NULL first before calling cmdq_pkt_destroy(). Calling cmdq_pkt_destroy() is unnecessary if we didn't call cmdq_pkt_create() and it will result in a NULL pointer dereference. Fixes: `7627122fd1` ("drm/mediatek: Add cmdq_handle in mtk_crtc") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/cc537bd6-837f-4c85-a37b-1a007e268310@stanley.mountain/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-10-23 14:09:13 +00:00
Jinjie Ruan	6e62807c7f	posix-clock: posix-clock: Fix unbalanced locking in pc_clock_settime() If get_clock_desc() succeeds, it calls fget() for the clockid's fd, and get the clk->rwsem read lock, so the error path should release the lock to make the lock balance and fput the clockid's fd to make the refcount balance and release the fd related resource. However the below commit left the error path locked behind resulting in unbalanced locking. Check timespec64_valid_strict() before get_clock_desc() to fix it, because the "ts" is not changed after that. Fixes: `d8794ac20a` ("posix-clock: Fix missing timespec64 check in pc_clock_settime()") Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Anna-Maria Behnsen <anna-maria@linutronix.de> [pabeni@redhat.com: fixed commit message typo] Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-23 16:05:01 +02:00
Liankun Yang	3ded11b5c1	drm/mediatek: Fix get efuse issue for MT8188 DPTX Update efuse data for MT8188 displayport. The DP monitor can not display when DUT connected to USB-c to DP dongle. Analysis view is invalid DP efuse data. Fixes: `350c3fe907` ("drm/mediatek: dp: Add support MT8188 dp/edp function") Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Liankun Yang <liankun.yang@mediatek.com> Reviewed-by: Fei Shao <fshao@chromium.org> Tested-by: Fei Shao <fshao@chromium.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20240923132521.22785-1-liankun.yang@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-10-23 14:00:56 +00:00
Heiner Kallweit	10ce0db787	r8169: avoid unsolicited interrupts It was reported that after resume from suspend a PCI error is logged and connectivity is broken. Error message is: PCI error (cmd = 0x0407, status_errs = 0x0000) The message seems to be a red herring as none of the error bits is set, and the PCI command register value also is normal. Exception handling for a PCI error includes a chip reset what apparently brakes connectivity here. The interrupt status bit triggering the PCI error handling isn't actually used on PCIe chip versions, so it's not clear why this bit is set by the chip. Fix this by ignoring this bit on PCIe chip versions. Fixes: `0e4851502f` ("r8169: merge with version 8.001.00 of Realtek's r8168 driver") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219388 Tested-by: Atlas Yu <atlas.yu@canonical.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/78e2f535-438f-4212-ad94-a77637ac6c9c@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-23 15:41:27 +02:00
Ye Bin	2ce1007f42	cifs: fix warning when destroy 'cifs_io_request_pool' There's a issue as follows: WARNING: CPU: 1 PID: 27826 at mm/slub.c:4698 free_large_kmalloc+0xac/0xe0 RIP: 0010:free_large_kmalloc+0xac/0xe0 Call Trace: <TASK> ? __warn+0xea/0x330 mempool_destroy+0x13f/0x1d0 init_cifs+0xa50/0xff0 [cifs] do_one_initcall+0xdc/0x550 do_init_module+0x22d/0x6b0 load_module+0x4e96/0x5ff0 init_module_from_file+0xcd/0x130 idempotent_init_module+0x330/0x620 __x64_sys_finit_module+0xb3/0x110 do_syscall_64+0xc1/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Obviously, 'cifs_io_request_pool' is not created by mempool_create(). So just use mempool_exit() to revert 'cifs_io_request_pool'. Fixes: `edea94a697` ("cifs: Add mempools for cifs_io_request and cifs_io_subrequest structs") Signed-off-by: Ye Bin <yebin10@huawei.com> Acked-by: David Howells <dhowells@redhat.com Signed-off-by: Steve French <stfrench@microsoft.com>	2024-10-23 07:42:44 -05:00
Henrique Carvalho	9a5dd61151	smb: client: Handle kstrdup failures for passwords In smb3_reconfigure(), after duplicating ctx->password and ctx->password2 with kstrdup(), we need to check for allocation failures. If ses->password allocation fails, return -ENOMEM. If ses->password2 allocation fails, free ses->password, set it to NULL, and return -ENOMEM. Fixes: `c1eb537bf4` ("cifs: allow changing password during remount") Reviewed-by: David Howells <dhowells@redhat.com Signed-off-by: Haoxiang Li <make24@iscas.ac.cn> Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-10-23 07:42:22 -05:00
Kailang Yang	e3ea2757c3	ALSA: hda/realtek: Update default depop procedure Old procedure has a chance to meet Headphone no output. Fixes: `c2d6af53a4` ("ALSA: hda/realtek - Add default procedure for suspend and resume state") Signed-off-by: Kailang Yang <kailang@realtek.com> Link: https://lore.kernel.org/17b717a0a0b04a77aea4a8ec820cba13@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-23 14:21:10 +02:00
Dmitry Antipov	b22db8b8be	net: sched: use RCU read-side critical section in taprio_dump() Fix possible use-after-free in 'taprio_dump()' by adding RCU read-side critical section there. Never seen on x86 but found on a KASAN-enabled arm64 system when investigating https://syzkaller.appspot.com/bug?extid=b65e0af58423fc8a73aa: [T15862] BUG: KASAN: slab-use-after-free in taprio_dump+0xa0c/0xbb0 [T15862] Read of size 4 at addr ffff0000d4bb88f8 by task repro/15862 [T15862] [T15862] CPU: 0 UID: 0 PID: 15862 Comm: repro Not tainted 6.11.0-rc1-00293-gdefaf1a2113a-dirty #2 [T15862] Hardware name: QEMU QEMU Virtual Machine, BIOS edk2-20240524-5.fc40 05/24/2024 [T15862] Call trace: [T15862] dump_backtrace+0x20c/0x220 [T15862] show_stack+0x2c/0x40 [T15862] dump_stack_lvl+0xf8/0x174 [T15862] print_report+0x170/0x4d8 [T15862] kasan_report+0xb8/0x1d4 [T15862] __asan_report_load4_noabort+0x20/0x2c [T15862] taprio_dump+0xa0c/0xbb0 [T15862] tc_fill_qdisc+0x540/0x1020 [T15862] qdisc_notify.isra.0+0x330/0x3a0 [T15862] tc_modify_qdisc+0x7b8/0x1838 [T15862] rtnetlink_rcv_msg+0x3c8/0xc20 [T15862] netlink_rcv_skb+0x1f8/0x3d4 [T15862] rtnetlink_rcv+0x28/0x40 [T15862] netlink_unicast+0x51c/0x790 [T15862] netlink_sendmsg+0x79c/0xc20 [T15862] __sock_sendmsg+0xe0/0x1a0 [T15862] ____sys_sendmsg+0x6c0/0x840 [T15862] ___sys_sendmsg+0x1ac/0x1f0 [T15862] __sys_sendmsg+0x110/0x1d0 [T15862] __arm64_sys_sendmsg+0x74/0xb0 [T15862] invoke_syscall+0x88/0x2e0 [T15862] el0_svc_common.constprop.0+0xe4/0x2a0 [T15862] do_el0_svc+0x44/0x60 [T15862] el0_svc+0x50/0x184 [T15862] el0t_64_sync_handler+0x120/0x12c [T15862] el0t_64_sync+0x190/0x194 [T15862] [T15862] Allocated by task 15857: [T15862] kasan_save_stack+0x3c/0x70 [T15862] kasan_save_track+0x20/0x3c [T15862] kasan_save_alloc_info+0x40/0x60 [T15862] __kasan_kmalloc+0xd4/0xe0 [T15862] __kmalloc_cache_noprof+0x194/0x334 [T15862] taprio_change+0x45c/0x2fe0 [T15862] tc_modify_qdisc+0x6a8/0x1838 [T15862] rtnetlink_rcv_msg+0x3c8/0xc20 [T15862] netlink_rcv_skb+0x1f8/0x3d4 [T15862] rtnetlink_rcv+0x28/0x40 [T15862] netlink_unicast+0x51c/0x790 [T15862] netlink_sendmsg+0x79c/0xc20 [T15862] __sock_sendmsg+0xe0/0x1a0 [T15862] ____sys_sendmsg+0x6c0/0x840 [T15862] ___sys_sendmsg+0x1ac/0x1f0 [T15862] __sys_sendmsg+0x110/0x1d0 [T15862] __arm64_sys_sendmsg+0x74/0xb0 [T15862] invoke_syscall+0x88/0x2e0 [T15862] el0_svc_common.constprop.0+0xe4/0x2a0 [T15862] do_el0_svc+0x44/0x60 [T15862] el0_svc+0x50/0x184 [T15862] el0t_64_sync_handler+0x120/0x12c [T15862] el0t_64_sync+0x190/0x194 [T15862] [T15862] Freed by task 6192: [T15862] kasan_save_stack+0x3c/0x70 [T15862] kasan_save_track+0x20/0x3c [T15862] kasan_save_free_info+0x4c/0x80 [T15862] poison_slab_object+0x110/0x160 [T15862] __kasan_slab_free+0x3c/0x74 [T15862] kfree+0x134/0x3c0 [T15862] taprio_free_sched_cb+0x18c/0x220 [T15862] rcu_core+0x920/0x1b7c [T15862] rcu_core_si+0x10/0x1c [T15862] handle_softirqs+0x2e8/0xd64 [T15862] __do_softirq+0x14/0x20 Fixes: `18cdd2f099` ("net/sched: taprio: taprio_dump and taprio_change are protected by rtnl_mutex") Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://patch.msgid.link/20241018051339.418890-2-dmantipov@yandex.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-23 13:26:15 +02:00
Dmitry Antipov	f504465970	net: sched: fix use-after-free in taprio_change() In 'taprio_change()', 'admin' pointer may become dangling due to sched switch / removal caused by 'advance_sched()', and critical section protected by 'q->current_entry_lock' is too small to prevent from such a scenario (which causes use-after-free detected by KASAN). Fix this by prefer 'rcu_replace_pointer()' over 'rcu_assign_pointer()' to update 'admin' immediately before an attempt to schedule freeing. Fixes: `a3d43c0d56` ("taprio: Add support adding an admin schedule") Reported-by: syzbot+b65e0af58423fc8a73aa@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b65e0af58423fc8a73aa Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://patch.msgid.link/20241018051339.418890-1-dmantipov@yandex.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-23 13:26:15 +02:00
Ashish Kalra	88a921aa3c	x86/sev: Ensure that RMP table fixups are reserved The BIOS reserves RMP table memory via e820 reservations. This can still lead to RMP page faults during kexec if the host tries to access memory within the same 2MB region. Commit `400fea4b96` ("x86/sev: Add callback to apply RMP table fixups for kexec" adjusts the e820 reservations for the RMP table so that the entire 2MB range at the start/end of the RMP table is marked reserved. The e820 reservations are then passed to firmware via SNP_INIT where they get marked HV-Fixed. The RMP table fixups are done after the e820 ranges have been added to memblock, allowing the fixup ranges to still be allocated and used by the system. The problem is that this memory range is now marked reserved in the e820 tables and during SNP initialization these reserved ranges are marked as HV-Fixed. This means that the pages cannot be used by an SNP guest, only by the hypervisor. However, the memory management subsystem does not make this distinction and can allocate one of those pages to an SNP guest. This will ultimately result in RMPUPDATE failures associated with the guest, causing it to fail to start or terminate when accessing the HV-Fixed page. The issue is captured below with memblock=debug: [ 0.000000] SEV-SNP: *** DEBUG: snp_probe_rmptable_info:352 - rmp_base=0x280d4800000, rmp_end=0x28357efffff ... [ 0.000000] BIOS-provided physical RAM map: ... [ 0.000000] BIOS-e820: [mem 0x00000280d4800000-0x0000028357efffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000028357f00000-0x0000028357ffffff] usable ... ... [ 0.183593] memblock add: [0x0000028357f00000-0x0000028357ffffff] e820__memblock_setup+0x74/0xb0 ... [ 0.203179] MEMBLOCK configuration: [ 0.207057] memory size = 0x0000027d0d194000 reserved size = 0x0000000009ed2c00 [ 0.215299] memory.cnt = 0xb ... [ 0.311192] memory[0x9] [0x0000028357f00000-0x0000028357ffffff], 0x0000000000100000 bytes flags: 0x0 ... ... [ 0.419110] SEV-SNP: Reserving start/end of RMP table on a 2MB boundary [0x0000028357e00000] [ 0.428514] e820: update [mem 0x28357e00000-0x28357ffffff] usable ==> reserved [ 0.428517] e820: update [mem 0x28357e00000-0x28357ffffff] usable ==> reserved [ 0.428520] e820: update [mem 0x28357e00000-0x28357ffffff] usable ==> reserved ... ... [ 5.604051] MEMBLOCK configuration: [ 5.607922] memory size = 0x0000027d0d194000 reserved size = 0x0000000011faae02 [ 5.616163] memory.cnt = 0xe ... [ 5.754525] memory[0xc] [0x0000028357f00000-0x0000028357ffffff], 0x0000000000100000 bytes on node 0 flags: 0x0 ... ... [ 10.080295] Early memory node ranges[ 10.168065] ... node 0: [mem 0x0000028357f00000-0x0000028357ffffff] ... ... [ 8149.348948] SEV-SNP: RMPUPDATE failed for PFN 28357f7c, pg_level: 1, ret: 2 As shown above, the memblock allocations show 1MB after the end of the RMP as available for allocation, which is what the RMP table fixups have reserved. This memory range subsequently gets allocated as SNP guest memory, resulting in an RMPUPDATE failure. This can potentially be fixed by not reserving the memory range in the e820 table, but that causes kexec failures when using the KEXEC_FILE_LOAD syscall. The solution is to use memblock_reserve() to mark the memory reserved for the system, ensuring that it cannot be allocated to an SNP guest. Since HV-Fixed memory is still readable/writable by the host, this only ends up being a problem if the memory in this range requires a page state change, which generally will only happen when allocating memory in this range to be used for running SNP guests, which is now possible with the SNP hypervisor support in kernel 6.11. Backporter note: Fixes tag points to a 6.9 change but as the last paragraph above explains, this whole thing can happen after 6.11 received SNP HV support, therefore backporting to 6.9 is not really necessary. [ bp: Massage commit message. ] Fixes: `400fea4b96` ("x86/sev: Add callback to apply RMP table fixups for kexec") Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: <stable@kernel.org> # 6.11, see Backporter note above. Link: https://lore.kernel.org/r/20240815221630.131133-1-Ashish.Kalra@amd.com	2024-10-23 12:34:06 +02:00
Vladimir Oltean	34d35b4edb	net/sched: act_api: deny mismatched skip_sw/skip_hw flags for actions created by classifiers tcf_action_init() has logic for checking mismatches between action and filter offload flags (skip_sw/skip_hw). AFAIU, this is intended to run on the transition between the new tc_act_bind(flags) returning true (aka now gets bound to classifier) and tc_act_bind(act->tcfa_flags) returning false (aka action was not bound to classifier before). Otherwise, the check is skipped. For the case where an action is not standalone, but rather it was created by a classifier and is bound to it, tcf_action_init() skips the check entirely, and this means it allows mismatched flags to occur. Taking the matchall classifier code path as an example (with mirred as an action), the reason is the following: 1 \| mall_change() 2 \| -> mall_replace_hw_filter() 3 \| -> tcf_exts_validate_ex() 4 \| -> flags \|= TCA_ACT_FLAGS_BIND; 5 \| -> tcf_action_init() 6 \| -> tcf_action_init_1() 7 \| -> a_o->init() 8 \| -> tcf_mirred_init() 9 \| -> tcf_idr_create_from_flags() 10 \| -> tcf_idr_create() 11 \| -> p->tcfa_flags = flags; 12 \| -> tc_act_bind(flags)) 13 \| -> tc_act_bind(act->tcfa_flags) When invoked from tcf_exts_validate_ex() like matchall does (but other classifiers validate their extensions as well), tcf_action_init() runs in a call path where "flags" always contains TCA_ACT_FLAGS_BIND (set by line 4). So line 12 is always true, and line 13 is always true as well. No transition ever takes place, and the check is skipped. The code was added in this form in commit `c86e0209dc` ("flow_offload: validate flags of filter and actions"), but I'm attributing the blame even earlier in that series, to when TCA_ACT_FLAGS_SKIP_HW and TCA_ACT_FLAGS_SKIP_SW were added to the UAPI. Following the development process of this change, the check did not always exist in this form. A change took place between v3 [1] and v4 [2], AFAIU due to review feedback that it doesn't make sense for action flags to be different than classifier flags. I think I agree with that feedback, but it was translated into code that omits enforcing this for "classic" actions created at the same time with the filters themselves. There are 3 more important cases to discuss. First there is this command: $ tc qdisc add dev eth0 clasct $ tc filter add dev eth0 ingress matchall skip_sw \ action mirred ingress mirror dev eth1 which should be allowed, because prior to the concept of dedicated action flags, it used to work and it used to mean the action inherited the skip_sw/skip_hw flags from the classifier. It's not a mismatch. Then we have this command: $ tc qdisc add dev eth0 clasct $ tc filter add dev eth0 ingress matchall skip_sw \ action mirred ingress mirror dev eth1 skip_hw where there is a mismatch and it should be rejected. Finally, we have: $ tc qdisc add dev eth0 clasct $ tc filter add dev eth0 ingress matchall skip_sw \ action mirred ingress mirror dev eth1 skip_sw where the offload flags coincide, and this should be treated the same as the first command based on inheritance, and accepted. [1]: https://lore.kernel.org/netdev/20211028110646.13791-9-simon.horman@corigine.com/ [2]: https://lore.kernel.org/netdev/20211118130805.23897-10-simon.horman@corigine.com/ Fixes: `7adc576512` ("flow_offload: add skip_hw and skip_sw to control if offload the action") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20241017161049.3570037-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-23 11:31:27 +02:00
Leo Yan	0b6e2e22cb	tracing: Consider the NULL character when validating the event length strlen() returns a string length excluding the null byte. If the string length equals to the maximum buffer length, the buffer will have no space for the NULL terminating character. This commit checks this condition and returns failure for it. Link: https://lore.kernel.org/all/20241007144724.920954-1-leo.yan@arm.com/ Fixes: `dec65d79fd` ("tracing/probe: Check event name length correctly") Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-10-23 17:24:47 +09:00
Mikel Rychliski	73f3508047	tracing/probes: Fix MAX_TRACE_ARGS limit handling When creating a trace_probe we would set nr_args prior to truncating the arguments to MAX_TRACE_ARGS. However, we would only initialize arguments up to the limit. This caused invalid memory access when attempting to set up probes with more than 128 fetchargs. BUG: kernel NULL pointer dereference, address: 0000000000000020 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 UID: 0 PID: 1769 Comm: cat Not tainted 6.11.0-rc7+ #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 RIP: 0010:__set_print_fmt+0x134/0x330 Resolve the issue by applying the MAX_TRACE_ARGS limit earlier. Return an error when there are too many arguments instead of silently truncating. Link: https://lore.kernel.org/all/20240930202656.292869-1-mikel@mikelr.com/ Fixes: `035ba76014` ("tracing/probes: cleanup: Set trace_probe::nr_args at trace_probe_init") Signed-off-by: Mikel Rychliski <mikel@mikelr.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-10-23 17:24:44 +09:00
Pei Xiao	2b059d0d1e	slub/kunit: fix a WARNING due to unwrapped __kmalloc_cache_noprof 'modprobe slub_kunit' will have a warning as shown below. The root cause is that __kmalloc_cache_noprof was directly used, which resulted in no alloc_tag being allocated. This caused current->alloc_tag to be null, leading to a warning in alloc_tag_add_check. Let's add an alloc_hook layer to __kmalloc_cache_noprof specifically within lib/slub_kunit.c, which is the only user of this internal slub function outside kmalloc implementation itself. [58162.947016] WARNING: CPU: 2 PID: 6210 at ./include/linux/alloc_tag.h:125 alloc_tagging_slab_alloc_hook+0x268/0x27c [58162.957721] Call trace: [58162.957919] alloc_tagging_slab_alloc_hook+0x268/0x27c [58162.958286] __kmalloc_cache_noprof+0x14c/0x344 [58162.958615] test_kmalloc_redzone_access+0x50/0x10c [slub_kunit] [58162.959045] kunit_try_run_case+0x74/0x184 [kunit] [58162.959401] kunit_generic_run_threadfn_adapter+0x2c/0x4c [kunit] [58162.959841] kthread+0x10c/0x118 [58162.960093] ret_from_fork+0x10/0x20 [58162.960363] ---[ end trace 0000000000000000 ]--- Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Fixes: `a0a44d9175` ("mm, slab: don't wrap internal functions with alloc_hooks()") Signed-off-by: Vlastimil Babka <vbabka@suse.cz>	2024-10-23 09:50:58 +02:00
Elena Salomatkina	4b60a56555	sumversion: Fix a memory leak in get_src_version() strsep() modifies its first argument - buf. An invalid pointer will be passed to the free() function. Make the pointer passed to free() match the return value of read_text_file(). Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `9413e76405` ("kbuild: split the second line of .mod into .usyms") Signed-off-by: Elena Salomatkina <esalomatkina@ispras.ru> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-10-23 16:11:13 +09:00
Daniel Borkmann	82bbe13331	selftests/bpf: Add test for passing in uninit mtu_len Add a small test to pass an uninitialized mtu_len to the bpf_check_mtu() helper to probe whether the verifier rejects it under !CAP_PERFMON. # ./vmtest.sh -- ./test_progs -t verifier_mtu [...] ./test_progs -t verifier_mtu [ 1.414712] tsc: Refined TSC clocksource calibration: 3407.993 MHz [ 1.415327] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fcd52370, max_idle_ns: 440795242006 ns [ 1.416463] clocksource: Switched to clocksource tsc [ 1.429842] bpf_testmod: loading out-of-tree module taints kernel. [ 1.430283] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel #510/1 verifier_mtu/uninit/mtu: write rejected:OK #510 verifier_mtu:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241021152809.33343-5-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-22 15:42:56 -07:00
Daniel Borkmann	baa802d2aa	selftests/bpf: Add test for writes to .rodata Add a small test to write a (verification-time) fixed vs unknown but bounded-sized buffer into .rodata BPF map and assert that both get rejected. # ./vmtest.sh -- ./test_progs -t verifier_const [...] ./test_progs -t verifier_const [ 1.418717] tsc: Refined TSC clocksource calibration: 3407.994 MHz [ 1.419113] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fcde90a1, max_idle_ns: 440795222066 ns [ 1.419972] clocksource: Switched to clocksource tsc [ 1.449596] bpf_testmod: loading out-of-tree module taints kernel. [ 1.449958] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel #475/1 verifier_const/rodata/strtol: write rejected:OK #475/2 verifier_const/bss/strtol: write accepted:OK #475/3 verifier_const/data/strtol: write accepted:OK #475/4 verifier_const/rodata/mtu: write rejected:OK #475/5 verifier_const/bss/mtu: write accepted:OK #475/6 verifier_const/data/mtu: write accepted:OK #475/7 verifier_const/rodata/mark: write with unknown reg rejected:OK #475/8 verifier_const/rodata/mark: write with unknown reg rejected:OK #475 verifier_const:OK #476/1 verifier_const_or/constant register \|= constant should keep constant type:OK #476/2 verifier_const_or/constant register \|= constant should not bypass stack boundary checks:OK #476/3 verifier_const_or/constant register \|= constant register should keep constant type:OK #476/4 verifier_const_or/constant register \|= constant register should not bypass stack boundary checks:OK #476 verifier_const_or:OK Summary: 2/12 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241021152809.33343-4-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-22 15:42:56 -07:00
Daniel Borkmann	14a3d3ef02	bpf: Remove MEM_UNINIT from skb/xdp MTU helpers We can now undo parts of `4b3786a6c5` ("bpf: Zero former ARG_PTR_TO_{LONG,INT} args in case of error") as discussed in [0]. Given the BPF helpers now have MEM_WRITE tag, the MEM_UNINIT can be cleared. The mtu_len is an input as well as output argument, meaning, the BPF program has to set it to something. It cannot be uninitialized. Therefore, allowing uninitialized memory and zeroing it on error would be odd. It was done as an interim step in `4b3786a6c5` as the desired behavior could not have been expressed before the introduction of MEM_WRITE tag. Fixes: `4b3786a6c5` ("bpf: Zero former ARG_PTR_TO_{LONG,INT} args in case of error") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/a86eb76d-f52f-dee4-e5d2-87e45de3e16f@iogearbox.net [0] Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241021152809.33343-3-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-22 15:42:56 -07:00
Daniel Borkmann	8ea607330a	bpf: Fix overloading of MEM_UNINIT's meaning Lonial reported an issue in the BPF verifier where check_mem_size_reg() has the following code: if (!tnum_is_const(reg->var_off)) /* For unprivileged variable accesses, disable raw * mode so that the program is required to * initialize all the memory that the helper could * just partially fill up. */ meta = NULL; This means that writes are not checked when the register containing the size of the passed buffer has not a fixed size. Through this bug, a BPF program can write to a map which is marked as read-only, for example, .rodata global maps. The problem is that MEM_UNINIT's initial meaning that "the passed buffer to the BPF helper does not need to be initialized" which was added back in commit `435faee1aa` ("bpf, verifier: add ARG_PTR_TO_RAW_STACK type") got overloaded over time with "the passed buffer is being written to". The problem however is that checks such as the above which were added later via `06c1c04972` ("bpf: allow helpers access to variable memory") set meta to NULL in order force the user to always initialize the passed buffer to the helper. Due to the current double meaning of MEM_UNINIT, this bypasses verifier write checks to the memory (not boundary checks though) and only assumes the latter memory is read instead. Fix this by reverting MEM_UNINIT back to its original meaning, and having MEM_WRITE as an annotation to BPF helpers in order to then trigger the BPF verifier checks for writing to memory. Some notes: check_arg_pair_ok() ensures that for ARG_CONST_SIZE{,_OR_ZERO} we can access fn->arg_type[arg - 1] since it must contain a preceding ARG_PTR_TO_MEM. For check_mem_reg() the meta argument can be removed altogether since we do check both BPF_READ and BPF_WRITE. Same for the equivalent check_kfunc_mem_size_reg(). Fixes: `7b3552d3f9` ("bpf: Reject writes for PTR_TO_MAP_KEY in check_helper_mem_access") Fixes: `97e6d7dab1` ("bpf: Check PTR_TO_MEM \| MEM_RDONLY in check_helper_mem_access") Fixes: `15baa55ff5` ("bpf/verifier: allow all functions to read user provided context") Reported-by: Lonial Con <kongln9170@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241021152809.33343-2-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-22 15:42:56 -07:00
Daniel Borkmann	6fad274f06	bpf: Add MEM_WRITE attribute Add a MEM_WRITE attribute for BPF helper functions which can be used in bpf_func_proto to annotate an argument type in order to let the verifier know that the helper writes into the memory passed as an argument. In the past MEM_UNINIT has been (ab)used for this function, but the latter merely tells the verifier that the passed memory can be uninitialized. There have been bugs with overloading the latter but aside from that there are also cases where the passed memory is read + written which currently cannot be expressed, see also `4b3786a6c5` ("bpf: Zero former ARG_PTR_TO_{LONG,INT} args in case of error"). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241021152809.33343-1-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-22 15:42:56 -07:00
Alex Deucher	7c210ca5a2	drm/amdgpu: handle default profile on on devices without fullscreen 3D Some devices do not support fullscreen 3D. v2: Make the check generic. Fixes: `ec1aab7816` ("drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUs") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Kenneth Feng <kenneth.feng@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> (cherry picked from commit `1cdd67510e`)	2024-10-22 18:21:51 -04:00
Mario Limonciello	ba1959f711	drm/amd/display: Disable PSR-SU on Parade 08-01 TCON too Stuart Hayhurst has found that both at bootup and fullscreen VA-API video is leading to black screens for around 1 second and kernel WARNING [1] traces when calling dmub_psr_enable() with Parade 08-01 TCON. These symptoms all go away with PSR-SU disabled for this TCON, so disable it for now while DMUB traces [2] from the failure can be analyzed and the failure state properly root caused. Cc: Marc Rossi <Marc.Rossi@amd.com> Cc: Hamza Mahfooz <Hamza.Mahfooz@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/uploads/a832dd515b571ee171b3e3b566e99a13/dmesg.log [1] Link: https://gitlab.freedesktop.org/drm/amd/uploads/8f13ff3b00963c833e23e68aa8116959/output.log [2] Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2645 Reviewed-by: Leo Li <sunpeng.li@amd.com> Link: https://lore.kernel.org/r/20240205211233.2601-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `afb634a682`) Cc: stable@vger.kernel.org	2024-10-22 18:13:03 -04:00
Frank Min	108bc59fe8	drm/amdgpu: fix random data corruption for sdma 7 There is random data corruption caused by const fill, this is caused by write compression mode not correctly configured. So correct compression mode for const fill. Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `75400f8d6e`) Cc: stable@vger.kernel.org # 6.11.x	2024-10-22 18:11:43 -04:00
Aurabindo Pillai	63feb35cd2	drm/amd/display: temp w/a for DP Link Layer compliance [Why&How] Disabling P-State support on full updates for DCN401 results in introducing additional communication with SMU. A UCLK hard min message to SMU takes 4 seconds to go through, which was due to DCN not allowing pstate switch, which was caused by incorrect value for TTU watermark before blanking the HUBP prior to DPG on for servicing the test request. Fix the issue temporarily by disallowing pstate changes for compliance test while test request handler is reworked for a proper fix. Fixes: `67ea53a4bd` ("drm/amd/display: Disable DCN401 UCLK P-State support on full updates") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `8a79f7cdbb`) Cc: stable@vger.kernel.org	2024-10-22 18:11:20 -04:00
Aurabindo Pillai	23d16ede33	drm/amd/display: temp w/a for dGPU to enter idle optimizations [Why&How] vblank immediate disable currently does not work for all asics. On DCN401, the vblank interrupts never stop coming, and hence we never get a chance to trigger idle optimizations. Add a workaround to enable immediate disable only on APUs for now. This adds a 2-frame delay for triggering idle optimization, which is a negligible overhead. Fixes: `58a261bfc9` ("drm/amd/display: use a more lax vblank enable policy for older ASICs") Fixes: `e45b6716de` ("drm/amd/display: use a more lax vblank enable policy for DCN35+") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `9b47278cec`) Cc: stable@vger.kernel.org	2024-10-22 18:10:55 -04:00
Kenneth Feng	f67644b219	drm/amd/pm: update deep sleep status on smu v14.0.2/3 disable deep sleep during the compute workload for the potential performance loss on smu v14.0.2/3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `7d9af459f4`)	2024-10-22 18:10:17 -04:00
Kenneth Feng	f888e3d34b	drm/amd/pm: update overdrive function on smu v14.0.2/3 update overdrive function on smu v14.0.2/3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `dcf822fca5`)	2024-10-22 18:10:08 -04:00
Kenneth Feng	9515e74d75	drm/amd/pm: update the driver-fw interface file for smu v14.0.2/3 update the driver-fw interface file for smu v14.0.2/3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `0642c95efb`)	2024-10-22 18:09:06 -04:00
Mario Limonciello	bf58f03931	drm/amd: Guard against bad data for ATIF ACPI method If a BIOS provides bad data in response to an ATIF method call this causes a NULL pointer dereference in the caller. ``` ? show_regs (arch/x86/kernel/dumpstack.c:478 (discriminator 1)) ? __die (arch/x86/kernel/dumpstack.c:423 arch/x86/kernel/dumpstack.c:434) ? page_fault_oops (arch/x86/mm/fault.c:544 (discriminator 2) arch/x86/mm/fault.c:705 (discriminator 2)) ? do_user_addr_fault (arch/x86/mm/fault.c:440 (discriminator 1) arch/x86/mm/fault.c:1232 (discriminator 1)) ? acpi_ut_update_object_reference (drivers/acpi/acpica/utdelete.c:642) ? exc_page_fault (arch/x86/mm/fault.c:1542) ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) ? amdgpu_atif_query_backlight_caps.constprop.0 (drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:387 (discriminator 2)) amdgpu ? amdgpu_atif_query_backlight_caps.constprop.0 (drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:386 (discriminator 1)) amdgpu ``` It has been encountered on at least one system, so guard for it. Fixes: `d38ceaf99e` ("drm/amdgpu: add core driver (v4)") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `c9b7c809b8`) Cc: stable@vger.kernel.org	2024-10-22 18:08:12 -04:00
Krzysztof Kozlowski	db7e59e6a3	ASoC: qcom: sc7280: Fix missing Soundwire runtime stream alloc Commit `15c7fab0e0` ("ASoC: qcom: Move Soundwire runtime stream alloc to soundcards") moved the allocation of Soundwire stream runtime from the Qualcomm Soundwire driver to each individual machine sound card driver, except that it forgot to update SC7280 card. Just like for other Qualcomm sound cards using Soundwire, the card driver should allocate and release the runtime. Otherwise sound playback will result in a NULL pointer dereference or other effect of uninitialized memory accesses (which was confirmed on SDM845 having similar issue). Cc: stable@vger.kernel.org Cc: Alexey Klimov <alexey.klimov@linaro.org> Cc: Steev Klimaszewski <steev@kali.org> Fixes: `15c7fab0e0` ("ASoC: qcom: Move Soundwire runtime stream alloc to soundcards") Link: https://lore.kernel.org/r/20241010054109.16938-1-krzysztof.kozlowski@linaro.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20241012101108.129476-1-krzysztof.kozlowski@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-22 21:00:38 +01:00
Hou Tao	1f97c03f43	bpf: Preserve param->string when parsing mount options In bpf_parse_param(), keep the value of param->string intact so it can be freed later. Otherwise, the kmalloc area pointed to by param->string will be leaked as shown below: unreferenced object 0xffff888118c46d20 (size 8): comm "new_name", pid 12109, jiffies 4295580214 hex dump (first 8 bytes): 61 6e 79 00 38 c9 5c 7e any.8.\~ backtrace (crc e1b7f876): [<00000000c6848ac7>] kmemleak_alloc+0x4b/0x80 [<00000000de9f7d00>] __kmalloc_node_track_caller_noprof+0x36e/0x4a0 [<000000003e29b886>] memdup_user+0x32/0xa0 [<0000000007248326>] strndup_user+0x46/0x60 [<0000000035b3dd29>] __x64_sys_fsconfig+0x368/0x3d0 [<0000000018657927>] x64_sys_call+0xff/0x9f0 [<00000000c0cabc95>] do_syscall_64+0x3b/0xc0 [<000000002f331597>] entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: `6c1752e0b6` ("bpf: Support symbolic BPF FS delegation mount options") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20241022130133.3798232-1-houtao@huaweicloud.com	2024-10-22 12:56:38 -07:00
Georgi Djakov	d0ccf760a4	spi: geni-qcom: Fix boot warning related to pm_runtime and devres During boot, users sometimes observe the following warning: [7.841431] WARNING: CPU: 4 PID: 492 at drivers/interconnect/core.c:685 __icc_enable (drivers/interconnect/core.c:685 (discriminator 7)) [..] [7.841541] Call trace: [7.841542] __icc_enable (drivers/interconnect/core.c:685 (discriminator 7)) [7.841545] icc_disable (drivers/interconnect/core.c:708) [7.841547] geni_icc_disable (drivers/soc/qcom/qcom-geni-se.c:862) [7.841553] spi_geni_runtime_suspend+0x3c/0x4c spi_geni_qcom This occurs when the spi-geni driver receives an -EPROBE_DEFER error from spi_geni_grab_gpi_chan(), causing devres to start releasing all resources as shown below: [7.138679] geni_spi 880000.spi: DEVRES REL ffff800081443800 devm_icc_release (8 bytes) [7.138751] geni_spi 880000.spi: DEVRES REL ffff800081443800 devm_icc_release (8 bytes) [7.138827] geni_spi 880000.spi: DEVRES REL ffff800081443800 pm_runtime_disable_action (16 bytes) [7.139494] geni_spi 880000.spi: DEVRES REL ffff800081443800 devm_pm_opp_config_release (16 bytes) [7.139512] geni_spi 880000.spi: DEVRES REL ffff800081443800 devm_spi_release_controller (8 bytes) [7.139516] geni_spi 880000.spi: DEVRES REL ffff800081443800 devm_clk_release (16 bytes) [7.139519] geni_spi 880000.spi: DEVRES REL ffff800081443800 devm_ioremap_release (8 bytes) [7.139524] geni_spi 880000.spi: DEVRES REL ffff800081443800 devm_region_release (24 bytes) [7.139527] geni_spi 880000.spi: DEVRES REL ffff800081443800 devm_kzalloc_release (22 bytes) [7.139530] geni_spi 880000.spi: DEVRES REL ffff800081443800 devm_pinctrl_release (8 bytes) [7.139539] geni_spi 880000.spi: DEVRES REL ffff800081443800 devm_kzalloc_release (40 bytes) The issue here is that pm_runtime_disable_action() results in a call to spi_geni_runtime_suspend(), which attempts to suspend the device and disable an interconnect path that devm_icc_release() has just released. Resolve this by calling geni_icc_get() before enabling runtime PM. This approach ensures that when devres releases resources in reverse order, it will start with pm_runtime_disable_action(), suspending the device, and then proceed to free the remaining resources. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Closes: https://lore.kernel.org/r/CA+G9fYtsjFtddG8i+k-SpV8U6okL0p4zpsTiwGfNH5GUA8dWAA@mail.gmail.com Fixes: `89e362c883` ("spi: geni-qcom: Undo runtime PM changes at driver exit time") Signed-off-by: Georgi Djakov <djakov@kernel.org> Link: https://patch.msgid.link/20241008231615.430073-1-djakov@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-22 20:49:02 +01:00
Shengjiu Wang	b9a8ecf810	ASoC: fsl_micfil: Add sample rate constraint On some platforms, for example i.MX93, there is only one audio PLL source, so some sample rate can't be supported. If the PLL source is used for 8kHz series rates, then 11kHz series rates can't be supported. So add constraints according to the frequency of available clock sources, then alsa-lib will help to convert the unsupported rate for the driver. Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://patch.msgid.link/1728884313-6778-1-git-send-email-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-22 17:36:30 +01:00
Keith Busch	f54f0d0e2b	nvme: enhance cns version checking The number of CNS bits in the command is specific to the nvme spec version compliance. The existing check is not sufficient for possible CNS values the driver uses that may create confusion between host and device, so enhance the check to consider the version and desired CNS value. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-22 09:11:03 -07:00
Borislav Petkov (AMD)	1d81d85d1a	x86/microcode/AMD: Split load_microcode_amd() This function should've been split a long time ago because it is used in two paths: 1) On the late loading path, when the microcode is loaded through the request_firmware interface 2) In the save_microcode_in_initrd() path which collects all the microcode patches which are relevant for the current system before the initrd with the microcode container has been jettisoned. In that path, it is not really necessary to iterate over the nodes on a system and match a patch however it didn't cause any trouble so it was left for a later cleanup However, that later cleanup was expedited by the fact that Jens was enabling "Use L3 as a NUMA node" in the BIOS setting in his machine and so this causes the NUMA CPU masks used in cpumask_of_node() to be generated after 2) above happened on the first node. Which means, all those masks were funky, wrong, uninitialized and whatnot, leading to explosions when dereffing c->microcode in load_microcode_amd(). So split that function and do only the necessary work needed at each stage. Fixes: `94838d230a` ("x86/microcode/AMD: Use the family,model,stepping encoded in the patch ID") Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/91194406-3fdf-4e38-9838-d334af538f74@kernel.dk	2024-10-22 16:48:00 +02:00
Dave Kleikamp	67373ca840	jfs: Fix sanity check in dbMount MAXAG is a legitimate value for bmp->db_numag Fixes: `e63866a475` ("jfs: fix out-of-bounds in dbNextAG() and diAlloc()") Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>	2024-10-22 09:40:37 -05:00
Borislav Petkov (AMD)	d1744a4c97	x86/microcode/AMD: Pay attention to the stepping dynamically Commit in Fixes changed how a microcode patch is loaded on Zen and newer but the patch matching needs to happen with different rigidity, depending on what is being done: 1) When the patch is added to the patches cache, the stepping must be ignored because the driver still supports different steppings per system 2) When the patch is matched for loading, then the stepping must be taken into account because each CPU needs the patch matching its exact stepping Take care of that by making the matching smarter. Fixes: `94838d230a` ("x86/microcode/AMD: Use the family,model,stepping encoded in the patch ID") Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/91194406-3fdf-4e38-9838-d334af538f74@kernel.dk	2024-10-22 16:37:13 +02:00
Yue Haibing	75f49c3dc7	btrfs: fix passing 0 to ERR_PTR in btrfs_search_dir_index_item() The ret may be zero in btrfs_search_dir_index_item() and should not passed to ERR_PTR(). Now btrfs_unlink_subvol() is the only caller to this, reconstructed it to check ERR_PTR(-ENOENT) while ret >= 0. This fixes smatch warnings: fs/btrfs/dir-item.c:353 btrfs_search_dir_index_item() warn: passing zero to 'ERR_PTR' Fixes: `9dcbe16fcc` ("btrfs: use btrfs_for_each_slot in btrfs_search_dir_index_item") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-22 16:10:55 +02:00
Qu Wenruo	3c36a72c1d	btrfs: reject ro->rw reconfiguration if there are hard ro requirements [BUG] Syzbot reports the following crash: BTRFS info (device loop0 state MCS): disabling free space tree BTRFS info (device loop0 state MCS): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1) BTRFS info (device loop0 state MCS): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2) Oops: general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:backup_super_roots fs/btrfs/disk-io.c:1691 [inline] RIP: 0010:write_all_supers+0x97a/0x40f0 fs/btrfs/disk-io.c:4041 Call Trace: <TASK> btrfs_commit_transaction+0x1eae/0x3740 fs/btrfs/transaction.c:2530 btrfs_delete_free_space_tree+0x383/0x730 fs/btrfs/free-space-tree.c:1312 btrfs_start_pre_rw_mount+0xf28/0x1300 fs/btrfs/disk-io.c:3012 btrfs_remount_rw fs/btrfs/super.c:1309 [inline] btrfs_reconfigure+0xae6/0x2d40 fs/btrfs/super.c:1534 btrfs_reconfigure_for_mount fs/btrfs/super.c:2020 [inline] btrfs_get_tree_subvol fs/btrfs/super.c:2079 [inline] btrfs_get_tree+0x918/0x1920 fs/btrfs/super.c:2115 vfs_get_tree+0x90/0x2b0 fs/super.c:1800 do_new_mount+0x2be/0xb40 fs/namespace.c:3472 do_mount fs/namespace.c:3812 [inline] __do_sys_mount fs/namespace.c:4020 [inline] __se_sys_mount+0x2d6/0x3c0 fs/namespace.c:3997 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f [CAUSE] To support mounting different subvolume with different RO/RW flags for the new mount APIs, btrfs introduced two workaround to support this feature: - Skip mount option/feature checks if we are mounting a different subvolume - Reconfigure the fs to RW if the initial mount is RO Combining these two, we can have the following sequence: - Mount the fs ro,rescue=all,clear_cache,space_cache=v1 rescue=all will mark the fs as hard read-only, so no v2 cache clearing will happen. - Mount a subvolume rw of the same fs. We go into btrfs_get_tree_subvol(), but fc_mount() returns EBUSY because our new fc is RW, different from the original fs. Now we enter btrfs_reconfigure_for_mount(), which switches the RO flag first so that we can grab the existing fs_info. Then we reconfigure the fs to RW. - During reconfiguration, option/features check is skipped This means we will restart the v2 cache clearing, and convert back to v1 cache. This will trigger fs writes, and since the original fs has "rescue=all" option, it skips the csum tree read. And eventually causing NULL pointer dereference in super block writeback. [FIX] For reconfiguration caused by different subvolume RO/RW flags, ensure we always run btrfs_check_options() to ensure we have proper hard RO requirements met. In fact the function btrfs_check_options() doesn't really do many complex checks, but hard RO requirement and some feature dependency checks, thus there is no special reason not to do the check for mount reconfiguration. Reported-by: syzbot+56360f93efa90ff15870@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/0000000000008c5d090621cb2770@google.com/ Fixes: `f044b31867` ("btrfs: handle the ro->rw transition for mounting different subvolumes") CC: stable@vger.kernel.org # 6.8+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-22 16:10:51 +02:00
Boris Burkov	7a2339058e	btrfs: fix read corruption due to race with extent map merging In debugging some corrupt squashfs files, we observed symptoms of corrupt page cache pages but correct on-disk contents. Further investigation revealed that the exact symptom was a correct page followed by an incorrect, duplicate, page. This got us thinking about extent maps. commit `ac05ca913e` ("Btrfs: fix race between using extent maps and merging them") enforces a reference count on the primary `em` extent_map being merged, as that one gets modified. However, since, commit `3d2ac99224` ("btrfs: introduce new members for extent_map") both 'em' and 'merge' get modified, which started modifying 'merge' and thus introduced the same race. We were able to reproduce this by looping the affected squashfs workload in parallel on a bunch of separate btrfs-es while also dropping caches. We are still working on a simple enough reproducer to make into an fstest. The simplest fix is to stop modifying 'merge', which is not essential, as it is dropped immediately after the merge. This behavior is simply a consequence of the order of the two extent maps being important in computing the new values. Modify merge_ondisk_extents to take prev and next by const* and also take a third merged parameter that it puts the results in. Note that this introduces the rather odd behavior of passing 'em' to merge_ondisk_extents as a const * and as a regular ptr. Fixes: `3d2ac99224` ("btrfs: introduce new members for extent_map") CC: stable@vger.kernel.org # 6.11+ Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-22 16:10:13 +02:00
Qu Wenruo	f10f59f91a	btrfs: fix the delalloc range locking if sector size < page size Inside lock_delalloc_folios(), there are several problems related to sector size < page size handling: - Set the writer locks without checking if the folio is still valid We call btrfs_folio_start_writer_lock() just like it's folio_lock(). But since the folio may not even be the folio of the current mapping, we can easily screw up the folio->private. - The range is not clamped inside the page This means we can over write other bitmaps if the start/len is not properly handled, and trigger the btrfs_subpage_assert(). - @processed_end is always rounded up to page end If the delalloc range is not page aligned, and we need to retry (returning -EAGAIN), then we will unlock to the page end. Thankfully this is not a huge problem, as now btrfs_folio_end_writer_lock() can handle range larger than the locked range, and only unlock what is already locked. Fix all these problems by: - Lock and check the folio first, then call btrfs_folio_set_writer_lock() So that if we got a folio not belonging to the inode, we won't touch folio->private. - Properly truncate the range inside the page - Update @processed_end to the locked range end Fixes: `1e1de38792` ("btrfs: make process_one_page() to handle subpage locking") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-22 16:09:44 +02:00
Qu Wenruo	5f9062a48d	btrfs: qgroup: set a more sane default value for subtree drop threshold Since commit `011b46c304` ("btrfs: skip subtree scan if it's too high to avoid low stall in btrfs_commit_transaction()"), btrfs qgroup can automatically skip large subtree scan at the cost of marking qgroup inconsistent. It's designed to address the final performance problem of snapshot drop with qgroup enabled, but to be safe the default value is BTRFS_MAX_LEVEL, requiring a user space daemon to set a different value to make it work. I'd say it's not a good idea to rely on user space tool to set this default value, especially when some operations (snapshot dropping) can be triggered immediately after mount, leaving a very small window to that that sysfs interface. So instead of disabling this new feature by default, enable it with a low threshold (3), so that large subvolume tree drop at mount time won't cause huge qgroup workload. CC: stable@vger.kernel.org # 6.1 Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-22 16:09:11 +02:00
Filipe Manana	3510e684b8	btrfs: clear force-compress on remount when compress mount option is given After the migration to use fs context for processing mount options we had a slight change in the semantics for remounting a filesystem that was mounted with compress-force. Before we could clear compress-force by passing only "-o compress[=algo]" during a remount, but after that change that does not work anymore, force-compress is still present and one needs to pass "-o compress-force=no,compress[=algo]" to the mount command. Example, when running on a kernel 6.8+: $ mount -o compress-force=zlib:9 /dev/sdi /mnt/sdi $ mount \| grep sdi /dev/sdi on /mnt/sdi type btrfs (rw,relatime,compress-force=zlib:9,discard=async,space_cache=v2,subvolid=5,subvol=/) $ mount -o remount,compress=zlib:5 /mnt/sdi $ mount \| grep sdi /dev/sdi on /mnt/sdi type btrfs (rw,relatime,compress-force=zlib:5,discard=async,space_cache=v2,subvolid=5,subvol=/) On a 6.7 kernel (or older): $ mount -o compress-force=zlib:9 /dev/sdi /mnt/sdi $ mount \| grep sdi /dev/sdi on /mnt/sdi type btrfs (rw,relatime,compress-force=zlib:9,discard=async,space_cache=v2,subvolid=5,subvol=/) $ mount -o remount,compress=zlib:5 /mnt/sdi $ mount \| grep sdi /dev/sdi on /mnt/sdi type btrfs (rw,relatime,compress=zlib:5,discard=async,space_cache=v2,subvolid=5,subvol=/) So update btrfs_parse_param() to clear "compress-force" when "compress" is given, providing the same semantics as kernel 6.7 and older. Reported-by: Roman Mamedov <rm@romanrm.net> Link: https://lore.kernel.org/linux-btrfs/20241014182416.13d0f8b0@nvm/ CC: stable@vger.kernel.org # 6.8+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-22 16:07:53 +02:00
Hsin-Te Yuan	655c6c1b7a	drm/mediatek: Fix color format MACROs in OVL In commit `9f428b95ac` ("drm/mediatek: Add new color format MACROs in OVL"), some new color formats are defined in the MACROs to make the switch statement more concise. That commit was intended to be a no-op cleanup. However, there are typos in these formats MACROs, which cause the return value to be incorrect. Fix the typos to ensure the return value remains unchanged. Fixes: `9f428b95ac` ("drm/mediatek: Add new color format MACROs in OVL") Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Hsin-Te Yuan <yuanhsinte@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241016-color-v3-1-e0f5f44a72d8@chromium.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-10-22 12:50:51 +00:00
Jason-JH.Lin	e6411bf2ae	drm/mediatek: Add blend_modes to mtk_plane_init() for different SoCs Since some SoCs support premultiplied pixel formats but some do not, the blend_modes parameter is added to mtk_plane_init(), which is obtained from the mtk_ddp_comp_get_blend_modes function implemented in different blending supported components. The blending supported components can use driver data to set the blend mode capabilities for different SoCs. Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241009034646.13143-6-jason-jh.lin@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-10-22 12:47:47 +00:00
Jason-JH.Lin	333ab43616	drm/mediatek: ovl: Add blend_modes to driver data OVL_CON_CLRFMT_MAN is a configuration for extending color format settings of DISP_REG_OVL_CON(n). It will change some of the original color format settings. Take the settings of (3 << 12) for example. - If OVL_CON_CLRFMT_MAN = 0 means OVL_CON_CLRFMT_RGBA8888. - If OVL_CON_CLRFMT_MAN = 1 means OVL_CON_CLRFMT_PARGB8888. Since previous SoCs did not support OVL_CON_CLRFMT_MAN, this means that the SoC does not support the premultiplied color format. It will break the original color format setting of MT8173. Therefore, the blend_modes is added to the driver data and then mtk_ovl_fmt_convert() will check the blend_modes to see if pre-multiplied is supported in the current platform. If it is not supported, use coverage mode to set it to the supported color formats to solve the degradation problem. Fixes: `a3f7f7ef4b` ("drm/mediatek: Support "Pre-multiplied" blending in OVL") Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241009034646.13143-5-jason-jh.lin@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-10-22 12:46:15 +00:00
Jason-JH.Lin	41607c3ceb	drm/mediatek: ovl: Remove the color format comment for ovl_fmt_convert() Since we changed MACROs to be consistent with DRM input color format naming, the comment for ovl_fmt_conver() is no longer needed. Fixes: `9f428b95ac` ("drm/mediatek: Add new color format MACROs in OVL") Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241009034646.13143-4-jason-jh.lin@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-10-22 12:44:44 +00:00
Jason-JH.Lin	28fbc3293f	drm/mediatek: ovl: Refine ignore_pixel_alpha comment and placement Refine the comment for ignore_pixel_alpha flag and move it to if(state->fb) statement to make it less conditional. Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241009034646.13143-3-jason-jh.lin@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-10-22 12:43:15 +00:00
Jason-JH.Lin	995d4d558e	drm/mediatek: ovl: Fix XRGB format breakage for blend_modes unsupported SoCs OVL_CON_AEN is for alpha blending enable. For the SoC that is supported the blend_modes, OVL_CON_AEN will always enabled to use constant alpha and then use the ignore_pixel_alpha bit to do the alpha blending for XRGB8888 format. Note that ignore pixel alpha bit is not supported if the SoC is not supported the blend_modes. So it will break the original setting of XRGB8888 format for the blend_modes unsupported SoCs, such as MT8173. To fix the downgrade issue, enable alpha blending only when a valid blend_mode or has_alpha is set. Fixes: `bc46eb5d5d` ("drm/mediatek: Support DRM plane alpha in OVL") Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241009034646.13143-2-jason-jh.lin@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-10-22 12:41:35 +00:00
Christoph Hellwig	4a201dcfa1	xfs: update the pag for the last AG at recovery time Currently log recovery never updates the in-core perag values for the last allocation group when they were grown by growfs. This leads to btree record validation failures for the alloc, ialloc or finotbt trees if a transaction references this new space. Found by Brian's new growfs recovery stress test. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-22 13:37:19 +02:00
Christoph Hellwig	069cf5e32b	xfs: don't use __GFP_RETRY_MAYFAIL in xfs_initialize_perag __GFP_RETRY_MAYFAIL increases the likelyhood of allocations to fail, which isn't really helpful during log recovery. Remove the flag and stick to the default GFP_KERNEL policies. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-22 13:37:18 +02:00
Christoph Hellwig	b882b0f813	xfs: error out when a superblock buffer update reduces the agcount XFS currently does not support reducing the agcount, so error out if a logged sb buffer tries to shrink the agcount. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-22 13:37:18 +02:00
Christoph Hellwig	6a18765b54	xfs: update the file system geometry after recoverying superblock buffers Primary superblock buffers that change the file system geometry after a growfs operation can affect the operation of later CIL checkpoints that make use of the newly added space and allocation groups. Apply the changes to the in-memory structures as part of recovery pass 2, to ensure recovery works fine for such cases. In the future we should apply the logic to other updates such as features bits as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-22 13:37:18 +02:00
Christoph Hellwig	aa67ec6a25	xfs: merge the perag freeing helpers There is no good reason to have two different routines for freeing perag structures for the unmount and error cases. Add two arguments to specify the range of AGs to free to xfs_free_perag, and use that to replace xfs_free_unused_perag_range. The addition RCU grace period for the error case is harmless, and the extra check for the AG to actually exist is not required now that the callers pass the exact known allocated range. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-22 13:37:18 +02:00
Christoph Hellwig	82742f8c3f	xfs: pass the exact range to initialize to xfs_initialize_perag Currently only the new agcount is passed to xfs_initialize_perag, which requires lookups of existing AGs to skip them and complicates error handling. Also pass the previous agcount so that the range that xfs_initialize_perag operates on is exactly defined. That way the extra lookups can be avoided, and error handling can clean up the exact range from the old count to the last added perag structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-22 13:37:18 +02:00
Darrick J. Wong	af8512c527	xfs: don't fail repairs on metadata files with no attr fork Fix a minor bug where we fail repairs on metadata files that do not have attr forks because xrep_metadata_inode_subtype doesn't filter ENOENT. Cc: stable@vger.kernel.org # v6.8 Fixes: `5a8e07e799` ("xfs: repair the inode core and forks of a metadata inode") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-22 13:37:18 +02:00
Oliver Neukum	8a7d12d674	net: usb: usbnet: fix name regression The fix for MAC addresses broke detection of the naming convention because it gave network devices no random MAC before bind() was called. This means that the check for the local assignment bit was always negative as the address was zeroed from allocation, instead of from overwriting the MAC with a unique hardware address. The correct check for whether bind() has altered the MAC is done with is_zero_ether_addr Signed-off-by: Oliver Neukum <oneukum@suse.com> Reported-by: Greg Thelen <gthelen@google.com> Diagnosed-by: John Sperbeck <jsperbeck@google.com> Fixes: `bab8eb0dd4` ("usbnet: modern method to get random MAC") Link: https://patch.msgid.link/20241017071849.389636-1-oneukum@suse.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-22 13:24:26 +02:00
Yuan Can	f7b4cf0306	mlxsw: spectrum_router: fix xa_store() error checking It is meant to use xa_err() to extract the error encoded in the return value of xa_store(). Fixes: `44c2fbebe1` ("mlxsw: spectrum_router: Share nexthop counters in resilient groups") Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20241017023223.74180-1-yuancan@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-22 13:01:55 +02:00
Paolo Abeni	fa287557e6	Merge tag 'nf-24-10-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== This patchset contains Netfilter fixes for net: 1) syzkaller managed to triger UaF due to missing reference on netns in bpf infrastructure, from Florian Westphal. 2) Fix incorrect conversion from NFPROTO_UNSPEC to NFPROTO_{IPV4,IPV6} in the following xtables targets: MARK and NFLOG. Moreover, add missing I have my half share in this mistake, I did not take the necessary time to review this: For several years I have been struggling to keep working on Netfilter, juggling a myriad of side consulting projects to stop burning my own savings. I have extended the iptables-tests.py test infrastructure to improve the coverage of ip6tables and detect similar problems in the future. This is a v2 including a extended PR with one more fix. netfilter pull request 24-10-21 * tag 'nf-24-10-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: xtables: fix typo causing some targets not to load on IPv6 netfilter: bpf: must hold reference on net namespace ==================== Link: https://patch.msgid.link/20241021094536.81487-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-22 12:43:42 +02:00
Bartosz Golaszewski	7e336a6c15	MAINTAINERS: add a keyword entry for the GPIO subsystem Every now and then - despite being clearly documented as deprecated - the legacy GPIO API is being used in some new drivers in the kernel. Add a keyword pattern matching the unwanted functions so that I get Cc'ed anytime they're being used and get the chance to object. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20241017071835.19069-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-10-22 09:13:10 +02:00
Viktor Malik	aff1871bfc	objpool: fix choosing allocation for percpu slots objpool intends to use vmalloc for default (non-atomic) allocations of percpu slots and objects. However, the condition checking if GFP flags set any bit of GFP_ATOMIC is wrong b/c GFP_ATOMIC is a combination of bits (__GFP_HIGH\|__GFP_KSWAPD_RECLAIM) and so `pool->gfp & GFP_ATOMIC` will be true if either bit is set. Since GFP_ATOMIC and GFP_KERNEL share the ___GFP_KSWAPD_RECLAIM bit, kmalloc will be used in cases when GFP_KERNEL is specified, i.e. in all current usages of objpool. This may lead to unexpected OOM errors since kmalloc cannot allocate large amounts of memory. For instance, objpool is used by fprobe rethook which in turn is used by BPF kretprobe.multi and kprobe.session probe types. Trying to attach these to all kernel functions with libbpf using SEC("kprobe.session/") int kprobe(struct pt_regs ctx) { [...] } fails on objpool slot allocation with ENOMEM. Fix the condition to truly use vmalloc by default. Link: https://lore.kernel.org/all/20240826060718.267261-1-vmalik@redhat.com/ Fixes: `b4edb8d2d4` ("lib: objpool added: ring-array based lockless MPMC") Signed-off-by: Viktor Malik <vmalik@redhat.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Matt Wu <wuqiang.matt@bytedance.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-10-22 14:22:42 +09:00
Mark Brown	c2ee9f594d	KVM: selftests: Fix build on on non-x86 architectures Commit `9a400068a1` ("KVM: selftests: x86: Avoid using SSE/AVX instructions") unconditionally added -march=x86-64-v2 to the CFLAGS used to build the KVM selftests which does not work on non-x86 architectures: cc1: error: unknown value ‘x86-64-v2’ for ‘-march’ Fix this by making the addition of this x86 specific command line flag conditional on building for x86. Fixes: `9a400068a1` ("KVM: selftests: x86: Avoid using SSE/AVX instructions") Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-10-21 15:49:33 -07:00
Linus Torvalds	a360f311f5	9p: fix slab cache name creation for real This was attempted by using the dev_name in the slab cache name, but as Omar Sandoval pointed out, that can be an arbitrary string, eg something like "/dev/root". Which in turn trips verify_dirent_name(), which fails if a filename contains a slash. So just make it use a sequence counter, and make it an atomic_t to avoid any possible races or locking issues. Reported-and-tested-by: Omar Sandoval <osandov@fb.com> Link: https://lore.kernel.org/all/ZxafcO8KWMlXaeWE@telecaster.dhcp.thefacebook.com/ Fixes: `79efebae4a` ("9p: Avoid creating multiple slab caches with the same name") Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-10-21 15:41:29 -07:00
Pawan Gupta	3267cb6d3a	x86/lam: Disable ADDRESS_MASKING in most cases Linear Address Masking (LAM) has a weakness related to transient execution as described in the SLAM paper[1]. Unless Linear Address Space Separation (LASS) is enabled this weakness may be exploitable. Until kernel adds support for LASS[2], only allow LAM for COMPILE_TEST, or when speculation mitigations have been disabled at compile time, otherwise keep LAM disabled. There are no processors in market that support LAM yet, so currently nobody is affected by this issue. [1] SLAM: https://download.vusec.net/papers/slam_sp24.pdf [2] LASS: https://lore.kernel.org/lkml/20230609183632.48706-1-alexander.shishkin@linux.intel.com/ [ dhansen: update SPECULATION_MITIGATIONS -> CPU_MITIGATIONS ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/5373262886f2783f054256babdf5a98545dc986b.1706068222.git.pawan.kumar.gupta%40linux.intel.com	2024-10-21 15:05:43 -07:00
Linus Torvalds	d129377639	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "ARM64: - Fix the guest view of the ID registers, making the relevant fields writable from userspace (affecting ID_AA64DFR0_EL1 and ID_AA64PFR1_EL1) - Correcly expose S1PIE to guests, fixing a regression introduced in 6.12-rc1 with the S1POE support - Fix the recycling of stage-2 shadow MMUs by tracking the context (are we allowed to block or not) as well as the recycling state - Address a couple of issues with the vgic when userspace misconfigures the emulation, resulting in various splats. Headaches courtesy of our Syzkaller friends - Stop wasting space in the HYP idmap, as we are dangerously close to the 4kB limit, and this has already exploded in -next - Fix another race in vgic_init() - Fix a UBSAN error when faking the cache topology with MTE enabled RISCV: - RISCV: KVM: use raw_spinlock for critical section in imsic x86: - A bandaid for lack of XCR0 setup in selftests, which causes trouble if the compiler is configured to have x86-64-v3 (with AVX) as the default ISA. Proper XCR0 setup will come in the next merge window. - Fix an issue where KVM would not ignore low bits of the nested CR3 and potentially leak up to 31 bytes out of the guest memory's bounds - Fix case in which an out-of-date cached value for the segments could by returned by KVM_GET_SREGS. - More cleanups for KVM_X86_QUIRK_SLOT_ZAP_ALL - Override MTRR state for KVM confidential guests, making it WB by default as is already the case for Hyper-V guests. Generic: - Remove a couple of unused functions" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (27 commits) RISCV: KVM: use raw_spinlock for critical section in imsic KVM: selftests: Fix out-of-bounds reads in CPUID test's array lookups KVM: selftests: x86: Avoid using SSE/AVX instructions KVM: nSVM: Ignore nCR3[4:0] when loading PDPTEs from memory KVM: VMX: reset the segment cache after segment init in vmx_vcpu_reset() KVM: x86: Clean up documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL KVM: x86/mmu: Add lockdep assert to enforce safe usage of kvm_unmap_gfn_range() KVM: x86/mmu: Zap only SPs that shadow gPTEs when deleting memslot x86/kvm: Override default caching mode for SEV-SNP and TDX KVM: Remove unused kvm_vcpu_gfn_to_pfn_atomic KVM: Remove unused kvm_vcpu_gfn_to_pfn KVM: arm64: Ensure vgic_ready() is ordered against MMIO registration KVM: arm64: vgic: Don't check for vgic_ready() when setting NR_IRQS KVM: arm64: Fix shift-out-of-bounds bug KVM: arm64: Shave a few bytes from the EL2 idmap code KVM: arm64: Don't eagerly teardown the vgic on init error KVM: arm64: Expose S1PIE to guests KVM: arm64: nv: Clarify safety of allowing TLBI unmaps to reschedule KVM: arm64: nv: Punt stage-2 recycling to a vCPU request KVM: arm64: nv: Do not block when unmapping stage-2 if disallowed ...	2024-10-21 11:22:04 -07:00
Linus Torvalds	c1bc09d7bf	Merge tag 'probes-fixes-v6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull uprobe fix from Masami Hiramatsu: - uprobe: avoid out-of-bounds memory access of fetching args Uprobe trace events can cause out-of-bounds memory access when fetching user-space data which is bigger than one page, because it does not check the local CPU buffer size when reading the data. This checks the read data size and cut it down to the local CPU buffer size. * tag 'probes-fixes-v6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: uprobe: avoid out-of-bounds memory access of fetching args	2024-10-21 11:08:05 -07:00
Dipendra Khadka	e70d2677ef	phy: tegra: xusb: Add error pointer check in xusb.c Add error pointer check after tegra_xusb_find_lane(). Fixes: `e8f7d2f409` ("phy: tegra: xusb: Add usb-phy support") Signed-off-by: Dipendra Khadka <kdipendra88@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20240930191101.13184-1-kdipendra88@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-21 23:34:42 +05:30
Abel Vesa	16fde3e076	dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Fix X1E80100 resets entries The PCIe 6a PHY is actually Gen4 4-lanes capable. So the gen4x4 compatible describes it. But according to the schema, currently the gen4x4 compatible doesn't require both PHY and PHY-nocsr resets, while the HW does. So fix that by adding the gen4x4 compatible alongside the gen4x2 one for the resets description. Fixes: `0c5f4d23f7` ("dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Document the X1E80100 QMP PCIe PHY Gen4 x4") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410182029.n2zPkuGx-lkp@intel.com/ Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20241021-phy-qcom-qmp-pcie-fix-x1e80100-gen4x4-resets-v3-1-1918c46fc37c@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-21 23:33:25 +05:30
Richard Zhu	f89263b697	phy: freescale: imx8m-pcie: Do CMN_RST just before PHY PLL lock check When enable initcall_debug together with higher debug level below. CONFIG_CONSOLE_LOGLEVEL_DEFAULT=9 CONFIG_CONSOLE_LOGLEVEL_QUIET=9 CONFIG_MESSAGE_LOGLEVEL_DEFAULT=7 The initialization of i.MX8MP PCIe PHY might be timeout failed randomly. To fix this issue, adjust the sequence of the resets refer to the power up sequence listed below. i.MX8MP PCIe PHY power up sequence: /--------------------------------------------- 1.8v supply ---------/ /--------------------------------------------------- 0.8v supply ---/ ---\ /-------------------------------------------------- X REFCLK Valid Reference Clock ---/ \-------------------------------------------------- ------------------------------------------- \| i_init_restn -------------- ------------------------------------ \| i_cmn_rstn --------------------- ------------------------------- \| o_pll_lock_done -------------------------- Logs: imx6q-pcie 33800000.pcie: host bridge /soc@0/pcie@33800000 ranges: imx6q-pcie 33800000.pcie: IO 0x001ff80000..0x001ff8ffff -> 0x0000000000 imx6q-pcie 33800000.pcie: MEM 0x0018000000..0x001fefffff -> 0x0018000000 probe of clk_imx8mp_audiomix.reset.0 returned 0 after 1052 usecs probe of 30e20000.clock-controller returned 0 after 32971 usecs phy phy-32f00000.pcie-phy.4: phy poweron failed --> -110 probe of 30e10000.dma-controller returned 0 after 10235 usecs imx6q-pcie 33800000.pcie: waiting for PHY ready timeout! dwhdmi-imx 32fd8000.hdmi: Detected HDMI TX controller v2.13a with HDCP (samsung_dw_hdmi_phy2) imx6q-pcie 33800000.pcie: probe with driver imx6q-pcie failed with error -110 Fixes: `dce9edff16` ("phy: freescale: imx8m-pcie: Add i.MX8MP PCIe PHY support") Cc: stable@vger.kernel.org Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> v2 changes: - Rebase to latest fixes branch of linux-phy git repo. - Richard's environment have problem and can't sent out patch. So I help post this fix patch. Link: https://lore.kernel.org/r/20241021155241.943665-1-Frank.Li@nxp.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-21 23:18:29 +05:30
Linus Torvalds	7166c32651	Merge tag 'vfs-6.12-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "afs: - Fix a lock recursion in afs_wake_up_async_call() on ->notify_lock netfs: - Drop the references to a folio immediately after the folio has been extracted to prevent races with future I/O collection - Fix a documenation build error - Downgrade the i_rwsem for buffered writes to fix a cifs reported performance regression when switching to netfslib vfs: - Explicitly return -E2BIG from openat2() if the specified size is unexpectedly large. This aligns openat2() with other extensible struct based system calls - When copying a mount namespace ensure that we only try to remove the new copy from the mount namespace rbtree if it has already been added to it nilfs: - Clear the buffer delay flag when clearing the buffer state clags when a buffer head is discarded to prevent a kernel OOPs ocfs2: - Fix an unitialized value warning in ocfs2_setattr() proc: - Fix a kernel doc warning" * tag 'vfs-6.12-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: proc: Fix W=1 build kernel-doc warning afs: Fix lock recursion fs: Fix uninitialized value issue in from_kuid and from_kgid fs: don't try and remove empty rbtree node netfs: Downgrade i_rwsem for a buffered write nilfs2: fix kernel bug due to missing clearing of buffer delay flag openat2: explicitly return -E2BIG for (usize > PAGE_SIZE) netfs: fix documentation build error netfs: In readahead, put the folio refs as soon extracted	2024-10-21 10:48:24 -07:00
Linus Torvalds	a777c32ca4	Merge tag 'v6.12-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a regression in mpi that broke RSA" * tag 'v6.12-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: lib/mpi - Fix an "Uninitialized scalar variable" issue	2024-10-21 09:59:43 -07:00
Selvin Xavier	76d3ddff71	RDMA/bnxt_re: synchronize the qp-handle table array There is a race between the CREQ tasklet and destroy qp when accessing the qp-handle table. There is a chance of reading a valid qp-handle in the CREQ tasklet handler while the QP is already moving ahead with the destruction. Fixing this race by implementing a table-lock to synchronize the access. Fixes: `f218d67ef0` ("RDMA/bnxt_re: Allow posting when QPs are in error") Fixes: `84cf229f40` ("RDMA/bnxt_re: Fix the qp table indexing") Link: https://patch.msgid.link/r/1728912975-19346-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-21 13:28:15 -03:00
Selvin Xavier	d71f4acd58	RDMA/bnxt_re: Fix the usage of control path spin locks Control path completion processing always runs in tasklet context. To synchronize with the posting thread, there is no need to use the irq variant of spin lock. Use spin_lock_bh instead. Fixes: `1ac5a40479` ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Link: https://patch.msgid.link/r/1728912975-19346-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-21 13:28:15 -03:00
Patrisious Haddad	78ed28e08e	RDMA/mlx5: Round max_rd_atomic/max_dest_rd_atomic up instead of down After the cited commit below max_dest_rd_atomic and max_rd_atomic values are being rounded down to the next power of 2. As opposed to the old behavior and mlx4 driver where they used to be rounded up instead. In order to stay consistent with older code and other drivers, revert to using fls round function which rounds up to the next power of 2. Fixes: `f18e26af6a` ("RDMA/mlx5: Convert modify QP to use MLX5_SET macros") Link: https://patch.msgid.link/r/d85515d6ef21a2fa8ef4c8293dce9b58df8a6297.1728550179.git.leon@kernel.org Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-21 13:25:44 -03:00
Leon Romanovsky	89f8c6f197	RDMA/cxgb4: Dump vendor specific QP details Restore the missing functionality to dump vendor specific QP details, which was mistakenly removed in the commit mentioned in Fixes line. Fixes: `5cc34116cc` ("RDMA: Add dedicated QP resource tracker function") Link: https://patch.msgid.link/r/ed9844829135cfdcac7d64285688195a5cd43f82.1728323026.git.leonro@nvidia.com Reported-by: Dr. David Alan Gilbert <linux@treblig.org> Closes: https://lore.kernel.org/all/Zv_4qAxuC0dLmgXP@gallifrey Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-21 13:22:33 -03:00
Christoph Hellwig	6db388585e	iomap: turn iomap_want_unshare_iter into an inline function iomap_want_unshare_iter currently sits in fs/iomap/buffered-io.c, which depends on CONFIG_BLOCK. It is also in used in fs/dax.c whіch has no such dependency. Given that it is a trivial check turn it into an inline in include/linux/iomap.h to fix the DAX && !BLOCK build. Fixes: `6ef6a0e821` ("iomap: share iomap_unshare_iter predicate code with fsdax") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241015041350.118403-1-hch@lst.de Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-21 17:01:01 +02:00
Bartosz Golaszewski	f2b5b8201b	spi: mtk-snfi: fix kerneldoc for mtk_snand_is_page_ops() The op argument is missing the colon and is not picked up by the kerneldoc generator. Fix it to address the following build warning: drivers/spi/spi-mtk-snfi.c:1201: warning: Function parameter or struct member 'op' not described in 'mtk_snand_is_page_ops' Fixes: `764f1b7481` ("spi: add driver for MTK SPI NAND Flash Interface") Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Link: https://patch.msgid.link/20241021142113.71081-1-brgl@bgdev.pl Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-21 15:33:19 +01:00
Armin Wolf	a7990957fa	platform/x86: dell-wmi: Ignore suspend notifications Some machines like the Dell G15 5155 emit WMI events when suspending/resuming. Ignore those WMI events. Tested-by: siddharth.manthan@gmail.com Signed-off-by: Armin Wolf <W_Armin@gmx.de> Acked-by: Pali Rohár <pali@kernel.org> Link: https://lore.kernel.org/r/20241014220529.397390-1-W_Armin@gmx.de Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2024-10-21 16:31:59 +02:00
Vamsi Krishna Brahmajosyula	48771da480	platform/x86/intel/pmc: Fix pmc_core_iounmap to call iounmap for valid addresses Commit `50c6dbdfd1` ("x86/ioremap: Improve iounmap() address range checks") introduces a WARN when adrress ranges of iounmap are invalid. On Thinkpad P1 Gen 7 (Meteor Lake-P) this caused the following warning to appear: WARNING: CPU: 7 PID: 713 at arch/x86/mm/ioremap.c:461 iounmap+0x58/0x1f0 Modules linked in: rfkill(+) snd_timer(+) fjes(+) snd soundcore intel_pmc_core(+) int3403_thermal(+) int340x_thermal_zone intel_vsec pmt_telemetry acpi_pad pmt_class acpi_tad int3400_thermal acpi_thermal_rel joydev loop nfnetlink zram xe drm_suballoc_helper nouveau i915 mxm_wmi drm_ttm_helper gpu_sched drm_gpuvm drm_exec drm_buddy i2c_algo_bit crct10dif_pclmul crc32_pclmul ttm crc32c_intel polyval_clmulni rtsx_pci_sdmmc ucsi_acpi polyval_generic mmc_core hid_multitouch drm_display_helper ghash_clmulni_intel typec_ucsi nvme sha512_ssse3 video sha256_ssse3 nvme_core intel_vpu sha1_ssse3 rtsx_pci cec typec nvme_auth i2c_hid_acpi i2c_hid wmi pinctrl_meteorlake serio_raw ip6_tables ip_tables fuse CPU: 7 UID: 0 PID: 713 Comm: (udev-worker) Not tainted 6.12.0-rc2iounmap+ #42 Hardware name: LENOVO 21KWCTO1WW/21KWCTO1WW, BIOS N48ET19W (1.06 ) 07/18/2024 RIP: 0010:iounmap+0x58/0x1f0 Code: 85 6a 01 00 00 48 8b 05 e6 e2 28 04 48 39 c5 72 19 eb 26 cc cc cc 48 ba 00 00 00 00 00 00 32 00 48 8d 44 02 ff 48 39 c5 72 23 <0f> 0b 48 83 c4 08 5b 5d 41 5c c3 cc cc cc cc 48 ba 00 00 00 00 00 RSP: 0018:ffff888131eff038 EFLAGS: 00010207 RAX: ffffc90000000000 RBX: 0000000000000000 RCX: ffff888e33b80000 RDX: dffffc0000000000 RSI: ffff888e33bc29c0 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffff8881598a8000 R09: ffff888e2ccedc10 R10: 0000000000000003 R11: ffffffffb3367634 R12: 00000000fe000000 R13: ffff888101d0da28 R14: ffffffffc2e437e0 R15: ffff888110b03b28 FS: 00007f3c1d4b3980(0000) GS:ffff888e33b80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005651cfc93578 CR3: 0000000124e4c002 CR4: 0000000000f70ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0xb6/0x176 ? iounmap+0x58/0x1f0 ? report_bug+0x1f4/0x2b0 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x17/0x40 ? asm_exc_invalid_op+0x1a/0x20 ? iounmap+0x58/0x1f0 pmc_core_ssram_get_pmc+0x477/0x6c0 [intel_pmc_core] ? __pfx_pmc_core_ssram_get_pmc+0x10/0x10 [intel_pmc_core] ? __pfx_do_pci_enable_device+0x10/0x10 ? pci_wait_for_pending+0x60/0x110 ? pci_enable_device_flags+0x1e3/0x2e0 ? __pfx_mtl_core_init+0x10/0x10 [intel_pmc_core] pmc_core_ssram_init+0x7f/0x110 [intel_pmc_core] mtl_core_init+0xda/0x130 [intel_pmc_core] ? __mutex_init+0xb9/0x130 pmc_core_probe+0x27e/0x10b0 [intel_pmc_core] ? _raw_spin_lock_irqsave+0x96/0xf0 ? __pfx_pmc_core_probe+0x10/0x10 [intel_pmc_core] ? __pfx_mutex_unlock+0x10/0x10 ? __pfx_mutex_lock+0x10/0x10 ? device_pm_check_callbacks+0x82/0x370 ? acpi_dev_pm_attach+0x234/0x2b0 platform_probe+0x9f/0x150 really_probe+0x1e0/0x8a0 __driver_probe_device+0x18c/0x370 ? __pfx___driver_attach+0x10/0x10 driver_probe_device+0x4a/0x120 __driver_attach+0x190/0x4a0 ? __pfx___driver_attach+0x10/0x10 bus_for_each_dev+0x103/0x180 ? __pfx_bus_for_each_dev+0x10/0x10 ? klist_add_tail+0x136/0x270 bus_add_driver+0x2fc/0x540 driver_register+0x1a5/0x360 ? __pfx_pmc_core_driver_init+0x10/0x10 [intel_pmc_core] do_one_initcall+0xa4/0x380 ? __pfx_do_one_initcall+0x10/0x10 ? kasan_unpoison+0x44/0x70 do_init_module+0x296/0x800 load_module+0x5090/0x6ce0 ? __pfx_load_module+0x10/0x10 ? ima_post_read_file+0x193/0x200 ? __pfx_ima_post_read_file+0x10/0x10 ? rw_verify_area+0x152/0x4c0 ? kernel_read_file+0x257/0x750 ? __pfx_kernel_read_file+0x10/0x10 ? __pfx_filemap_get_read_batch+0x10/0x10 ? init_module_from_file+0xd1/0x130 init_module_from_file+0xd1/0x130 ? __pfx_init_module_from_file+0x10/0x10 ? __pfx__raw_spin_lock+0x10/0x10 ? __pfx_cred_has_capability.isra.0+0x10/0x10 idempotent_init_module+0x236/0x770 ? __pfx_idempotent_init_module+0x10/0x10 ? fdget+0x58/0x3f0 ? security_capable+0x7d/0x110 __x64_sys_finit_module+0xbe/0x130 do_syscall_64+0x82/0x160 ? __pfx_filemap_read+0x10/0x10 ? __pfx___fsnotify_parent+0x10/0x10 ? vfs_read+0x3a6/0xa30 ? vfs_read+0x3a6/0xa30 ? __seccomp_filter+0x175/0xc60 ? __pfx___seccomp_filter+0x10/0x10 ? fdget_pos+0x1ce/0x500 ? syscall_exit_to_user_mode_prepare+0x149/0x170 ? syscall_exit_to_user_mode+0x10/0x210 ? do_syscall_64+0x8e/0x160 ? switch_fpu_return+0xe3/0x1f0 ? syscall_exit_to_user_mode+0x1d5/0x210 ? do_syscall_64+0x8e/0x160 ? exc_page_fault+0x76/0xf0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f3c1d6d155d Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 83 58 0f 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe6309db38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 0000557c212550a0 RCX: 00007f3c1d6d155d RDX: 0000000000000000 RSI: 00007f3c1cd943bd RDI: 0000000000000025 RBP: 00007ffe6309dbf0 R08: 00007f3c1d7c7b20 R09: 00007ffe6309db80 R10: 0000557c21255270 R11: 0000000000000246 R12: 00007f3c1cd943bd R13: 0000000000020000 R14: 0000557c21255c80 R15: 0000557c21255240 </TASK> no_free_ptr(tmp_ssram) sets tmp_ssram NULL while assigning ssram. pmc_core_iounmap calls iounmap unconditionally causing the above warning to appear during boot. Fix it by checking for a valid address before calling iounmap. Also in the function pmc_core_ssram_get_pmc return -ENOMEM when ioremap fails similar to other instances in the file. Fixes: `a01486dc4b` ("platform/x86/intel/pmc: Cleanup SSRAM discovery") Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Vamsi Krishna Brahmajosyula <vamsikrishna.brahmajosyula@gmail.com> Link: https://lore.kernel.org/r/20241018104958.14195-1-vamsikrishna.brahmajosyula@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2024-10-21 16:31:59 +02:00
Yang Erkun	d5ff2fb2e7	nfsd: cancel nfsd_shrinker_work using sync mode in nfs4_state_shutdown_net In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will release all resources related to the hashed `nfs4_client`. If the `nfsd_client_shrinker` is running concurrently, the `expire_client` function will first unhash this client and then destroy it. This can lead to the following warning. Additionally, numerous use-after-free errors may occur as well. nfsd_client_shrinker echo 0 > /proc/fs/nfsd/threads expire_client nfsd_shutdown_net unhash_client ... nfs4_state_shutdown_net /* won't wait shrinker exit / / cancel_work(&nn->nfsd_shrinker_work) * nfsd_file for this /* won't destroy unhashed client1 / client1 still alive nfs4_state_destroy_net / nfsd_file_cache_shutdown / trigger warning / kmem_cache_destroy(nfsd_file_slab) kmem_cache_destroy(nfsd_file_mark_slab) / release nfsd_file and mark */ __destroy_client ==================================================================== BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on __kmem_cache_shutdown() -------------------------------------------------------------------- CPU: 4 UID: 0 PID: 764 Comm: sh Not tainted 6.12.0-rc3+ #1 dump_stack_lvl+0x53/0x70 slab_err+0xb0/0xf0 __kmem_cache_shutdown+0x15c/0x310 kmem_cache_destroy+0x66/0x160 nfsd_file_cache_shutdown+0xac/0x210 [nfsd] nfsd_destroy_serv+0x251/0x2a0 [nfsd] nfsd_svc+0x125/0x1e0 [nfsd] write_threads+0x16a/0x2a0 [nfsd] nfsctl_transaction_write+0x74/0xa0 [nfsd] vfs_write+0x1a5/0x6d0 ksys_write+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e ==================================================================== BUG nfsd_file_mark (Tainted: G B W ): Objects remaining nfsd_file_mark on __kmem_cache_shutdown() -------------------------------------------------------------------- dump_stack_lvl+0x53/0x70 slab_err+0xb0/0xf0 __kmem_cache_shutdown+0x15c/0x310 kmem_cache_destroy+0x66/0x160 nfsd_file_cache_shutdown+0xc8/0x210 [nfsd] nfsd_destroy_serv+0x251/0x2a0 [nfsd] nfsd_svc+0x125/0x1e0 [nfsd] write_threads+0x16a/0x2a0 [nfsd] nfsctl_transaction_write+0x74/0xa0 [nfsd] vfs_write+0x1a5/0x6d0 ksys_write+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e To resolve this issue, cancel `nfsd_shrinker_work` using synchronous mode in nfs4_state_shutdown_net. Fixes: `7c24fa2250` ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker") Signed-off-by: Yang Erkun <yangerkun@huaweicloud.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-10-21 10:27:36 -04:00
Bibo Mao	d2f8671045	LoongArch: Set initial pte entry with PAGE_GLOBAL for kernel space There are two pages in one TLB entry on LoongArch system. For kernel space, it requires both two pte entries (buddies) with PAGE_GLOBAL bit set, otherwise HW treats it as non-global tlb, there will be potential problems if tlb entry for kernel space is not global. Such as fail to flush kernel tlb with the function local_flush_tlb_kernel_range() which supposed only flush tlb with global bit. Kernel address space areas include percpu, vmalloc, vmemmap, fixmap and kasan areas. For these areas both two consecutive page table entries should be enabled with PAGE_GLOBAL bit. So with function set_pte() and pte_clear(), pte buddy entry is checked and set besides its own pte entry. However it is not atomic operation to set both two pte entries, there is problem with test_vmalloc test case. So function kernel_pte_init() is added to init a pte table when it is created for kernel address space, and the default initial pte value is PAGE_GLOBAL rather than zero at beginning. Then only its own pte entry need update with function set_pte() and pte_clear(), nothing to do with the pte buddy entry. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-10-21 22:11:19 +08:00
Thomas Weißschuh	134475a9ab	LoongArch: Don't crash in stack_top() for tasks without vDSO Not all tasks have a vDSO mapped, for example kthreads never do. If such a task ever ends up calling stack_top(), it will derefence the NULL vdso pointer and crash. This can for example happen when using kunit: [<9000000000203874>] stack_top+0x58/0xa8 [<90000000002956cc>] arch_pick_mmap_layout+0x164/0x220 [<90000000003c284c>] kunit_vm_mmap_init+0x108/0x12c [<90000000003c1fbc>] __kunit_add_resource+0x38/0x8c [<90000000003c2704>] kunit_vm_mmap+0x88/0xc8 [<9000000000410b14>] usercopy_test_init+0xbc/0x25c [<90000000003c1db4>] kunit_try_run_case+0x5c/0x184 [<90000000003c3d54>] kunit_generic_run_threadfn_adapter+0x24/0x48 [<900000000022e4bc>] kthread+0xc8/0xd4 [<9000000000200ce8>] ret_from_kernel_thread+0xc/0xa4 Fixes: `803b0fc5c3` ("LoongArch: Add process management") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-10-21 22:11:19 +08:00
Huacai Chen	2ed119aef6	LoongArch: Set correct size for vDSO code mapping The current size of vDSO code mapping is hardcoded to PAGE_SIZE. This cannot work for 4KB page size after commit `18efd0b10e` ("LoongArch: vDSO: Wire up getrandom() vDSO implementation") because the code size increases to 8KB. Thus set the code mapping size to its real size, i.e. PAGE_ALIGN(vdso_end - vdso_start). Fixes: `18efd0b10e` ("LoongArch: vDSO: Wire up getrandom() vDSO implementation") Reviewed-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-10-21 22:11:19 +08:00
Huacai Chen	69cc6fad5d	LoongArch: Enable IRQ if do_ale() triggered in irq-enabled context Unaligned access exception can be triggered in irq-enabled context such as user mode, in this case do_ale() may call get_user() which may cause sleep. Then we will get: BUG: sleeping function called from invalid context at arch/loongarch/kernel/access-helper.h:7 in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 129, name: modprobe preempt_count: 0, expected: 0 RCU nest depth: 0, expected: 0 CPU: 0 UID: 0 PID: 129 Comm: modprobe Tainted: G W 6.12.0-rc1+ #1723 Tainted: [W]=WARN Stack : 9000000105e0bd48 0000000000000000 9000000003803944 9000000105e08000 9000000105e0bc70 9000000105e0bc78 0000000000000000 0000000000000000 9000000105e0bc78 0000000000000001 9000000185e0ba07 9000000105e0b890 ffffffffffffffff 9000000105e0bc78 73924b81763be05b 9000000100194500 000000000000020c 000000000000000a 0000000000000000 0000000000000003 00000000000023f0 00000000000e1401 00000000072f8000 0000007ffbb0e260 0000000000000000 0000000000000000 9000000005437650 90000000055d5000 0000000000000000 0000000000000003 0000007ffbb0e1f0 0000000000000000 0000005567b00490 0000000000000000 9000000003803964 0000007ffbb0dfec 00000000000000b0 0000000000000007 0000000000000003 0000000000071c1d ... Call Trace: [<9000000003803964>] show_stack+0x64/0x1a0 [<9000000004c57464>] dump_stack_lvl+0x74/0xb0 [<9000000003861ab4>] __might_resched+0x154/0x1a0 [<900000000380c96c>] emulate_load_store_insn+0x6c/0xf60 [<9000000004c58118>] do_ale+0x78/0x180 [<9000000003801bc8>] handle_ale+0x128/0x1e0 So enable IRQ if unaligned access exception is triggered in irq-enabled context to fix it. Cc: stable@vger.kernel.org Reported-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-10-21 22:11:19 +08:00
Huacai Chen	b7296f9d5b	LoongArch: Get correct cores_per_package for SMT systems In loongson_sysconf, The "core" of cores_per_node and cores_per_package stands for a logical core, which means in a SMT system it stands for a thread indeed. This information is gotten from SMBIOS Type4 Structure, so in order to get a correct cores_per_package for both SMT and non-SMT systems in parse_cpu_table() we should use SMBIOS_THREAD_PACKAGE_OFFSET instead of SMBIOS_CORE_PACKAGE_OFFSET. Cc: stable@vger.kernel.org Reported-by: Chao Li <lichao@loongson.cn> Tested-by: Chao Li <lichao@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-10-21 22:11:18 +08:00
Yanteng Si	b69269c870	LoongArch: Use "Exception return address" to comment ERA The information contained in the comment for LOONGARCH_CSR_ERA is even less informative than the macro itself, which can cause confusion for junior developers. Let's use the full English term. Signed-off-by: Yanteng Si <siyanteng@cqsoftware.com.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-10-21 22:11:18 +08:00
Marek Maslanka	5fa6078801	platform/x86:intel/pmc: Revert "Enable the ACPI PM Timer to be turned off when suspended" Commit `e86c8186d0` ("platform/x86:intel/pmc: Enable the ACPI PM Timer to be turned off when suspended") can cause the suspend process to hang as the pmcdev->lock in the pmc_core_acpi_pm_timer_suspend_resume might already be held by the pmc_core_mphy_pg_show or pmc_core_pll_show if the userspace gets frozen when these functions are being executed. Also, pmc_core_acpi_pm_timer_suspend_resume must not sleep, as this function is called indirectly by the tick_freeze function in kernel/time/tick-common.c, which holds the spinlock. Revert the changes for now to fix these issues. Fixes: `e86c8186d0` ("platform/x86:intel/pmc: Enable the ACPI PM Timer to be turned off when suspended") Reported-by: Luca Coelho <luca@coelho.fi> Closes: https://lore.kernel.org/lkml/40555604c3f4be43bf72e72d5409eaece4be9320.camel@coelho.fi/ Signed-off-by: Marek Maslanka <mmaslanka@google.com> Link: https://lore.kernel.org/r/20241012182656.2107178-1-mmaslanka@google.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2024-10-21 16:04:00 +02:00
Javier Carrasco	5c23878252	drm/bridge: tc358767: fix missing of_node_put() in for_each_endpoint_of_node() for_each_endpoint_of_node() requires a call to of_node_put() for every early exit. A new error path was added to the loop without observing this requirement. Add the missing call to of_node_put() in the error path. Fixes: `1fb4dceeed` ("drm/bridge: tc358767: Add configurable default preemphasis") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: Marek Vasut <marex@denx.de> Link: https://lore.kernel.org/r/20241013-tc358767-of_node_put-v1-1-97431772c0ff@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241013-tc358767-of_node_put-v1-1-97431772c0ff@gmail.com	2024-10-21 15:00:35 +02:00
Abel Vesa	85e444a681	drm/bridge: Fix assignment of the of_node of the parent to aux bridge The assignment of the of_node to the aux bridge needs to mark the of_node as reused as well, otherwise resource providers like pinctrl will report a gpio as already requested by a different device when both pinconf and gpios property are present. Fix that by using the device_set_of_node_from_dev() helper instead. Fixes: `6914968a0b` ("drm/bridge: properly refcount DT nodes in aux bridge drivers") Cc: stable@vger.kernel.org # 6.8 Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20241018-drm-aux-bridge-mark-of-node-reused-v2-1-aeed1b445c7d@linaro.org Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241018-drm-aux-bridge-mark-of-node-reused-v2-1-aeed1b445c7d@linaro.org	2024-10-21 14:59:48 +02:00
Christian Brauner	35100ae2dc	Merge patch series "fs/super.c: introduce get_tree_bdev_flags()" Allison Karlitskaya <allison.karlitskaya@redhat.com> says: In context of my work on composefs/bootc I've been testing the new support for directly mounting files with erofs (ie: without a loopback device) and it's working well. Thanks for adding this feature --- it's a huge quality of life improvement for us. I've observed a strange behaviour, though: when mounting a file as an erofs, if you read() the filesystem context fd, you always get the following error message reported: Can't lookup blockdev. That's caused by the code in erofs_fc_get_tree() trying to call get_tree_bdev() and recovering from the error in case it was ENOTBLK and CONFIG_EROFS_FS_BACKED_BY_FILE. Unfortunately, get_tree_bdev() logs the error directly on the fs_context, so you get the error message even on successful mounts. It looks something like this at the syscall level: fsopen("erofs", FSOPEN_CLOEXEC) = 3 fsconfig(3, FSCONFIG_SET_FLAG, "ro", NULL, 0) = 0 fsconfig(3, FSCONFIG_SET_STRING, "source", "/home/lis/src/mountcfs/cfs", 0) = 0 fsconfig(3, FSCONFIG_CMD_CREATE, NULL, NULL, 0) = 0 fsmount(3, FSMOUNT_CLOEXEC, 0) = 5 move_mount(5, "", AT_FDCWD, "/tmp/composefs.upper.KuT5aV", MOVE_MOUNT_F_EMPTY_PATH) = 0 read(3, "e /home/lis/src/mountcfs/cfs: Can't lookup blockdev\n", 1024) = 52 This is kernel 6.12.0-0.rc0.20240926git11a299a7933e.13.fc42.x86_64 from Fedora Rawhide. It's a pretty minor issue, but it sent me on a wild goose chase for an hour or two, so probably it should get fixed before the final release. Gao Xiang <hsiangkao@linux.alibaba.com>: Fix this by providing a get_tree_bdev_flags() helper which can be used to silence such warnings. * patches from https://lore.kernel.org/r/20241009033151.2334888-1-hsiangkao@linux.alibaba.com: erofs: use get_tree_bdev_flags() to avoid misleading messages fs/super.c: introduce get_tree_bdev_flags() Link: https://lore.kernel.org/r/20241009033151.2334888-1-hsiangkao@linux.alibaba.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-21 14:30:29 +02:00
Gao Xiang	14c2d97265	erofs: use get_tree_bdev_flags() to avoid misleading messages Users can pass in an arbitrary source path for the proper type of a mount then without "Can't lookup blockdev" error message. Reported-by: Allison Karlitskaya <allison.karlitskaya@redhat.com> Closes: https://lore.kernel.org/r/CAOYeF9VQ8jKVmpy5Zy9DNhO6xmWSKMB-DO8yvBB0XvBE7=3Ugg@mail.gmail.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241009033151.2334888-2-hsiangkao@linux.alibaba.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-21 14:30:27 +02:00
Gao Xiang	4021e68513	fs/super.c: introduce get_tree_bdev_flags() As Allison reported [1], currently get_tree_bdev() will store "Can't lookup blockdev" error message. Although it makes sense for pure bdev-based fses, this message may mislead users who try to use EROFS file-backed mounts since get_tree_nodev() is used as a fallback then. Add get_tree_bdev_flags() to specify extensible flags [2] and GET_TREE_BDEV_QUIET_LOOKUP to silence "Can't lookup blockdev" message since it's misleading to EROFS file-backed mounts now. [1] https://lore.kernel.org/r/CAOYeF9VQ8jKVmpy5Zy9DNhO6xmWSKMB-DO8yvBB0XvBE7=3Ugg@mail.gmail.com [2] https://lore.kernel.org/r/ZwUkJEtwIpUA4qMz@infradead.org Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241009033151.2334888-1-hsiangkao@linux.alibaba.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-21 14:30:26 +02:00
Shubham Panwar	8fa73ee44d	ACPI: button: Add DMI quirk for Samsung Galaxy Book2 to fix initial lid detection issue Add a DMI quirk for Samsung Galaxy Book2 to fix an initial lid state detection issue. The _LID device incorrectly returns the lid status as "closed" during boot, causing the system to enter a suspend loop right after booting. The quirk ensures that the correct lid state is reported initially, preventing the system from immediately suspending after startup. It only addresses the initial lid state detection and ensures proper system behavior upon boot. Signed-off-by: Shubham Panwar <shubiisp8@gmail.com> Link: https://patch.msgid.link/20241020095045.6036-2-shubiisp8@gmail.com [ rjw: Changelog edits ] Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-10-21 14:00:59 +02:00
Christian Heusel	53f1a907d3	ACPI: resource: Add LG 16T90SP to irq1_level_low_skip_override[] The LG Gram Pro 16 2-in-1 (2024) the 16T90SP has its keybopard IRQ (1) described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh which breaks the keyboard. Add the 16T90SP to the irq1_level_low_skip_override[] quirk table to fix this. Reported-by: Dirk Holten <dirk.holten@gmx.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219382 Cc: All applicable <stable@vger.kernel.org> Suggested-by: Dirk Holten <dirk.holten@gmx.de> Signed-off-by: Christian Heusel <christian@heusel.eu> Link: https://patch.msgid.link/20241017-lg-gram-pro-keyboard-v2-1-7c8fbf6ff718@heusel.eu Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-10-21 13:54:57 +02:00
Jack Yu	038fa6ddf5	ASoC: rt722-sdca: increase clk_stop_timeout to fix clock stop issue clk_stop_timeout should be increased to 900ms to fix clock stop issue. Signed-off-by: Jack Yu <jack.yu@realtek.com> Link: https://patch.msgid.link/cd26275d9fc54374a18dc016755cb72d@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-21 12:49:15 +01:00
Koba Ko	088984c8d5	ACPI: PRM: Find EFI_MEMORY_RUNTIME block for PRM handler and context PRMT needs to find the correct type of block to translate the PA-VA mapping for EFI runtime services. The issue arises because the PRMT is finding a block of type EFI_CONVENTIONAL_MEMORY, which is not appropriate for runtime services as described in Section 2.2.2 (Runtime Services) of the UEFI Specification [1]. Since the PRM handler is a type of runtime service, this causes an exception when the PRM handler is called. [Firmware Bug]: Unable to handle paging request in EFI runtime service WARNING: CPU: 22 PID: 4330 at drivers/firmware/efi/runtime-wrappers.c:341 __efi_queue_work+0x11c/0x170 Call trace: Let PRMT find a block with EFI_MEMORY_RUNTIME for PRM handler and PRM context. If no suitable block is found, a warning message will be printed, but the procedure continues to manage the next PRM handler. However, if the PRM handler is actually called without proper allocation, it would result in a failure during error handling. By using the correct memory types for runtime services, ensure that the PRM handler and the context are properly mapped in the virtual address space during runtime, preventing the paging request error. The issue is really that only memory that has been remapped for runtime by the firmware can be used by the PRM handler, and so the region needs to have the EFI_MEMORY_RUNTIME attribute. Link: https://uefi.org/sites/default/files/resources/UEFI_Spec_2_10_Aug29.pdf # [1] Fixes: `cefc7ca462` ("ACPI: PRM: implement OperationRegion handler for the PlatformRtMechanism subtype") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Koba Ko <kobak@nvidia.com> Reviewed-by: Matthew R. Ochs <mochs@nvidia.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://patch.msgid.link/20241012205010.4165798-1-kobak@nvidia.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-10-21 13:46:50 +02:00
Yuan Can	5209d1b654	powercap: dtpm_devfreq: Fix error check against dev_pm_qos_add_request() The caller of the function dev_pm_qos_add_request() checks again a non zero value but dev_pm_qos_add_request() can return '1' if the request already exists. Therefore, the setup function fails while the QoS request actually did not failed. Fix that by changing the check against a negative value like all the other callers of the function. Fixes: `e446556173` ("powercap/drivers/dtpm: Add dtpm devfreq with energy model support") Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/20241018021205.46460-1-yuancan@huawei.com [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-10-21 13:23:06 +02:00
Christian Loehle	29dcbea924	cpufreq: docs: Reflect latency changes in docs There were two changes related to transition latency recently. Namely commit `e13aa799c2` ("cpufreq: Change default transition delay to 2ms") and commit `37c6dccd68` ("cpufreq: Remove LATENCY_MULTIPLIER"). Both changed the defaults / maximums so let the documentation reflect that. Signed-off-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/46853b6e-bad5-4ace-9b23-ff157f234ae3@arm.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-10-21 13:20:03 +02:00
Michael S. Tsirkin	d95d9a31ac	virtio_net: fix integer overflow in stats Static analysis on linux-next has detected the following issue in function virtnet_stats_ctx_init, in drivers/net/virtio_net.c : if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_CVQ) { queue_type = VIRTNET_Q_TYPE_CQ; ctx->bitmap[queue_type] \|= VIRTIO_NET_STATS_TYPE_CVQ; ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_cvq_desc); ctx->size[queue_type] += sizeof(struct virtio_net_stats_cvq); } ctx->bitmap is declared as a u32 however it is being bit-wise or'd with VIRTIO_NET_STATS_TYPE_CVQ and this is defined as 1 << 32: include/uapi/linux/virtio_net.h:#define VIRTIO_NET_STATS_TYPE_CVQ (1ULL << 32) ..and hence the bit-wise or operation won't set any bits in ctx->bitmap because 1ULL < 32 is too wide for a u32. In fact, the field is read into a u64: u64 offset, bitmap; .... bitmap = ctx->bitmap[queue_type]; so to fix, it is enough to make bitmap an array of u64. Fixes: `941168f8b4` ("virtio_net: support device stats") Reported-by: "Colin King (gmail)" <colin.i.king@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/53e2bd6728136d5916e384a7840e5dc7eebff832.1729099611.git.mst@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-21 13:09:00 +02:00
Eric Dumazet	95ecba62e2	net: fix races in netdev_tx_sent_queue()/dev_watchdog() Some workloads hit the infamous dev_watchdog() message: "NETDEV WATCHDOG: eth0 (xxxx): transmit queue XX timed out" It seems possible to hit this even for perfectly normal BQL enabled drivers: 1) Assume a TX queue was idle for more than dev->watchdog_timeo (5 seconds unless changed by the driver) 2) Assume a big packet is sent, exceeding current BQL limit. 3) Driver ndo_start_xmit() puts the packet in TX ring, and netdev_tx_sent_queue() is called. 4) QUEUE_STATE_STACK_XOFF could be set from netdev_tx_sent_queue() before txq->trans_start has been written. 5) txq->trans_start is written later, from netdev_start_xmit() if (rc == NETDEV_TX_OK) txq_trans_update(txq) dev_watchdog() running on another cpu could read the old txq->trans_start, and then see QUEUE_STATE_STACK_XOFF, because 5) did not happen yet. To solve the issue, write txq->trans_start right before one XOFF bit is set : - _QUEUE_STATE_DRV_XOFF from netif_tx_stop_queue() - __QUEUE_STATE_STACK_XOFF from netdev_tx_sent_queue() From dev_watchdog(), we have to read txq->state before txq->trans_start. Add memory barriers to enforce correct ordering. In the future, we could avoid writing over txq->trans_start for normal operations, and rename this field to txq->xoff_start_time. Fixes: `bec251bc8b` ("net: no longer stop all TX queues in dev_watchdog()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20241015194118.3951657-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-21 12:54:25 +02:00
Lin Ma	47dd5447ca	net: wwan: fix global oob in wwan_rtnl_policy The variable wwan_rtnl_link_ops assign a bigger maxtype which leads to a global out-of-bounds read when parsing the netlink attributes. Exactly same bug cause as the oob fixed in commit `b33fb5b801` ("net: qualcomm: rmnet: fix global oob in rmnet_policy"). ================================================================== BUG: KASAN: global-out-of-bounds in validate_nla lib/nlattr.c:388 [inline] BUG: KASAN: global-out-of-bounds in __nla_validate_parse+0x19d7/0x29a0 lib/nlattr.c:603 Read of size 1 at addr ffffffff8b09cb60 by task syz.1.66276/323862 CPU: 0 PID: 323862 Comm: syz.1.66276 Not tainted 6.1.70 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x177/0x231 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:284 [inline] print_report+0x14f/0x750 mm/kasan/report.c:395 kasan_report+0x139/0x170 mm/kasan/report.c:495 validate_nla lib/nlattr.c:388 [inline] __nla_validate_parse+0x19d7/0x29a0 lib/nlattr.c:603 __nla_parse+0x3c/0x50 lib/nlattr.c:700 nla_parse_nested_deprecated include/net/netlink.h:1269 [inline] __rtnl_newlink net/core/rtnetlink.c:3514 [inline] rtnl_newlink+0x7bc/0x1fd0 net/core/rtnetlink.c:3623 rtnetlink_rcv_msg+0x794/0xef0 net/core/rtnetlink.c:6122 netlink_rcv_skb+0x1de/0x420 net/netlink/af_netlink.c:2508 netlink_unicast_kernel net/netlink/af_netlink.c:1326 [inline] netlink_unicast+0x74b/0x8c0 net/netlink/af_netlink.c:1352 netlink_sendmsg+0x882/0xb90 net/netlink/af_netlink.c:1874 sock_sendmsg_nosec net/socket.c:716 [inline] __sock_sendmsg net/socket.c:728 [inline] ____sys_sendmsg+0x5cc/0x8f0 net/socket.c:2499 ___sys_sendmsg+0x21c/0x290 net/socket.c:2553 __sys_sendmsg net/socket.c:2582 [inline] __do_sys_sendmsg net/socket.c:2591 [inline] __se_sys_sendmsg+0x19e/0x270 net/socket.c:2589 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x45/0x90 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f67b19a24ad RSP: 002b:00007f67b17febb8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f67b1b45f80 RCX: 00007f67b19a24ad RDX: 0000000000000000 RSI: 0000000020005e40 RDI: 0000000000000004 RBP: 00007f67b1a1e01d R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffd2513764f R14: 00007ffd251376e0 R15: 00007f67b17fed40 </TASK> The buggy address belongs to the variable: wwan_rtnl_policy+0x20/0x40 The buggy address belongs to the physical page: page:ffffea00002c2700 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xb09c flags: 0xfff00000001000(reserved\|node=0\|zone=1\|lastcpupid=0x7ff) raw: 00fff00000001000 ffffea00002c2708 ffffea00002c2708 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner info is not present (never set?) Memory state around the buggy address: ffffffff8b09ca00: 05 f9 f9 f9 05 f9 f9 f9 00 01 f9 f9 00 01 f9 f9 ffffffff8b09ca80: 00 00 00 05 f9 f9 f9 f9 00 00 03 f9 f9 f9 f9 f9 >ffffffff8b09cb00: 00 00 00 00 05 f9 f9 f9 00 00 00 00 f9 f9 f9 f9 ^ ffffffff8b09cb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== According to the comment of `nla_parse_nested_deprecated`, use correct size `IFLA_WWAN_MAX` here to fix this issue. Fixes: `88b710532e` ("wwan: add interface creation support") Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241015131621.47503-1-linma@zju.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-21 11:38:44 +02:00
Pablo Neira Ayuso	306ed1728e	netfilter: xtables: fix typo causing some targets not to load on IPv6 - There is no NFPROTO_IPV6 family for mark and NFLOG. - TRACE is also missing module autoload with NFPROTO_IPV6. This results in ip6tables failing to restore a ruleset. This issue has been reported by several users providing incomplete patches. Very similar to Ilya Katsnelson's patch including a missing chunk in the TRACE extension. Fixes: `0bfcb7b71e` ("netfilter: xtables: avoid NFPROTO_UNSPEC where needed") Reported-by: Ignat Korchagin <ignat@cloudflare.com> Reported-by: Ilya Katsnelson <me@0upti.me> Reported-by: Krzysztof Olędzki <ole@ans.pl> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-10-21 11:31:26 +02:00
Arnd Bergmann	51521d2e2c	fbdev: wm8505fb: select CONFIG_FB_IOMEM_FOPS The fb_io_mmap() function is used in the file operations but not enabled in all configurations unless FB_IOMEM_FOPS gets selected: ld.lld-20: error: undefined symbol: fb_io_mmap referenced by wm8505fb.c drivers/video/fbdev/wm8505fb.o:(wm8505fb_ops) in archive vmlinux.a Fixes: `11754a5046` ("fbdev/wm8505fb: Initialize fb_ops to fbdev I/O-memory helpers") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Helge Deller <deller@gmx.de>	2024-10-21 11:16:51 +02:00
Paolo Abeni	374d4106cb	Merge branch 'fsl-fman-fix-refcount-handling-of-fman-related-devices' Aleksandr Mishin says: ==================== fsl/fman: Fix refcount handling of fman-related devices The series is intended to fix refcount handling for fman-related "struct device" objects - the devices are not released upon driver removal or in the error paths during probe. This leads to device reference leaks. The device pointers are now saved to struct mac_device and properly handled in the driver's probe and removal functions. Originally reported by Simon Horman <horms@kernel.org> (https://lore.kernel.org/all/20240702133651.GK598357@kernel.org/) Compile tested only. ==================== Link: https://patch.msgid.link/20241015060122.25709-1-amishin@t-argos.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-21 10:50:18 +02:00
Aleksandr Mishin	1dec67e0d9	fsl/fman: Fix refcount handling of fman-related devices In mac_probe() there are multiple calls to of_find_device_by_node(), fman_bind() and fman_port_bind() which takes references to of_dev->dev. Not all references taken by these calls are released later on error path in mac_probe() and in mac_remove() which lead to reference leaks. Add references release. Fixes: `3933961682` ("fsl/fman: Add FMan MAC driver") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-21 10:50:16 +02:00
Aleksandr Mishin	efeddd552e	fsl/fman: Save device references taken in mac_probe() In mac_probe() there are calls to of_find_device_by_node() which takes references to of_dev->dev. These references are not saved and not released later on error path in mac_probe() and in mac_remove(). Add new fields into mac_device structure to save references taken for future use in mac_probe() and mac_remove(). This is a preparation for further reference leaks fix. Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-21 10:50:16 +02:00
Miklos Szeredi	184429a17f	Revert "fuse: move initialization of fuse_file to fuse_writepages() instead of in callback" This reverts commit `672c3b7457`. fuse_writepages() might be called with no dirty pages after all writable opens were closed. In this case __fuse_write_file_get() will return NULL which will trigger the WARNING. The exact conditions under which this is triggered is unclear and syzbot didn't find a reproducer yet. Reported-by: syzbot+217a976dc26ef2fa8711@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/CAJnrk1aQwfvb51wQ5rUSf9N8j1hArTFeSkHqC_3T-mU6_BCD=A@mail.gmail.com/ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2024-10-21 10:02:51 +02:00
Peter Collingbourne	a552e2ef5f	bpf, arm64: Fix address emission with tag-based KASAN enabled When BPF_TRAMP_F_CALL_ORIG is enabled, the address of a bpf_tramp_image struct on the stack is passed during the size calculation pass and an address on the heap is passed during code generation. This may cause a heap buffer overflow if the heap address is tagged because emit_a64_mov_i64() will emit longer code than it did during the size calculation pass. The same problem could occur without tag-based KASAN if one of the 16-bit words of the stack address happened to be all-ones during the size calculation pass. Fix the problem by assuming the worst case (4 instructions) when calculating the size of the bpf_tramp_image address emission. Fixes: `19d3c179a3` ("bpf, arm64: Fix trampoline for BPF_TRAMP_F_CALL_ORIG") Signed-off-by: Peter Collingbourne <pcc@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Xu Kuohai <xukuohai@huawei.com> Link: https://linux-review.googlesource.com/id/I1496f2bc24fba7a1d492e16e2b94cf43714f2d3c Link: https://lore.kernel.org/bpf/20241018221644.3240898-1-pcc@google.com	2024-10-21 09:45:19 +02:00
Eric Biggers	86c96e7289	ALSA: hda/tas2781: select CRC32 instead of CRC32_SARWATE Fix the kconfig option for the tas2781 HDA driver to select CRC32 rather than CRC32_SARWATE. CRC32_SARWATE is an option from the kconfig 'choice' that selects the specific CRC32 implementation. Selecting a 'choice' option seems to have no effect, but even if it did work, it would be incorrect for a random driver to override the user's choice. CRC32 is the correct option to select for crc32() to be available. Fixes: `5be27f1e3e` ("ALSA: hda/tas2781: Add tas2781 HDA driver") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://patch.msgid.link/20241020175624.7095-1-ebiggers@kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-21 09:21:08 +02:00
José Relvas	35fdc6e1c1	ALSA: hda/realtek: Add subwoofer quirk for Acer Predator G9-593 The Acer Predator G9-593 has a 2+1 speaker system which isn't probed correctly. This patch adds a quirk with the proper pin connections. Note that I do not own this laptop, so I cannot guarantee that this fixes the issue. Testing was done by other users here: https://discussion.fedoraproject.org/t/-/118482 This model appears to have two different dev IDs... - 0x1177 (as seen on the forum link above) - 0x1178 (as seen on https://linux-hardware.org/?probe=127df9999f) I don't think the audio system was changed between model revisions, so the patch applies for both IDs. Signed-off-by: José Relvas <josemonsantorelvas@gmail.com> Link: https://patch.msgid.link/20241020102756.225258-1-josemonsantorelvas@gmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-21 09:20:04 +02:00
Andrey Shumilin	72cafe63b3	ALSA: firewire-lib: Avoid division by zero in apply_constraint_to_size() The step variable is initialized to zero. It is changed in the loop, but if it's not changed it will remain zero. Add a variable check before the division. The observed behavior was introduced by commit `826b5de90c` ("ALSA: firewire-lib: fix insufficient PCM rule for period/buffer size"), and it is difficult to show that any of the interval parameters will satisfy the snd_interval_test() condition with data from the amdtp_rate_table[] table. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `826b5de90c` ("ALSA: firewire-lib: fix insufficient PCM rule for period/buffer size") Signed-off-by: Andrey Shumilin <shum.sdl@nppct.ru> Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://patch.msgid.link/20241018060018.1189537-1-shum.sdl@nppct.ru Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-21 09:09:21 +02:00
Arnd Bergmann	338b655a11	i915: fix DRM_I915_GVT_KVMGT dependencies Depending on x86 and KVM is not enough, as the kvm helper functions that get called here are controlled by CONFIG_KVM_X86, which is disabled if both KVM_INTEL and KVM_AMD are turned off. ERROR: modpost: "kvm_write_track_remove_gfn" [drivers/gpu/drm/i915/kvmgt.ko] undefined! ERROR: modpost: "kvm_page_track_register_notifier" [drivers/gpu/drm/i915/kvmgt.ko] undefined! ERROR: modpost: "kvm_page_track_unregister_notifier" [drivers/gpu/drm/i915/kvmgt.ko] undefined! ERROR: modpost: "kvm_write_track_add_gfn" [drivers/gpu/drm/i915/kvmgt.ko] undefined! Change the dependency to CONFIG_KVM_X86 instead. Fixes: `ea4290d77b` ("KVM: x86: leave kvm.ko out of the build if no vendor module is requested") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241015152157.2955229-1-arnd@kernel.org Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `341e402303`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2024-10-21 09:51:05 +03:00
Gil Fine	3cea8af2d1	thunderbolt: Honor TMU requirements in the domain when setting TMU mode Currently, when configuring TMU (Time Management Unit) mode of a given router, we take into account only its own TMU requirements ignoring other routers in the domain. This is problematic if the router we are configuring has lower TMU requirements than what is already configured in the domain. In the scenario below, we have a host router with two USB4 ports: A and B. Port A connected to device router #1 (which supports CL states) and existing DisplayPort tunnel, thus, the TMU mode is HiFi uni-directional. 1. Initial topology [Host] A/ / [Device #1] / Monitor 2. Plug in device #2 (that supports CL states) to downstream port B of the host router [Host] A/ B\ / \ [Device #1] [Device #2] / Monitor The TMU mode on port B and port A will be configured to LowRes which is not what we want and will cause monitor to start flickering. To address this we first scan the domain and search for any router configured to HiFi uni-directional mode, and if found, configure TMU mode of the given router to HiFi uni-directional as well. Cc: stable@vger.kernel.org Signed-off-by: Gil Fine <gil.fine@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>	2024-10-21 09:42:42 +03:00
Qiao Ma	373b9338c9	uprobe: avoid out-of-bounds memory access of fetching args Uprobe needs to fetch args into a percpu buffer, and then copy to ring buffer to avoid non-atomic context problem. Sometimes user-space strings, arrays can be very large, but the size of percpu buffer is only page size. And store_trace_args() won't check whether these data exceeds a single page or not, caused out-of-bounds memory access. It could be reproduced by following steps: 1. build kernel with CONFIG_KASAN enabled 2. save follow program as test.c ``` \#include <stdio.h> \#include <stdlib.h> \#include <string.h> // If string length large than MAX_STRING_SIZE, the fetch_store_strlen() // will return 0, cause __get_data_size() return shorter size, and // store_trace_args() will not trigger out-of-bounds access. // So make string length less than 4096. \#define STRLEN 4093 void generate_string(char str, int n) { int i; for (i = 0; i < n; ++i) { char c = i % 26 + 'a'; str[i] = c; } str[n-1] = '\0'; } void print_string(char str) { printf("%s\n", str); } int main() { char tmp[STRLEN]; generate_string(tmp, STRLEN); print_string(tmp); return 0; } ``` 3. compile program `gcc -o test test.c` 4. get the offset of `print_string()` ``` objdump -t test \| grep -w print_string 0000000000401199 g F .text 000000000000001b print_string ``` 5. configure uprobe with offset 0x1199 ``` off=0x1199 cd /sys/kernel/debug/tracing/ echo "p /root/test:${off} arg1=+0(%di):ustring arg2=\$comm arg3=+0(%di):ustring" > uprobe_events echo 1 > events/uprobes/enable echo 1 > tracing_on ``` 6. run `test`, and kasan will report error. ================================================================== BUG: KASAN: use-after-free in strncpy_from_user+0x1d6/0x1f0 Write of size 8 at addr ffff88812311c004 by task test/499CPU: 0 UID: 0 PID: 499 Comm: test Not tainted 6.12.0-rc3+ #18 Hardware name: Red Hat KVM, BIOS 1.16.0-4.al8 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x55/0x70 print_address_description.constprop.0+0x27/0x310 kasan_report+0x10f/0x120 ? strncpy_from_user+0x1d6/0x1f0 strncpy_from_user+0x1d6/0x1f0 ? rmqueue.constprop.0+0x70d/0x2ad0 process_fetch_insn+0xb26/0x1470 ? __pfx_process_fetch_insn+0x10/0x10 ? _raw_spin_lock+0x85/0xe0 ? __pfx__raw_spin_lock+0x10/0x10 ? __pte_offset_map+0x1f/0x2d0 ? unwind_next_frame+0xc5f/0x1f80 ? arch_stack_walk+0x68/0xf0 ? is_bpf_text_address+0x23/0x30 ? kernel_text_address.part.0+0xbb/0xd0 ? __kernel_text_address+0x66/0xb0 ? unwind_get_return_address+0x5e/0xa0 ? __pfx_stack_trace_consume_entry+0x10/0x10 ? arch_stack_walk+0xa2/0xf0 ? _raw_spin_lock_irqsave+0x8b/0xf0 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? depot_alloc_stack+0x4c/0x1f0 ? _raw_spin_unlock_irqrestore+0xe/0x30 ? stack_depot_save_flags+0x35d/0x4f0 ? kasan_save_stack+0x34/0x50 ? kasan_save_stack+0x24/0x50 ? mutex_lock+0x91/0xe0 ? __pfx_mutex_lock+0x10/0x10 prepare_uprobe_buffer.part.0+0x2cd/0x500 uprobe_dispatcher+0x2c3/0x6a0 ? __pfx_uprobe_dispatcher+0x10/0x10 ? __kasan_slab_alloc+0x4d/0x90 handler_chain+0xdd/0x3e0 handle_swbp+0x26e/0x3d0 ? __pfx_handle_swbp+0x10/0x10 ? uprobe_pre_sstep_notifier+0x151/0x1b0 irqentry_exit_to_user_mode+0xe2/0x1b0 asm_exc_int3+0x39/0x40 RIP: 0033:0x401199 Code: 01 c2 0f b6 45 fb 88 02 83 45 fc 01 8b 45 fc 3b 45 e4 7c b7 8b 45 e4 48 98 48 8d 50 ff 48 8b 45 e8 48 01 d0 ce RSP: 002b:00007ffdf00576a8 EFLAGS: 00000206 RAX: 00007ffdf00576b0 RBX: 0000000000000000 RCX: 0000000000000ff2 RDX: 0000000000000ffc RSI: 0000000000000ffd RDI: 00007ffdf00576b0 RBP: 00007ffdf00586b0 R08: 00007feb2f9c0d20 R09: 00007feb2f9c0d20 R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000401040 R13: 00007ffdf0058780 R14: 0000000000000000 R15: 0000000000000000 </TASK> This commit enforces the buffer's maxlen less than a page-size to avoid store_trace_args() out-of-memory access. Link: https://lore.kernel.org/all/20241015060148.1108331-1-mqaio@linux.alibaba.com/ Fixes: `dcad1a204f` ("tracing/uprobes: Fetch args before reserving a ring buffer") Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-10-21 13:15:28 +09:00
Linus Torvalds	42f7652d3e	Linux 6.12-rc4	2024-10-20 15:19:38 -07:00
Kent Overstreet	a069f01479	bcachefs: Set bch_inode_unpacked.bi_snapshot in old inode path This fixes a fsck bug on a very old filesystem (pre mainline merge). Fixes: `72350ee0ea` ("bcachefs: Kill snapshot arg to fsck_write_inode()") Reported-by: Marcin Mirosław <marcin@mejor.pl> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-20 18:09:09 -04:00
Kent Overstreet	e04ee86089	bcachefs: Mark more errors as AUTOFIX Reported-by: Marcin Mirosław <marcin@mejor.pl> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-20 18:08:53 -04:00
Linus Torvalds	d7f513ae7b	Merge tag 'for-net-2024-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Pull bluetooth fixes from Luiz Augusto Von Dentz: - ISO: Fix multiple init when debugfs is disabled - Call iso_exit() on module unload - Remove debugfs directory on module init failure - btusb: Fix not being able to reconnect after suspend - btusb: Fix regression with fake CSR controllers 0a12:0001 - bnep: fix wild-memory-access in proto_unregister Note: normally the bluetooth fixes go through the networking tree, but this missed the weekly merge, and two of the commits fix regressions that have caused a fair amount of noise and have now hit stable too: https://lore.kernel.org/all/4e1977ca-6166-4891-965e-34a6f319035f@leemhuis.info/ So I'm pulling it directly just to expedite things and not miss yet another -rc release. This is not meant to become a new pattern. * tag 'for-net-2024-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001 Bluetooth: bnep: fix wild-memory-access in proto_unregister Bluetooth: btusb: Fix not being able to reconnect after suspend Bluetooth: Remove debugfs directory on module init failure Bluetooth: Call iso_exit() on module unload Bluetooth: ISO: Fix multiple init when debugfs is disabled	2024-10-20 14:08:17 -07:00
Linus Torvalds	dd4f50373e	Merge tag 'pinctrl-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Mostly error path fixes, but one pretty serious interrupt problem in the Ocelot driver as well: - Fix two error paths and a missing semicolon in the Intel driver - Add a missing ACPI ID for the Intel Panther Lake - Check return value of devm_kasprintf() in the Apple and STM32 drivers - Add a missing mutex_destroy() in the aw9523 driver - Fix a double free in cv1800_pctrl_dt_node_to_map() in the Sophgo driver - Fix a double free in ma35_pinctrl_dt_node_to_map_func() in the Nuvoton driver - Fix a bug in the Ocelot interrupt handler making the system hang" * tag 'pinctrl-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: ocelot: fix system hang on level based interrupts pinctrl: nuvoton: fix a double free in ma35_pinctrl_dt_node_to_map_func() pinctrl: sophgo: fix double free in cv1800_pctrl_dt_node_to_map() pinctrl: intel: platform: Add Panther Lake to the list of supported pinctrl: aw9523: add missing mutex_destroy pinctrl: stm32: check devm_kasprintf() returned value pinctrl: apple: check devm_kasprintf() returned value pinctrl: intel: platform: use semicolon instead of comma in ncommunities assignment pinctrl: intel: platform: fix error path in device_for_each_child_node()	2024-10-20 13:55:46 -07:00
Kent Overstreet	f0d3302073	bcachefs: Workaround for kvmalloc() not supporting > INT_MAX allocations kvmalloc() doesn't support allocations > INT_MAX, but vmalloc() does - the limit should be lifted, but we can work around this for now. A user with a 75 TB filesystem reported the following journal replay error: https://github.com/koverstreet/bcachefs/issues/769 In journal replay we have to sort and dedup all the keys from the journal, which means we need a large contiguous allocation. Given that the user has 128GB of ram, the 2GB limit on allocation size has become far too small. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-20 16:50:14 -04:00
Kent Overstreet	3956ff8bc2	bcachefs: Don't use wait_event_interruptible() in recovery Fix a bug where mount was failing with -ERESTARTSYS: https://github.com/koverstreet/bcachefs/issues/741 We only want the interruptible wait when called from fsync. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-20 16:50:14 -04:00
Kent Overstreet	eb5db64c45	bcachefs: Fix __bch2_fsck_err() warning We only warn about having a btree_trans that wasn't passed in if we'll be prompting. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-20 16:50:14 -04:00
Linus Torvalds	c55228220d	Merge tag 'char-misc-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull misc driver fixes from Greg KH: "Here are a number of small char/misc/iio driver fixes for 6.12-rc4: - loads of small iio driver fixes for reported problems - parport driver out-of-bounds fix - Kconfig description and MAINTAINERS file updates All of these, except for the Kconfig and MAINTAINERS file updates have been in linux-next all week. Those other two are just documentation changes and will have no runtime issues and were merged on Friday" * tag 'char-misc-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (39 commits) misc: rtsx: list supported models in Kconfig help MAINTAINERS: Remove some entries due to various compliance requirements. misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for OTP device misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for EEPROM device parport: Proper fix for array out-of-bounds access iio: frequency: admv4420: fix missing select REMAP_SPI in Kconfig iio: frequency: {admv4420,adrf6780}: format Kconfig entries iio: adc: ad4695: Add missing Kconfig select iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency() iioc: dac: ltc2664: Fix span variable usage in ltc2664_channel_config() iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig iio: amplifiers: ada4250: add missing select REGMAP_SPI in Kconfig iio: frequency: adf4377: add missing select REMAP_SPI in Kconfig iio: resolver: ad2s1210: add missing select (TRIGGERED_)BUFFER in Kconfig iio: resolver: ad2s1210 add missing select REGMAP in Kconfig iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig iio: pressure: bm1390: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig ...	2024-10-20 13:10:44 -07:00
Linus Torvalds	c01ac4b944	Merge tag 'tty-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fixes from Greg KH: "Here are some small tty and serial driver fixes for 6.12-rc4: - qcom-geni serial driver fixes, wow what a mess of a UART chip that thing is... - vt infoleak fix for odd font sizes - imx serial driver bugfix - yet-another n_gsm ldisc bugfix, slowly chipping down the issues in that piece of code All of these have been in linux-next for over a week with no reported issues" * tag 'tty-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: qcom-geni: rename suspend functions serial: qcom-geni: drop unused receive parameter serial: qcom-geni: drop flip buffer WARN() serial: qcom-geni: fix rx cancel dma status bit serial: qcom-geni: fix receiver enable serial: qcom-geni: fix dma rx cancellation serial: qcom-geni: fix shutdown race serial: qcom-geni: revert broken hibernation support serial: qcom-geni: fix polled console initialisation serial: imx: Update mctrl old_status on RTSD interrupt tty: n_gsm: Fix use-after-free in gsm_cleanup_mux vt: prevent kernel-infoleak in con_font_get()	2024-10-20 13:03:30 -07:00
Linus Torvalds	b68c189570	Merge tag 'usb-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB driver fixes from Greg KH: "Here are some small USB driver fixes and new device ids for 6.12-rc4: - xhci driver fixes for a number of reported issues - new usb-serial driver ids - dwc3 driver fixes for reported problems. - usb gadget driver fixes for reported problems - typec driver fixes - MAINTAINER file updates All of these have been in linux-next this week with no reported issues" * tag 'usb-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: serial: option: add Telit FN920C04 MBIM compositions USB: serial: option: add support for Quectel EG916Q-GL xhci: dbc: honor usb transfer size boundaries. usb: xhci: Fix handling errors mid TD followed by other errors xhci: Mitigate failed set dequeue pointer commands xhci: Fix incorrect stream context type macro USB: gadget: dummy-hcd: Fix "task hung" problem usb: gadget: f_uac2: fix return value for UAC2_ATTRIBUTE_STRING store usb: dwc3: core: Fix system suspend on TI AM62 platforms xhci: tegra: fix checked USB2 port number usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFG usb: typec: qcom-pmic-typec: fix sink status being overwritten with RP_DEF usb: typec: altmode should keep reference to parent MAINTAINERS: usb: raw-gadget: add bug tracker link MAINTAINERS: Add an entry for the LJCA drivers	2024-10-20 12:57:53 -07:00
Linus Torvalds	db87114dcf	Merge tag 'x86_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Explicitly disable the TSC deadline timer when going idle to address some CPU errata in that area - Do not apply the Zenbleed fix on anything else except AMD Zen2 on the late microcode loading path - Clear CPU buffers later in the NMI exit path on 32-bit to avoid register clearing while they still contain sensitive data, for the RDFS mitigation - Do not clobber EFLAGS.ZF with VERW on the opportunistic SYSRET exit path on 32-bit - Fix parsing issues of memory bandwidth specification in sysfs for resctrl's memory bandwidth allocation feature - Other small cleanups and improvements * tag 'x86_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Always explicitly disarm TSC-deadline timer x86/CPU/AMD: Only apply Zenbleed fix for Zen2 during late microcode load x86/bugs: Use code segment selector for VERW operand x86/entry_32: Clear CPU buffers after register restore in NMI return x86/entry_32: Do not clobber user EFLAGS.ZF x86/resctrl: Annotate get_mem_config() functions as __init x86/resctrl: Avoid overflow in MB settings in bw_validate() x86/amd_nb: Add new PCI ID for AMD family 1Ah model 20h	2024-10-20 12:04:32 -07:00
Linus Torvalds	949c9ef59b	Merge tag 'irq_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Fix a case for sifive-plic where an interrupt gets disabled and masked and remains masked when it gets reenabled later - Plug a small race in GIC-v4 where userspace can force an affinity change of a virtual CPU (vPE) in its unmapping path - Do not mix the two sets of ocelot irqchip's registers in the mask calculation of the main interrupt sticky register - Other smaller fixlets and cleanups * tag 'irq_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/renesas-rzg2l: Fix missing put_device irqchip/riscv-intc: Fix SMP=n boot with ACPI irqchip/sifive-plic: Unmask interrupt in plic_irq_enable() irqchip/gic-v4: Don't allow a VMOVP on a dying VPE irqchip/sifive-plic: Return error code on failure irqchip/riscv-imsic: Fix output text of base address irqchip/ocelot: Comment sticky register clearing code irqchip/ocelot: Fix trigger register address irqchip: Remove obsolete config ARM_GIC_V3_ITS_PCI	2024-10-20 11:44:07 -07:00
Linus Torvalds	2b4d25010d	Merge tag 'sched_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduling fixes from Borislav Petkov: - Add PREEMPT_RT maintainers - Fix another aspect of delayed dequeued tasks wrt determining their state, i.e., whether they're runnable or blocked - Handle delayed dequeued tasks and their migration wrt PSI properly - Fix the situation where a delayed dequeue task gets enqueued into a new class, which should not happen - Fix a case where memory allocation would happen while the runqueue lock is held, which is a no-no - Do not over-schedule when tasks with shorter slices preempt the currently running task - Make sure delayed to deque entities are properly handled before unthrottling - Other smaller cleanups and improvements * tag 'sched_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Add an entry for PREEMPT_RT. sched/fair: Fix external p->on_rq users sched/psi: Fix mistaken CPU pressure indication after corrupted task state bug sched/core: Dequeue PSI signals for blocked tasks that are delayed sched: Fix delayed_dequeue vs switched_from_fair() sched/core: Disable page allocation in task_tick_mm_cid() sched/deadline: Use hrtick_enabled_dl() before start_hrtick_dl() sched/eevdf: Fix wakeup-preempt by checking cfs_rq->nr_running sched: Fix sched_delayed vs cfs_bandwidth	2024-10-20 11:30:56 -07:00
Linus Torvalds	a5ee44c829	Merge tag 'for-linus-6.12a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "A single fix for a build failure introduced this merge window" * tag 'for-linus-6.12a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Remove dependency between pciback and privcmd	2024-10-20 11:25:58 -07:00
Linus Torvalds	10e93e1900	Merge tag 'dma-mapping-6.12-2024-10-20' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fix from Christoph Hellwig: "Just another small tracing fix from Sean" * tag 'dma-mapping-6.12-2024-10-20' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: fix tracing dma_alloc/free with vmalloc'd memory	2024-10-20 10:56:42 -07:00
Paolo Bonzini	e9001a382f	Merge tag 'kvmarm-fixes-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.12, take #3 - Stop wasting space in the HYP idmap, as we are dangerously close to the 4kB limit, and this has already exploded in -next - Fix another race in vgic_init() - Fix a UBSAN error when faking the cache topology with MTE enabled	2024-10-20 12:10:59 -04:00
Paolo Bonzini	ddd5c58201	Merge tag 'kvmarm-fixes-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.12, take #2 - Fix the guest view of the ID registers, making the relevant fields writable from userspace (affecting ID_AA64DFR0_EL1 and ID_AA64PFR1_EL1) - Correcly expose S1PIE to guests, fixing a regression introduced in 6.12-rc1 with the S1POE support - Fix the recycling of stage-2 shadow MMUs by tracking the context (are we allowed to block or not) as well as the recycling state - Address a couple of issues with the vgic when userspace misconfigures the emulation, resulting in various splats. Headaches courtesy of our Syzkaller friends	2024-10-20 12:10:56 -04:00
Cyan Yang	3ec4350d4e	RISCV: KVM: use raw_spinlock for critical section in imsic For the external interrupt updating procedure in imsic, there was a spinlock to protect it already. But since it should not be preempted in any cases, we should turn to use raw_spinlock to prevent any preemption in case PREEMPT_RT was enabled. Signed-off-by: Cyan Yang <cyan.yang@sifive.com> Reviewed-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Message-ID: <20240919160126.44487-1-cyan.yang@sifive.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 12:10:44 -04:00
Sean Christopherson	773cca1834	KVM: selftests: Fix out-of-bounds reads in CPUID test's array lookups When looking for a "mangled", i.e. dynamic, CPUID entry, terminate the walk based on the number of array _entries_, not the size in bytes of the array. Iterating based on the total size of the array can result in false passes, e.g. if the random data beyond the array happens to match a CPUID entry's function and index. Fixes: `fb18d053b7` ("selftest: kvm: x86: test KVM_GET_CPUID2 and guest visible CPUIDs against KVM_GET_SUPPORTED_CPUID") Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-ID: <20241003234337.273364-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 12:10:44 -04:00
Vitaly Kuznetsov	9a400068a1	KVM: selftests: x86: Avoid using SSE/AVX instructions Some distros switched gcc to '-march=x86-64-v3' by default and while it's hard to find a CPU which doesn't support it today, many KVM selftests fail with ==== Test Assertion Failure ==== lib/x86_64/processor.c:570: Unhandled exception in guest pid=72747 tid=72747 errno=4 - Interrupted system call Unhandled exception '0x6' at guest RIP '0x4104f7' The failure is easy to reproduce elsewhere with $ make clean && CFLAGS='-march=x86-64-v3' make -j && ./x86_64/kvm_pv_test The root cause of the problem seems to be that with '-march=x86-64-v3' GCC uses AVX* instructions (VMOVQ in the example above) and without prior XSETBV() in the guest this results in #UD. It is certainly possible to add it there, e.g. the following saves the day as well: Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-ID: <20240920154422.2890096-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 12:10:27 -04:00
Hangbin Liu	3b05b9c36d	MAINTAINERS: add samples/pktgen to NETWORKING [GENERAL] samples/pktgen is missing in the MAINTAINERS file. Suggested-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Message-ID: <20241018005301.10052-1-liuhangbin@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-20 09:41:10 -05:00
Jesper Dangaard Brouer	3e14d8ebaa	mailmap: update entry for Jesper Dangaard Brouer Mapping all my previously used emails to my kernel.org email. Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Message-ID: <172909057364.2452383.8019986488234344607.stgit@firesoul> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-20 09:41:10 -05:00
Peter Rashleigh	12bc14949c	net: dsa: mv88e6xxx: Fix error when setting port policy on mv88e6393x mv88e6393x_port_set_policy doesn't correctly shift the ptr value when converting the policy format between the old and new styles, so the target register ends up with the ptr being written over the data bits. Shift the pointer to align with the format expected by mv88e6393x_port_policy_write(). Fixes: `6584b26020` ("net: dsa: mv88e6xxx: implement .port_set_policy for Amethyst") Signed-off-by: Peter Rashleigh <peter@rashleigh.ca> Reviewed-by: Simon Horman <horms@kernel.org> Message-ID: <20241016040822.3917-1-peter@rashleigh.ca> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-20 09:41:10 -05:00
Sean Christopherson	f559b2e9c5	KVM: nSVM: Ignore nCR3[4:0] when loading PDPTEs from memory Ignore nCR3[4:0] when loading PDPTEs from memory for nested SVM, as bits 4:0 of CR3 are ignored when PAE paging is used, and thus VMRUN doesn't enforce 32-byte alignment of nCR3. In the absolute worst case scenario, failure to ignore bits 4:0 can result in an out-of-bounds read, e.g. if the target page is at the end of a memslot, and the VMM isn't using guard pages. Per the APM: The CR3 register points to the base address of the page-directory-pointer table. The page-directory-pointer table is aligned on a 32-byte boundary, with the low 5 address bits 4:0 assumed to be 0. And the SDM's much more explicit: 4:0 Ignored Note, KVM gets this right when loading PDPTRs, it's only the nSVM flow that is broken. Fixes: `e4e517b4be` ("KVM: MMU: Do not unconditionally read PDPTE from guest memory") Reported-by: Kirk Swidowski <swidowski@google.com> Cc: Andy Nguyen <theflow@google.com> Cc: 3pvd <3pvd@google.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20241009140838.1036226-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 07:31:06 -04:00
Maxim Levitsky	731285fbb6	KVM: VMX: reset the segment cache after segment init in vmx_vcpu_reset() Reset the segment cache after segment initialization in vmx_vcpu_reset() to harden KVM against caching stale/uninitialized data. Without the recent fix to bypass the cache in kvm_arch_vcpu_put(), the following scenario is possible: - vCPU is just created, and the vCPU thread is preempted before SS.AR_BYTES is written in vmx_vcpu_reset(). - When scheduling out the vCPU task, kvm_arch_vcpu_in_kernel() => vmx_get_cpl() reads and caches '0' for SS.AR_BYTES. - vmx_vcpu_reset() => seg_setup() configures SS.AR_BYTES, but doesn't invoke vmx_segment_cache_clear() to invalidate the cache. As a result, KVM retains a stale value in the cache, which can be read, e.g. via KVM_GET_SREGS. Usually this is not a problem because the VMX segment cache is reset on each VM-Exit, but if the userspace VMM (e.g KVM selftests) reads and writes system registers just after the vCPU was created, _without_ modifying SS.AR_BYTES, userspace will write back the stale '0' value and ultimately will trigger a VM-Entry failure due to incorrect SS segment type. Invalidating the cache after writing the VMCS doesn't address the general issue of cache accesses from IRQ context being unsafe, but it does prevent KVM from clobbering the VMCS, i.e. mitigates the harm done _if_ KVM has a bug that results in an unsafe cache access. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Fixes: `2fb92db1ec` ("KVM: VMX: Cache vmcs segment fields") [sean: rework changelog to account for previous patch] Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20241009175002.1118178-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 07:31:06 -04:00
Sean Christopherson	5a27984244	KVM: x86: Clean up documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL Massage the documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL to call out that it applies to moved memslots as well as deleted memslots, to avoid KVM's "fast zap" terminology (which has no meaning for userspace), and to reword the documented targeted zap behavior to specifically say that KVM _may_ zap a subset of all SPTEs. As evidenced by the fix to zap non-leafs SPTEs with gPTEs, formally documenting KVM's exact internal behavior is risky and unnecessary. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20241009192345.1148353-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 07:31:05 -04:00
Sean Christopherson	28cf497881	KVM: x86/mmu: Add lockdep assert to enforce safe usage of kvm_unmap_gfn_range() Add a lockdep assertion in kvm_unmap_gfn_range() to ensure that either mmu_invalidate_in_progress is elevated, or that the range is being zapped due to memslot removal (loosely detected by slots_lock being held). Zapping SPTEs without mmu_invalidate_{in_progress,seq} protection is unsafe as KVM's page fault path snapshots state before acquiring mmu_lock, and thus can create SPTEs with stale information if vCPUs aren't forced to retry faults (due to seeing an in-progress or past MMU invalidation). Memslot removal is a special case, as the memslot is retrieved outside of mmu_invalidate_seq, i.e. doesn't use the "standard" protections, and instead relies on SRCU synchronization to ensure any in-flight page faults are fully resolved before zapping SPTEs. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20241009192345.1148353-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 07:31:05 -04:00
Sean Christopherson	58a20a9435	KVM: x86/mmu: Zap only SPs that shadow gPTEs when deleting memslot When performing a targeted zap on memslot removal, zap only MMU pages that shadow guest PTEs, as zapping all SPs that "match" the gfn is inexact and unnecessary. Furthermore, for_each_gfn_valid_sp() arguably shouldn't exist, because it doesn't do what most people would it expect it to do. The "round gfn for level" adjustment that is done for direct SPs (no gPTE) means that the exact gfn comparison will not get a match, even when a SP does "cover" a gfn, or was even created specifically for a gfn. For memslot deletion specifically, KVM's behavior will vary significantly based on the size and alignment of a memslot, and in weird ways. E.g. for a 4KiB memslot, KVM will zap more SPs if the slot is 1GiB aligned than if it's only 4KiB aligned. And as described below, zapping SPs in the aligned case overzaps for direct MMUs, as odds are good the upper-level SPs are serving other memslots. To iterate over all potentially-relevant gfns, KVM would need to make a pass over the hash table for each level, with the gfn used for lookup rounded for said level. And then check that the SP is of the correct level, too, e.g. to avoid over-zapping. But even then, KVM would massively overzap, as processing every level is all but guaranteed to zap SPs that serve other memslots, especially if the memslot being removed is relatively small. KVM could mitigate that issue by processing only levels that can be possible guest huge pages, i.e. are less likely to be re-used for other memslot, but while somewhat logical, that's quite arbitrary and would be a bit of a mess to implement. So, zap only SPs with gPTEs, as the resulting behavior is easy to describe, is predictable, and is explicitly minimal, i.e. KVM only zaps SPs that absolutely must be zapped. Cc: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Message-ID: <20241009192345.1148353-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 07:08:17 -04:00
Kirill A. Shutemov	8e690b817e	x86/kvm: Override default caching mode for SEV-SNP and TDX AMD SEV-SNP and Intel TDX have limited access to MTRR: either it is not advertised in CPUID or it cannot be programmed (on TDX, due to #VE on CR0.CD clear). This results in guests using uncached mappings where it shouldn't and pmd/pud_set_huge() failures due to non-uniform memory type reported by mtrr_type_lookup(). Override MTRR state, making it WB by default as the kernel does for Hyper-V guests. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Suggested-by: Binbin Wu <binbin.wu@intel.com> Cc: Juergen Gross <jgross@suse.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Juergen Gross <jgross@suse.com> Message-ID: <20241015095818.357915-1-kirill.shutemov@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 07:07:02 -04:00
Dr. David Alan Gilbert	bc07eea2f3	KVM: Remove unused kvm_vcpu_gfn_to_pfn_atomic The last use of kvm_vcpu_gfn_to_pfn_atomic was removed by commit `1bbc60d0c7` ("KVM: x86/mmu: Remove MMU auditing") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Message-ID: <20241001141354.18009-3-linux@treblig.org> [Adjust Documentation/virt/kvm/locking.rst. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 07:05:51 -04:00
Dr. David Alan Gilbert	88a387cf9e	KVM: Remove unused kvm_vcpu_gfn_to_pfn The last use of kvm_vcpu_gfn_to_pfn was removed by commit `b1624f99aa` ("KVM: Remove kvm_vcpu_gfn_to_page() and kvm_vcpu_gpa_to_page()") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Message-ID: <20241001141354.18009-2-linux@treblig.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 07:04:52 -04:00
Linus Torvalds	715ca9dd68	Merge tag 'io_uring-6.12-20241019' of git://git.kernel.dk/linux Pull one more io_uring fix from Jens Axboe: "Fix for a regression introduced in 6.12-rc2, where a condition check was negated and hence -EAGAIN would bubble back up up to userspace rather than trigger a retry condition" * tag 'io_uring-6.12-20241019' of git://git.kernel.dk/linux: io_uring/rw: fix wrong NOWAIT check in io_rw_init_file()	2024-10-19 17:04:52 -07:00
Aleksandr Mishin	eb592008f7	octeon_ep: Add SKB allocation failures handling in __octep_oq_process_rx() build_skb() returns NULL in case of a memory allocation failure so handle it inside __octep_oq_process_rx() to avoid NULL pointer dereference. __octep_oq_process_rx() is called during NAPI polling by the driver. If skb allocation fails, keep on pulling packets out of the Rx DMA queue: we shouldn't break the polling immediately and thus falsely indicate to the octep_napi_poll() that the Rx pressure is going down. As there is no associated skb in this case, don't process the packets and don't push them up the network stack - they are skipped. Helper function is implemented to unmmap/flush all the fragment buffers used by the dropped packet. 'alloc_failures' counter is incremented to mark the skb allocation error in driver statistics. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `37d79d0596` ("octeon_ep: add Tx/Rx processing and interrupt support") Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-19 16:20:07 -05:00
Aleksandr Mishin	bd28df2619	octeon_ep: Implement helper for iterating packets in Rx queue The common code with some packet and index manipulations is extracted and moved to newly implemented helper to make the code more readable and avoid duplication. This is a preparation for skb allocation failure handling. Found by Linux Verification Center (linuxtesting.org) with SVACE. Suggested-by: Simon Horman <horms@kernel.org> Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-19 16:20:07 -05:00
Vadim Fedorenko	4ab3e4983b	bnxt_en: replace ptp_lock with irqsave variant In netpoll configuration the completion processing can happen in hard irq context which will break with spin_lock_bh() for fullfilling RX timestamp in case of all packets timestamping. Replace it with spin_lock_irqsave() variant. Fixes: `7f5515d19c` ("bnxt_en: Get the RX packet timestamp") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Message-ID: <20241016195234.2622004-1-vadfed@meta.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-19 16:16:25 -05:00
Michel Alex	de96f6a300	net: phy: dp83822: Fix reset pin definitions This change fixes a rare issue where the PHY fails to detect a link due to incorrect reset behavior. The SW_RESET definition was incorrectly assigned to bit 14, which is the Digital Restart bit according to the datasheet. This commit corrects SW_RESET to bit 15 and assigns DIG_RESTART to bit 14 as per the datasheet specifications. The SW_RESET define is only used in the phy_reset function, which fully re-initializes the PHY after the reset is performed. The change in the bit definitions should not have any negative impact on the functionality of the PHY. v2: - added Fixes tag - improved commit message Cc: stable@vger.kernel.org Fixes: `5dc39fd5ef` ("net: phy: DP83822: Add ability to advertise Fiber connection") Signed-off-by: Alex Michel <alex.michel@wiedemann-group.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Message-ID: <AS1P250MB0608A798661549BF83C4B43EA9462@AS1P250MB0608.EURP250.PROD.OUTLOOK.COM> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-19 16:14:21 -05:00
Jakub Kicinski	9f86df0e75	MAINTAINERS: add Simon as an official reviewer Simon has been diligently and consistently reviewing networking changes for at least as long as our development statistics go back. Often if not usually topping the list of reviewers. Make his role official. Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Message-ID: <20241015153005.2854018-1-kuba@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-19 16:07:46 -05:00
Jakub Boehm	f99cf996ba	net: plip: fix break; causing plip to never transmit Since commit `71ae2cb305` ("net: plip: Fix fall-through warnings for Clang") plip was not able to send any packets, this patch replaces one unintended break; with fallthrough; which was originally missed by commit `9525d69a36` ("net: plip: mark expected switch fall-throughs"). I have verified with a real hardware PLIP connection that everything works once again after applying this patch. Fixes: `71ae2cb305` ("net: plip: Fix fall-through warnings for Clang") Signed-off-by: Jakub Boehm <boehm.jakub@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Message-ID: <20241015-net-plip-tx-fix-v1-1-32d8be1c7e0b@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-19 16:06:55 -05:00
Wang Hai	e4dd8bfe0f	be2net: fix potential memory leak in be_xmit() The be_xmit() returns NETDEV_TX_OK without freeing skb in case of be_xmit_enqueue() fails, add dev_kfree_skb_any() to fix it. Fixes: `760c295e0e` ("be2net: Support for OS2BMC.") Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Message-ID: <20241015144802.12150-1-wanghai38@huawei.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-19 16:06:08 -05:00
Wang Hai	2cb3f56e82	net/sun3_82586: fix potential memory leak in sun3_82586_send_packet() The sun3_82586_send_packet() returns NETDEV_TX_OK without freeing skb in case of skb->len being too long, add dev_kfree_skb() to fix it. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Message-ID: <20241015144148.7918-1-wanghai38@huawei.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-19 16:04:20 -05:00
Kory Maincent	f2767a4195	net: pse-pd: Fix out of bound for loop Adjust the loop limit to prevent out-of-bounds access when iterating over PI structures. The loop should not reach the index pcdev->nr_lines since we allocate exactly pcdev->nr_lines number of PI structures. This fix ensures proper bounds are maintained during iterations. Fixes: `9be9567a7c` ("net: pse-pd: Add support for PSE PIs") Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Message-ID: <20241015130255.125508-1-kory.maincent@bootlin.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>	2024-10-19 15:55:56 -05:00
Linus Torvalds	531643fcd9	Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Fixes all in drivers. The largest is the mpi3mr which corrects a phy count limit that should only apply to the controller but was being incorrectly applied to expander phys" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: target: core: Fix null-ptr-deref in target_alloc_device() scsi: mpi3mr: Validate SAS port assignments scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut down scsi: ufs: core: Requeue aborted request scsi: ufs: core: Fix the issue of ICU failure	2024-10-19 12:52:19 -07:00
Linus Torvalds	06526daaff	Merge tag 'ftrace-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace fixes from Steven Rostedt: "A couple of fixes to function graph infrastructure: - Fix allocation of idle shadow stack allocation during hotplug If function graph tracing is started when a CPU is offline, if it were come online during the trace then the idle task that represents the CPU will not get a shadow stack allocated for it. This means all function graph hooks that happen while that idle task is running (including in interrupt mode) will have all its events dropped. Switch over to the CPU hotplug mechanism that will have any newly brought on line CPU get a callback that can allocate the shadow stack for its idle task. - Fix allocation size of the ret_stack_list array When function graph tracing converted over to allowing more than one user at a time, it had to convert its shadow stack from an array of ret_stack structures to an array of unsigned longs. The shadow stacks are allocated in batches of 32 at a time and assigned to every running task. The batch is held by the ret_stack_list array. But when the conversion happened, instead of allocating an array of 32 pointers, it was allocated as a ret_stack itself (PAGE_SIZE). This ret_stack_list gets passed to a function that iterates over what it believes is its size defined by the FTRACE_RETSTACK_ALLOC_SIZE macro (which is 32). Luckily (PAGE_SIZE) is greater than 32 * sizeof(long), otherwise this would have been an array overflow. This still should be fixed and the ret_stack_list should be allocated to the size it is expected to be as someday it may end up being bigger than SHADOW_STACK_SIZE" * tag 'ftrace-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: fgraph: Allocate ret_stack_list with proper size fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks	2024-10-19 12:42:14 -07:00
Linus Torvalds	8203ca3809	Merge tag 'ipe-pr-20241018' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe Pull ipe fixes from Fan Wu: "This addresses several issues identified by Luca when attempting to enable IPE on Debian and systemd: - address issues with IPE policy update errors and policy update version check, improving the clarity of error messages for better understanding by userspace programs. - enable IPE policies to be signed by secondary and platform keyrings, facilitating broader use across general Linux distributions like Debian. - updates the IPE entry in the MAINTAINERS file to reflect the new tree URL and my updated email from kernel.org" * tag 'ipe-pr-20241018' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe: MAINTAINERS: update IPE tree url and Fan Wu's email ipe: fallback to platform keyring also if key in trusted keyring is rejected ipe: allow secondary and platform keyrings to install/update policies ipe: also reject policy updates with the same version ipe: return -ESTALE instead of -EINVAL on update when new policy has a lower version	2024-10-19 11:48:14 -07:00
Linus Torvalds	f9e4825524	Merge tag 'input-for-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - a fix for Zinitix driver to not fail probing if the property enabling touch keys functionality is not defined. Support for touch keys was added in 6.12 merge window so this issue does not affect users of released kernels - a couple new vendor/device IDs in xpad driver to enable support for more hardware * tag 'input-for-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: zinitix - don't fail if linux,keycodes prop is absent Input: xpad - add support for MSI Claw A1M Input: xpad - add support for 8BitDo Ultimate 2C Wireless Controller	2024-10-19 10:18:03 -07:00
Linus Torvalds	9197b73fd7	Merge tag '9p-for-6.12-rc4' of https://github.com/martinetd/linux Pull 9p fixes from Dominique Martinet: "Mashed-up update that I sat on too long: - fix for multiple slabs created with the same name - enable multipage folios - theorical fix to also look for opened fids by inode if none was found by dentry" [ Enabling multi-page folios should have been done during the merge window, but it's a one-liner, and the actual meat of the enablement is in netfs and already in use for other filesystems... - Linus ] * tag '9p-for-6.12-rc4' of https://github.com/martinetd/linux: 9p: Avoid creating multiple slab caches with the same name 9p: Enable multipage folios 9p: v9fs_fid_find: also lookup by inode if not found dentry	2024-10-19 08:44:10 -07:00
Linus Torvalds	4e6bd4a33a	Merge tag 'rust-fixes-6.12-2' of https://github.com/Rust-for-Linux/linux Pull rust fixes from Miguel Ojeda: "Toolchain and infrastructure: - Fix several issues with the 'rustc-option' macro. It includes a refactor from Masahiro of three '{cc,rust}-' macros, which is not a fix but avoids repeating the same commands (which would be several lines in the case of 'rustc-option'). - Fix conditions for 'CONFIG_HAVE_CFI_ICALL_NORMALIZE_INTEGERS'. It includes the addition of 'CONFIG_RUSTC_LLVM_VERSION', which is not a fix but is needed for the actual fix. And a trivial grammar fix" tag 'rust-fixes-6.12-2' of https://github.com/Rust-for-Linux/linux: cfi: fix conditions for HAVE_CFI_ICALL_NORMALIZE_INTEGERS kbuild: rust: add `CONFIG_RUSTC_LLVM_VERSION` kbuild: fix issues with rustc-option kbuild: refactor cc-option-yn, cc-disable-warning, rust-option-yn macros lib/Kconfig.debug: fix grammar in RUST_BUILD_ASSERT_ALLOW	2024-10-19 08:32:47 -07:00
Jens Axboe	ae6a888a43	io_uring/rw: fix wrong NOWAIT check in io_rw_init_file() A previous commit improved how !FMODE_NOWAIT is dealt with, but inadvertently negated a check whilst doing so. This caused -EAGAIN to be returned from reading files with O_NONBLOCK set. Fix up the check for REQ_F_SUPPORT_NOWAIT. Reported-by: Julian Orth <ju.orth@gmail.com> Link: https://github.com/axboe/liburing/issues/1270 Fixes: `f7c9134385` ("io_uring/rw: allow pollable non-blocking attempts for !FMODE_NOWAIT") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-19 09:25:45 -06:00
Jinjie Ruan	369f056889	iio: gts-helper: Fix memory leaks for the error path of iio_gts_build_avail_scale_table() If per_time_scales[i] or per_time_gains[i] kcalloc fails in the for loop of iio_gts_build_avail_scale_table(), the err_free_out will fail to call kfree() each time when i is reduced to 0, so all the per_time_scales[0] and per_time_gains[0] will not be freed, which will cause memory leaks. Fix it by checking if i >= 0. Cc: stable@vger.kernel.org Fixes: `38416c28e1` ("iio: light: Add gain-time-scale helpers") Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20241016012453.2013302-1-ruanjinjie@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-19 15:10:33 +01:00
Jinjie Ruan	691e79ffc4	iio: gts-helper: Fix memory leaks in iio_gts_build_avail_scale_table() modprobe iio-test-gts and rmmod it, then the following memory leak occurs: unreferenced object 0xffffff80c810be00 (size 64): comm "kunit_try_catch", pid 1654, jiffies 4294913981 hex dump (first 32 bytes): 02 00 00 00 08 00 00 00 20 00 00 00 40 00 00 00 ........ ...@... 80 00 00 00 00 02 00 00 00 04 00 00 00 08 00 00 ................ backtrace (crc a63d875e): [<0000000028c1b3c2>] kmemleak_alloc+0x34/0x40 [<000000001d6ecc87>] __kmalloc_noprof+0x2bc/0x3c0 [<00000000393795c1>] devm_iio_init_iio_gts+0x4b4/0x16f4 [<0000000071bb4b09>] 0xffffffdf052a62e0 [<000000000315bc18>] 0xffffffdf052a6488 [<00000000f9dc55b5>] kunit_try_run_case+0x13c/0x3ac [<00000000175a3fd4>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000f505065d>] kthread+0x2e8/0x374 [<00000000bbfb0e5d>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80cbfe9e70 (size 16): comm "kunit_try_catch", pid 1658, jiffies 4294914015 hex dump (first 16 bytes): 10 00 00 00 40 00 00 00 80 00 00 00 00 00 00 00 ....@........... backtrace (crc 857f0cb4): [<0000000028c1b3c2>] kmemleak_alloc+0x34/0x40 [<000000001d6ecc87>] __kmalloc_noprof+0x2bc/0x3c0 [<00000000393795c1>] devm_iio_init_iio_gts+0x4b4/0x16f4 [<0000000071bb4b09>] 0xffffffdf052a62e0 [<000000007d089d45>] 0xffffffdf052a6864 [<00000000f9dc55b5>] kunit_try_run_case+0x13c/0x3ac [<00000000175a3fd4>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000f505065d>] kthread+0x2e8/0x374 [<00000000bbfb0e5d>] ret_from_fork+0x10/0x20 ...... It includes 55 times "size 64" memory leaks, which correspond to 5 times test_init_iio_gain_scale() calls with gts_test_gains size 10 (10size(int)) and gts_test_itimes size 5. It also includes 51 times "size 16" memory leak, which correspond to one time __test_init_iio_gain_scale() call with gts_test_gains_gain_low size 3 (3size(int)) and gts_test_itimes size 5. The reason is that the per_time_gains[i] is not freed which is allocated in the "gts->num_itime" for loop in iio_gts_build_avail_scale_table(). Cc: stable@vger.kernel.org Fixes: `38416c28e1` ("iio: light: Add gain-time-scale helpers") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com> Link: https://patch.msgid.link/20241011095512.3667549-1-ruanjinjie@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-19 15:08:26 +01:00
Steven Rostedt	fae4078c28	fgraph: Allocate ret_stack_list with proper size The ret_stack_list is an array of ret_stack shadow stacks for the function graph usage. When the first function graph is enabled, all tasks in the system get a shadow stack. The ret_stack_list is a 32 element array of pointers to these shadow stacks. It allocates the shadow stack in batches (32 stacks at a time), assigns them to running tasks, and continues until all tasks are covered. When the function graph shadow stack changed from an array of ftrace_ret_stack structures to an array of longs, the allocation of ret_stack_list went from allocating an array of 32 elements to just a block defined by SHADOW_STACK_SIZE. Luckily, that's defined as PAGE_SIZE and is much more than enough to hold 32 pointers. But it is way overkill for the amount needed to allocate. Change the allocation of ret_stack_list back to a kcalloc() of FTRACE_RETSTACK_ALLOC_SIZE pointers. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241018215212.23f13f40@rorschach Fixes: `42675b723b` ("function_graph: Convert ret_stack to a series of longs") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-10-18 21:57:20 -04:00
Steven Rostedt	2c02f7375e	fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks The function graph infrastructure allocates a shadow stack for every task when enabled. This includes the idle tasks. The first time the function graph is invoked, the shadow stacks are created and never freed until the task exits. This includes the idle tasks. Only the idle tasks that were for online CPUs had their shadow stacks created when function graph tracing started. If function graph tracing is enabled and a CPU comes online, the idle task representing that CPU will not have its shadow stack created, and all function graph tracing for that idle task will be silently dropped. Instead, use the CPU hotplug mechanism to allocate the idle shadow stacks. This will include idle tasks for CPUs that come online during tracing. This issue can be reproduced by: # cd /sys/kernel/tracing # echo 0 > /sys/devices/system/cpu/cpu1/online # echo 0 > set_ftrace_pid # echo function_graph > current_tracer # echo 1 > options/funcgraph-proc # echo 1 > /sys/devices/system/cpu/cpu1 # grep '<idle>' per_cpu/cpu1/trace \| head Before, nothing would show up. After: 1) <idle>-0 \| 0.811 us \| __enqueue_entity(); 1) <idle>-0 \| 5.626 us \| } /* enqueue_entity */ 1) <idle>-0 \| \| dl_server_update_idle_time() { 1) <idle>-0 \| \| dl_scaled_delta_exec() { 1) <idle>-0 \| 0.450 us \| arch_scale_cpu_capacity(); 1) <idle>-0 \| 1.242 us \| } 1) <idle>-0 \| 1.908 us \| } 1) <idle>-0 \| \| dl_server_start() { 1) <idle>-0 \| \| enqueue_dl_entity() { 1) <idle>-0 \| \| task_contending() { Note, if tracing stops and restarts, the old way would then initialize the onlined CPUs. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/20241018214300.6df82178@rorschach Fixes: `868baf07b1` ("ftrace: Fix memory leak with function graph and cpu hotplug") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-10-18 21:56:56 -04:00
Linus Torvalds	3d5ad2d4ec	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Daniel Borkmann: - Fix BPF verifier to not affect subreg_def marks in its range propagation (Eduard Zingerman) - Fix a truncation bug in the BPF verifier's handling of coerce_reg_to_size_sx (Dimitar Kanaliev) - Fix the BPF verifier's delta propagation between linked registers under 32-bit addition (Daniel Borkmann) - Fix a NULL pointer dereference in BPF devmap due to missing rxq information (Florian Kauer) - Fix a memory leak in bpf_core_apply (Jiri Olsa) - Fix an UBSAN-reported array-index-out-of-bounds in BTF parsing for arrays of nested structs (Hou Tao) - Fix build ID fetching where memory areas backing the file were created with memfd_secret (Andrii Nakryiko) - Fix BPF task iterator tid filtering which was incorrectly using pid instead of tid (Jordan Rome) - Several fixes for BPF sockmap and BPF sockhash redirection in combination with vsocks (Michal Luczaj) - Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered (Andrea Parri) - Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the possibility of an infinite BPF tailcall (Pu Lehui) - Fix a build warning from resolve_btfids that bpf_lsm_key_free cannot be resolved (Thomas Weißschuh) - Fix a bug in kfunc BTF caching for modules where the wrong BTF object was returned (Toke Høiland-Jørgensen) - Fix a BPF selftest compilation error in cgroup-related tests with musl libc (Tony Ambardar) - Several fixes to BPF link info dumps to fill missing fields (Tyrone Wu) - Add BPF selftests for kfuncs from multiple modules, checking that the correct kfuncs are called (Simon Sundberg) - Ensure that internal and user-facing bpf_redirect flags don't overlap (Toke Høiland-Jørgensen) - Switch to use kvzmalloc to allocate BPF verifier environment (Rik van Riel) - Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic splat under RT (Wander Lairson Costa) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (38 commits) lib/buildid: Handle memfd_secret() files in build_id_parse() selftests/bpf: Add test case for delta propagation bpf: Fix print_reg_state's constant scalar dump bpf: Fix incorrect delta propagation between linked registers bpf: Properly test iter/task tid filtering bpf: Fix iter/task tid filtering riscv, bpf: Make BPF_CMPXCHG fully ordered bpf, vsock: Drop static vsock_bpf_prot initialization vsock: Update msg_count on read_skb() vsock: Update rx_bytes on read_skb() bpf, sockmap: SK_DROP on attempted redirects of unsupported af_vsock selftests/bpf: Add asserts for netfilter link info bpf: Fix link info netfilter flags to populate defrag flag selftests/bpf: Add test for sign extension in coerce_subreg_to_size_sx() selftests/bpf: Add test for truncation after sign extension in coerce_reg_to_size_sx() bpf: Fix truncation bug in coerce_reg_to_size_sx() selftests/bpf: Assert link info uprobe_multi count & path_size if unset bpf: Fix unpopulated path_size when uprobe_multi fields unset selftests/bpf: Fix cross-compiling urandom_read selftests/bpf: Add test for kfunc module order ...	2024-10-18 16:27:14 -07:00
Linus Torvalds	dbafeddb95	Merge tag 'linux_kselftest-fixes-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fix from Shuah Khan: - fix test makefile to install tests directory without which the test fails with errors * tag 'linux_kselftest-fixes-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftest: hid: add the missing tests directory	2024-10-18 16:11:17 -07:00
Nikita Travkin	2de01e0e57	Input: zinitix - don't fail if linux,keycodes prop is absent When initially adding the touchkey support, a mistake was made in the property parsing code. The possible negative errno from device_property_count_u32() was never checked, which was an oversight left from converting to it from the of_property as part of the review fixes. Re-add the correct handling of the absent property, in which case zero touchkeys should be assumed, which would disable the feature. Reported-by: Jakob Hauser <jahau@rocketmail.com> Tested-by: Jakob Hauser <jahau@rocketmail.com> Fixes: `075d9b22c8` ("Input: zinitix - add touchkey support") Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Nikita Travkin <nikita@trvn.ru> Tested-by: Yassine Oudjana <y.oudjana@protonmail.com> Link: https://lore.kernel.org/r/20241004-zinitix-no-keycodes-v2-1-876dc9fea4b6@trvn.ru Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>	2024-10-18 16:03:04 -07:00
Linus Torvalds	f8eacd8ad7	Merge tag 'block-6.12-20241018' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Fix target passthrough identifier (Nilay) - Fix tcp locking (Hannes) - Replace list with sbitmap for tracking RDMA rsp tags (Guixen) - Remove unnecessary fallthrough statements (Tokunori) - Remove ready-without-media support (Greg) - Fix multipath partition scan deadlock (Keith) - Fix concurrent PCI reset and remove queue mapping (Maurizio) - Fabrics shutdown fixes (Nilay) - Fix for a kerneldoc warning (Keith) - Fix a race with blk-rq-qos and wakeups (Omar) - Cleanup of checking for always-set tag_set (SurajSonawane2415) - Fix for a crash with CPU hotplug notifiers (Ming) - Don't allow zero-copy ublk on unprivileged device (Ming) - Use array_index_nospec() for CDROM (Josh) - Remove dead code in drbd (David) - Tweaks to elevator loading (Breno) * tag 'block-6.12-20241018' of git://git.kernel.dk/linux: cdrom: Avoid barrier_nospec() in cdrom_ioctl_media_changed() nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function nvme: make keep-alive synchronous operation nvme-loop: flush off pending I/O while shutting down loop controller nvme-pci: fix race condition between reset and nvme_dev_disable() ublk: don't allow user copy for unprivileged device blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race nvme-multipath: defer partition scanning blk-mq: setup queue ->tag_set before initializing hctx elevator: Remove argument from elevator_find_get elevator: do not request_module if elevator exists drbd: Remove unused conn_lowest_minor nvme: disable CC.CRIME (NVME_CC_CRIME) nvme: delete unnecessary fallthru comment nvmet-rdma: use sbitmap to replace rsp free list block: Fix elevator_get_default() checking for NULL q->tag_set nvme: tcp: avoid race between queue_lock lock and destroy nvmet-passthru: clear EUID/NGUID/UUID while using loop target block: fix blk_rq_map_integrity_sg kernel-doc	2024-10-18 15:53:00 -07:00
Linus Torvalds	a041f47898	Merge tag 'io_uring-6.12-20241018' of git://git.kernel.dk/linux Pull io_uring fixes from Jens Axboe: - Fix a regression this merge window where cloning of registered buffers didn't take into account the dummy_ubuf - Fix a race with reading how many SQRING entries are available, causing userspace to need to loop around io_uring_sqring_wait() rather than being able to rely on SQEs being available when it returned - Ensure that the SQPOLL thread is TASK_RUNNING before running task_work off the cancelation exit path * tag 'io_uring-6.12-20241018' of git://git.kernel.dk/linux: io_uring/sqpoll: ensure task state is TASK_RUNNING when running task_work io_uring/rsrc: ignore dummy_ubuf for buffer cloning io_uring/sqpoll: close race on waiting for sqring entries	2024-10-18 15:38:37 -07:00
John Edwards	22a18935d7	Input: xpad - add support for MSI Claw A1M Add MSI Claw A1M controller to xpad_device match table when in xinput mode. Add MSI VID as XPAD_XBOX360_VENDOR. Signed-off-by: John Edwards <uejji@uejji.net> Reviewed-by: Derek J. Clark <derekjohn.clark@gmail.com> Reviewed-by: Christopher Snowhill <kode54@gmail.com> Link: https://lore.kernel.org/r/20241010232020.3292284-4-uejji@uejji.net Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>	2024-10-18 14:34:57 -07:00
Mark Brown	d641c164f8	ASoC/SoundWire: clean up link DMA during stop for IPC4 Merge series from Bard Liao <yung-chuan.liao@linux.intel.com>: Clean up the link DMA for playback during stop for IPC4 is required to reset the DMA read/write pointers when the stream is prepared and restarted after a call to snd_pcm_drain()/snd_pcm_drop(). The change is mainly on ASoC. We may go via ASoC tree with Vinod's Acked-by tag Ranjani Sridharan (4): ASoC: SOF: ipc4-topology: Do not set ALH node_id for aggregated DAIs ASoC: SOF: Intel: hda: Handle prepare without close for non-HDA DAI's soundwire: intel_ace2x: Send PDI stream number during prepare ASoC: SOF: Intel: hda: Always clean up link DMA during stop drivers/soundwire/intel_ace2x.c \| 19 +++++----------- sound/soc/sof/intel/hda-dai-ops.c \| 23 +++++++++---------- sound/soc/sof/intel/hda-dai.c \| 37 +++++++++++++++++++++++++++---- sound/soc/sof/ipc4-topology.c \| 15 +++++++++++-- 4 files changed, 62 insertions(+), 32 deletions(-) -- 2.43.0	2024-10-18 21:47:03 +01:00
Olga Kornievskaia	8dd91e8d31	nfsd: fix race between laundromat and free_stateid There is a race between laundromat handling of revoked delegations and a client sending free_stateid operation. Laundromat thread finds that delegation has expired and needs to be revoked so it marks the delegation stid revoked and it puts it on a reaper list but then it unlock the state lock and the actual delegation revocation happens without the lock. Once the stid is marked revoked a racing free_stateid processing thread does the following (1) it calls list_del_init() which removes it from the reaper list and (2) frees the delegation stid structure. The laundromat thread ends up not calling the revoke_delegation() function for this particular delegation but that means it will no release the lock lease that exists on the file. Now, a new open for this file comes in and ends up finding that lease list isn't empty and calls nfsd_breaker_owns_lease() which ends up trying to derefence a freed delegation stateid. Leading to the followint use-after-free KASAN warning: kernel: ================================================================== kernel: BUG: KASAN: slab-use-after-free in nfsd_breaker_owns_lease+0x140/0x160 [nfsd] kernel: Read of size 8 at addr ffff0000e73cd0c8 by task nfsd/6205 kernel: kernel: CPU: 2 UID: 0 PID: 6205 Comm: nfsd Kdump: loaded Not tainted 6.11.0-rc7+ #9 kernel: Hardware name: Apple Inc. Apple Virtualization Generic Platform, BIOS 2069.0.0.0.0 08/03/2024 kernel: Call trace: kernel: dump_backtrace+0x98/0x120 kernel: show_stack+0x1c/0x30 kernel: dump_stack_lvl+0x80/0xe8 kernel: print_address_description.constprop.0+0x84/0x390 kernel: print_report+0xa4/0x268 kernel: kasan_report+0xb4/0xf8 kernel: __asan_report_load8_noabort+0x1c/0x28 kernel: nfsd_breaker_owns_lease+0x140/0x160 [nfsd] kernel: nfsd_file_do_acquire+0xb3c/0x11d0 [nfsd] kernel: nfsd_file_acquire_opened+0x84/0x110 [nfsd] kernel: nfs4_get_vfs_file+0x634/0x958 [nfsd] kernel: nfsd4_process_open2+0xa40/0x1a40 [nfsd] kernel: nfsd4_open+0xa08/0xe80 [nfsd] kernel: nfsd4_proc_compound+0xb8c/0x2130 [nfsd] kernel: nfsd_dispatch+0x22c/0x718 [nfsd] kernel: svc_process_common+0x8e8/0x1960 [sunrpc] kernel: svc_process+0x3d4/0x7e0 [sunrpc] kernel: svc_handle_xprt+0x828/0xe10 [sunrpc] kernel: svc_recv+0x2cc/0x6a8 [sunrpc] kernel: nfsd+0x270/0x400 [nfsd] kernel: kthread+0x288/0x310 kernel: ret_from_fork+0x10/0x20 This patch proposes a fixed that's based on adding 2 new additional stid's sc_status values that help coordinate between the laundromat and other operations (nfsd4_free_stateid() and nfsd4_delegreturn()). First to make sure, that once the stid is marked revoked, it is not removed by the nfsd4_free_stateid(), the laundromat take a reference on the stateid. Then, coordinating whether the stid has been put on the cl_revoked list or we are processing FREE_STATEID and need to make sure to remove it from the list, each check that state and act accordingly. If laundromat has added to the cl_revoke list before the arrival of FREE_STATEID, then nfsd4_free_stateid() knows to remove it from the list. If nfsd4_free_stateid() finds that operations arrived before laundromat has placed it on cl_revoke list, it marks the state freed and then laundromat will no longer add it to the list. Also, for nfsd4_delegreturn() when looking for the specified stid, we need to access stid that are marked removed or freeable, it means the laundromat has started processing it but hasn't finished and this delegreturn needs to return nfserr_deleg_revoked and not nfserr_bad_stateid. The latter will not trigger a FREE_STATEID and the lack of it will leave this stid on the cl_revoked list indefinitely. Fixes: `2d4a532d38` ("nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock") CC: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-10-18 16:40:37 -04:00
Fan Wu	917a15c37d	MAINTAINERS: update IPE tree url and Fan Wu's email Update Integrity Policy Enforcement (IPE) LSM tree url and maintainer's email to the newly issued kernel.org tree/email. Signed-off-by: Fan Wu <wufan@kernel.org>	2024-10-18 12:15:37 -07:00
Luca Boccassi	f40998a8e6	ipe: fallback to platform keyring also if key in trusted keyring is rejected If enabled, we fallback to the platform keyring if the trusted keyring doesn't have the key used to sign the ipe policy. But if pkcs7_verify() rejects the key for other reasons, such as usage restrictions, we do not fallback. Do so, following the same change in dm-verity. Signed-off-by: Luca Boccassi <bluca@debian.org> Suggested-by: Serge Hallyn <serge@hallyn.com> [FW: fixed some line length issues and a typo in the commit message] Signed-off-by: Fan Wu <wufan@kernel.org>	2024-10-18 12:14:53 -07:00
Linus Torvalds	b04ae0f451	Merge tag 'v6.12-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - Fix possible double free setting xattrs - Fix slab out of bounds with large ioctl payload - Remove three unused functions, and an unused variable that could be confusing * tag 'v6.12-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Remove unused functions smb/client: Fix logically dead code smb: client: fix OOBs when building SMB2_IOCTL request smb: client: fix possible double free in smb2_set_ea()	2024-10-18 11:37:12 -07:00
Linus Torvalds	568570fdf2	Merge tag 'xfs-6.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Carlos Maiolino: - Fix integer overflow in xrep_bmap - Fix stale dealloc punching for COW IO * tag 'xfs-6.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: punch delalloc extents from the COW fork for COW writes xfs: set IOMAP_F_SHARED for all COW fork allocations xfs: share more code in xfs_buffered_write_iomap_begin xfs: support the COW fork in xfs_bmap_punch_delalloc_range xfs: IOMAP_ZERO and IOMAP_UNSHARE already hold invalidate_lock xfs: take XFS_MMAPLOCK_EXCL xfs_file_write_zero_eof xfs: factor out a xfs_file_write_zero_eof helper iomap: move locking out of iomap_write_delalloc_release iomap: remove iomap_file_buffered_write_punch_delalloc iomap: factor out a iomap_last_written_block helper xfs: fix integer overflow in xrep_bmap	2024-10-18 11:28:39 -07:00
Linus Torvalds	5e9ab267be	Merge tag 'pm-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix two issues in the amd-pstate cpufreq driver and update the intel_rapl power capping driver with a new processor ID. Specifics: - Enable ACPI CPPC in amd_pstate_register_driver() after disabling it in amd_pstate_unregister_driver() when switching driver operation modes (Dhananjay Ugwekar) - Make amd-pstate use nominal performance as the maximum performance level when boost is disabled (Mario Limonciello) - Add ArrowLake-H to the list of processors where PL4 is supported in the MSR part of the intel_rapl power capping driver (Srinivas Pandruvada)" * tag 'pm-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: powercap: intel_rapl_msr: Add PL4 support for ArrowLake-H cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled cpufreq/amd-pstate: Fix amd_pstate mode switch on shared memory systems	2024-10-18 11:16:01 -07:00
Linus Torvalds	3b3a0ef6ae	Merge tag 'hwmon-for-v6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fix from Guenter Roeck: "Fix auto-detect regression in jc42 driver" * tag 'hwmon-for-v6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: [PATCH} hwmon: (jc42) Properly detect TSE2004-compliant devices again	2024-10-18 11:13:53 -07:00
Linus Torvalds	5d97dde4d5	Merge tag 'drm-fixes-2024-10-18' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "Weekly fixes, msm and xe are the two main ones, with a bunch of scattered fixes including a largish revert in mgag200, then amdgpu, vmwgfx and scattering of other minor ones. All seems pretty regular. msm: - Display: - move CRTC resource assignment to atomic_check otherwise to make consecutive calls to atomic_check() consistent - fix rounding / sign-extension issues with pclk calculation in case of DSC - cleanups to drop incorrect null checks in dpu snapshots - fix to use kvzalloc in dpu snapshot to avoid allocation issues in heavily loaded system cases - Fix to not program merge_3d block if dual LM is not being used - Fix to not flush merge_3d block if its not enabled otherwise this leads to false timeouts - GPU: - a7xx: add a fence wait before SMMU table update xe: - New workaround to Xe2 (Aradhya) - Fix unbalanced rpm put (Matthew Auld) - Remove fragile lock optimization (Matthew Brost) - Fix job release, delegating it to the drm scheduler (Matthew Brost) - Fix timestamp bit width for Xe2 (Lucas) - Fix external BO's dma-resv usag (Matthew Brost) - Fix returning success for timeout in wait_token (Nirmoy) - Initialize fence to avoid it being detected as signaled (Matthew Auld) - Improve cache flush for BMG (Matthew Auld) - Don't allow hflip for tile4 framebuffer on Xe2 (Juha-Pekka) amdgpu: - SR-IOV fix - CS chunk handling fix - MES fixes - SMU13 fixes amdkfd: - VRAM usage reporting fix radeon: - Fix possible_clones handling i915: - Two DP bandwidth related MST fixes ast: - Clear EDID on unplugged connectors host1x: - Fix boot on Tegra186 - Set DMA parameters mgag200: - Revert VBLANK support panel: - himax-hx83192: Adjust power and gamma qaic: - Sgtable loop fixes vmwgfx: - Limit display layout allocatino size - Handle allocation errors in connector checks - Clean up KMS code for 2d-only setup - Report surface-check errors correctly - Remove NULL test around kvfree()" * tag 'drm-fixes-2024-10-18' of https://gitlab.freedesktop.org/drm/kernel: (45 commits) drm/ast: vga: Clear EDID if no display is connected drm/ast: sil164: Clear EDID if no display is connected Revert "drm/mgag200: Add vblank support" drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUs drm/i915/display: Don't allow tile4 framebuffer to do hflip on display20 or greater drm/xe/bmg: improve cache flushing behaviour drm/xe/xe_sync: initialise ufence.signalled drm/xe/ufence: ufence can be signaled right after wait_woken drm/xe: Use bookkeep slots for external BO's in exec IOCTL drm/xe/query: Increase timestamp width drm/xe: Don't free job in TDR drm/xe: Take job list lock in xe_sched_add_pending_job drm/xe: fix unbalanced rpm put() with declare_wedged() drm/xe: fix unbalanced rpm put() with fence_fini() drm/xe/xe2lpg: Extend Wa_15016589081 for xe2lpg drm/i915/dp_mst: Don't require DSC hblank quirk for a non-DSC compatible mode drm/i915/dp_mst: Handle error during DSC BW overhead/slice calculation drm/msm/a6xx+: Insert a fence wait before SMMU table update drm/msm/dpu: don't always program merge_3d block drm/msm/dpu: Don't always set merge_3d pending flush ...	2024-10-18 11:03:21 -07:00
Chancel Liu	da95e891dd	ASoC: fsl_micfil: Add a flag to distinguish with different volume control types On i.MX8MM the register of volume control has positive and negative values. It is different from other platforms like i.MX8MP and i.MX93 which only have positive values. Add a volume_sx flag to use SX_TLV volume control for this kind of platform. Use common TLV volume control for other platforms. Fixes: `cdfa92eb90` ("ASoC: fsl_micfil: Correct the number of steps on SX controls") Signed-off-by: Chancel Liu <chancel.liu@nxp.com> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Link: https://patch.msgid.link/20241017071507.2577786-1-chancel.liu@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-18 18:44:14 +01:00
Alexey Klimov	9fc9ef0572	ASoC: codecs: lpass-rx-macro: fix RXn(rx,n) macro for DSM_CTL and SEC7 regs Turns out some registers of pre-2.5 version of rxmacro codecs are not located at the expected offsets but 0xc further away in memory. So far the detected registers are CDC_RX_RX2_RX_PATH_SEC7 and CDC_RX_RX2_RX_PATH_DSM_CTL. CDC_RX_RXn_RX_PATH_DSM_CTL(rx, n) macro incorrectly generates the address 0x540 for RX2 but it should be 0x54C and it also overwrites CDC_RX_RX2_RX_PATH_SEC7 which is located at 0x540. The same goes for CDC_RX_RXn_RX_PATH_SEC7(rx, n). Fix this by introducing additional rxn_reg_stride2 offset. For 2.5 version and above this offset will be equal to 0. With such change the corresponding RXn() macros will generate the same values for 2.5 codec version for all RX paths and the same old values for pre-2.5 version for RX0 and RX1. However for the latter case with RX2 path it will also add rxn_reg_stride2 on top. While at this, also remove specific if-check for INTERP_AUX from rx_macro_digital_mute() and rx_macro_enable_interp_clk(). These if-check was used to handle such special offset for AUX interpolator but since CDC_RX_RXn_RX_PATH_SEC7(rx, n) and CDC_RX_RXn_RX_PATH_DSM_CTL(rx, n) macros will generate the correst addresses of dsm register, they are no longer needed. Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patch.msgid.link/20241016221049.1145101-1-alexey.klimov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-18 18:44:13 +01:00
Jens Axboe	49c234b50a	Merge tag 'md-6.12-20241018' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-6.12 Pull MD fixes from Song. * tag 'md-6.12-20241018' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md/raid10: fix null ptr dereference in raid10_size() md: ensure child flush IO does not affect origin bio->bi_status	2024-10-18 10:58:24 -06:00
Linus Torvalds	b1b4675167	mm: fix follow_pfnmap API lockdep assert The lockdep asserts for the new follow_pfnmap() API "knows" that a pfnmap always has a vma->vm_file, since that's the only way to create such a mapping. And that's actually true for all the normal cases. But not for the mmap failure case, where the incomplete mapping is torn down and we have cleared vma->vm_file because the failure occured before the file was linked to the vma. So this codepath does actually need to check for vm_file being NULL. Reported-by: Jann Horn <jannh@google.com> Fixes: `6da8e9634b` ("mm: new follow_pfnmap API") Cc: Peter Xu <peterx@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-10-18 09:50:05 -07:00
Rafael J. Wysocki	cf8679bb77	Merge branch 'pm-cpufreq' Merge amd-pstate driver fixes for 6.12-rc4: - Enable ACPI CPPC in amd_pstate_register_driver() after disabling it in amd_pstate_unregister_driver() during driver operation mode switch (Dhananjay Ugwekar). - Make amd-pstate use nominal performance as the maximum performance level when boost is disabled (Mario Limonciello). * pm-cpufreq: cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled cpufreq/amd-pstate: Fix amd_pstate mode switch on shared memory systems	2024-10-18 18:22:43 +02:00
Linus Torvalds	75aa74d52f	Merge tag 'iommu-fixes-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu fixes from Joerg Roedel: "ARM-SMMU fixes from Will Deacon: - Clarify warning message when failing to disable the MMU-500 prefetcher - Fix undefined behaviour in calculation of L1 stream-table index when 32-bit StreamIDs are implemented - Replace a rogue comma with a semicolon Intel VT-d fix from Lu Baolu: - Fix incorrect pci_for_each_dma_alias() for non-PCI devices" * tag 'iommu-fixes-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu/vt-d: Fix incorrect pci_for_each_dma_alias() for non-PCI devices iommu/arm-smmu-v3: Convert comma to semicolon iommu/arm-smmu-v3: Fix last_sid_idx calculation for sid_bits==32 iommu/arm-smmu: Clarify MMU-500 CPRE workaround	2024-10-18 07:13:24 -07:00
Linus Torvalds	ef444a0aba	Merge tag 'powerpc-6.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Madhavan Srinivasan: - To prevent possible memory leak, free "name" on error in opal_event_init() Thanks to Michael Ellerman and 2639161967. * tag 'powerpc-6.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/powernv: Free name on error in opal_event_init()	2024-10-18 07:07:13 -07:00
Linus Torvalds	c91c14618f	Merge tag 's390-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - Fix PCI error recovery by handling error events correctly - Fix CCA crypto card behavior within protected execution environment - Two KVM commits which fix virtual vs physical address handling bugs in KVM pfault handling - Fix return code handling in pckmo_key2protkey() - Deactivate sclp console as late as possible so that outstanding messages appear on the console instead of being dropped on reboot - Convert newlines to CRLF instead of LFCR for the sclp vt220 driver, as required by the vt220 specification - Initialize also psw mask in perf_arch_fetch_caller_regs() to make sure that user_mode(regs) will return false - Update defconfigs * tag 's390-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: Update defconfigs s390: Initialize psw mask in perf_arch_fetch_caller_regs() s390/sclp_vt220: Convert newlines to CRLF instead of LFCR s390/sclp: Deactivate sclp after all its users s390/pkey_pckmo: Return with success for valid protected key types KVM: s390: Change virtual to physical address access in diag 0x258 handler KVM: s390: gaccess: Check if guest address is in memslot s390/ap: Fix CCA crypto card behavior within protected execution environment s390/pci: Handle PCI error codes other than 0x3a	2024-10-18 07:01:59 -07:00
Yo-Jung (Leo) Lin	9b673c7551	misc: rtsx: list supported models in Kconfig help rts5228, rts5261, rts5264 are supported by the rtsx_pci driver, but they are not mentioned in the Kconfig help when the code was added. List those models in the Kconfig help accordingly. Signed-off-by: Yo-Jung Lin (Leo) <0xff07@gmail.com> Link: https://lore.kernel.org/r/20241017144747.15966-1-0xff07@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-18 13:40:17 +02:00
Greg Kroah-Hartman	6e90b675cf	MAINTAINERS: Remove some entries due to various compliance requirements. Remove some entries due to various compliance requirements. They can come back in the future if sufficient documentation is provided. Link: https://lore.kernel.org/r/2024101835-tiptop-blip-09ed@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-18 13:40:03 +02:00
Thorsten Blum	197231da7f	proc: Fix W=1 build kernel-doc warning Building the kernel with W=1 generates the following warning: fs/proc/fd.c:81: warning: This comment starts with '/**', but isn't a kernel-doc comment. Use a normal comment for the helper function proc_fdinfo_permission(). Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20241018102705.92237-2-thorsten.blum@linux.dev Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-18 13:02:47 +02:00
Hans de Goede	51268879eb	HID: lenovo: Add support for Thinkpad X1 Tablet Gen 3 keyboard The Thinkpad X1 Tablet Gen 3 keyboard has the same Lenovo specific quirks as the original Thinkpad X1 Tablet keyboard. Add the PID for the "Thinkpad X1 Tablet Gen 3 keyboard" to the hid-lenovo driver to fix the FnLock, Mute and media buttons not working. Suggested-by: Izhar Firdaus <izhar@fedoraproject.org> Closes https://bugzilla.redhat.com/show_bug.cgi?id=2315395 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-10-18 12:36:11 +02:00
Kenneth Albanowski	526748b925	HID: multitouch: Add quirk for Logitech Bolt receiver w/ Casa touchpad The Logitech Casa Touchpad does not reliably send touch release signals when communicating through the Logitech Bolt wireless-to-USB receiver. Adjusting the device class to add MT_QUIRK_NOT_SEEN_MEANS_UP to make sure that no touches become stuck, MT_QUIRK_FORCE_MULTI_INPUT is not needed, but harmless. Linux does not have information on which devices are connected to the Bolt receiver, so we have to enable this for the entire device. Signed-off-by: Kenneth Albanowski <kenalba@chromium.org> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-10-18 12:33:47 +02:00
Bartłomiej Maryńczak	293c485cba	HID: i2c-hid: Delayed i2c resume wakeup for 0x0d42 Goodix touchpad Patch for Goodix 27c6:0d42 touchpads found in Inspiron 5515 laptops. After resume from suspend, one can communicate with this device just fine. We can read data from it or request a reset, but for some reason the interrupt line will not go up when new events are available. (it can correctly respond to a reset with an interrupt tho) The only way I found to wake this device up is to send anything to it after ~1.5s mark, for example a simple read request, or power mode change. In this patch, I simply delay the resume steps with msleep, this will cause the set_power request to happen after the ~1.5s barrier causing the device to resume its event interrupts. Sleep was used rather than delayed_work to make this workaround as non-invasive as possible. [jkosina@suse.com: shortlog update] Signed-off-by: Bartłomiej Maryńczak <marynczakbartlomiej@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-10-18 12:32:06 +02:00
Greg Kroah-Hartman	1154a59921	Merge tag 'usb-serial-6.12-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial device ids for 6.12-rc4 Here are some new modem device ids. Everything has been in linux-next over night with no reported issues. * tag 'usb-serial-6.12-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: option: add Telit FN920C04 MBIM compositions USB: serial: option: add support for Quectel EG916Q-GL	2024-10-18 12:11:28 +02:00
Jiqian Chen	0fd2a74330	xen: Remove dependency between pciback and privcmd Commit `2fae6bb7be` ("xen/privcmd: Add new syscall to get gsi from dev") adds a weak reverse dependency to the config XEN_PRIVCMD definition, that dependency causes xen-privcmd can't be loaded on domU, because dependent xen-pciback isn't always be loaded successfully on domU. To solve above problem, remove that dependency, and do not call pcistub_get_gsi_from_sbdf() directly, instead add a hook in drivers/xen/apci.c, xen-pciback register the real call function, then in privcmd_ioctl_pcidev_get_gsi call that hook. Fixes: `2fae6bb7be` ("xen/privcmd: Add new syscall to get gsi from dev") Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Reviewed-by: Juergen Gross <jgross@suse.com> Message-ID: <20241012084537.1543059-1-Jiqian.Chen@amd.com> Signed-off-by: Juergen Gross <jgross@suse.com>	2024-10-18 11:59:04 +02:00
Kent Overstreet	bc6d2d1041	bcachefs: fsck: Improve hash_check_key() hash_check_key() checks and repairs the hash table btrees: dirents and xattrs are open addressing hash tables. We recently had a corruption reported where the hash type on an inode somehow got flipped, which made the existing dirents invisible and allowed new ones to be created with the same name. Now, hash_check_key() can repair duplicates: it will delete one of them, if it has an xattr or dangling dirent, but if it has two valid dirents one of them gets renamed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	dc96656b20	bcachefs: bch2_hash_set_or_get_in_snapshot() Add a variant of bch2_hash_set_in_snapshot() that returns the existing key on -EEXIST. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	15a3836c8e	bcachefs: Repair mismatches in inode hash seed, type Different versions of the same inode (same inode number, different snapshot ID) must have the same hash seed and type - lookups require this, since they see keys from different snapshots simultaneously. To repair we only need to make the inodes consistent, hash_check_key() will do the rest. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	d8e879377f	bcachefs: Add hash seed, type to inode_to_text() This helped with discovering some filesystem corruption fsck has having trouble with: the str_hash type had gotten flipped on one snapshot's version of an inode. All versions of a given inode number have the same hash seed and hash type, since lookups will be done with a single hash/seed and type and see dirents/xattrs from multiple snapshots. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	78cf0ae636	bcachefs: INODE_STR_HASH() for bch_inode_unpacked Trivial cleanup - add a normal BITMASK() helper for bch_inode_unpacked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	b96f8cd387	bcachefs: Run in-kernel offline fsck without ratelimit errors Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Hongbo Li	489ecc4cfd	bcachefs: skip mount option handle for empty string. The options parse in get_tree will split the options buffer, it will get the empty string for last one by strsep(). After commit ea0eeb89b1d5 ("bcachefs: reject unknown mount options") is merged, unknown mount options is not allowed (here is empty string), and this causes this errors. This can be reproduced just by the following steps: bcachefs format /dev/loop mount -t bcachefs -o metadata_target=loop1 /dev/loop1 /mnt/bcachefs/ Fixes: ea0eeb89b1d5 ("bcachefs: reject unknown mount options") Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Hongbo Li	07cf8bac2d	bcachefs: fix incorrect show_options results When call show_options in bcachefs, the options buffer is appeneded to the seq variable. In fact, it requires an additional comma to be appended first. This will affect the remount process when reading existing mount options. Fixes: 9305cf91d05e ("bcachefs: bch2_opts_to_text()") Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	97535cd84f	bcachefs: Fix data corruption on -ENOSPC in buffered write path Found by generic/299: When we have to truncate a write due to -ENOSPC, we may have to read in the folio we're writing to if we're now no longer doing a complete write to a !uptodate folio. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	335d318ef5	bcachefs: bch2_folio_reservation_get_partial() is now better behaved bch2_folio_reservation_get_partial(), on partial success, will now return a reservation that's aligned to the filesystem blocksize. This is a partial fix for fstests generic/299 - fio verify is badly behaved in the presence of short writes that aren't aligned to its blocksize. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	81e0b6c7c1	bcachefs: fix disk reservation accounting in bch2_folio_reservation_get() bch2_disk_reservation_put() zeroes out the reservation - oops. This fixes a disk reservation leak when getting a quota reservation returned an error. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	4007bbb203	bcachefS: ec: fix data type on stripe deletion Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	a0d11feefb	bcachefs: Don't use commit_do() unnecessarily Using commit_do() to call alloc_sectors_start_trans() breaks when we're randomly injecting transaction restarts - the restart in the commit causes us to leak the lock that alloc_sectorS_start_trans() takes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	6bee2a04c5	bcachefs: handle restarts in bch2_bucket_io_time_reset() bch2_bucket_io_time_reset() doesn't need to succeed, which is why it didn't previously retry on transaction restart - but we're now treating these as errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	29fd10a36a	bcachefs: fix restart handling in __bch2_resume_logged_op_finsert() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:48 -04:00
Kent Overstreet	d8b5059774	bcachefs: fix restart handling in bch2_alloc_write_key() This is ugly: We may discover in alloc_write_key that the data type we calculated is wrong, because BCH_DATA_need_discard is checked/set elsewhere, and the disk accounting counters we calculated need to be updated. But bch2_alloc_key_to_dev_counters(..., BTREE_TRIGGER_gc) is not safe w.r.t. transaction restarts, so we need to propagate the fixup back to our gc state in case we take a transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:47 -04:00
Kent Overstreet	7ee4be9c62	bcachefs: fix restart handling in bch2_do_invalidates_work() this one is fairly harmless since the invalidate worker will just run again later if it needs to, but still worth fixing Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:47 -04:00
Kent Overstreet	028f3c1d9b	bcachefs: fix missing restart handling in bch2_read_retry_nodecode() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:47 -04:00
Kent Overstreet	e1c4d2f082	bcachefs: fix restart handling in bch2_fiemap() We were leaking transaction restart errors to userspace. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:47 -04:00
Kent Overstreet	94bdeec8f5	bcachefs: fix bch2_hash_delete() error path we were exiting an iterator that hadn't been initialized Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:47 -04:00
Kent Overstreet	74ec2f3024	bcachefs: fix restart handling in bch2_rename2() This should be impossible to hit in practice; the first lookup within a transaction won't return a restart due to lock ordering, but we're adding fault injection for transaction restarts and shaking out bugs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-18 00:49:47 -04:00
Dave Airlie	83f0007848	Merge tag 'drm-xe-fixes-2024-10-17' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - New workaround to Xe2 (Aradhya) - Fix unbalanced rpm put (Matthew Auld) - Remove fragile lock optimization (Matthew Brost) - Fix job release, delegating it to the drm scheduler (Matthew Brost) - Fix timestamp bit width for Xe2 (Lucas) - Fix external BO's dma-resv usag (Matthew Brost) - Fix returning success for timeout in wait_token (Nirmoy) - Initialize fence to avoid it being detected as signaled (Matthew Auld) - Improve cache flush for BMG (Matthew Auld) - Don't allow hflip for tile4 framebuffer on Xe2 (Juha-Pekka) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/jkldrex5733ldxrla75b4ayvhujjhw2kccmasl5rotoufoacj4@pkvlrrv4orc7	2024-10-18 13:53:41 +10:00
Linus Torvalds	ade8ff3b6a	Merge tag 'x86_bugs_post_ibpb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 IBPB fixes from Borislav Petkov: "This fixes the IBPB implementation of older AMDs (< gen4) that do not flush the RSB (Return Address Stack) so you can still do some leaking when using a "=ibpb" mitigation for Retbleed or SRSO. Fix it by doing the flushing in software on those generations. IBPB is not the default setting so this is not likely to affect anybody in practice" * tag 'x86_bugs_post_ibpb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/bugs: Do not use UNTRAIN_RET with IBPB on entry x86/bugs: Skip RSB fill at VMEXIT x86/entry: Have entry_ibpb() invalidate return predictions x86/cpufeatures: Add a IBPB_NO_RET BUG flag x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET	2024-10-17 19:12:38 -07:00
Josh Poimboeuf	b0bf1afde7	cdrom: Avoid barrier_nospec() in cdrom_ioctl_media_changed() The barrier_nospec() after the array bounds check is overkill and painfully slow for arches which implement it. Furthermore, most arches don't implement it, so they remain exposed to Spectre v1 (which can affect pretty much any CPU with branch prediction). Instead, clamp the user pointer to a valid range so it's guaranteed to be a valid array index even when the bounds check mispredicts. Fixes: `8270cb10c0` ("cdrom: Fix spectre-v1 gadget") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/1d86f4d9d8fba68e5ca64cdeac2451b95a8bf872.1729202937.git.jpoimboe@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-17 19:47:15 -06:00
Linus Torvalds	4d939780b7	Merge tag 'mm-hotfixes-stable-2024-10-17-16-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "28 hotfixes. 13 are cc:stable. 23 are MM. It is the usual shower of unrelated singletons - please see the individual changelogs for details" * tag 'mm-hotfixes-stable-2024-10-17-16-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (28 commits) maple_tree: add regression test for spanning store bug maple_tree: correct tree corruption on spanning store mm/mglru: only clear kswapd_failures if reclaimable mm/swapfile: skip HugeTLB pages for unuse_vma selftests: mm: fix the incorrect usage() info of khugepaged MAINTAINERS: add Jann as memory mapping/VMA reviewer mm: swap: prevent possible data-race in __try_to_reclaim_swap mm: khugepaged: fix the incorrect statistics when collapsing large file folios MAINTAINERS: kasan, kcov: add bugzilla links mm: don't install PMD mappings when THPs are disabled by the hw/process/vma mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw() Docs/damon/maintainer-profile: update deprecated awslabs GitHub URLs Docs/damon/maintainer-profile: add missing '_' suffixes for external web links maple_tree: check for MA_STATE_BULK on setting wr_rebalance mm: khugepaged: fix the arguments order in khugepaged_collapse_file trace point mm/damon/tests/sysfs-kunit.h: fix memory leak in damon_sysfs_test_add_targets() mm: remove unused stub for can_swapin_thp() mailmap: add an entry for Andy Chiu MAINTAINERS: add memory mapping/VMA co-maintainers fs/proc: fix build with GCC 15 due to -Werror=unterminated-string-initialization ...	2024-10-17 16:33:06 -07:00
Linus Torvalds	d4b82e5808	Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Two clk driver fixes and a unit test fix: - Terminate the of_device_id table in the Samsung exynosautov920 clk driver so that device matching logic doesn't run off the end of the array into other memory and break matching for any kernel with this driver loaded - Properly limit the max clk ID in the Rockchip clk driver - Use clk kunit helpers in the clk tests so that memory isn't leaked after the test concludes" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: test: Fix some memory leaks clk: rockchip: fix finding of maximum clock ID clk: samsung: Fix out-of-bound access of of_match_node()	2024-10-17 16:24:42 -07:00
Dave Airlie	49ff3e79a7	Merge tag 'drm-misc-fixes-2024-10-17' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: ast: - Clear EDID on unplugged connectors host1x: - Fix boot on Tegra186 - Set DMA parameters mgag200: - Revert VBLANK support panel: - himax-hx83192: Adjust power and gamma qaic: - Sgtable loop fixes vmwgfx: - Limit display layout allocatino size - Handle allocation errors in connector checks - Clean up KMS code for 2d-only setup - Report surface-check errors correctly - Remove NULL test around kvfree() Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241017115516.GA196624@linux.fritz.box	2024-10-18 06:43:17 +10:00
Dave Airlie	7626b4e96b	Merge tag 'drm-intel-fixes-2024-10-17' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Two DP bandwidth related MST fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZxDLdML9Dwqkb1AW@jlahtine-mobl.ger.corp.intel.com	2024-10-18 06:41:13 +10:00
Dave Airlie	01541a8706	Merge tag 'amd-drm-fixes-6.12-2024-10-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.12-2024-10-16: amdgpu: - SR-IOV fix - CS chunk handling fix - MES fixes - SMU13 fixes amdkfd: - VRAM usage reporting fix radeon: - Fix possible_clones handling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016200514.3520286-1-alexander.deucher@amd.com	2024-10-18 06:13:19 +10:00
Sebastian Andrzej Siewior	5ec36fe24b	MAINTAINERS: Add an entry for PREEMPT_RT. Add a maintainers entry now that the PREEMPT_RT bits are merged. Steven volunteered and asked for the list. There are no files associated with this entry since it is spread over the kernel. It serves as entry for people knowing what they look for. There is a keyword added so if PREEMPT_RT is mentioned somewhere, then the entry will be picked up. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Pavel Machek <pavel@denx.de> Link: https://lore.kernel.org/all/20241015151132.Erx81G9f@linutronix.de	2024-10-17 21:33:30 +02:00
Andrii Nakryiko	5ac9b4e935	lib/buildid: Handle memfd_secret() files in build_id_parse() >From memfd_secret(2) manpage: The memory areas backing the file created with memfd_secret(2) are visible only to the processes that have access to the file descriptor. The memory region is removed from the kernel page tables and only the page tables of the processes holding the file descriptor map the corresponding physical memory. (Thus, the pages in the region can't be accessed by the kernel itself, so that, for example, pointers to the region can't be passed to system calls.) We need to handle this special case gracefully in build ID fetching code. Return -EFAULT whenever secretmem file is passed to build_id_parse() family of APIs. Original report and repro can be found in [0]. [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/ Fixes: `de3ec364c3` ("lib/buildid: add single folio-based file reader abstraction") Reported-by: Yi Lai <yi1.lai@intel.com> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Link: https://lore.kernel.org/bpf/20241017175431.6183-A-hca@linux.ibm.com Link: https://lore.kernel.org/bpf/20241017174713.2157873-1-andrii@kernel.org	2024-10-17 21:30:32 +02:00
Jens Axboe	de7007e9e6	Merge tag 'nvme-6.12-2024-10-18' of git://git.infradead.org/nvme into block-6.12 Pull NVMe fixes from Keith: "nvme fixes for Linux 6.12 - Fix target passthrough identifier (Nilay) - Fix tcp locking (Hannes) - Replace list with sbitmap for tracking RDMA rsp tags (Guixen) - Remove unnecessary fallthrough statements (Tokunori) - Remove ready-without-media support (Greg) - Fix multipath partition scan deadlock (Keith) - Fix concurrent PCI reset and remove queue mapping (Maurizio) - Fabrics shutdown fixes (Nilay)" * tag 'nvme-6.12-2024-10-18' of git://git.infradead.org/nvme: nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function nvme: make keep-alive synchronous operation nvme-loop: flush off pending I/O while shutting down loop controller nvme-pci: fix race condition between reset and nvme_dev_disable() nvme-multipath: defer partition scanning nvme: disable CC.CRIME (NVME_CC_CRIME) nvme: delete unnecessary fallthru comment nvmet-rdma: use sbitmap to replace rsp free list nvme: tcp: avoid race between queue_lock lock and destroy nvmet-passthru: clear EUID/NGUID/UUID while using loop target block: fix blk_rq_map_integrity_sg kernel-doc	2024-10-17 12:49:27 -06:00
Luca Boccassi	02e2f9aa33	ipe: allow secondary and platform keyrings to install/update policies The current policy management makes it impossible to use IPE in a general purpose distribution. In such cases the users are not building the kernel, the distribution is, and access to the private key included in the trusted keyring is, for obvious reason, not available. This means that users have no way to enable IPE, since there will be no built-in generic policy, and no access to the key to sign updates validated by the trusted keyring. Just as we do for dm-verity, kernel modules and more, allow the secondary and platform keyrings to also validate policies. This allows users enrolling their own keys in UEFI db or MOK to also sign policies, and enroll them. This makes it sensible to enable IPE in general purpose distributions, as it becomes usable by any user wishing to do so. Keys in these keyrings can already load kernels and kernel modules, so there is no security downgrade. Add a kconfig each, like dm-verity does, but default to enabled if the dependencies are available. Signed-off-by: Luca Boccassi <bluca@debian.org> Reviewed-by: Serge Hallyn <serge@hallyn.com> [FW: fixed some style issues] Signed-off-by: Fan Wu <wufan@kernel.org>	2024-10-17 11:46:10 -07:00
Yu Kuai	825711e001	md/raid10: fix null ptr dereference in raid10_size() In raid10_run() if raid10_set_queue_limits() succeed, the return value is set to zero, and if following procedures failed raid10_run() will return zero while mddev->private is still NULL, causing null ptr dereference in raid10_size(). Fix the problem by only overwrite the return value if raid10_set_queue_limits() failed. Fixes: `3d8466ba68` ("md/raid10: use the atomic queue limit update APIs") Cc: stable@vger.kernel.org Reported-and-tested-by: ValdikSS <iam@valdikss.org.ru> Closes: https://lore.kernel.org/all/0dd96820-fe52-4841-bc58-dbf14d6bfcc8@valdikss.org.ru/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241009014914.1682037-1-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-10-17 11:43:43 -07:00
Luca Boccassi	5ceecb301e	ipe: also reject policy updates with the same version Currently IPE accepts an update that has the same version as the policy being updated, but it doesn't make it a no-op nor it checks that the old and new policyes are the same. So it is possible to change the content of a policy, without changing its version. This is very confusing from userspace when managing policies. Instead change the update logic to reject updates that have the same version with ESTALE, as that is much clearer and intuitive behaviour. Signed-off-by: Luca Boccassi <bluca@debian.org> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Fan Wu <wufan@kernel.org>	2024-10-17 11:38:15 -07:00
Luca Boccassi	579941899d	ipe: return -ESTALE instead of -EINVAL on update when new policy has a lower version When loading policies in userspace we want a recognizable error when an update attempts to use an old policy, as that is an error that needs to be treated differently from an invalid policy. Use -ESTALE as it is clear enough for an update mechanism. Signed-off-by: Luca Boccassi <bluca@debian.org> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Fan Wu <wufan@kernel.org>	2024-10-17 11:37:13 -07:00
Li Nan	62ce0782bb	md: ensure child flush IO does not affect origin bio->bi_status When a flush is issued to an RAID array, a child flush IO is created and issued for each member disk in the RAID array. Since commit `b75197e86e` ("md: Remove flush handling"), each child flush IO has been chained with the original bio. As a result, the failure of any child IO could modify the bi_status of the original bio, potentially impacting the upper-layer filesystem. Fix the issue by preventing child flush IO from altering the original bio->bi_status as before. However, this design introduces a known issue: in the event of a power failure, if a flush IO on a member disk fails, the upper layers may not be informed. This issue is not easy to fix and will not be addressed for the time being in this issue. Fixes: `b75197e86e` ("md: Remove flush handling") Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240919063048.2887579-1-linan666@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-10-17 11:35:36 -07:00
Nilay Shroff	599d9f3a10	nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function We no more need acquiring ctrl->lock before accessing the NVMe controller state and instead we can now use the helper nvme_ctrl_state. So replace the use of ctrl->lock from nvme_keep_alive_finish function with nvme_ctrl_state call. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-17 11:07:37 -07:00
Nilay Shroff	d06923670b	nvme: make keep-alive synchronous operation The nvme keep-alive operation, which executes at a periodic interval, could potentially sneak in while shutting down a fabric controller. This may lead to a race between the fabric controller admin queue destroy code path (invoked while shutting down controller) and hw/hctx queue dispatcher called from the nvme keep-alive async request queuing operation. This race could lead to the kernel crash shown below: Call Trace: autoremove_wake_function+0x0/0xbc (unreliable) __blk_mq_sched_dispatch_requests+0x114/0x24c blk_mq_sched_dispatch_requests+0x44/0x84 blk_mq_run_hw_queue+0x140/0x220 nvme_keep_alive_work+0xc8/0x19c [nvme_core] process_one_work+0x200/0x4e0 worker_thread+0x340/0x504 kthread+0x138/0x140 start_kernel_thread+0x14/0x18 While shutting down fabric controller, if nvme keep-alive request sneaks in then it would be flushed off. The nvme_keep_alive_end_io function is then invoked to handle the end of the keep-alive operation which decrements the admin->q_usage_counter and assuming this is the last/only request in the admin queue then the admin->q_usage_counter becomes zero. If that happens then blk-mq destroy queue operation (blk_mq_destroy_ queue()) which could be potentially running simultaneously on another cpu (as this is the controller shutdown code path) would forward progress and deletes the admin queue. So, now from this point onward we are not supposed to access the admin queue resources. However the issue here's that the nvme keep-alive thread running hw/hctx queue dispatch operation hasn't yet finished its work and so it could still potentially access the admin queue resource while the admin queue had been already deleted and that causes the above crash. This fix helps avoid the observed crash by implementing keep-alive as a synchronous operation so that we decrement admin->q_usage_counter only after keep-alive command finished its execution and returns the command status back up to its caller (blk_execute_rq()). This would ensure that fabric shutdown code path doesn't destroy the fabric admin queue until keep-alive request finished execution and also keep-alive thread is not running hw/hctx queue dispatch operation. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-17 11:07:37 -07:00
Nilay Shroff	c199fac88f	nvme-loop: flush off pending I/O while shutting down loop controller While shutting down loop controller, we first quiesce the admin/IO queue, delete the admin/IO tag-set and then at last destroy the admin/IO queue. However it's quite possible that during the window between quiescing and destroying of the admin/IO queue, some admin/IO request might sneak in and if that happens then we could potentially encounter a hung task because shutdown operation can't forward progress until any pending I/O is flushed off. This commit helps ensure that before destroying the admin/IO queue, we unquiesce the admin/IO queue so that any outstanding requests, which are added after the admin/IO queue is quiesced, are now flushed to its completion. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-17 11:07:37 -07:00
Daniel Borkmann	db123e4230	selftests/bpf: Add test case for delta propagation Add a small BPF verifier test case to ensure that alu32 additions to registers are not subject to linked scalar delta tracking. # ./vmtest.sh -- ./test_progs -t verifier_linked_scalars [...] ./test_progs -t verifier_linked_scalars [ 1.413138] tsc: Refined TSC clocksource calibration: 3407.993 MHz [ 1.413524] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fcd52370, max_idle_ns: 440795242006 ns [ 1.414223] clocksource: Switched to clocksource tsc [ 1.419640] bpf_testmod: loading out-of-tree module taints kernel. [ 1.420025] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel #500/1 verifier_linked_scalars/scalars: find linked scalars:OK #500 verifier_linked_scalars:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED [ 1.590858] ACPI: PM: Preparing to enter system sleep state S5 [ 1.591402] reboot: Power down [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20241016134913.32249-3-daniel@iogearbox.net	2024-10-17 11:06:34 -07:00
Daniel Borkmann	3e9e708757	bpf: Fix print_reg_state's constant scalar dump print_reg_state() should not consider adding reg->off to reg->var_off.value when dumping scalars. Scalars can be produced with reg->off != 0 through BPF_ADD_CONST, and thus as-is this can skew the register log dump. Fixes: `98d7ca374b` ("bpf: Track delta between "linked" registers.") Reported-by: Nathaniel Theis <nathaniel.theis@nccgroup.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241016134913.32249-2-daniel@iogearbox.net	2024-10-17 11:06:34 -07:00
Daniel Borkmann	3878ae04e9	bpf: Fix incorrect delta propagation between linked registers Nathaniel reported a bug in the linked scalar delta tracking, which can lead to accepting a program with OOB access. The specific code is related to the sync_linked_regs() function and the BPF_ADD_CONST flag, which signifies a constant offset between two scalar registers tracked by the same register id. The verifier attempts to track "similar" scalars in order to propagate bounds information learned about one scalar to others. For instance, if r1 and r2 are known to contain the same value, then upon encountering 'if (r1 != 0x1234) goto xyz', not only does it know that r1 is equal to 0x1234 on the path where that conditional jump is not taken, it also knows that r2 is. Additionally, with env->bpf_capable set, the verifier will track scalars which should be a constant delta apart (if r1 is known to be one greater than r2, then if r1 is known to be equal to 0x1234, r2 must be equal to 0x1233.) The code path for the latter in adjust_reg_min_max_vals() is reached when processing both 32 and 64-bit addition operations. While adjust_reg_min_max_vals() knows whether dst_reg was produced by a 32 or a 64-bit addition (based on the alu32 bool), the only information saved in dst_reg is the id of the source register (reg->id, or'ed by BPF_ADD_CONST) and the value of the constant offset (reg->off). Later, the function sync_linked_regs() will attempt to use this information to propagate bounds information from one register (known_reg) to others, meaning, for all R in linked_regs, it copies known_reg range (and possibly adjusting delta) into R for the case of R->id == known_reg->id. For the delta adjustment, meaning, matching reg->id with BPF_ADD_CONST, the verifier adjusts the register as reg = known_reg; reg += delta where delta is computed as (s32)reg->off - (s32)known_reg->off and placed as a scalar into a fake_reg to then simulate the addition of reg += fake_reg. This is only correct, however, if the value in reg was created by a 64-bit addition. When reg contains the result of a 32-bit addition operation, its upper 32 bits will always be zero. sync_linked_regs() on the other hand, may cause the verifier to believe that the addition between fake_reg and reg overflows into those upper bits. For example, if reg was generated by adding the constant 1 to known_reg using a 32-bit alu operation, then reg->off is 1 and known_reg->off is 0. If known_reg is known to be the constant 0xFFFFFFFF, sync_linked_regs() will tell the verifier that reg is equal to the constant 0x100000000. This is incorrect as the actual value of reg will be 0, as the 32-bit addition will wrap around. Example: 0: (b7) r0 = 0; R0_w=0 1: (18) r1 = 0x80000001; R1_w=0x80000001 3: (37) r1 /= 1; R1_w=scalar() 4: (bf) r2 = r1; R1_w=scalar(id=1) R2_w=scalar(id=1) 5: (bf) r4 = r1; R1_w=scalar(id=1) R4_w=scalar(id=1) 6: (04) w2 += 2147483647; R2_w=scalar(id=1+2147483647,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 7: (04) w4 += 0 ; R4_w=scalar(id=1+0,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 8: (15) if r2 == 0x0 goto pc+1 10: R0=0 R1=0xffffffff80000001 R2=0x7fffffff R4=0xffffffff80000001 R10=fp0 What can be seen here is that r1 is copied to r2 and r4, such that {r1,r2,r4}.id are all the same which later lets sync_linked_regs() to be invoked. Then, in a next step constants are added with alu32 to r2 and r4, setting their ->off, as well as id \|= BPF_ADD_CONST. Next, the conditional will bind r2 and propagate ranges to its linked registers. The verifier now believes the upper 32 bits of r4 are r4=0xffffffff80000001, while actually r4=r1=0x80000001. One approach for a simple fix suitable also for stable is to limit the constant delta tracking to only 64-bit alu addition. If necessary at some later point, BPF_ADD_CONST could be split into BPF_ADD_CONST64 and BPF_ADD_CONST32 to avoid mixing the two under the tradeoff to further complicate sync_linked_regs(). However, none of the added tests from `dedf56d775` ("selftests/bpf: Add tests for add_const") make this necessary at this point, meaning, BPF CI also passes with just limiting tracking to 64-bit alu addition. Fixes: `98d7ca374b` ("bpf: Track delta between "linked" registers.") Reported-by: Nathaniel Theis <nathaniel.theis@nccgroup.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20241016134913.32249-1-daniel@iogearbox.net	2024-10-17 11:06:34 -07:00
Jordan Rome	ee8c7c6c3f	bpf: Properly test iter/task tid filtering Previously test_task_tid was setting `linfo.task.tid` to `getpid()` which is the same as `gettid()` for the parent process. Instead create a new child thread and set `linfo.task.tid` to `gettid()` to make sure the tid filtering logic is working as expected. Signed-off-by: Jordan Rome <linux@jordanrome.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241016210048.1213935-2-linux@jordanrome.com	2024-10-17 10:52:18 -07:00
Jordan Rome	9495a5b731	bpf: Fix iter/task tid filtering In userspace, you can add a tid filter by setting the "task.tid" field for "bpf_iter_link_info". However, `get_pid_task` when called for the `BPF_TASK_ITER_TID` type should have been using `PIDTYPE_PID` (tid) instead of `PIDTYPE_TGID` (pid). Fixes: `f0d74c4da1` ("bpf: Parameterize task iterators.") Signed-off-by: Jordan Rome <linux@jordanrome.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241016210048.1213935-1-linux@jordanrome.com	2024-10-17 10:52:18 -07:00
Maurizio Lombardi	26bc0a81f6	nvme-pci: fix race condition between reset and nvme_dev_disable() nvme_dev_disable() modifies the dev->online_queues field, therefore nvme_pci_update_nr_queues() should avoid racing against it, otherwise we could end up passing invalid values to blk_mq_update_nr_hw_queues(). WARNING: CPU: 39 PID: 61303 at drivers/pci/msi/api.c:347 pci_irq_get_affinity+0x187/0x210 Workqueue: nvme-reset-wq nvme_reset_work [nvme] RIP: 0010:pci_irq_get_affinity+0x187/0x210 Call Trace: <TASK> ? blk_mq_pci_map_queues+0x87/0x3c0 ? pci_irq_get_affinity+0x187/0x210 blk_mq_pci_map_queues+0x87/0x3c0 nvme_pci_map_queues+0x189/0x460 [nvme] blk_mq_update_nr_hw_queues+0x2a/0x40 nvme_reset_work+0x1be/0x2a0 [nvme] Fix the bug by locking the shutdown_lock mutex before using dev->online_queues. Give up if nvme_dev_disable() is running or if it has been executed already. Fixes: `949928c1c7` ("NVMe: Fix possible queue use after freed") Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-17 09:55:22 -07:00
Linus Torvalds	6efbea77b3	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: - Disable software tag-based KASAN when compiling with GCC, as functions are incorrectly instrumented leading to a crash early during boot - Fix pkey configuration for kernel threads when POE is enabled - Fix invalid memory accesses in uprobes when targetting load-literal instructions * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: kasan: Disable Software Tag-Based KASAN with GCC Documentation/protection-keys: add AArch64 to documentation arm64: set POR_EL0 for kernel threads arm64: probes: Fix uprobes for big-endian kernels arm64: probes: Fix simulate_ldr*_literal() arm64: probes: Remove broken LDR (literal) uprobe support	2024-10-17 09:51:03 -07:00
Linus Torvalds	c16e5c94c8	Merge tag 'arm-fixes-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "Most of the fixes this time are for platform specific drivers, addressing issues found through build testing on freescale, ep93xx, starfive, and npcm platforms, as as well as the ffa firmware. The fixes for the scmi firmware driver address compatibility problems found on broadcom machines. There are only two devicetree fixes, addressing incorrect in configuration on broadcom and marvell machines. The changes to the Documentation and MAINTAINERS files are for clarification only" * tag 'arm-fixes-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: firmware: arm_ffa: Avoid string-fortify warning caused by memcpy() firmware: arm_scmi: Queue in scmi layer for mailbox implementation firmware: arm_ffa: Avoid string-fortify warning in export_uuid() firmware: arm_scmi: Give SMC transport precedence over mailbox firmware: arm_scmi: Fix the double free in scmi_debugfs_common_setup() Documentation/process: maintainer-soc: clarify submitting patches dmaengine: cirrus: check that output may be truncated dmaengine: cirrus: ERR_CAST() ioremap error MAINTAINERS: use the canonical soc mailing list address and mark it as L: ARM: dts: bcm2837-rpi-cm3-io3: Fix HDMI hpd-gpio pin arm64: dts: marvell: cn9130-sr-som: fix cp0 mdio pin numbers soc: fsl: cpm1: qmc: Fix unused data compilation warning soc: fsl: cpm1: qmc: Do not use IS_ERR_VALUE() on error pointers reset: starfive: jh71x0: Fix accessing the empty member on JH7110 SoC reset: npcm: convert comma to semicolon	2024-10-17 09:43:36 -07:00
Linus Torvalds	5c94bdab3a	Merge tag 'sound-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes, nothing really stands out: - Usual HD-audio quirks / device-specific fixes - Kconfig dependency fix for UM - A series of minor fixes for SoundWire - Updates of USB-audio LINE6 contact address" * tag 'sound-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2 ALSA/hda: intel-sdw-acpi: add support for sdw-manager-list property read ALSA/hda: intel-sdw-acpi: simplify sdw-master-count property read ALSA/hda: intel-sdw-acpi: fetch fwnode once in sdw_intel_scan_controller() ALSA/hda: intel-sdw-acpi: cleanup sdw_intel_scan_controller ALSA: hda/tas2781: Add new quirk for Lenovo, ASUS, Dell projects ALSA: scarlett2: Add error check after retrieving PEQ filter values ALSA: hda/cs8409: Fix possible NULL dereference sound: Make CONFIG_SND depend on INDIRECT_IOMEM instead of UML ALSA: line6: update contact information ALSA: usb-audio: Fix NULL pointer deref in snd_usb_power_domain_set() ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2 ALSA: hda: Sound support for HP Spectre x360 16 inch model 2024	2024-10-17 09:36:59 -07:00
Linus Torvalds	07d6bf634b	Merge tag 'net-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Current release - new code bugs: - eth: mlx5: HWS, don't destroy more bwc queue locks than allocated Previous releases - regressions: - ipv4: give an IPv4 dev to blackhole_netdev - udp: compute L4 checksum as usual when not segmenting the skb - tcp/dccp: don't use timer_pending() in reqsk_queue_unlink(). - eth: mlx5e: don't call cleanup on profile rollback failure - eth: microchip: vcap api: fix memory leaks in vcap_api_encode_rule_test() - eth: enetc: disable Tx BD rings after they are empty - eth: macb: avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY Previous releases - always broken: - posix-clock: fix missing timespec64 check in pc_clock_settime() - genetlink: hold RCU in genlmsg_mcast() - mptcp: prevent MPC handshake on port-based signal endpoints - eth: vmxnet3: fix packet corruption in vmxnet3_xdp_xmit_frame - eth: stmmac: dwmac-tegra: fix link bring-up sequence - eth: bcmasp: fix potential memory leak in bcmasp_xmit() Misc: - add Andrew Lunn as a co-maintainer of all networking drivers" * tag 'net-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits) net/mlx5e: Don't call cleanup on profile rollback failure net/mlx5: Unregister notifier on eswitch init failure net/mlx5: Fix command bitmask initialization net/mlx5: Check for invalid vector index on EQ creation net/mlx5: HWS, use lock classes for bwc locks net/mlx5: HWS, don't destroy more bwc queue locks than allocated net/mlx5: HWS, fixed double free in error flow of definer layout net/mlx5: HWS, removed wrong access to a number of rules variable mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow net: ethernet: mtk_eth_soc: fix memory corruption during fq dma init vmxnet3: Fix packet corruption in vmxnet3_xdp_xmit_frame net: dsa: vsc73xx: fix reception from VLAN-unaware bridges net: ravb: Only advertise Rx/Tx timestamps if hardware supports it net: microchip: vcap api: Fix memory leaks in vcap_api_encode_rule_test() net: phy: mdio-bcm-unimac: Add BCM6846 support dt-bindings: net: brcm,unimac-mdio: Add bcm6846-mdio udp: Compute L4 checksum as usual when not segmenting the skb genetlink: hold RCU in genlmsg_mcast() net: dsa: mv88e6xxx: Fix the max_vid definition for the MV88E6361 tcp/dccp: Don't use timer_pending() in reqsk_queue_unlink(). ...	2024-10-17 09:31:18 -07:00
Sean Anderson	78b2770c93	dma-mapping: fix tracing dma_alloc/free with vmalloc'd memory Not all virtual addresses have physical addresses, such as if they were vmalloc'd. Just trace the virtual address instead of trying to trace a physical address. This aligns with the API, and is good enough to associate dma_alloc with dma_free. Fixes: `038eb433dc` ("dma-mapping: add tracing for dma-mapping API calls") Reported-by: syzbot+b4bfacdec173efaa8567@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/670ebde5.050a0220.d9b66.0154.GAE@google.com/ Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Signed-off-by: Christoph Hellwig <hch@lst.de>	2024-10-17 17:41:29 +02:00
Lorenzo Stoakes	e993457df6	maple_tree: add regression test for spanning store bug Add a regression test to assert that, when performing a spanning store which consumes the entirety of the rightmost right leaf node does not result in maple tree corruption when doing so. This achieves this by building a test tree of 3 levels and establishing a store which ultimately results in a spanned store of this nature. Link: https://lkml.kernel.org/r/30cdc101a700d16e03ba2f9aa5d83f2efa894168.1728314403.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: Bert Karwatzki <spasswolf@web.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 08:35:10 -07:00
Lorenzo Stoakes	bea07fd631	maple_tree: correct tree corruption on spanning store Patch series "maple_tree: correct tree corruption on spanning store", v3. There has been a nasty yet subtle maple tree corruption bug that appears to have been in existence since the inception of the algorithm. This bug seems far more likely to happen since commit `f8d112a4e6` ("mm/mmap: avoid zeroing vma tree in mmap_region()"), which is the point at which reports started to be submitted concerning this bug. We were made definitely aware of the bug thanks to the kind efforts of Bert Karwatzki who helped enormously in my being able to track this down and identify the cause of it. The bug arises when an attempt is made to perform a spanning store across two leaf nodes, where the right leaf node is the rightmost child of the shared parent, AND the store completely consumes the right-mode node. This results in mas_wr_spanning_store() mitakenly duplicating the new and existing entries at the maximum pivot within the range, and thus maple tree corruption. The fix patch corrects this by detecting this scenario and disallowing the mistaken duplicate copy. The fix patch commit message goes into great detail as to how this occurs. This series also includes a test which reliably reproduces the issue, and asserts that the fix works correctly. Bert has kindly tested the fix and confirmed it resolved his issues. Also Mikhail Gavrilov kindly reported what appears to be precisely the same bug, which this fix should also resolve. This patch (of 2): There has been a subtle bug present in the maple tree implementation from its inception. This arises from how stores are performed - when a store occurs, it will overwrite overlapping ranges and adjust the tree as necessary to accommodate this. A range may always ultimately span two leaf nodes. In this instance we walk the two leaf nodes, determine which elements are not overwritten to the left and to the right of the start and end of the ranges respectively and then rebalance the tree to contain these entries and the newly inserted one. This kind of store is dubbed a 'spanning store' and is implemented by mas_wr_spanning_store(). In order to reach this stage, mas_store_gfp() invokes mas_wr_preallocate(), mas_wr_store_type() and mas_wr_walk() in turn to walk the tree and update the object (mas) to traverse to the location where the write should be performed, determining its store type. When a spanning store is required, this function returns false stopping at the parent node which contains the target range, and mas_wr_store_type() marks the mas->store_type as wr_spanning_store to denote this fact. When we go to perform the store in mas_wr_spanning_store(), we first determine the elements AFTER the END of the range we wish to store (that is, to the right of the entry to be inserted) - we do this by walking to the NEXT pivot in the tree (i.e. r_mas.last + 1), starting at the node we have just determined contains the range over which we intend to write. We then turn our attention to the entries to the left of the entry we are inserting, whose state is represented by l_mas, and copy these into a 'big node', which is a special node which contains enough slots to contain two leaf node's worth of data. We then copy the entry we wish to store immediately after this - the copy and the insertion of the new entry is performed by mas_store_b_node(). After this we copy the elements to the right of the end of the range which we are inserting, if we have not exceeded the length of the node (i.e. r_mas.offset <= r_mas.end). Herein lies the bug - under very specific circumstances, this logic can break and corrupt the maple tree. Consider the following tree: Height 0 Root Node / \ pivot = 0xffff / \ pivot = ULONG_MAX / \ 1 A [-----] ... / \ pivot = 0x4fff / \ pivot = 0xffff / \ 2 (LEAVES) B [-----] [-----] C ^--- Last pivot 0xffff. Now imagine we wish to store an entry in the range [0x4000, 0xffff] (note that all ranges expressed in maple tree code are inclusive): 1. mas_store_gfp() descends the tree, finds node A at <=0xffff, then determines that this is a spanning store across nodes B and C. The mas state is set such that the current node from which we traverse further is node A. 2. In mas_wr_spanning_store() we try to find elements to the right of pivot 0xffff by searching for an index of 0x10000: - mas_wr_walk_index() invokes mas_wr_walk_descend() and mas_wr_node_walk() in turn. - mas_wr_node_walk() loops over entries in node A until EITHER it finds an entry whose pivot equals or exceeds 0x10000 OR it reaches the final entry. - Since no entry has a pivot equal to or exceeding 0x10000, pivot 0xffff is selected, leading to node C. - mas_wr_walk_traverse() resets the mas state to traverse node C. We loop around and invoke mas_wr_walk_descend() and mas_wr_node_walk() in turn once again. - Again, we reach the last entry in node C, which has a pivot of 0xffff. 3. We then copy the elements to the left of 0x4000 in node B to the big node via mas_store_b_node(), and insert the new [0x4000, 0xffff] entry too. 4. We determine whether we have any entries to copy from the right of the end of the range via - and with r_mas set up at the entry at pivot 0xffff, r_mas.offset <= r_mas.end, and then we DUPLICATE the entry at pivot 0xffff. 5. BUG! The maple tree is corrupted with a duplicate entry. This requires a very specific set of circumstances - we must be spanning the last element in a leaf node, which is the last element in the parent node. spanning store across two leaf nodes with a range that ends at that shared pivot. A potential solution to this problem would simply be to reset the walk each time we traverse r_mas, however given the rarity of this situation it seems that would be rather inefficient. Instead, this patch detects if the right hand node is populated, i.e. has anything we need to copy. We do so by only copying elements from the right of the entry being inserted when the maximum value present exceeds the last, rather than basing this on offset position. The patch also updates some comments and eliminates the unused bool return value in mas_wr_walk_index(). The work performed in commit `f8d112a4e6` ("mm/mmap: avoid zeroing vma tree in mmap_region()") seems to have made the probability of this event much more likely, which is the point at which reports started to be submitted concerning this bug. The motivation for this change arose from Bert Karwatzki's report of encountering mm instability after the release of kernel v6.12-rc1 which, after the use of CONFIG_DEBUG_VM_MAPLE_TREE and similar configuration options, was identified as maple tree corruption. After Bert very generously provided his time and ability to reproduce this event consistently, I was able to finally identify that the issue discussed in this commit message was occurring for him. Link: https://lkml.kernel.org/r/cover.1728314402.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/48b349a2a0f7c76e18772712d0997a5e12ab0a3b.1728314403.git.lorenzo.stoakes@oracle.com Fixes: `54a611b605` ("Maple Tree: add new data structure") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Bert Karwatzki <spasswolf@web.de> Closes: https://lore.kernel.org/all/20241001023402.3374-1-spasswolf@web.de/ Tested-by: Bert Karwatzki <spasswolf@web.de> Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Closes: https://lore.kernel.org/all/CABXGCsOPwuoNOqSMmAvWO2Fz4TEmPnjFj-b7iF+XFRu1h7-+Dg@mail.gmail.com/ Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 08:35:10 -07:00
Cristian Ciocaltea	d8f9d6d826	phy: phy-rockchip-samsung-hdptx: Depend on CONFIG_COMMON_CLK Ensure CONFIG_PHY_ROCKCHIP_SAMSUNG_HDPTX depends on CONFIG_COMMON_CLK to fix the following link errors when compile testing some random kernel configurations: m68k-linux-ld: drivers/phy/rockchip/phy-rockchip-samsung-hdptx.o: in function `rk_hdptx_phy_clk_register': drivers/phy/rockchip/phy-rockchip-samsung-hdptx.c:1031:(.text+0x470): undefined reference to `__clk_get_name' m68k-linux-ld: drivers/phy/rockchip/phy-rockchip-samsung-hdptx.c:1036:(.text+0x4ba): undefined reference to `devm_clk_hw_register' m68k-linux-ld: drivers/phy/rockchip/phy-rockchip-samsung-hdptx.c:1040:(.text+0x4d2): undefined reference to `of_clk_hw_simple_get' m68k-linux-ld: drivers/phy/rockchip/phy-rockchip-samsung-hdptx.c:1040:(.text+0x4da): undefined reference to `devm_of_clk_add_hw_provider' Fixes: `c4b09c5620` ("phy: phy-rockchip-samsung-hdptx: Add clock provider support") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409180305.53PXymZn-lkp@intel.com/ Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240923-sam-hdptx-link-fix-v1-1-8d10d7456305@collabora.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-17 20:49:16 +05:30
Andrea Parri	e59db0623f	riscv, bpf: Make BPF_CMPXCHG fully ordered According to the prototype formal BPF memory consistency model discussed e.g. in [1] and following the ordering properties of the C/in-kernel macro atomic_cmpxchg(), a BPF atomic operation with the BPF_CMPXCHG modifier is fully ordered. However, the current RISC-V JIT lowerings fail to meet such memory ordering property. This is illustrated by the following litmus test: BPF BPF__MP+success_cmpxchg+fence { 0:r1=x; 0:r3=y; 0:r5=1; 1:r2=y; 1:r4=f; 1:r7=x; } P0 \| P1 ; (u64 )(r1 + 0) = 1 \| r1 = (u64 )(r2 + 0) ; r2 = cmpxchg_64 (r3 + 0, r4, r5) \| r3 = atomic_fetch_add((u64 )(r4 + 0), r5) ; \| r6 = (u64 *)(r7 + 0) ; exists (1:r1=1 /\ 1:r6=0) whose "exists" clause is not satisfiable according to the BPF memory model. Using the current RISC-V JIT lowerings, the test can be mapped to the following RISC-V litmus test: RISCV RISCV__MP+success_cmpxchg+fence { 0:x1=x; 0:x3=y; 0:x5=1; 1:x2=y; 1:x4=f; 1:x7=x; } P0 \| P1 ; sd x5, 0(x1) \| ld x1, 0(x2) ; L00: \| amoadd.d.aqrl x3, x5, 0(x4) ; lr.d x2, 0(x3) \| ld x6, 0(x7) ; bne x2, x4, L01 \| ; sc.d x6, x5, 0(x3) \| ; bne x6, x4, L00 \| ; fence rw, rw \| ; L01: \| ; exists (1:x1=1 /\ 1:x6=0) where the two stores in P0 can be reordered. Update the RISC-V JIT lowerings/implementation of BPF_CMPXCHG to emit an SC with RELEASE ("rl") annotation in order to meet the expected memory ordering guarantees. The resulting RISC-V JIT lowerings of BPF_CMPXCHG match the RISC-V lowerings of the C atomic_cmpxchg(). Other lowerings were fixed via `20a759df3b` ("riscv, bpf: make some atomic operations fully ordered"). Fixes: `dd642ccb45` ("riscv, bpf: Implement more atomic operations for RV64") Signed-off-by: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lpc.events/event/18/contributions/1949/attachments/1665/3441/bpfmemmodel.2024.09.19p.pdf [1] Link: https://lore.kernel.org/bpf/20241017143628.2673894-1-parri.andrea@gmail.com	2024-10-17 17:14:48 +02:00
Siddharth Vadapalli	b4b32423b6	phy: ti: phy-j721e-wiz: fix usxgmii configuration Commit `b64a85fb8f` ("phy: ti: phy-j721e-wiz.c: Add usxgmii support in wiz driver") added support for USXGMII mode. In doing so, P0_REFCLK_SEL was set to "pcs_mac_clk_divx1_ln_0" (0x3) and P0_STANDARD_MODE was set to LANE_MODE_GEN1, which results in a data rate of 5.15625 Gbps. However, since the USXGMII mode can support up to 10.3125 Gbps data rate, the aforementioned fields should be set to "pcs_mac_clk_divx0_ln_0" (0x2) and LANE_MODE_GEN2 respectively. The signal corresponding to the USXGMII lane of the SERDES has been measured as 5 Gbps without the change and 10 Gbps with the change. Hence, fix the configuration accordingly to support USXGMII up to 10G. Fixes: `b64a85fb8f` ("phy: ti: phy-j721e-wiz.c: Add usxgmii support in wiz driver") Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Link: https://lore.kernel.org/r/20241012053937.3596885-1-s-vadapalli@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-17 20:23:02 +05:30
Jan Kiszka	e10c52e7e0	phy: starfive: jh7110-usb: Fix link configuration to controller In order to connect the USB 2.0 PHY to its controller, we also need to set "u0_pdrstn_split_sw_usbpipe_plugen" [1]. Some downstream U-Boot versions did that, but upstream firmware does not, and the kernel must not rely on such behavior anyway. Failing to set this left the USB gadget port invisible to connected hosts behind. Link: https://doc-en.rvspace.org/JH7110/TRM/JH7110_TRM/sys_syscon.html#sys_syscon__section_b3l_fqs_wsb [1] Fixes: `16d3a71c20` ("phy: starfive: Add JH7110 USB 2.0 PHY driver") Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Minda Chen <minda.chen@starfivetech.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20241015070444.20972-2-minda.chen@starfivetech.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-17 20:19:44 +05:30
Johan Hovold	031b46b472	phy: qcom: qmp-pcie: drop bogus x1e80100 qref supplies The PCIe PHYs on x1e80100 do not a have a qref supply so stop requesting one. This also avoids the follow warning at boot: qcom-qmp-pcie-phy 1bfc000.phy: supply vdda-qref not found, using dummy regulator Fixes: `9dab00ee95` ("phy: qcom: qmp-pcie: Add Gen4 4-lanes mode for X1E80100") Fixes: `606060ce8f` ("phy: qcom-qmp-pcie: Add support for X1E80100 g3x2 and g4x2 PCIE") Cc: Abel Vesa <abel.vesa@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20241015121406.15033-1-johan+linaro@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-17 20:12:05 +05:30
Jens Axboe	8f7033aa40	io_uring/sqpoll: ensure task state is TASK_RUNNING when running task_work When the sqpoll is exiting and cancels pending work items, it may need to run task_work. If this happens from within io_uring_cancel_generic(), then it may be under waiting for the io_uring_task waitqueue. This results in the below splat from the scheduler, as the ring mutex may be attempted grabbed while in a TASK_INTERRUPTIBLE state. Ensure that the task state is set appropriately for that, just like what is done for the other cases in io_run_task_work(). do not call blocking ops when !TASK_RUNNING; state=1 set at [<0000000029387fd2>] prepare_to_wait+0x88/0x2fc WARNING: CPU: 6 PID: 59939 at kernel/sched/core.c:8561 __might_sleep+0xf4/0x140 Modules linked in: CPU: 6 UID: 0 PID: 59939 Comm: iou-sqp-59938 Not tainted 6.12.0-rc3-00113-g8d020023b155 #7456 Hardware name: linux,dummy-virt (DT) pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : __might_sleep+0xf4/0x140 lr : __might_sleep+0xf4/0x140 sp : ffff80008c5e7830 x29: ffff80008c5e7830 x28: ffff0000d93088c0 x27: ffff60001c2d7230 x26: dfff800000000000 x25: ffff0000e16b9180 x24: ffff80008c5e7a50 x23: 1ffff000118bcf4a x22: ffff0000e16b9180 x21: ffff0000e16b9180 x20: 000000000000011b x19: ffff80008310fac0 x18: 1ffff000118bcd90 x17: 30303c5b20746120 x16: 74657320313d6574 x15: 0720072007200720 x14: 0720072007200720 x13: 0720072007200720 x12: ffff600036c64f0b x11: 1fffe00036c64f0a x10: ffff600036c64f0a x9 : dfff800000000000 x8 : 00009fffc939b0f6 x7 : ffff0001b6327853 x6 : 0000000000000001 x5 : ffff0001b6327850 x4 : ffff600036c64f0b x3 : ffff8000803c35bc x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000e16b9180 Call trace: __might_sleep+0xf4/0x140 mutex_lock+0x84/0x124 io_handle_tw_list+0xf4/0x260 tctx_task_work_run+0x94/0x340 io_run_task_work+0x1ec/0x3c0 io_uring_cancel_generic+0x364/0x524 io_sq_thread+0x820/0x124c ret_from_fork+0x10/0x20 Cc: stable@vger.kernel.org Fixes: `af5d68f889` ("io_uring/sqpoll: manage task_work privately") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-17 08:38:04 -06:00
Daniele Palmas	6d951576ee	USB: serial: option: add Telit FN920C04 MBIM compositions Add the following Telit FN920C04 compositions: 0x10a2: MBIM + tty (AT/NMEA) + tty (AT) + tty (diag) T: Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 17 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10a2 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FN920 S: SerialNumber=92c4c4d8 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms 0x10a7: MBIM + tty (AT) + tty (AT) + tty (diag) T: Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 18 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10a7 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FN920 S: SerialNumber=92c4c4d8 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms 0x10aa: MBIM + tty (AT) + tty (diag) + DPL (data packet logging) + adb T: Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 15 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10aa Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FN920 S: SerialNumber=92c4c4d8 C: #Ifs= 6 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>	2024-10-17 16:38:02 +02:00
Benjamin B. Frost	540eff5d7f	USB: serial: option: add support for Quectel EG916Q-GL Add Quectel EM916Q-GL with product ID 0x6007 T: Bus=01 Lev=02 Prnt=02 Port=01 Cnt=01 Dev#= 3 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2c7c ProdID=6007 Rev= 2.00 S: Manufacturer=Quectel S: Product=EG916Q-GL C:* #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=200mA A: FirstIf#= 4 IfCount= 2 Cls=02(comm.) Sub=06 Prot=00 I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=82(I) Atr=03(Int.) MxPS= 16 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 16 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=86(I) Atr=03(Int.) MxPS= 16 Ivl=32ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=06 Prot=00 Driver=cdc_ether E: Ad=88(I) Atr=03(Int.) MxPS= 32 Ivl=32ms I: If#= 5 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether I:* If#= 5 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms MI_00 Quectel USB Diag Port MI_01 Quectel USB NMEA Port MI_02 Quectel USB AT Port MI_03 Quectel USB Modem Port MI_04 Quectel USB Net Port Signed-off-by: Benjamin B. Frost <benjamin@geanix.com> Reviewed-by: Lars Melin <larsm17@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>	2024-10-17 16:34:44 +02:00
Kalle Valo	a940b3a1ad	Merge tag 'ath-current-20241016' of git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath ath.git patches for v6.12-rc4 Fix two instances of memory leaks, one in ath10k and one in ath11k.	2024-10-17 17:25:37 +03:00
Bitterblue Smith	a95d28a8a2	wifi: rtlwifi: rtl8192du: Don't claim USB ID 0bda:8171 This ID appears to be RTL8188SU, not RTL8192DU. This is the wrong driver for RTL8188SU. The r8712u driver from staging handles this ID. I think this ID comes from the original rtl8192du driver from Realtek. I don't know if they added it by mistake, or it was actually used for two different chips. RTL8188SU with this ID exists in the wild. RTL8192DU with this ID probably doesn't. Fixes: `b5dc8873b6` ("wifi: rtlwifi: Add rtl8192du/sw.c") Cc: stable@vger.kernel.org # v6.11 Closes: https://github.com/lwfinger/rtl8192du/issues/105 Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/40245564-41fe-4a5e-881f-cd517255b20a@gmail.com	2024-10-17 17:25:03 +03:00
Bitterblue Smith	4aefde403d	wifi: rtw88: Fix the RX aggregation in USB 3 mode RTL8822CU, RTL8822BU, and RTL8821CU don't need BIT_EN_PRE_CALC. In fact, RTL8822BU in USB 3 mode doesn't pass all the frames to the driver, resulting in much lower download speed than normal: $ iperf3 -c 192.168.0.1 -R Connecting to host 192.168.0.1, port 5201 Reverse mode, remote host 192.168.0.1 is sending [ 5] local 192.168.0.50 port 43062 connected to 192.168.0.1 port 5201 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 26.9 MBytes 225 Mbits/sec [ 5] 1.00-2.00 sec 7.50 MBytes 62.9 Mbits/sec [ 5] 2.00-3.00 sec 8.50 MBytes 71.3 Mbits/sec [ 5] 3.00-4.00 sec 8.38 MBytes 70.3 Mbits/sec [ 5] 4.00-5.00 sec 7.75 MBytes 65.0 Mbits/sec [ 5] 5.00-6.00 sec 8.00 MBytes 67.1 Mbits/sec [ 5] 6.00-7.00 sec 8.00 MBytes 67.1 Mbits/sec [ 5] 7.00-8.00 sec 7.75 MBytes 65.0 Mbits/sec [ 5] 8.00-9.00 sec 7.88 MBytes 66.1 Mbits/sec [ 5] 9.00-10.00 sec 7.88 MBytes 66.1 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.02 sec 102 MBytes 85.1 Mbits/sec 224 sender [ 5] 0.00-10.00 sec 98.6 MBytes 82.7 Mbits/sec receiver Don't set BIT_EN_PRE_CALC. Then the speed is much better: % iperf3 -c 192.168.0.1 -R Connecting to host 192.168.0.1, port 5201 Reverse mode, remote host 192.168.0.1 is sending [ 5] local 192.168.0.50 port 39000 connected to 192.168.0.1 port 5201 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 52.8 MBytes 442 Mbits/sec [ 5] 1.00-2.00 sec 71.9 MBytes 603 Mbits/sec [ 5] 2.00-3.00 sec 74.8 MBytes 627 Mbits/sec [ 5] 3.00-4.00 sec 75.9 MBytes 636 Mbits/sec [ 5] 4.00-5.00 sec 76.0 MBytes 638 Mbits/sec [ 5] 5.00-6.00 sec 74.1 MBytes 622 Mbits/sec [ 5] 6.00-7.00 sec 74.0 MBytes 621 Mbits/sec [ 5] 7.00-8.00 sec 76.0 MBytes 638 Mbits/sec [ 5] 8.00-9.00 sec 74.4 MBytes 624 Mbits/sec [ 5] 9.00-10.00 sec 63.9 MBytes 536 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 717 MBytes 601 Mbits/sec 24 sender [ 5] 0.00-10.00 sec 714 MBytes 599 Mbits/sec receiver Fixes: `002a5db9a5` ("wifi: rtw88: Enable USB RX aggregation for 8822c/8822b/8821c") Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/afb94a82-3d18-459e-97fc-1a217608cdf0@gmail.com	2024-10-17 17:24:17 +03:00
Geert Uytterhoeven	b73b206952	wifi: brcm80211: BRCM_TRACING should depend on TRACING When tracing is disabled, there is no point in asking the user about enabling Broadcom wireless device tracing. Fixes: `f5c4f10852` ("brcm80211: Allow trace support to be enabled separately from debug") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/81a29b15eaacc1ac1fb421bdace9ac0c3385f40f.1727179742.git.geert@linux-m68k.org	2024-10-17 17:23:42 +03:00
Ping-Ke Shih	aa70ff0945	wifi: rtw89: pci: early chips only enable 36-bit DMA on specific PCI hosts The early chips including RTL8852A, RTL8851B, RTL8852B and RTL8852BT have interoperability problems of 36-bit DMA with some PCI hosts. Rollback to 32-bit DMA by default, and only enable 36-bit DMA for tested platforms. Since all Intel platforms we have can work correctly, add the vendor ID to white list. Otherwise, list vendor/device ID of bridge we have tested. Fixes: `1fd4b3fe52` ("wifi: rtw89: pci: support 36-bit PCI DMA address") Reported-by: Marcel Weißenbach <mweissenbach@ignaz.org> Closes: https://lore.kernel.org/linux-wireless/20240918073237.Horde.VLueh0_KaiDw-9asEEcdM84@ignaz.org/T/#m07c5694df1acb173a42e1a0bab7ac22bd231a2b8 Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Tested-by: Marcel Weißenbach <mweissenbach@ignaz.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20240924021633.19861-1-pkshih@realtek.com	2024-10-17 17:23:15 +03:00
Naohiro Aota	bf9821ba47	btrfs: zoned: fix zone unusable accounting for freed reserved extent When btrfs reserves an extent and does not use it (e.g, by an error), it calls btrfs_free_reserved_extent() to free the reserved extent. In the process, it calls btrfs_add_free_space() and then it accounts the region bytes as block_group->zone_unusable. However, it leaves the space_info->bytes_zone_unusable side not updated. As a result, ENOSPC can happen while a space_info reservation succeeded. The reservation is fine because the freed region is not added in space_info->bytes_zone_unusable, leaving that space as "free". OTOH, corresponding block group counts it as zone_unusable and its allocation pointer is not rewound, we cannot allocate an extent from that block group. That will also negate space_info's async/sync reclaim process, and cause an ENOSPC error from the extent allocation process. Fix that by returning the space to space_info->bytes_zone_unusable. Ideally, since a bio is not submitted for this reserved region, we should return the space to free space and rewind the allocation pointer. But, it needs rework on extent allocation handling, so let it work in this way for now. Fixes: `169e0da91a` ("btrfs: zoned: track unusable bytes for zones") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-17 16:16:46 +02:00
Arnaldo Carvalho de Melo	39c6a35620	perf trace: The return from 'write' isn't a pid When adding a explicit beautifier for the 'write' syscall when the BPF based buffer collector was introduced there was a cut'n'paste error that carried the syscall_fmt->errpid setting from a nearby syscall (waitid) that returns a pid. So the write return was being suppressed by the return pretty printer, remove that field, reverting it back to the default return handler, that prints positive numbers as-is and interpret negative values as errnos. I actually introduced the problem while making Howard's original patch work just with the 'write' syscall, as we couldn't just look for any buffers, the ones that are filled in by the kernel couldn't use the same sys_enter BPF collector. Fixes: `b257fac12f` ("perf trace: Pretty print buffer data") Reported-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/lkml/bcf50648-3c7e-4513-8717-0d14492c53b9@linaro.org Link: https://lore.kernel.org/all/Zt8jTfzDYgBPvFCd@x1/#t Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alan Maguire <alan.maguire@oracle.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-17 10:34:43 -03:00
Arnaldo Carvalho de Melo	ab8aaab874	tools headers UAPI: Sync linux/const.h with the kernel headers To pick up the changes in: `947697c6f0` ("uapi: Define GENMASK_U128") That causes no changes in tooling, just addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/const.h include/uapi/linux/const.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/lkml/ZwltGNJwujKu1Fgn@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-10-17 10:34:43 -03:00
David Howells	610a79ffea	afs: Fix lock recursion afs_wake_up_async_call() can incur lock recursion. The problem is that it is called from AF_RXRPC whilst holding the ->notify_lock, but it tries to take a ref on the afs_call struct in order to pass it to a work queue - but if the afs_call is already queued, we then have an extraneous ref that must be put... calling afs_put_call() may call back down into AF_RXRPC through rxrpc_kernel_shutdown_call(), however, which might try taking the ->notify_lock again. This case isn't very common, however, so defer it to a workqueue. The oops looks something like: BUG: spinlock recursion on CPU#0, krxrpcio/7001/1646 lock: 0xffff888141399b30, .magic: dead4ead, .owner: krxrpcio/7001/1646, .owner_cpu: 0 CPU: 0 UID: 0 PID: 1646 Comm: krxrpcio/7001 Not tainted 6.12.0-rc2-build3+ #4351 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 Call Trace: <TASK> dump_stack_lvl+0x47/0x70 do_raw_spin_lock+0x3c/0x90 rxrpc_kernel_shutdown_call+0x83/0xb0 afs_put_call+0xd7/0x180 rxrpc_notify_socket+0xa0/0x190 rxrpc_input_split_jumbo+0x198/0x1d0 rxrpc_input_data+0x14b/0x1e0 ? rxrpc_input_call_packet+0xc2/0x1f0 rxrpc_input_call_event+0xad/0x6b0 rxrpc_input_packet_on_conn+0x1e1/0x210 rxrpc_input_packet+0x3f2/0x4d0 rxrpc_io_thread+0x243/0x410 ? __pfx_rxrpc_io_thread+0x10/0x10 kthread+0xcf/0xe0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x24/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/1394602.1729162732@warthog.procyon.org.uk cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-17 15:33:46 +02:00
Alessandro Zanni	15f3434748	fs: Fix uninitialized value issue in from_kuid and from_kgid ocfs2_setattr() uses attr->ia_mode, attr->ia_uid and attr->ia_gid in a trace point even though ATTR_MODE, ATTR_UID and ATTR_GID aren't set. Initialize all fields of newattrs to avoid uninitialized variables, by checking if ATTR_MODE, ATTR_UID, ATTR_GID are initialized, otherwise 0. Reported-by: syzbot+6c55f725d1bdc8c52058@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6c55f725d1bdc8c52058 Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com> Link: https://lore.kernel.org/r/20241017120553.55331-1-alessandro.zanni87@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-17 15:33:43 +02:00
Christian Brauner	229fd15908	fs: don't try and remove empty rbtree node When copying a namespace we won't have added the new copy into the namespace rbtree until after the copy succeeded. Calling free_mnt_ns() will try to remove the copy from the rbtree which is invalid. Simply free the namespace skeleton directly. Link: https://lore.kernel.org/r/20241016-adapter-seilwinde-83c508a7bde1@brauner Fixes: `1901c92497` ("fs: keep an index of current mount namespaces") Tested-by: Brad Spengler <spender@grsecurity.net> Cc: stable@vger.kernel.org # v6.11+ Reported-by: Brad Spengler <spender@grsecurity.net> Suggested-by: Brad Spengler <spender@grsecurity.net> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-17 15:33:43 +02:00
David Howells	d6a77668a7	netfs: Downgrade i_rwsem for a buffered write In the I/O locking code borrowed from NFS into netfslib, i_rwsem is held locked across a buffered write - but this causes a performance regression in cifs as it excludes buffered reads for the duration (cifs didn't use any locking for buffered reads). Mitigate this somewhat by downgrading the i_rwsem to a read lock across the buffered write. This at least allows parallel reads to occur whilst excluding other writes, DIO, truncate and setattr. Note that this shouldn't be a problem for a buffered write as a read through an mmap can circumvent i_rwsem anyway. Also note that we might want to make this change in NFS also. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/1317958.1729096113@warthog.procyon.org.uk cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Trond Myklebust <trondmy@kernel.org> cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-cifs@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-17 15:33:42 +02:00
Johan Hovold	1dd196f900	phy: qcom: qmp-combo: move driver data initialisation earlier Commit `44aff8e310` ("phy: qcom-qmp-combo: clean up probe initialisation") removed most users of the platform device driver data, but mistakenly also removed the initialisation despite the data still being used in the runtime PM callbacks. The initialisation was soon after restored by commit `83a0bbe39b` ("phy: qcom-qmp-combo: add support for updated sc8280xp binding") but now happens slightly later during probe. This should not cause any trouble currently as runtime PM needs to be enabled manually through sysfs and the platform device would not be suspended before the PHY has been registered anyway. Move the driver data initialisation to avoid a NULL-pointer dereference on runtime suspend if runtime PM is ever enabled by default in this driver. Fixes: `44aff8e310` ("phy: qcom-qmp-combo: clean up probe initialisation") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240911115253.10920-5-johan+linaro@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-17 18:34:16 +05:30
Johan Hovold	34c21f94fa	phy: qcom: qmp-usbc: fix NULL-deref on runtime suspend Commit `413db06c05` ("phy: qcom-qmp-usb: clean up probe initialisation") removed most users of the platform device driver data from the qcom-qmp-usb driver, but mistakenly also removed the initialisation despite the data still being used in the runtime PM callbacks. This bug was later reproduced when the driver was copied to create the qmp-usbc driver. Restore the driver data initialisation at probe to avoid a NULL-pointer dereference on runtime suspend. Apparently no one uses runtime PM, which currently needs to be enabled manually through sysfs, with these drivers. Fixes: `19281571a4` ("phy: qcom: qmp-usb: split USB-C PHY driver") Cc: stable@vger.kernel.org # 6.9 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240911115253.10920-4-johan+linaro@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-17 18:33:46 +05:30
Johan Hovold	29240130ab	phy: qcom: qmp-usb-legacy: fix NULL-deref on runtime suspend Commit `413db06c05` ("phy: qcom-qmp-usb: clean up probe initialisation") removed most users of the platform device driver data from the qcom-qmp-usb driver, but mistakenly also removed the initialisation despite the data still being used in the runtime PM callbacks. This bug was later reproduced when the driver was copied to create the qmp-usb-legacy driver. Restore the driver data initialisation at probe to avoid a NULL-pointer dereference on runtime suspend. Apparently no one uses runtime PM, which currently needs to be enabled manually through sysfs, with these drivers. Fixes: `e464a3180a` ("phy: qcom-qmp-usb: split off the legacy USB+dp_com support") Cc: stable@vger.kernel.org # 6.6 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240911115253.10920-3-johan+linaro@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-17 18:33:46 +05:30
Johan Hovold	bd9e4d4a3b	phy: qcom: qmp-usb: fix NULL-deref on runtime suspend Commit `413db06c05` ("phy: qcom-qmp-usb: clean up probe initialisation") removed most users of the platform device driver data, but mistakenly also removed the initialisation despite the data still being used in the runtime PM callbacks. Restore the driver data initialisation at probe to avoid a NULL-pointer dereference on runtime suspend. Apparently no one uses runtime PM, which currently needs to be enabled manually through sysfs, with this driver. Fixes: `413db06c05` ("phy: qcom-qmp-usb: clean up probe initialisation") Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240911115253.10920-2-johan+linaro@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-17 18:33:46 +05:30
Johan Hovold	938ade15ab	dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: add missing x1e80100 pipediv2 clocks The x1e80100 QMP PCIe PHYs all have a pipediv2 clock that needs to be described. Fixes: `e94b29f2bd` ("dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Document the X1E80100 QMP PCIe PHYs") Cc: Abel Vesa <abel.vesa@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20240916082307.29393-2-johan+linaro@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-17 18:31:17 +05:30
Florian Westphal	1230fe7ad3	netfilter: bpf: must hold reference on net namespace BUG: KASAN: slab-use-after-free in __nf_unregister_net_hook+0x640/0x6b0 Read of size 8 at addr ffff8880106fe400 by task repro/72= bpf_nf_link_release+0xda/0x1e0 bpf_link_free+0x139/0x2d0 bpf_link_release+0x68/0x80 __fput+0x414/0xb60 Eric says: It seems that bpf was able to defer the __nf_unregister_net_hook() after exit()/close() time. Perhaps a netns reference is missing, because the netns has been dismantled/freed already. bpf_nf_link_attach() does : link->net = net; But I do not see a reference being taken on net. Add such a reference and release it after hook unreg. Note that I was unable to get syzbot reproducer to work, so I do not know if this resolves this splat. Fixes: `84601d6ee6` ("bpf: add bpf_link support for BPF_NETFILTER programs") Diagnosed-by: Eric Dumazet <edumazet@google.com> Reported-by: Lai, Yi <yi1.lai@linux.intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-10-17 13:58:57 +02:00
Kirill Marinushkin	740883fa6c	ASoC: Change my e-mail to gmail Change my contact e-mail in pcm3060 driver and MAINTAINERS Signed-off-by: Kirill Marinushkin <k.marinushkin@gmail.com> Cc: Kirill Marinushkin <kmarinushkin@birdec.com> Cc: Liam Girdwood <lgirdwood@gmail.com> Cc: Mark Brown <broonie@kernel.org> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.com> Cc: linux-kernel@vger.kernel.org Cc: linux-sound@vger.kernel.org Link: https://patch.msgid.link/20241016215810.1544222-1-k.marinushkin@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-17 12:11:33 +01:00
Derek Fang	6924565a04	ASoC: Intel: soc-acpi: lnl: Add match entry for TM2 laptops Add a new match table entry on Lunarlake for the TM2 laptops with rt713 and rt1318. Signed-off-by: Derek Fang <derek.fang@realtek.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241016030703.13669-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-17 12:11:32 +01:00
Ilya Dudikov	b0867999e3	ASoC: amd: yc: Fix non-functional mic on ASUS E1404FA ASUS Vivobook E1404FA needs a quirks-table entry for the internal microphone to function properly. Signed-off-by: Ilya Dudikov <ilyadud@mail.ru> Link: https://patch.msgid.link/20241016034038.13481-1-ilyadud25@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-17 12:11:31 +01:00
Ranjani Sridharan	ab5593793e	ASoC: SOF: Intel: hda: Always clean up link DMA during stop This is required to reset the DMA read/write pointers when the stream is prepared and restarted after a call to snd_pcm_drain()/snd_pcm_drop(). Also, now that the stream is reset during stop, do not save LLP registers in the case of STOP/suspend to avoid erroneous delay reporting. Link: https://github.com/thesofproject/sof/issues/9502 Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> All: stable@vger.kernel.org # 6.10.x 6.11.x Link: https://patch.msgid.link/20241016032910.14601-5-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-17 12:11:20 +01:00
Ranjani Sridharan	c78f1e15e4	soundwire: intel_ace2x: Send PDI stream number during prepare In the case of a prepare callback after an xrun or when the PCM is restarted after a call to snd_pcm_drain/snd_pcm_drop, avoid reprogramming the SHIM registers but send the PDI stream number so that the link DMA data can be set. This is needed for the case that the DMA data is cleared when the PCM is stopped and restarted without being closed. Link: https://github.com/thesofproject/sof/issues/9502 Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Acked-by: Vinod Koul <vkoul@kernel.org> All: stable@vger.kernel.org # 6.10.x 6.11.x Link: https://patch.msgid.link/20241016032910.14601-4-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-17 12:11:19 +01:00
Ranjani Sridharan	6e38a7e098	ASoC: SOF: Intel: hda: Handle prepare without close for non-HDA DAI's When a PCM is restarted after a snd_pcm_drain/snd_pcm_drop(), the prepare callback will be invoked and the hw_params will be set again. For the HDA DAI's, the hw_params function handles this case already but not for the non-HDA DAI's. So, add the check for link_prepared to verify if the hw_params should be done again or not. Additionally, for SDW DAI's reset the PCMSyCM registers as would be done in the case of a start after a hw_free. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> All: stable@vger.kernel.org # 6.10.x 6.11.x Link: https://patch.msgid.link/20241016032910.14601-3-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-17 12:11:18 +01:00
Ranjani Sridharan	9822b4c90d	ASoC: SOF: ipc4-topology: Do not set ALH node_id for aggregated DAIs For aggregated DAIs, the node ID is set to the group_id during the DAI widget's ipc_prepare op. With the current logic, setting the dai_index for node_id in the dai_config is redundant as it will be overwritten with the group_id anyway. Removing it will also prevent any accidental clearing/resetting of the group_id for aggregated DAIs due to the dai_config calls could that happen before the allocated group_id is freed. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> All: stable@vger.kernel.org # 6.10.x 6.11.x Link: https://patch.msgid.link/20241016032910.14601-2-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-17 12:11:17 +01:00
Michal Luczaj	19039f2797	bpf, vsock: Drop static vsock_bpf_prot initialization vsock_bpf_prot is set up at runtime. Remove the superfluous init. No functional change intended. Fixes: `634f1a7110` ("vsock: support sockmap") Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241013-vsock-fixes-for-redir-v2-4-d6577bbfe742@rbox.co	2024-10-17 13:02:55 +02:00
Michal Luczaj	6dafde852d	vsock: Update msg_count on read_skb() Dequeuing via vsock_transport::read_skb() left msg_count outdated, which then confused SOCK_SEQPACKET recv(). Decrease the counter. Fixes: `634f1a7110` ("vsock: support sockmap") Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241013-vsock-fixes-for-redir-v2-3-d6577bbfe742@rbox.co	2024-10-17 13:02:54 +02:00
Michal Luczaj	3543152f2d	vsock: Update rx_bytes on read_skb() Make sure virtio_transport_inc_rx_pkt() and virtio_transport_dec_rx_pkt() calls are balanced (i.e. virtio_vsock_sock::rx_bytes doesn't lie) after vsock_transport::read_skb(). While here, also inform the peer that we've freed up space and it has more credit. Failing to update rx_bytes after packet is dequeued leads to a warning on SOCK_STREAM recv(): [ 233.396654] rx_queue is empty, but rx_bytes is non-zero [ 233.396702] WARNING: CPU: 11 PID: 40601 at net/vmw_vsock/virtio_transport_common.c:589 Fixes: `634f1a7110` ("vsock: support sockmap") Suggested-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241013-vsock-fixes-for-redir-v2-2-d6577bbfe742@rbox.co	2024-10-17 13:02:54 +02:00
Michal Luczaj	9c5bd93edf	bpf, sockmap: SK_DROP on attempted redirects of unsupported af_vsock Don't mislead the callers of bpf_{sk,msg}_redirect_{map,hash}(): make sure to immediately and visibly fail the forwarding of unsupported af_vsock packets. Fixes: `634f1a7110` ("vsock: support sockmap") Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241013-vsock-fixes-for-redir-v2-1-d6577bbfe742@rbox.co	2024-10-17 13:02:54 +02:00
Paolo Abeni	cb560795c8	Merge branch 'mlx5-misc-fixes-2024-10-15' Tariq Toukan says: ==================== mlx5 misc fixes 2024-10-15 This patchset provides misc bug fixes from the team to the mlx5 core and Eth drivers. Series generated against: commit `174714f0e5` ("selftests: drivers: net: fix name not defined") ==================== Link: https://patch.msgid.link/20241015093208.197603-1-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 12:14:11 +02:00
Cosmin Ratiu	4dbc1d1a9f	net/mlx5e: Don't call cleanup on profile rollback failure When profile rollback fails in mlx5e_netdev_change_profile, the netdev profile var is left set to NULL. Avoid a crash when unloading the driver by not calling profile->cleanup in such a case. This was encountered while testing, with the original trigger that the wq rescuer thread creation got interrupted (presumably due to Ctrl+C-ing modprobe), which gets converted to ENOMEM (-12) by mlx5e_priv_init, the profile rollback also fails for the same reason (signal still active) so the profile is left as NULL, leading to a crash later in _mlx5e_remove. [ 732.473932] mlx5_core 0000:08:00.1: E-Switch: Unload vfs: mode(OFFLOADS), nvfs(2), necvfs(0), active vports(2) [ 734.525513] workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR [ 734.557372] mlx5_core 0000:08:00.1: mlx5e_netdev_init_profile:6235:(pid 6086): mlx5e_priv_init failed, err=-12 [ 734.559187] mlx5_core 0000:08:00.1 eth3: mlx5e_netdev_change_profile: new profile init failed, -12 [ 734.560153] workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR [ 734.589378] mlx5_core 0000:08:00.1: mlx5e_netdev_init_profile:6235:(pid 6086): mlx5e_priv_init failed, err=-12 [ 734.591136] mlx5_core 0000:08:00.1 eth3: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12 [ 745.537492] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 745.538222] #PF: supervisor read access in kernel mode <snipped> [ 745.551290] Call Trace: [ 745.551590] <TASK> [ 745.551866] ? __die+0x20/0x60 [ 745.552218] ? page_fault_oops+0x150/0x400 [ 745.555307] ? exc_page_fault+0x79/0x240 [ 745.555729] ? asm_exc_page_fault+0x22/0x30 [ 745.556166] ? mlx5e_remove+0x6b/0xb0 [mlx5_core] [ 745.556698] auxiliary_bus_remove+0x18/0x30 [ 745.557134] device_release_driver_internal+0x1df/0x240 [ 745.557654] bus_remove_device+0xd7/0x140 [ 745.558075] device_del+0x15b/0x3c0 [ 745.558456] mlx5_rescan_drivers_locked.part.0+0xb1/0x2f0 [mlx5_core] [ 745.559112] mlx5_unregister_device+0x34/0x50 [mlx5_core] [ 745.559686] mlx5_uninit_one+0x46/0xf0 [mlx5_core] [ 745.560203] remove_one+0x4e/0xd0 [mlx5_core] [ 745.560694] pci_device_remove+0x39/0xa0 [ 745.561112] device_release_driver_internal+0x1df/0x240 [ 745.561631] driver_detach+0x47/0x90 [ 745.562022] bus_remove_driver+0x84/0x100 [ 745.562444] pci_unregister_driver+0x3b/0x90 [ 745.562890] mlx5_cleanup+0xc/0x1b [mlx5_core] [ 745.563415] __x64_sys_delete_module+0x14d/0x2f0 [ 745.563886] ? kmem_cache_free+0x1b0/0x460 [ 745.564313] ? lockdep_hardirqs_on_prepare+0xe2/0x190 [ 745.564825] do_syscall_64+0x6d/0x140 [ 745.565223] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 745.565725] RIP: 0033:0x7f1579b1288b Fixes: `3ef14e463f` ("net/mlx5e: Separate between netdev objects and mlx5e profiles initialization") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 12:14:07 +02:00
Cosmin Ratiu	1da9cfd6c4	net/mlx5: Unregister notifier on eswitch init failure It otherwise remains registered and a subsequent attempt at eswitch enabling might trigger warnings of the sort: [ 682.589148] ------------[ cut here ]------------ [ 682.590204] notifier callback eswitch_vport_event [mlx5_core] already registered [ 682.590256] WARNING: CPU: 13 PID: 2660 at kernel/notifier.c:31 notifier_chain_register+0x3e/0x90 [...snipped] [ 682.610052] Call Trace: [ 682.610369] <TASK> [ 682.610663] ? __warn+0x7c/0x110 [ 682.611050] ? notifier_chain_register+0x3e/0x90 [ 682.611556] ? report_bug+0x148/0x170 [ 682.611977] ? handle_bug+0x36/0x70 [ 682.612384] ? exc_invalid_op+0x13/0x60 [ 682.612817] ? asm_exc_invalid_op+0x16/0x20 [ 682.613284] ? notifier_chain_register+0x3e/0x90 [ 682.613789] atomic_notifier_chain_register+0x25/0x40 [ 682.614322] mlx5_eswitch_enable_locked+0x1d4/0x3b0 [mlx5_core] [ 682.614965] mlx5_eswitch_enable+0xc9/0x100 [mlx5_core] [ 682.615551] mlx5_device_enable_sriov+0x25/0x340 [mlx5_core] [ 682.616170] mlx5_core_sriov_configure+0x50/0x170 [mlx5_core] [ 682.616789] sriov_numvfs_store+0xb0/0x1b0 [ 682.617248] kernfs_fop_write_iter+0x117/0x1a0 [ 682.617734] vfs_write+0x231/0x3f0 [ 682.618138] ksys_write+0x63/0xe0 [ 682.618536] do_syscall_64+0x4c/0x100 [ 682.618958] entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: `7624e58a8b` ("net/mlx5: E-switch, register event handler before arming the event") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 12:14:07 +02:00
Shay Drory	d62b14045c	net/mlx5: Fix command bitmask initialization Command bitmask have a dedicated bit for MANAGE_PAGES command, this bit isn't Initialize during command bitmask Initialization, only during MANAGE_PAGES. In addition, mlx5_cmd_trigger_completions() is trying to trigger completion for MANAGE_PAGES command as well. Hence, in case health error occurred before any MANAGE_PAGES command have been invoke (for example, during mlx5_enable_hca()), mlx5_cmd_trigger_completions() will try to trigger completion for MANAGE_PAGES command, which will result in null-ptr-deref error.[1] Fix it by Initialize command bitmask correctly. While at it, re-write the code for better understanding. [1] BUG: KASAN: null-ptr-deref in mlx5_cmd_trigger_completions+0x1db/0x600 [mlx5_core] Write of size 4 at addr 0000000000000214 by task kworker/u96:2/12078 CPU: 10 PID: 12078 Comm: kworker/u96:2 Not tainted 6.9.0-rc2_for_upstream_debug_2024_04_07_19_01 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: mlx5_health0000:08:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core] Call Trace: <TASK> dump_stack_lvl+0x7e/0xc0 kasan_report+0xb9/0xf0 kasan_check_range+0xec/0x190 mlx5_cmd_trigger_completions+0x1db/0x600 [mlx5_core] mlx5_cmd_flush+0x94/0x240 [mlx5_core] enter_error_state+0x6c/0xd0 [mlx5_core] mlx5_fw_fatal_reporter_err_work+0xf3/0x480 [mlx5_core] process_one_work+0x787/0x1490 ? lockdep_hardirqs_on_prepare+0x400/0x400 ? pwq_dec_nr_in_flight+0xda0/0xda0 ? assign_work+0x168/0x240 worker_thread+0x586/0xd30 ? rescuer_thread+0xae0/0xae0 kthread+0x2df/0x3b0 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x2d/0x70 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork_asm+0x11/0x20 </TASK> Fixes: `9b98d395b8` ("net/mlx5: Start health poll at earlier stage of driver load") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 12:14:07 +02:00
Maher Sanalla	d4f25be27e	net/mlx5: Check for invalid vector index on EQ creation Currently, mlx5 driver does not enforce vector index to be lower than the maximum number of supported completion vectors when requesting a new completion EQ. Thus, mlx5_comp_eqn_get() fails when trying to acquire an IRQ with an improper vector index. To prevent the case above, enforce that vector index value is valid and lower than maximum in mlx5_comp_eqn_get() before handling the request. Fixes: `f14c1a14e6` ("net/mlx5: Allocate completion EQs dynamically") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 12:14:07 +02:00
Cosmin Ratiu	9addffa343	net/mlx5: HWS, use lock classes for bwc locks The HWS BWC API uses one lock per queue and usually acquires one of them, except when doing changes which require locking all queues in order. Naturally, lockdep isn't too happy about acquiring the same lock class multiple times, so inform it that each queue lock is a different class to avoid false positives. Fixes: `2ca62599aa` ("net/mlx5: HWS, added send engine and context handling") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 12:14:07 +02:00
Cosmin Ratiu	45bcbd4922	net/mlx5: HWS, don't destroy more bwc queue locks than allocated hws_send_queues_bwc_locks_destroy destroyed more queue locks than allocated, leading to memory corruption (occasionally) and warnings such as DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) in __mutex_destroy because sometimes, the 'mutex' being destroyed was random memory. The severity of this problem is proportional to the number of queues configured because the code overreaches beyond the end of the bwc_send_queue_locks array by 2x its length. Fix that by using the correct number of bwc queues. Fixes: `2ca62599aa` ("net/mlx5: HWS, added send engine and context handling") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 12:14:07 +02:00
Yevgeny Kliteynik	5aa2184e29	net/mlx5: HWS, fixed double free in error flow of definer layout Fix error flow bug that could lead to double free of a buffer during a failure to calculate a suitable definer layout. Fixes: `74a778b4a6` ("net/mlx5: HWS, added definers handling") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 12:14:07 +02:00
Yevgeny Kliteynik	65b4eb9f3d	net/mlx5: HWS, removed wrong access to a number of rules variable Removed wrong access to the num_of_rules field of the matcher. This is a usual u32 variable, but the access was as if it was atomic. This fixes the following CI warnings: mlx5hws_bwc.c:708:17: warning: large atomic operation may incur significant performance penalty; the access size (4 bytes) exceeds the max lock-free size (0 bytes) [-Watomic-alignment] Fixes: `510f9f61a1` ("net/mlx5: HWS, added API and enabled HWS support") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409291101.6NdtMFVC-lkp@intel.com/ Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 12:14:07 +02:00
Matthieu Baerts (NGI0)	7decd1f590	mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow Syzkaller reported this splat: ================================================================== BUG: KASAN: slab-use-after-free in mptcp_pm_nl_rm_addr_or_subflow+0xb44/0xcc0 net/mptcp/pm_netlink.c:881 Read of size 4 at addr ffff8880569ac858 by task syz.1.2799/14662 CPU: 0 UID: 0 PID: 14662 Comm: syz.1.2799 Not tainted 6.12.0-rc2-syzkaller-00307-g36c254515dc6 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0xc3/0x620 mm/kasan/report.c:488 kasan_report+0xd9/0x110 mm/kasan/report.c:601 mptcp_pm_nl_rm_addr_or_subflow+0xb44/0xcc0 net/mptcp/pm_netlink.c:881 mptcp_pm_nl_rm_subflow_received net/mptcp/pm_netlink.c:914 [inline] mptcp_nl_remove_id_zero_address+0x305/0x4a0 net/mptcp/pm_netlink.c:1572 mptcp_pm_nl_del_addr_doit+0x5c9/0x770 net/mptcp/pm_netlink.c:1603 genl_family_rcv_msg_doit+0x202/0x2f0 net/netlink/genetlink.c:1115 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0x565/0x800 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x165/0x410 net/netlink/af_netlink.c:2551 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline] netlink_unicast+0x53c/0x7f0 net/netlink/af_netlink.c:1357 netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg net/socket.c:744 [inline] ____sys_sendmsg+0x9ae/0xb40 net/socket.c:2607 ___sys_sendmsg+0x135/0x1e0 net/socket.c:2661 __sys_sendmsg+0x117/0x1f0 net/socket.c:2690 do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline] __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386 do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411 entry_SYSENTER_compat_after_hwframe+0x84/0x8e RIP: 0023:0xf7fe4579 Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00 RSP: 002b:00000000f574556c EFLAGS: 00000296 ORIG_RAX: 0000000000000172 RAX: ffffffffffffffda RBX: 000000000000000b RCX: 0000000020000140 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Allocated by task 5387: kasan_save_stack+0x33/0x60 mm/kasan/common.c:47 kasan_save_track+0x14/0x30 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:394 kmalloc_noprof include/linux/slab.h:878 [inline] kzalloc_noprof include/linux/slab.h:1014 [inline] subflow_create_ctx+0x87/0x2a0 net/mptcp/subflow.c:1803 subflow_ulp_init+0xc3/0x4d0 net/mptcp/subflow.c:1956 __tcp_set_ulp net/ipv4/tcp_ulp.c:146 [inline] tcp_set_ulp+0x326/0x7f0 net/ipv4/tcp_ulp.c:167 mptcp_subflow_create_socket+0x4ae/0x10a0 net/mptcp/subflow.c:1764 __mptcp_subflow_connect+0x3cc/0x1490 net/mptcp/subflow.c:1592 mptcp_pm_create_subflow_or_signal_addr+0xbda/0x23a0 net/mptcp/pm_netlink.c:642 mptcp_pm_nl_fully_established net/mptcp/pm_netlink.c:650 [inline] mptcp_pm_nl_work+0x3a1/0x4f0 net/mptcp/pm_netlink.c:943 mptcp_worker+0x15a/0x1240 net/mptcp/protocol.c:2777 process_one_work+0x958/0x1b30 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Freed by task 113: kasan_save_stack+0x33/0x60 mm/kasan/common.c:47 kasan_save_track+0x14/0x30 mm/kasan/common.c:68 kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:579 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x51/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:230 [inline] slab_free_hook mm/slub.c:2342 [inline] slab_free mm/slub.c:4579 [inline] kfree+0x14f/0x4b0 mm/slub.c:4727 kvfree+0x47/0x50 mm/util.c:701 kvfree_rcu_list+0xf5/0x2c0 kernel/rcu/tree.c:3423 kvfree_rcu_drain_ready kernel/rcu/tree.c:3563 [inline] kfree_rcu_monitor+0x503/0x8b0 kernel/rcu/tree.c:3632 kfree_rcu_shrink_scan+0x245/0x3a0 kernel/rcu/tree.c:3966 do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435 shrink_slab+0x32b/0x12a0 mm/shrinker.c:662 shrink_one+0x47e/0x7b0 mm/vmscan.c:4818 shrink_many mm/vmscan.c:4879 [inline] lru_gen_shrink_node mm/vmscan.c:4957 [inline] shrink_node+0x2452/0x39d0 mm/vmscan.c:5937 kswapd_shrink_node mm/vmscan.c:6765 [inline] balance_pgdat+0xc19/0x18f0 mm/vmscan.c:6957 kswapd+0x5ea/0xbf0 mm/vmscan.c:7226 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Last potentially related work creation: kasan_save_stack+0x33/0x60 mm/kasan/common.c:47 __kasan_record_aux_stack+0xba/0xd0 mm/kasan/generic.c:541 kvfree_call_rcu+0x74/0xbe0 kernel/rcu/tree.c:3810 subflow_ulp_release+0x2ae/0x350 net/mptcp/subflow.c:2009 tcp_cleanup_ulp+0x7c/0x130 net/ipv4/tcp_ulp.c:124 tcp_v4_destroy_sock+0x1c5/0x6a0 net/ipv4/tcp_ipv4.c:2541 inet_csk_destroy_sock+0x1a3/0x440 net/ipv4/inet_connection_sock.c:1293 tcp_done+0x252/0x350 net/ipv4/tcp.c:4870 tcp_rcv_state_process+0x379b/0x4f30 net/ipv4/tcp_input.c:6933 tcp_v4_do_rcv+0x1ad/0xa90 net/ipv4/tcp_ipv4.c:1938 sk_backlog_rcv include/net/sock.h:1115 [inline] __release_sock+0x31b/0x400 net/core/sock.c:3072 __tcp_close+0x4f3/0xff0 net/ipv4/tcp.c:3142 __mptcp_close_ssk+0x331/0x14d0 net/mptcp/protocol.c:2489 mptcp_close_ssk net/mptcp/protocol.c:2543 [inline] mptcp_close_ssk+0x150/0x220 net/mptcp/protocol.c:2526 mptcp_pm_nl_rm_addr_or_subflow+0x2be/0xcc0 net/mptcp/pm_netlink.c:878 mptcp_pm_nl_rm_subflow_received net/mptcp/pm_netlink.c:914 [inline] mptcp_nl_remove_id_zero_address+0x305/0x4a0 net/mptcp/pm_netlink.c:1572 mptcp_pm_nl_del_addr_doit+0x5c9/0x770 net/mptcp/pm_netlink.c:1603 genl_family_rcv_msg_doit+0x202/0x2f0 net/netlink/genetlink.c:1115 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0x565/0x800 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x165/0x410 net/netlink/af_netlink.c:2551 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline] netlink_unicast+0x53c/0x7f0 net/netlink/af_netlink.c:1357 netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg net/socket.c:744 [inline] ____sys_sendmsg+0x9ae/0xb40 net/socket.c:2607 ___sys_sendmsg+0x135/0x1e0 net/socket.c:2661 __sys_sendmsg+0x117/0x1f0 net/socket.c:2690 do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline] __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386 do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411 entry_SYSENTER_compat_after_hwframe+0x84/0x8e The buggy address belongs to the object at ffff8880569ac800 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 88 bytes inside of freed 512-byte region [ffff8880569ac800, ffff8880569aca00) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x569ac head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x4fff00000000040(head\|node=1\|zone=1\|lastcpupid=0x7ff) page_type: f5(slab) raw: 04fff00000000040 ffff88801ac42c80 dead000000000100 dead000000000122 raw: 0000000000000000 0000000080100010 00000001f5000000 0000000000000000 head: 04fff00000000040 ffff88801ac42c80 dead000000000100 dead000000000122 head: 0000000000000000 0000000080100010 00000001f5000000 0000000000000000 head: 04fff00000000002 ffffea00015a6b01 ffffffffffffffff 0000000000000000 head: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO\|__GFP_FS\|__GFP_NOWARN\|__GFP_NORETRY\|__GFP_COMP\|__GFP_NOMEMALLOC), pid 10238, tgid 10238 (kworker/u32:6), ts 597403252405, free_ts 597177952947 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x2d1/0x350 mm/page_alloc.c:1537 prep_new_page mm/page_alloc.c:1545 [inline] get_page_from_freelist+0x101e/0x3070 mm/page_alloc.c:3457 __alloc_pages_noprof+0x223/0x25a0 mm/page_alloc.c:4733 alloc_pages_mpol_noprof+0x2c9/0x610 mm/mempolicy.c:2265 alloc_slab_page mm/slub.c:2412 [inline] allocate_slab mm/slub.c:2578 [inline] new_slab+0x2ba/0x3f0 mm/slub.c:2631 ___slab_alloc+0xd1d/0x16f0 mm/slub.c:3818 __slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3908 __slab_alloc_node mm/slub.c:3961 [inline] slab_alloc_node mm/slub.c:4122 [inline] __kmalloc_cache_noprof+0x2c5/0x310 mm/slub.c:4290 kmalloc_noprof include/linux/slab.h:878 [inline] kzalloc_noprof include/linux/slab.h:1014 [inline] mld_add_delrec net/ipv6/mcast.c:743 [inline] igmp6_leave_group net/ipv6/mcast.c:2625 [inline] igmp6_group_dropped+0x4ab/0xe40 net/ipv6/mcast.c:723 __ipv6_dev_mc_dec+0x281/0x360 net/ipv6/mcast.c:979 addrconf_leave_solict net/ipv6/addrconf.c:2253 [inline] __ipv6_ifa_notify+0x3f6/0xc30 net/ipv6/addrconf.c:6283 addrconf_ifdown.isra.0+0xef9/0x1a20 net/ipv6/addrconf.c:3982 addrconf_notify+0x220/0x19c0 net/ipv6/addrconf.c:3781 notifier_call_chain+0xb9/0x410 kernel/notifier.c:93 call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1996 call_netdevice_notifiers_extack net/core/dev.c:2034 [inline] call_netdevice_notifiers net/core/dev.c:2048 [inline] dev_close_many+0x333/0x6a0 net/core/dev.c:1589 page last free pid 13136 tgid 13136 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1108 [inline] free_unref_page+0x5f4/0xdc0 mm/page_alloc.c:2638 stack_depot_save_flags+0x2da/0x900 lib/stackdepot.c:666 kasan_save_stack+0x42/0x60 mm/kasan/common.c:48 kasan_save_track+0x14/0x30 mm/kasan/common.c:68 unpoison_slab_object mm/kasan/common.c:319 [inline] __kasan_slab_alloc+0x89/0x90 mm/kasan/common.c:345 kasan_slab_alloc include/linux/kasan.h:247 [inline] slab_post_alloc_hook mm/slub.c:4085 [inline] slab_alloc_node mm/slub.c:4134 [inline] kmem_cache_alloc_noprof+0x121/0x2f0 mm/slub.c:4141 skb_clone+0x190/0x3f0 net/core/skbuff.c:2084 do_one_broadcast net/netlink/af_netlink.c:1462 [inline] netlink_broadcast_filtered+0xb11/0xef0 net/netlink/af_netlink.c:1540 netlink_broadcast+0x39/0x50 net/netlink/af_netlink.c:1564 uevent_net_broadcast_untagged lib/kobject_uevent.c:331 [inline] kobject_uevent_net_broadcast lib/kobject_uevent.c:410 [inline] kobject_uevent_env+0xacd/0x1670 lib/kobject_uevent.c:608 device_del+0x623/0x9f0 drivers/base/core.c:3882 snd_card_disconnect.part.0+0x58a/0x7c0 sound/core/init.c:546 snd_card_disconnect+0x1f/0x30 sound/core/init.c:495 snd_usx2y_disconnect+0xe9/0x1f0 sound/usb/usx2y/usbusx2y.c:417 usb_unbind_interface+0x1e8/0x970 drivers/usb/core/driver.c:461 device_remove drivers/base/dd.c:569 [inline] device_remove+0x122/0x170 drivers/base/dd.c:561 That's because 'subflow' is used just after 'mptcp_close_ssk(subflow)', which will initiate the release of its memory. Even if it is very likely the release and the re-utilisation will be done later on, it is of course better to avoid any issues and read the content of 'subflow' before closing it. Fixes: `1c1f721375` ("mptcp: pm: only decrement add_addr_accepted for MPJ req") Cc: stable@vger.kernel.org Reported-by: syzbot+3c8b7a8e7df6a2a226ca@syzkaller.appspotmail.com Closes: https://lore.kernel.org/670d7337.050a0220.4cbc0.004f.GAE@google.com Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/20241015-net-mptcp-uaf-pm-rm-v1-1-c4ee5d987a64@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 12:06:55 +02:00
Felix Fietkau	88806efc03	net: ethernet: mtk_eth_soc: fix memory corruption during fq dma init The loop responsible for allocating up to MTK_FQ_DMA_LENGTH buffers must only touch as many descriptors, otherwise it ends up corrupting unrelated memory. Fix the loop iteration count accordingly. Fixes: `c57e558194` ("net: ethernet: mtk_eth_soc: handle dma buffer size soc specific") Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241015081755.31060-1-nbd@nbd.name Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 12:02:29 +02:00
Daniel Borkmann	4678adf94d	vmxnet3: Fix packet corruption in vmxnet3_xdp_xmit_frame Andrew and Nikolay reported connectivity issues with Cilium's service load-balancing in case of vmxnet3. If a BPF program for native XDP adds an encapsulation header such as IPIP and transmits the packet out the same interface, then in case of vmxnet3 a corrupted packet is being sent and subsequently dropped on the path. vmxnet3_xdp_xmit_frame() which is called e.g. via vmxnet3_run_xdp() through vmxnet3_xdp_xmit_back() calculates an incorrect DMA address: page = virt_to_page(xdpf->data); tbi->dma_addr = page_pool_get_dma_addr(page) + VMXNET3_XDP_HEADROOM; dma_sync_single_for_device(&adapter->pdev->dev, tbi->dma_addr, buf_size, DMA_TO_DEVICE); The above assumes a fixed offset (VMXNET3_XDP_HEADROOM), but the XDP BPF program could have moved xdp->data. While the passed buf_size is correct (xdpf->len), the dma_addr needs to have a dynamic offset which can be calculated as xdpf->data - (void *)xdpf, that is, xdp->data - xdp->data_hard_start. Fixes: `54f00cce11` ("vmxnet3: Add XDP support.") Reported-by: Andrew Sauber <andrew.sauber@isovalent.com> Reported-by: Nikolay Nikolaev <nikolay.nikolaev@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Nikolay Nikolaev <nikolay.nikolaev@isovalent.com> Acked-by: Anton Protopopov <aspsk@isovalent.com> Cc: William Tu <witu@nvidia.com> Cc: Ronak Doshi <ronak.doshi@broadcom.com> Link: https://patch.msgid.link/a0888656d7f09028f9984498cc698bb5364d89fc.1728931137.git.daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-17 11:27:17 +02:00
Oliver Upton	78a0055555	KVM: arm64: Ensure vgic_ready() is ordered against MMIO registration kvm_vgic_map_resources() prematurely marks the distributor as 'ready', potentially allowing vCPUs to enter the guest before the distributor's MMIO registration has been made visible. Plug the race by marking the distributor as ready only after MMIO registration is completed. Rely on the implied ordering of synchronize_srcu() to ensure the MMIO registration is visible before vgic_dist::ready. This also means that writers to vgic_dist::ready are now serialized by the slots_lock, which was effectively the case already as all writers held the slots_lock in addition to the config_lock. Fixes: `59112e9c39` ("KVM: arm64: vgic: Fix a circular locking issue") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241017001947.2707312-3-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-17 09:20:48 +01:00
Oliver Upton	5978d4ec7e	KVM: arm64: vgic: Don't check for vgic_ready() when setting NR_IRQS KVM commits to a particular sizing of SPIs when the vgic is initialized, which is before the point a vgic becomes ready. On top of that, KVM supplies a default amount of SPIs should userspace not explicitly configure this. As such, the check for vgic_ready() in the handling of KVM_DEV_ARM_VGIC_GRP_NR_IRQS is completely wrong, and testing if nr_spis is nonzero is sufficient for preventing userspace from playing games with us. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241017001947.2707312-2-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-17 09:20:48 +01:00
Ilkka Koskinen	c6c167afa0	KVM: arm64: Fix shift-out-of-bounds bug Fix a shift-out-of-bounds bug reported by UBSAN when running VM with MTE enabled host kernel. UBSAN: shift-out-of-bounds in arch/arm64/kvm/sys_regs.c:1988:14 shift exponent 33 is too large for 32-bit type 'int' CPU: 26 UID: 0 PID: 7629 Comm: qemu-kvm Not tainted 6.12.0-rc2 #34 Hardware name: IEI NF5280R7/Mitchell MB, BIOS 00.00. 2024-10-12 09:28:54 10/14/2024 Call trace: dump_backtrace+0xa0/0x128 show_stack+0x20/0x38 dump_stack_lvl+0x74/0x90 dump_stack+0x18/0x28 __ubsan_handle_shift_out_of_bounds+0xf8/0x1e0 reset_clidr+0x10c/0x1c8 kvm_reset_sys_regs+0x50/0x1c8 kvm_reset_vcpu+0xec/0x2b0 __kvm_vcpu_set_target+0x84/0x158 kvm_vcpu_set_target+0x138/0x168 kvm_arch_vcpu_ioctl_vcpu_init+0x40/0x2b0 kvm_arch_vcpu_ioctl+0x28c/0x4b8 kvm_vcpu_ioctl+0x4bc/0x7a8 __arm64_sys_ioctl+0xb4/0x100 invoke_syscall+0x70/0x100 el0_svc_common.constprop.0+0x48/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x3c/0x158 el0t_64_sync_handler+0x120/0x130 el0t_64_sync+0x194/0x198 Fixes: `7af0c2534f` ("KVM: arm64: Normalize cache configuration") Cc: stable@vger.kernel.org Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20241017025701.67936-1-ilkka@os.amperecomputing.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-17 09:20:13 +01:00
Marc Zyngier	afa9b48f32	KVM: arm64: Shave a few bytes from the EL2 idmap code Our idmap is becoming too big, to the point where it doesn't fit in a 4kB page anymore. There are some low-hanging fruits though, such as the el2_init_state horror that is expanded 3 times in the kernel. Let's at least limit ourselves to two copies, which makes the kernel link again. At some point, we'll have to have a better way of doing this. Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241009204903.GA3353168@thelio-3990X	2024-10-17 09:17:56 +01:00
Ingo Molnar	be602cde65	Merge branch 'linus' into sched/urgent, to resolve conflict Conflicts: kernel/sched/ext.c There's a context conflict between this upstream commit: `3fdb9ebcec` sched_ext: Start schedulers with consistent p->scx.slice values ... and this fix in sched/urgent: `98442f0ccd` sched: Fix delayed_dequeue vs switched_from_fair() Resolve it. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2024-10-17 09:58:07 +02:00
Dave Airlie	4cd33d972e	Merge tag 'drm-msm-fixes-2024-10-16' of https://gitlab.freedesktop.org/drm/msm into drm-fixes Fixes for v6.12 Display: - move CRTC resource assignment to atomic_check otherwise to make consecutive calls to atomic_check() consistent - fix rounding / sign-extension issues with pclk calculation in case of DSC - cleanups to drop incorrect null checks in dpu snapshots - fix to use kvzalloc in dpu snapshot to avoid allocation issues in heavily loaded system cases - Fix to not program merge_3d block if dual LM is not being used - Fix to not flush merge_3d block if its not enabled otherwise this leads to false timeouts GPU: - a7xx: add a fence wait before SMMU table update Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGsp3Zbd_H3FhHdRz9yCYA4wxX4SenpYRSk=Mx2d8GMSuQ@mail.gmail.com	2024-10-17 17:40:55 +10:00
Wei Xu	b130ba4a62	mm/mglru: only clear kswapd_failures if reclaimable lru_gen_shrink_node() unconditionally clears kswapd_failures, which can prevent kswapd from sleeping and cause 100% kswapd cpu usage even when kswapd repeatedly fails to make progress in reclaim. Only clear kswap_failures in lru_gen_shrink_node() if reclaim makes some progress, similar to shrink_node(). I happened to run into this problem in one of my tests recently. It requires a combination of several conditions: The allocator needs to allocate a right amount of pages such that it can wake up kswapd without itself being OOM killed; there is no memory for kswapd to reclaim (My test disables swap and cleans page cache first); no other process frees enough memory at the same time. Link: https://lkml.kernel.org/r/20241014221211.832591-1-weixugc@google.com Fixes: `e4dde56cd2` ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Wei Xu <weixugc@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Jan Alexander Steffens <heftig@archlinux.org> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:12 -07:00
Liu Shixin	7528c4fb12	mm/swapfile: skip HugeTLB pages for unuse_vma I got a bad pud error and lost a 1GB HugeTLB when calling swapoff. The problem can be reproduced by the following steps: 1. Allocate an anonymous 1GB HugeTLB and some other anonymous memory. 2. Swapout the above anonymous memory. 3. run swapoff and we will get a bad pud error in kernel message: mm/pgtable-generic.c:42: bad pud 00000000743d215d(84000001400000e7) We can tell that pud_clear_bad is called by pud_none_or_clear_bad in unuse_pud_range() by ftrace. And therefore the HugeTLB pages will never be freed because we lost it from page table. We can skip HugeTLB pages for unuse_vma to fix it. Link: https://lkml.kernel.org/r/20241015014521.570237-1-liushixin2@huawei.com Fixes: `0fe6e20b9c` ("hugetlb, rmap: add reverse mapping for hugepage") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:11 -07:00
Nanyong Sun	3e822bed2f	selftests: mm: fix the incorrect usage() info of khugepaged The mount option of tmpfs should be huge=advise, not madvise which is not supported and may mislead the users. Link: https://lkml.kernel.org/r/20241015020257.139235-1-sunnanyong@huawei.com Fixes: `1b03d0d558` ("selftests/vm: add thp collapse file and tmpfs testing") Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:11 -07:00
Jann Horn	cb2bb9c564	MAINTAINERS: add Jann as memory mapping/VMA reviewer Add myself as a reviewer for memory mapping / VMA code. I will probably only reply to patches sporadically, but hopefully this will help me keep up with changes that look interesting security-wise. Link: https://lkml.kernel.org/r/20241014-maintainers-mmap-reviewer-v1-1-50dce0514752@google.com Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:11 -07:00
Jeongjun Park	818f916e3a	mm: swap: prevent possible data-race in __try_to_reclaim_swap A report [1] was uploaded from syzbot. In the previous commit `862590ac37` ("mm: swap: allow cache reclaim to skip slot cache"), the __try_to_reclaim_swap() function reads offset and folio->entry from folio without folio_lock protection. In the currently reported KCSAN log, it is assumed that the actual data-race will not occur because the calltrace that does WRITE already obtains the folio_lock and then writes. However, the existing __try_to_reclaim_swap() function was already implemented to perform reads under folio_lock protection [1], and there is a risk of a data-race occurring through a function other than the one shown in the KCSAN log. Therefore, I think it is appropriate to change read operations for folio to be performed under folio_lock. [1] ================================================================== BUG: KCSAN: data-race in __delete_from_swap_cache / __try_to_reclaim_swap write to 0xffffea0004c90328 of 8 bytes by task 5186 on cpu 0: __delete_from_swap_cache+0x1f0/0x290 mm/swap_state.c:163 delete_from_swap_cache+0x72/0xe0 mm/swap_state.c:243 folio_free_swap+0x1d8/0x1f0 mm/swapfile.c:1850 free_swap_cache mm/swap_state.c:293 [inline] free_pages_and_swap_cache+0x1fc/0x410 mm/swap_state.c:325 __tlb_batch_free_encoded_pages mm/mmu_gather.c:136 [inline] tlb_batch_pages_flush mm/mmu_gather.c:149 [inline] tlb_flush_mmu_free mm/mmu_gather.c:366 [inline] tlb_flush_mmu+0x2cf/0x440 mm/mmu_gather.c:373 zap_pte_range mm/memory.c:1700 [inline] zap_pmd_range mm/memory.c:1739 [inline] zap_pud_range mm/memory.c:1768 [inline] zap_p4d_range mm/memory.c:1789 [inline] unmap_page_range+0x1f3c/0x22d0 mm/memory.c:1810 unmap_single_vma+0x142/0x1d0 mm/memory.c:1856 unmap_vmas+0x18d/0x2b0 mm/memory.c:1900 exit_mmap+0x18a/0x690 mm/mmap.c:1864 __mmput+0x28/0x1b0 kernel/fork.c:1347 mmput+0x4c/0x60 kernel/fork.c:1369 exit_mm+0xe4/0x190 kernel/exit.c:571 do_exit+0x55e/0x17f0 kernel/exit.c:926 do_group_exit+0x102/0x150 kernel/exit.c:1088 get_signal+0xf2a/0x1070 kernel/signal.c:2917 arch_do_signal_or_restart+0x95/0x4b0 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop kernel/entry/common.c:111 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x59/0x130 kernel/entry/common.c:218 do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f read to 0xffffea0004c90328 of 8 bytes by task 5189 on cpu 1: __try_to_reclaim_swap+0x9d/0x510 mm/swapfile.c:198 free_swap_and_cache_nr+0x45d/0x8a0 mm/swapfile.c:1915 zap_pte_range mm/memory.c:1656 [inline] zap_pmd_range mm/memory.c:1739 [inline] zap_pud_range mm/memory.c:1768 [inline] zap_p4d_range mm/memory.c:1789 [inline] unmap_page_range+0xcf8/0x22d0 mm/memory.c:1810 unmap_single_vma+0x142/0x1d0 mm/memory.c:1856 unmap_vmas+0x18d/0x2b0 mm/memory.c:1900 exit_mmap+0x18a/0x690 mm/mmap.c:1864 __mmput+0x28/0x1b0 kernel/fork.c:1347 mmput+0x4c/0x60 kernel/fork.c:1369 exit_mm+0xe4/0x190 kernel/exit.c:571 do_exit+0x55e/0x17f0 kernel/exit.c:926 __do_sys_exit kernel/exit.c:1055 [inline] __se_sys_exit kernel/exit.c:1053 [inline] __x64_sys_exit+0x1f/0x20 kernel/exit.c:1053 x64_sys_call+0x2d46/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:61 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f value changed: 0x0000000000000242 -> 0x0000000000000000 Link: https://lkml.kernel.org/r/20241007070623.23340-1-aha310510@gmail.com Reported-by: syzbot+fa43f1b63e3aa6f66329@syzkaller.appspotmail.com Fixes: `862590ac37` ("mm: swap: allow cache reclaim to skip slot cache") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Acked-by: Chris Li <chrisl@kernel.org> Reviewed-by: Kairui Song <kasong@tencent.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:11 -07:00
Baolin Wang	d60fcaf00d	mm: khugepaged: fix the incorrect statistics when collapsing large file folios Khugepaged already supports collapsing file large folios (including shmem mTHP) by commit `7de856ffd0` ("mm: khugepaged: support shmem mTHP collapse"), and the control parameters in khugepaged: 'khugepaged_max_ptes_swap' and 'khugepaged_max_ptes_none', still compare based on PTE granularity to determine whether a file collapse is needed. However, the statistics for 'present' and 'swap' in hpage_collapse_scan_file() do not take into account the large folios, which may lead to incorrect judgments regarding the khugepaged_max_ptes_swap/none parameters, resulting in unnecessary file collapses. To fix this issue, take into account the large folios' statistics for 'present' and 'swap' variables in the hpage_collapse_scan_file(). Link: https://lkml.kernel.org/r/c76305d96d12d030a1a346b50503d148364246d2.1728901391.git.baolin.wang@linux.alibaba.com Fixes: `7de856ffd0` ("mm: khugepaged: support shmem mTHP collapse") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Barry Song <baohua@kernel.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:10 -07:00
Andrey Konovalov	22ff9b0ff1	MAINTAINERS: kasan, kcov: add bugzilla links Add links to the Bugzilla component that's used to track KASAN and KCOV issues. Link: https://lkml.kernel.org/r/20241012225524.117871-1-andrey.konovalov@linux.dev Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:10 -07:00
David Hildenbrand	2b0f922323	mm: don't install PMD mappings when THPs are disabled by the hw/process/vma We (or rather, readahead logic :) ) might be allocating a THP in the pagecache and then try mapping it into a process that explicitly disabled THP: we might end up installing PMD mappings. This is a problem for s390x KVM, which explicitly remaps all PMD-mapped THPs to be PTE-mapped in s390_enable_sie()->thp_split_mm(), before starting the VM. For example, starting a VM backed on a file system with large folios supported makes the VM crash when the VM tries accessing such a mapping using KVM. Is it also a problem when the HW disabled THP using TRANSPARENT_HUGEPAGE_UNSUPPORTED? At least on x86 this would be the case without X86_FEATURE_PSE. In the future, we might be able to do better on s390x and only disallow PMD mappings -- what s390x and likely TRANSPARENT_HUGEPAGE_UNSUPPORTED really wants. For now, fix it by essentially performing the same check as would be done in __thp_vma_allowable_orders() or in shmem code, where this works as expected, and disallow PMD mappings, making us fallback to PTE mappings. Link: https://lkml.kernel.org/r/20241011102445.934409-3-david@redhat.com Fixes: `793917d997` ("mm/readahead: Add large folio readahead") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Leo Fu <bfu@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Cc: Thomas Huth <thuth@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:10 -07:00
Kefeng Wang	963756aac1	mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw() Patch series "mm: don't install PMD mappings when THPs are disabled by the hw/process/vma". During testing, it was found that we can get PMD mappings in processes where THP (and more precisely, PMD mappings) are supposed to be disabled. While it works as expected for anon+shmem, the pagecache is the problematic bit. For s390 KVM this currently means that a VM backed by a file located on filesystem with large folio support can crash when KVM tries accessing the problematic page, because the readahead logic might decide to use a PMD-sized THP and faulting it into the page tables will install a PMD mapping, something that s390 KVM cannot tolerate. This might also be a problem with HW that does not support PMD mappings, but I did not try reproducing it. Fix it by respecting the ways to disable THPs when deciding whether we can install a PMD mapping. khugepaged should already be taking care of not collapsing if THPs are effectively disabled for the hw/process/vma. This patch (of 2): Add vma_thp_disabled() and thp_disabled_by_hw() helpers to be shared by shmem_allowable_huge_orders() and __thp_vma_allowable_orders(). [david@redhat.com: rename to vma_thp_disabled(), split out thp_disabled_by_hw() ] Link: https://lkml.kernel.org/r/20241011102445.934409-2-david@redhat.com Fixes: `793917d997` ("mm/readahead: Add large folio readahead") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Leo Fu <bfu@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Cc: Boqiao Fu <bfu@redhat.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:10 -07:00
SeongJae Park	f4050ccab7	Docs/damon/maintainer-profile: update deprecated awslabs GitHub URLs DAMON GitHub repos have moved from awslabs GitHub org to damonitor org[1]. Following the change, URLs on documents are also updated[2]. However, commit `2e9b3d6e2e` ("Docs/damon/maintainer-profile: add links in place"), which was added just after the update, was using the deprecated GitHub URLs. Update those to use damonitor GitHub URLs instead. [1] https://lore.kernel.org/20240813232158.83903-1-sj@kernel.org [2] https://lore.kernel.org/20240826015741.80707-2-sj@kernel.org Link: https://lkml.kernel.org/r/20241011170154.70651-3-sj@kernel.org Fixes: `2e9b3d6e2e` ("Docs/damon/maintainer-profile: add links in place") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:09 -07:00
SeongJae Park	46e10f644a	Docs/damon/maintainer-profile: add missing '_' suffixes for external web links Patch series "Docs/damon/maintainer-profile: a couple of minor hotfixes". DAMON maintainer-profile.rst file patches[1] that were merged into the v6.12-rc1 have a couple of minor mistakes. Fix those. [1] https://lore.kernel.org/20240826015741.80707-1-sj@kernel.org This patch (of 2): Links to external web pages on DAMON's maintainer-profile.rst are missing '_' suffixes. As a result, rendered document is having only verbose URLs that cannot be clicked. Fix those. Also, update the link texts for git trees to contain the names of the trees, for better readability and avoiding below Sphinx warning. maintainer-profile.rst:4: WARNING: Duplicate explicit target name: "tree". Link: https://lkml.kernel.org/r/20241011170154.70651-1-sj@kernel.org Link: https://lkml.kernel.org/r/20241011170154.70651-2-sj@kernel.org Fixes: `2e9b3d6e2e` ("Docs/damon/maintainer-profile: add links in place") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:09 -07:00
Sidhartha Kumar	a6e0ceb7bf	maple_tree: check for MA_STATE_BULK on setting wr_rebalance It is possible for a bulk operation (MA_STATE_BULK is set) to enter the new_end < mt_min_slots[type] case and set wr_rebalance as a store type. This is incorrect as bulk stores do not rebalance per write, but rather after the all of the writes are done through the mas_bulk_rebalance() path. Therefore, add a check to make sure MA_STATE_BULK is not set before we return wr_rebalance as the store type. Also add a test to make sure wr_rebalance is never the store type when doing bulk operations via mas_expected_entries() This is a hotfix for this rc however it has no userspace effects as there are no users of the bulk insertion mode. Link: https://lkml.kernel.org/r/20241011214451.7286-1-sidhartha.kumar@oracle.com Fixes: `5d659bbb52` ("maple_tree: introduce mas_wr_store_type()") Suggested-by: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Sidhartha <sidhartha.kumar@oracle.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:09 -07:00
Yang Shi	37f0b47c51	mm: khugepaged: fix the arguments order in khugepaged_collapse_file trace point The "addr" and "is_shmem" arguments have different order in TP_PROTO and TP_ARGS. This resulted in the incorrect trace result: text-hugepage-644429 [276] 392092.878683: mm_khugepaged_collapse_file: mm=0xffff20025d52c440, hpage_pfn=0x200678c00, index=512, addr=1, is_shmem=0, filename=text-hugepage, nr=512, result=failed The value of "addr" is wrong because it was treated as bool value, the type of is_shmem. Fix the order in TP_PROTO to keep "addr" is before "is_shmem" since the original patch review suggested this order to achieve best packing. And use "lx" for "addr" instead of "ld" in TP_printk because address is typically shown in hex. After the fix, the trace result looks correct: text-hugepage-7291 [004] 128.627251: mm_khugepaged_collapse_file: mm=0xffff0001328f9500, hpage_pfn=0x20016ea00, index=512, addr=0x400000, is_shmem=0, filename=text-hugepage, nr=512, result=failed Link: https://lkml.kernel.org/r/20241012011702.1084846-1-yang@os.amperecomputing.com Fixes: `4c9473e87e` ("mm/khugepaged: add tracepoint to collapse_file()") Signed-off-by: Yang Shi <yang@os.amperecomputing.com> Cc: Gautam Menghani <gautammenghani201@gmail.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> [6.2+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:09 -07:00
Jinjie Ruan	2d6a1c8356	mm/damon/tests/sysfs-kunit.h: fix memory leak in damon_sysfs_test_add_targets() The sysfs_target->regions allocated in damon_sysfs_regions_alloc() is not freed in damon_sysfs_test_add_targets(), which cause the following memory leak, free it to fix it. unreferenced object 0xffffff80c2a8db80 (size 96): comm "kunit_try_catch", pid 187, jiffies 4294894363 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 0): [<0000000001e3714d>] kmemleak_alloc+0x34/0x40 [<000000008e6835c1>] __kmalloc_cache_noprof+0x26c/0x2f4 [<000000001286d9f8>] damon_sysfs_test_add_targets+0x1cc/0x738 [<0000000032ef8f77>] kunit_try_run_case+0x13c/0x3ac [<00000000f3edea23>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000adf936cf>] kthread+0x2e8/0x374 [<0000000041bb1628>] ret_from_fork+0x10/0x20 Link: https://lkml.kernel.org/r/20241010125323.3127187-1-ruanjinjie@huawei.com Fixes: `b8ee5575f7` ("mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets()") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:08 -07:00
Andy Shevchenko	a5e8eb2513	mm: remove unused stub for can_swapin_thp() When can_swapin_thp() is unused, it prevents kernel builds with clang, `make W=1` and CONFIG_WERROR=y: mm/memory.c:4184:20: error: unused function 'can_swapin_thp' [-Werror,-Wunused-function] Fix this by removing the unused stub. See also commit `6863f5643d` ("kbuild: allow Clang to find unused static inline functions for W=1 build"). Link: https://lkml.kernel.org/r/20241008191329.2332346-1-andriy.shevchenko@linux.intel.com Fixes: `242d12c981` ("mm: support large folios swap-in for sync io devices") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Barry Song <baohua@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Chuanhua Han <hanchuanhua@oppo.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:08 -07:00
Andy Chiu	3f4e74cb3f	mailmap: add an entry for Andy Chiu Map my outdated addresses within mailmap. Link: https://lkml.kernel.org/r/20241009144934.43027-1-andybnac@gmail.com Signed-off-by: Andy Chiu <andybnac@gmail.com> Cc: Greentime Hu <greentime.hu@sifive.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Leon Chien <leonchien@synology.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:08 -07:00
Lorenzo Stoakes	f8dc524e59	MAINTAINERS: add memory mapping/VMA co-maintainers Add myself and Liam as co-maintainers of the memory mapping and VMA code alongside Andrew as we are heavily involved in its implementation and maintenance. Link: https://lkml.kernel.org/r/20241009201032.6130-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:08 -07:00
Brahmajit Das	5778ace04e	fs/proc: fix build with GCC 15 due to -Werror=unterminated-string-initialization show show_smap_vma_flags() has been a using misspelled initializer in mnemonics[] - it needed to initialize 2 element array of char and it used NUL-padded 2 character string literals (i.e. 3-element initializer). This has been spotted by gcc-15[]; prior to that gcc quietly dropped the 3rd eleemnt of initializers. To fix this we are increasing the size of mnemonics[] (from mnemonics[BITS_PER_LONG][2] to mnemonics[BITS_PER_LONG][3]) to accomodate the NUL-padded string literals. This also helps us in simplyfying the logic for printing of the flags as instead of printing each character from the mnemonics[], we can just print the mnemonics[] using seq_printf. []: fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization] 917 \| [0 ... (BITS_PER_LONG-1)] = "??", \| ^~~~ fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization] fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization] fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization] fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization] fs/proc/task_mmu.c:917:49: error: initializer-string for array of `char' is too long [-Werror=unterminate d-string-initialization] ... Stephen pointed out: : The C standard explicitly allows for a string initializer to be too long : due to the NUL byte at the end ... so this warning may be overzealous. but let's make the warning go away anwyay. Link: https://lkml.kernel.org/r/20241005063700.2241027-1-brahmajit.xyz@gmail.com Link: https://lkml.kernel.org/r/20241003093040.47c08382@canb.auug.org.au Signed-off-by: Brahmajit Das <brahmajit.xyz@gmail.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: David Hildenbrand <david@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:07 -07:00
Florian Westphal	dc783ba4b9	lib: alloc_tag_module_unload must wait for pending kfree_rcu calls Ben Greear reports following splat: ------------[ cut here ]------------ net/netfilter/nf_nat_core.c:1114 module nf_nat func:nf_nat_register_fn has 256 allocated at module unload WARNING: CPU: 1 PID: 10421 at lib/alloc_tag.c:168 alloc_tag_module_unload+0x22b/0x3f0 Modules linked in: nf_nat(-) btrfs ufs qnx4 hfsplus hfs minix vfat msdos fat ... Hardware name: Default string Default string/SKYBAY, BIOS 5.12 08/04/2020 RIP: 0010:alloc_tag_module_unload+0x22b/0x3f0 codetag_unload_module+0x19b/0x2a0 ? codetag_load_module+0x80/0x80 nf_nat module exit calls kfree_rcu on those addresses, but the free operation is likely still pending by the time alloc_tag checks for leaks. Wait for outstanding kfree_rcu operations to complete before checking resolves this warning. Reproducer: unshare -n iptables-nft -t nat -A PREROUTING -p tcp grep nf_nat /proc/allocinfo # will list 4 allocations rmmod nft_chain_nat rmmod nf_nat # will WARN. [akpm@linux-foundation.org: add comment] Link: https://lkml.kernel.org/r/20241007205236.11847-1-fw@strlen.de Fixes: `a473573964` ("lib: code tagging module support") Signed-off-by: Florian Westphal <fw@strlen.de> Reported-by: Ben Greear <greearb@candelatech.com> Closes: https://lore.kernel.org/netdev/bdaaef9d-4364-4171-b82b-bcfc12e207eb@candelatech.com/ Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:07 -07:00
Jann Horn	6fa1066fc5	mm/mremap: fix move_normal_pmd/retract_page_tables race In mremap(), move_page_tables() looks at the type of the PMD entry and the specified address range to figure out by which method the next chunk of page table entries should be moved. At that point, the mmap_lock is held in write mode, but no rmap locks are held yet. For PMD entries that point to page tables and are fully covered by the source address range, move_pgt_entry(NORMAL_PMD, ...) is called, which first takes rmap locks, then does move_normal_pmd(). move_normal_pmd() takes the necessary page table locks at source and destination, then moves an entire page table from the source to the destination. The problem is: The rmap locks, which protect against concurrent page table removal by retract_page_tables() in the THP code, are only taken after the PMD entry has been read and it has been decided how to move it. So we can race as follows (with two processes that have mappings of the same tmpfs file that is stored on a tmpfs mount with huge=advise); note that process A accesses page tables through the MM while process B does it through the file rmap: process A process B ========= ========= mremap mremap_to move_vma move_page_tables get_old_pmd alloc_new_pmd * PREEMPT * madvise(MADV_COLLAPSE) do_madvise madvise_walk_vmas madvise_vma_behavior madvise_collapse hpage_collapse_scan_file collapse_file retract_page_tables i_mmap_lock_read(mapping) pmdp_collapse_flush i_mmap_unlock_read(mapping) move_pgt_entry(NORMAL_PMD, ...) take_rmap_locks move_normal_pmd drop_rmap_locks When this happens, move_normal_pmd() can end up creating bogus PMD entries in the line `pmd_populate(mm, new_pmd, pmd_pgtable(pmd))`. The effect depends on arch-specific and machine-specific details; on x86, you can end up with physical page 0 mapped as a page table, which is likely exploitable for user->kernel privilege escalation. Fix the race by letting process B recheck that the PMD still points to a page table after the rmap locks have been taken. Otherwise, we bail and let the caller fall back to the PTE-level copying path, which will then bail immediately at the pmd_none() check. Bug reachability: Reaching this bug requires that you can create shmem/file THP mappings - anonymous THP uses different code that doesn't zap stuff under rmap locks. File THP is gated on an experimental config flag (CONFIG_READ_ONLY_THP_FOR_FS), so on normal distro kernels you need shmem THP to hit this bug. As far as I know, getting shmem THP normally requires that you can mount your own tmpfs with the right mount flags, which would require creating your own user+mount namespace; though I don't know if some distros maybe enable shmem THP by default or something like that. Bug impact: This issue can likely be used for user->kernel privilege escalation when it is reachable. Link: https://lkml.kernel.org/r/20241007-move_normal_pmd-vs-collapse-fix-2-v1-1-5ead9631f2ea@google.com Fixes: `1d65b771bc` ("mm/khugepaged: retract_page_tables() without mmap or vma lock") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: David Hildenbrand <david@redhat.com> Co-developed-by: David Hildenbrand <david@redhat.com> Closes: https://project-zero.issues.chromium.org/371047675 Acked-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:07 -07:00
Sebastian Andrzej Siewior	8f3ce3d996	mm: percpu: increase PERCPU_DYNAMIC_SIZE_SHIFT on certain builds. Arnd reported a build failure due to the BUILD_BUG_ON() statement in alloc_kmem_cache_cpus(). The test PERCPU_DYNAMIC_EARLY_SIZE < NR_KMALLOC_TYPES * KMALLOC_SHIFT_HIGH * sizeof(struct kmem_cache_cpu) The factors that increase the right side of the equation: - PAGE_SIZE > 4KiB increases KMALLOC_SHIFT_HIGH - For the local_lock_t in kmem_cache_cpu: - PREEMPT_RT adds an actual lock. - LOCKDEP increases the size of the lock. - LOCK_STAT adds additional bytes plus padding to the lockdep structure. The net difference with and without PREEMPT_RT is 88 bytes for the lock_lock_t, 96 bytes for kmem_cache_cpu due to additional padding. This is enough to exceed the 80KiB limit with 16KiB page size - the 8KiB page size is fine. Increase PERCPU_DYNAMIC_SIZE_SHIFT to 13 on configs with PAGE_SIZE larger than 4KiB and LOCKDEP enabled. Link: https://lkml.kernel.org/r/20241007143049.gyMpEu89@linutronix.de Fixes: `d8fccd9ca5` ("arm64: Allow to enable PREEMPT_RT.") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410020326.iaZIteIx-lkp@intel.com/ Reported-by: Arnd Bergmann <arnd@kernel.org> Closes: https://lore.kernel.org/20241004095702.637528-1-arnd@kernel.org Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:07 -07:00
Edward Liaw	e142cc87ac	selftests/mm: fix deadlock for fork after pthread_create on ARM On Android with arm, there is some synchronization needed to avoid a deadlock when forking after pthread_create. Link: https://lkml.kernel.org/r/20241003211716.371786-3-edliaw@google.com Fixes: `cff2945827` ("selftests/mm: extend and rename uffd pagemap test") Signed-off-by: Edward Liaw <edliaw@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:06 -07:00
Edward Liaw	e61ef21e27	selftests/mm: replace atomic_bool with pthread_barrier_t Patch series "selftests/mm: fix deadlock after pthread_create". On Android arm, pthread_create followed by a fork caused a deadlock in the case where the fork required work to be completed by the created thread. Update the synchronization primitive to use pthread_barrier instead of atomic_bool. Apply the same fix to the wp-fork-with-event test. This patch (of 2): Swap synchronization primitive with pthread_barrier, so that stdatomic.h does not need to be included. The synchronization is needed on Android ARM64; we see a deadlock with pthread_create when the parent thread races forward before the child has a chance to start doing work. Link: https://lkml.kernel.org/r/20241003211716.371786-1-edliaw@google.com Link: https://lkml.kernel.org/r/20241003211716.371786-2-edliaw@google.com Fixes: `cff2945827` ("selftests/mm: extend and rename uffd pagemap test") Signed-off-by: Edward Liaw <edliaw@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:06 -07:00
OGAWA Hirofumi	963a7f4d3b	fat: fix uninitialized variable syszbot produced this with a corrupted fs image. In theory, however an IO error would trigger this also. This affects just an error report, so should not be a serious error. Link: https://lkml.kernel.org/r/87r08wjsnh.fsf@mail.parknet.co.jp Link: https://lkml.kernel.org/r/66ff2c95.050a0220.49194.03e9.GAE@google.com Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reported-by: syzbot+ef0d7bc412553291aa86@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:06 -07:00
Ryusuke Konishi	08cfa12adf	nilfs2: propagate directory read errors from nilfs_find_entry() Syzbot reported that a task hang occurs in vcs_open() during a fuzzing test for nilfs2. The root cause of this problem is that in nilfs_find_entry(), which searches for directory entries, ignores errors when loading a directory page/folio via nilfs_get_folio() fails. If the filesystem images is corrupted, and the i_size of the directory inode is large, and the directory page/folio is successfully read but fails the sanity check, for example when it is zero-filled, nilfs_check_folio() may continue to spit out error messages in bursts. Fix this issue by propagating the error to the callers when loading a page/folio fails in nilfs_find_entry(). The current interface of nilfs_find_entry() and its callers is outdated and cannot propagate error codes such as -EIO and -ENOMEM returned via nilfs_find_entry(), so fix it together. Link: https://lkml.kernel.org/r/20241004033640.6841-1-konishi.ryusuke@gmail.com Fixes: `2ba466d74e` ("nilfs2: directory entry operations") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: Lizhi Xu <lizhi.xu@windriver.com> Closes: https://lkml.kernel.org/r/20240927013806.3577931-1-lizhi.xu@windriver.com Reported-by: syzbot+8a192e8d090fa9a31135@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8a192e8d090fa9a31135 Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:06 -07:00
Lorenzo Stoakes	74874c5793	mm/mmap: correct error handling in mmap_region() Commit `f8d112a4e6` ("mm/mmap: avoid zeroing vma tree in mmap_region()") changed how error handling is performed in mmap_region(). The error value defaults to -ENOMEM, but then gets reassigned immediately to the result of vms_gather_munmap_vmas() if we are performing a MAP_FIXED mapping over existing VMAs (and thus unmapping them). This overwrites the error value, potentially clearing it. After this, we invoke may_expand_vm() and possibly vm_area_alloc(), and check to see if they failed. If they do so, then we perform error-handling logic, but importantly, we do NOT update the error code. This means that, if vms_gather_munmap_vmas() succeeds, but one of these calls does not, the function will return indicating no error, but rather an address value of zero, which is entirely incorrect. Correct this and avoid future confusion by strictly setting error on each and every occasion we jump to the error handling logic, and set the error code immediately prior to doing so. This way we can see at a glance that the error code is always correct. Many thanks to Vegard Nossum who spotted this issue in discussion around this problem. Link: https://lkml.kernel.org/r/20241002073932.13482-1-lorenzo.stoakes@oracle.com Fixes: `f8d112a4e6` ("mm/mmap: avoid zeroing vma tree in mmap_region()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Suggested-by: Vegard Nossum <vegard.nossum@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-10-17 00:28:05 -07:00
Thomas Zimmermann	c09c4f2a97	drm/ast: vga: Clear EDID if no display is connected Do not keep the obsolete EDID around after unplugging the display from the connector. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: `2a2391f857` ("drm/ast: vga: Transparently handle BMC support") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Jocelyn Falempe <jfalempe@redhat.com> Cc: Dave Airlie <airlied@redhat.com> Cc: dri-devel@lists.freedesktop.org Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241015065113.11790-3-tzimmermann@suse.de	2024-10-17 08:50:14 +02:00
Thomas Zimmermann	5b3c0209e8	drm/ast: sil164: Clear EDID if no display is connected Do not keep the obsolete EDID around after unplugging the display from the connector. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: `d20c2f8464` ("drm/ast: sil164: Transparently handle BMC support") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Jocelyn Falempe <jfalempe@redhat.com> Cc: Dave Airlie <airlied@redhat.com> Cc: dri-devel@lists.freedesktop.org Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241015065113.11790-2-tzimmermann@suse.de	2024-10-17 08:50:14 +02:00
Thomas Zimmermann	e5a3c24bca	Revert "drm/mgag200: Add vblank support" This reverts commit `6c9e14ee9f`. This reverts commit `d5070c9b29`. This reverts commit `89c6ea2006`. The VLINE interrupt doesn't work correctly on G200SE-A (at least). We have also seen missing interrupts on G200ER. So revert vblank support. Fixes frozen displays and warnings about missed vblanks. [ 33.818362] [CRTC:34:crtc-0] vblank wait timed out From the vblank code, the driver only keeps the register constants and the line that disables all interrupts in mgag200_device_init(). Both is still useful without vblank handling. Reported-by: Tony Luck <tony.luck@intel.com> Closes: https://lore.kernel.org/dri-devel/Zvx6lSi7oq5xvTZb@agluck-desk3.sc.intel.com/raw Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241015063932.8620-1-tzimmermann@suse.de	2024-10-17 08:49:45 +02:00
Mathias Nyman	30c9ae5ece	xhci: dbc: honor usb transfer size boundaries. Treat each completed full size write to /dev/ttyDBC0 as a separate usb transfer. Make sure the size of the TRBs matches the size of the tty write by first queuing as many max packet size TRBs as possible up to the last TRB which will be cut short to match the size of the tty write. This solves an issue where userspace writes several transfers back to back via /dev/ttyDBC0 into a kfifo before dbgtty can find available request to turn that kfifo data into TRBs on the transfer ring. The boundary between transfer was lost as xhci-dbgtty then turned everyting in the kfifo into as many 'max packet size' TRBs as possible. DbC would then send more data to the host than intended for that transfer, causing host to issue a babble error. Refuse to write more data to kfifo until previous tty write data is turned into properly sized TRBs with data size boundaries matching tty write size Tested-by: Uday M Bhat <uday.m.bhat@intel.com> Tested-by: Łukasz Bartosik <ukaszb@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-17 08:45:36 +02:00
Michal Pecio	f42a36bae0	usb: xhci: Fix handling errors mid TD followed by other errors Some host controllers fail to produce the final completion event on an isochronous TD which experienced an error mid TD. We deal with it by flagging such TDs and checking if the next event points at the flagged TD or at the next one, and giving back the flagged TD if the latter. This is not enough, because the next TD may be missed by the xHC. Or there may be no next TD but a ring underrun. We also need to get such TD quickly out of the way, or errors on later TDs may be handled wrong. If the next TD experiences a Missed Service Error, we will set the skip flag on the endpoint and then attempt skipping TDs when yet another event arrives. In such scenario, we ought to report the 'error mid TD' transfer as such rather than skip it. Another problem case are Stopped events. If we see one after an error mid TD, we naively assume that it's a Force Stopped Event because it doesn't match the pending TD, but in reality it might be an ordinary Stopped event for the next TD, which we fail to recognize and handle. Fix this by moving error mid TD handling before the whole TD skipping loop. Remove unnecessary conditions, always give back the TD if the new event points to any TRB outside it or if the pointer is NULL, as may be the case in Ring Underrun and Overrun events on 1st gen hardware. Only if the pending TD isn't flagged, consider other actions like skipping. As a side effect of reordering with skip and FSE cases, error mid TD is reordered with last_td_was_short check. This is harmless, because the two cases are mutually exclusive - only one can happen in any given run of handle_tx_event(). Tested on the NEC host and a USB camera with flaky cable. Dynamic debug confirmed that Transaction Errors are sometimes seen, sometimes mid-TD, sometimes followed by Missed Service. In such cases, they were finished properly before skipping began. [Rebase on 6.12-rc1 -Mathias] Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-17 08:45:36 +02:00
Mathias Nyman	fe49df60cd	xhci: Mitigate failed set dequeue pointer commands Avoid xHC host from processing a cancelled URB by always turning cancelled URB TDs into no-op TRBs before queuing a 'Set TR Deq' command. If the command fails then xHC will start processing the cancelled TD instead of skipping it once endpoint is restarted, causing issues like Babble error. This is not a complete solution as a failed 'Set TR Deq' command does not guarantee xHC TRB caches are cleared. Fixes: `4db356924a` ("xhci: turn cancelled td cleanup to its own function") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-17 08:45:36 +02:00
Mathias Nyman	6599b6a6fa	xhci: Fix incorrect stream context type macro The stream contex type (SCT) bitfield is used both in the stream context data structure, and in the 'Set TR Dequeue pointer' command TRB. In both cases it uses bits 3:1 The SCT_FOR_TRB(p) macro used to set the stream context type (SCT) field for the 'Set TR Dequeue pointer' command TRB incorrectly shifts the value 1 bit left before masking the three bits. Fix this by first masking and rshifting, just like the similar SCT_FOR_CTX(p) macro does This issue has not been visibile as the lost bit 3 is only used with secondary stream arrays (SSA). Xhci driver currently only supports using a primary stream array with Linear stream addressing. Fixes: `95241dbdf8` ("xhci: Set SCT field for Set TR dequeue on streams") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-17 08:45:36 +02:00
Alan Stern	5189df7b80	USB: gadget: dummy-hcd: Fix "task hung" problem The syzbot fuzzer has been encountering "task hung" problems ever since the dummy-hcd driver was changed to use hrtimers instead of regular timers. It turns out that the problems are caused by a subtle difference between the timer_pending() and hrtimer_active() APIs. The changeover blindly replaced the first by the second. However, timer_pending() returns True when the timer is queued but not when its callback is running, whereas hrtimer_active() returns True when the hrtimer is queued _or_ its callback is running. This difference occasionally caused dummy_urb_enqueue() to think that the callback routine had not yet started when in fact it was almost finished. As a result the hrtimer was not restarted, which made it impossible for the driver to dequeue later the URB that was just enqueued. This caused usb_kill_urb() to hang, and things got worse from there. Since hrtimers have no API for telling when they are queued and the callback isn't running, the driver must keep track of this for itself. That's what this patch does, adding a new "timer_pending" flag and setting or clearing it at the appropriate times. Reported-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/6709234e.050a0220.3e960.0011.GAE@google.com/ Tested-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Fixes: `a7f3813e58` ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler") Cc: Marcello Sylvester Bauer <sylv@sylv.io> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/2dab644e-ef87-4de8-ac9a-26f100b2c609@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-17 08:45:10 +02:00
Yun Lu	fe05c40ca9	selftest: hid: add the missing tests directory Commit `160c826b4d` ("selftest: hid: add missing run-hid-tools-tests.sh") has added the run-hid-tools-tests.sh script for it to be installed, but I forgot to add the tests directory together. If running the test case without the tests directory, will results in the following error message: make -C tools/testing/selftests/ TARGETS=hid install \ INSTALL_PATH=$KSFT_INSTALL_PATH cd $KSFT_INSTALL_PATH ./run_kselftest.sh -t hid:hid-core.sh /usr/lib/python3.11/site-packages/_pytest/config/__init__.py:331: PluggyTeardownRaisedWarning: A plugin raised an exception during an old-style hookwrapper teardown. Plugin: helpconfig, Hook: pytest_cmdline_parse UsageError: usage: __main__.py [options] [file_or_dir] [file_or_dir] [...] __main__.py: error: unrecognized arguments: --udevd inifile: None rootdir: /root/linux/kselftest_install/hid In fact, the run-hid-tools-tests.sh script uses the scripts in the tests directory to run tests. The tests directory also needs to be added to be installed. Fixes: `ffb85d5c9e` ("selftests: hid: import hid-tools hid-core tests") Cc: stable@vger.kernel.org Signed-off-by: Yun Lu <luyun@kylinos.cn> Acked-by: Benjamin Tissoires <bentiss@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-10-16 15:55:14 -06:00
Jinjie Ruan	6b5cca7868	clk: test: Fix some memory leaks CONFIG_CLK_KUNIT_TEST=y, CONFIG_DEBUG_KMEMLEAK=y and CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y, the following memory leak occurs. If the KUNIT_ASSERT_*() fails, the latter (exit() or testcases) clk_put() or clk_hw_unregister() will fail to release the clk resource and cause memory leaks, use new clk_hw_register_kunit() and clk_hw_get_clk_kunit() to automatically release them. unreferenced object 0xffffff80c6af5000 (size 512): comm "kunit_try_catch", pid 371, jiffies 4294896001 hex dump (first 32 bytes): 20 4c c0 86 e1 ff ff ff e0 1a c0 86 e1 ff ff ff L.............. c0 75 e3 c6 80 ff ff ff 00 00 00 00 00 00 00 00 .u.............. backtrace (crc 8ca788fa): [<00000000e21852d0>] kmemleak_alloc+0x34/0x40 [<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4 [<00000000d1bc850c>] __clk_register+0x80/0x1ecc [<00000000b08c78c5>] clk_hw_register+0xc4/0x110 [<00000000b16d6df8>] clk_multiple_parents_mux_test_init+0x238/0x288 [<0000000014a7e804>] kunit_try_run_case+0x10c/0x3ac [<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec [<0000000066619fb8>] kthread+0x2e8/0x374 [<00000000a1157f53>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80c6e37880 (size 96): comm "kunit_try_catch", pid 371, jiffies 4294896002 hex dump (first 32 bytes): 00 50 af c6 80 ff ff ff 00 00 00 00 00 00 00 00 .P.............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc b4b766dd): [<00000000e21852d0>] kmemleak_alloc+0x34/0x40 [<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4 [<0000000086e7dd64>] clk_hw_create_clk.part.0.isra.0+0x58/0x2f4 [<00000000dcf1ac31>] clk_hw_get_clk+0x8c/0x114 [<000000006fab5bfa>] clk_test_multiple_parents_mux_set_range_set_parent_get_rate+0x3c/0xa0 [<00000000c97db55a>] kunit_try_run_case+0x13c/0x3ac [<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec [<0000000066619fb8>] kthread+0x2e8/0x374 [<00000000a1157f53>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80c2b56900 (size 96): comm "kunit_try_catch", pid 395, jiffies 4294896107 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 e0 49 c0 86 e1 ff ff ff .........I...... backtrace (crc 2e59b327): [<00000000e21852d0>] kmemleak_alloc+0x34/0x40 [<00000000c6c715a8>] __kmalloc_noprof+0x2bc/0x3c0 [<00000000f04a7951>] __clk_register+0x70c/0x1ecc [<00000000b08c78c5>] clk_hw_register+0xc4/0x110 [<00000000cafa9563>] clk_orphan_transparent_multiple_parent_mux_test_init+0x1a8/0x1dc [<0000000014a7e804>] kunit_try_run_case+0x10c/0x3ac [<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec [<0000000066619fb8>] kthread+0x2e8/0x374 [<00000000a1157f53>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80c87c9400 (size 512): comm "kunit_try_catch", pid 483, jiffies 4294896907 hex dump (first 32 bytes): a0 44 c0 86 e1 ff ff ff e0 1a c0 86 e1 ff ff ff .D.............. 20 05 a8 c8 80 ff ff ff 00 00 00 00 00 00 00 00 ............... backtrace (crc c25b43fb): [<00000000e21852d0>] kmemleak_alloc+0x34/0x40 [<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4 [<00000000d1bc850c>] __clk_register+0x80/0x1ecc [<00000000b08c78c5>] clk_hw_register+0xc4/0x110 [<000000002688be48>] clk_single_parent_mux_test_init+0x1a0/0x1d4 [<0000000014a7e804>] kunit_try_run_case+0x10c/0x3ac [<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec [<0000000066619fb8>] kthread+0x2e8/0x374 [<00000000a1157f53>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80c6dd2380 (size 96): comm "kunit_try_catch", pid 483, jiffies 4294896908 hex dump (first 32 bytes): 00 94 7c c8 80 ff ff ff 00 00 00 00 00 00 00 00 ..\|............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc `4401212`): [<00000000e21852d0>] kmemleak_alloc+0x34/0x40 [<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4 [<0000000086e7dd64>] clk_hw_create_clk.part.0.isra.0+0x58/0x2f4 [<00000000dcf1ac31>] clk_hw_get_clk+0x8c/0x114 [<0000000063eb2c90>] clk_test_single_parent_mux_set_range_disjoint_child_last+0x3c/0xa0 [<00000000c97db55a>] kunit_try_run_case+0x13c/0x3ac [<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec [<0000000066619fb8>] kthread+0x2e8/0x374 [<00000000a1157f53>] ret_from_fork+0x10/0x20 ...... Fixes: `02cdeace1e` ("clk: tests: Add tests for single parent mux") Fixes: `2e9cad1abc` ("clk: tests: Add some tests for orphan with multiple parents") Fixes: `433fb8a611` ("clk: tests: Add missing test case for ranges") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://lore.kernel.org/r/20241016022658.2131826-1-ruanjinjie@huawei.com Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2024-10-16 14:39:18 -07:00
Linus Torvalds	c964ced772	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma fixes from Jason Gunthorpe: "Several miscellaneous fixes. A lot of bnxt_re activity, there will be more rc patches there coming. - Many bnxt_re bug fixes - Memory leaks, kasn, NULL pointer deref, soft lockups, error unwinding and some small functional issues - Error unwind bug in rdma netlink - Two issues with incorrect VLAN detection for iWarp - skb_splice_from_iter() splat in siw - Give SRP slab caches unique names to resolve the merge window WARN_ON regression" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/bnxt_re: Fix the GID table length RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages RDMA/bnxt_re: Change the sequence of updating the CQ toggle value RDMA/bnxt_re: Fix an error path in bnxt_re_add_device RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop RDMA/bnxt_re: Fix a possible NULL pointer dereference RDMA/bnxt_re: Return more meaningful error RDMA/bnxt_re: Fix incorrect dereference of srq in async event RDMA/bnxt_re: Fix out of bound check RDMA/bnxt_re: Fix the max CQ WQEs for older adapters RDMA/srpt: Make slab cache names unique RDMA/irdma: Fix misspelling of "accept*" RDMA/cxgb4: Fix RDMA_CM_EVENT_UNREACHABLE error for iWARP RDMA/siw: Add sendpage_ok() check to disable MSG_SPLICE_PAGES RDMA/core: Fix ENODEV error for iWARP test over vlan RDMA/nldev: Fix NULL pointer dereferences issue in rdma_nl_notify_event RDMA/bnxt_re: Fix the max WQEs used in Static WQE mode RDMA/bnxt_re: Add a check for memory allocation RDMA/bnxt_re: Fix incorrect AVID type in WQE structure RDMA/bnxt_re: Fix a possible memory leak	2024-10-16 13:37:59 -07:00
Srinivas Pandruvada	3ebe9c1255	powercap: intel_rapl_msr: Add PL4 support for ArrowLake-H Add ArrowLake-H to the list of processors where PL4 is supported. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20241016154851.1293654-1-srinivas.pandruvada@linux.intel.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-10-16 22:34:03 +02:00
Rafael J. Wysocki	702dedf758	Merge tag 'amd-pstate-v6.12-2024-10-16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux Merge an amd-pstate driver fix for 6.12-rc4 from Mario Limonciello: "Fix a regression introduced where boost control malfunctioned in amd-pstate" * tag 'amd-pstate-v6.12-2024-10-16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled	2024-10-16 22:29:34 +02:00
Luiz Augusto von Dentz	2c1dda2acc	Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001 Fake CSR controllers don't seem to handle short-transfer properly which cause command to time out: kernel: usb 1-1: new full-speed USB device number 19 using xhci_hcd kernel: usb 1-1: New USB device found, idVendor=0a12, idProduct=0001, bcdDevice=88.91 kernel: usb 1-1: New USB device strings: Mfr=0, Product=2, SerialNumber=0 kernel: usb 1-1: Product: BT DONGLE10 ... Bluetooth: hci1: Opcode 0x1004 failed: -110 kernel: Bluetooth: hci1: command 0x1004 tx timeout According to USB Spec 2.0 Section 5.7.3 Interrupt Transfer Packet Size Constraints a interrupt transfer is considered complete when the size is 0 (ZPL) or < wMaxPacketSize: 'When an interrupt transfer involves more data than can fit in one data payload of the currently established maximum size, all data payloads are required to be maximum-sized except for the last data payload, which will contain the remaining data. An interrupt transfer is complete when the endpoint does one of the following: • Has transferred exactly the amount of data expected • Transfers a packet with a payload size less than wMaxPacketSize or transfers a zero-length packet' Link: https://bugzilla.kernel.org/show_bug.cgi?id=219365 Fixes: `7b05933340` ("Bluetooth: btusb: Fix not handling ZPL/short-transfer") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-10-16 16:10:25 -04:00
Ye Bin	64a90991ba	Bluetooth: bnep: fix wild-memory-access in proto_unregister There's issue as follows: KASAN: maybe wild-memory-access in range [0xdead...108-0xdead...10f] CPU: 3 UID: 0 PID: 2805 Comm: rmmod Tainted: G W RIP: 0010:proto_unregister+0xee/0x400 Call Trace: <TASK> __do_sys_delete_module+0x318/0x580 do_syscall_64+0xc1/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f As bnep_init() ignore bnep_sock_init()'s return value, and bnep_sock_init() will cleanup all resource. Then when remove bnep module will call bnep_sock_cleanup() to cleanup sock's resource. To solve above issue just return bnep_sock_init()'s return value in bnep_exit(). Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-10-16 16:10:03 -04:00
Luiz Augusto von Dentz	4084286151	Bluetooth: btusb: Fix not being able to reconnect after suspend This partially reverts 81b3e33bb054 ("Bluetooth: btusb: Don't fail external suspend requests") as it introduced a call to hci_suspend_dev that assumes the system-suspend which doesn't work well when just the device is being suspended because wakeup flag is only set for remote devices that can wakeup the system. Reported-by: Rafael J. Wysocki <rafael@kernel.org> Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Reported-by: Kenneth Crudup <kenny@panix.com> Fixes: `610712298b` ("Bluetooth: btusb: Don't fail external suspend requests") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Tested-by: Rafael J. Wysocki <rafael@kernel.org>	2024-10-16 16:09:43 -04:00
Aaron Thompson	1db4564f10	Bluetooth: Remove debugfs directory on module init failure If bt_init() fails, the debugfs directory currently is not removed. If the module is loaded again after that, the debugfs directory is not set up properly due to the existing directory. # modprobe bluetooth # ls -laF /sys/kernel/debug/bluetooth total 0 drwxr-xr-x 2 root root 0 Sep 27 14:26 ./ drwx------ 31 root root 0 Sep 27 14:25 ../ -r--r--r-- 1 root root 0 Sep 27 14:26 l2cap -r--r--r-- 1 root root 0 Sep 27 14:26 sco # modprobe -r bluetooth # ls -laF /sys/kernel/debug/bluetooth ls: cannot access '/sys/kernel/debug/bluetooth': No such file or directory # # modprobe bluetooth modprobe: ERROR: could not insert 'bluetooth': Invalid argument # dmesg \| tail -n 6 Bluetooth: Core ver 2.22 NET: Registered PF_BLUETOOTH protocol family Bluetooth: HCI device and connection manager initialized Bluetooth: HCI socket layer initialized Bluetooth: Faking l2cap_init() failure for testing NET: Unregistered PF_BLUETOOTH protocol family # ls -laF /sys/kernel/debug/bluetooth total 0 drwxr-xr-x 2 root root 0 Sep 27 14:31 ./ drwx------ 31 root root 0 Sep 27 14:26 ../ # # modprobe bluetooth # dmesg \| tail -n 7 Bluetooth: Core ver 2.22 debugfs: Directory 'bluetooth' with parent '/' already present! NET: Registered PF_BLUETOOTH protocol family Bluetooth: HCI device and connection manager initialized Bluetooth: HCI socket layer initialized Bluetooth: L2CAP socket layer initialized Bluetooth: SCO socket layer initialized # ls -laF /sys/kernel/debug/bluetooth total 0 drwxr-xr-x 2 root root 0 Sep 27 14:31 ./ drwx------ 31 root root 0 Sep 27 14:26 ../ # Cc: stable@vger.kernel.org Fixes: `ffcecac6a7` ("Bluetooth: Create root debugfs directory during module init") Signed-off-by: Aaron Thompson <dev@aaront.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-10-16 16:09:25 -04:00
Aaron Thompson	d458cd1221	Bluetooth: Call iso_exit() on module unload If iso_init() has been called, iso_exit() must be called on module unload. Without that, the struct proto that iso_init() registered with proto_register() becomes invalid, which could cause unpredictable problems later. In my case, with CONFIG_LIST_HARDENED and CONFIG_BUG_ON_DATA_CORRUPTION enabled, loading the module again usually triggers this BUG(): list_add corruption. next->prev should be prev (ffffffffb5355fd0), but was 0000000000000068. (next=ffffffffc0a010d0). ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:29! Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 4159 Comm: modprobe Not tainted 6.10.11-4+bt2-ao-desktop #1 RIP: 0010:__list_add_valid_or_report+0x61/0xa0 ... __list_add_valid_or_report+0x61/0xa0 proto_register+0x299/0x320 hci_sock_init+0x16/0xc0 [bluetooth] bt_init+0x68/0xd0 [bluetooth] __pfx_bt_init+0x10/0x10 [bluetooth] do_one_initcall+0x80/0x2f0 do_init_module+0x8b/0x230 __do_sys_init_module+0x15f/0x190 do_syscall_64+0x68/0x110 ... Cc: stable@vger.kernel.org Fixes: `ccf74f2390` ("Bluetooth: Add BTPROTO_ISO socket type") Signed-off-by: Aaron Thompson <dev@aaront.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-10-16 16:09:03 -04:00
Aaron Thompson	a9b7b535ba	Bluetooth: ISO: Fix multiple init when debugfs is disabled If bt_debugfs is not created successfully, which happens if either CONFIG_DEBUG_FS or CONFIG_DEBUG_FS_ALLOW_ALL is unset, then iso_init() returns early and does not set iso_inited to true. This means that a subsequent call to iso_init() will result in duplicate calls to proto_register(), bt_sock_register(), etc. With CONFIG_LIST_HARDENED and CONFIG_BUG_ON_DATA_CORRUPTION enabled, the duplicate call to proto_register() triggers this BUG(): list_add double add: new=ffffffffc0b280d0, prev=ffffffffbab56250, next=ffffffffc0b280d0. ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:35! Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 2 PID: 887 Comm: bluetoothd Not tainted 6.10.11-1-ao-desktop #1 RIP: 0010:__list_add_valid_or_report+0x9a/0xa0 ... __list_add_valid_or_report+0x9a/0xa0 proto_register+0x2b5/0x340 iso_init+0x23/0x150 [bluetooth] set_iso_socket_func+0x68/0x1b0 [bluetooth] kmem_cache_free+0x308/0x330 hci_sock_sendmsg+0x990/0x9e0 [bluetooth] __sock_sendmsg+0x7b/0x80 sock_write_iter+0x9a/0x110 do_iter_readv_writev+0x11d/0x220 vfs_writev+0x180/0x3e0 do_writev+0xca/0x100 ... This change removes the early return. The check for iso_debugfs being NULL was unnecessary, it is always NULL when iso_inited is false. Cc: stable@vger.kernel.org Fixes: `ccf74f2390` ("Bluetooth: Add BTPROTO_ISO socket type") Signed-off-by: Aaron Thompson <dev@aaront.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-10-16 16:08:43 -04:00
Alex Deucher	ec1aab7816	drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUs This uses more aggressive hueristics than the the bootup default profile. On windows the OS has a special fullscreen 3D mode where this is used. Since we don't have the equivalent on Linux default to this profile for dGPUs. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1500 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131 Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `336568de91`)	2024-10-16 15:51:10 -04:00
Stefan Kerkmann	ea330429a0	Input: xpad - add support for 8BitDo Ultimate 2C Wireless Controller This XBOX360 compatible gamepad uses the new product id 0x310a under the 8BitDo's vendor id 0x2dc8. The change was tested using the gamepad in a wired and wireless dongle configuration. Signed-off-by: Stefan Kerkmann <s.kerkmann@pengutronix.de> Link: https://lore.kernel.org/r/20241015-8bitdo_2c_ultimate_wireless-v1-1-9c9f9db2e995@pengutronix.de Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>	2024-10-16 12:39:39 -07:00
Linus Torvalds	667b1d41b2	Merge tag 'for-6.12-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - regression fix: dirty extents tracked in xarray for qgroups must be adjusted for 32bit platforms - fix potentially freeing uninitialized name in fscrypt structure - fix warning about unneeded variable in a send callback * tag 'for-6.12-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix uninitialized pointer free on read_alloc_one_name() error btrfs: send: cleanup unneeded return variable in changed_verity() btrfs: fix uninitialized pointer free in add_inode_ref() btrfs: use sector numbers as keys for the dirty extents xarray	2024-10-16 09:30:20 -07:00
Linus Torvalds	9f635d44d7	Merge tag 'v6.12-rc3-ksmbd-fixes' of git://git.samba.org/ksmbd Pull smb server fixes from Steve French: - fix race between session setup and session logoff - add supplementary group support * tag 'v6.12-rc3-ksmbd-fixes' of git://git.samba.org/ksmbd: ksmbd: add support for supplementary groups ksmbd: fix user-after-free from session log off	2024-10-16 09:15:43 -07:00
Linus Torvalds	6f6fc393f4	Merge tag 'v6.12-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Remove bogus testmgr ENOENT error messages - Ensure algorithm is still alive before marking it as tested - Disable buggy hash algorithms in marvell/cesa * tag 'v6.12-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: marvell/cesa - Disable hash algorithms crypto: testmgr - Hide ENOENT errors better crypto: api - Fix liveliness check in crypto_alg_tested	2024-10-16 08:42:54 -07:00
Tyrone Wu	2aa587fd66	selftests/bpf: Add asserts for netfilter link info Add assertions/tests to verify `bpf_link_info` fields for netfilter link are correctly populated. Signed-off-by: Tyrone Wu <wudevelops@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20241011193252.178997-2-wudevelops@gmail.com	2024-10-16 17:04:38 +02:00
Tyrone Wu	92f3715e1e	bpf: Fix link info netfilter flags to populate defrag flag This fix correctly populates the `bpf_link_info.netfilter.flags` field when user passes the `BPF_F_NETFILTER_IP_DEFRAG` flag. Fixes: `91721c2d02` ("netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link") Signed-off-by: Tyrone Wu <wudevelops@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Florian Westphal <fw@strlen.de> Cc: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20241011193252.178997-1-wudevelops@gmail.com	2024-10-16 17:04:33 +02:00
Remi Pommarel	befd716ed4	wifi: ath11k: Fix invalid ring usage in full monitor mode On full monitor HW the monitor destination rxdma ring does not have the same descriptor format as in the "classical" mode. The full monitor destination entries are of hal_sw_monitor_ring type and fetched using ath11k_dp_full_mon_process_rx while the classical ones are of type hal_reo_entrance_ring and fetched with ath11k_dp_rx_mon_dest_process. Although both hal_sw_monitor_ring and hal_reo_entrance_ring are of same size, the offset to useful info (such as sw_cookie, paddr, etc) are different. Thus if ath11k_dp_rx_mon_dest_process gets called on full monitor destination ring, invalid skb buffer id will be fetched from DMA ring causing issues such as the following rcu_sched stall: rcu: INFO: rcu_sched self-detected stall on CPU rcu: 0-....: (1 GPs behind) idle=c67/0/0x7 softirq=45768/45769 fqs=1012 (t=2100 jiffies g=14817 q=8703) Task dump for CPU 0: task:swapper/0 state:R running task stack: 0 pid: 0 ppid: 0 flags:0x0000000a Call trace: dump_backtrace+0x0/0x160 show_stack+0x14/0x20 sched_show_task+0x158/0x184 dump_cpu_task+0x40/0x4c rcu_dump_cpu_stacks+0xec/0x12c rcu_sched_clock_irq+0x6c8/0x8a0 update_process_times+0x88/0xd0 tick_sched_timer+0x74/0x1e0 __hrtimer_run_queues+0x150/0x204 hrtimer_interrupt+0xe4/0x240 arch_timer_handler_phys+0x30/0x40 handle_percpu_devid_irq+0x80/0x130 handle_domain_irq+0x5c/0x90 gic_handle_irq+0x8c/0xb4 do_interrupt_handler+0x30/0x54 el1_interrupt+0x2c/0x4c el1h_64_irq_handler+0x14/0x1c el1h_64_irq+0x74/0x78 do_raw_spin_lock+0x60/0x100 _raw_spin_lock_bh+0x1c/0x2c ath11k_dp_rx_mon_mpdu_pop.constprop.0+0x174/0x650 ath11k_dp_rx_process_mon_status+0x8b4/0xa80 ath11k_dp_rx_process_mon_rings+0x244/0x510 ath11k_dp_service_srng+0x190/0x300 ath11k_pcic_ext_grp_napi_poll+0x30/0xc0 __napi_poll+0x34/0x174 net_rx_action+0xf8/0x2a0 _stext+0x12c/0x2ac irq_exit+0x94/0xc0 handle_domain_irq+0x60/0x90 gic_handle_irq+0x8c/0xb4 call_on_irq_stack+0x28/0x44 do_interrupt_handler+0x4c/0x54 el1_interrupt+0x2c/0x4c el1h_64_irq_handler+0x14/0x1c el1h_64_irq+0x74/0x78 arch_cpu_idle+0x14/0x20 do_idle+0xf0/0x130 cpu_startup_entry+0x24/0x50 rest_init+0xf8/0x104 arch_call_rest_init+0xc/0x14 start_kernel+0x56c/0x58c __primary_switched+0xa0/0xa8 Thus ath11k_dp_rx_mon_dest_process(), which use classical destination entry format, should no be called on full monitor capable HW. Fixes: `67a9d399fc` ("ath11k: enable RX PPDU stats in monitor co-exist mode") Signed-off-by: Remi Pommarel <repk@triplefau.lt> Reviewed-by: Praneesh P <quic_ppranees@quicinc.com> Link: https://patch.msgid.link/20240924194119.15942-1-repk@triplefau.lt Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>	2024-10-16 07:30:31 -07:00
Manikanta Pubbisetty	e15d84b3bb	wifi: ath10k: Fix memory leak in management tx In the current logic, memory is allocated for storing the MSDU context during management packet TX but this memory is not being freed during management TX completion. Similar leaks are seen in the management TX cleanup logic. Kmemleak reports this problem as below, unreferenced object 0xffffff80b64ed250 (size 16): comm "kworker/u16:7", pid 148, jiffies 4294687130 (age 714.199s) hex dump (first 16 bytes): 00 2b d8 d8 80 ff ff ff c4 74 e9 fd 07 00 00 00 .+.......t...... backtrace: [<ffffffe6e7b245dc>] __kmem_cache_alloc_node+0x1e4/0x2d8 [<ffffffe6e7adde88>] kmalloc_trace+0x48/0x110 [<ffffffe6bbd765fc>] ath10k_wmi_tlv_op_gen_mgmt_tx_send+0xd4/0x1d8 [ath10k_core] [<ffffffe6bbd3eed4>] ath10k_mgmt_over_wmi_tx_work+0x134/0x298 [ath10k_core] [<ffffffe6e78d5974>] process_scheduled_works+0x1ac/0x400 [<ffffffe6e78d60b8>] worker_thread+0x208/0x328 [<ffffffe6e78dc890>] kthread+0x100/0x1c0 [<ffffffe6e78166c0>] ret_from_fork+0x10/0x20 Free the memory during completion and cleanup to fix the leak. Protect the mgmt_pending_tx idr_remove() operation in ath10k_wmi_tlv_op_cleanup_mgmt_tx_send() using ar->data_lock similar to other instances. Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.2.0-01387-QCAHLSWMTPLZ-1 Fixes: `dc405152bb` ("ath10k: handle mgmt tx completion event") Fixes: `c730c47717` ("ath10k: Remove msdu from idr when management pkt send fails") Cc: stable@vger.kernel.org Signed-off-by: Manikanta Pubbisetty <quic_mpubbise@quicinc.com> Link: https://patch.msgid.link/20241015064103.6060-1-quic_mpubbise@quicinc.com Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>	2024-10-16 07:30:31 -07:00
Ming Lei	42aafd8b48	ublk: don't allow user copy for unprivileged device UBLK_F_USER_COPY requires userspace to call write() on ublk char device for filling request buffer, and unprivileged device can't be trusted. So don't allow user copy for unprivileged device. Cc: stable@vger.kernel.org Fixes: `1172d5b8be` ("ublk: support user copy") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241016134847.2911721-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-16 08:08:18 -06:00
Juha-Pekka Heikkila	ffafd12696	drm/i915/display: Don't allow tile4 framebuffer to do hflip on display20 or greater On display ver 20 onwards tile4 is not supported with horizontal flip Bspec: 69853 Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com> Signed-off-by: Mika Kahola <mika.kahola@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241007182841.2104740-1-juhapekka.heikkila@gmail.com (cherry picked from commit `73e8e2f9a3`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-16 09:07:09 -05:00
Matthew Auld	6df106e93f	drm/xe/bmg: improve cache flushing behaviour The BSpec says that EN_L3_RW_CCS_CACHE_FLUSH must be toggled on for manual global invalidation to take effect and actually flush device cache, however this also turns on flushing for things like pipecontrol, which occurs between submissions for compute/render. This sounds like massive overkill for our needs, where we already have the manual flushing on the display side with the global invalidation. Some observations on BMG: 1. Disabling l2 caching for host writes and stubbing out the driver global invalidation but keeping EN_L3_RW_CCS_CACHE_FLUSH enabled, has no impact on wb-transient-vs-display IGT, which makes sense since the pipecontrol is now flushing the device cache after the render copy. Without EN_L3_RW_CCS_CACHE_FLUSH the test then fails, which is also expected since device cache is now dirty and display engine can't see the writes. 2. Disabling EN_L3_RW_CCS_CACHE_FLUSH, but keeping the driver global invalidation also has no impact on wb-transient-vs-display. This suggests that the global invalidation still works as expected and is flushing the device cache without EN_L3_RW_CCS_CACHE_FLUSH turned on. With that drop EN_L3_RW_CCS_CACHE_FLUSH. This helps some workloads since we no longer flush the device cache between submissions as part of pipecontrol. Edit: We now also have clarification from HW side that BSpec was indeed wrong here. v2: - Rebase and update commit message. BSpec: 71718 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Vitasta Wattal <vitasta.wattal@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241007074541.33937-2-matthew.auld@intel.com (cherry picked from commit `67ec9f87bd`) [ Fix conflict due to changed xe_mmio_write32() signature ] Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-16 09:00:22 -05:00
Matthew Auld	816b186ce2	drm/xe/xe_sync: initialise ufence.signalled We can incorrectly think that the fence has signalled, if we get a non-zero value here from the kmalloc, which is quite plausible. Just use kzalloc to prevent stuff like this. Fixes: `977e5b82e0` ("drm/xe: Expose user fence from xe_sync_entry") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241011133633.388008-2-matthew.auld@intel.com (cherry picked from commit `26f69e88dc`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-16 09:00:22 -05:00
Nirmoy Das	4e8b5a1651	drm/xe/ufence: ufence can be signaled right after wait_woken do_comapre() can return success after a timedout wait_woken() which was treated as -ETIME. The loop calling wait_woken() sets correct err so there is no need to re-evaluate err. v2: Remove entire check that reevaluate err at the end(Matt) Fixes: `e670f0b4ef` ("drm/xe/uapi: Return correct error code for xe_wait_user_fence_ioctl") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1630 Cc: stable@vger.kernel.org # v6.8+ Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241011151029.4160630-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit `ec7e6a1d52`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-16 09:00:22 -05:00
Matthew Brost	e7518276e9	drm/xe: Use bookkeep slots for external BO's in exec IOCTL Fix external BO's dma-resv usage in exec IOCTL using bookkeep slots rather than write slots. This leaves syncing to user space rather than the KMD blindly enforcing write semantics on every external BO. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Kenneth Graunke <kenneth.w.graunke@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reported-by: Simona Vetter <simona.vetter@ffwll.ch> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2673 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240911152622.903058-1-matthew.brost@intel.com (cherry picked from commit `b8b1163248`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-16 09:00:22 -05:00
Lucas De Marchi	477d665e9b	drm/xe/query: Increase timestamp width Starting with Xe2 the timestamp is a full 64 bit counter, contrary to the 36 bit that was available before. Although 36 should be sufficient for any reasonable delta calculation (for Xe2, of about 30min), it's surprising to userspace to get something truncated. Also if the timestamp being compared to is coming from the GPU and the application is not careful enough to apply the width there, a delta calculation would be wrong. Extend it to full 64-bits starting with Xe2. v2: Expand width=64 to media gt, as it's just a wrong tagging in the spec - empirical tests show it goes beyond 36 bits and match the engines for the main gt Bspec: 60411 Cc: Szymon Morek <szymon.morek@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241011035618.1057602-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `9d559cdcb2`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-16 09:00:22 -05:00
Matthew Brost	82926f52d7	drm/xe: Don't free job in TDR Freeing job in TDR is not safe as TDR can pass the run_job thread resulting in UAF. It is only safe for free job to naturally be called by the scheduler. Rather free job in TDR, add to pending list. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2811 Cc: Matthew Auld <matthew.auld@intel.com> Fixes: `e275d61c5f` ("drm/xe/guc: Handle timing out of signaled jobs gracefully") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003001657.3517883-3-matthew.brost@intel.com (cherry picked from commit `ea2f6a77d0`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-16 09:00:22 -05:00
Matthew Brost	ed931fb40e	drm/xe: Take job list lock in xe_sched_add_pending_job A fragile micro optimization in xe_sched_add_pending_job relied on both the GPU scheduler being stopped and fence signaling stopped to safely add a job to the pending list without the job list lock in xe_sched_add_pending_job. Remove this optimization and just take the job list lock. Fixes: `7ddb9403dd` ("drm/xe: Sample ctx timestamp to determine if jobs have timed out") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003001657.3517883-2-matthew.brost@intel.com (cherry picked from commit `90521df5fc`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-16 09:00:22 -05:00
Matthew Auld	761f916af4	drm/xe: fix unbalanced rpm put() with declare_wedged() Technically the or_reset() means we call the action on failure, however that would lead to unbalanced rpm put(). Move the get() earlier to fix this. It should be extremely unlikely to ever trigger this in practice. Fixes: `90936a0a4c` ("drm/xe: Don't suspend device upon wedge") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009084808.204432-4-matthew.auld@intel.com (cherry picked from commit `a187c1b0a8`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-16 09:00:22 -05:00
Matthew Auld	03a86c24ae	drm/xe: fix unbalanced rpm put() with fence_fini() Currently we can call fence_fini() twice if something goes wrong when sending the GuC CT for the tlb request, since we signal the fence and return an error, leading to the caller also calling fini() on the error path in the case of stack version of the flow, which leads to an extra rpm put() which might later cause device to enter suspend when it shouldn't. It looks like we can just drop the fini() call since the fence signaller side will already call this for us. There are known mysterious splats with device going to sleep even with an rpm ref, and this could be one candidate. v2 (Matt B): - Prefer warning if we detect double fini() Fixes: `f002702290` ("drm/xe: Hold a PM ref when GT TLB invalidations are inflight") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009084808.204432-3-matthew.auld@intel.com (cherry picked from commit `cfcbc0520d`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-16 09:00:22 -05:00
Aradhya Bhatia	4ceead37ca	drm/xe/xe2lpg: Extend Wa_15016589081 for xe2lpg Add workaround (wa) 15016589081 which applies to Xe2_v3_LPG_MD. Xe2_v3_LPG_MD is a Lunar Lake platform with GFX version: 20.04. This wa is type: permanent, and hence is applicable on all steppings. Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009065542.283151-1-aradhya.bhatia@intel.com (cherry picked from commit `8fb1da9f9b`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-16 09:00:22 -05:00
Omar Sandoval	e972b08b91	blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race We're seeing crashes from rq_qos_wake_function that look like this: BUG: unable to handle page fault for address: ffffafe180a40084 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0 Oops: Oops: 0002 [#1] PREEMPT SMP PTI CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40 Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00 RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084 RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011 R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002 R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> try_to_wake_up+0x5a/0x6a0 rq_qos_wake_function+0x71/0x80 __wake_up_common+0x75/0xa0 __wake_up+0x36/0x60 scale_up.part.0+0x50/0x110 wb_timer_fn+0x227/0x450 ... So rq_qos_wake_function() calls wake_up_process(data->task), which calls try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock). p comes from data->task, and data comes from the waitqueue entry, which is stored on the waiter's stack in rq_qos_wait(). Analyzing the core dump with drgn, I found that the waiter had already woken up and moved on to a completely unrelated code path, clobbering what was previously data->task. Meanwhile, the waker was passing the clobbered garbage in data->task to wake_up_process(), leading to the crash. What's happening is that in between rq_qos_wake_function() deleting the waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding that it already got a token and returning. The race looks like this: rq_qos_wait() rq_qos_wake_function() ============================================================== prepare_to_wait_exclusive() data->got_token = true; list_del_init(&curr->entry); if (data.got_token) break; finish_wait(&rqw->wait, &data.wq); ^- returns immediately because list_empty_careful(&wq_entry->entry) is true ... return, go do something else ... wake_up_process(data->task) (NO LONGER VALID!)-^ Normally, finish_wait() is supposed to synchronize against the waker. But, as noted above, it is returning immediately because the waitqueue entry has already been removed from the waitqueue. The bug is that rq_qos_wake_function() is accessing the waitqueue entry AFTER deleting it. Note that autoremove_wake_function() wakes the waiter and THEN deletes the waitqueue entry, which is the proper order. Fix it by swapping the order. We also need to use list_del_init_careful() to match the list_empty_careful() in finish_wait(). Fixes: `38cfb5a45e` ("blk-wbt: improve waking of tasks") Cc: stable@vger.kernel.org Signed-off-by: Omar Sandoval <osandov@fb.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-16 07:20:14 -06:00
Jens Axboe	858e686a30	io_uring/rsrc: ignore dummy_ubuf for buffer cloning For placeholder buffers, &dummy_ubuf is assigned which is a static value. When buffers are attempted cloned, don't attempt to grab a reference to it, as we both don't need it and it'll actively fail as dummy_ubuf doesn't have a valid reference count setup. Link: https://lore.kernel.org/io-uring/Zw8dkUzsxQ5LgAJL@ly-workstation/ Reported-by: Lai, Yi <yi1.lai@linux.intel.com> Fixes: `7cc2a6eadc` ("io_uring: add IORING_REGISTER_COPY_BUFFERS method") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-16 07:09:25 -06:00
Ryusuke Konishi	6ed469df0b	nilfs2: fix kernel bug due to missing clearing of buffer delay flag Syzbot reported that after nilfs2 reads a corrupted file system image and degrades to read-only, the BUG_ON check for the buffer delay flag in submit_bh_wbc() may fail, causing a kernel bug. This is because the buffer delay flag is not cleared when clearing the buffer state flags to discard a page/folio or a buffer head. So, fix this. This became necessary when the use of nilfs2's own page clear routine was expanded. This state inconsistency does not occur if the buffer is written normally by log writing. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Link: https://lore.kernel.org/r/20241015213300.7114-1-konishi.ryusuke@gmail.com Fixes: `8c26c4e269` ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption") Reported-by: syzbot+985ada84bf055a575c07@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=985ada84bf055a575c07 Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-16 15:05:32 +02:00
Imre Deak	2f54e71359	drm/i915/dp_mst: Don't require DSC hblank quirk for a non-DSC compatible mode If an MST branch device doesn't support DSC for a given mode, but the MST link has enough BW for the mode, assume that the branch device does support the mode using an uncompressed stream. Fixes: `55eaef1641` ("drm/i915/dp_mst: Handle the Synaptics HBlank expansion quirk") Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009110135.1216498-2-imre.deak@intel.com (cherry picked from commit `4e75c3e208`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2024-10-16 14:56:40 +03:00
Imre Deak	69b3d87212	drm/i915/dp_mst: Handle error during DSC BW overhead/slice calculation The MST branch device may not support the number of DSC slices a mode requires, handle the error in this case. Fixes: `4e0837a8d0` ("drm/i915/dp_mst: Account for FEC and DSC overhead during BW allocation") Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009110135.1216498-1-imre.deak@intel.com (cherry picked from commit `802a69b6b8`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2024-10-16 14:56:15 +03:00
Amir Goldstein	20121d3f58	fuse: update inode size after extending passthrough write yangyun reported that libfuse test test_copy_file_range() copies zero bytes from a newly written file when fuse passthrough is enabled. The reason is that extending passthrough write is not updating the fuse inode size and when vfs_copy_file_range() observes a zero size inode, it returns without calling the filesystem copy_file_range() method. Fix this by adjusting the fuse inode size after an extending passthrough write. This does not provide cache coherency of fuse inode attributes and backing inode attributes, but it should prevent situations where fuse inode size is too small, causing read/copy to be wrongly shortened. Reported-by: yangyun <yangyun50@huawei.com> Closes: https://github.com/libfuse/libfuse/issues/1048 Fixes: `57e1176e60` ("fuse: implement read/write passthrough") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2024-10-16 13:18:21 +02:00
Amir Goldstein	f03b296e8b	fs: pass offset and result to backing_file end_write() callback This is needed for extending fuse inode size after fuse passthrough write. Suggested-by: Miklos Szeredi <miklos@szeredi.hu> Link: https://lore.kernel.org/linux-fsdevel/CAJfpegs=cvZ_NYy6Q_D42XhYS=Sjj5poM1b5TzXzOVvX=R36aA@mail.gmail.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2024-10-16 13:17:45 +02:00
Heiko Carstens	b4fa00fd42	s390: Update defconfigs Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2024-10-16 11:32:32 +02:00
Heiko Carstens	223e7fb979	s390: Initialize psw mask in perf_arch_fetch_caller_regs() Also initialize regs->psw.mask in perf_arch_fetch_caller_regs(). This way user_mode(regs) will return false, like it should. It looks like all current users initialize regs to zero, so that this doesn't fix a bug currently. However it is better to not rely on callers to do this. Fixes: `914d52e464` ("s390: implement perf_arch_fetch_caller_regs") Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2024-10-16 11:32:32 +02:00
Thomas Weißschuh	dee3df68ab	s390/sclp_vt220: Convert newlines to CRLF instead of LFCR According to the VT220 specification the possible character combinations sent on RETURN are only CR or CRLF [0]. The Return key sends either a CR character (0/13) or a CR character (0/13) and an LF character (0/10), depending on the set/reset state of line feed/new line mode (LNM). The sclp/vt220 driver however uses LFCR. This can confuse tools, for example the kunit runner. Link: https://vt100.net/docs/vt220-rm/chapter3.html#S3.2 Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Link: https://lore.kernel.org/r/20241014-s390-kunit-v1-2-941defa765a6@linutronix.de Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2024-10-16 11:32:32 +02:00
Thomas Weißschuh	0d9dc27df2	s390/sclp: Deactivate sclp after all its users On reboot the SCLP interface is deactivated through a reboot notifier. This happens before other components using SCLP have the chance to run their own reboot notifiers. Two of those components are the SCLP console and tty drivers which try to flush the last outstanding messages. At that point the SCLP interface is already unusable and the messages are discarded. Execute sclp_deactivate() as late as possible to avoid this issue. Fixes: `4ae46db99c` ("s390/consoles: improve panic notifiers reliability") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Link: https://lore.kernel.org/r/20241014-s390-kunit-v1-1-941defa765a6@linutronix.de Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2024-10-16 11:32:32 +02:00
Holger Dengler	9b52ddeb46	s390/pkey_pckmo: Return with success for valid protected key types The key_to_protkey handler function in module pkey_pckmo should return with success on all known protected key types, including the new types introduced by `fd197556ee` ("s390/pkey: Add AES xts and HMAC clear key token support"). Fixes: `fd197556ee` ("s390/pkey: Add AES xts and HMAC clear key token support") Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2024-10-16 11:32:32 +02:00
Vasiliy Kovalev	164cd0e077	ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2 The cached version avoids redundant commands to the codec, improving stability and reducing unnecessary operations. This change ensures better power management and reliable restoration of pin configurations, especially after hibernation (S4) and other power transitions. Fixes: `9988844c45` ("ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2") Suggested-by: Kai-Heng Feng <kaihengf@nvidia.com> Suggested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org> Link: https://patch.msgid.link/20241016080713.46801-1-kovalev@altlinux.org Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-16 10:29:57 +02:00
Kevin Groeneveld	9499327714	usb: gadget: f_uac2: fix return value for UAC2_ATTRIBUTE_STRING store The configfs store callback should return the number of bytes consumed not the total number of bytes we actually stored. These could differ if for example the passed in string had a newline we did not store. If the returned value does not match the number of bytes written the writer might assume a failure or keep trying to write the remaining bytes. For example the following command will hang trying to write the final newline over and over again (tested on bash 2.05b): echo foo > function_name Fixes: `993a44fa85` ("usb: gadget: f_uac2: allow changing interface name via configfs") Cc: stable <stable@kernel.org> Signed-off-by: Kevin Groeneveld <kgroeneveld@lenbrook.com> Link: https://lore.kernel.org/r/20241006232637.4267-1-kgroeneveld@lenbrook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-16 10:28:28 +02:00
Roger Quadros	705e3ce37b	usb: dwc3: core: Fix system suspend on TI AM62 platforms Since commit `6d73572206` ("usb: dwc3: core: Prevent phy suspend during init"), system suspend is broken on AM62 TI platforms. Before that commit, both DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY bits (hence forth called 2 SUSPHY bits) were being set during core initialization and even during core re-initialization after a system suspend/resume. These bits are required to be set for system suspend/resume to work correctly on AM62 platforms. Since that commit, the 2 SUSPHY bits are not set for DEVICE/OTG mode if gadget driver is not loaded and started. For Host mode, the 2 SUSPHY bits are set before the first system suspend but get cleared at system resume during core re-init and are never set again. This patch resovles these two issues by ensuring the 2 SUSPHY bits are set before system suspend and restored to the original state during system resume. Cc: stable@vger.kernel.org # v6.9+ Fixes: `6d73572206` ("usb: dwc3: core: Prevent phy suspend during init") Link: https://lore.kernel.org/all/1519dbe7-73b6-4afc-bfe3-23f4f75d772f@kernel.org/ Signed-off-by: Roger Quadros <rogerq@kernel.org> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Tested-by: Markus Schneider-Pargmann <msp@baylibre.com> Reviewed-by: Dhruva Gole <d-gole@ti.com> Link: https://lore.kernel.org/r/20241011-am62-lpm-usb-v3-1-562d445625b5@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-16 10:25:48 +02:00
Henry Lin	7d381137cb	xhci: tegra: fix checked USB2 port number If USB virtualizatoin is enabled, USB2 ports are shared between all Virtual Functions. The USB2 port number owned by an USB2 root hub in a Virtual Function may be less than total USB2 phy number supported by the Tegra XUSB controller. Using total USB2 phy number as port number to check all PORTSC values would cause invalid memory access. [ 116.923438] Unable to handle kernel paging request at virtual address 006c622f7665642f ... [ 117.213640] Call trace: [ 117.216783] tegra_xusb_enter_elpg+0x23c/0x658 [ 117.222021] tegra_xusb_runtime_suspend+0x40/0x68 [ 117.227260] pm_generic_runtime_suspend+0x30/0x50 [ 117.232847] __rpm_callback+0x84/0x3c0 [ 117.237038] rpm_suspend+0x2dc/0x740 [ 117.241229] pm_runtime_work+0xa0/0xb8 [ 117.245769] process_scheduled_works+0x24c/0x478 [ 117.251007] worker_thread+0x23c/0x328 [ 117.255547] kthread+0x104/0x1b0 [ 117.259389] ret_from_fork+0x10/0x20 [ 117.263582] Code: 54000222 f9461ae8 f8747908 b4ffff48 (f9400100) Cc: stable@vger.kernel.org # v6.3+ Fixes: `a30951d31b` ("xhci: tegra: USB2 pad power controls") Signed-off-by: Henry Lin <henryl@nvidia.com> Link: https://lore.kernel.org/r/20241014042134.27664-1-henryl@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-16 10:25:43 +02:00
Prashanth K	c96e312521	usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFG DWC3 programming guide mentions that when operating in USB2.0 speeds, if GUSB2PHYCFG[6] or GUSB2PHYCFG[8] is set, it must be cleared prior to issuing commands and may be set again after the command completes. But currently while issuing EndXfer command without CmdIOC set, we wait for 1ms after GUSB2PHYCFG is restored. This results in cases where EndXfer command doesn't get completed and causes SMMU faults since requests are unmapped afterwards. Hence restore GUSB2PHYCFG after waiting for EndXfer command completion. Cc: stable@vger.kernel.org Fixes: `1d26ba0944` ("usb: dwc3: Wait unconditionally after issuing EndXfer command") Signed-off-by: Prashanth K <quic_prashk@quicinc.com> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20240924093208.2524531-1-quic_prashk@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-16 10:25:28 +02:00
Jonathan Marek	ffe85c24d7	usb: typec: qcom-pmic-typec: fix sink status being overwritten with RP_DEF This line is overwriting the result of the above switch-case. This fixes the tcpm driver getting stuck in a "Sink TX No Go" loop. Fixes: `a4422ff221` ("usb: typec: qcom: Add Qualcomm PMIC Type-C driver") Cc: stable <stable@kernel.org> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241005144146.2345-1-jonathan@marek.ca Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-16 10:25:03 +02:00
Thadeu Lima de Souza Cascardo	befab3a278	usb: typec: altmode should keep reference to parent The altmode device release refers to its parent device, but without keeping a reference to it. When registering the altmode, get a reference to the parent and put it in the release function. Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues like this: [ 43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000) [ 43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000) [ 43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000) [ 43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000) [ 46.612867] ================================================================== [ 46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129 [ 46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48 [ 46.614538] [ 46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535 [ 46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 [ 46.616042] Workqueue: events kobject_delayed_cleanup [ 46.616446] Call Trace: [ 46.616648] <TASK> [ 46.616820] dump_stack_lvl+0x5b/0x7c [ 46.617112] ? typec_altmode_release+0x38/0x129 [ 46.617470] print_report+0x14c/0x49e [ 46.617769] ? rcu_read_unlock_sched+0x56/0x69 [ 46.618117] ? __virt_addr_valid+0x19a/0x1ab [ 46.618456] ? kmem_cache_debug_flags+0xc/0x1d [ 46.618807] ? typec_altmode_release+0x38/0x129 [ 46.619161] kasan_report+0x8d/0xb4 [ 46.619447] ? typec_altmode_release+0x38/0x129 [ 46.619809] ? process_scheduled_works+0x3cb/0x85f [ 46.620185] typec_altmode_release+0x38/0x129 [ 46.620537] ? process_scheduled_works+0x3cb/0x85f [ 46.620907] device_release+0xaf/0xf2 [ 46.621206] kobject_delayed_cleanup+0x13b/0x17a [ 46.621584] process_scheduled_works+0x4f6/0x85f [ 46.621955] ? __pfx_process_scheduled_works+0x10/0x10 [ 46.622353] ? hlock_class+0x31/0x9a [ 46.622647] ? lock_acquired+0x361/0x3c3 [ 46.622956] ? move_linked_works+0x46/0x7d [ 46.623277] worker_thread+0x1ce/0x291 [ 46.623582] ? __kthread_parkme+0xc8/0xdf [ 46.623900] ? __pfx_worker_thread+0x10/0x10 [ 46.624236] kthread+0x17e/0x190 [ 46.624501] ? kthread+0xfb/0x190 [ 46.624756] ? __pfx_kthread+0x10/0x10 [ 46.625015] ret_from_fork+0x20/0x40 [ 46.625268] ? __pfx_kthread+0x10/0x10 [ 46.625532] ret_from_fork_asm+0x1a/0x30 [ 46.625805] </TASK> [ 46.625953] [ 46.626056] Allocated by task 678: [ 46.626287] kasan_save_stack+0x24/0x44 [ 46.626555] kasan_save_track+0x14/0x2d [ 46.626811] __kasan_kmalloc+0x3f/0x4d [ 46.627049] __kmalloc_noprof+0x1bf/0x1f0 [ 46.627362] typec_register_port+0x23/0x491 [ 46.627698] cros_typec_probe+0x634/0xbb6 [ 46.628026] platform_probe+0x47/0x8c [ 46.628311] really_probe+0x20a/0x47d [ 46.628605] device_driver_attach+0x39/0x72 [ 46.628940] bind_store+0x87/0xd7 [ 46.629213] kernfs_fop_write_iter+0x1aa/0x218 [ 46.629574] vfs_write+0x1d6/0x29b [ 46.629856] ksys_write+0xcd/0x13b [ 46.630128] do_syscall_64+0xd4/0x139 [ 46.630420] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 46.630820] [ 46.630946] Freed by task 48: [ 46.631182] kasan_save_stack+0x24/0x44 [ 46.631493] kasan_save_track+0x14/0x2d [ 46.631799] kasan_save_free_info+0x3f/0x4d [ 46.632144] __kasan_slab_free+0x37/0x45 [ 46.632474] kfree+0x1d4/0x252 [ 46.632725] device_release+0xaf/0xf2 [ 46.633017] kobject_delayed_cleanup+0x13b/0x17a [ 46.633388] process_scheduled_works+0x4f6/0x85f [ 46.633764] worker_thread+0x1ce/0x291 [ 46.634065] kthread+0x17e/0x190 [ 46.634324] ret_from_fork+0x20/0x40 [ 46.634621] ret_from_fork_asm+0x1a/0x30 Fixes: `8a37d87d72` ("usb: typec: Bus type for alternate modes") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241004123738.2964524-1-cascardo@igalia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-16 10:22:52 +02:00
Andrey Konovalov	92682f3460	MAINTAINERS: usb: raw-gadget: add bug tracker link Add a link to the GitHub repository where Raw Gadget issues are managed. Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://lore.kernel.org/r/20241012225853.118217-1-andrey.konovalov@linux.dev Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-16 10:18:58 +02:00
Sakari Ailus	0240b293ec	MAINTAINERS: Add an entry for the LJCA drivers Add a MAINTAINERS entry for the Intel La Jolla Cove Adapter (LJCA) set of drivers. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20241011070414.3124-1-sakari.ailus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-16 10:18:54 +02:00
Qianqiang Liu	cd843399d7	crypto: lib/mpi - Fix an "Uninitialized scalar variable" issue The "err" variable may be returned without an initialized value. Fixes: `8e3a67f2de` ("crypto: lib/mpi - Add error checks to extension") Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-10-16 13:38:16 +08:00
Dr. David Alan Gilbert	6aca91c416	cifs: Remove unused functions cifs_ses_find_chan() has been unused since commit `f486ef8e20` ("cifs: use the chans_need_reconnect bitmap for reconnect status") cifs_read_page_from_socket() has been unused since commit `d08089f649` ("cifs: Change the I/O paths to use an iterator rather than a page list") cifs_chan_in_reconnect() has been unused since commit `bc962159e8` ("cifs: avoid race conditions with parallel reconnects") Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-10-16 00:30:52 -05:00
Advait Dhamorikar	3dfea293f4	smb/client: Fix logically dead code The if condition in collect_sample: can never be satisfied because of a logical contradiction. The indicated dead code may have performed some action; that action will never occur. Fixes: `94ae8c3fee` ("smb: client: compress: LZ77 code improvements cleanup") Signed-off-by: Advait Dhamorikar <advaitdhamorikar@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-10-16 00:30:52 -05:00
Paulo Alcantara	1ab60323c5	smb: client: fix OOBs when building SMB2_IOCTL request When using encryption, either enforced by the server or when using 'seal' mount option, the client will squash all compound request buffers down for encryption into a single iov in smb2_set_next_command(). SMB2_ioctl_init() allocates a small buffer (448 bytes) to hold the SMB2_IOCTL request in the first iov, and if the user passes an input buffer that is greater than 328 bytes, smb2_set_next_command() will end up writing off the end of @rqst->iov[0].iov_base as shown below: mount.cifs //srv/share /mnt -o ...,seal ln -s $(perl -e "print('a')for 1..1024") /mnt/link BUG: KASAN: slab-out-of-bounds in smb2_set_next_command.cold+0x1d6/0x24c [cifs] Write of size 4116 at addr ffff8881148fcab8 by task ln/859 CPU: 1 UID: 0 PID: 859 Comm: ln Not tainted 6.12.0-rc3 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x5d/0x80 ? smb2_set_next_command.cold+0x1d6/0x24c [cifs] print_report+0x156/0x4d9 ? smb2_set_next_command.cold+0x1d6/0x24c [cifs] ? __virt_addr_valid+0x145/0x310 ? __phys_addr+0x46/0x90 ? smb2_set_next_command.cold+0x1d6/0x24c [cifs] kasan_report+0xda/0x110 ? smb2_set_next_command.cold+0x1d6/0x24c [cifs] kasan_check_range+0x10f/0x1f0 __asan_memcpy+0x3c/0x60 smb2_set_next_command.cold+0x1d6/0x24c [cifs] smb2_compound_op+0x238c/0x3840 [cifs] ? kasan_save_track+0x14/0x30 ? kasan_save_free_info+0x3b/0x70 ? vfs_symlink+0x1a1/0x2c0 ? do_symlinkat+0x108/0x1c0 ? __pfx_smb2_compound_op+0x10/0x10 [cifs] ? kmem_cache_free+0x118/0x3e0 ? cifs_get_writable_path+0xeb/0x1a0 [cifs] smb2_get_reparse_inode+0x423/0x540 [cifs] ? __pfx_smb2_get_reparse_inode+0x10/0x10 [cifs] ? rcu_is_watching+0x20/0x50 ? __kmalloc_noprof+0x37c/0x480 ? smb2_create_reparse_symlink+0x257/0x490 [cifs] ? smb2_create_reparse_symlink+0x38f/0x490 [cifs] smb2_create_reparse_symlink+0x38f/0x490 [cifs] ? __pfx_smb2_create_reparse_symlink+0x10/0x10 [cifs] ? find_held_lock+0x8a/0xa0 ? hlock_class+0x32/0xb0 ? __build_path_from_dentry_optional_prefix+0x19d/0x2e0 [cifs] cifs_symlink+0x24f/0x960 [cifs] ? __pfx_make_vfsuid+0x10/0x10 ? __pfx_cifs_symlink+0x10/0x10 [cifs] ? make_vfsgid+0x6b/0xc0 ? generic_permission+0x96/0x2d0 vfs_symlink+0x1a1/0x2c0 do_symlinkat+0x108/0x1c0 ? __pfx_do_symlinkat+0x10/0x10 ? strncpy_from_user+0xaa/0x160 __x64_sys_symlinkat+0xb9/0xf0 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f08d75c13bb Reported-by: David Howells <dhowells@redhat.com> Fixes: `e77fe73c7e` ("cifs: we can not use small padding iovs together with encryption") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-10-16 00:30:52 -05:00
Su Hui	19ebc1e6ca	smb: client: fix possible double free in smb2_set_ea() Clang static checker(scan-build) warning： fs/smb/client/smb2ops.c:1304:2: Attempt to free released memory. 1304 \| kfree(ea); \| ^~~~~~~~~ There is a double free in such case: 'ea is initialized to NULL' -> 'first successful memory allocation for ea' -> 'something failed, goto sea_exit' -> 'first memory release for ea' -> 'goto replay_again' -> 'second goto sea_exit before allocate memory for ea' -> 'second memory release for ea resulted in double free'. Re-initialie 'ea' to NULL near to the replay_again label, it can fix this double free problem. Fixes: `4f1fffa237` ("cifs: commands that are retried should have replay flag set") Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Su Hui <suhui@nfschina.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-10-16 00:25:54 -05:00
Mario Limonciello	18d9b52271	cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled When boost has been disabled the limit for perf should be nominal perf not the highest perf. Using the latter to do calculations will lead to incorrect values that are still above nominal. Fixes: `ad4caad58d` ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()") Reported-by: Peter Jung <ptr1337@cachyos.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219348 Reviewed-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Link: https://lore.kernel.org/r/20241012174519.897-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>	2024-10-15 23:54:15 -05:00
Linus Torvalds	dff6584301	Merge tag 'sched_ext-for-6.12-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - More issues reported in the enable/disable paths on large machines with many tasks due to scx_tasks_lock being held too long. Break up the task iterations - Remove ops.select_cpu() dependency in bypass mode so that a misbehaving implementation can't live-lock the machine by pushing all tasks to few CPUs in bypass mode - Other misc fixes * tag 'sched_ext-for-6.12-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Remove unnecessary cpu_relax() sched_ext: Don't hold scx_tasks_lock for too long sched_ext: Move scx_tasks_lock handling into scx_task_iter helpers sched_ext: bypass mode shouldn't depend on ops.select_cpu() sched_ext: Move scx_buildin_idle_enabled check to scx_bpf_select_cpu_dfl() sched_ext: Start schedulers with consistent p->scx.slice values Revert "sched_ext: Use shorter slice while bypassing" sched_ext: use correct function name in pick_task_scx() warning message selftests: sched_ext: Add sched_ext as proper selftest target	2024-10-15 19:47:19 -07:00
Wang Hai	fca6caeb4a	scsi: target: core: Fix null-ptr-deref in target_alloc_device() There is a null-ptr-deref issue reported by KASAN: BUG: KASAN: null-ptr-deref in target_alloc_device+0xbc4/0xbe0 [target_core_mod] ... kasan_report+0xb9/0xf0 target_alloc_device+0xbc4/0xbe0 [target_core_mod] core_dev_setup_virtual_lun0+0xef/0x1f0 [target_core_mod] target_core_init_configfs+0x205/0x420 [target_core_mod] do_one_initcall+0xdd/0x4e0 ... entry_SYSCALL_64_after_hwframe+0x76/0x7e In target_alloc_device(), if allocing memory for dev queues fails, then dev will be freed by dev->transport->free_device(), but dev->transport is not initialized at that time, which will lead to a null pointer reference problem. Fixing this bug by freeing dev with hba->backend->ops->free_device(). Fixes: `1526d9f10c` ("scsi: target: Make state_list per CPU") Signed-off-by: Wang Hai <wanghai38@huawei.com> Link: https://lore.kernel.org/r/20241011113444.40749-1-wanghai38@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-10-15 22:25:52 -04:00
Ranjan Kumar	b9e63d6c7c	scsi: mpi3mr: Validate SAS port assignments A sanity check on phy_mask was added in commit `3668651def` ("scsi: mpi3mr: Sanitise num_phys"). This causes warning messages when more than 64 phys are detected and devices connected to phys greater than 64 are dropped. The phy_mask bitmap is only needed for controller phys and not required for expander phys. Controller phys can go up to a maximum of 64 and therefore u64 is good enough to contain phy_mask bitmap. To suppress those warnings and allow devices to be discovered as before the offending commit, restrict the phy_mask setting and lowest phy setting only to the controller phys. Fixes: `3668651def` ("scsi: mpi3mr: Sanitise num_phys") Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410051943.Mp9o5DlF-lkp@intel.com/ Reported-by: Alexander Motin <mav@ixsystems.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241008074353.200379-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-10-15 22:17:17 -04:00
Seunghwan Baek	19a198b677	scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut down There is a history of deadlock if reboot is performed at the beginning of booting. SDEV_QUIESCE was set for all LU's scsi_devices by UFS shutdown, and at that time the audio driver was waiting on blk_mq_submit_bio() holding a mutex_lock while reading the fw binary. After that, a deadlock issue occurred while audio driver shutdown was waiting for mutex_unlock of blk_mq_submit_bio(). To solve this, set SDEV_OFFLINE for all LUs except WLUN, so that any I/O that comes down after a UFS shutdown will return an error. [ 31.907781]I[0: swapper/0: 0] 1 130705007 1651079834 11289729804 0 D( 2) 3 ffffff882e208000 * init [device_shutdown] [ 31.907793]I[0: swapper/0: 0] Mutex: 0xffffff8849a2b8b0: owner[0xffffff882e28cb00 kworker/6:0 :49] [ 31.907806]I[0: swapper/0: 0] Call trace: [ 31.907810]I[0: swapper/0: 0] __switch_to+0x174/0x338 [ 31.907819]I[0: swapper/0: 0] __schedule+0x5ec/0x9cc [ 31.907826]I[0: swapper/0: 0] schedule+0x7c/0xe8 [ 31.907834]I[0: swapper/0: 0] schedule_preempt_disabled+0x24/0x40 [ 31.907842]I[0: swapper/0: 0] __mutex_lock+0x408/0xdac [ 31.907849]I[0: swapper/0: 0] __mutex_lock_slowpath+0x14/0x24 [ 31.907858]I[0: swapper/0: 0] mutex_lock+0x40/0xec [ 31.907866]I[0: swapper/0: 0] device_shutdown+0x108/0x280 [ 31.907875]I[0: swapper/0: 0] kernel_restart+0x4c/0x11c [ 31.907883]I[0: swapper/0: 0] __arm64_sys_reboot+0x15c/0x280 [ 31.907890]I[0: swapper/0: 0] invoke_syscall+0x70/0x158 [ 31.907899]I[0: swapper/0: 0] el0_svc_common+0xb4/0xf4 [ 31.907909]I[0: swapper/0: 0] do_el0_svc+0x2c/0xb0 [ 31.907918]I[0: swapper/0: 0] el0_svc+0x34/0xe0 [ 31.907928]I[0: swapper/0: 0] el0t_64_sync_handler+0x68/0xb4 [ 31.907937]I[0: swapper/0: 0] el0t_64_sync+0x1a0/0x1a4 [ 31.908774]I[0: swapper/0: 0] 49 0 11960702 11236868007 0 D( 2) 6 ffffff882e28cb00 * kworker/6:0 [__bio_queue_enter] [ 31.908783]I[0: swapper/0: 0] Call trace: [ 31.908788]I[0: swapper/0: 0] __switch_to+0x174/0x338 [ 31.908796]I[0: swapper/0: 0] __schedule+0x5ec/0x9cc [ 31.908803]I[0: swapper/0: 0] schedule+0x7c/0xe8 [ 31.908811]I[0: swapper/0: 0] __bio_queue_enter+0xb8/0x178 [ 31.908818]I[0: swapper/0: 0] blk_mq_submit_bio+0x194/0x67c [ 31.908827]I[0: swapper/0: 0] __submit_bio+0xb8/0x19c Fixes: `b294ff3e34` ("scsi: ufs: core: Enable power management for wlun") Cc: stable@vger.kernel.org Signed-off-by: Seunghwan Baek <sh8267.baek@samsung.com> Link: https://lore.kernel.org/r/20240829093913.6282-2-sh8267.baek@samsung.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-10-15 22:11:11 -04:00
Peter Wang	8fa075804c	scsi: ufs: core: Requeue aborted request After the SQ cleanup fix, the CQ will receive a response with the corresponding tag marked as OCS: ABORTED. To align with the behavior of Legacy SDB mode, the handling of OCS: ABORTED has been changed to match that of OCS_INVALID_COMMAND_STATUS (SDB), with both returning a SCSI result of DID_REQUEUE. Furthermore, the workaround implemented before the SQ cleanup fix can be removed. Fixes: `ab248643d3` ("scsi: ufs: core: Add error handling for MCQ mode") Cc: stable@vger.kernel.org Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20241001091917.6917-3-peter.wang@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-10-15 22:07:32 -04:00
Peter Wang	bf0c6cc73f	scsi: ufs: core: Fix the issue of ICU failure When setting the ICU bit without using read-modify-write, SQRTCy will restart SQ again and receive an RTC return error code 2 (Failure - SQ not stopped). Additionally, the error log has been modified so that this type of error can be observed. Fixes: `ab248643d3` ("scsi: ufs: core: Add error handling for MCQ mode") Cc: stable@vger.kernel.org Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20241001091917.6917-2-peter.wang@mediatek.com Reviewed-by: Bao D. Nguyen <quic_nguyenb@quicinc.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-10-15 22:07:31 -04:00
Vladimir Oltean	11d06f0aae	net: dsa: vsc73xx: fix reception from VLAN-unaware bridges Similar to the situation described for sja1105 in commit `1f9fc48fd3` ("net: dsa: sja1105: fix reception from VLAN-unaware bridges"), the vsc73xx driver uses tag_8021q and doesn't need the ds->untag_bridge_pvid request. In fact, this option breaks packet reception. The ds->untag_bridge_pvid option strips VLANs from packets received on VLAN-unaware bridge ports. But those VLANs should already be stripped by tag_vsc73xx_8021q.c as part of vsc73xx_rcv() - they are not VLANs in VLAN-unaware mode, but DSA tags. Thus, dsa_software_vlan_untag() tries to untag a VLAN that doesn't exist, corrupting the packet. Fixes: `93e4649efa` ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges") Tested-by: Pawel Dembicki <paweldembicki@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://patch.msgid.link/20241014153041.1110364-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 18:41:52 -07:00
Niklas Söderlund	126e799602	net: ravb: Only advertise Rx/Tx timestamps if hardware supports it Recent work moving the reporting of Rx software timestamps to the core [1] highlighted an issue where hardware time stamping was advertised for the platforms where it is not supported. Fix this by covering advertising support for hardware timestamps only if the hardware supports it. Due to the Tx implementation in RAVB software Tx timestamping is also only considered if the hardware supports hardware timestamps. This should be addressed in future, but this fix only reflects what the driver currently implements. 1. Commit `277901ee3a` ("ravb: Remove setting of RX software timestamp") Fixes: `7e09a052dc` ("ravb: Exclude gPTP feature support for RZ/G2L") Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com> Tested-by: Paul Barker <paul.barker.ct@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://patch.msgid.link/20241014124343.3875285-1-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 18:41:10 -07:00
Jinjie Ruan	217a3d98d1	net: microchip: vcap api: Fix memory leaks in vcap_api_encode_rule_test() Commit `a3c1e45156` ("net: microchip: vcap: Fix use-after-free error in kunit test") fixed the use-after-free error, but introduced below memory leaks by removing necessary vcap_free_rule(), add it to fix it. unreferenced object 0xffffff80ca58b700 (size 192): comm "kunit_try_catch", pid 1215, jiffies 4294898264 hex dump (first 32 bytes): 00 12 7a 00 05 00 00 00 0a 00 00 00 64 00 00 00 ..z.........d... 00 00 00 00 00 00 00 00 00 04 0b cc 80 ff ff ff ................ backtrace (crc 9c09c3fe): [<0000000052a0be73>] kmemleak_alloc+0x34/0x40 [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4 [<0000000040a01b8d>] vcap_alloc_rule+0x3cc/0x9c4 [<000000003fe86110>] vcap_api_encode_rule_test+0x1ac/0x16b0 [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000c5d82c9a>] kthread+0x2e8/0x374 [<00000000f4287308>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80cc0b0400 (size 64): comm "kunit_try_catch", pid 1215, jiffies 4294898265 hex dump (first 32 bytes): 80 04 0b cc 80 ff ff ff 18 b7 58 ca 80 ff ff ff ..........X..... 39 00 00 00 02 00 00 00 06 05 04 03 02 01 ff ff 9............... backtrace (crc daf014e9): [<0000000052a0be73>] kmemleak_alloc+0x34/0x40 [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4 [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528 [<00000000dfdb1e81>] vcap_api_encode_rule_test+0x224/0x16b0 [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000c5d82c9a>] kthread+0x2e8/0x374 [<00000000f4287308>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80cc0b0700 (size 64): comm "kunit_try_catch", pid 1215, jiffies 4294898265 hex dump (first 32 bytes): 80 07 0b cc 80 ff ff ff 28 b7 58 ca 80 ff ff ff ........(.X..... 3c 00 00 00 00 00 00 00 01 2f 03 b3 ec ff ff ff <......../...... backtrace (crc 8d877792): [<0000000052a0be73>] kmemleak_alloc+0x34/0x40 [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4 [<000000006eadfab7>] vcap_rule_add_action+0x2d0/0x52c [<00000000323475d1>] vcap_api_encode_rule_test+0x4d4/0x16b0 [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000c5d82c9a>] kthread+0x2e8/0x374 [<00000000f4287308>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80cc0b0900 (size 64): comm "kunit_try_catch", pid 1215, jiffies 4294898266 hex dump (first 32 bytes): 80 09 0b cc 80 ff ff ff 80 06 0b cc 80 ff ff ff ................ 7d 00 00 00 01 00 00 00 00 00 00 00 ff 00 00 00 }............... backtrace (crc 34181e56): [<0000000052a0be73>] kmemleak_alloc+0x34/0x40 [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4 [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528 [<00000000991e3564>] vcap_val_rule+0xcf0/0x13e8 [<00000000fc9868e5>] vcap_api_encode_rule_test+0x678/0x16b0 [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000c5d82c9a>] kthread+0x2e8/0x374 [<00000000f4287308>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80cc0b0980 (size 64): comm "kunit_try_catch", pid 1215, jiffies 4294898266 hex dump (first 32 bytes): 18 b7 58 ca 80 ff ff ff 00 09 0b cc 80 ff ff ff ..X............. 67 00 00 00 00 00 00 00 01 01 74 88 c0 ff ff ff g.........t..... backtrace (crc 275fd9be): [<0000000052a0be73>] kmemleak_alloc+0x34/0x40 [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4 [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528 [<000000001396a1a2>] test_add_def_fields+0xb0/0x100 [<000000006e7621f0>] vcap_val_rule+0xa98/0x13e8 [<00000000fc9868e5>] vcap_api_encode_rule_test+0x678/0x16b0 [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000c5d82c9a>] kthread+0x2e8/0x374 [<00000000f4287308>] ret_from_fork+0x10/0x20 ...... Cc: stable@vger.kernel.org Fixes: `a3c1e45156` ("net: microchip: vcap: Fix use-after-free error in kunit test") Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20241014121922.1280583-1-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 18:39:00 -07:00
Jakub Kicinski	9626c18209	Merge branch 'net-phy-mdio-bcm-unimac-add-bcm6846-variant' Linus Walleij says: ==================== net: phy: mdio-bcm-unimac: Add BCM6846 variant As pointed out by Florian: https://lore.kernel.org/linux-devicetree/b542b2e8-115c-4234-a464-e73aa6bece5c@broadcom.com/ The BCM6846 has a few extra registers and cannot reuse the compatible string from other variants of the Unimac MDIO block: we need to be able to tell them apart. ==================== Link: https://patch.msgid.link/20241012-bcm6846-mdio-v1-0-c703ca83e962@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 18:23:55 -07:00
Linus Walleij	906b77ca91	net: phy: mdio-bcm-unimac: Add BCM6846 support Add Unimac mdio compatible string for the special BCM6846 variant. This variant has a few extra registers compared to other versions. Suggested-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/linux-devicetree/b542b2e8-115c-4234-a464-e73aa6bece5c@broadcom.com/ Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://patch.msgid.link/20241012-bcm6846-mdio-v1-2-c703ca83e962@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 18:23:53 -07:00
Linus Walleij	6ed97afd75	dt-bindings: net: brcm,unimac-mdio: Add bcm6846-mdio The MDIO block in the BCM6846 is not identical to any of the previous versions, but has extended registers not present in the other variants. For this reason we need to use a new compatible especially for this SoC. Suggested-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/linux-devicetree/b542b2e8-115c-4234-a464-e73aa6bece5c@broadcom.com/ Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20241012-bcm6846-mdio-v1-1-c703ca83e962@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 18:23:53 -07:00
Jakub Sitnicki	d96016a764	udp: Compute L4 checksum as usual when not segmenting the skb If: 1) the user requested USO, but 2) there is not enough payload for GSO to kick in, and 3) the egress device doesn't offer checksum offload, then we want to compute the L4 checksum in software early on. In the case when we are not taking the GSO path, but it has been requested, the software checksum fallback in skb_segment doesn't get a chance to compute the full checksum, if the egress device can't do it. As a result we end up sending UDP datagrams with only a partial checksum filled in, which the peer will discard. Fixes: `10154dbded` ("udp: Allow GSO transmit from devices with no checksum offload") Reported-by: Ivan Babrou <ivan@cloudflare.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20241011-uso-swcsum-fixup-v2-1-6e1ddc199af9@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 18:12:33 -07:00
Eric Dumazet	56440d7ec2	genetlink: hold RCU in genlmsg_mcast() While running net selftests with CONFIG_PROVE_RCU_LIST=y I saw one lockdep splat [1]. genlmsg_mcast() uses for_each_net_rcu(), and must therefore hold RCU. Instead of letting all callers guard genlmsg_multicast_allns() with a rcu_read_lock()/rcu_read_unlock() pair, do it in genlmsg_mcast(). This also means the @flags parameter is useless, we need to always use GFP_ATOMIC. [1] [10882.424136] ============================= [10882.424166] WARNING: suspicious RCU usage [10882.424309] 6.12.0-rc2-virtme #1156 Not tainted [10882.424400] ----------------------------- [10882.424423] net/netlink/genetlink.c:1940 RCU-list traversed in non-reader section!! [10882.424469] other info that might help us debug this: [10882.424500] rcu_scheduler_active = 2, debug_locks = 1 [10882.424744] 2 locks held by ip/15677: [10882.424791] #0: ffffffffb6b491b0 (cb_lock){++++}-{3:3}, at: genl_rcv (net/netlink/genetlink.c:1219) [10882.426334] #1: ffffffffb6b49248 (genl_mutex){+.+.}-{3:3}, at: genl_rcv_msg (net/netlink/genetlink.c:61 net/netlink/genetlink.c:57 net/netlink/genetlink.c:1209) [10882.426465] stack backtrace: [10882.426805] CPU: 14 UID: 0 PID: 15677 Comm: ip Not tainted 6.12.0-rc2-virtme #1156 [10882.426919] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [10882.427046] Call Trace: [10882.427131] <TASK> [10882.427244] dump_stack_lvl (lib/dump_stack.c:123) [10882.427335] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822) [10882.427387] genlmsg_multicast_allns (net/netlink/genetlink.c:1940 (discriminator 7) net/netlink/genetlink.c:1977 (discriminator 7)) [10882.427436] l2tp_tunnel_notify.constprop.0 (net/l2tp/l2tp_netlink.c:119) l2tp_netlink [10882.427683] l2tp_nl_cmd_tunnel_create (net/l2tp/l2tp_netlink.c:253) l2tp_netlink [10882.427748] genl_family_rcv_msg_doit (net/netlink/genetlink.c:1115) [10882.427834] genl_rcv_msg (net/netlink/genetlink.c:1195 net/netlink/genetlink.c:1210) [10882.427877] ? __pfx_l2tp_nl_cmd_tunnel_create (net/l2tp/l2tp_netlink.c:186) l2tp_netlink [10882.427927] ? __pfx_genl_rcv_msg (net/netlink/genetlink.c:1201) [10882.427959] netlink_rcv_skb (net/netlink/af_netlink.c:2551) [10882.428069] genl_rcv (net/netlink/genetlink.c:1220) [10882.428095] netlink_unicast (net/netlink/af_netlink.c:1332 net/netlink/af_netlink.c:1357) [10882.428140] netlink_sendmsg (net/netlink/af_netlink.c:1901) [10882.428210] ____sys_sendmsg (net/socket.c:729 (discriminator 1) net/socket.c:744 (discriminator 1) net/socket.c:2607 (discriminator 1)) Fixes: `33f72e6f0c` ("l2tp : multicast notification to the registered listeners") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Cc: Tom Parkin <tparkin@katalix.com> Cc: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20241011171217.3166614-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 17:52:58 -07:00
Peter Rashleigh	1833d8a26f	net: dsa: mv88e6xxx: Fix the max_vid definition for the MV88E6361 According to the Marvell datasheet the 88E6361 has two VTU pages (4k VIDs per page) so the max_vid should be 8191, not 4095. In the current implementation mv88e6xxx_vtu_walk() gives unexpected results because of this error. I verified that mv88e6xxx_vtu_walk() works correctly on the MV88E6361 with this patch in place. Fixes: `12899f2998` ("net: dsa: mv88e6xxx: enable support for 88E6361 switch") Signed-off-by: Peter Rashleigh <peter@rashleigh.ca> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241014204342.5852-1-peter@rashleigh.ca Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 17:50:51 -07:00
Kuniyuki Iwashima	e8c526f2bd	tcp/dccp: Don't use timer_pending() in reqsk_queue_unlink(). Martin KaFai Lau reported use-after-free [0] in reqsk_timer_handler(). """ We are seeing a use-after-free from a bpf prog attached to trace_tcp_retransmit_synack. The program passes the req->sk to the bpf_sk_storage_get_tracing kernel helper which does check for null before using it. """ The commit `83fccfc394` ("inet: fix potential deadlock in reqsk_queue_unlink()") added timer_pending() in reqsk_queue_unlink() not to call del_timer_sync() from reqsk_timer_handler(), but it introduced a small race window. Before the timer is called, expire_timers() calls detach_timer(timer, true) to clear timer->entry.pprev and marks it as not pending. If reqsk_queue_unlink() checks timer_pending() just after expire_timers() calls detach_timer(), TCP will miss del_timer_sync(); the reqsk timer will continue running and send multiple SYN+ACKs until it expires. The reported UAF could happen if req->sk is close()d earlier than the timer expiration, which is 63s by default. The scenario would be 1. inet_csk_complete_hashdance() calls inet_csk_reqsk_queue_drop(), but del_timer_sync() is missed 2. reqsk timer is executed and scheduled again 3. req->sk is accept()ed and reqsk_put() decrements rsk_refcnt, but reqsk timer still has another one, and inet_csk_accept() does not clear req->sk for non-TFO sockets 4. sk is close()d 5. reqsk timer is executed again, and BPF touches req->sk Let's not use timer_pending() by passing the caller context to __inet_csk_reqsk_queue_drop(). Note that reqsk timer is pinned, so the issue does not happen in most use cases. [1] [0] BUG: KFENCE: use-after-free read in bpf_sk_storage_get_tracing+0x2e/0x1b0 Use-after-free read at 0x00000000a891fb3a (in kfence-#1): bpf_sk_storage_get_tracing+0x2e/0x1b0 bpf_prog_5ea3e95db6da0438_tcp_retransmit_synack+0x1d20/0x1dda bpf_trace_run2+0x4c/0xc0 tcp_rtx_synack+0xf9/0x100 reqsk_timer_handler+0xda/0x3d0 run_timer_softirq+0x292/0x8a0 irq_exit_rcu+0xf5/0x320 sysvec_apic_timer_interrupt+0x6d/0x80 asm_sysvec_apic_timer_interrupt+0x16/0x20 intel_idle_irq+0x5a/0xa0 cpuidle_enter_state+0x94/0x273 cpu_startup_entry+0x15e/0x260 start_secondary+0x8a/0x90 secondary_startup_64_no_verify+0xfa/0xfb kfence-#1: 0x00000000a72cc7b6-0x00000000d97616d9, size=2376, cache=TCPv6 allocated by task 0 on cpu 9 at 260507.901592s: sk_prot_alloc+0x35/0x140 sk_clone_lock+0x1f/0x3f0 inet_csk_clone_lock+0x15/0x160 tcp_create_openreq_child+0x1f/0x410 tcp_v6_syn_recv_sock+0x1da/0x700 tcp_check_req+0x1fb/0x510 tcp_v6_rcv+0x98b/0x1420 ipv6_list_rcv+0x2258/0x26e0 napi_complete_done+0x5b1/0x2990 mlx5e_napi_poll+0x2ae/0x8d0 net_rx_action+0x13e/0x590 irq_exit_rcu+0xf5/0x320 common_interrupt+0x80/0x90 asm_common_interrupt+0x22/0x40 cpuidle_enter_state+0xfb/0x273 cpu_startup_entry+0x15e/0x260 start_secondary+0x8a/0x90 secondary_startup_64_no_verify+0xfa/0xfb freed by task 0 on cpu 9 at 260507.927527s: rcu_core_si+0x4ff/0xf10 irq_exit_rcu+0xf5/0x320 sysvec_apic_timer_interrupt+0x6d/0x80 asm_sysvec_apic_timer_interrupt+0x16/0x20 cpuidle_enter_state+0xfb/0x273 cpu_startup_entry+0x15e/0x260 start_secondary+0x8a/0x90 secondary_startup_64_no_verify+0xfa/0xfb Fixes: `83fccfc394` ("inet: fix potential deadlock in reqsk_queue_unlink()") Reported-by: Martin KaFai Lau <martin.lau@kernel.org> Closes: https://lore.kernel.org/netdev/eb6684d0-ffd9-4bdc-9196-33f690c25824@linux.dev/ Link: https://lore.kernel.org/netdev/b55e2ca0-42f2-4b7c-b445-6ffd87ca74a0@linux.dev/ [1] Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20241014223312.4254-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 17:47:30 -07:00
Rob Clark	77ad507dbb	drm/msm/a6xx+: Insert a fence wait before SMMU table update The CP_SMMU_TABLE_UPDATE _should_ be waiting for idle, but on some devices (x1-85, possibly others), it seems to pass that barrier while there are still things in the event completion FIFO waiting to be written back to memory. Work around that by adding a fence wait before context switch. The CP_EVENT_WRITE that writes the fence is the last write from a submit, so seeing this value hit memory is a reliable indication that it is safe to proceed with the context switch. v2: Only emit CP_WAIT_TIMESTAMP on a7xx, as it is not supported on a6xx. Conversely, I've not been able to reproduce this issue on a6xx, so hopefully it is limited to a7xx, or perhaps just certain a7xx devices. Fixes: `af66706acc` ("drm/msm/a6xx: Add skeleton A7xx support") Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/63 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>	2024-10-15 17:18:16 -07:00
Wang Hai	fed07d3eb8	net: bcmasp: fix potential memory leak in bcmasp_xmit() The bcmasp_xmit() returns NETDEV_TX_OK without freeing skb in case of mapping fails, add dev_kfree_skb() to fix it. Fixes: `490cb41200` ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20241014145901.48940-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 17:10:27 -07:00
Michael Ellerman	cf8989d20d	powerpc/powernv: Free name on error in opal_event_init() In opal_event_init() if request_irq() fails name is not freed, leading to a memory leak. The code only runs at boot time, there's no way for a user to trigger it, so there's no security impact. Fix the leak by freeing name in the error path. Reported-by: 2639161967 <2639161967@qq.com> Closes: https://lore.kernel.org/linuxppc-dev/87wmjp3wig.fsf@mail.lhotse Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20240920093520.67997-1-mpe@ellerman.id.au	2024-10-16 09:26:50 +11:00
Jessica Zhang	f87f3b80ab	drm/msm/dpu: don't always program merge_3d block Only program the merge_3d block for the video phys encoder when the 3d blend mode is not NONE Fixes: `3e79527a33` ("drm/msm/dpu: enable merge_3d support on sm8150/sm8250") Suggested-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/619095/ Link: https://lore.kernel.org/r/20241009-merge3d-fix-v1-1-0d0b6f5c244e@quicinc.com Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>	2024-10-15 15:00:27 -07:00
Jessica Zhang	40dad89cb8	drm/msm/dpu: Don't always set merge_3d pending flush Don't set the merge_3d pending flush bits if the mode_3d is BLEND_3D_NONE. Always flushing merge_3d can cause timeout issues when there are multiple commits with concurrent writeback enabled. This is because the video phys enc waits for the hw_ctl flush register to be completely cleared [1] in its wait_for_commit_done(), but the WB encoder always sets the merge_3d pending flush during each commit regardless of if the merge_3d is actually active. This means that the hw_ctl flush register will never be 0 when there are multiple CWB commits and the video phys enc will hit vblank timeout errors after the first CWB commit. [1] commit `fe9df3f50c` ("drm/msm/dpu: add real wait_for_commit_done()") Fixes: `3e79527a33` ("drm/msm/dpu: enable merge_3d support on sm8150/sm8250") Fixes: `d7d0e73f7d` ("drm/msm/dpu: introduce the dpu_encoder_phys_* for writeback") Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/619092/ Link: https://lore.kernel.org/r/20241009-mode3d-fix-v1-1-c0258354fadc@quicinc.com Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>	2024-10-15 14:59:20 -07:00
Fabrizio Castro	d038109ac1	irqchip/renesas-rzg2l: Fix missing put_device rzg2l_irqc_common_init() calls of_find_device_by_node(), but the corresponding put_device() call is missing. This also gets reported by make coccicheck. Make use of the cleanup interfaces from cleanup.h to call into __free_put_device(), which in turn calls into put_device when leaving function rzg2l_irqc_common_init() and variable "dev" goes out of scope. To prevent that the device is put on successful completion, assign NULL to "dev" to prevent __free_put_device() from calling into put_device() within the successful path. "make coccicheck" will still complain about missing put_device() calls, but those are false positives now. Fixes: `3fed09559c` ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver") Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241011172003.1242841-1-fabrizio.castro.jz@renesas.com	2024-10-15 23:54:35 +02:00
Sunil V L	a98a0f050c	irqchip/riscv-intc: Fix SMP=n boot with ACPI When CONFIG_SMP is disabled, the static array rintc_acpi_data with size NR_CPUS is not sufficient to hold all RINTC structures passed from the firmware. All RINTC structures are required to configure IMSIC/APLIC/PLIC properly irrespective of SMP in the OS. So, allocate dynamic memory based on the number of RINTC structures in MADT to fix this issue. Fixes: `f8619b66bd` ("irqchip/riscv-intc: Add ACPI support for AIA") Reported-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/all/20241014065739.656959-1-sunilvl@ventanamicro.com Closes: https://github.com/linux-riscv/linux-riscv/actions/runs/11280997511/job/31375229012	2024-10-15 23:14:25 +02:00
Arnd Bergmann	1b59d6c19c	Merge tag 'scmi-fixes-6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm SCMI fixes for v6.12 Couple of fixes to address the issues found and reported on Broadcom STB platforms following the recent refactor of all the SCMI transports as standalone drivers. One of the issue is that the effective timeout value is much less than the intended value due to the way mailbox messages are queues in the mailbox framework. Since we block or serialise the shmem access anyway, there is no point in utilizing mailbox queues. The issue is fixed with exclusive lock on the channel when sending the message. The other issues is actually non-issue for upstream, but the workaround is just changing the link order of the transport drivers which enables Broadcom STB platforms to run both upstream and custom downstream kernel without any device tree changes. So pushing this to help them test upstream seamlessly as it has no practical or theoretical impact for others. There is also a fix to address possible double freeing of the name string in scmi_debugfs_common_cleanup() when devm_add_action_or_reset() fails. * tag 'scmi-fixes-6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_scmi: Queue in scmi layer for mailbox implementation firmware: arm_scmi: Give SMC transport precedence over mailbox firmware: arm_scmi: Fix the double free in scmi_debugfs_common_setup() Link: https://lore.kernel.org/r/20241015185128.1000604-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-10-15 20:39:43 +00:00
Arnd Bergmann	aa56d75267	Merge tag 'ffa-fixes-6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm FF-A fixes for v6.12 Couple of fixes to avoid string-fortify warnings in export_uuid() and memcpy() from the recently added functions to support FFA_MSG_SEND_DIRECT_REQ2 and FFA_MSG_SEND_DIRECT_RESP2. * tag 'ffa-fixes-6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_ffa: Avoid string-fortify warning caused by memcpy() firmware: arm_ffa: Avoid string-fortify warning in export_uuid() Link: https://lore.kernel.org/r/20241015185037.1000435-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-10-15 20:38:27 +00:00
Arnd Bergmann	6f54738166	Merge tag 'mvebu-fixes-6.12-1' of https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into arm/fixes mvebu fixes for 6.12 (part 1) Fix cp0 mdio pin numbers on SolidRun CN9130 SoM * tag 'mvebu-fixes-6.12-1' of https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: arm64: dts: marvell: cn9130-sr-som: fix cp0 mdio pin numbers Link: https://lore.kernel.org/r/87ldyud25o.fsf@BLaptop.bootlin.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-10-15 20:37:29 +00:00
Arnd Bergmann	76237ff95b	Merge tag 'reset-fixes-for-v6.12' of git://git.pengutronix.de/pza/linux into arm/fixes Reset controller fixes for v6.12 Fix a NULL pointer dereference in reset-starfive-jh71x0 and replace two accidental commas at line endings with semicolons in reset-npcm. * tag 'reset-fixes-for-v6.12' of git://git.pengutronix.de/pza/linux: reset: starfive: jh71x0: Fix accessing the empty member on JH7110 SoC reset: npcm: convert comma to semicolon Link: https://lore.kernel.org/r/20240930165733.1541936-1-p.zabel@pengutronix.de Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-10-15 20:36:54 +00:00
Wang Hai	c401ed1c70	net: systemport: fix potential memory leak in bcm_sysport_xmit() The bcm_sysport_xmit() returns NETDEV_TX_OK without freeing skb in case of dma_map_single() fails, add dev_kfree_skb() to fix it. Fixes: `80105befdb` ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Wang Hai <wanghai38@huawei.com> Link: https://patch.msgid.link/20241014145115.44977-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 12:53:52 -07:00
Linus Torvalds	2f87d0916c	Merge tag 'trace-ringbuffer-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring-buffer fixes from Steven Rostedt: - Fix ref counter of buffers assigned at boot up A tracing instance can be created from the kernel command line. If it maps to memory, it is considered permanent and should not be deleted, or bad things can happen. If it is not mapped to memory, then the user is fine to delete it via rmdir from the instances directory. But the ref counts assumed 0 was free to remove and greater than zero was not. But this was not the case. When an instance is created, it should have the reference of 1, and if it should not be removed, it must be greater than 1. The boot up code set normal instances with a ref count of 0, which could get removed if something accessed it and then released it. And memory mapped instances had a ref count of 1 which meant it could be deleted, and bad things happen. Keep normal instances ref count as 1, and set memory mapped instances ref count to 2. - Protect sub buffer size (order) updates from other modifications When a ring buffer is changing the size of its sub-buffers, no other operations should be performed on the ring buffer. That includes reading it. But the locking only grabbed the buffer->mutex that keeps some operations from touching the ring buffer. It also must hold the cpu_buffer->reader_lock as well when updates happen as other paths use that to do some operations on the ring buffer. * tag 'trace-ringbuffer-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Fix reader locking when changing the sub buffer order ring-buffer: Fix refcount setting of boot mapped buffers	2024-10-15 11:18:44 -07:00
Alexei Starovoitov	ee230090f6	Merge branch 'fix-truncation-bug-in-coerce_reg_to_size_sx-and-extend-selftests' Dimitar Kanaliev says: ==================== Fix truncation bug in coerce_reg_to_size_sx and extend selftests. This patch series addresses a truncation bug in the eBPF verifier function coerce_reg_to_size_sx(). The issue was caused by the incorrect ordering of assignments between 32-bit and 64-bit min/max values, leading to improper truncation when updating the register state. This issue has been reported previously by Zac Ecob[1] , but was not followed up on. The first patch fixes the assignment order in coerce_reg_to_size_sx() to ensure correct truncation. The subsequent patches add selftests for coerce_{reg,subreg}_to_size_sx. Changelog: v1 -> v2: - Moved selftests inside the conditional check for cpuv4 [1] (https://lore.kernel.org/bpf/h3qKLDEO6m9nhif0eAQX4fVrqdO0D_OPb0y5HfMK9jBePEKK33wQ3K-bqSVnr0hiZdFZtSJOsbNkcEQGpv_yJk61PAAiO8fUkgMRSO-lB50=@protonmail.com/) ==================== Link: https://lore.kernel.org/r/20241014121155.92887-1-dimitar.kanaliev@siteground.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-15 11:16:25 -07:00
Dimitar Kanaliev	35ccd576a2	selftests/bpf: Add test for sign extension in coerce_subreg_to_size_sx() Add a test for unsigned ranges after signed extension instruction. This case isn't currently covered by existing tests in verifier_movsx.c. Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Dimitar Kanaliev <dimitar.kanaliev@siteground.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241014121155.92887-4-dimitar.kanaliev@siteground.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-15 11:16:25 -07:00
Dimitar Kanaliev	61f506eacc	selftests/bpf: Add test for truncation after sign extension in coerce_reg_to_size_sx() Add test that checks whether unsigned ranges deduced by the verifier for sign extension instruction is correct. Without previous patch that fixes truncation in coerce_reg_to_size_sx() this test fails. Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Dimitar Kanaliev <dimitar.kanaliev@siteground.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241014121155.92887-3-dimitar.kanaliev@siteground.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-15 11:16:25 -07:00
Dimitar Kanaliev	ae67b9fb8c	bpf: Fix truncation bug in coerce_reg_to_size_sx() coerce_reg_to_size_sx() updates the register state after a sign-extension operation. However, there's a bug in the assignment order of the unsigned min/max values, leading to incorrect truncation: 0: (85) call bpf_get_prandom_u32#7 ; R0_w=scalar() 1: (57) r0 &= 1 ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,var_off=(0x0; 0x1)) 2: (07) r0 += 254 ; R0_w=scalar(smin=umin=smin32=umin32=254,smax=umax=smax32=umax32=255,var_off=(0xfe; 0x1)) 3: (bf) r0 = (s8)r0 ; R0_w=scalar(smin=smin32=-2,smax=smax32=-1,umin=umin32=0xfffffffe,umax=0xffffffff,var_off=(0xfffffffffffffffe; 0x1)) In the current implementation, the unsigned 32-bit min/max values (u32_min_value and u32_max_value) are assigned directly from the 64-bit signed min/max values (s64_min and s64_max): reg->umin_value = reg->u32_min_value = s64_min; reg->umax_value = reg->u32_max_value = s64_max; Due to the chain assigmnent, this is equivalent to: reg->u32_min_value = s64_min; // Unintended truncation reg->umin_value = reg->u32_min_value; reg->u32_max_value = s64_max; // Unintended truncation reg->umax_value = reg->u32_max_value; Fixes: `1f9a1ea821` ("bpf: Support new sign-extension load insns") Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Reported-by: Zac Ecob <zacecob@protonmail.com> Signed-off-by: Dimitar Kanaliev <dimitar.kanaliev@siteground.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Reviewed-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Link: https://lore.kernel.org/r/20241014121155.92887-2-dimitar.kanaliev@siteground.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-15 11:16:24 -07:00
Linus Torvalds	bdc7276512	Merge tag 'bcachefs-2024-10-14' of git://evilpiepirate.org/bcachefs Pull bcachefs fixes from Kent Overstreet: - New metadata version inode_has_child_snapshots This fixes bugs with handling of unlinked inodes + snapshots, in particular when an inode is reattached after taking a snapshot; deleted inodes now get correctly cleaned up across snapshots. - Disk accounting rewrite fixes - validation fixes for when a device has been removed - fix journal replay failing with "journal_reclaim_would_deadlock" - Some more small fixes for erasure coding + device removal - Assorted small syzbot fixes * tag 'bcachefs-2024-10-14' of git://evilpiepirate.org/bcachefs: (27 commits) bcachefs: Fix sysfs warning in fstests generic/730,731 bcachefs: Handle race between stripe reuse, invalidate_stripe_to_dev bcachefs: Fix kasan splat in new_stripe_alloc_buckets() bcachefs: Add missing validation for bch_stripe.csum_granularity_bits bcachefs: Fix missing bounds checks in bch2_alloc_read() bcachefs: fix uaf in bch2_dio_write_done() bcachefs: Improve check_snapshot_exists() bcachefs: Fix bkey_nocow_lock() bcachefs: Fix accounting replay flags bcachefs: Fix invalid shift in member_to_text() bcachefs: Fix bch2_have_enough_devs() for BCH_SB_MEMBER_INVALID bcachefs: __wait_for_freeing_inode: Switch to wait_bit_queue_entry bcachefs: Check if stuck in journal_res_get() closures: Add closure_wait_event_timeout() bcachefs: Fix state lock involved deadlock bcachefs: Fix NULL pointer dereference in bch2_opt_to_text bcachefs: Release transaction before wake up bcachefs: add check for btree id against max in try read node bcachefs: Disk accounting device validation fixes bcachefs: bch2_inode_or_descendents_is_open() ...	2024-10-15 11:06:45 -07:00
Wang Hai	c186b7a7f2	net: ethernet: rtsn: fix potential memory leak in rtsn_start_xmit() The rtsn_start_xmit() returns NETDEV_TX_OK without freeing skb in case of skb->len being too long, add dev_kfree_skb_any() to fix it. Fixes: `b0d3969d2b` ("net: ethernet: rtsn: Add support for Renesas Ethernet-TSN") Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241014144250.38802-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 10:59:29 -07:00
Wang Hai	99714e37e8	net: xilinx: axienet: fix potential memory leak in axienet_start_xmit() The axienet_start_xmit() returns NETDEV_TX_OK without freeing skb in case of dma_map_single() fails, add dev_kfree_skb_any() to fix it. Fixes: `71791dc8bd` ("net: axienet: Check for DMA mapping errors") Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://patch.msgid.link/20241014143704.31938-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 10:59:23 -07:00
Jakub Kicinski	56f51dfdff	Merge branch 'mptcp-prevent-mpc-handshake-on-port-based-signal-endpoints' Matthieu Baerts says: ==================== mptcp: prevent MPC handshake on port-based signal endpoints MPTCP connection requests toward a listening socket created by the in-kernel PM for a port based signal endpoint will never be accepted, they need to be explicitly rejected. - Patch 1: Explicitly reject such requests. A fix for >= v5.12. - Patch 2: Cover this case in the MPTCP selftests to avoid regressions. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> v1: https://lore.kernel.org/20240908180620.822579-1-xiyou.wangcong@gmail.com Link: https://lore.kernel.org/a5289a0d-2557-40b8-9575-6f1a0bbf06e4@redhat.com ==================== Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-0-7faea8e6b6ae@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 10:57:04 -07:00
Paolo Abeni	5afca7e996	selftests: mptcp: join: test for prohibited MPC to port-based endp Explicitly verify that MPC connection attempts towards a port-based signal endpoint fail with a reset. Note that this new test is a bit different from the other ones, not using 'run_tests'. It is then needed to add the capture capability, and the picking the right port which have been extracted into three new helpers. The info about the capture can also be printed from a single point, which simplifies the exit paths in do_transfer(). The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: `1729cf186d` ("mptcp: create the listening socket for new port") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-2-7faea8e6b6ae@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 10:57:02 -07:00
Paolo Abeni	3d041393ea	mptcp: prevent MPC handshake on port-based signal endpoints Syzkaller reported a lockdep splat: ============================================ WARNING: possible recursive locking detected 6.11.0-rc6-syzkaller-00019-g67784a74e258 #0 Not tainted -------------------------------------------- syz-executor364/5113 is trying to acquire lock: ffff8880449f1958 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff8880449f1958 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328 but task is already holding lock: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(k-slock-AF_INET); lock(k-slock-AF_INET); * DEADLOCK * May be due to missing lock nesting notation 7 locks held by syz-executor364/5113: #0: ffff8880449f0e18 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1607 [inline] #0: ffff8880449f0e18 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg+0x153/0x1b10 net/mptcp/protocol.c:1806 #1: ffff88803fe39ad8 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1607 [inline] #1: ffff88803fe39ad8 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg_fastopen+0x11f/0x530 net/mptcp/protocol.c:1727 #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline] #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline] #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x5f/0x1b80 net/ipv4/ip_output.c:470 #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline] #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline] #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1390 net/ipv4/ip_output.c:228 #4: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: local_lock_acquire include/linux/local_lock_internal.h:29 [inline] #4: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: process_backlog+0x33b/0x15b0 net/core/dev.c:6104 #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline] #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline] #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x230/0x5f0 net/ipv4/ip_input.c:232 #6: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] #6: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328 stack backtrace: CPU: 0 UID: 0 PID: 5113 Comm: syz-executor364 Not tainted 6.11.0-rc6-syzkaller-00019-g67784a74e258 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:93 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119 check_deadlock kernel/locking/lockdep.c:3061 [inline] validate_chain+0x15d3/0x5900 kernel/locking/lockdep.c:3855 __lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5142 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328 mptcp_sk_clone_init+0x32/0x13c0 net/mptcp/protocol.c:3279 subflow_syn_recv_sock+0x931/0x1920 net/mptcp/subflow.c:874 tcp_check_req+0xfe4/0x1a20 net/ipv4/tcp_minisocks.c:853 tcp_v4_rcv+0x1c3e/0x37f0 net/ipv4/tcp_ipv4.c:2267 ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 __netif_receive_skb_one_core net/core/dev.c:5661 [inline] __netif_receive_skb+0x2bf/0x650 net/core/dev.c:5775 process_backlog+0x662/0x15b0 net/core/dev.c:6108 __napi_poll+0xcb/0x490 net/core/dev.c:6772 napi_poll net/core/dev.c:6841 [inline] net_rx_action+0x89b/0x1240 net/core/dev.c:6963 handle_softirqs+0x2c4/0x970 kernel/softirq.c:554 do_softirq+0x11b/0x1e0 kernel/softirq.c:455 </IRQ> <TASK> __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382 local_bh_enable include/linux/bottom_half.h:33 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:908 [inline] __dev_queue_xmit+0x1763/0x3e90 net/core/dev.c:4450 dev_queue_xmit include/linux/netdevice.h:3105 [inline] neigh_hh_output include/net/neighbour.h:526 [inline] neigh_output include/net/neighbour.h:540 [inline] ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:235 ip_local_out net/ipv4/ip_output.c:129 [inline] __ip_queue_xmit+0x118c/0x1b80 net/ipv4/ip_output.c:535 __tcp_transmit_skb+0x2544/0x3b30 net/ipv4/tcp_output.c:1466 tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6542 [inline] tcp_rcv_state_process+0x2c32/0x4570 net/ipv4/tcp_input.c:6729 tcp_v4_do_rcv+0x77d/0xc70 net/ipv4/tcp_ipv4.c:1934 sk_backlog_rcv include/net/sock.h:1111 [inline] __release_sock+0x214/0x350 net/core/sock.c:3004 release_sock+0x61/0x1f0 net/core/sock.c:3558 mptcp_sendmsg_fastopen+0x1ad/0x530 net/mptcp/protocol.c:1733 mptcp_sendmsg+0x1884/0x1b10 net/mptcp/protocol.c:1812 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x1a6/0x270 net/socket.c:745 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597 ___sys_sendmsg net/socket.c:2651 [inline] __sys_sendmmsg+0x3b2/0x740 net/socket.c:2737 __do_sys_sendmmsg net/socket.c:2766 [inline] __se_sys_sendmmsg net/socket.c:2763 [inline] __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2763 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f04fb13a6b9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 01 1a 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffd651f42d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f04fb13a6b9 RDX: 0000000000000001 RSI: 0000000020000d00 RDI: 0000000000000004 RBP: 00007ffd651f4310 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000020000080 R11: 0000000000000246 R12: 00000000000f4240 R13: 00007f04fb187449 R14: 00007ffd651f42f4 R15: 00007ffd651f4300 </TASK> As noted by Cong Wang, the splat is false positive, but the code path leading to the report is an unexpected one: a client is attempting an MPC handshake towards the in-kernel listener created by the in-kernel PM for a port based signal endpoint. Such connection will be never accepted; many of them can make the listener queue full and preventing the creation of MPJ subflow via such listener - its intended role. Explicitly detect this scenario at initial-syn time and drop the incoming MPC request. Fixes: `1729cf186d` ("mptcp: create the listening socket for new port") Cc: stable@vger.kernel.org Reported-by: syzbot+f4aacdfef2c6a6529c3e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f4aacdfef2c6a6529c3e Cc: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-1-7faea8e6b6ae@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 10:57:02 -07:00
Li RongQing	82ac39ebd6	net/smc: Fix searching in list of known pnetids in smc_pnet_add_pnetid pnetid of pi (not newly allocated pe) should be compared Fixes: `e888a2e833` ("net/smc: introduce list of pnetids for Ethernet devices") Reviewed-by: D. Wythe <alibuda@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Link: https://patch.msgid.link/20241014115321.33234-1-lirongqing@baidu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 10:56:31 -07:00
Oleksij Rempel	d0c3601f2c	net: macb: Avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY A boot delay was introduced by commit `79540d133e` ("net: macb: Fix handling of fixed-link node"). This delay was caused by the call to `mdiobus_register()` in cases where a fixed-link PHY was present. The MDIO bus registration triggered unnecessary PHY address scans, leading to a 20-second delay due to attempts to detect Clause 45 (C45) compatible PHYs, despite no MDIO bus being attached. The commit `79540d133e` ("net: macb: Fix handling of fixed-link node") was originally introduced to fix a regression caused by commit `7897b071ac` ("net: macb: convert to phylink"), which caused the driver to misinterpret fixed-link nodes as PHY nodes. This resulted in warnings like: mdio_bus f0028000.ethernet-ffffffff: fixed-link has invalid PHY address mdio_bus f0028000.ethernet-ffffffff: scan phy fixed-link at address 0 ... mdio_bus f0028000.ethernet-ffffffff: scan phy fixed-link at address 31 This patch reworks the logic to avoid registering and allocation of the MDIO bus when: - The device tree contains a fixed-link node. - There is no "mdio" child node in the device tree. If a child node named "mdio" exists, the MDIO bus will be registered to support PHYs attached to the MACB's MDIO bus. Otherwise, with only a fixed-link, the MDIO bus is skipped. Tested on a sama5d35 based system with a ksz8863 switch attached to macb0. Fixes: `79540d133e` ("net: macb: Fix handling of fixed-link node") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241013052916.3115142-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 10:40:47 -07:00
Wang Hai	cf57b5d7a2	net: ethernet: aeroflex: fix potential memory leak in greth_start_xmit_gbit() The greth_start_xmit_gbit() returns NETDEV_TX_OK without freeing skb in case of skb->len being too long, add dev_kfree_skb() to fix it. Fixes: `d4c41139df` ("net: Add Aeroflex Gaisler 10/100/1G Ethernet MAC driver") Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Link: https://patch.msgid.link/20241012110434.49265-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 09:59:20 -07:00
Eric Dumazet	a1494d532e	netdevsim: use cond_resched() in nsim_dev_trap_report_work() I am still seeing many syzbot reports hinting that syzbot might fool nsim_dev_trap_report_work() with hundreds of ports [1] Lets use cond_resched(), and system_unbound_wq instead of implicit system_wq. [1] INFO: task syz-executor:20633 blocked for more than 143 seconds. Not tainted 6.12.0-rc2-syzkaller-00205-g1d227fcc7222 #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor state:D stack:25856 pid:20633 tgid:20633 ppid:1 flags:0x00004006 ... NMI backtrace for cpu 1 CPU: 1 UID: 0 PID: 16760 Comm: kworker/1:0 Not tainted 6.12.0-rc2-syzkaller-00205-g1d227fcc7222 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: events nsim_dev_trap_report_work RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x70 kernel/kcov.c:210 Code: 89 fb e8 23 00 00 00 48 8b 3d 04 fb 9c 0c 48 89 de 5b e9 c3 c7 5d 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <f3> 0f 1e fa 48 8b 04 24 65 48 8b 0c 25 c0 d7 03 00 65 8b 15 60 f0 RSP: 0018:ffffc90000a187e8 EFLAGS: 00000246 RAX: 0000000000000100 RBX: ffffc90000a188e0 RCX: ffff888027d3bc00 RDX: ffff888027d3bc00 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88804a2e6000 R08: ffffffff8a4bc495 R09: ffffffff89da3577 R10: 0000000000000004 R11: ffffffff8a4bc2b0 R12: dffffc0000000000 R13: ffff88806573b503 R14: dffffc0000000000 R15: ffff8880663cca00 FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc90a747f98 CR3: 000000000e734000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 000000000000002b DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: <NMI> </NMI> <TASK> __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382 spin_unlock_bh include/linux/spinlock.h:396 [inline] nsim_dev_trap_report drivers/net/netdevsim/dev.c:820 [inline] nsim_dev_trap_report_work+0x75d/0xaa0 drivers/net/netdevsim/dev.c:850 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Fixes: `ba5e127214` ("netdevsim: avoid potential loop in nsim_dev_trap_report_work()") Reported-by: syzbot+d383dc9579a76f56c251@syzkaller.appspotmail.com Reported-by: syzbot+c596faae21a68bf7afd0@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20241012094230.3893510-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 09:58:43 -07:00
Sabrina Dubroca	cf58aefb13	macsec: don't increment counters for an unrelated SA On RX, we shouldn't be incrementing the stats for an arbitrary SA in case the actual SA hasn't been set up. Those counters are intended to track packets for their respective AN when the SA isn't currently configured. Due to the way MACsec is implemented, we don't keep counters unless the SA is configured, so we can't track those packets, and those counters will remain at 0. The RXSC's stats keeps track of those packets without telling us which AN they belonged to. We could add counters for non-existent SAs, and then find a way to integrate them in the dump to userspace, but I don't think it's worth the effort. Fixes: `91ec9bd57f` ("macsec: Fix traffic counters/statistics") Reported-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/f5ac92aaa5b89343232615f4c03f9f95042c6aa0.1728657709.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-15 09:57:03 -07:00
Thierry Reding	eb0c062161	gpu: host1x: Set up device DMA parameters In order to store device DMA parameters, the DMA framework depends on the device's dma_parms field to point at a valid memory location. Add backing storage for this in struct host1x_memory_context and point to it. Reported-by: Jonathan Hunter <jonathanh@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240916133320.368620-1-thierry.reding@gmail.com (cherry picked from commit `b4ad4ef374`) Signed-off-by: Thierry Reding <treding@nvidia.com>	2024-10-15 18:46:25 +02:00
Alex Deucher	cb07c8338f	drm/amdgpu/swsmu: Only force workload setup on init Needed to set the workload type at init time so that we can apply the navi3x margin optimization. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131 Fixes: `c50fe289ed` ("drm/amdgpu/swsmu: always force a state reprogram on init") Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `580ad7cbd4`) Cc: stable@vger.kernel.org	2024-10-15 11:53:27 -04:00
Ville Syrjälä	28127dba64	drm/radeon: Fix encoder->possible_clones Include the encoder itself in its possible_clones bitmask. In the past nothing validated that drivers were populating possible_clones correctly, but that changed in commit `74d2aacbe8` ("drm: Validate encoder->possible_clones"). Looks like radeon never got the memo and is still not following the rules 100% correctly. This results in some warnings during driver initialization: Bogus possible_clones: [ENCODER:46:TV-46] possible_clones=0x4 (full encoder mask=0x7) WARNING: CPU: 0 PID: 170 at drivers/gpu/drm/drm_mode_config.c:615 drm_mode_config_validate+0x113/0x39c ... Cc: Alex Deucher <alexander.deucher@amd.com> Cc: amd-gfx@lists.freedesktop.org Fixes: `74d2aacbe8` ("drm: Validate encoder->possible_clones") Reported-by: Erhard Furtner <erhard_f@mailbox.org> Closes: https://lore.kernel.org/dri-devel/20241009000321.418e4294@yea/ Tested-by: Erhard Furtner <erhard_f@mailbox.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `3b6e7d4064`) Cc: stable@vger.kernel.org	2024-10-15 11:53:07 -04:00
Alex Deucher	7a1613e47e	drm/amdgpu/smu13: always apply the powersave optimization It can avoid margin issues in some very demanding applications. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131 Fixes: `c50fe289ed` ("drm/amdgpu/swsmu: always force a state reprogram on init") Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `62f38b4cca`) Cc: stable@vger.kernel.org	2024-10-15 11:52:46 -04:00
Philip Yang	68d26c10ef	drm/amdkfd: Accounting pdd vram_usage for svm Process device data pdd->vram_usage is read by rocm-smi via sysfs, this is currently missing the svm_bo usage accounting, so "rocm-smi --showpids" per process VRAM usage report is incorrect. Add pdd->vram_usage accounting when svm_bo allocation and release, change to atomic64_t type because it is updated outside process mutex now. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `98c0b0efcc`)	2024-10-15 11:50:13 -04:00
Srinivasan Shanmugam	e7457532cb	drm/amd/amdgpu: Fix double unlock in amdgpu_mes_add_ring This patch addresses a double unlock issue in the amdgpu_mes_add_ring function. The mutex was being unlocked twice under certain error conditions, which could lead to undefined behavior. The fix ensures that the mutex is unlocked only once before jumping to the clean_up_memory label. The unlock operation is moved to just before the goto statement within the conditional block that checks the return value of amdgpu_ring_init. This prevents the second unlock attempt after the clean_up_memory label, which is no longer necessary as the mutex is already unlocked by this point in the code flow. This change resolves the potential double unlock and maintains the correct mutex handling throughout the function. Fixes below: Commit `d0c423b647` ("drm/amdgpu/mes: use ring for kernel queue submission"), leads to the following Smatch static checker warning: drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1240 amdgpu_mes_add_ring() warn: double unlock '&adev->mes.mutex_hidden' (orig line 1213) drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 1143 int amdgpu_mes_add_ring(struct amdgpu_device adev, int gang_id, 1144 int queue_type, int idx, 1145 struct amdgpu_mes_ctx_data ctx_data, 1146 struct amdgpu_ring *out) 1147 { 1148 struct amdgpu_ring ring; 1149 struct amdgpu_mes_gang gang; 1150 struct amdgpu_mes_queue_properties qprops = {0}; 1151 int r, queue_id, pasid; 1152 1153 / 1154 * Avoid taking any other locks under MES lock to avoid circular 1155 * lock dependencies. 1156 / 1157 amdgpu_mes_lock(&adev->mes); 1158 gang = idr_find(&adev->mes.gang_id_idr, gang_id); 1159 if (!gang) { 1160 DRM_ERROR("gang id %d doesn't exist\n", gang_id); 1161 amdgpu_mes_unlock(&adev->mes); 1162 return -EINVAL; 1163 } 1164 pasid = gang->process->pasid; 1165 1166 ring = kzalloc(sizeof(struct amdgpu_ring), GFP_KERNEL); 1167 if (!ring) { 1168 amdgpu_mes_unlock(&adev->mes); 1169 return -ENOMEM; 1170 } 1171 1172 ring->ring_obj = NULL; 1173 ring->use_doorbell = true; 1174 ring->is_mes_queue = true; 1175 ring->mes_ctx = ctx_data; 1176 ring->idx = idx; 1177 ring->no_scheduler = true; 1178 1179 if (queue_type == AMDGPU_RING_TYPE_COMPUTE) { 1180 int offset = offsetof(struct amdgpu_mes_ctx_meta_data, 1181 compute[ring->idx].mec_hpd); 1182 ring->eop_gpu_addr = 1183 amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset); 1184 } 1185 1186 switch (queue_type) { 1187 case AMDGPU_RING_TYPE_GFX: 1188 ring->funcs = adev->gfx.gfx_ring[0].funcs; 1189 ring->me = adev->gfx.gfx_ring[0].me; 1190 ring->pipe = adev->gfx.gfx_ring[0].pipe; 1191 break; 1192 case AMDGPU_RING_TYPE_COMPUTE: 1193 ring->funcs = adev->gfx.compute_ring[0].funcs; 1194 ring->me = adev->gfx.compute_ring[0].me; 1195 ring->pipe = adev->gfx.compute_ring[0].pipe; 1196 break; 1197 case AMDGPU_RING_TYPE_SDMA: 1198 ring->funcs = adev->sdma.instance[0].ring.funcs; 1199 break; 1200 default: 1201 BUG(); 1202 } 1203 1204 r = amdgpu_ring_init(adev, ring, 1024, NULL, 0, 1205 AMDGPU_RING_PRIO_DEFAULT, NULL); 1206 if (r) 1207 goto clean_up_memory; 1208 1209 amdgpu_mes_ring_to_queue_props(adev, ring, &qprops); 1210 1211 dma_fence_wait(gang->process->vm->last_update, false); 1212 dma_fence_wait(ctx_data->meta_data_va->last_pt_update, false); 1213 amdgpu_mes_unlock(&adev->mes); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1214 1215 r = amdgpu_mes_add_hw_queue(adev, gang_id, &qprops, &queue_id); 1216 if (r) 1217 goto clean_up_ring; ^^^^^^^^^^^^^^^^^^ 1218 1219 ring->hw_queue_id = queue_id; 1220 ring->doorbell_index = qprops.doorbell_off; 1221 1222 if (queue_type == AMDGPU_RING_TYPE_GFX) 1223 sprintf(ring->name, "gfx_%d.%d.%d", pasid, gang_id, queue_id); 1224 else if (queue_type == AMDGPU_RING_TYPE_COMPUTE) 1225 sprintf(ring->name, "compute_%d.%d.%d", pasid, gang_id, 1226 queue_id); 1227 else if (queue_type == AMDGPU_RING_TYPE_SDMA) 1228 sprintf(ring->name, "sdma_%d.%d.%d", pasid, gang_id, 1229 queue_id); 1230 else 1231 BUG(); 1232 1233 out = ring; 1234 return 0; 1235 1236 clean_up_ring: 1237 amdgpu_ring_fini(ring); 1238 clean_up_memory: 1239 kfree(ring); --> 1240 amdgpu_mes_unlock(&adev->mes); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1241 return r; 1242 } Fixes: `d0c423b647` ("drm/amdgpu/mes: use ring for kernel queue submission") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by: Jack Xiao <Jack.Xiao@amd.com> Reported by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `bfaf188360`)	2024-10-15 11:49:08 -04:00
Michael Chen	7760d7f93c	drm/amdgpu/mes: fix issue of writing to the same log buffer from 2 MES pipes With Unified MES enabled in gfx12, need separate event log buffer for the 2 MES pipes to avoid data overwrite. Signed-off-by: Michael Chen <michael.chen@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `144df260f3`) Cc: stable@vger.kernel.org # 6.11.x	2024-10-15 11:48:36 -04:00
Mohammed Anees	c0ec082f10	drm/amdgpu: prevent BO_HANDLES error from being overwritten Before this patch, if multiple BO_HANDLES chunks were submitted, the error -EINVAL would be correctly set but could be overwritten by the return value from amdgpu_cs_p1_bo_handles(). This patch ensures that if there are multiple BO_HANDLES, we stop. Fixes: `fec5f8e8c6` ("drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit") Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `40f2cd9882`) Cc: stable@vger.kernel.org	2024-10-15 11:48:05 -04:00
Alex Deucher	d2c72d96df	drm/amdgpu: enable enforce_isolation sysfs node on VFs It should be enabled on both bare metal and VFs. Fixes: `e189be9b2e` ("drm/amdgpu: Add enforce_isolation sysfs attribute") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Cc: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> (cherry picked from commit `dc8847b054`)	2024-10-15 11:47:41 -04:00
Keith Busch	1f021341ee	nvme-multipath: defer partition scanning We need to suppress the partition scan from occuring within the controller's scan_work context. If a path error occurs here, the IO will wait until a path becomes available or all paths are torn down, but that action also occurs within scan_work, so it would deadlock. Defer the partion scan to a different context that does not block scan_work. Reported-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-15 08:32:07 -07:00
Petr Pavlu	09661f75e7	ring-buffer: Fix reader locking when changing the sub buffer order The function ring_buffer_subbuf_order_set() updates each ring_buffer_per_cpu and installs new sub buffers that match the requested page order. This operation may be invoked concurrently with readers that rely on some of the modified data, such as the head bit (RB_PAGE_HEAD), or the ring_buffer_per_cpu.pages and reader_page pointers. However, no exclusive access is acquired by ring_buffer_subbuf_order_set(). Modifying the mentioned data while a reader also operates on them can then result in incorrect memory access and various crashes. Fix the problem by taking the reader_lock when updating a specific ring_buffer_per_cpu in ring_buffer_subbuf_order_set(). Link: https://lore.kernel.org/linux-trace-kernel/20240715145141.5528-1-petr.pavlu@suse.com/ Link: https://lore.kernel.org/linux-trace-kernel/20241010195849.2f77cc3f@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20241011112850.17212b25@gandalf.local.home/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241015112440.26987-1-petr.pavlu@suse.com Fixes: `8e7b58c27b` ("ring-buffer: Just update the subbuffers when changing their allocation order") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-10-15 11:18:51 -04:00
Jens Axboe	28aabffae6	io_uring/sqpoll: close race on waiting for sqring entries When an application uses SQPOLL, it must wait for the SQPOLL thread to consume SQE entries, if it fails to get an sqe when calling io_uring_get_sqe(). It can do so by calling io_uring_enter(2) with the flag value of IORING_ENTER_SQ_WAIT. In liburing, this is generally done with io_uring_sqring_wait(). There's a natural expectation that once this call returns, a new SQE entry can be retrieved, filled out, and submitted. However, the kernel uses the cached sq head to determine if the SQRING is full or not. If the SQPOLL thread is currently in the process of submitting SQE entries, it may have updated the cached sq head, but not yet committed it to the SQ ring. Hence the kernel may find that there are SQE entries ready to be consumed, and return successfully to the application. If the SQPOLL thread hasn't yet committed the SQ ring entries by the time the application returns to userspace and attempts to get a new SQE, it will fail getting a new SQE. Fix this by having io_sqring_full() always use the user visible SQ ring head entry, rather than the internally cached one. Cc: stable@vger.kernel.org # 5.10+ Link: https://github.com/axboe/liburing/discussions/1267 Reported-by: Benedek Thaler <thaler@thaler.hu> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-15 09:13:51 -06:00
Jon Hunter	c8347f915e	gpu: host1x: Fix boot regression for Tegra Commit `4c27ac45e6` ("gpu: host1x: Request syncpoint IRQs only during probe") caused a boot regression for the Tegra186 device. Following this update the function host1x_intr_init() now calls host1x_hw_intr_disable_all_syncpt_intrs() during probe. However, host1x_intr_init() is called before runtime power-management is enabled for Host1x and the function host1x_hw_intr_disable_all_syncpt_intrs() is accessing hardware registers. So if the Host1x hardware is not enabled prior to probing then the device will now hang on attempting to access the registers. So far this is only observed on Tegra186, but potentially could be seen on other devices. Fix this by moving the call to the function host1x_intr_init() in probe to after enabling the runtime power-management in the probe and update the failure path in probe as necessary. Fixes: `4c27ac45e6` ("gpu: host1x: Request syncpoint IRQs only during probe") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240925160504.60221-1-jonathanh@nvidia.com (cherry picked from commit `dc56f8428e`) Signed-off-by: Thierry Reding <treding@nvidia.com>	2024-10-15 15:49:06 +02:00
Gavin Shan	b079883841	firmware: arm_ffa: Avoid string-fortify warning caused by memcpy() Copying from a 144 byte structure arm_smccc_1_2_regs at an offset of 32 into an 112 byte struct ffa_send_direct_data2 causes a compile-time warning: \| In file included from drivers/firmware/arm_ffa/driver.c:25: \| In function 'fortify_memcpy_chk', \| inlined from 'ffa_msg_send_direct_req2' at drivers/firmware/arm_ffa/driver.c:504:3: \| include/linux/fortify-string.h:580:4: warning: call to '__read_overflow2_field' \| declared with 'warning' attribute: detected read beyond size of field \| (2nd parameter); maybe use struct_group()? [-Wattribute-warning] \| __read_overflow2_field(q_size_field, size); Fix it by not passing a plain buffer to memcpy() to avoid the overflow warning. Fixes: `aaef3bc981` ("firmware: arm_ffa: Add support for FFA_MSG_SEND_DIRECT_{REQ,RESP}2") Signed-off-by: Gavin Shan <gshan@redhat.com> Message-Id: <20241014004724.991353-1-gshan@redhat.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2024-10-15 13:50:10 +01:00
Zhang Rui	ffd95846c6	x86/apic: Always explicitly disarm TSC-deadline timer New processors have become pickier about the local APIC timer state before entering low power modes. These low power modes are used (for example) when you close your laptop lid and suspend. If you put your laptop in a bag and it is not in this low power mode, it is likely to get quite toasty while it quickly sucks the battery dry. The problem boils down to some CPUs' inability to power down until the CPU recognizes that the local APIC timer is shut down. The current kernel code works in one-shot and periodic modes but does not work for deadline mode. Deadline mode has been the supported and preferred mode on Intel CPUs for over a decade and uses an MSR to drive the timer instead of an APIC register. Disable the TSC Deadline timer in lapic_timer_shutdown() by writing to MSR_IA32_TSC_DEADLINE when in TSC-deadline mode. Also avoid writing to the initial-count register (APIC_TMICT) which is ignored in TSC-deadline mode. Note: The APIC_LVTT\|=APIC_LVT_MASKED operation should theoretically be enough to tell the hardware that the timer will not fire in any of the timer modes. But mitigating AMD erratum 411[1] also requires clearing out APIC_TMICT. Solely setting APIC_LVT_MASKED is also ineffective in practice on Intel Lunar Lake systems, which is the motivation for this change. 1. 411 Processor May Exit Message-Triggered C1E State Without an Interrupt if Local APIC Timer Reaches Zero - https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/revision-guides/41322_10h_Rev_Gd.pdf Fixes: `279f146143` ("x86: apic: Use tsc deadline for oneshot when available") Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: Todd Brandt <todd.e.brandt@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241015061522.25288-1-rui.zhang%40intel.com	2024-10-15 05:45:18 -07:00
Colin Ian King	637c4f6fe4	octeontx2-af: Fix potential integer overflows on integer shifts The left shift int 32 bit integer constants 1 is evaluated using 32 bit arithmetic and then assigned to a 64 bit unsigned integer. In the case where the shift is 32 or more this can lead to an overflow. Avoid this by shifting using the BIT_ULL macro instead. Fixes: `019aba04f0` ("octeontx2-af: Modify SMQ flush sequence to drop packets") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/20241010154519.768785-1-colin.i.king@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-15 13:32:22 +02:00
Will Deacon	7aed6a2c51	kasan: Disable Software Tag-Based KASAN with GCC Syzbot reports a KASAN failure early during boot on arm64 when building with GCC 12.2.0 and using the Software Tag-Based KASAN mode: \| BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline] \| BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356 \| Write of size 4 at addr 03ff800086867e00 by task swapper/0 \| Pointer tag: [03], memory tag: [fe] Initial triage indicates that the report is a false positive and a thorough investigation of the crash by Mark Rutland revealed the root cause to be a bug in GCC: > When GCC is passed `-fsanitize=hwaddress` or > `-fsanitize=kernel-hwaddress` it ignores > `__attribute__((no_sanitize_address))`, and instruments functions > we require are not instrumented. > > [...] > > All versions [of GCC] I tried were broken, from 11.3.0 to 14.2.0 > inclusive. > > I think we have to disable KASAN_SW_TAGS with GCC until this is > fixed Disable Software Tag-Based KASAN when building with GCC by making CC_HAS_KASAN_SW_TAGS depend on !CC_IS_GCC. Cc: Andrey Konovalov <andreyknvl@gmail.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/000000000000f362e80620e27859@google.com Link: https://lore.kernel.org/r/ZvFGwKfoC4yVjN_X@J2N7QTR9R3 Link: https://bugzilla.kernel.org/show_bug.cgi?id=218854 Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20241014161100.18034-1-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2024-10-15 11:38:10 +01:00
Paritosh Dixit	1cff6ff302	net: stmmac: dwmac-tegra: Fix link bring-up sequence The Tegra MGBE driver sometimes fails to initialize, reporting the following error, and as a result, it is unable to acquire an IP address with DHCP: tegra-mgbe 6800000.ethernet: timeout waiting for link to become ready As per the recommendation from the Tegra hardware design team, fix this issue by: - clearing the PHY_RDY bit before setting the CDR_RESET bit and then setting PHY_RDY bit before clearing CDR_RESET bit. This ensures valid data is present at UPHY RX inputs before starting the CDR lock. - adding the required delays when bringing up the UPHY lane. Note we need to use delays here because there is no alternative, such as polling, for these cases. Using the usleep_range() instead of ndelay() as sleeping is preferred over busy wait loop. Without this change we would see link failures on boot sometimes as often as 1 in 5 boots. With this fix we have not observed any failures in over 1000 boots. Fixes: `d8ca113724` ("net: stmmac: tegra: Add MGBE support") Signed-off-by: Paritosh Dixit <paritoshd@nvidia.com> Link: https://patch.msgid.link/20241010142908.602712-1-paritoshd@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-15 12:35:17 +02:00
Christoph Hellwig	f6f91d290c	xfs: punch delalloc extents from the COW fork for COW writes When ->iomap_end is called on a short write to the COW fork it needs to punch stale delalloc data from the COW fork and not the data fork. Ensure that IOMAP_F_NEW is set for new COW fork allocations in xfs_buffered_write_iomap_begin, and then use the IOMAP_F_SHARED flag in xfs_buffered_write_delalloc_punch to decide which fork to punch. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-15 11:37:42 +02:00
Christoph Hellwig	7d6fe5c586	xfs: set IOMAP_F_SHARED for all COW fork allocations Change to always set xfs_buffered_write_iomap_begin for COW fork allocations even if they don't overlap existing data fork extents, which will allow the iomap_end callback to detect if it has to punch stale delalloc blocks from the COW fork instead of the data fork. It also means we sample the sequence counter for both the data and the COW fork when writing to the COW fork, which ensures we properly revalidate when only COW fork changes happens. This is essentially a revert of commit `72a048c105` ("xfs: only set IOMAP_F_SHARED when providing a srcmap to a write"). This is fine because the problem that the commit fixed has now been dealt with in iomap by only looking at the actual srcmap and not the fallback to the write iomap. Note that the direct I/O path was never changed and has always set IOMAP_F_SHARED for all COW fork allocations. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-15 11:37:42 +02:00
Christoph Hellwig	c29440ff66	xfs: share more code in xfs_buffered_write_iomap_begin Introduce a local iomap_flags variable so that the code allocating new delalloc blocks in the data fork can fall through to the found_imap label and reuse the code to unlock and fill the iomap. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-15 11:37:42 +02:00
Christoph Hellwig	8fe3b21efa	xfs: support the COW fork in xfs_bmap_punch_delalloc_range xfs_buffered_write_iomap_begin can also create delallocate reservations that need cleaning up, prepare for that by adding support for the COW fork in xfs_bmap_punch_delalloc_range. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-15 11:37:42 +02:00
Christoph Hellwig	abd7d651ad	xfs: IOMAP_ZERO and IOMAP_UNSHARE already hold invalidate_lock All XFS callers of iomap_zero_range and iomap_file_unshare already hold invalidate_lock, so we can't take it again in iomap_file_buffered_write_punch_delalloc. Use the passed in flags argument to detect if we're called from a zero or unshare operation and don't take the lock again in this case. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-15 11:37:42 +02:00
Christoph Hellwig	acfbac7764	xfs: take XFS_MMAPLOCK_EXCL xfs_file_write_zero_eof xfs_file_write_zero_eof is the only caller of xfs_zero_range that does not take XFS_MMAPLOCK_EXCL (aka the invalidate lock). Currently that is actually the right thing, as an error in the iomap zeroing code will also take the invalidate_lock to clean up, but to fix that deadlock we need a consistent locking pattern first. The only extra thing that XFS_MMAPLOCK_EXCL will lock out are read pagefaults, which isn't really needed here, but also not actively harmful. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-15 11:37:42 +02:00
Christoph Hellwig	3c399374af	xfs: factor out a xfs_file_write_zero_eof helper Split a helper from xfs_file_write_checks that just deal with the post-EOF zeroing to keep the code readable. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-15 11:37:42 +02:00
Christoph Hellwig	b784951662	iomap: move locking out of iomap_write_delalloc_release XFS (which currently is the only user of iomap_write_delalloc_release) already holds invalidate_lock for most zeroing operations. To be able to avoid a deadlock it needs to stop taking the lock, but doing so in iomap would leak XFS locking details into iomap. To avoid this require the caller to hold invalidate_lock when calling iomap_write_delalloc_release instead of taking it there. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-15 11:37:42 +02:00
Christoph Hellwig	caf0ea451d	iomap: remove iomap_file_buffered_write_punch_delalloc Currently iomap_file_buffered_write_punch_delalloc can be called from XFS either with the invalidate lock held or not. To fix this while keeping the locking in the file system and not the iomap library code we'll need to life the locking up into the file system. To prepare for that, open code iomap_file_buffered_write_punch_delalloc in the only caller, and instead export iomap_write_delalloc_release. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-15 11:37:42 +02:00
Christoph Hellwig	c0adf8c3a9	iomap: factor out a iomap_last_written_block helper Split out a pice of logic from iomap_file_buffered_write_punch_delalloc that is useful for all iomap_end implementations. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-15 11:37:41 +02:00
Oliver Neukum	b62f4c186c	net: usb: usbnet: fix race in probe failure The same bug as in the disconnect code path also exists in the case of a failure late during the probe process. The flag must also be set. Signed-off-by: Oliver Neukum <oneukum@suse.com> Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Link: https://patch.msgid.link/20241010131934.1499695-1-oneukum@suse.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-10-15 11:37:25 +02:00
Lu Baolu	6e02a277f1	iommu/vt-d: Fix incorrect pci_for_each_dma_alias() for non-PCI devices Previously, the domain_context_clear() function incorrectly called pci_for_each_dma_alias() to set up context entries for non-PCI devices. This could lead to kernel hangs or other unexpected behavior. Add a check to only call pci_for_each_dma_alias() for PCI devices. For non-PCI devices, domain_context_clear_one() is called directly. Reported-by: Todd Brandt <todd.e.brandt@intel.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219363 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219349 Fixes: `9a16ab9d64` ("iommu/vt-d: Make context clearing consistent with context mapping") Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20241014013744.102197-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-10-15 10:17:54 +02:00
Joerg Roedel	8e8a69bc77	Merge tag 'arm-smmu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into fixes Arm SMMU fixes for 6.12 - Clarify warning message when failing to disable the MMU-500 prefetcher - Fix undefined behaviour in calculation of L1 stream-table index when 32-bit StreamIDs are implemented - Replace a rogue comma with a semicolon	2024-10-15 10:16:22 +02:00
Bartosz Golaszewski	bb94f56b9c	fbdev: da8xx: remove the driver This driver is no longer used on any platform. It has been replaced by tilcdc on the two DaVinci boards we still support and can be removed. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Helge Deller <deller@gmx.de>	2024-10-15 10:08:23 +02:00
Christophe JAILLET	161e95b899	fbdev: Constify struct sbus_mmap_map 'struct sbus_mmap_map' are not modified in these drivers. Constifying this structure moves some data to a read-only section, so increases overall security. Update sbusfb_mmap_helper() accordingly. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 2452 536 16 3004 bbc drivers/video/fbdev/bw2.o After: ===== text data bss dec hex filename 2500 483 16 2999 bb7 drivers/video/fbdev/bw2.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Helge Deller <deller@gmx.de>	2024-10-15 10:07:32 +02:00
SurajSonawane2415	57e755d333	fbdev: nvidiafb: fix inconsistent indentation warning Fix the indentation to ensure consistent code style and improve readability, and to fix this warning: drivers/video/fbdev/nvidia/nv_hw.c:1512 NVLoadStateExt() warn: inconsistent indenting Signed-off-by: SurajSonawane2415 <surajsonawane0215@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>	2024-10-15 10:07:32 +02:00
Gonzalo Silvalde Blanco	447794e447	fbdev: sstfb: Make CONFIG_FB_DEVICE optional The sstfb driver currently depends on CONFIG_FB_DEVICE to create sysfs entries and access info->dev. This patch wraps the relevant code blocks with #ifdef CONFIG_FB_DEVICE, allowing the driver to be built and used even if CONFIG_FB_DEVICE is not selected. The sysfs setting only controls the VGA pass-through state and is not required for the display to work correctly. (See: http://vogonswiki.com/index.php/VGA_passthrough_cable for more information.) Added some fixes from Thomas Zimmermann. Signed-off-by: Gonzalo Silvalde Blanco <gonzalo.silvalde@gmail.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Helge Deller <deller@gmx.de>	2024-10-15 10:07:31 +02:00
Pierre-Louis Bossart	71dce222d5	ALSA/hda: intel-sdw-acpi: add support for sdw-manager-list property read The DisCo for SoundWire 2.0 spec adds support for a new sdw-manager-list property. Add it in backwards-compatible mode with 'sdw-master-count', which assumed that all links between 0..count-1 exist. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241001070611.63288-5-yung-chuan.liao@linux.intel.com	2024-10-15 07:54:16 +02:00
Pierre-Louis Bossart	8782ba9685	ALSA/hda: intel-sdw-acpi: simplify sdw-master-count property read For some reason we used an array of one u8 when the specification requires a u32. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241001070611.63288-4-yung-chuan.liao@linux.intel.com	2024-10-15 07:54:11 +02:00
Pierre-Louis Bossart	5b1b5631d8	ALSA/hda: intel-sdw-acpi: fetch fwnode once in sdw_intel_scan_controller() Optimize a bit by using an intermediate 'fwnode' variable. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241001070611.63288-3-yung-chuan.liao@linux.intel.com	2024-10-15 07:54:03 +02:00
Pierre-Louis Bossart	9d94c58316	ALSA/hda: intel-sdw-acpi: cleanup sdw_intel_scan_controller Remove unnecessary initialization and un-shadow return code. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241001070611.63288-2-yung-chuan.liao@linux.intel.com	2024-10-15 07:53:55 +02:00
Jean Delvare	eabb038101	[PATCH} hwmon: (jc42) Properly detect TSE2004-compliant devices again Commit `b3e992f69c` ("hwmon: (jc42) Strengthen detect function") attempted to make the detect function more robust for TSE2004-compliant devices by checking capability bits which, according to the JEDEC 21-C specification, should always be set. Unfortunately, not all real-world implementations fully adhere to this specification, so this change caused a regression. Stop testing bit 7 (EVSD) of the Capabilities register, as it was found to be 0 on one real-world device. Also stop testing bits 0 (EVENT) and 2 (RANGE) as vendor datasheets (Renesas TSE2004GB2B0, ST STTS2004) suggest that they may not always be set either. Signed-off-by: Jean Delvare <jdelvare@suse.de> Message-ID: <20241014141204.026f4641@endymion.delvare> Fixes: `b3e992f69c` ("hwmon: (jc42) Strengthen detect function") Message-ID: <20241014220426.0c8f4d9c@endymion.delvare> Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2024-10-14 19:14:08 -07:00
Kai Shen	25c12b459d	net/smc: Fix memory leak when using percpu refs This patch adds missing percpu_ref_exit when releasing percpu refs. When releasing percpu refs, percpu_ref_exit should be called. Otherwise, memory leak happens. Fixes: `79a22238b4` ("net/smc: Use percpu ref for wr tx reference") Signed-off-by: Kai Shen <KaiShen@linux.alibaba.com> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Link: https://patch.msgid.link/20241010115624.7769-1-KaiShen@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-14 17:32:05 -07:00
Jakub Kicinski	e3e4e5667d	Merge branch 'posix-clock-fix-missing-timespec64-check-for-ptp-clock' Jinjie Ruan says: ==================== posix-clock: Fix missing timespec64 check for PTP clock Check timespec64 in pc_clock_settime() for PTP clock as the man manual of clock_settime() said. ==================== Link: https://patch.msgid.link/20241009072302.1754567-1-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-14 17:22:46 -07:00
Jinjie Ruan	ea531dc66e	net: lan743x: Remove duplicate check Since timespec64_valid() has been checked in higher layer pc_clock_settime(), the duplicate check in lan743x_ptpci_settime64() can be removed. Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20241009072302.1754567-3-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-14 17:22:43 -07:00
Jinjie Ruan	d8794ac20a	posix-clock: Fix missing timespec64 check in pc_clock_settime() As Andrew pointed out, it will make sense that the PTP core checked timespec64 struct's tv_sec and tv_nsec range before calling ptp->info->settime64(). As the man manual of clock_settime() said, if tp.tv_sec is negative or tp.tv_nsec is outside the range [0..999,999,999], it should return EINVAL, which include dynamic clocks which handles PTP clock, and the condition is consistent with timespec64_valid(). As Thomas suggested, timespec64_valid() only check the timespec is valid, but not ensure that the time is in a valid range, so check it ahead using timespec64_valid_strict() in pc_clock_settime() and return -EINVAL if not valid. There are some drivers that use tp->tv_sec and tp->tv_nsec directly to write registers without validity checks and assume that the higher layer has checked it, which is dangerous and will benefit from this, such as hclge_ptp_settime(), igb_ptp_settime_i210(), _rcar_gen4_ptp_settime(), and some drivers can remove the checks of itself. Cc: stable@vger.kernel.org Fixes: `0606f422b4` ("posix clocks: Introduce dynamic clocks") Acked-by: Richard Cochran <richardcochran@gmail.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20241009072302.1754567-2-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-14 17:22:43 -07:00
Xiu Jianfeng	3cc4e13bb1	cgroup: Fix potential overflow issue when checking max_depth cgroup.max.depth is the maximum allowed descent depth below the current cgroup. If the actual descent depth is equal or larger, an attempt to create a new child cgroup will fail. However due to the cgroup->max_depth is of int type and having the default value INT_MAX, the condition 'level > cgroup->max_depth' will never be satisfied, and it will cause an overflow of the level after it reaches to INT_MAX. Fix it by starting the level from 0 and using '>=' instead. It's worth mentioning that this issue is unlikely to occur in reality, as it's impossible to have a depth of INT_MAX hierarchy, but should be be avoided logically. Fixes: `1a926e0bba` ("cgroup: implement hierarchy limits") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-10-14 13:39:25 -10:00
David Vernet	60e339be10	sched_ext: Remove unnecessary cpu_relax() As described in commit `b07996c7ab` ("sched_ext: Don't hold scx_tasks_lock for too long"), we're doing a cond_resched() every 32 calls to scx_task_iter_next() to avoid RCU and other stalls. That commit also added a cpu_relax() to the codepath where we drop and reacquire the lock, but as Waiman described in [0], cpu_relax() should only be necessary in busy loops to avoid pounding on a cacheline (or to allow a hypertwin to more fully utilize a core). Let's remove the unnecessary cpu_relax(). [0]: https://lore.kernel.org/all/35b3889b-904a-4d26-981f-c8aa1557a7c7@redhat.com/ Cc: Waiman Long <llong@redhat.com> Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-10-14 13:23:49 -10:00
Justin Chen	da1642bc97	firmware: arm_scmi: Queue in scmi layer for mailbox implementation send_message() does not block in the MBOX implementation. This is because the mailbox layer has its own queue. However, this confuses the per xfer timeouts as they all start their timeout ticks in parallel. Consider a case where the xfer timeout is 30ms and a SCMI transaction takes 25ms: \| 0ms: Message #0 is queued in mailbox layer and sent out, then sits \| at scmi_wait_for_message_response() with a timeout of 30ms \| 1ms: Message #1 is queued in mailbox layer but not sent out yet. \| Since send_message() doesn't block, it also sits at \| scmi_wait_for_message_response() with a timeout of 30ms \| ... \| 25ms: Message #0 is completed, txdone is called and message #1 is sent \| 31ms: Message #1 times out since the count started at 1ms. Even though \| it has only been inflight for 6ms. Fixes: `5c8a47a5a9` ("firmware: arm_scmi: Make scmi core independent of the transport type") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Message-Id: <20241014160717.1678953-1-justin.chen@broadcom.com> Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2024-10-14 21:36:46 +01:00
Douglas Anderson	e4a45582db	drm/msm: Allocate memory for disp snapshot with kvzalloc() With the "drm/msm: add a display mmu fault handler" series [1] we saw issues in the field where memory allocation was failing when allocating space for registers in msm_disp_state_dump_regs(). Specifically we were seeing an order 5 allocation fail. It's not surprising that order 5 allocations will sometimes fail after the system has been up and running for a while. There's no need here for contiguous memory. Change the allocation to kvzalloc() which should make it much less likely to fail. [1] https://lore.kernel.org/r/20240628214848.4075651-1-quic_abhinavk@quicinc.com/ Fixes: `98659487b8` ("drm/msm: add support to take dpu snapshot") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/619658/ Link: https://lore.kernel.org/r/20241014093605.2.I72441365ffe91f3dceb17db0a8ec976af8139590@changeid Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>	2024-10-14 13:16:17 -07:00
Douglas Anderson	293f532632	drm/msm: Avoid NULL dereference in msm_disp_state_print_regs() If the allocation in msm_disp_state_dump_regs() failed then `block->state` can be NULL. The msm_disp_state_print_regs() function _does_ have code to try to handle it with: if (reg) dump_addr = reg; ...but since "dump_addr" is initialized to NULL the above is actually a noop. The code then goes on to dereference `dump_addr`. Make the function print "Registers not stored" when it sees a NULL to solve this. Since we're touching the code, fix msm_disp_state_print_regs() not to pointlessly take a double-pointer and properly mark the pointer as `const`. Fixes: `98659487b8` ("drm/msm: add support to take dpu snapshot") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/619657/ Link: https://lore.kernel.org/r/20241014093605.1.Ia1217cecec9ef09eb3c6d125360cc6c8574b0e73@changeid Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>	2024-10-14 13:16:17 -07:00
Jonathan Marek	358b762400	drm/msm/dsi: fix 32-bit signed integer extension in pclk_rate calculation When (mode->clock * 1000) is larger than (1<<31), int to unsigned long conversion will sign extend the int to 64 bits and the pclk_rate value will be incorrect. Fix this by making the result of the multiplication unsigned. Note that above (1<<32) would still be broken and require more changes, but its unlikely anyone will need that anytime soon. Fixes: `c4d8cfe516` ("drm/msm/dsi: add implementation for helper functions") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/618434/ Link: https://lore.kernel.org/r/20241007050157.26855-2-jonathan@marek.ca Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>	2024-10-14 13:16:17 -07:00
Jonathan Marek	24436a540d	drm/msm/dsi: improve/fix dsc pclk calculation drm_mode_vrefresh() can introduce a large rounding error, avoid it. Fixes: `7c9e4a554d` ("drm/msm/dsi: Reduce pclk rate for compression") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/618432/ Link: https://lore.kernel.org/r/20241007050157.26855-1-jonathan@marek.ca Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>	2024-10-14 13:16:17 -07:00
Dmitry Baryshkov	f260ed880c	drm/msm/hdmi: drop pll_cmp_to_fdata from hdmi_phy_8998 The pll_cmp_to_fdata() was never used by the working code. Drop it to prevent warnings with W=1 and clang. Reported-by: Jani Nikula <jani.nikula@intel.com> Closes: https://lore.kernel.org/dri-devel/3553b1db35665e6ff08592e35eb438a574d1ad65.1725962479.git.jani.nikula@intel.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Fixes: `caedbf17c4` ("drm/msm: add msm8998 hdmi phy/pll support") Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/615348/ Link: https://lore.kernel.org/r/20240922-msm-drop-unused-func-v1-1-c5dc083415b8@linaro.org Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>	2024-10-14 13:16:16 -07:00
Dmitry Baryshkov	3a0851b442	drm/msm/dpu: check for overflow in _dpu_crtc_setup_lm_bounds() Make _dpu_crtc_setup_lm_bounds() check that CRTC width is not overflowing LM requirements. Rename the function accordingly. Fixes: `25fdd5933e` ("drm/msm: Add SDM845 DPU support") Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Tested-by: Abhinav Kumar <quic_abhinavk@quicinc.com> # sc7280 Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/612237/ Link: https://lore.kernel.org/r/20240903-dpu-mode-config-width-v6-3-617e1ecc4b7a@linaro.org Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>	2024-10-14 13:16:16 -07:00
Dmitry Baryshkov	3ae133b019	drm/msm/dpu: move CRTC resource assignment to dpu_encoder_virt_atomic_check Historically CRTC resources (LMs and CTLs) were assigned in dpu_crtc_atomic_begin(). The commit `9222cdd27e` ("drm/msm/dpu: move hw resource tracking to crtc state") simply moved resources to struct dpu_crtc_state, without changing the code sequence. Later on the commit `b107603b4a` ("drm/msm/dpu: map mixer/ctl hw blocks in encoder modeset") rearanged the code, but still kept the cstate->num_mixers assignment to happen during commit phase. This makes dpu_crtc_state inconsistent between consequent atomic_check() calls. Move CRTC resource assignment to happen at the end of dpu_encoder_virt_atomic_check(). Fixes: `b107603b4a` ("drm/msm/dpu: map mixer/ctl hw blocks in encoder modeset") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/612235/ Link: https://lore.kernel.org/r/20240903-dpu-mode-config-width-v6-2-617e1ecc4b7a@linaro.org Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>	2024-10-14 13:16:16 -07:00
Dmitry Baryshkov	bfecbc2cfb	drm/msm/dpu: make sure phys resources are properly initialized The commit `b954fa6baa` ("drm/msm/dpu: Refactor rm iterator") removed zero-init of the hw_ctl array, but didn't change the error condition, that checked for hw_ctl[i] being NULL. At the same time because of the early returns in case of an error dpu_encoder_phys might be left with the resources assigned in the previous state. Rework assigning of hw_pp / hw_ctl to the dpu_encoder_phys in order to make sure they are always set correctly. Fixes: `b954fa6baa` ("drm/msm/dpu: Refactor rm iterator") Suggested-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/612233/ Link: https://lore.kernel.org/r/20240903-dpu-mode-config-width-v6-1-617e1ecc4b7a@linaro.org Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>	2024-10-14 13:16:16 -07:00
Arnd Bergmann	629253b2f6	firmware: arm_ffa: Avoid string-fortify warning in export_uuid() Copying to a 16 byte structure into an 8-byte struct member causes a compile-time warning: \| In file included from drivers/firmware/arm_ffa/driver.c:25: \| In function 'fortify_memcpy_chk', \| inlined from 'export_uuid' at include/linux/uuid.h:88:2, \| inlined from 'ffa_msg_send_direct_req2' at drivers/firmware/arm_ffa/driver.c:488:2: \| include/linux/fortify-string.h:571:25: error: call to '__write_overflow_field' \| declared with attribute warning: detected write beyond size of field \| (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] \| __write_overflow_field(p_size_field, size); Use a union for the conversion instead and make sure the byte order is fixed in the process. Fixes: `aaef3bc981` ("firmware: arm_ffa: Add support for FFA_MSG_SEND_DIRECT_{REQ,RESP}2") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Message-Id: <20240909110938.247976-1-arnd@kernel.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2024-10-14 21:07:27 +01:00
Steven Rostedt	2cf9733891	ring-buffer: Fix refcount setting of boot mapped buffers A ring buffer which has its buffered mapped at boot up to fixed memory should not be freed. Other buffers can be. The ref counting setup was wrong for both. It made the not mapped buffers ref count have zero, and the boot mapped buffer a ref count of 1. But an normally allocated buffer should be 1, where it can be removed. Keep the ref count of a normal boot buffer with its setup ref count (do not decrement it), and increment the fixed memory boot mapped buffer's ref count. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241011165224.33dd2624@gandalf.local.home Fixes: `e645535a95` ("tracing: Add option to use memmapped memory for trace boot instance") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-10-14 14:30:59 -04:00
Linus Torvalds	eca631b8fe	Merge tag 'f2fs-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs fix from Jaegeuk Kim: "An urgent fix to resolve DIO read performance regression caused by 'f2fs: fix to avoid racing in between read and OPU dio write'" * tag 'f2fs-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: f2fs: allow parallel DIO reads	2024-10-14 11:19:19 -07:00
Linus Torvalds	63fa605041	Merge tag 'erofs-for-6.12-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: "The main one fixes a syzbot issue due to the invalid inode type out of file-backed mounts. The others are minor cleanups without actual logic changes. Summary: - Make sure only regular inodes can be used for file-backed mounts - Two minor codebase cleanups" * tag 'erofs-for-6.12-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: get rid of kaddr in `struct z_erofs_maprecorder` erofs: get rid of z_erofs_try_to_claim_pcluster() erofs: ensure regular inodes for file-backed mounts	2024-10-14 11:12:09 -07:00
Jai Luthra	d35f406429	dmaengine: ti: k3-udma: Set EOP for all TRs in cyclic BCDMA transfer When receiving data in cyclic mode from PDMA peripherals, where reload count is set to infinite, any TR in the set can potentially be the last one of the overall transfer. In such cases, the EOP flag needs to be set in each TR and PDMA's Static TR "Z" parameter should be set, matching the size of the TR. This is required for the teardown to function properly and cleanup the internal state memory. This only affects platforms using BCDMA and not those using UDMA-P, which could set EOP flag in the teardown TR automatically. Similarly when transmitting data in cyclic mode to PDMA peripherals, the EOP flag needs to be set to get the teardown completion signal correctly. Fixes: `0177947397` ("dmaengine: ti: k3-udma: Initial support for K3 BCDMA") Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Verdin AM62 Signed-off-by: Jai Luthra <j-luthra@ti.com> Signed-off-by: Jai Luthra <jai.luthra@linux.dev> Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com> Link: https://lore.kernel.org/r/20240930-z_cnt-v2-1-9d38aba149a2@linux.dev Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-14 23:41:05 +05:30
Wolfram Sang	6e9c5c8ef2	dmaengine: sh: rz-dmac: handle configs where one address is zero Configs like the ones coming from the MMC subsystem will have either 'src' or 'dst' zeroed, resulting in an unknown bus width. This will bail out on the RZ DMA driver because of the sanity check for a valid bus width. Reorder the code, so that the check will only be applied when the corresponding address is non-zero. Fixes: `5000d37042` ("dmaengine: sh: Add DMAC driver for RZ/G2L SoC") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Biju Das <biju.das.jz@bp.renesas.com> Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://lore.kernel.org/r/20241007110200.43166-6-wsa+renesas@sang-engineering.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-14 23:10:58 +05:30
Cong Yang	fcf38bc321	drm/panel: himax-hx83102: Adjust power and gamma to optimize brightness The current panel brightness is only 360 nit. Adjust the power and gamma to optimize the panel brightness. The brightness after adjustment is 390 nit. Fixes: `3179338750` ("drm/panel: himax-hx83102: Support for IVO t109nw41 MIPI-DSI panel") Signed-off-by: Cong Yang <yangcong5@huaqin.corp-partner.google.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241011020819.1254157-1-yangcong5@huaqin.corp-partner.google.com	2024-10-14 10:00:45 -07:00
Joey Gouly	f56d8d2389	Documentation/protection-keys: add AArch64 to documentation As POE support was recently added, update the documentation. Also note that kernel threads have a default protection key register value. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20241001133618.1547996-3-joey.gouly@arm.com [will: Adjusted wording based on feedback from Kevin] Signed-off-by: Will Deacon <will@kernel.org>	2024-10-14 17:27:48 +01:00
Joey Gouly	e3e8527133	arm64: set POR_EL0 for kernel threads Restrict kernel threads to only have RWX overlays for pkey 0. This matches what arch/x86 does, by defaulting to a restrictive PKRU. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Kevin Brodsky <Kevin.Brodsky@arm.com> Link: https://lore.kernel.org/r/20241001133618.1547996-2-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-14 17:22:47 +01:00
Jakub Kicinski	0b84db5d8f	MAINTAINERS: add Andrew Lunn as a co-maintainer of all networking drivers Andrew has been a pillar of the community for as long as I remember. Focusing on embedded networking, co-maintaining Ethernet PHYs and DSA code, but also actively reviewing MAC and integrated NIC drivers. Elevate Andrew to the status of co-maintainer of all netdev drivers. Acked-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/20241011193303.2461769-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-14 07:49:41 -07:00
Ming Lei	c25c0c9035	blk-mq: setup queue ->tag_set before initializing hctx Commit `7b815817aa` ("blk-mq: add helper for checking if one CPU is mapped to specified hctx") needs to check queue mapping via tag set in hctx's cpuhp handler. However, q->tag_set may not be setup yet when the cpuhp handler is enabled, then kernel oops is triggered. Fix the issue by setup queue tag_set before initializing hctx. Cc: stable@vger.kernel.org Reported-and-tested-by: Rick Koch <mr.rickkoch@gmail.com> Closes: https://lore.kernel.org/linux-block/CANa58eeNDozLaBHKPLxSAhEy__FPfJT_F71W=sEQw49UCrC9PQ@mail.gmail.com Fixes: `7b815817aa` ("blk-mq: add helper for checking if one CPU is mapped to specified hctx") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241014005115.2699642-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-14 08:17:07 -06:00
Baojun Xu	1e9c708dc3	ALSA: hda/tas2781: Add new quirk for Lenovo, ASUS, Dell projects Add new vendor_id and subsystem_id in quirk for Lenovo, ASUS, and Dell projects. Signed-off-by: Baojun Xu <baojun.xu@ti.com> Link: https://patch.msgid.link/20241011074040.524-1-baojun.xu@ti.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-14 11:59:54 +02:00
Kent Overstreet	5e3b72324d	bcachefs: Fix sysfs warning in fstests generic/730,731 sysfs warns if we're removing a symlink from a directory that's no longer in sysfs; this is triggered by fstests generic/730, which simulates hot removal of a block device. This patch is however not a correct fix, since checking kobj->state_in_sysfs on a kobj owned by another subsystem is racy. A better fix would be to add the appropriate check to sysfs_remove_link() - and sysfs_create_link() as well. But kobject_add_internal()/kobject_del() do not as of today have locking that would support that. Note that the block/holder.c code appears to be subject to this race as well. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-14 05:43:01 -04:00
Andrei Simion	3692a4ccac	MAINTAINERS: Update maintainer list for MICROCHIP ASOC, SSC and MCP16502 drivers To help Claudiu and offload the work, add myself to the maintainer list for those drivers. Acked-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Signed-off-by: Andrei Simion <andrei.simion@microchip.com> Link: https://patch.msgid.link/20241014092830.46709-1-andrei.simion@microchip.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-14 10:31:56 +01:00
Peter Zijlstra	cd9626e9eb	sched/fair: Fix external p->on_rq users Sean noted that ever since commit `152e11f6df` ("sched/fair: Implement delayed dequeue") KVM's preemption notifiers have started mis-classifying preemption vs blocking. Notably p->on_rq is no longer sufficient to determine if a task is runnable or blocked -- the aforementioned commit introduces tasks that remain on the runqueue even through they will not run again, and should be considered blocked for many cases. Add the task_is_runnable() helper to classify things and audit all external users of the p->on_rq state. Also add a few comments. Fixes: `152e11f6df` ("sched/fair: Implement delayed dequeue") Reported-by: Sean Christopherson <seanjc@google.com> Tested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20241010091843.GK33184@noisy.programming.kicks-ass.net	2024-10-14 09:14:35 +02:00
Johannes Weiner	c650812419	sched/psi: Fix mistaken CPU pressure indication after corrupted task state bug Since sched_delayed tasks remain queued even after blocking, the load balancer can migrate them between runqueues while PSI considers them to be asleep. As a result, it misreads the migration requeue followed by a wakeup as a double queue: psi: inconsistent task state! task=... cpu=... psi_flags=4 clear=. set=4 First, call psi_enqueue() after p->sched_class->enqueue_task(). A wakeup will clear p->se.sched_delayed while a migration will not, so psi can use that flag to tell them apart. Then teach psi to migrate any "sleep" state when delayed-dequeue tasks are being migrated. Delayed-dequeue tasks can be revived by ttwu_runnable(), which will call down with a new ENQUEUE_DELAYED. Instead of further complicating the wakeup conditional in enqueue_task(), identify migration contexts instead and default to wakeup handling for all other cases. It's not just the warning in dmesg, the task state corruption causes a permanent CPU pressure indication, which messes with workload/machine health monitoring. Debugged-by-and-original-fix-by: K Prateek Nayak <kprateek.nayak@amd.com> Fixes: `152e11f6df` ("sched/fair: Implement delayed dequeue") Closes: https://lore.kernel.org/lkml/20240830123458.3557-1-spasswolf@web.de/ Closes: https://lore.kernel.org/all/cd67fbcd-d659-4822-bb90-7e8fbb40a856@molgen.mpg.de/ Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lkml.kernel.org/r/20241010193712.GC181795@cmpxchg.org	2024-10-14 09:11:42 +02:00
Kent Overstreet	cb6055e66f	bcachefs: Handle race between stripe reuse, invalidate_stripe_to_dev When creating a new stripe, we may reuse an existing stripe that has some empty and some nonempty blocks. Generally, the existing stripe won't change underneath us - except for block sector counts, which we copy to the new key in ec_stripe_key_update. But the device removal path can now invalidate stripe pointers to a device, and that can race with stripe reuse. Change ec_stripe_key_update() to check for and resolve this inconsistency. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-13 22:03:03 -04:00
Kent Overstreet	b1e562265e	bcachefs: Fix kasan splat in new_stripe_alloc_buckets() Update for BCH_SB_MEMBER_INVALID. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-13 22:03:01 -04:00
Linus Torvalds	6485cf5ea2	Merge tag 'hid-for-linus-2024101301' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - fix for memory corruption regression in amd_sfh driver (Basavaraj Natikar) - fix for mis-reporting of BTN_TOOL_PEN and BTN_TOOL_RUBBER for AES sensors tools in Wacom driver (Jason Gerecke) - fix for unitialized variable use in intel-ish-hid driver (SurajSonawane2415) - a few device-specific quirks / device ID additions * tag 'hid-for-linus-2024101301' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: wacom: Hardcode (non-inverted) AES pens as BTN_TOOL_PEN HID: amd_sfh: Switch to device-managed dmam_alloc_coherent() HID: multitouch: Add quirk for HONOR MagicBook Art 14 touchpad HID: multitouch: Add support for B2402FVA track point HID: plantronics: Workaround for an unexcepted opposite volume key hid: intel-ish-hid: Fix uninitialized variable 'rv' in ish_fw_xfer_direct_dma	2024-10-13 16:35:20 -07:00
Kent Overstreet	9f25dbe0bf	bcachefs: Add missing validation for bch_stripe.csum_granularity_bits Reported-by: syzbot+f8c98a50c323635be65d@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-13 17:55:33 -04:00
Kent Overstreet	a319aeaebb	bcachefs: Fix missing bounds checks in bch2_alloc_read() We were checking that the alloc key was for a valid device, but not a valid bucket. This is the upgrade path from versions prior to bcachefs being mainlined. Reported-by: syzbot+a1b59c8e1a3f022fd301@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-13 17:55:33 -04:00
Kent Overstreet	573ddcdc56	bcachefs: fix uaf in bch2_dio_write_done() Reported-by: syzbot+19ad84d5133871207377@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-13 17:55:33 -04:00
Alice Ryhl	8b8ca9c25f	cfi: fix conditions for HAVE_CFI_ICALL_NORMALIZE_INTEGERS The HAVE_CFI_ICALL_NORMALIZE_INTEGERS option has some tricky conditions when KASAN or GCOV are turned on, as in that case we need some clang and rustc fixes [1][2] to avoid boot failures. The intent with the current setup is that you should be able to override the check and turn on the option if your clang/rustc has the fix. However, this override does not work in practice. Thus, use the new RUSTC_LLVM_VERSION to correctly implement the check for whether the fix is available. Additionally, remove KASAN_HW_TAGS from the list of incompatible options. The CFI_ICALL_NORMALIZE_INTEGERS option is incompatible with KASAN because LLVM will emit some constructors when using KASAN that are assigned incorrect CFI tags. These constructors are emitted due to use of -fsanitize=kernel-address or -fsanitize=kernel-hwaddress that are respectively passed when KASAN_GENERIC or KASAN_SW_TAGS are enabled. However, the KASAN_HW_TAGS option relies on hardware support for MTE instead and does not pass either flag. (Note also that KASAN_HW_TAGS does not `select CONSTRUCTORS`.) Link: https://github.com/llvm/llvm-project/pull/104826 [1] Link: https://github.com/rust-lang/rust/pull/129373 [2] Fixes: `4c66f8307a` ("cfi: encode cfi normalized integers + kasan/gcov bug in Kconfig") Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20241010-icall-detect-vers-v1-2-8f114956aa88@google.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2024-10-13 22:23:13 +02:00
Gary Guo	af0121c2d3	kbuild: rust: add `CONFIG_RUSTC_LLVM_VERSION` Each version of Rust supports a range of LLVM versions. There are cases where we want to gate a config on the LLVM version instead of the Rust version. Normalized cfi integer tags are one example [1]. The invocation of rustc-version is being moved from init/Kconfig to scripts/Kconfig.include for consistency with cc-version. Link: https://lore.kernel.org/all/20240925-cfi-norm-kasan-fix-v1-1-0328985cdf33@google.com/ [1] Signed-off-by: Gary Guo <gary@garyguo.net> Link: https://lore.kernel.org/r/20241011114040.3900487-1-gary@garyguo.net [ Added missing `-llvm` to the Usage documentation. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2024-10-13 22:22:28 +02:00
Heiko Thiery	2471787c1f	misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for OTP device By using NVMEM_DEVID_AUTO we support more than 1 device and automatically enumerate. Fixes: `0969001569` ("misc: microchip: pci1xxxx: Add support to read and write into PCI1XXXX OTP via NVMEM sysfs") Cc: stable@vger.kernel.org Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com> Reviewed-by: Michael Walle <mwalle@kernel.org> Link: https://lore.kernel.org/r/20241007071120.9522-2-heiko.thiery@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-13 18:17:57 +02:00
Heiko Thiery	3c2d73de49	misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for EEPROM device By using NVMEM_DEVID_AUTO we support more than 1 device and automatically enumerate. Fixes: `9ab5465349` ("misc: microchip: pci1xxxx: Add support to read and write into PCI1XXXX EEPROM via NVMEM sysfs") Cc: stable@vger.kernel.org Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com> Reviewed-by: Michael Walle <mwalle@kernel.org> Link: https://lore.kernel.org/r/20241007071120.9522-1-heiko.thiery@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-13 18:17:57 +02:00
Takashi Iwai	02ac3a9ef3	parport: Proper fix for array out-of-bounds access The recent fix for array out-of-bounds accesses replaced sprintf() calls blindly with snprintf(). However, since snprintf() returns the would-be-printed size, not the actually output size, the length calculation can still go over the given limit. Use scnprintf() instead of snprintf(), which returns the actually output letters, for addressing the potential out-of-bounds access properly. Fixes: `ab11dac93d` ("dev/parport: fix the array out-of-bounds risk") Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20240920103318.19271-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-13 18:17:35 +02:00
Greg Kroah-Hartman	7528cb0f65	Merge tag 'iio-fixes-for-6.12a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus Jonathan writes: IIO: 1st set of fixes for the 6.12 cycle. Most of this pull request is the result of Javier Carrasco doing a careful audit for missing Kconfig dependencies that luck has meant the random builds have never hit. The rest is the usual mix of old bugs that have surfaced and some fallout from the recent merge window. adi,ad5686 - Fix binding duplication of compatible strings. bosch,bma400 - Fix an uninitialized variable in the event tap handling. bosch,bmi323 - Fix several issues in the register saving and restore on suspend/resume sensiron,spd500 - Fix missing CRC8 dependency ti,op3001 - Fix a missing full-scale range value (values above this point were all reported wrongly) vishay,veml6030 - Fix a segmentation fault due to some type confusion. - Fix wrong ambient light sensor resolution. * tag 'iio-fixes-for-6.12a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (34 commits) iio: frequency: admv4420: fix missing select REMAP_SPI in Kconfig iio: frequency: {admv4420,adrf6780}: format Kconfig entries iio: adc: ad4695: Add missing Kconfig select iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency() iioc: dac: ltc2664: Fix span variable usage in ltc2664_channel_config() iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig iio: amplifiers: ada4250: add missing select REGMAP_SPI in Kconfig iio: frequency: adf4377: add missing select REMAP_SPI in Kconfig iio: resolver: ad2s1210: add missing select (TRIGGERED_)BUFFER in Kconfig iio: resolver: ad2s1210 add missing select REGMAP in Kconfig iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig iio: pressure: bm1390: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig iio: magnetometer: af8133j: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig iio: light: bu27008: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig iio: chemical: ens160: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig iio: dac: ad5766: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig iio: dac: ad3552r: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig ...	2024-10-13 17:23:47 +02:00
Pranjal Ramajor Asha Kanojiya	c5e8e93897	accel/qaic: Fix the for loop used to walk SG table Only for_each_sgtable_dma_sg() should be used to walk through a SG table to grab correct bus address and length pair after calling DMA MAP API on a SG table as DMA MAP APIs updates the SG table and for_each_sgtable_sg() walks through the original SG table. Fixes: `ff13be8303` ("accel/qaic: Add datapath") Fixes: `129776ac2e` ("accel/qaic: Add control path") Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241004193252.3888544-1-quic_jhugo@quicinc.com	2024-10-12 14:55:55 -06:00
Sergey Matsievskiy	93b8ddc545	pinctrl: ocelot: fix system hang on level based interrupts The current implementation only calls chained_irq_enter() and chained_irq_exit() if it detects pending interrupts. ``` for (i = 0; i < info->stride; i++) { uregmap_read(info->map, id_reg + 4 * i, &reg); if (!reg) continue; chained_irq_enter(parent_chip, desc); ``` However, in case of GPIO pin configured in level mode and the parent controller configured in edge mode, GPIO interrupt might be lowered by the hardware. In the result, if the interrupt is short enough, the parent interrupt is still pending while the GPIO interrupt is cleared; chained_irq_enter() never gets called and the system hangs trying to service the parent interrupt. Moving chained_irq_enter() and chained_irq_exit() outside the for loop ensures that they are called even when GPIO interrupt is lowered by the hardware. The similar code with chained_irq_enter() / chained_irq_exit() functions wrapping interrupt checking loop may be found in many other drivers: ``` grep -r -A 10 chained_irq_enter drivers/pinctrl ``` Cc: stable@vger.kernel.org Signed-off-by: Sergey Matsievskiy <matsievskiysv@gmail.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/20241012105743.12450-2-matsievskiysv@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2024-10-12 22:04:38 +02:00
Bartosz Golaszewski	1d59d474e1	PCI: Hold rescan lock while adding devices during host probe Since adding the PCI power control code, we may end up with a race between the pwrctl platform device rescanning the bus and host controller probe functions. The latter need to take the rescan lock when adding devices or we may end up in an undefined state having two incompletely added devices and hit the following crash when trying to remove the device over sysfs: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Internal error: Oops: 0000000096000004 [#1] SMP Call trace: __pi_strlen+0x14/0x150 kernfs_find_ns+0x80/0x13c kernfs_remove_by_name_ns+0x54/0xf0 sysfs_remove_bin_file+0x24/0x34 pci_remove_resource_files+0x3c/0x84 pci_remove_sysfs_dev_files+0x28/0x38 pci_stop_bus_device+0x8c/0xd8 pci_stop_bus_device+0x40/0xd8 pci_stop_and_remove_bus_device_locked+0x28/0x48 remove_store+0x70/0xb0 dev_attr_store+0x20/0x38 sysfs_kf_write+0x58/0x78 kernfs_fop_write_iter+0xe8/0x184 vfs_write+0x2dc/0x308 ksys_write+0x7c/0xec Fixes: `4565d2652a` ("PCI/pwrctl: Add PCI power control core code") Link: https://lore.kernel.org/r/20241003084342.27501-1-brgl@bgdev.pl Reported-by: Konrad Dybcio <konradybcio@kernel.org> Tested-by: Konrad Dybcio <konradybcio@kernel.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>	2024-10-12 09:31:44 -05:00
Krzysztof Kozlowski	b930d86478	ASoC: qcom: Select missing common Soundwire module code on SDM845 SDM845 sound card driver uses qcom_snd_sdw_startup() from the common Soundwire module, so select it to fix build failures: ERROR: modpost: "qcom_snd_sdw_startup" [sound/soc/qcom/snd-soc-sdm845.ko] undefined! Fixes: `d0e806b0cc` ("ASoC: qcom: sdm845: add missing soundwire runtime stream alloc") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20241012100957.129103-1-krzysztof.kozlowski@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-12 11:23:51 +01:00
Kent Overstreet	c986dd7ecb	bcachefs: Improve check_snapshot_exists() Check if we have snapshot_trees or subvolumes that refer to the snapshot node being reconstructed, and use them. With this, the kill_btree_root test that blows away the snapshots btree now passes, and we're able to successfully reconstruct. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-12 05:02:48 -04:00
Kent Overstreet	9183c2b11e	bcachefs: Fix bkey_nocow_lock() This fixes an assertion pop in nocow_locking.c 00243 kernel BUG at fs/bcachefs/nocow_locking.c:41! 00243 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP 00243 Modules linked in: 00243 Hardware name: linux,dummy-virt (DT) 00243 pstate: 60001005 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--) 00244 pc : bch2_bucket_nocow_unlock (/home/testdashboard/linux-7/fs/bcachefs/nocow_locking.c:41) 00244 lr : bkey_nocow_lock (/home/testdashboard/linux-7/fs/bcachefs/data_update.c:79) 00244 sp : ffffff80c82373b0 00244 x29: ffffff80c82373b0 x28: ffffff80e08958c0 x27: ffffff80e0880000 00244 x26: ffffff80c8237a98 x25: 00000000000000a0 x24: ffffff80c8237ab0 00244 x23: 00000000000000c0 x22: 0000000000000008 x21: 0000000000000000 00244 x20: ffffff80c8237a98 x19: 0000000000000018 x18: 0000000000000000 00244 x17: 0000000000000000 x16: 000000000000003f x15: 0000000000000000 00244 x14: 0000000000000008 x13: 0000000000000018 x12: 0000000000000000 00244 x11: 0000000000000000 x10: ffffff80e0880000 x9 : ffffffc0803ac1a4 00244 x8 : 0000000000000018 x7 : ffffff80c8237a88 x6 : ffffff80c8237ab0 00244 x5 : ffffff80e08988d0 x4 : 00000000ffffffff x3 : 0000000000000000 00244 x2 : 0000000000000004 x1 : 0003000000000d1e x0 : ffffff80e08988c0 00244 Call trace: 00244 bch2_bucket_nocow_unlock (/home/testdashboard/linux-7/fs/bcachefs/nocow_locking.c:41) 00245 bch2_data_update_init (/home/testdashboard/linux-7/fs/bcachefs/data_update.c:627 (discriminator 1)) 00245 promote_alloc.isra.0 (/home/testdashboard/linux-7/fs/bcachefs/io_read.c:242 /home/testdashboard/linux-7/fs/bcachefs/io_read.c:304) 00245 __bch2_read_extent (/home/testdashboard/linux-7/fs/bcachefs/io_read.c:949) 00246 __bch2_read (/home/testdashboard/linux-7/fs/bcachefs/io_read.c:1215) 00246 bch2_direct_IO_read (/home/testdashboard/linux-7/fs/bcachefs/fs-io-direct.c:132) 00246 bch2_read_iter (/home/testdashboard/linux-7/fs/bcachefs/fs-io-direct.c:201) 00247 aio_read.constprop.0 (/home/testdashboard/linux-7/fs/aio.c:1602) 00247 io_submit_one.constprop.0 (/home/testdashboard/linux-7/fs/aio.c:2003 /home/testdashboard/linux-7/fs/aio.c:2052) 00248 __arm64_sys_io_submit (/home/testdashboard/linux-7/fs/aio.c:2111 /home/testdashboard/linux-7/fs/aio.c:2081 /home/testdashboard/linux-7/fs/aio.c:2081) 00248 invoke_syscall.constprop.0 (/home/testdashboard/linux-7/arch/arm64/include/asm/syscall.h:61 /home/testdashboard/linux-7/arch/arm64/kernel/syscall.c:54) 00248 ========= FAILED TIMEOUT tiering_variable_buckets_replicas in 1200s Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-12 05:01:52 -04:00
Kent Overstreet	672f75238e	bcachefs: Fix accounting replay flags BCH_TRANS_COMMIT_journal_reclaim without BCH_WATERMARK_reclaim means "return an error if low on journal space" - but accounting replay must succeed. Fixes https://github.com/koverstreet/bcachefs/issues/656 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-12 03:02:16 -04:00
Kent Overstreet	c1bd21bb65	bcachefs: Fix invalid shift in member_to_text() Reported-by: syzbot+064ce437a1ad63d3f6ef@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-12 03:02:16 -04:00
Kent Overstreet	7d84d9f449	bcachefs: Fix bch2_have_enough_devs() for BCH_SB_MEMBER_INVALID This fixes a kasan splat in the ec device removal tests. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-11 22:20:51 -04:00
Kalesh AP	dc5006cfcf	RDMA/bnxt_re: Fix the GID table length GID table length is reported by FW. The gid index which is passed to the driver during modify_qp/create_ah is restricted by the sgid_index field of struct ib_global_route. sgid_index is u8 and the max sgid possible is 256. Each GID entry in HW will have 2 GID entries in the kernel gid table. So we can support twice the gid table size reported by FW. Also, restrict the max GID to 256 also. Fixes: `847b97887e` ("RDMA/bnxt_re: Restrict the max_gids to 256") Link: https://patch.msgid.link/r/1728373302-19530-11-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:02 -03:00
Bhargava Chenna Marreddy	7988bdbbb8	RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages Avoid memory corruption while setting up Level-2 PBL pages for the non MR resources when num_pages > 256K. There will be a single PDE page address (contiguous pages in the case of > PAGE_SIZE), but, current logic assumes multiple pages, leading to invalid memory access after 256K PBL entries in the PDE. Fixes: `0c4dcd6028` ("RDMA/bnxt_re: Refactor hardware queue memory allocation") Link: https://patch.msgid.link/r/1728373302-19530-10-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:02 -03:00
Chandramohan Akula	2df411353d	RDMA/bnxt_re: Change the sequence of updating the CQ toggle value Currently the CQ toggle value in the shared page (read by the userlib) is updated as part of the cqn_handler. There is a potential race of application calling the CQ ARM doorbell immediately and using the old toggle value. Change the sequence of updating CQ toggle value to update in the bnxt_qplib_service_nq function immediately after reading the toggle value to be in sync with the HW updated value. Fixes: `e275919d96` ("RDMA/bnxt_re: Share a page to expose per CQ info with userspace") Link: https://patch.msgid.link/r/1728373302-19530-9-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:01 -03:00
Kalesh AP	a5e099e0c4	RDMA/bnxt_re: Fix an error path in bnxt_re_add_device In bnxt_re_add_device(), when register netdev notifier fails, driver is not unregistering the IB device in the error cleanup path. Also, removed the duplicate cleanup in error path of bnxt_re_probe. Fixes: `94a9dc6ac8` ("RDMA/bnxt_re: Group all operations under add_device and remove_device") Link: https://patch.msgid.link/r/1728373302-19530-8-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:01 -03:00
Selvin Xavier	8be3e5b0c9	RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop Driver waits indefinitely for the fifo occupancy to go below a threshold as soon as the pacing interrupt is received. This can cause soft lockup on one of the processors, if the rate of DB is very high. Add a loop count for FPGA and exit the __wait_for_fifo_occupancy_below_th if the loop is taking more time. Pacing will be continuing until the occupancy is below the threshold. This is ensured by the checks in bnxt_re_pacing_timer_exp and further scheduling the work for pacing based on the fifo occupancy. Fixes: `2ad4e6303a` ("RDMA/bnxt_re: Implement doorbell pacing algorithm") Link: https://patch.msgid.link/r/1728373302-19530-7-git-send-email-selvin.xavier@broadcom.com Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:01 -03:00
Kalesh AP	0ba9294da0	RDMA/bnxt_re: Fix a possible NULL pointer dereference There is a possibility of a NULL pointer dereference in the failure path of bnxt_re_add_device(). To address that, moved the update of "rdev->adev" to bnxt_re_dev_add(). Fixes: `dee3da3422` ("RDMA/bnxt_re: Change aux driver data to en_info to hold more information") Link: https://patch.msgid.link/r/1728373302-19530-6-git-send-email-selvin.xavier@broadcom.com Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-rdma/CAH-L+nMCwymKGqf5pd8-FZNhxEkDD=kb6AoCaE6fAVi7b3e5Qw@mail.gmail.com/T/#t Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:01 -03:00
Kalesh AP	98647df017	RDMA/bnxt_re: Return more meaningful error When the HWRM command fails, driver currently returns -EFAULT(Bad address). This does not look correct. Modified to return -EIO(I/O error). Fixes: `cc1ec769b8` ("RDMA/bnxt_re: Fixing the Control path command and response handling") Fixes: `65288a22dd` ("RDMA/bnxt_re: use shadow qd while posting non blocking rcfw command") Link: https://patch.msgid.link/r/1728373302-19530-5-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:01 -03:00
Kashyap Desai	87b4d8d28f	RDMA/bnxt_re: Fix incorrect dereference of srq in async event Currently driver is not getting correct srq. Dereference only if qplib has a valid srq. Fixes: `b02fd3f79e` ("RDMA/bnxt_re: Report async events and errors") Link: https://patch.msgid.link/r/1728373302-19530-4-git-send-email-selvin.xavier@broadcom.com Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:01 -03:00
Kalesh AP	a9e6e74439	RDMA/bnxt_re: Fix out of bound check Driver exports pacing stats only on GenP5 and P7 adapters. But while parsing the pacing stats, driver has a check for "rdev->dbr_pacing". This caused a trace when KASAN is enabled. BUG: KASAN: slab-out-of-bounds in bnxt_re_get_hw_stats+0x2b6a/0x2e00 [bnxt_re] Write of size 8 at addr ffff8885942a6340 by task modprobe/4809 Fixes: `8b6573ff34` ("bnxt_re: Update the debug counters for doorbell pacing") Link: https://patch.msgid.link/r/1728373302-19530-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:01 -03:00
Abhishek Mohapatra	ac6df53738	RDMA/bnxt_re: Fix the max CQ WQEs for older adapters Older adapters doesn't support the MAX CQ WQEs reported by older FW. So restrict the value reported to 1M always for older adapters. Fixes: `1ac5a40479` ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Link: https://patch.msgid.link/r/1728373302-19530-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Abhishek Mohapatra<abhishek.mohapatra@broadcom.com> Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:00 -03:00
Alessandro Zanni	174714f0e5	selftests: drivers: net: fix name not defined This fix solves this error, when calling kselftest with targets "drivers/net": File "tools/testing/selftests/net/lib/py/nsim.py", line 64, in __init__ if e.errno == errno.ENOSPC: NameError: name 'errno' is not defined The error was found by running tests manually with the command: make kselftest TARGETS="drivers/net" The module errno makes available standard error system symbols. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com> Link: https://patch.msgid.link/20241010183034.24739-1-alessandro.zanni87@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-11 16:01:00 -07:00
Alessandro Zanni	6ea8a1c28f	selftests: net/rds: add module not found This fix solves this error, when calling kselftest with targets "net/rds": The error was found by running tests manually with the command: make kselftest TARGETS="net/rds" The patch also specifies to import ip() function from the utils module. Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Link: https://patch.msgid.link/20241010194421.48198-1-alessandro.zanni87@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-11 15:59:53 -07:00
Wei Fang	1d7b2ce43d	net: enetc: add missing static descriptor and inline keyword Fix the build warnings when CONFIG_FSL_ENETC_MDIO is not enabled. The detailed warnings are shown as follows. include/linux/fsl/enetc_mdio.h:62:18: warning: no previous prototype for function 'enetc_hw_alloc' [-Wmissing-prototypes] 62 \| struct enetc_hw enetc_hw_alloc(struct device dev, void __iomem port_regs) \| ^ include/linux/fsl/enetc_mdio.h:62:1: note: declare 'static' if the function is not intended to be used outside of this translation unit 62 \| struct enetc_hw enetc_hw_alloc(struct device dev, void __iomem port_regs) \| ^ \| static 8 warnings generated. Fixes: `6517798dd3` ("enetc: Make MDIO accessors more generic and export to include/linux/fsl") Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410102136.jQHZOcS4-lkp@intel.com/ Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241011030103.392362-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-11 15:59:15 -07:00
Jakub Kicinski	0af8c8ae34	Merge branch 'net-enetc-fix-some-issues-of-xdp' Wei Fang says: ==================== net: enetc: fix some issues of XDP We found some bugs when testing the XDP function of enetc driver, and these bugs are easy to reproduce. This is not only causes XDP to not work, but also the network cannot be restored after exiting the XDP program. So the patch set is mainly to fix these bugs. For details, please see the commit message of each patch. v1: https://lore.kernel.org/bpf/20240919084104.661180-1-wei.fang@nxp.com/ v2: https://lore.kernel.org/netdev/20241008224806.2onzkt3gbslw5jxb@skbuf/ v3: https://lore.kernel.org/imx/20241009090327.146461-1-wei.fang@nxp.com/ ==================== Link: https://patch.msgid.link/20241010092056.298128-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-11 15:45:21 -07:00
Wei Fang	6b58fadd44	net: enetc: disable NAPI after all rings are disabled When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC on LS1028A, it was found that if the command was re-run multiple times, Rx could not receive the frames, and the result of xdp-bench showed that the rx rate was 0. root@ls1028ardb:~# ./xdp-bench tx eno0 Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc) Summary 2046 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s By observing the Rx PIR and CIR registers, CIR is always 0x7FF and PIR is always 0x7FE, which means that the Rx ring is full and can no longer accommodate other Rx frames. Therefore, the problem is caused by the Rx BD ring not being cleaned up. Further analysis of the code revealed that the Rx BD ring will only be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met. Therefore, some debug logs were added to the driver and the current values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx BD ring was full. The logs are as follows. [ 178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 [ 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 [ 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110 From the results, the max value of xdp_tx_in_flight has reached 2140. However, the size of the Rx BD ring is only 2048. So xdp_tx_in_flight did not drop to 0 after enetc_stop() is called and the driver does not clear it. The root cause is that NAPI is disabled too aggressively, without having waited for the pending XDP_TX frames to be transmitted, and their buffers recycled, so that xdp_tx_in_flight cannot naturally drop to 0. Later, enetc_free_tx_ring() does free those stale, unsent XDP_TX packets, but it is not coded up to also reset xdp_tx_in_flight, hence the manifestation of the bug. One option would be to cover this extra condition in enetc_free_tx_ring(), but now that the ENETC_TX_DOWN exists, we have created a window at the beginning of enetc_stop() where NAPI can still be scheduled, but any concurrent enqueue will be blocked. Therefore, enetc_wait_bdrs() and enetc_disable_tx_bdrs() can be called with NAPI still scheduled, and it is guaranteed that this will not wait indefinitely, but instead give us an indication that the pending TX frames have orderly dropped to zero. Only then should we call napi_disable(). This way, enetc_free_tx_ring() becomes entirely redundant and can be dropped as part of subsequent cleanup. The change also refactors enetc_start() so that it looks like the mirror opposite procedure of enetc_stop(). Fixes: `ff58fda090` ("net: enetc: prioritize ability to go down over packet processing") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241010092056.298128-5-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-11 15:45:11 -07:00
Wei Fang	0a93f2ca4b	net: enetc: disable Tx BD rings after they are empty The Tx BD rings are disabled first in enetc_stop() and the driver waits for them to become empty. This operation is not safe while the ring is actively transmitting frames, and will cause the ring to not be empty and hardware exception. As described in the NETC block guide, software should only disable an active Tx ring after all pending ring entries have been consumed (i.e. when PI = CI). Disabling a transmit ring that is actively processing BDs risks a HW-SW race hazard whereby a hardware resource becomes assigned to work on one or more ring entries only to have those entries be removed due to the ring becoming disabled. When testing XDP_REDIRECT feautre, although all frames were blocked from being put into Tx rings during ring reconfiguration, the similar warning log was still encountered: fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #6 clear fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #7 clear The reason is that when there are still unsent frames in the Tx ring, disabling the Tx ring causes the remaining frames to be unable to be sent out. And the Tx ring cannot be restored, which means that even if the xdp program is uninstalled, the Tx frames cannot be sent out anymore. Therefore, correct the operation order in enect_start() and enect_stop(). Fixes: `ff58fda090` ("net: enetc: prioritize ability to go down over packet processing") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241010092056.298128-4-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-11 15:45:11 -07:00
Wei Fang	c728a95ccf	net: enetc: block concurrent XDP transmissions during ring reconfiguration When testing the XDP_REDIRECT function on the LS1028A platform, we found a very reproducible issue that the Tx frames can no longer be sent out even if XDP_REDIRECT is turned off. Specifically, if there is a lot of traffic on Rx direction, when XDP_REDIRECT is turned on, the console may display some warnings like "timeout for tx ring #6 clear", and all redirected frames will be dropped, the detailed log is as follows. root@ls1028ardb:~# ./xdp-bench redirect eno0 eno2 Redirecting from eno0 (ifindex 3; driver fsl_enetc) to eno2 (ifindex 4; driver fsl_enetc) [203.849809] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #5 clear [204.006051] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #6 clear [204.161944] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #7 clear eno0->eno2 1420505 rx/s 1420590 err,drop/s 0 xmit/s xmit eno0->eno2 0 xmit/s 1420590 drop/s 0 drv_err/s 15.71 bulk-avg eno0->eno2 1420484 rx/s 1420485 err,drop/s 0 xmit/s xmit eno0->eno2 0 xmit/s 1420485 drop/s 0 drv_err/s 15.71 bulk-avg By analyzing the XDP_REDIRECT implementation of enetc driver, the driver will reconfigure Tx and Rx BD rings when a bpf program is installed or uninstalled, but there is no mechanisms to block the redirected frames when enetc driver reconfigures rings. Similarly, XDP_TX verdicts on received frames can also lead to frames being enqueued in the Tx rings. Because XDP ignores the state set by the netif_tx_wake_queue() API, so introduce the ENETC_TX_DOWN flag to suppress transmission of XDP frames. Fixes: `c33bfaf91c` ("net: enetc: set up XDP program under enetc_reconfigure()") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241010092056.298128-3-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-11 15:45:11 -07:00
Wei Fang	412950d574	net: enetc: remove xdp_drops statistic from enetc_xdp_drop() The xdp_drops statistic indicates the number of XDP frames dropped in the Rx direction. However, enetc_xdp_drop() is also used in XDP_TX and XDP_REDIRECT actions. If frame loss occurs in these two actions, the frames loss count should not be included in xdp_drops, because there are already xdp_tx_drops and xdp_redirect_failures to count the frame loss of these two actions, so it's better to remove xdp_drops statistic from enetc_xdp_drop() and increase xdp_drops in XDP_DROP action. Fixes: `7ed2bc8007` ("net: enetc: add support for XDP_TX") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241010092056.298128-2-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-11 15:45:10 -07:00
Daniel Machon	8a6be4bd6f	net: sparx5: fix source port register when mirroring When port mirroring is added to a port, the bit position of the source port, needs to be written to the register ANA_AC_PROBE_PORT_CFG. This register is replicated for n_ports > 32, and therefore we need to derive the correct register from the port number. Before this patch, we wrongly calculate the register from portno / BITS_PER_BYTE, where the divisor ought to be 32, causing any port >=8 to be written to the wrong register. We fix this, by using do_div(), where the dividend is the register, the remainder is the bit position and the divisor is now 32. Fixes: `4e50d72b3b` ("net: sparx5: add port mirroring implementation") Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241009-mirroring-fix-v1-1-9ec962301989@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-11 15:42:51 -07:00
Xin Long	22600596b6	ipv4: give an IPv4 dev to blackhole_netdev After commit `8d7017fd62` ("blackhole_netdev: use blackhole_netdev to invalidate dst entries"), blackhole_netdev was introduced to invalidate dst cache entries on the TX path whenever the cache times out or is flushed. When two UDP sockets (sk1 and sk2) send messages to the same destination simultaneously, they are using the same dst cache. If the dst cache is invalidated on one path (sk2) while the other (sk1) is still transmitting, sk1 may try to use the invalid dst entry. CPU1 CPU2 udp_sendmsg(sk1) udp_sendmsg(sk2) udp_send_skb() ip_output() <--- dst timeout or flushed dst_dev_put() ip_finish_output2() ip_neigh_for_gw() This results in a scenario where ip_neigh_for_gw() returns -EINVAL because blackhole_dev lacks an in_dev, which is needed to initialize the neigh in arp_constructor(). This error is then propagated back to userspace, breaking the UDP application. The patch fixes this issue by assigning an in_dev to blackhole_dev for IPv4, similar to what was done for IPv6 in commit `e5f80fcf86` ("ipv6: give an IPv6 dev to blackhole_netdev"). This ensures that even when the dst entry is invalidated with blackhole_dev, it will not fail to create the neigh entry. As devinet_init() is called ealier than blackhole_netdev_init() in system booting, it can not assign the in_dev to blackhole_dev in devinet_init(). As Paolo suggested, add a separate late_initcall() in devinet.c to ensure inet_blackhole_dev_init() is called after blackhole_netdev_init(). Fixes: `8d7017fd62` ("blackhole_netdev: use blackhole_netdev to invalidate dst entries") Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/3000792d45ca44e16c785ebe2b092e610e5b3df1.1728499633.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-11 15:36:12 -07:00
Harshit Mogalapalli	3fd976afe9	pinctrl: nuvoton: fix a double free in ma35_pinctrl_dt_node_to_map_func() 'new_map' is allocated using devm_* which takes care of freeing the allocated data on device removal, call to .dt_free_map = pinconf_generic_dt_free_map double frees the map as pinconf_generic_dt_free_map() calls pinctrl_utils_free_map(). Fix this by using kcalloc() instead of auto-managed devm_kcalloc(). Cc: stable@vger.kernel.org Fixes: `f805e35631` ("pinctrl: nuvoton: Add ma35d1 pinctrl and GPIO driver") Reported-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/20241010205237.1245318-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2024-10-11 21:54:58 +02:00
John Allen	ee4d4e8d2c	x86/CPU/AMD: Only apply Zenbleed fix for Zen2 during late microcode load Commit `f69759be25` ("x86/CPU/AMD: Move Zenbleed check to the Zen2 init function") causes a bit in the DE_CFG MSR to get set erroneously after a microcode late load. The microcode late load path calls into amd_check_microcode() and subsequently zen2_zenbleed_check(). Since the above commit removes the cpu_has_amd_erratum() call from zen2_zenbleed_check(), this will cause all non-Zen2 CPUs to go through the function and set the bit in the DE_CFG MSR. Call into the Zenbleed fix path on Zen2 CPUs only. [ bp: Massage commit message, use cpu_feature_enabled(). ] Fixes: `f69759be25` ("x86/CPU/AMD: Move Zenbleed check to the Zen2 init function") Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20240923164404.27227-1-john.allen@amd.com	2024-10-11 21:26:45 +02:00
Rafael J. Wysocki	940efc9fc8	Merge tag 'amd-pstate-v6.12-2024-10-10' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux Merge an amd-pstate fix for 6.12 from Mario Limonciello: "Fix an issue with changing amd-pstate modes at runtime on shared memory systems." * tag 'amd-pstate-v6.12-2024-10-10' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: cpufreq/amd-pstate: Fix amd_pstate mode switch on shared memory systems	2024-10-11 20:32:58 +02:00
Roi Martin	2ab5e243c2	btrfs: fix uninitialized pointer free on read_alloc_one_name() error The function read_alloc_one_name() does not initialize the name field of the passed fscrypt_str struct if kmalloc fails to allocate the corresponding buffer. Thus, it is not guaranteed that fscrypt_str.name is initialized when freeing it. This is a follow-up to the linked patch that fixes the remaining instances of the bug introduced by commit `e43eec81c5` ("btrfs: use struct qstr instead of name and namelen pairs"). Link: https://lore.kernel.org/linux-btrfs/20241009080833.1355894-1-jroi.martin@gmail.com/ Fixes: `e43eec81c5` ("btrfs: use struct qstr instead of name and namelen pairs") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Roi Martin <jroi.martin@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-11 19:55:04 +02:00
Christian Heusel	a0af4936e4	btrfs: send: cleanup unneeded return variable in changed_verity() As all changed_* functions need to return something, just return 0 directly here, as the verity status is passed via the context. Reported by LKP: fs/btrfs/send.c:6877:5-8: Unneeded variable: "ret". Return "0" on line 6883 Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202410092305.WbyqspH8-lkp@intel.com/ Signed-off-by: Christian Heusel <christian@heusel.eu> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-11 19:54:58 +02:00
Roi Martin	66691c6e2f	btrfs: fix uninitialized pointer free in add_inode_ref() The add_inode_ref() function does not initialize the "name" struct when it is declared. If any of the following calls to "read_one_inode() returns NULL, dir = read_one_inode(root, parent_objectid); if (!dir) { ret = -ENOENT; goto out; } inode = read_one_inode(root, inode_objectid); if (!inode) { ret = -EIO; goto out; } then "name.name" would be freed on "out" before being initialized. out: ... kfree(name.name); This issue was reported by Coverity with CID 1526744. Fixes: `e43eec81c5` ("btrfs: use struct qstr instead of name and namelen pairs") CC: stable@vger.kernel.org # 6.6+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Roi Martin <jroi.martin@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-11 19:54:52 +02:00
Breno Leitao	ee7ff15bf5	elevator: Remove argument from elevator_find_get Commit `e4eb37cc0f` ("block: Remove elevator required features") removed the usage of `struct request_queue` from elevator_find_get(), but didn't removed the argument. Remove the "struct request_queue *q" argument from elevator_find_get() given it is useless. Fixes: `e4eb37cc0f` ("block: Remove elevator required features") Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20241011155615.3361143-1-leitao@debian.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-11 11:11:09 -06:00
Breno Leitao	b4ff6e93bf	elevator: do not request_module if elevator exists Whenever an I/O elevator is changed, the system attempts to load a module for the new elevator. This occurs regardless of whether the elevator is already loaded or built directly into the kernel. This behavior introduces unnecessary overhead and potential issues. This makes the operation slower, and more error-prone. For instance, making the problem fixed by [1] visible for users that doesn't even rely on modules being available through modules. Do not try to load the ioscheduler if it is already visible. This change brings two main benefits: it improves the performance of elevator changes, and it reduces the likelihood of errors occurring during this process. [1] Commit `e3accac1a9` ("block: Fix elv_iosched_local_module handling of "none" scheduler") Fixes: `734e1a8603` ("block: Prevent deadlocks when switching elevators") Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20241011170122.3880087-1-leitao@debian.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-11 11:10:52 -06:00
Bart Van Assche	4d784c042d	RDMA/srpt: Make slab cache names unique Since commit `4c39529663` ("slab: Warn on duplicate cache names when DEBUG_VM=y"), slab complains about duplicate cache names. Hence this patch. The approach is as follows: - Maintain an xarray with the slab size as index and a reference count and a kmem_cache pointer as contents. Use srpt-${slab_size} as kmem cache name. - Use 512-byte alignment for all slabs instead of only for some of the slabs. - Increment the reference count instead of calling kmem_cache_create(). - Decrement the reference count instead of calling kmem_cache_destroy(). Fixes: `5dabcd0456` ("RDMA/srpt: Add support for immediate data") Link: https://patch.msgid.link/r/20241009210048.4122518-1-bvanassche@acm.org Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Closes: https://lore.kernel.org/linux-block/xpe6bea7rakpyoyfvspvin2dsozjmjtjktpph7rep3h25tv7fb@ooz4cu5z6bq6/ Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 14:07:33 -03:00
Alexander Zubkov	8cddfa535c	RDMA/irdma: Fix misspelling of "accept" There is "accept" misspelled as "accpet*" in the comments. Fix the spelling. Fixes: `146b9756f1` ("RDMA/irdma: Add connection manager") Link: https://patch.msgid.link/r/20241008161913.19965-1-green@qrator.net Signed-off-by: Alexander Zubkov <green@qrator.net> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 14:06:09 -03:00
Anumula Murali Mohan Reddy	c659b405b8	RDMA/cxgb4: Fix RDMA_CM_EVENT_UNREACHABLE error for iWARP ip_dev_find() always returns real net_device address, whether traffic is running on a vlan or real device, if traffic is over vlan, filling endpoint struture with real ndev and an attempt to send a connect request will results in RDMA_CM_EVENT_UNREACHABLE error. This patch fixes the issue by using vlan_dev_real_dev(). Fixes: `830662f6f0` ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address") Link: https://patch.msgid.link/r/20241007132311.70593-1-anumula@chelsio.com Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 13:55:53 -03:00
Showrya M N	4e1e3dd88a	RDMA/siw: Add sendpage_ok() check to disable MSG_SPLICE_PAGES While running ISER over SIW, the initiator machine encounters a warning from skb_splice_from_iter() indicating that a slab page is being used in send_page. To address this, it is better to add a sendpage_ok() check within the driver itself, and if it returns 0, then MSG_SPLICE_PAGES flag should be disabled before entering the network stack. A similar issue has been discussed for NVMe in this thread: https://lore.kernel.org/all/20240530142417.146696-1-ofir.gal@volumez.com/ WARNING: CPU: 0 PID: 5342 at net/core/skbuff.c:7140 skb_splice_from_iter+0x173/0x320 Call Trace: tcp_sendmsg_locked+0x368/0xe40 siw_tx_hdt+0x695/0xa40 [siw] siw_qp_sq_process+0x102/0xb00 [siw] siw_sq_resume+0x39/0x110 [siw] siw_run_sq+0x74/0x160 [siw] kthread+0xd2/0x100 ret_from_fork+0x34/0x40 ret_from_fork_asm+0x1a/0x30 Link: https://patch.msgid.link/r/20241007125835.89942-1-showrya@chelsio.com Signed-off-by: Showrya M N <showrya@chelsio.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 13:55:53 -03:00
Filipe Manana	97420be7bd	btrfs: use sector numbers as keys for the dirty extents xarray We are using the logical address ("bytenr") of an extent as the key for qgroup records in the dirty extents xarray. This is a problem because the xarrays use "unsigned long" for keys/indices, meaning that on a 32 bits platform any extent starting at or beyond 4G is truncated, which is a too low limitation as virtually everyone is using storage with more than 4G of space. This means a "bytenr" of 4G gets truncated to 0, and so does 8G and 16G for example, resulting in incorrect qgroup accounting. Fix this by using sector numbers as keys instead, that is, using keys that match the logical address right shifted by fs_info->sectorsize_bits, which is what we do for the fs_info->buffer_radix that tracks extent buffers (radix trees also use an "unsigned long" type for keys). This also makes the index space more dense which helps optimize the xarray (as mentioned at Documentation/core-api/xarray.rst). Fixes: `3cce39a8ca` ("btrfs: qgroup: use xarray to track dirty extents in transaction") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-10-11 18:33:35 +02:00
Namjae Jeon	a77e0e02af	ksmbd: add support for supplementary groups Even though system user has a supplementary group, It gets NT_STATUS_ACCESS_DENIED when attempting to create file or directory. This patch add KSMBD_EVENT_LOGIN_REQUEST_EXT/RESPONSE_EXT netlink events to get supplementary groups list. The new netlink event doesn't break backward compatibility when using old ksmbd-tools. Co-developed-by: Atte Heikkilä <atteh.mailbox@gmail.com> Signed-off-by: Atte Heikkilä <atteh.mailbox@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-10-11 11:02:14 -05:00
Florian Fainelli	db8f0b8088	firmware: arm_scmi: Give SMC transport precedence over mailbox Broadcom STB platforms have for historical reasons included both "arm,scmi-smc" and "arm,scmi" in their SCMI Device Tree node compatible string, in that order. After the commit `b53515fa17` ("firmware: arm_scmi: Make MBOX transport a standalone driver") and with a kernel configuration that enables both the SMC and the mailbox transports, we would probe the mailbox transport, but fail to complete since we would not have a mailbox driver available. With each SCMI transport being a platform driver with its own set of compatible strings to match, rather than an unique platform driver entry point, we no longer match from most specific to least specific. There is also no simple way for the mailbox driver to return -ENODEV and let another platform driver attempt probing. This leads to a platform with no SCMI provider, therefore all drivers depending upon SCMI resources are put on deferred probe forever. By keeping the SMC transport objects linked first, we can let the platform driver match the compatible string and probe successfully with no adverse effects on platforms using the mailbox transport. This is just the workaround to the issue observed which doesn't have any impact on the other platforms. Fixes: `b53515fa17` ("firmware: arm_scmi: Make MBOX transport a standalone driver") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Message-Id: <20241007235413.507860-1-florian.fainelli@broadcom.com> Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2024-10-11 16:52:42 +01:00
Su Hui	39b13dce1a	firmware: arm_scmi: Fix the double free in scmi_debugfs_common_setup() Clang static checker(scan-build) throws below warning： \| drivers/firmware/arm_scmi/driver.c:line 2915, column 2 \| Attempt to free released memory. When devm_add_action_or_reset() fails, scmi_debugfs_common_cleanup() will run twice which causes double free of 'dbg->name'. Remove the redundant scmi_debugfs_common_cleanup() to fix this problem. Fixes: `c3d4aed763` ("firmware: arm_scmi: Populate a common SCMI debugfs root") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Message-Id: <20241011104001.1546476-1-suhui@nfschina.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2024-10-11 16:52:42 +01:00
Jaegeuk Kim	332fade75d	f2fs: allow parallel DIO reads This fixes a regression which prevents parallel DIO reads. Fixes: `0cac51185e` ("f2fs: fix to avoid racing in between read and OPU dio write") Reviewed-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2024-10-11 15:12:07 +00:00
Dr. David Alan Gilbert	1e3fc20000	drbd: Remove unused conn_lowest_minor conn_lowest_minor() last use was removed by 2011 commit `69a227731a` ("drbd: Pass a peer device to a number of fuctions") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20241010204426.277535-1-linux@treblig.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-11 07:11:44 -06:00
Marc Zyngier	df5fd75ee3	KVM: arm64: Don't eagerly teardown the vgic on init error As there is very little ordering in the KVM API, userspace can instanciate a half-baked GIC (missing its memory map, for example) at almost any time. This means that, with the right timing, a thread running vcpu-0 can enter the kernel without a GIC configured and get a GIC created behind its back by another thread. Amusingly, it will pick up that GIC and start messing with the data structures without the GIC having been fully initialised. Similarly, a thread running vcpu-1 can enter the kernel, and try to init the GIC that was previously created. Since this GIC isn't properly configured (no memory map), it fails to correctly initialise. And that's the point where we decide to teardown the GIC, freeing all its resources. Behind vcpu-0's back. Things stop pretty abruptly, with a variety of symptoms. Clearly, this isn't good, we should be a bit more careful about this. It is obvious that this guest is not viable, as it is missing some important part of its configuration. So instead of trying to tear bits of it down, let's just mark it as dead. It means that any further interaction from userspace will result in -EIO. The memory will be released on the "normal" path, when userspace gives up. Cc: stable@vger.kernel.org Reported-by: Alexander Potapenko <glider@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241009183603.3221824-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-11 13:40:25 +01:00
Mika Westerberg	e9e1b20fae	thunderbolt: Fix KASAN reported stack out-of-bounds read in tb_retimer_scan() KASAN reported following issue: BUG: KASAN: stack-out-of-bounds in tb_retimer_scan+0xffe/0x1550 [thunderbolt] Read of size 4 at addr ffff88810111fc1c by task kworker/u56:0/11 CPU: 0 UID: 0 PID: 11 Comm: kworker/u56:0 Tainted: G U 6.11.0+ #1387 Tainted: [U]=USER Workqueue: thunderbolt0 tb_handle_hotplug [thunderbolt] Call Trace: <TASK> dump_stack_lvl+0x6c/0x90 print_report+0xd1/0x630 kasan_report+0xdb/0x110 __asan_report_load4_noabort+0x14/0x20 tb_retimer_scan+0xffe/0x1550 [thunderbolt] tb_scan_port+0xa6f/0x2060 [thunderbolt] tb_handle_hotplug+0x17b1/0x3080 [thunderbolt] process_one_work+0x626/0x1100 worker_thread+0x6c8/0xfa0 kthread+0x2c8/0x3a0 ret_from_fork+0x3a/0x80 ret_from_fork_asm+0x1a/0x30 This happens because the loop variable still gets incremented by one so max becomes 3 instead of 2, and this makes the second loop read past the the array declared on the stack. Fix this by assigning to max directly in the loop body. Fixes: `ff6ab055e0` ("thunderbolt: Add receiver lane margining support for retimers") CC: stable@vger.kernel.org Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>	2024-10-11 14:23:15 +03:00
Shengjiu Wang	54c805c1eb	ASoC: fsl_esai: change dev_warn to dev_dbg in irq handler Irq handler need to be executed as fast as possible, so the log in irq handler is better to use dev_dbg which needs to be enabled when debugging. Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Reviewed-by: Iuliana Prodan <iuliana.prodan@nxp.com> Link: https://patch.msgid.link/1728622433-2873-1-git-send-email-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-11 12:06:49 +01:00
Lad Prabhakar	9b064d200a	ASoC: rsnd: Fix probe failure on HiHope boards due to endpoint parsing On the HiHope boards, we have a single port with a single endpoint defined as below: .... rsnd_port: port { rsnd_endpoint: endpoint { remote-endpoint = <&dw_hdmi0_snd_in>; dai-format = "i2s"; bitclock-master = <&rsnd_endpoint>; frame-master = <&rsnd_endpoint>; playback = <&ssi2>; }; }; .... With commit `547b02f74e` ("ASoC: rsnd: enable multi Component support for Audio Graph Card/Card2"), support for multiple ports was added. This caused probe failures on HiHope boards, as the endpoint could not be retrieved due to incorrect device node pointers being used. This patch fixes the issue by updating the `rsnd_dai_of_node()` and `rsnd_dai_probe()` functions to use the correct device node pointers based on the port names ('port' or 'ports'). It ensures that the endpoint is properly parsed for both single and multi-port configurations, restoring compatibility with HiHope boards. Fixes: `547b02f74e` ("ASoC: rsnd: enable multi Component support for Audio Graph Card/Card2") Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://patch.msgid.link/20241010141432.716868-1-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-11 12:06:49 +01:00
Colin Ian King	ca2803fadf	ASoC: max98388: Fix missing increment of variable slot_found The variable slot_found is being initialized to zero and inside a for-loop is being checked if it's reached MAX_NUM_CH, however, this is currently impossible since slot_found is never changed. In a previous loop a similar coding pattern is used and slot_found is being incremented. It appears the increment of slot_found is missing from the loop, so fix the code by adding in the increment. Fixes: `6a8e1d46f0` ("ASoC: max98388: add amplifier driver") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://patch.msgid.link/20241010182032.776280-1-colin.i.king@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-11 12:06:48 +01:00
Darrick J. Wong	0fb823f1cf	xfs: fix integer overflow in xrep_bmap The variable declaration in this function predates the merge of the nrext64 (aka 64-bit extent counters) feature, which means that the variable declaration type is insufficient to avoid an integer overflow. Fix that by redeclaring the variable to be xfs_extnum_t. Coverity-id: 1630958 Fixes: `8f71bede8e` ("xfs: repair inode fork block mapping data structures") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-10-11 12:32:48 +02:00
Arnd Bergmann	b72cd67a03	Merge tag 'arm-soc/for-6.12/devicetree-fixes' of https://github.com/Broadcom/stblinux into arm/fixes This pull request contains Broadcom ARM-based SoCs Device Tree fixes for 6.12, please pull the following: - Florian fixed the HDMI gpio pin which is connected to GPIO pin 0, not 1 * tag 'arm-soc/for-6.12/devicetree-fixes' of https://github.com/Broadcom/stblinux: ARM: dts: bcm2837-rpi-cm3-io3: Fix HDMI hpd-gpio pin Link: https://lore.kernel.org/r/20241008220440.23182-1-florian.fainelli@broadcom.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-10-11 10:03:30 +00:00
Arnd Bergmann	dec17c8b36	Merge tag 'soc_fsl-6.12-3' of https://github.com/chleroy/linux into arm/fixes FSL SOC fixes for v6.12: - Fix a "cast to pointer from integer of different size" build error due to IS_ERROR_VALUE() used with something which is not a pointer. - Fix an unused data build warning. * tag 'soc_fsl-6.12-3' of https://github.com/chleroy/linux: soc: fsl: cpm1: qmc: Fix unused data compilation warning soc: fsl: cpm1: qmc: Do not use IS_ERR_VALUE() on error pointers Link: https://lore.kernel.org/r/c954bdb0-0c16-491a-8662-37e58f07208f@csgroup.eu Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-10-11 10:03:13 +00:00
Krzysztof Kozlowski	29ce0bca6d	Documentation/process: maintainer-soc: clarify submitting patches Patches for SoCs are expected to be picked up by SoC submaintainers. The main SoC maintainers should be addressed only in few cases. Rewrite the section about maintainer handling to document above expectation. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Will Deacon <will@kernel.org> Cc: Kevin Hilman <khilman@baylibre.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Conor Dooley <conor@kernel.org> Cc: Heiko Stübner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240925095635.30452-1-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-10-11 09:56:53 +00:00
Alexander Sverdlin	26d77ce574	dmaengine: cirrus: check that output may be truncated ep93xx_dma.c: In function 'ep93xx_dma_of_probe': ep93xx_dma.c:1409:74: warning: '%u' directive output may be truncated writing between 1 and 8 bytes into a region of size 2 [-Wformat-truncation=] snprintf(dma_clk_name, sizeof(dma_clk_name), "m2p%u", i); ^~ Fixes: `d7333f9d33` ("dmaengine: cirrus: use snprintf() to calm down gcc 13.3.0") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409172024.pU8U5beA-lkp@intel.com/ Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Link: https://lore.kernel.org/r/2bf9c37aad8f085839f9c63104f7275742f51945.camel@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-10-11 09:55:47 +00:00
Alexander Sverdlin	5b484feb7a	dmaengine: cirrus: ERR_CAST() ioremap error ep93xx_dma.c:1354:37: sparse: sparse: incorrect type in return expression (different address spaces) ep93xx_dma.c:1354:37: sparse: expected struct ep93xx_dma_engine * ep93xx_dma.c:1354:37: sparse: got void [noderef] __iomem *regs Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409202250.fPlN2Erd-lkp@intel.com/ Fixes: 4e8ad5ed845b ("dmaengine: cirrus: Convert to DT for Cirrus EP93xx") Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Link: https://lore.kernel.org/r/d4b542f1d678796fbf094ebcc77295af3617bca0.camel@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-10-11 09:55:37 +00:00
Konstantin Ryabitsev	32af1c8af4	MAINTAINERS: use the canonical soc mailing list address and mark it as L: The soc@kernel.org address started out as a mail alias, but at some point became a mailing list. Use the canonical name of the list and properly mark it as L: instead of M:. Signed-off-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-10-11 09:53:12 +00:00
Zhu Jun	fd5f14c126	ALSA: scarlett2: Add error check after retrieving PEQ filter values Add error check after retrieving PEQ filter values in scarlett2_update_filter_values that ensure function returns error if PEQ filter value retrieval fails. Fixes: `b64678eb4e` ("ALSA: scarlett2: Add DSP controls") Signed-off-by: Zhu Jun <zhujun2@cmss.chinamobile.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241009092305.8570-1-zhujun2@cmss.chinamobile.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-11 11:17:14 +02:00
Murad Masimov	c9bd4a82b4	ALSA: hda/cs8409: Fix possible NULL dereference If snd_hda_gen_add_kctl fails to allocate memory and returns NULL, then NULL pointer dereference will occur in the next line. Since dolphin_fixups function is a hda_fixup function which is not supposed to return any errors, add simple check before dereference, ignore the fail. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `20e5077241` ("ALSA: hda/cs8409: Add support for dolphin") Signed-off-by: Murad Masimov <m.masimov@maxima.ru> Link: https://patch.msgid.link/20241010221649.1305-1-m.masimov@maxima.ru Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-11 11:14:06 +02:00
Jason Gerecke	2934b12281	HID: wacom: Hardcode (non-inverted) AES pens as BTN_TOOL_PEN Unlike EMR tools which encode type information in their tool ID, tools for AES sensors are all "generic pens". It is inappropriate to make use of the wacom_intuos_get_tool_type function when dealing with these kinds of devices. Instead, we should only ever report BTN_TOOL_PEN or BTN_TOOL_RUBBER, as depending on the state of the Eraser and Invert bits. Reported-by: Daniel Jutz <daniel@djutz.com> Closes: https://lore.kernel.org/linux-input/3cd82004-c5b8-4f2a-9a3b-d88d855c65e4@heusel.eu/ Bisected-by: Christian Heusel <christian@heusel.eu> Fixes: `9c2913b962` ("HID: wacom: more appropriate tool type categorization") Link: https://gitlab.freedesktop.org/libinput/libinput/-/issues/1041 Link: https://github.com/linuxwacom/input-wacom/issues/440 Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> Cc: stable@vger.kernel.org Acked-by: Benjamin Tissoires <bentiss@kernel.org> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-10-11 11:12:19 +02:00
Peter Zijlstra	f5aaff7bfa	sched/core: Dequeue PSI signals for blocked tasks that are delayed psi_dequeue() in for blocked task expects psi_sched_switch() to clear the TSK_.*RUNNING PSI flags and set the TSK_IOWAIT flags however psi_sched_switch() uses "!task_on_rq_queued(prev)" to detect if the task is blocked or still runnable which is no longer true with DELAY_DEQUEUE since a blocking task can be left queued on the runqueue. This can lead to PSI splats similar to: psi: inconsistent task state! task=... cpu=... psi_flags=4 clear=0 set=4 when the task is requeued since the TSK_RUNNING flag was not cleared when the task was blocked. Explicitly communicate that the task was blocked to psi_sched_switch() even if it was delayed and is still on the runqueue. [ prateek: Broke off the relevant part from [1], commit message ] Fixes: `152e11f6df` ("sched/fair: Implement delayed dequeue") Closes: https://lore.kernel.org/lkml/20240830123458.3557-1-spasswolf@web.de/ Closes: https://lore.kernel.org/all/cd67fbcd-d659-4822-bb90-7e8fbb40a856@molgen.mpg.de/ Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Not-yet-signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/lkml/20241004123506.GR18071@noisy.programming.kicks-ass.net/ [1]	2024-10-11 10:49:33 +02:00
Peter Zijlstra	98442f0ccd	sched: Fix delayed_dequeue vs switched_from_fair() Commit `2e0199df25` ("sched/fair: Prepare exit/cleanup paths for delayed_dequeue") and its follow up fixes try to deal with a rather unfortunate situation where is task is enqueued in a new class, even though it shouldn't have been. Mostly because the existing ->switched_to/from() hooks are in the wrong place for this case. This all led to Paul being able to trigger failures at something like once per 10k CPU hours of RCU torture. For now, do the ugly thing and move the code to the right place by ignoring the switch hooks. Note: Clean up the whole sched_class::switch*_{to,from}() thing. Fixes: `2e0199df25` ("sched/fair: Prepare exit/cleanup paths for delayed_dequeue") Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241003185037.GA5594@noisy.programming.kicks-ass.net	2024-10-11 10:49:32 +02:00
Waiman Long	73ab05aa46	sched/core: Disable page allocation in task_tick_mm_cid() With KASAN and PREEMPT_RT enabled, calling task_work_add() in task_tick_mm_cid() may cause the following splat. [ 63.696416] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 [ 63.696416] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 610, name: modprobe [ 63.696416] preempt_count: 10001, expected: 0 [ 63.696416] RCU nest depth: 1, expected: 1 This problem is caused by the following call trace. sched_tick() [ acquire rq->__lock ] -> task_tick_mm_cid() -> task_work_add() -> __kasan_record_aux_stack() -> kasan_save_stack() -> stack_depot_save_flags() -> alloc_pages_mpol_noprof() -> __alloc_pages_noprof() -> get_page_from_freelist() -> rmqueue() -> rmqueue_pcplist() -> __rmqueue_pcplist() -> rmqueue_bulk() -> rt_spin_lock() The rq lock is a raw_spinlock_t. We can't sleep while holding it. IOW, we can't call alloc_pages() in stack_depot_save_flags(). The task_tick_mm_cid() function with its task_work_add() call was introduced by commit `223baf9d17` ("sched: Fix performance regression introduced by mm_cid") in v6.4 kernel. Fortunately, there is a kasan_record_aux_stack_noalloc() variant that calls stack_depot_save_flags() while not allowing it to allocate new pages. To allow task_tick_mm_cid() to use task_work without page allocation, a new TWAF_NO_ALLOC flag is added to enable calling kasan_record_aux_stack_noalloc() instead of kasan_record_aux_stack() if set. The task_tick_mm_cid() function is modified to add this new flag. The possible downside is the missing stack trace in a KASAN report due to new page allocation required when task_work_add_noallloc() is called which should be rare. Fixes: `223baf9d17` ("sched: Fix performance regression introduced by mm_cid") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241010014432.194742-1-longman@redhat.com	2024-10-11 10:49:32 +02:00
Phil Auld	d16b7eb6f5	sched/deadline: Use hrtick_enabled_dl() before start_hrtick_dl() The deadline server code moved one of the start_hrtick_dl() calls but dropped the dl specific hrtick_enabled check. This causes hrticks to get armed even when sched_feat(HRTICK_DL) is false. Fix it. Fixes: `63ba8422f8` ("sched/deadline: Introduce deadline servers") Signed-off-by: Phil Auld <pauld@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lore.kernel.org/r/20241004123729.460668-1-pauld@redhat.com	2024-10-11 10:49:32 +02:00
Justin Chen	1e48fd0574	phy: usb: disable COMMONONN for dual mode The COMMONONN bit suspends the phy when the port is put into a suspend state. However when the phy is shared between host and device in dual mode, this no longer works cleanly as there is no synchronization between the two. Fixes: `5095d045a9` ("phy: usb: Turn off phy when port is in suspend") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20241010185344.859865-1-justin.chen@broadcom.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-11 13:56:25 +05:30
Petr Vaganov	6889cd2a93	xfrm: fix one more kernel-infoleak in algo dumping During fuzz testing, the following issue was discovered: BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x598/0x2a30 _copy_to_iter+0x598/0x2a30 __skb_datagram_iter+0x168/0x1060 skb_copy_datagram_iter+0x5b/0x220 netlink_recvmsg+0x362/0x1700 sock_recvmsg+0x2dc/0x390 __sys_recvfrom+0x381/0x6d0 __x64_sys_recvfrom+0x130/0x200 x64_sys_call+0x32c8/0x3cc0 do_syscall_64+0xd8/0x1c0 entry_SYSCALL_64_after_hwframe+0x79/0x81 Uninit was stored to memory at: copy_to_user_state_extra+0xcc1/0x1e00 dump_one_state+0x28c/0x5f0 xfrm_state_walk+0x548/0x11e0 xfrm_dump_sa+0x1e0/0x840 netlink_dump+0x943/0x1c40 __netlink_dump_start+0x746/0xdb0 xfrm_user_rcv_msg+0x429/0xc00 netlink_rcv_skb+0x613/0x780 xfrm_netlink_rcv+0x77/0xc0 netlink_unicast+0xe90/0x1280 netlink_sendmsg+0x126d/0x1490 __sock_sendmsg+0x332/0x3d0 ____sys_sendmsg+0x863/0xc30 ___sys_sendmsg+0x285/0x3e0 __x64_sys_sendmsg+0x2d6/0x560 x64_sys_call+0x1316/0x3cc0 do_syscall_64+0xd8/0x1c0 entry_SYSCALL_64_after_hwframe+0x79/0x81 Uninit was created at: __kmalloc+0x571/0xd30 attach_auth+0x106/0x3e0 xfrm_add_sa+0x2aa0/0x4230 xfrm_user_rcv_msg+0x832/0xc00 netlink_rcv_skb+0x613/0x780 xfrm_netlink_rcv+0x77/0xc0 netlink_unicast+0xe90/0x1280 netlink_sendmsg+0x126d/0x1490 __sock_sendmsg+0x332/0x3d0 ____sys_sendmsg+0x863/0xc30 ___sys_sendmsg+0x285/0x3e0 __x64_sys_sendmsg+0x2d6/0x560 x64_sys_call+0x1316/0x3cc0 do_syscall_64+0xd8/0x1c0 entry_SYSCALL_64_after_hwframe+0x79/0x81 Bytes 328-379 of 732 are uninitialized Memory access of size 732 starts at ffff88800e18e000 Data copied to user address 00007ff30f48aff0 CPU: 2 PID: 18167 Comm: syz-executor.0 Not tainted 6.8.11 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Fixes copying of xfrm algorithms where some random data of the structure fields can end up in userspace. Padding in structures may be filled with random (possibly sensitve) data and should never be given directly to user-space. A similar issue was resolved in the commit `8222d5910d` ("xfrm: Zero padding when dumping algos and encap") Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: `c7a5899eb2` ("xfrm: redact SA secret with lockdown confidentiality") Cc: stable@vger.kernel.org Co-developed-by: Boris Tonofa <b.tonofa@ideco.ru> Signed-off-by: Boris Tonofa <b.tonofa@ideco.ru> Signed-off-by: Petr Vaganov <p.vaganov@ideco.ru> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2024-10-11 09:00:03 +02:00
Johan Hovold	be847a3a8d	serial: qcom-geni: rename suspend functions Drop the unnecessary "_sys" infix from the suspend PM ops. Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20241009145110.16847-10-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:39:24 +02:00
Johan Hovold	4cf4b344c1	serial: qcom-geni: drop unused receive parameter Serial drivers should not be dropping characters themselves, but at least drop the unused 'drop' parameter from the receive handler for now. Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20241009145110.16847-9-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:39:23 +02:00
Johan Hovold	8173d74ac1	serial: qcom-geni: drop flip buffer WARN() Drop the unnecessary WARN() in case the TTY buffers are ever full in favour of a rate limited dev_err() which doesn't kill the machine when panic_on_warn is set. Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20241009145110.16847-8-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:39:23 +02:00
Johan Hovold	c657243ae1	serial: qcom-geni: fix rx cancel dma status bit Cancelling an rx command is signalled using bit 14 of the rx DMA status register and not bit 11. This bit is currently unused, but this error becomes apparent, for example, when tracing the status register when closing the port. Fixes: `eddac5af06` ("soc: qcom: Add GENI based QUP Wrapper driver") Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20241009145110.16847-7-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:39:23 +02:00
Johan Hovold	fa103d2599	serial: qcom-geni: fix receiver enable The receiver is supposed to be enabled in the startup() callback and not in set_termios() which is called also during console setup. This specifically avoids accepting input before the port has been opened (and interrupts enabled), something which can also break the GENI firmware (cancel fails and after abort, the "stale" counter handling appears to be broken so that later input is not processed until twelve chars have been received). There also does not appear to be any need to keep the receiver disabled while updating the port settings. Since commit `6f3c3cafb1` ("serial: qcom-geni: disable interrupts during console writes") the calls to manipulate the secondary interrupts, which were done without holding the port lock, can also lead to the receiver being left disabled when set_termios() races with the console code (e.g. when init opens the tty during boot). This can manifest itself as a serial getty not accepting input. The calls to stop and start rx in set_termios() can similarly race with DMA completion and, for example, cause the DMA buffer to be unmapped twice or the mapping to be leaked. Fix this by only enabling the receiver during startup and while holding the port lock to avoid racing with the console code. Fixes: `6f3c3cafb1` ("serial: qcom-geni: disable interrupts during console writes") Fixes: `2aaa43c707` ("tty: serial: qcom-geni-serial: add support for serial engine DMA") Fixes: `c4f528795d` ("tty: serial: msm_geni_serial: Add serial driver support for GENI based QUP") Cc: stable@vger.kernel.org # 6.3 Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20241009145110.16847-6-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:39:23 +02:00
Johan Hovold	23ee4a2566	serial: qcom-geni: fix dma rx cancellation Make sure to wait for the DMA transfer to complete when cancelling the rx command on stop_rx(). This specifically prevents the DMA completion interrupt from firing after rx has been restarted, something which can lead to an IOMMU fault and hosed rx when the interrupt handler unmaps the DMA buffer for the new command: qcom_geni_serial 988000.serial: serial engine reports 0 RX bytes in! arm-smmu 15000000.iommu: FSR = 00000402 [Format=2 TF], SID=0x563 arm-smmu 15000000.iommu: FSYNR0 = 00210013 [S1CBNDX=33 WNR PLVL=3] Bluetooth: hci0: command 0xfc00 tx timeout Bluetooth: hci0: Reading QCA version information failed (-110) Also add the missing state machine reset which is needed in case cancellation fails. Fixes: `2aaa43c707` ("tty: serial: qcom-geni-serial: add support for serial engine DMA") Cc: stable@vger.kernel.org # 6.3 Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20241009145110.16847-5-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:39:23 +02:00
Johan Hovold	23f5f5debc	serial: qcom-geni: fix shutdown race A commit adding back the stopping of tx on port shutdown failed to add back the locking which had also been removed by commit `e83766334f` ("tty: serial: qcom_geni_serial: No need to stop tx/rx on UART shutdown"). Holding the port lock is needed to serialise against the console code, which may update the interrupt enable register and access the port state. Fixes: `d8aca2f968` ("tty: serial: qcom-geni-serial: stop operations in progress at shutdown") Fixes: `947cc4ecc0` ("serial: qcom-geni: fix soft lockup on sw flow control and suspend") Cc: stable@vger.kernel.org # 6.3 Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20241009145110.16847-4-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:39:23 +02:00
Johan Hovold	19df76662a	serial: qcom-geni: revert broken hibernation support This reverts commit `35781d8356`. Hibernation is not supported on Qualcomm platforms with mainline kernels yet a broken vendor implementation for the GENI serial driver made it upstream. This is effectively dead code that cannot be tested and should just be removed, but if these paths were ever hit for an open non-console port they would crash the machine as the driver would fail to enable clocks during restore() (i.e. all ports would have to be closed by drivers and user space before hibernating the system to avoid this as a comment in the code hinted at). The broken implementation also added a random call to enable the receiver in the port setup code where it does not belong and which enables the receiver prematurely for console ports. Fixes: `35781d8356` ("tty: serial: qcom-geni-serial: Add support for Hibernation feature") Cc: stable@vger.kernel.org # 6.2 Cc: Aniket Randive <quic_arandive@quicinc.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20241009145110.16847-3-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:39:23 +02:00
Johan Hovold	4bef7c6f29	serial: qcom-geni: fix polled console initialisation The polled console (KGDB/KDB) implementation must not call port setup unconditionally as the port may already be in use by the console or a getty. Only make sure that the receiver is enabled, but do not enable any device interrupts. Fixes: `d8851a96ba` ("tty: serial: qcom-geni-serial: Add a poll_init() function") Cc: stable@vger.kernel.org # 6.4 Cc: Douglas Anderson <dianders@chromium.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20241009145110.16847-2-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:39:23 +02:00
Marek Vasut	40d7903386	serial: imx: Update mctrl old_status on RTSD interrupt When sending data using DMA at high baudrate (4 Mbdps in local test case) to a device with small RX buffer which keeps asserting RTS after every received byte, it is possible that the iMX UART driver would not recognize the falling edge of RTS input signal and get stuck, unable to transmit any more data. This condition happens when the following sequence of events occur: - imx_uart_mctrl_check() is called at some point and takes a snapshot of UART control signal status into sport->old_status using imx_uart_get_hwmctrl(). The RTSS/TIOCM_CTS bit is of interest here (). - DMA transfer occurs, the remote device asserts RTS signal after each byte. The i.MX UART driver recognizes each such RTS signal change, raises an interrupt with USR1 register RTSD bit set, which leads to invocation of __imx_uart_rtsint(), which calls uart_handle_cts_change(). - If the RTS signal is deasserted, uart_handle_cts_change() clears port->hw_stopped and unblocks the port for further data transfers. - If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped and blocks the port for further data transfers. This may occur as the last interrupt of a transfer, which means port->hw_stopped remains set and the port remains blocked (). - Any further data transfer attempts will trigger imx_uart_mctrl_check(), which will read current status of UART control signals by calling imx_uart_get_hwmctrl() (*) and compare it with sport->old_status . - If current status differs from sport->old_status for RTS signal, uart_handle_cts_change() is called and possibly unblocks the port by clearing port->hw_stopped . - If current status does not differ from sport->old_status for RTS signal, no action occurs. This may occur in case prior snapshot () was taken before any transfer so the RTS is deasserted, current snapshot (*) was taken after a transfer and therefore RTS is deasserted again, which means current status and sport->old_status are identical. In case () triggered when RTS got asserted, and made port->hw_stopped set, the port->hw_stopped will remain set because no change on RTS line is recognized by this driver and uart_handle_cts_change() is not called from here to unblock the port->hw_stopped. Update sport->old_status in __imx_uart_rtsint() accordingly to make imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR and TIOCM_RI bits in sport->old_status do not suffer from this problem. Fixes: `ceca629e0b` ("[ARM] 2971/1: i.MX uart handle rts irq") Cc: stable <stable@kernel.org> Reviewed-by: Esben Haabendal <esben@geanix.com> Signed-off-by: Marek Vasut <marex@denx.de> Link: https://lore.kernel.org/r/20241002184133.19427-1-marex@denx.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:28:45 +02:00
Longlong Xia	9462f4ca56	tty: n_gsm: Fix use-after-free in gsm_cleanup_mux BUG: KASAN: slab-use-after-free in gsm_cleanup_mux+0x77b/0x7b0 drivers/tty/n_gsm.c:3160 [n_gsm] Read of size 8 at addr ffff88815fe99c00 by task poc/3379 CPU: 0 UID: 0 PID: 3379 Comm: poc Not tainted 6.11.0+ #56 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 Call Trace: <TASK> gsm_cleanup_mux+0x77b/0x7b0 drivers/tty/n_gsm.c:3160 [n_gsm] __pfx_gsm_cleanup_mux+0x10/0x10 drivers/tty/n_gsm.c:3124 [n_gsm] __pfx_sched_clock_cpu+0x10/0x10 kernel/sched/clock.c:389 update_load_avg+0x1c1/0x27b0 kernel/sched/fair.c:4500 __pfx_min_vruntime_cb_rotate+0x10/0x10 kernel/sched/fair.c:846 __rb_insert_augmented+0x492/0xbf0 lib/rbtree.c:161 gsmld_ioctl+0x395/0x1450 drivers/tty/n_gsm.c:3408 [n_gsm] _raw_spin_lock_irqsave+0x92/0xf0 arch/x86/include/asm/atomic.h:107 __pfx_gsmld_ioctl+0x10/0x10 drivers/tty/n_gsm.c:3822 [n_gsm] ktime_get+0x5e/0x140 kernel/time/timekeeping.c:195 ldsem_down_read+0x94/0x4e0 arch/x86/include/asm/atomic64_64.h:79 __pfx_ldsem_down_read+0x10/0x10 drivers/tty/tty_ldsem.c:338 __pfx_do_vfs_ioctl+0x10/0x10 fs/ioctl.c:805 tty_ioctl+0x643/0x1100 drivers/tty/tty_io.c:2818 Allocated by task 65: gsm_data_alloc.constprop.0+0x27/0x190 drivers/tty/n_gsm.c:926 [n_gsm] gsm_send+0x2c/0x580 drivers/tty/n_gsm.c:819 [n_gsm] gsm1_receive+0x547/0xad0 drivers/tty/n_gsm.c:3038 [n_gsm] gsmld_receive_buf+0x176/0x280 drivers/tty/n_gsm.c:3609 [n_gsm] tty_ldisc_receive_buf+0x101/0x1e0 drivers/tty/tty_buffer.c:391 tty_port_default_receive_buf+0x61/0xa0 drivers/tty/tty_port.c:39 flush_to_ldisc+0x1b0/0x750 drivers/tty/tty_buffer.c:445 process_scheduled_works+0x2b0/0x10d0 kernel/workqueue.c:3229 worker_thread+0x3dc/0x950 kernel/workqueue.c:3391 kthread+0x2a3/0x370 kernel/kthread.c:389 ret_from_fork+0x2d/0x70 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:257 Freed by task 3367: kfree+0x126/0x420 mm/slub.c:4580 gsm_cleanup_mux+0x36c/0x7b0 drivers/tty/n_gsm.c:3160 [n_gsm] gsmld_ioctl+0x395/0x1450 drivers/tty/n_gsm.c:3408 [n_gsm] tty_ioctl+0x643/0x1100 drivers/tty/tty_io.c:2818 [Analysis] gsm_msg on the tx_ctrl_list or tx_data_list of gsm_mux can be freed by multi threads through ioctl,which leads to the occurrence of uaf. Protect it by gsm tx lock. Signed-off-by: Longlong Xia <xialonglong@kylinos.cn> Cc: stable <stable@kernel.org> Suggested-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20240926130213.531959-1-xialonglong@kylinos.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:28:17 +02:00
Jeongjun Park	f956052e00	vt: prevent kernel-infoleak in con_font_get() font.data may not initialize all memory spaces depending on the implementation of vc->vc_sw->con_font_get. This may cause info-leak, so to prevent this, it is safest to modify it to initialize the allocated memory space to 0, and it generally does not affect the overall performance of the system. Cc: stable@vger.kernel.org Reported-by: syzbot+955da2d57931604ee691@syzkaller.appspotmail.com Fixes: `05e2600cb0` ("VT: Bump font size limitation to 64x128 pixels") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Link: https://lore.kernel.org/r/20241010174619.59662-1-aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-10-11 08:27:36 +02:00
Gao Xiang	ae54567eaa	erofs: get rid of kaddr in `struct z_erofs_maprecorder` `kaddr` becomes useless after switching to metabuf. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241010235830.1535616-1-hsiangkao@linux.alibaba.com	2024-10-11 13:36:58 +08:00
Gao Xiang	2402082e53	erofs: get rid of z_erofs_try_to_claim_pcluster() Just fold it into the caller for simplicity. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241010090420.405871-1-hsiangkao@linux.alibaba.com	2024-10-11 13:36:58 +08:00
Gao Xiang	416a8b2c02	erofs: ensure regular inodes for file-backed mounts Only regular inodes are allowed for file-backed mounts, not directories (as seen in the original syzbot case) or special inodes. Also ensure that .read_folio() is implemented on the underlying fs for the primary device. Fixes: `fb17675026` ("erofs: add file-backed mount support") Reported-by: syzbot+001306cd9c92ce0df23f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/00000000000011bdde0622498ee3@google.com Tested-by: syzbot+001306cd9c92ce0df23f@syzkaller.appspotmail.com Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240917130803.32418-1-hsiangkao@linux.alibaba.com	2024-10-11 13:36:41 +08:00
Thorsten Blum	f07fd958a4	drm/vmwgfx: Remove unnecessary NULL checks before kvfree() Since kvfree() already checks if its argument is NULL, an additional check before calling kvfree() is unnecessary and can be removed. Remove both and the following Coccinelle/coccicheck warnings reported by ifnullfree.cocci: WARNING: NULL check before some freeing functions is not needed WARNING: NULL check before some freeing functions is not needed Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241007115131.1811-3-thorsten.blum@linux.dev	2024-10-10 23:01:45 -04:00
Tyrone Wu	b836cbdf3b	selftests/bpf: Assert link info uprobe_multi count & path_size if unset Add assertions in `bpf_link_info.uprobe_multi` test to verify that `count` and `path_size` fields are correctly populated when the fields are unset. This tests a previous bug where the `path_size` field was not populated when `path` and `path_size` were unset. Signed-off-by: Tyrone Wu <wudevelops@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241011000803.681190-2-wudevelops@gmail.com	2024-10-10 19:11:28 -07:00
Tyrone Wu	ad6b5b6ea9	bpf: Fix unpopulated path_size when uprobe_multi fields unset Previously when retrieving `bpf_link_info.uprobe_multi` with `path` and `path_size` fields unset, the `path_size` field is not populated (remains 0). This behavior was inconsistent with how other input/output string buffer fields work, as the field should be populated in cases when: - both buffer and length are set (currently works as expected) - both buffer and length are unset (not working as expected) This patch now fills the `path_size` field when `path` and `path_size` are unset. Fixes: `e56fdbfb06` ("bpf: Add link_info support for uprobe multi link") Signed-off-by: Tyrone Wu <wudevelops@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241011000803.681190-1-wudevelops@gmail.com	2024-10-10 19:11:28 -07:00
Tony Ambardar	fd526e121c	selftests/bpf: Fix cross-compiling urandom_read Linking of urandom_read and liburandom_read.so prefers LLVM's 'ld.lld' but falls back to using 'ld' if unsupported. However, this fallback discards any existing makefile macro for LD and can break cross-compilation. Fix by changing the fallback to use the target linker $(LD), passed via '-fuse-ld=' using an absolute path rather than a linker "flavour". Fixes: `08c79c9cd6` ("selftests/bpf: Don't force lld on non-x86 architectures") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241009040720.635260-1-tony.ambardar@gmail.com	2024-10-10 19:08:14 -07:00
Stephen Boyd	23dbbe8889	Merge tag 'samsung-clk-fixes-6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into clk-fixes Pull a Samsung clk driver fix from Krzysztof Kozlowski: Add missing sentinel in of_device_id table so the code iterating over it will not go over the size of an array. * tag 'samsung-clk-fixes-6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: clk: samsung: Fix out-of-bound access of of_match_node()	2024-10-10 14:47:45 -07:00
Tejun Heo	b07996c7ab	sched_ext: Don't hold scx_tasks_lock for too long While enabling and disabling a BPF scheduler, every task is iterated a couple times by walking scx_tasks. Except for one, all iterations keep holding scx_tasks_lock. On multi-socket systems under heavy rq lock contention and high number of threads, this can can lead to RCU and other stalls. The following is triggered on a 2 x AMD EPYC 7642 system (192 logical CPUs) running `stress-ng --workload 150 --workload-threads 10` with >400k idle threads and RCU stall period reduced to 5s: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 91-...!: (10 ticks this GP) idle=0754/1/0x4000000000000000 softirq=18204/18206 fqs=17 rcu: 186-...!: (17 ticks this GP) idle=ec54/1/0x4000000000000000 softirq=25863/25866 fqs=17 rcu: (detected by 80, t=10042 jiffies, g=89305, q=33 ncpus=192) Sending NMI from CPU 80 to CPUs 91: NMI backtrace for cpu 91 CPU: 91 UID: 0 PID: 284038 Comm: sched_ext_ops_h Kdump: loaded Not tainted 6.12.0-rc2-work-g6bf5681f7ee2-dirty #471 Hardware name: Supermicro Super Server/H11DSi, BIOS 2.8 12/14/2023 Sched_ext: simple (disabling+all) RIP: 0010:queued_spin_lock_slowpath+0x17b/0x2f0 Code: 02 c0 10 03 00 83 79 08 00 75 08 f3 90 83 79 08 00 74 f8 48 8b 11 48 85 d2 74 09 0f 0d 0a eb 0a 31 d2 eb 06 31 d2 eb 02 f3 90 <8b> 07 66 85 c0 75 f7 39 d8 75 0d be 01 00 00 00 89 d8 f0 0f b1 37 RSP: 0018:ffffc9000fadfcb8 EFLAGS: 00000002 RAX: 0000000001700001 RBX: 0000000001700000 RCX: ffff88bfcaaf10c0 RDX: 0000000000000000 RSI: 0000000000000101 RDI: ffff88bfca8f0080 RBP: 0000000001700000 R08: 0000000000000090 R09: ffffffffffffffff R10: ffff88a74761b268 R11: 0000000000000000 R12: ffff88a6b6765460 R13: ffffc9000fadfd60 R14: ffff88bfca8f0080 R15: ffff88bfcaac0000 FS: 0000000000000000(0000) GS:ffff88bfcaac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5c55f526a0 CR3: 0000000afd474000 CR4: 0000000000350eb0 Call Trace: <NMI> </NMI> <TASK> do_raw_spin_lock+0x9c/0xb0 task_rq_lock+0x50/0x190 scx_task_iter_next_locked+0x157/0x170 scx_ops_disable_workfn+0x2c2/0xbf0 kthread_worker_fn+0x108/0x2a0 kthread+0xeb/0x110 ret_from_fork+0x36/0x40 ret_from_fork_asm+0x1a/0x30 </TASK> Sending NMI from CPU 80 to CPUs 186: NMI backtrace for cpu 186 CPU: 186 UID: 0 PID: 51248 Comm: fish Kdump: loaded Not tainted 6.12.0-rc2-work-g6bf5681f7ee2-dirty #471 scx_task_iter can safely drop locks while iterating. Make scx_task_iter_next() drop scx_tasks_lock every 32 iterations to avoid stalls. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Vernet <void@manifault.com>	2024-10-10 11:41:44 -10:00
Tejun Heo	967da57832	sched_ext: Move scx_tasks_lock handling into scx_task_iter helpers Iterating with scx_task_iter involves scx_tasks_lock and optionally the rq lock of the task being iterated. Both locks can be released during iteration and the iteration can be continued after re-grabbing scx_tasks_lock. Currently, all lock handling is pushed to the caller which is a bit cumbersome and makes it difficult to add lock-aware behaviors. Make the scx_task_iter helpers handle scx_tasks_lock. - scx_task_iter_init/scx_taks_iter_exit() now grabs and releases scx_task_lock, respectively. Renamed to scx_task_iter_start/scx_task_iter_stop() to more clearly indicate that there are non-trivial side-effects. - Add __ prefix to scx_task_iter_rq_unlock() to indicate that the function is internal. - Add scx_task_iter_unlock/relock(). The former drops both rq lock (if held) and scx_tasks_lock and the latter re-locks only scx_tasks_lock. This doesn't cause behavior changes and will be used to implement stall avoidance. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Vernet <void@manifault.com>	2024-10-10 11:41:44 -10:00
Tejun Heo	aebe7ae4cb	sched_ext: bypass mode shouldn't depend on ops.select_cpu() Bypass mode was depending on ops.select_cpu() which can't be trusted as with the rest of the BPF scheduler. Always enable and use scx_select_cpu_dfl() in bypass mode. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Vernet <void@manifault.com>	2024-10-10 11:41:44 -10:00
Tejun Heo	cc3e1caca9	sched_ext: Move scx_buildin_idle_enabled check to scx_bpf_select_cpu_dfl() Move the sanity check from the inner function scx_select_cpu_dfl() to the exported kfunc scx_bpf_select_cpu_dfl(). This doesn't cause behavior differences and will allow using scx_select_cpu_dfl() in bypass mode regardless of scx_builtin_idle_enabled. Signed-off-by: Tejun Heo <tj@kernel.org>	2024-10-10 11:41:44 -10:00
Tejun Heo	3fdb9ebcec	sched_ext: Start schedulers with consistent p->scx.slice values The disable path caps p->scx.slice to SCX_SLICE_DFL. As the field is already being ignored at this stage during disable, the only effect this has is that when the next BPF scheduler is loaded, it won't see unreasonable left-over slices. Ultimately, this shouldn't matter but it's better to start in a known state. Drop p->scx.slice capping from the disable path and instead reset it to SCX_SLICE_DFL in the enable path. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Vernet <void@manifault.com>	2024-10-10 11:41:44 -10:00
Tejun Heo	54baa7ac0c	Revert "sched_ext: Use shorter slice while bypassing" This reverts commit `6f34d8d382`. Slice length is ignored while bypassing and tasks are switched on every tick and thus the patch does not make any difference. The perceived difference was from test noise. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Vernet <void@manifault.com>	2024-10-10 11:41:44 -10:00
Alice Ryhl	e72a076c62	kbuild: fix issues with rustc-option Fix a few different compiler errors that cause rustc-option to give wrong results. If KBUILD_RUSTFLAGS or the flags being tested contain any -Z flags, then the error below is generated. The RUSTC_BOOTSTRAP environment variable is added to fix this error. error: the option `Z` is only accepted on the nightly compiler help: consider switching to a nightly toolchain: `rustup default nightly` note: selecting a toolchain with `+toolchain` arguments require a rustup proxy; see <https://rust-lang.github.io/rustup/concepts/index.html> note: for more information about Rust's stability policy, see <https://doc.rust-lang.org/book/appendix-07-nightly-rust.html#unstable-features> error: 1 nightly option were parsed Note that RUSTC_BOOTSTRAP is also defined in the top-level Makefile, but Make-exported variables are unfortunately not inherited. That said, this is changing as of commit 98da874c4303 ("[SV 10593] Export variables to $(shell ...) commands"), which is part of Make 4.4. The probe may also fail with the error message below. To fix it, the /dev/null argument is replaced with a file containing the crate attribute #![no_core]. The #![no_core] attribute ensures that rustc does not look for the standard library. It's not possible to instead supply a standard library (i.e. `core`) to rustc, as we need `rustc-option` before the Rust standard library is compiled. error[E0463]: can't find crate for `std` \| = note: the `aarch64-unknown-none` target may not be installed = help: consider downloading the target with `rustup target add aarch64-unknown-none` = help: consider building the standard library from source with `cargo build -Zbuild-std` The -o and --out-dir parameters are altered to fix this warning: warning: ignoring --out-dir flag due to -o flag The --sysroot flag is provided as we would otherwise require it to be present in KBUILD_RUSTFLAGS. The --emit=obj flag is used to write the resulting object file to /dev/null instead of writing it to a file in $(TMPOUT). I verified that the Kconfig version of rustc-option doesn't have the same issues. Fixes: `c42297438a` ("kbuild: rust: Define probing macros for rustc") Co-developed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20241009-rustc-option-bootstrap-v3-1-5fa0d520efba@google.com [ Reworded as discussed in the list. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2024-10-10 22:34:41 +02:00
Harshit Mogalapalli	4575962aee	pinctrl: sophgo: fix double free in cv1800_pctrl_dt_node_to_map() 'map' is allocated using devm_* which takes care of freeing the allocated data, but in error paths there is a call to pinctrl_utils_free_map() which also does kfree(map) which leads to a double free. Use kcalloc() instead of devm_kcalloc() as freeing is manually handled. Fixes: `a29d8e93e7` ("pinctrl: sophgo: add support for CV1800B SoC") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/20241010111830.3474719-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2024-10-10 21:12:59 +02:00
Nikolay Kuratov	26498b8d54	drm/vmwgfx: Handle surface check failure correctly Currently if condition (!bo and !vmw_kms_srf_ok()) was met we go to err_out with ret == 0. err_out dereferences vfb if ret == 0, but in our case vfb is still NULL. Fix this by assigning sensible error to ret. Found by Linux Verification Center (linuxtesting.org) with SVACE Signed-off-by: Nikolay Kuratov <kniv@yandex-team.ru> Cc: stable@vger.kernel.org Fixes: `810b3e1683` ("drm/vmwgfx: Support topology greater than texture size") Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241002122429.1981822-1-kniv@yandex-team.ru	2024-10-10 14:35:24 -04:00
Zack Rusin	512a9721ca	drm/vmwgfx: Cleanup kms setup without 3d Do not validate format equality for the non 3d cases to allow xrgb to argb copies and make sure the dx binding flags are only used on dx compatible surfaces. Fixes basic 2d kms setup on configurations without 3d. There's little practical benefit to it because kms framebuffer coherence is disabled on configurations without 3d but with those changes the code actually makes sense. v2: Remove the now unused format variable Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Fixes: `d6667f0ddf` ("drm/vmwgfx: Fix handling of dumb buffers") Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.9+ Cc: Maaz Mombasawala <maaz.mombasawala@broadcom.com> Cc: Martin Krastev <martin.krastev@broadcom.com> Reviewed-by: Martin Krastev <martin.krastev@broadcom.com> Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240827043905.472825-1-zack.rusin@broadcom.com	2024-10-10 14:31:54 -04:00
Ian Forbes	4809a017a2	drm/vmwgfx: Handle possible ENOMEM in vmw_stdu_connector_atomic_check Handle unlikely ENOMEN condition and other errors in vmw_stdu_connector_atomic_check. Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: `75c3e8a26a` ("drm/vmwgfx: Trigger a modeset when the screen moves") Reviewed-by: Zack Rusin <zack.rusin@broadcom.com> Reviewed-by: Martin Krastev <martin.krastev@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809183756.27283-1-ian.forbes@broadcom.com	2024-10-10 14:27:51 -04:00
Javier Carrasco	6b8e9dbfae	iio: frequency: admv4420: fix missing select REMAP_SPI in Kconfig This driver makes use of regmap_spi, but does not select the required module. Add the missing 'select REGMAP_SPI'. Fixes: `b59c041559` ("iio: frequency: admv4420.c: Add support for ADMV4420") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241007-ad2s1210-select-v2-2-7345d228040f@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-10 19:27:25 +01:00
Javier Carrasco	5c9644a683	iio: frequency: {admv4420,adrf6780}: format Kconfig entries Format the entries of these drivers in the Kconfig, where spaces instead of tabs were used. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241007-ad2s1210-select-v2-1-7345d228040f@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-10 19:27:25 +01:00
Ian Forbes	28a5dfd4f6	drm/vmwgfx: Limit display layout ioctl array size to VMWGFX_NUM_DISPLAY_UNITS Currently the array size is only limited by the largest kmalloc size which is incorrect. This change will also return a more specific error message than ENOMEM to userspace. Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Reviewed-by: Zack Rusin <zack.rusin@broadcom.com> Reviewed-by: Martin Krastev <martin.krastev@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240808200634.1074083-1-ian.forbes@broadcom.com	2024-10-10 14:27:01 -04:00
David Lechner	66cf4455f3	iio: adc: ad4695: Add missing Kconfig select Add select IIO_BUFFER and select IIO_TRIGGERED_BUFFER to the Kconfig for the ad4695 driver. Fixes: `6cc7e4bf2e` ("iio: adc: ad4695: implement triggered buffer") Signed-off-by: David Lechner <dlechner@baylibre.com> Reviewed-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241009-iio-adc-ad4695-fix-kconfig-v1-1-e2a4dfde8d55@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-10 19:10:27 +01:00
Javier Carrasco	4c4834fd86	iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Fixes: `2a86487786` ("iio: adc: ti-ads8688: add trigger and buffer support") Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: Sean Nyekjaer <sean@geanix.com> Link: https://patch.msgid.link/20241003-iio-select-v1-4-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-10 19:00:21 +01:00
Christophe JAILLET	3a29b84cf7	iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency() If hid_sensor_set_report_latency() fails, the error code should be returned instead of a value likely to be interpreted as 'success'. Fixes: `138bc7969c` ("iio: hid-sensor-hub: Implement batch mode") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/c50640665f091a04086e5092cf50f73f2055107a.1727980825.git.christophe.jaillet@wanadoo.fr Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-10 18:50:30 +01:00
Alexei Starovoitov	3f2ac59c0d	Merge branch 'fix-caching-of-btf-for-kfuncs-in-the-verifier' Toke Høiland-Jørgensen says: ==================== Fix caching of BTF for kfuncs in the verifier When playing around with defining kfuncs in some custom modules, we noticed that if a BPF program calls two functions with the same signature in two different modules, the function from the wrong module may sometimes end up being called. Whether this happens depends on the order of the calls in the BPF program, which turns out to be due to the use of sort() inside __find_kfunc_desc_btf() in the verifier code. This series contains a fix for the issue (first patch), and a selftest to trigger it (last patch). The middle commit is a small refactor to expose the module loading helper functions in testing_helpers.c. See the individual patch descriptions for more details. Changes in v2: - Drop patch that refactors module building in selftests (Alexei) - Get rid of expect_val function argument in selftest (Jiri) - Collect ACKs - Link to v1: https://lore.kernel.org/r/20241008-fix-kfunc-btf-caching-for-modules-v1-0-dfefd9aa4318@redhat.com ==================== Link: https://lore.kernel.org/r/20241010-fix-kfunc-btf-caching-for-modules-v2-0-745af6c1af98@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-10 10:44:28 -07:00
Simon Sundberg	f91b256644	selftests/bpf: Add test for kfunc module order Add a test case for kfuncs from multiple external modules, checking that the correct kfuncs are called regardless of which order they're called in. Specifically, check that calling the kfuncs in an order different from the one the modules' BTF are loaded in works. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20241010-fix-kfunc-btf-caching-for-modules-v2-3-745af6c1af98@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-10 10:44:03 -07:00
Simon Sundberg	4192bb294f	selftests/bpf: Provide a generic [un]load_module helper Generalize the previous [un]load_bpf_testmod() helpers (in testing_helpers.c) to the more generic [un]load_module(), which can load an arbitrary kernel module by name. This allows future selftests to more easily load custom kernel modules other than bpf_testmod.ko. Refactor [un]load_bpf_testmod() to wrap this new helper. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20241010-fix-kfunc-btf-caching-for-modules-v2-2-745af6c1af98@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-10 10:44:03 -07:00
Toke Høiland-Jørgensen	6cb86a0fde	bpf: fix kfunc btf caching for modules The verifier contains a cache for looking up module BTF objects when calling kfuncs defined in modules. This cache uses a 'struct bpf_kfunc_btf_tab', which contains a sorted list of BTF objects that were already seen in the current verifier run, and the BTF objects are looked up by the offset stored in the relocated call instruction using bsearch(). The first time a given offset is seen, the module BTF is loaded from the file descriptor passed in by libbpf, and stored into the cache. However, there's a bug in the code storing the new entry: it stores a pointer to the new cache entry, then calls sort() to keep the cache sorted for the next lookup using bsearch(), and then returns the entry that was just stored through the stored pointer. However, because sort() modifies the list of entries in place by value, the stored pointer may no longer point to the right entry, in which case the wrong BTF object will be returned. The end result of this is an intermittent bug where, if a BPF program calls two functions with the same signature in two different modules, the function from the wrong module may sometimes end up being called. Whether this happens depends on the order of the calls in the BPF program (as that affects whether sort() reorders the array of BTF objects), making it especially hard to track down. Simon, credited as reporter below, spent significant effort analysing and creating a reproducer for this issue. The reproducer is added as a selftest in a subsequent patch. The fix is straight forward: simply don't use the stored pointer after calling sort(). Since we already have an on-stack pointer to the BTF object itself at the point where the function return, just use that, and populate it from the cache entry in the branch where the lookup succeeds. Fixes: `2357672c54` ("bpf: Introduce BPF support for kernel module function calls") Reported-by: Simon Sundberg <simon.sundberg@kau.se> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20241010-fix-kfunc-btf-caching-for-modules-v2-1-745af6c1af98@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-10 10:44:03 -07:00
Honglei Wang	c425180d88	sched_ext: use correct function name in pick_task_scx() warning message pick_next_task_scx() was turned into pick_task_scx() since commit `753e2836d1` ("sched_ext: Unify regular and core-sched pick task paths"). Update the outdated message. Signed-off-by: Honglei Wang <jameshongleiwang@126.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-10-10 06:35:28 -10:00
Christian Heusel	182fff3a2a	ASoC: amd: yc: Add quirk for ASUS Vivobook S15 M3502RA As reported the builtin microphone doesn't work on the ASUS Vivobook model S15 OLED M3502RA. Therefore add a quirk for it to make it work. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219345 Signed-off-by: Christian Heusel <christian@heusel.eu> Link: https://patch.msgid.link/20241010-bugzilla-219345-asus-vivobook-v1-1-3bb24834e2c3@heusel.eu Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-10 17:03:03 +01:00
Julian Vetter	ad6639f143	sound: Make CONFIG_SND depend on INDIRECT_IOMEM instead of UML When building for the UM arch and neither INDIRECT_IOMEM=y, nor HAS_IOMEM=y is selected, it will fall back to the implementations from asm-generic/io.h for IO memcpy. But these fall-back functions just do a memcpy. So, instead of depending on UML, add dependency on 'HAS_IOMEM \|\| INDIRECT_IOMEM'. Reviewed-by: Yann Sionneau <ysionneau@kalrayinc.com> Signed-off-by: Julian Vetter <jvetter@kalrayinc.com> Link: https://patch.msgid.link/20241010124601.700528-1-jvetter@kalrayinc.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-10 16:11:18 +02:00
Michael Mueller	cad4b3d4ab	KVM: s390: Change virtual to physical address access in diag 0x258 handler The parameters for the diag 0x258 are real addresses, not virtual, but KVM was using them as virtual addresses. This only happened to work, since the Linux kernel as a guest used to have a 1:1 mapping for physical vs virtual addresses. Fix KVM so that it correctly uses the addresses as real addresses. Cc: stable@vger.kernel.org Fixes: `8ae04b8f50` ("KVM: s390: Guest's memory access functions get access registers") Suggested-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Signed-off-by: Nico Boehr <nrb@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20240917151904.74314-3-nrb@linux.ibm.com Acked-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2024-10-10 15:31:55 +02:00
Nico Boehr	e8061f0618	KVM: s390: gaccess: Check if guest address is in memslot Previously, access_guest_page() did not check whether the given guest address is inside of a memslot. This is not a problem, since kvm_write_guest_page/kvm_read_guest_page return -EFAULT in this case. However, -EFAULT is also returned when copy_to/from_user fails. When emulating a guest instruction, the address being outside a memslot usually means that an addressing exception should be injected into the guest. Failure in copy_to/from_user however indicates that something is wrong in userspace and hence should be handled there. To be able to distinguish these two cases, return PGM_ADDRESSING in access_guest_page() when the guest address is outside guest memory. In access_guest_real(), populate vcpu->arch.pgm.code such that kvm_s390_inject_prog_cond() can be used in the caller for injecting into the guest (if applicable). Since this adds a new return value to access_guest_page(), we need to make sure that other callers are not confused by the new positive return value. There are the following users of access_guest_page(): - access_guest_with_key() does the checking itself (in guest_range_to_gpas()), so this case should never happen. Even if, the handling is set up properly. - access_guest_real() just passes the return code to its callers, which are: - read_guest_real() - see below - write_guest_real() - see below There are the following users of read_guest_real(): - ar_translation() in gaccess.c which already returns PGM_* - setup_apcb10(), setup_apcb00(), setup_apcb11() in vsie.c which always return -EFAULT on read_guest_read() nonzero return - no change - shadow_crycb(), handle_stfle() always present this as validity, this could be handled better but doesn't change current behaviour - no change There are the following users of write_guest_real(): - kvm_s390_store_status_unloaded() always returns -EFAULT on write_guest_real() failure. Fixes: `2293897805` ("KVM: s390: add architecture compliant guest access functions") Cc: stable@vger.kernel.org Signed-off-by: Nico Boehr <nrb@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20240917151904.74314-2-nrb@linux.ibm.com Acked-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2024-10-10 15:31:55 +02:00
Harald Freudenberger	78f636e82b	s390/ap: Fix CCA crypto card behavior within protected execution environment A crypto card comes in 3 flavors: accelerator, CCA co-processor or EP11 co-processor. Within a protected execution environment only the accelerator and EP11 co-processor is supported. However, it is possible to set up a KVM guest with a CCA card and run it as a protected execution guest. There is nothing at the host side which prevents this. Within such a guest, a CCA card is shown as "illicit" and you can't do anything with such a crypto card. Regardless of the unsupported CCA card within a protected execution guest there are a couple of user space applications which unconditional try to run crypto requests to the zcrypt device driver. There was a bug within the AP bus code which allowed such a request to be forwarded to a CCA card where it is finally rejected and the driver reacts with -ENODEV but also triggers an AP bus scan. Together with a retry loop this caused some kind of "hang" of the KVM guest. On startup it caused timeouts and finally led the KVM guest startup fail. Fix that by closing the gap and make sure a CCA card is not usable within a protected execution environment. Another behavior within an protected execution environment with CCA cards was that the se_bind and se_associate AP queue sysfs attributes where shown. The implementation unconditional always added these attributes. Fix that by checking if the card mode is supported within a protected execution environment and only if valid, add the attribute group. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2024-10-10 15:31:55 +02:00
Niklas Schnelle	3cd03ea57e	s390/pci: Handle PCI error codes other than 0x3a The Linux implementation of PCI error recovery for s390 was based on the understanding that firmware error recovery is a two step process with an optional initial error event to indicate the cause of the error if known followed by either error event 0x3A (Success) or 0x3B (Failure) to indicate whether firmware was able to recover. While this has been the case in testing and the error cases seen in the wild it turns out this is not correct. Instead firmware only generates 0x3A for some error and service scenarios and expects the OS to perform recovery for all PCI events codes except for those indicating permanent error (0x3B, 0x40) and those indicating errors on the function measurement block (0x2A, 0x2B, 0x2C). Align Linux behavior with these expectations. Fixes: `4cdf2f4e24` ("s390/pci: implement minimal PCI error recovery") Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2024-10-10 15:31:55 +02:00
Markus Grabner	122fe6e915	ALSA: line6: update contact information The Line6 driver source code files contain an outdated email address of the original author. This patch updates the contact information. Signed-off-by: Markus Grabner <line6@grabner-graz.at> Link: https://patch.msgid.link/20241009194251.15662-1-line6@grabner-graz.at Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-10 14:02:57 +02:00
Karol Kosik	57c14b983f	ALSA: usb-audio: Fix NULL pointer deref in snd_usb_power_domain_set() Commit adding support for multiple control interfaces expanded struct snd_usb_power_domain with pointer to control interface for proper control message routing but missed one initialization point of this structure, which has left new field with NULL value. Standard mandates that each device has at least one control interface and code responsible for power domain does not check for NULL values when querying for control interface. This caused some USB devices to crash the kernel. Fixes: `6aa8700150` ("ALSA: usb-audio: Support multiple control interfaces") Signed-off-by: Karol Kosik <k.kosik@outlook.com> Link: https://patch.msgid.link/AS8P190MB1285B563C6B5394DB274813FEC782@AS8P190MB1285.EURP190.PROD.OUTLOOK.COM Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-10 13:58:44 +02:00
Alain Volmat	b5a468199b	spi: stm32: fix missing device mode capability in stm32mp25 The STM32MP25 SOC has capability to behave in device mode however missing .has_device_mode within its stm32mp25_spi_cfg structure leads to not being able to enable the device mode. Signed-off-by: Alain Volmat <alain.volmat@foss.st.com> Link: https://patch.msgid.link/20241009-spi-mp25-device-fix-v1-1-8e5ca7db7838@foss.st.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-10 12:15:18 +01:00
Amadeusz Sławiński	9eb2142a2a	ASoC: topology: Bump minimal topology ABI version When v4 topology support was removed, minimal topology ABI version should have been bumped. Fixes: `fe4a074542` ("ASoC: Drop soc-topology ABI v4 support") Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Link: https://patch.msgid.link/20241009081230.304918-1-amadeuszx.slawinski@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-10 12:15:03 +01:00
Zhu Jun	251ce34a44	ASoC: codecs: Fix error handling in aw_dev_get_dsp_status function Added proper error handling for register value check that return -EPERM when register value does not meet expected condition Signed-off-by: Zhu Jun <zhujun2@cmss.chinamobile.com> Link: https://patch.msgid.link/20241009073938.7472-1-zhujun2@cmss.chinamobile.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-10 12:15:01 +01:00
Alexey Klimov	d0e806b0cc	ASoC: qcom: sdm845: add missing soundwire runtime stream alloc During the migration of Soundwire runtime stream allocation from the Qualcomm Soundwire controller to SoC's soundcard drivers the sdm845 soundcard was forgotten. At this point any playback attempt or audio daemon startup, for instance on sdm845-db845c (Qualcomm RB3 board), will result in stream pointer NULL dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000101ecf000 [0000000000000020] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP Modules linked in: ... CPU: 5 UID: 0 PID: 1198 Comm: aplay Not tainted 6.12.0-rc2-qcomlt-arm64-00059-g9d78f315a362-dirty #18 Hardware name: Thundercomm Dragonboard 845c (DT) pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : sdw_stream_add_slave+0x44/0x380 [soundwire_bus] lr : sdw_stream_add_slave+0x44/0x380 [soundwire_bus] sp : ffff80008a2035c0 x29: ffff80008a2035c0 x28: ffff80008a203978 x27: 0000000000000000 x26: 00000000000000c0 x25: 0000000000000000 x24: ffff1676025f4800 x23: ffff167600ff1cb8 x22: ffff167600ff1c98 x21: 0000000000000003 x20: ffff167607316000 x19: ffff167604e64e80 x18: 0000000000000000 x17: 0000000000000000 x16: ffffcec265074160 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffff167600ff1cec x5 : ffffcec22cfa2010 x4 : 0000000000000000 x3 : 0000000000000003 x2 : ffff167613f836c0 x1 : 0000000000000000 x0 : ffff16761feb60b8 Call trace: sdw_stream_add_slave+0x44/0x380 [soundwire_bus] wsa881x_hw_params+0x68/0x80 [snd_soc_wsa881x] snd_soc_dai_hw_params+0x3c/0xa4 __soc_pcm_hw_params+0x230/0x660 dpcm_be_dai_hw_params+0x1d0/0x3f8 dpcm_fe_dai_hw_params+0x98/0x268 snd_pcm_hw_params+0x124/0x460 snd_pcm_common_ioctl+0x998/0x16e8 snd_pcm_ioctl+0x34/0x58 __arm64_sys_ioctl+0xac/0xf8 invoke_syscall+0x48/0x104 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x34/0xe0 el0t_64_sync_handler+0x120/0x12c el0t_64_sync+0x190/0x194 Code: aa0403fb f9418400 9100e000 9400102f (f8420f22) ---[ end trace 0000000000000000 ]--- 0000000000006108 <sdw_stream_add_slave>: 6108: d503233f paciasp 610c: a9b97bfd stp x29, x30, [sp, #-112]! 6110: 910003fd mov x29, sp 6114: a90153f3 stp x19, x20, [sp, #16] 6118: a9025bf5 stp x21, x22, [sp, #32] 611c: aa0103f6 mov x22, x1 6120: 2a0303f5 mov w21, w3 6124: a90363f7 stp x23, x24, [sp, #48] 6128: aa0003f8 mov x24, x0 612c: aa0203f7 mov x23, x2 6130: a9046bf9 stp x25, x26, [sp, #64] 6134: aa0403f9 mov x25, x4 <-- x4 copied to x25 6138: a90573fb stp x27, x28, [sp, #80] 613c: aa0403fb mov x27, x4 6140: f9418400 ldr x0, [x0, #776] 6144: 9100e000 add x0, x0, #0x38 6148: 94000000 bl 0 <mutex_lock> 614c: f8420f22 ldr x2, [x25, #32]! <-- offset 0x44 ^^^ This is 0x6108 + offset 0x44 from the beginning of sdw_stream_add_slave() where data abort happens. wsa881x_hw_params() is called with stream = NULL and passes it further in register x4 (5th argument) to sdw_stream_add_slave() without any checks. Value from x4 is copied to x25 and finally it aborts on trying to load a value from address in x25 plus offset 32 (in dec) which corresponds to master_list member in struct sdw_stream_runtime: struct sdw_stream_runtime { const char * name; /* 0 8 / struct sdw_stream_params params; / 8 12 / enum sdw_stream_state state; / 20 4 / enum sdw_stream_type type; / 24 4 / / XXX 4 bytes hole, try to pack / here-> struct list_head master_list; / 32 16 / int m_rt_count; / 48 4 / / size: 56, cachelines: 1, members: 6 / / sum members: 48, holes: 1, sum holes: 4 / / padding: 4 / / last cacheline: 56 bytes */ Fix this by adding required calls to qcom_snd_sdw_startup() and sdw_release_stream() to startup and shutdown routines which restores the previous correct behaviour when ->set_stream() method is called to set a valid stream runtime pointer on playback startup. Reproduced and then fix was tested on db845c RB3 board. Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: stable@vger.kernel.org Fixes: `15c7fab0e0` ("ASoC: qcom: Move Soundwire runtime stream alloc to soundcards") Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org> Tested-by: Steev Klimaszewski <steev@kali.org> # Lenovo Yoga C630 Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://patch.msgid.link/20241009213922.999355-1-alexey.klimov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-10 12:15:00 +01:00
Binbin Zhou	a6134e7b4d	ASoC: loongson: Fix component check failed on FDT systems Add missing snd_soc_dai_link.platforms assignment to avoid soc_dai_link_sanity_check() failure. Fixes: `d24028606e` ("ASoC: loongson: Add Loongson ASoC Sound Card Support") Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Link: https://patch.msgid.link/6645888f2f9e8a1d8d799109f867d0f97fd78c58.1728459624.git.zhoubinbin@loongson.cn Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-10 12:14:59 +01:00
Aleksa Sarai	f92f0a1b05	openat2: explicitly return -E2BIG for (usize > PAGE_SIZE) While we do currently return -EFAULT in this case, it seems prudent to follow the behaviour of other syscalls like clone3. It seems quite unlikely that anyone depends on this error code being EFAULT, but we can always revert this if it turns out to be an issue. Cc: stable@vger.kernel.org # v5.6+ Fixes: `fddb5d430a` ("open: introduce openat2(2) syscall") Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Link: https://lore.kernel.org/r/20241010-extensible-structs-check_fields-v3-3-d2833dfe6edd@cyphar.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-10 12:09:03 +02:00
Herbert Xu	e845d2399a	crypto: marvell/cesa - Disable hash algorithms Disable cesa hash algorithms by lowering the priority because they appear to be broken when invoked in parallel. This allows them to still be tested for debugging purposes. Reported-by: Klaus Kudielka <klaus.kudielka@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-10-10 17:03:35 +08:00
Herbert Xu	6318fbe26e	crypto: testmgr - Hide ENOENT errors better The previous patch removed the ENOENT warning at the point of allocation, but the overall self-test warning is still there. Fix all of them by returning zero as the test result. This is safe because if the algorithm has gone away, then it cannot be marked as tested. Fixes: `4eded6d14f` ("crypto: testmgr - Hide ENOENT errors") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-10-10 17:03:35 +08:00
Herbert Xu	b81e286ba1	crypto: api - Fix liveliness check in crypto_alg_tested As algorithm testing is carried out without holding the main crypto lock, it is always possible for the algorithm to go away during the test. So before crypto_alg_tested updates the status of the tested alg, it checks whether it's still on the list of all algorithms. This is inaccurate because it may be off the main list but still on the list of algorithms to be removed. Updating the algorithm status is safe per se as the larval still holds a reference to it. However, killing spawns of other algorithms that are of lower priority is clearly a deficiency as it adds unnecessary churn. Fix the test by checking whether the algorithm is dead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-10-10 17:03:35 +08:00
Johannes Wikner	c62fa117c3	x86/bugs: Do not use UNTRAIN_RET with IBPB on entry Since X86_FEATURE_ENTRY_IBPB will invalidate all harmful predictions with IBPB, no software-based untraining of returns is needed anymore. Currently, this change affects retbleed and SRSO mitigations so if either of the mitigations is doing IBPB and the other one does the software sequence, the latter is not needed anymore. [ bp: Massage commit message. ] Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Johannes Wikner <kwikner@ethz.ch> Cc: <stable@kernel.org>	2024-10-10 10:38:21 +02:00
Johannes Wikner	0fad287864	x86/bugs: Skip RSB fill at VMEXIT entry_ibpb() is designed to follow Intel's IBPB specification regardless of CPU. This includes invalidating RSB entries. Hence, if IBPB on VMEXIT has been selected, entry_ibpb() as part of the RET untraining in the VMEXIT path will take care of all BTB and RSB clearing so there's no need to explicitly fill the RSB anymore. [ bp: Massage commit message. ] Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Johannes Wikner <kwikner@ethz.ch> Cc: <stable@kernel.org>	2024-10-10 10:35:53 +02:00
Johannes Wikner	50e4b3b940	x86/entry: Have entry_ibpb() invalidate return predictions entry_ibpb() should invalidate all indirect predictions, including return target predictions. Not all IBPB implementations do this, in which case the fallback is RSB filling. Prevent SRSO-style hijacks of return predictions following IBPB, as the return target predictor can be corrupted before the IBPB completes. [ bp: Massage. ] Signed-off-by: Johannes Wikner <kwikner@ethz.ch> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org>	2024-10-10 10:35:27 +02:00
Johannes Wikner	3ea87dfa31	x86/cpufeatures: Add a IBPB_NO_RET BUG flag Set this flag if the CPU has an IBPB implementation that does not invalidate return target predictions. Zen generations < 4 do not flush the RSB when executing an IBPB and this bug flag denotes that. [ bp: Massage. ] Signed-off-by: Johannes Wikner <kwikner@ethz.ch> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org>	2024-10-10 10:34:29 +02:00
Jim Mattson	ff898623af	x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET AMD's initial implementation of IBPB did not clear the return address predictor. Beginning with Zen4, AMD's IBPB does clear the return address predictor. This behavior is enumerated by CPUID.80000008H:EBX.IBPB_RET[30]. Define X86_FEATURE_AMD_IBPB_RET for use in KVM_GET_SUPPORTED_CPUID, when determining cross-vendor capabilities. Suggested-by: Venkatesh Srinivas <venkateshs@chromium.org> Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org>	2024-10-10 10:34:14 +02:00
Namjae Jeon	7aa8804c0b	ksmbd: fix user-after-free from session log off There is racy issue between smb2 session log off and smb2 session setup. It will cause user-after-free from session log off. This add session_lock when setting SMB2_SESSION_EXPIRED and referece count to session struct not to free session while it is being used. Cc: stable@vger.kernel.org # v5.15+ Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-25282 Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-10-09 21:23:17 -05:00
Tony Ambardar	60f802e2d6	selftests/bpf: Fix error compiling cgroup_ancestor.c with musl libc Existing code calls connect() with a 'struct sockaddr_in6 ' argument where a 'struct sockaddr ' argument is declared, yielding compile errors when building for mips64el/musl-libc: In file included from cgroup_ancestor.c:3: cgroup_ancestor.c: In function 'send_datagram': cgroup_ancestor.c:38:38: error: passing argument 2 of 'connect' from incompatible pointer type [-Werror=incompatible-pointer-types] 38 \| if (!ASSERT_OK(connect(sock, &addr, sizeof(addr)), "connect")) { \| ^~~~~ \| \| \| struct sockaddr_in6 * ./test_progs.h:343:29: note: in definition of macro 'ASSERT_OK' 343 \| long long ___res = (res); \ \| ^~~ In file included from .../netinet/in.h:10, from .../arpa/inet.h:9, from ./test_progs.h:17: .../sys/socket.h:386:19: note: expected 'const struct sockaddr ' but argument is of type 'struct sockaddr_in6 ' 386 \| int connect (int, const struct sockaddr *, socklen_t); \| ^~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors This only compiles because of a glibc extension allowing declaration of the argument as a "transparent union" which includes both types above. Explicitly cast the argument to allow compiling for both musl and glibc. Cc: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Fixes: `f957c230e1` ("selftests/bpf: convert test_skb_cgroup_id_user to test_progs") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Reviewed-by: Alexis Lothoré <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20241008231232.634047-1-tony.ambardar@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-09 18:29:07 -07:00
Pu Lehui	30a59cc797	riscv, bpf: Fix possible infinite tailcall when CONFIG_CFI_CLANG is enabled When CONFIG_CFI_CLANG is enabled, the number of prologue instructions skipped by tailcall needs to include the kcfi instruction, otherwise the TCC will be initialized every tailcall is called, which may result in infinite tailcalls. Fixes: `e63985ecd2` ("bpf, riscv64/cfi: Support kCFI + BPF on riscv64") Signed-off-by: Pu Lehui <pulehui@huawei.com> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20241008124544.171161-1-pulehui@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-09 18:23:06 -07:00
Tyrone Wu	4538a38f65	selftests/bpf: fix perf_event link info name_len assertion Fix `name_len` field assertions in `bpf_link_info.perf_event` for kprobe/uprobe/tracepoint to validate correct name size instead of 0. Fixes: `23cf7aa539` ("selftests/bpf: Add selftest for fill_link_info") Signed-off-by: Tyrone Wu <wudevelops@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20241008164312.46269-2-wudevelops@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-09 18:17:16 -07:00
Tyrone Wu	4deecdd29c	bpf: fix unpopulated name_len field in perf_event link info Previously when retrieving `bpf_link_info.perf_event` for kprobe/uprobe/tracepoint, the `name_len` field was not populated by the kernel, leaving it to reflect the value initially set by the user. This behavior was inconsistent with how other input/output string buffer fields function (e.g. `raw_tracepoint.tp_name_len`). This patch fills `name_len` with the actual size of the string name. Fixes: `1b715e1b0e` ("bpf: Support ->fill_link_info for perf_event") Signed-off-by: Tyrone Wu <wudevelops@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20241008164312.46269-1-wudevelops@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-09 18:17:16 -07:00
Rik van Riel	434247637c	bpf: use kvzmalloc to allocate BPF verifier environment The kzmalloc call in bpf_check can fail when memory is very fragmented, which in turn can lead to an OOM kill. Use kvzmalloc to fall back to vmalloc when memory is too fragmented to allocate an order 3 sized bpf verifier environment. Admittedly this is not a very common case, and only happens on systems where memory has already been squeezed close to the limit, but this does not seem like much of a hot path, and it's a simple enough fix. Signed-off-by: Rik van Riel <riel@surriel.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Link: https://lore.kernel.org/r/20241008170735.16766766@imladris.surriel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-09 18:13:05 -07:00
Alexei Starovoitov	830b8e4942	Merge branch 'check-the-remaining-info_cnt-before-repeating-btf-fields' Hou Tao says: ==================== Check the remaining info_cnt before repeating btf fields From: Hou Tao <houtao1@huawei.com> Hi, The patch set adds the missed check again info_cnt when flattening the array of nested struct. The problem was spotted when developing dynptr key support for hash map. Patch #1 adds the missed check and patch #2 adds three success test cases and one failure test case for the problem. Comments are always welcome. Change Log: v2: * patch #1: check info_cnt in btf_repeat_fields() * patch #2: use a hard-coded number instead of BTF_FIELDS_MAX, because BTF_FIELDS_MAX is not always available in vmlinux.h (e.g., for llvm 17/18) v1: https://lore.kernel.org/bpf/20240911110557.2759801-1-houtao@huaweicloud.com/T/#t ==================== Link: https://lore.kernel.org/r/20241008071114.3718177-1-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-09 16:32:47 -07:00
Hou Tao	c456f08040	selftests/bpf: Add more test case for field flattening Add three success test cases to test the flattening of array of nested struct. For these three tests, the number of special fields in map is BTF_FIELDS_MAX, but the array is defined in structs with different nested level. Add one failure test case for the flattening as well. In the test case, the number of special fields in map is BTF_FIELDS_MAX + 1. It will make btf_parse_fields() in map_create() return -E2BIG, the creation of map will succeed, but the load of program will fail because the btf_record is invalid for the map. Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241008071114.3718177-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-09 16:32:47 -07:00
Hou Tao	797d73ee23	bpf: Check the remaining info_cnt before repeating btf fields When trying to repeat the btf fields for array of nested struct, it doesn't check the remaining info_cnt. The following splat will be reported when the value of ret * nelems is greater than BTF_FIELDS_MAX: ------------[ cut here ]------------ UBSAN: array-index-out-of-bounds in ../kernel/bpf/btf.c:3951:49 index 11 is out of range for type 'btf_field_info [11]' CPU: 6 UID: 0 PID: 411 Comm: test_progs ...... 6.11.0-rc4+ #1 Tainted: [O]=OOT_MODULE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ... Call Trace: <TASK> dump_stack_lvl+0x57/0x70 dump_stack+0x10/0x20 ubsan_epilogue+0x9/0x40 __ubsan_handle_out_of_bounds+0x6f/0x80 ? kallsyms_lookup_name+0x48/0xb0 btf_parse_fields+0x992/0xce0 map_create+0x591/0x770 __sys_bpf+0x229/0x2410 __x64_sys_bpf+0x1f/0x30 x64_sys_call+0x199/0x9f0 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fea56f2cc5d ...... </TASK> ---[ end trace ]--- Fix it by checking the remaining info_cnt in btf_repeat_fields() before repeating the btf fields. Fixes: `64e8ee8148` ("bpf: look into the types of the fields of a struct type recursively.") Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241008071114.3718177-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-09 16:32:46 -07:00
Yao Zi	ad1081a0da	clk: rockchip: fix finding of maximum clock ID If an ID of a branch's child is greater than current maximum, we should set new maximum to the child's ID, instead of its parent's. Fixes: `2dc66a5ab2` ("clk: rockchip: rk3588: fix CLK_NR_CLKS usage") Signed-off-by: Yao Zi <ziyao@disroot.org> Link: https://lore.kernel.org/r/20240912133204.29089-2-ziyao@disroot.org Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2024-10-09 16:06:51 -07:00
Masahiro Yamada	b55da84759	kbuild: refactor cc-option-yn, cc-disable-warning, rust-option-yn macros cc-option-yn and cc-disable-warning duplicate the compile command seen a few lines above. These can be defined based on cc-option. I also refactored rustc-option-yn in the same way, although there are currently no users of it. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20241009102821.2675718-1-masahiroy@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2024-10-10 00:50:58 +02:00
Greg Joyce	0ce96a6708	nvme: disable CC.CRIME (NVME_CC_CRIME) Disable NVME_CC_CRIME so that CSTS.RDY indicates that the media is ready and able to handle commands without returning NVME_SC_ADMIN_COMMAND_MEDIA_NOT_READY. Signed-off-by: Greg Joyce <gjoyce@linux.ibm.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Tested-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-09 14:45:19 -07:00
Kent Overstreet	3b80552e70	bcachefs: __wait_for_freeing_inode: Switch to wait_bit_queue_entry inode_bit_waitqueue() is changing - this update clears the way for sched changes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-09 16:58:18 -04:00
Kent Overstreet	a7e2dd58fb	bcachefs: Check if stuck in journal_res_get() Like how we already do when the allocator seems to be stuck, check if we're waiting too long for a journal reservation and print some debug info. This is specifically to track down https://github.com/koverstreet/bcachefs/issues/656 which is showing up in userspace where we don't have sysfs/debugfs to get the journal debug info. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-09 16:57:59 -04:00
Kent Overstreet	04b670de28	closures: Add closure_wait_event_timeout() Add a closure version of wait_event_timeout(), with the same semantics. The closure version is useful because unlike wait_event(), it allows blocking code to run in the conditional expression. Cc: Coly Li <colyli@suse.de> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-09 16:57:57 -04:00
Alan Huang	9205d24cf7	bcachefs: Fix state lock involved deadlock We increased write ref, if the fs went to RO, that would lead to a deadlock, it actually happens: 00171 ========= TEST generic/279 00171 00172 bcachefs (vdb): starting version 1.12: rebalance_work_acct_fix opts=nocow 00172 bcachefs (vdb): recovering from clean shutdown, journal seq 35 00172 bcachefs (vdb): accounting_read... done 00172 bcachefs (vdb): alloc_read... done 00172 bcachefs (vdb): stripes_read... done 00172 bcachefs (vdb): snapshots_read... done 00172 bcachefs (vdb): journal_replay... done 00172 bcachefs (vdb): resume_logged_ops... done 00172 bcachefs (vdb): going read-write 00172 bcachefs (vdb): done starting filesystem 00172 FSTYP -- bcachefs 00172 PLATFORM -- Linux/aarch64 farm3-kvm 6.11.0-rc1-ktest-g3e290a0b8e34 #7030 SMP Tue Oct 8 14:15:12 UTC 2024 00172 MKFS_OPTIONS -- --nocow /dev/vdc 00172 MOUNT_OPTIONS -- /dev/vdc /mnt/scratch 00172 00172 bcachefs (vdc): starting version 1.12: rebalance_work_acct_fix opts=nocow 00172 bcachefs (vdc): initializing new filesystem 00172 bcachefs (vdc): going read-write 00172 bcachefs (vdc): marking superblocks 00172 bcachefs (vdc): initializing freespace 00172 bcachefs (vdc): done initializing freespace 00172 bcachefs (vdc): reading snapshots table 00172 bcachefs (vdc): reading snapshots done 00172 bcachefs (vdc): done starting filesystem 00173 bcachefs (vdc): shutting down 00173 bcachefs (vdc): going read-only 00173 bcachefs (vdc): finished waiting for writes to stop 00173 bcachefs (vdc): flushing journal and stopping allocators, journal seq 4 00173 bcachefs (vdc): flushing journal and stopping allocators complete, journal seq 6 00173 bcachefs (vdc): shutdown complete, journal seq 7 00173 bcachefs (vdc): marking filesystem clean 00173 bcachefs (vdc): shutdown complete 00173 bcachefs (vdb): shutting down 00173 bcachefs (vdb): going read-only 00361 INFO: task umount:6180 blocked for more than 122 seconds. 00361 Not tainted 6.11.0-rc1-ktest-g3e290a0b8e34 #7030 00361 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 00361 task:umount state:D stack:0 pid:6180 tgid:6180 ppid:6176 flags:0x00000004 00361 Call trace: 00362 __switch_to (arch/arm64/kernel/process.c:556) 00362 __schedule (kernel/sched/core.c:5191 kernel/sched/core.c:6529) 00363 schedule (include/asm-generic/bitops/generic-non-atomic.h:128 include/linux/thread_info.h:192 include/linux/sched.h:2084 kernel/sched/core.c:6608 kernel/sched/core.c:6621) 00365 bch2_fs_read_only (fs/bcachefs/super.c:346 (discriminator 41)) 00367 __bch2_fs_stop (fs/bcachefs/super.c:620) 00368 bch2_put_super (fs/bcachefs/fs.c:1942) 00369 generic_shutdown_super (include/linux/list.h:373 (discriminator 2) fs/super.c:650 (discriminator 2)) 00371 bch2_kill_sb (fs/bcachefs/fs.c:2170) 00372 deactivate_locked_super (fs/super.c:434 fs/super.c:475) 00373 deactivate_super (fs/super.c:508) 00374 cleanup_mnt (fs/namespace.c:250 fs/namespace.c:1374) 00376 __cleanup_mnt (fs/namespace.c:1381) 00376 task_work_run (include/linux/sched.h:2024 kernel/task_work.c:224) 00377 do_notify_resume (include/linux/resume_user_mode.h:50 arch/arm64/kernel/entry-common.c:151) 00377 el0_svc (arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/entry-common.c:171 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713) 00377 el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731) 00378 el0t_64_sync (arch/arm64/kernel/entry.S:598) 00378 INFO: task tee:6182 blocked for more than 122 seconds. 00378 Not tainted 6.11.0-rc1-ktest-g3e290a0b8e34 #7030 00378 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 00378 task:tee state:D stack:0 pid:6182 tgid:6182 ppid:533 flags:0x00000004 00378 Call trace: 00378 __switch_to (arch/arm64/kernel/process.c:556) 00378 __schedule (kernel/sched/core.c:5191 kernel/sched/core.c:6529) 00378 schedule (include/asm-generic/bitops/generic-non-atomic.h:128 include/linux/thread_info.h:192 include/linux/sched.h:2084 kernel/sched/core.c:6608 kernel/sched/core.c:6621) 00378 schedule_preempt_disabled (kernel/sched/core.c:6680) 00379 rwsem_down_read_slowpath (kernel/locking/rwsem.c:1073 (discriminator 1)) 00379 down_read (kernel/locking/rwsem.c:1529) 00381 bch2_gc_gens (fs/bcachefs/sb-members.h:77 fs/bcachefs/sb-members.h:88 fs/bcachefs/sb-members.h:128 fs/bcachefs/btree_gc.c:1240) 00383 bch2_fs_store_inner (fs/bcachefs/sysfs.c:473) 00385 bch2_fs_internal_store (fs/bcachefs/sysfs.c:417 fs/bcachefs/sysfs.c:580 fs/bcachefs/sysfs.c:576) 00386 sysfs_kf_write (fs/sysfs/file.c:137) 00387 kernfs_fop_write_iter (fs/kernfs/file.c:334) 00389 vfs_write (fs/read_write.c:497 fs/read_write.c:590) 00390 ksys_write (fs/read_write.c:643) 00391 __arm64_sys_write (fs/read_write.c:652) 00391 invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54) 00392 do_el0_svc (include/linux/thread_info.h:127 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2)) 00392 el0_svc (arch/arm64/include/asm/irqflags.h:55 arch/arm64/include/asm/irqflags.h:76 arch/arm64/kernel/entry-common.c:165 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713) 00392 el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731) 00392 el0t_64_sync (arch/arm64/kernel/entry.S:598) Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-09 16:42:54 -04:00
Mohammed Anees	a30f32222d	bcachefs: Fix NULL pointer dereference in bch2_opt_to_text This patch adds a bounds check to the bch2_opt_to_text function to prevent NULL pointer dereferences when accessing the opt->choices array. This ensures that the index used is within valid bounds before dereferencing. The new version enhances the readability. Reported-and-tested-by: syzbot+37186860aa7812b331d5@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=37186860aa7812b331d5 Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-09 16:42:53 -04:00
Alan Huang	a154154148	bcachefs: Release transaction before wake up We will get this if we wake up first: Kernel panic - not syncing: btree_node_write_done leaked btree_trans since there are still transactions waiting for cycle detectors after BTREE_NODE_write_in_flight is cleared. Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-09 16:42:53 -04:00
Piotr Zalewski	0151d10a48	bcachefs: add check for btree id against max in try read node Add check for read node's btree_id against BTREE_ID_NR_MAX in try_read_btree_node to prevent triggering EBUG_ON condition in bch2_btree_id_root[1]. [1] https://syzkaller.appspot.com/bug?extid=cf7b2215b5d70600ec00 Reported-by: syzbot+cf7b2215b5d70600ec00@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=cf7b2215b5d70600ec00 Fixes: `4409b8081d` ("bcachefs: Repair pass for scanning for btree nodes") Signed-off-by: Piotr Zalewski <pZ010001011111@proton.me> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-09 16:42:53 -04:00
Kent Overstreet	19773ec997	bcachefs: Disk accounting device validation fixes - Fix failure to validate that accounting replicas entries point to valid devices: this wasn't a real bug since they'd be cleaned up by GC, but is still something we should know about - Fix failure to validate that dev_data_type entries point to valid devices: this does fix a real bug, since bch2_accounting_read() would then try to copy the counters to that device and pop an inconsistent error when the device didn't exist - Remove accounting entries that are zeroed or invalid: if we're not validating them we need to get rid of them: they might not exist in the superblock, so we need the to trigger the superblock mark path when they're readded. This fixes the replication.ktest rereplicate test, which was failing with "superblock not marked for replicas..." Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-09 16:42:53 -04:00
Kent Overstreet	9d86178782	bcachefs: bch2_inode_or_descendents_is_open() fsck can now correctly check if inodes in interior snapshot nodes are open/in use. - Tweak the vfs inode rhashtable so that the subvolume ID isn't hashed, meaning inums in different subvolumes will hash to the same slot. Note that this is a hack, and will cause problems if anyone ever has the same file in many different snapshots open all at the same time. - Then check if any of those subvolumes is a descendent of the snapshot ID being checked Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-09 16:42:53 -04:00
Kent Overstreet	84878e8245	bcachefs: Kill bch2_propagate_key_to_snapshot_leaves() Dead code now. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-09 16:42:53 -04:00
Kent Overstreet	9b23fdbd5d	bcachefs: bcachefs_metadata_version_inode_has_child_snapshots There's an inherent race in taking a snapshot while an unlinked file is open, and then reattaching it in the child snapshot. In the interior snapshot node the file will appear unlinked, as though it should be deleted - it's not referenced by anything in that snapshot - but we can't delete it, because the file data is referenced by the child snapshot. This was being handled incorrectly with propagate_key_to_snapshot_leaves() - but that doesn't resolve the fundamental inconsistency of "this file looks like it should be deleted according to normal rules, but - ". To fix this, we need to fix the rule for when an inode is deleted. The previous rule, ignoring snapshots (there was no well-defined rule for with snapshots) was: Unlinked, non open files are deleted, either at recovery time or during online fsck The new rule is: Unlinked, non open files, that do not exist in child snapshots, are deleted. To make this work transactionally, we add a new inode flag, BCH_INODE_has_child_snapshot; it overrides BCH_INODE_unlinked when considering whether to delete an inode, or put it on the deleted list. For transactional consistency, clearing it handled by the inode trigger: when deleting an inode we check if there are parent inodes which can now have the BCH_INODE_has_child_snapshot flag cleared. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-09 16:42:51 -04:00
Björn Töpel	7941b83bce	selftests: sched_ext: Add sched_ext as proper selftest target The sched_ext selftests is missing proper cross-compilation support, a proper target entry, and out-of-tree build support. When building the kselftest suite, e.g.: make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- \ TARGETS=sched_ext SKIP_TARGETS="" O=/output/foo \ -C tools/testing/selftests install or: make ARCH=arm64 LLVM=1 TARGETS=sched_ext SKIP_TARGETS="" \ O=/output/foo -C tools/testing/selftests install The expectation is that the sched_ext is included, cross-built, the correct toolchain is picked up, and placed into /output/foo. In contrast to the BPF selftests, the sched_ext suite does not use bpftool at test run-time, so it is sufficient to build bpftool for the build host only. Add ARCH, CROSS_COMPILE, OUTPUT, and TARGETS support to the sched_ext selftest. Also, remove some variables that were unused by the Makefile. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: David Vernet <void@manifault.com> Tested-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-10-09 06:42:51 -10:00
Mark Rutland	13f8f1e05f	arm64: probes: Fix uprobes for big-endian kernels The arm64 uprobes code is broken for big-endian kernels as it doesn't convert the in-memory instruction encoding (which is always little-endian) into the kernel's native endianness before analyzing and simulating instructions. This may result in a few distinct problems: * The kernel may may erroneously reject probing an instruction which can safely be probed. * The kernel may erroneously erroneously permit stepping an instruction out-of-line when that instruction cannot be stepped out-of-line safely. * The kernel may erroneously simulate instruction incorrectly dur to interpretting the byte-swapped encoding. The endianness mismatch isn't caught by the compiler or sparse because: * The arch_uprobe::{insn,ixol} fields are encoded as arrays of u8, so the compiler and sparse have no idea these contain a little-endian 32-bit value. The core uprobes code populates these with a memcpy() which similarly does not handle endianness. * While the uprobe_opcode_t type is an alias for __le32, both arch_uprobe_analyze_insn() and arch_uprobe_skip_sstep() cast from u8[] to the similarly-named probe_opcode_t, which is an alias for u32. Hence there is no endianness conversion warning. Fix this by changing the arch_uprobe::{insn,ixol} fields to __le32 and adding the appropriate __le32_to_cpu() conversions prior to consuming the instruction encoding. The core uprobes copies these fields as opaque ranges of bytes, and so is unaffected by this change. At the same time, remove MAX_UINSN_BYTES and consistently use AARCH64_INSN_SIZE for clarity. Tested with the following: \| #include <stdio.h> \| #include <stdbool.h> \| \| #define noinline __attribute__((noinline)) \| \| static noinline void adrp_self(void) \| { \| void addr; \| \| asm volatile( \| " adrp %x0, adrp_self\n" \| " add %x0, %x0, :lo12:adrp_self\n" \| : "=r" (addr)); \| } \| \| \| int main(int argc, char argv) \| { \| void ptr = adrp_self(); \| bool equal = (ptr == adrp_self); \| \| printf("adrp_self => %p\n" \| "adrp_self() => %p\n" \| "%s\n", \| adrp_self, ptr, equal ? "EQUAL" : "NOT EQUAL"); \| \| return 0; \| } .... where the adrp_self() function was compiled to: \| 00000000004007e0 <adrp_self>: \| 4007e0: 90000000 adrp x0, 400000 <__ehdr_start> \| 4007e4: 911f8000 add x0, x0, #0x7e0 \| 4007e8: d65f03c0 ret Before this patch, the ADRP is not recognized, and is assumed to be steppable, resulting in corruption of the result: \| # ./adrp-self \| adrp_self => 0x4007e0 \| adrp_self() => 0x4007e0 \| EQUAL \| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events \| # echo 1 > /sys/kernel/tracing/events/uprobes/enable \| # ./adrp-self \| adrp_self => 0x4007e0 \| adrp_self() => 0xffffffffff7e0 \| NOT EQUAL After this patch, the ADRP is correctly recognized and simulated: \| # ./adrp-self \| adrp_self => 0x4007e0 \| adrp_self() => 0x4007e0 \| EQUAL \| # \| # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events \| # echo 1 > /sys/kernel/tracing/events/uprobes/enable \| # ./adrp-self \| adrp_self => 0x4007e0 \| adrp_self() => 0x4007e0 \| EQUAL Fixes: `9842ceae9f` ("arm64: Add uprobe support") Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241008155851.801546-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-09 16:56:53 +01:00
Mark Rutland	50f813e576	arm64: probes: Fix simulate_ldr*_literal() The simulate_ldr_literal() code always loads a 64-bit quantity, and when simulating a 32-bit load into a 'W' register, it discards the most significant 32 bits. For big-endian kernels this means that the relevant bits are discarded, and the value returned is the the subsequent 32 bits in memory (i.e. the value at addr + 4). Additionally, simulate_ldr_literal() and simulate_ldrsw_literal() use a plain C load, which the compiler may tear or elide (e.g. if the target is the zero register). Today this doesn't happen to matter, but it may matter in future if trampoline code uses a LDR (literal) or LDRSW (literal). Update simulate_ldr_literal() and simulate_ldrsw_literal() to use an appropriately-sized READ_ONCE() to perform the access, which avoids these problems. Fixes: `39a67d49ba` ("arm64: kprobes instruction simulation support") Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241008155851.801546-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-09 16:56:53 +01:00
Mark Rutland	acc450aa07	arm64: probes: Remove broken LDR (literal) uprobe support The simulate_ldr_literal() and simulate_ldrsw_literal() functions are unsafe to use for uprobes. Both functions were originally written for use with kprobes, and access memory with plain C accesses. When uprobes was added, these were reused unmodified even though they cannot safely access user memory. There are three key problems: 1) The plain C accesses do not have corresponding extable entries, and thus if they encounter a fault the kernel will treat these as unintentional accesses to user memory, resulting in a BUG() which will kill the kernel thread, and likely lead to further issues (e.g. lockup or panic()). 2) The plain C accesses are subject to HW PAN and SW PAN, and so when either is in use, any attempt to simulate an access to user memory will fault. Thus neither simulate_ldr_literal() nor simulate_ldrsw_literal() can do anything useful when simulating a user instruction on any system with HW PAN or SW PAN. 3) The plain C accesses are privileged, as they run in kernel context, and in practice can access a small range of kernel virtual addresses. The instructions they simulate have a range of +/-1MiB, and since the simulated instructions must itself be a user instructions in the TTBR0 address range, these can address the final 1MiB of the TTBR1 acddress range by wrapping downwards from an address in the first 1MiB of the TTBR0 address range. In contemporary kernels the last 8MiB of TTBR1 address range is reserved, and accesses to this will always fault, meaning this is no worse than (1). Historically, it was theoretically possible for the linear map or vmemmap to spill into the final 8MiB of the TTBR1 address range, but in practice this is extremely unlikely to occur as this would require either: * Having enough physical memory to fill the entire linear map all the way to the final 1MiB of the TTBR1 address range. * Getting unlucky with KASLR randomization of the linear map such that the populated region happens to overlap with the last 1MiB of the TTBR address range. ... and in either case if we were to spill into the final page there would be larger problems as the final page would alias with error pointers. Practically speaking, (1) and (2) are the big issues. Given there have been no reports of problems since the broken code was introduced, it appears that no-one is relying on probing these instructions with uprobes. Avoid these issues by not allowing uprobes on LDR (literal) and LDRSW (literal), limiting the use of simulate_ldr_literal() and simulate_ldrsw_literal() to kprobes. Attempts to place uprobes on LDR (literal) and LDRSW (literal) will be rejected as arm_probe_decode_insn() will return INSN_REJECTED. In future we can consider introducing working uprobes support for these instructions, but this will require more significant work. Fixes: `9842ceae9f` ("arm64: Add uprobe support") Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241008155851.801546-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-09 16:56:53 +01:00
Basavaraj Natikar	c56f9ecb7f	HID: amd_sfh: Switch to device-managed dmam_alloc_coherent() Using the device-managed version allows to simplify clean-up in probe() error path. Additionally, this device-managed ensures proper cleanup, which helps to resolve memory errors, page faults, btrfs going read-only, and btrfs disk corruption. Fixes: `4b2c53d93a` ("SFH:Transport Driver to add support of AMD Sensor Fusion Hub (SFH)") Tested-by: Chris Hixon <linux-kernel-bugs@hixontech.com> Tested-by: Richard <hobbes1069@gmail.com> Tested-by: Skyler <skpu@pm.me> Reported-by: Chris Hixon <linux-kernel-bugs@hixontech.com> Closes: https://lore.kernel.org/all/3b129b1f-8636-456a-80b4-0f6cce0eef63@hixontech.com/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=219331 Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-10-09 17:18:57 +02:00
Vasiliy Kovalev	9988844c45	ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2 There is a problem with simultaneous audio output to headphones and speakers, and when headphones are turned off, the speakers also turn off and do not turn them on. However, it was found that if you boot linux immediately after windows, there are no such problems. When comparing alsa-info, the only difference is the different configuration of Node 0x1d: working conf. (windows): Pin-ctls: 0x80: HP not working (linux): Pin-ctls: 0xc0: OUT HP This patch disable the AC_PINCTL_OUT_EN bit of Node 0x1d and fixes the described problem. Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241009134248.662175-1-kovalev@altlinux.org Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-09 16:56:22 +02:00
Pawan Gupta	e4d2102018	x86/bugs: Use code segment selector for VERW operand Robert Gill reported below #GP in 32-bit mode when dosemu software was executing vm86() system call: general protection fault: 0000 [#1] PREEMPT SMP CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1 Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010 EIP: restore_all_switch_stack+0xbe/0xcf EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000 ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046 CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0 Call Trace: show_regs+0x70/0x78 die_addr+0x29/0x70 exc_general_protection+0x13c/0x348 exc_bounds+0x98/0x98 handle_exception+0x14d/0x14d exc_bounds+0x98/0x98 restore_all_switch_stack+0xbe/0xcf exc_bounds+0x98/0x98 restore_all_switch_stack+0xbe/0xcf This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS are enabled. This is because segment registers with an arbitrary user value can result in #GP when executing VERW. Intel SDM vol. 2C documents the following behavior for VERW instruction: #GP(0) - If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user space. Use %cs selector to reference VERW operand. This ensures VERW will not #GP for an arbitrary user %ds. [ mingo: Fixed the SOB chain. ] Fixes: `a0e2dab44d` ("x86/entry_32: Add VERW just before userspace transition") Reported-by: Robert Gill <rtgill82@gmail.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com Cc: stable@vger.kernel.org # 5.10+ Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707 Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/ Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Suggested-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2024-10-09 09:42:30 +02:00
Pawan Gupta	48a2440d0f	x86/entry_32: Clear CPU buffers after register restore in NMI return CPU buffers are currently cleared after call to exc_nmi, but before register state is restored. This may be okay for MDS mitigation but not for RDFS. Because RDFS mitigation requires CPU buffers to be cleared when registers don't have any sensitive data. Move CLEAR_CPU_BUFFERS after RESTORE_ALL_NMI. Fixes: `a0e2dab44d` ("x86/entry_32: Add VERW just before userspace transition") Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20240925-fix-dosemu-vm86-v7-2-1de0daca2d42%40linux.intel.com	2024-10-08 15:16:28 -07:00
Pawan Gupta	2e2e5143d4	x86/entry_32: Do not clobber user EFLAGS.ZF Opportunistic SYSEXIT executes VERW to clear CPU buffers after user EFLAGS are restored. This can clobber user EFLAGS.ZF. Move CLEAR_CPU_BUFFERS before the user EFLAGS are restored. This ensures that the user EFLAGS.ZF is not clobbered. Closes: https://lore.kernel.org/lkml/yVXwe8gvgmPADpRB6lXlicS2fcHoV5OHHxyuFbB_MEleRPD7-KhGe5VtORejtPe-KCkT8Uhcg5d7-IBw4Ojb4H7z5LQxoZylSmJ8KNL3A8o=@protonmail.com/ Fixes: `a0e2dab44d` ("x86/entry_32: Add VERW just before userspace transition") Reported-by: Jari Ruusu <jariruusu@protonmail.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20240925-fix-dosemu-vm86-v7-1-1de0daca2d42%40linux.intel.com	2024-10-08 15:16:28 -07:00
Florian Klink	dc7785e472	ARM: dts: bcm2837-rpi-cm3-io3: Fix HDMI hpd-gpio pin HDMI_HPD_N_1V8 is connected to GPIO pin 0, not 1. This fixes HDMI hotplug/output detection. See https://datasheets.raspberrypi.com/cm/cm3-schematics.pdf Signed-off-by: Florian Klink <flokli@flokli.de> Reviewed-by: Stefan Wahren <wahrenst@gmx.net> Link: https://lore.kernel.org/r/20240715230311.685641-1-flokli@flokli.de Reviewed-by: Stefan Wahren <wahrenst@gmx.net> Fixes: `a54fe8a6cf` ("ARM: dts: add Raspberry Pi Compute Module 3 and IO board") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>	2024-10-08 15:01:51 -07:00
Tokunori Ikegami	9c7072df53	nvme: delete unnecessary fallthru comment Signed-off-by: Tokunori Ikegami <ikegami.t@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-08 13:50:44 -07:00
Guixin Liu	40f0e5dc2f	nvmet-rdma: use sbitmap to replace rsp free list We can use sbitmap to manage all the nvmet_rdma_rsp instead of using free lists and spinlock, and we can use an additional tag to determine whether the nvmet_rdma_rsp is extra allocated. In addition, performance has improved: 1. testing environment is local rxe rdma devie and mem-based backstore device. 2. fio command, test the average 5 times: fio -filename=/dev/nvme0n1 --ioengine=libaio -direct=1 -size=1G -name=1 -thread -runtime=60 -time_based -rw=read -numjobs=16 -iodepth=128 -bs=4k -group_reporting 3. Before: 241k IOPS, After: 256k IOPS, an increase of about 5%. Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Jens Axboe <axboe@kernel.dk>	2024-10-08 13:45:36 -07:00
Thomas Weißschuh	b24d7f0da6	bpf, lsm: Remove bpf_lsm_key_free hook The key_free LSM hook has been removed. Remove the corresponding BPF hook. Avoid warnings during the build: BTFIDS vmlinux WARN: resolve_btfids: unresolved symbol bpf_lsm_key_free Fixes: `5f8d28f6d7` ("lsm: infrastructure management of the key security blob") Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20241005-lsm-key_free-v1-1-42ea801dbd63@weissschuh.net	2024-10-08 12:52:40 -07:00
Felix Fietkau	52009b4193	wifi: mac80211: skip non-uploaded keys in ieee80211_iter_keys Sync iterator conditions with ieee80211_iter_keys_rcu. Fixes: `830af02f24` ("mac80211: allow driver to iterate keys") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://patch.msgid.link/20241006153630.87885-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-08 21:24:20 +02:00
Gustavo A. R. Silva	57be3d3562	wifi: radiotap: Avoid -Wflex-array-member-not-at-end warnings -Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. So, in order to avoid ending up with a flexible-array member in the middle of multiple other structs, we use the `__struct_group()` helper to create a new tagged `struct ieee80211_radiotap_header_fixed`. This structure groups together all the members of the flexible `struct ieee80211_radiotap_header` except the flexible array. As a result, the array is effectively separated from the rest of the members without modifying the memory layout of the flexible structure. We then change the type of the middle struct members currently causing trouble from `struct ieee80211_radiotap_header` to `struct ieee80211_radiotap_header_fixed`. We also want to ensure that in case new members need to be added to the flexible structure, they are always included within the newly created tagged struct. For this, we use `static_assert()`. This ensures that the memory layout for both the flexible structure and the new tagged struct is the same after any changes. This approach avoids having to implement `struct ieee80211_radiotap_header_fixed` as a completely separate structure, thus preventing having to maintain two independent but basically identical structures, closing the door to potential bugs in the future. So, with these changes, fix the following warnings: drivers/net/wireless/ath/wil6210/txrx.c:309:50: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/wireless/intel/ipw2x00/ipw2100.c:2521:50: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/wireless/intel/ipw2x00/ipw2200.h:1146:42: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/wireless/intel/ipw2x00/libipw.h:595:36: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/wireless/marvell/libertas/radiotap.h:34:42: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/wireless/marvell/libertas/radiotap.h:5:42: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/wireless/microchip/wilc1000/mon.c:10:42: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/wireless/microchip/wilc1000/mon.c:15:42: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/wireless/virtual/mac80211_hwsim.c:758:42: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/wireless/virtual/mac80211_hwsim.c:767:42: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://patch.msgid.link/ZwBMtBZKcrzwU7l4@kspp Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-08 21:24:20 +02:00
Felix Fietkau	393b6bc174	wifi: mac80211: do not pass a stopped vif to the driver in .get_txpower Avoid potentially crashing in the driver because of uninitialized private data Fixes: `5b3dc42b1b` ("mac80211: add support for driver tx power reporting") Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://patch.msgid.link/20241002095630.22431-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-08 21:24:19 +02:00
Remi Pommarel	4cc6f3e5e5	wifi: mac80211: Convert color collision detection to wiphy work Call to ieee80211_color_collision_detection_work() needs wiphy lock to be held (see lockdep assert in cfg80211_bss_color_notify()). Not locking wiphy causes the following lockdep error: WARNING: CPU: 2 PID: 42 at net/wireless/nl80211.c:19505 cfg80211_bss_color_notify+0x1a4/0x25c Modules linked in: CPU: 2 PID: 42 Comm: kworker/u8:3 Tainted: G W 6.4.0-02327-g36c6cb260481 #1048 Hardware name: Workqueue: phy1 ieee80211_color_collision_detection_work pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : cfg80211_bss_color_notify+0x1a4/0x25c lr : cfg80211_bss_color_notify+0x1a0/0x25c sp : ffff000002947d00 x29: ffff000002947d00 x28: ffff800008e1a000 x27: ffff000002bd4705 x26: ffff00000d034000 x25: ffff80000903cf40 x24: 0000000000000000 x23: ffff00000cb70720 x22: 0000000000800000 x21: ffff800008dfb008 x20: 000000000000008d x19: ffff00000d035fa8 x18: 0000000000000010 x17: 0000000000000001 x16: 000003564b1ce96a x15: 000d69696d057970 x14: 000000000000003b x13: 0000000000000001 x12: 0000000000040000 x11: 0000000000000001 x10: ffff80000978f9c0 x9 : ffff0000028d3174 x8 : ffff800008e30000 x7 : 0000000000000000 x6 : 0000000000000028 x5 : 000000000002f498 x4 : ffff00000d034a80 x3 : 0000000000800000 x2 : ffff800016143000 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: cfg80211_bss_color_notify+0x1a4/0x25c ieee80211_color_collision_detection_work+0x20/0x118 process_one_work+0x294/0x554 worker_thread+0x70/0x440 kthread+0xf4/0xf8 ret_from_fork+0x10/0x20 irq event stamp: 77372 hardirqs last enabled at (77371): [<ffff800008a346fc>] _raw_spin_unlock_irq+0x2c/0x4c hardirqs last disabled at (77372): [<ffff800008a28754>] el1_dbg+0x20/0x48 softirqs last enabled at (77350): [<ffff8000089e120c>] batadv_send_outstanding_bcast_packet+0xb8/0x120 softirqs last disabled at (77348): [<ffff8000089e11d4>] batadv_send_outstanding_bcast_packet+0x80/0x120 The wiphy lock cannot be taken directly from color collision detection delayed work (ieee80211_color_collision_detection_work()) because this work is cancel_delayed_work_sync() under this wiphy lock causing a potential deadlock( see [0] for details). To fix that ieee80211_color_collision_detection_work() could be converted to a wiphy work and cancel_delayed_work_sync() can be simply replaced by wiphy_delayed_work_cancel() serving the same purpose under wiphy lock. This could potentially fix [1]. [0]: https://lore.kernel.org/linux-wireless/D4A40Q44OAY2.W3SIF6UEPBUN@freebox.fr/ [1]: https://lore.kernel.org/lkml/000000000000612f290618eee3e5@google.com/ Reported-by: Nicolas Escande <nescande@freebox.fr> Signed-off-by: Remi Pommarel <repk@triplefau.lt> Link: https://patch.msgid.link/20240924192805.13859-3-repk@triplefau.lt Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-08 21:24:19 +02:00
Remi Pommarel	68d0021fe7	wifi: cfg80211: Add wiphy_delayed_work_pending() Add wiphy_delayed_work_pending() to check if any delayed work timer is pending, that can be used to be sure that wiphy_delayed_work_queue() won't postpone an already pending delayed work. Signed-off-by: Remi Pommarel <repk@triplefau.lt> Link: https://patch.msgid.link/20240924192805.13859-2-repk@triplefau.lt [fix return value kernel-doc] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-08 21:24:00 +02:00
Chenming Huang	e1a9ae3a73	wifi: cfg80211: Do not create BSS entries for unsupported channels Currently, in cfg80211_parse_ml_elem_sta_data(), when RNR element indicates a BSS that operates in a channel that current regulatory domain doesn't support, a NULL value is returned by ieee80211_get_channel_khz() and assigned to this BSS entry's channel field. Later in cfg80211_inform_single_bss_data(), the reported BSS entry's channel will be wrongly overridden by transmitted BSS's. This could result in connection failure that when wpa_supplicant tries to select this reported BSS entry while it actually resides in an unsupported channel. Since this channel is not supported, it is reasonable to skip such entries instead of reporting wrong information. Signed-off-by: Chenming Huang <quic_chenhuan@quicinc.com> Link: https://patch.msgid.link/20240923021644.12885-1-quic_chenhuan@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-08 21:15:51 +02:00
Ben Greear	8dd0498983	wifi: mac80211: Fix setting txpower with emulate_chanctx Propagate hw conf into the driver when txpower changes and driver is emulating channel contexts. Signed-off-by: Ben Greear <greearb@candelatech.com> Link: https://patch.msgid.link/20240924011325.1509103-1-greearb@candelatech.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-08 21:15:19 +02:00
Geert Uytterhoeven	b3e046c314	mac80211: MAC80211_MESSAGE_TRACING should depend on TRACING When tracing is disabled, there is no point in asking the user about enabling tracing of all mac80211 debug messages. Fixes: `3fae027316` ("mac80211: trace debug messages") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://patch.msgid.link/85bbe38ce0df13350f45714e2dc288cc70947a19.1727179690.git.geert@linux-m68k.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-10-08 21:14:57 +02:00
Nathan Chancellor	d5fd042bf4	x86/resctrl: Annotate get_mem_config() functions as __init After a recent LLVM change [1] that deduces __cold on functions that only call cold code (such as __init functions), there is a section mismatch warning from __get_mem_config_intel(), which got moved to .text.unlikely. as a result of that optimization: WARNING: modpost: vmlinux: section mismatch in reference: \ __get_mem_config_intel+0x77 (section: .text.unlikely.) -> thread_throttle_mode_init (section: .init.text) Mark __get_mem_config_intel() as __init as well since it is only called from __init code, which clears up the warning. While __rdt_get_mem_config_amd() does not exhibit a warning because it does not call any __init code, it is a similar function that is only called from __init code like __get_mem_config_intel(), so mark it __init as well to keep the code symmetrical. CONFIG_SECTION_MISMATCH_WARN_ONLY=n would turn this into a fatal error. Fixes: `05b93417ce` ("x86/intel_rdt/mba: Add primary support for Memory Bandwidth Allocation (MBA)") Fixes: `4d05bf71f1` ("x86/resctrl: Introduce AMD QOS feature") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Cc: <stable@kernel.org> Link: `6b11573b8c` [1] Link: https://lore.kernel.org/r/20240917-x86-restctrl-get_mem_config_intel-init-v3-1-10d521256284@kernel.org	2024-10-08 21:05:10 +02:00
Ville Syrjälä	07c90acb07	wifi: iwlegacy: Clear stale interrupts before resuming device iwl4965 fails upon resume from hibernation on my laptop. The reason seems to be a stale interrupt which isn't being cleared out before interrupts are enabled. We end up with a race beween the resume trying to bring things back up, and the restart work (queued form the interrupt handler) trying to bring things down. Eventually the whole thing blows up. Fix the problem by clearing out any stale interrupts before interrupts get enabled during resume. Here's a debug log of the indicent: [ 12.042589] ieee80211 phy0: il_isr ISR inta 0x00000080, enabled 0xaa00008b, fh 0x00000000 [ 12.042625] ieee80211 phy0: il4965_irq_tasklet inta 0x00000080, enabled 0x00000000, fh 0x00000000 [ 12.042651] iwl4965 0000:10:00.0: RF_KILL bit toggled to enable radio. [ 12.042653] iwl4965 0000:10:00.0: On demand firmware reload [ 12.042690] ieee80211 phy0: il4965_irq_tasklet End inta 0x00000000, enabled 0xaa00008b, fh 0x00000000, flags 0x00000282 [ 12.052207] ieee80211 phy0: il4965_mac_start enter [ 12.052212] ieee80211 phy0: il_prep_station Add STA to driver ID 31: ff:ff:ff:ff:ff:ff [ 12.052244] ieee80211 phy0: il4965_set_hw_ready hardware ready [ 12.052324] ieee80211 phy0: il_apm_init Init card's basic functions [ 12.052348] ieee80211 phy0: il_apm_init L1 Enabled; Disabling L0S [ 12.055727] ieee80211 phy0: il4965_load_bsm Begin load bsm [ 12.056140] ieee80211 phy0: il4965_verify_bsm Begin verify bsm [ 12.058642] ieee80211 phy0: il4965_verify_bsm BSM bootstrap uCode image OK [ 12.058721] ieee80211 phy0: il4965_load_bsm BSM write complete, poll 1 iterations [ 12.058734] ieee80211 phy0: __il4965_up iwl4965 is coming up [ 12.058737] ieee80211 phy0: il4965_mac_start Start UP work done. [ 12.058757] ieee80211 phy0: __il4965_down iwl4965 is going down [ 12.058761] ieee80211 phy0: il_scan_cancel_timeout Scan cancel timeout [ 12.058762] ieee80211 phy0: il_do_scan_abort Not performing scan to abort [ 12.058765] ieee80211 phy0: il_clear_ucode_stations Clearing ucode stations in driver [ 12.058767] ieee80211 phy0: il_clear_ucode_stations No active stations found to be cleared [ 12.058819] ieee80211 phy0: _il_apm_stop Stop card, put in low power state [ 12.058827] ieee80211 phy0: _il_apm_stop_master stop master [ 12.058864] ieee80211 phy0: il4965_clear_free_frames 0 frames on pre-allocated heap on clear. [ 12.058869] ieee80211 phy0: Hardware restart was requested [ 16.132299] iwl4965 0000:10:00.0: START_ALIVE timeout after 4000ms. [ 16.132303] ------------[ cut here ]------------ [ 16.132304] Hardware became unavailable upon resume. This could be a software issue prior to suspend or a hardware issue. [ 16.132338] WARNING: CPU: 0 PID: 181 at net/mac80211/util.c:1826 ieee80211_reconfig+0x8f/0x14b0 [mac80211] [ 16.132390] Modules linked in: ctr ccm sch_fq_codel xt_tcpudp xt_multiport xt_state iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 ip_tables x_tables binfmt_misc joydev mousedev btusb btrtl btintel btbcm bluetooth ecdh_generic ecc iTCO_wdt i2c_dev iwl4965 iwlegacy coretemp snd_hda_codec_analog pcspkr psmouse mac80211 snd_hda_codec_generic libarc4 sdhci_pci cqhci sha256_generic sdhci libsha256 firewire_ohci snd_hda_intel snd_intel_dspcfg mmc_core snd_hda_codec snd_hwdep firewire_core led_class iosf_mbi snd_hda_core uhci_hcd lpc_ich crc_itu_t cfg80211 ehci_pci ehci_hcd snd_pcm usbcore mfd_core rfkill snd_timer snd usb_common soundcore video parport_pc parport intel_agp wmi intel_gtt backlight e1000e agpgart evdev [ 16.132456] CPU: 0 UID: 0 PID: 181 Comm: kworker/u8:6 Not tainted 6.11.0-cl+ #143 [ 16.132460] Hardware name: Hewlett-Packard HP Compaq 6910p/30BE, BIOS 68MCU Ver. F.19 07/06/2010 [ 16.132463] Workqueue: async async_run_entry_fn [ 16.132469] RIP: 0010:ieee80211_reconfig+0x8f/0x14b0 [mac80211] [ 16.132501] Code: da 02 00 00 c6 83 ad 05 00 00 00 48 89 df e8 98 1b fc ff 85 c0 41 89 c7 0f 84 e9 02 00 00 48 c7 c7 a0 e6 48 a0 e8 d1 77 c4 e0 <0f> 0b eb 2d 84 c0 0f 85 8b 01 00 00 c6 87 ad 05 00 00 00 e8 69 1b [ 16.132504] RSP: 0018:ffffc9000029fcf0 EFLAGS: 00010282 [ 16.132507] RAX: 0000000000000000 RBX: ffff8880072008e0 RCX: 0000000000000001 [ 16.132509] RDX: ffffffff81f21a18 RSI: 0000000000000086 RDI: 0000000000000001 [ 16.132510] RBP: ffff8880072003c0 R08: 0000000000000000 R09: 0000000000000003 [ 16.132512] R10: 0000000000000000 R11: ffff88807e5b0000 R12: 0000000000000001 [ 16.132514] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ffffff92 [ 16.132515] FS: 0000000000000000(0000) GS:ffff88807c200000(0000) knlGS:0000000000000000 [ 16.132517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.132519] CR2: 000055dd43786c08 CR3: 000000000978f000 CR4: 00000000000006f0 [ 16.132521] Call Trace: [ 16.132525] <TASK> [ 16.132526] ? __warn+0x77/0x120 [ 16.132532] ? ieee80211_reconfig+0x8f/0x14b0 [mac80211] [ 16.132564] ? report_bug+0x15c/0x190 [ 16.132568] ? handle_bug+0x36/0x70 [ 16.132571] ? exc_invalid_op+0x13/0x60 [ 16.132573] ? asm_exc_invalid_op+0x16/0x20 [ 16.132579] ? ieee80211_reconfig+0x8f/0x14b0 [mac80211] [ 16.132611] ? snd_hdac_bus_init_cmd_io+0x24/0x200 [snd_hda_core] [ 16.132617] ? pick_eevdf+0x133/0x1c0 [ 16.132622] ? check_preempt_wakeup_fair+0x70/0x90 [ 16.132626] ? wakeup_preempt+0x4a/0x60 [ 16.132628] ? ttwu_do_activate.isra.0+0x5a/0x190 [ 16.132632] wiphy_resume+0x79/0x1a0 [cfg80211] [ 16.132675] ? wiphy_suspend+0x2a0/0x2a0 [cfg80211] [ 16.132697] dpm_run_callback+0x75/0x1b0 [ 16.132703] device_resume+0x97/0x200 [ 16.132707] async_resume+0x14/0x20 [ 16.132711] async_run_entry_fn+0x1b/0xa0 [ 16.132714] process_one_work+0x13d/0x350 [ 16.132718] worker_thread+0x2be/0x3d0 [ 16.132722] ? cancel_delayed_work_sync+0x70/0x70 [ 16.132725] kthread+0xc0/0xf0 [ 16.132729] ? kthread_park+0x80/0x80 [ 16.132732] ret_from_fork+0x28/0x40 [ 16.132735] ? kthread_park+0x80/0x80 [ 16.132738] ret_from_fork_asm+0x11/0x20 [ 16.132741] </TASK> [ 16.132742] ---[ end trace 0000000000000000 ]--- [ 16.132930] ------------[ cut here ]------------ [ 16.132932] WARNING: CPU: 0 PID: 181 at net/mac80211/driver-ops.c:41 drv_stop+0xe7/0xf0 [mac80211] [ 16.132957] Modules linked in: ctr ccm sch_fq_codel xt_tcpudp xt_multiport xt_state iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 ip_tables x_tables binfmt_misc joydev mousedev btusb btrtl btintel btbcm bluetooth ecdh_generic ecc iTCO_wdt i2c_dev iwl4965 iwlegacy coretemp snd_hda_codec_analog pcspkr psmouse mac80211 snd_hda_codec_generic libarc4 sdhci_pci cqhci sha256_generic sdhci libsha256 firewire_ohci snd_hda_intel snd_intel_dspcfg mmc_core snd_hda_codec snd_hwdep firewire_core led_class iosf_mbi snd_hda_core uhci_hcd lpc_ich crc_itu_t cfg80211 ehci_pci ehci_hcd snd_pcm usbcore mfd_core rfkill snd_timer snd usb_common soundcore video parport_pc parport intel_agp wmi intel_gtt backlight e1000e agpgart evdev [ 16.133014] CPU: 0 UID: 0 PID: 181 Comm: kworker/u8:6 Tainted: G W 6.11.0-cl+ #143 [ 16.133018] Tainted: [W]=WARN [ 16.133019] Hardware name: Hewlett-Packard HP Compaq 6910p/30BE, BIOS 68MCU Ver. F.19 07/06/2010 [ 16.133021] Workqueue: async async_run_entry_fn [ 16.133025] RIP: 0010:drv_stop+0xe7/0xf0 [mac80211] [ 16.133048] Code: 48 85 c0 74 0e 48 8b 78 08 89 ea 48 89 de e8 e0 87 04 00 65 ff 0d d1 de c4 5f 0f 85 42 ff ff ff e8 be 52 c2 e0 e9 38 ff ff ff <0f> 0b 5b 5d c3 0f 1f 40 00 41 54 49 89 fc 55 53 48 89 f3 2e 2e 2e [ 16.133050] RSP: 0018:ffffc9000029fc50 EFLAGS: 00010246 [ 16.133053] RAX: 0000000000000000 RBX: ffff8880072008e0 RCX: ffff88800377f6c0 [ 16.133054] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8880072008e0 [ 16.133056] RBP: 0000000000000000 R08: ffffffff81f238d8 R09: 0000000000000000 [ 16.133058] R10: ffff8880080520f0 R11: 0000000000000000 R12: ffff888008051c60 [ 16.133060] R13: ffff8880072008e0 R14: 0000000000000000 R15: ffff8880072011d8 [ 16.133061] FS: 0000000000000000(0000) GS:ffff88807c200000(0000) knlGS:0000000000000000 [ 16.133063] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.133065] CR2: 000055dd43786c08 CR3: 000000000978f000 CR4: 00000000000006f0 [ 16.133067] Call Trace: [ 16.133069] <TASK> [ 16.133070] ? __warn+0x77/0x120 [ 16.133075] ? drv_stop+0xe7/0xf0 [mac80211] [ 16.133098] ? report_bug+0x15c/0x190 [ 16.133100] ? handle_bug+0x36/0x70 [ 16.133103] ? exc_invalid_op+0x13/0x60 [ 16.133105] ? asm_exc_invalid_op+0x16/0x20 [ 16.133109] ? drv_stop+0xe7/0xf0 [mac80211] [ 16.133132] ieee80211_do_stop+0x55a/0x810 [mac80211] [ 16.133161] ? fq_codel_reset+0xa5/0xc0 [sch_fq_codel] [ 16.133164] ieee80211_stop+0x4f/0x180 [mac80211] [ 16.133192] __dev_close_many+0xa2/0x120 [ 16.133195] dev_close_many+0x90/0x150 [ 16.133198] dev_close+0x5d/0x80 [ 16.133200] cfg80211_shutdown_all_interfaces+0x40/0xe0 [cfg80211] [ 16.133223] wiphy_resume+0xb2/0x1a0 [cfg80211] [ 16.133247] ? wiphy_suspend+0x2a0/0x2a0 [cfg80211] [ 16.133269] dpm_run_callback+0x75/0x1b0 [ 16.133273] device_resume+0x97/0x200 [ 16.133277] async_resume+0x14/0x20 [ 16.133280] async_run_entry_fn+0x1b/0xa0 [ 16.133283] process_one_work+0x13d/0x350 [ 16.133287] worker_thread+0x2be/0x3d0 [ 16.133290] ? cancel_delayed_work_sync+0x70/0x70 [ 16.133294] kthread+0xc0/0xf0 [ 16.133296] ? kthread_park+0x80/0x80 [ 16.133299] ret_from_fork+0x28/0x40 [ 16.133302] ? kthread_park+0x80/0x80 [ 16.133304] ret_from_fork_asm+0x11/0x20 [ 16.133307] </TASK> [ 16.133308] ---[ end trace 0000000000000000 ]--- [ 16.133335] ieee80211 phy0: PM: dpm_run_callback(): wiphy_resume [cfg80211] returns -110 [ 16.133360] ieee80211 phy0: PM: failed to restore async: error -110 Cc: stable@vger.kernel.org Cc: Stanislaw Gruszka <stf_xl@wp.pl> Cc: Kalle Valo <kvalo@kernel.org> Cc: linux-wireless@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Stanislaw Gruszka <stf_xl@wp.pl> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20241001200745.8276-1-ville.syrjala@linux.intel.com	2024-10-08 21:51:24 +03:00
Chen Ridong	117932eea9	cgroup/bpf: use a dedicated workqueue for cgroup bpf destruction A hung_task problem shown below was found: INFO: task kworker/0:0:8 blocked for more than 327 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Workqueue: events cgroup_bpf_release Call Trace: <TASK> __schedule+0x5a2/0x2050 ? find_held_lock+0x33/0x100 ? wq_worker_sleeping+0x9e/0xe0 schedule+0x9f/0x180 schedule_preempt_disabled+0x25/0x50 __mutex_lock+0x512/0x740 ? cgroup_bpf_release+0x1e/0x4d0 ? cgroup_bpf_release+0xcf/0x4d0 ? process_scheduled_works+0x161/0x8a0 ? cgroup_bpf_release+0x1e/0x4d0 ? mutex_lock_nested+0x2b/0x40 ? __pfx_delay_tsc+0x10/0x10 mutex_lock_nested+0x2b/0x40 cgroup_bpf_release+0xcf/0x4d0 ? process_scheduled_works+0x161/0x8a0 ? trace_event_raw_event_workqueue_execute_start+0x64/0xd0 ? process_scheduled_works+0x161/0x8a0 process_scheduled_works+0x23a/0x8a0 worker_thread+0x231/0x5b0 ? __pfx_worker_thread+0x10/0x10 kthread+0x14d/0x1c0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x59/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> This issue can be reproduced by the following pressuse test: 1. A large number of cpuset cgroups are deleted. 2. Set cpu on and off repeatly. 3. Set watchdog_thresh repeatly. The scripts can be obtained at LINK mentioned above the signature. The reason for this issue is cgroup_mutex and cpu_hotplug_lock are acquired in different tasks, which may lead to deadlock. It can lead to a deadlock through the following steps: 1. A large number of cpusets are deleted asynchronously, which puts a large number of cgroup_bpf_release works into system_wq. The max_active of system_wq is WQ_DFL_ACTIVE(256). Consequently, all active works are cgroup_bpf_release works, and many cgroup_bpf_release works will be put into inactive queue. As illustrated in the diagram, there are 256 (in the acvtive queue) + n (in the inactive queue) works. 2. Setting watchdog_thresh will hold cpu_hotplug_lock.read and put smp_call_on_cpu work into system_wq. However step 1 has already filled system_wq, 'sscs.work' is put into inactive queue. 'sscs.work' has to wait until the works that were put into the inacvtive queue earlier have executed (n cgroup_bpf_release), so it will be blocked for a while. 3. Cpu offline requires cpu_hotplug_lock.write, which is blocked by step 2. 4. Cpusets that were deleted at step 1 put cgroup_release works into cgroup_destroy_wq. They are competing to get cgroup_mutex all the time. When cgroup_metux is acqured by work at css_killed_work_fn, it will call cpuset_css_offline, which needs to acqure cpu_hotplug_lock.read. However, cpuset_css_offline will be blocked for step 3. 5. At this moment, there are 256 works in active queue that are cgroup_bpf_release, they are attempting to acquire cgroup_mutex, and as a result, all of them are blocked. Consequently, sscs.work can not be executed. Ultimately, this situation leads to four processes being blocked, forming a deadlock. system_wq(step1) WatchDog(step2) cpu offline(step3) cgroup_destroy_wq(step4) ... 2000+ cgroups deleted asyn 256 actives + n inactives __lockup_detector_reconfigure P(cpu_hotplug_lock.read) put sscs.work into system_wq 256 + n + 1(sscs.work) sscs.work wait to be executed warting sscs.work finish percpu_down_write P(cpu_hotplug_lock.write) ...blocking... css_killed_work_fn P(cgroup_mutex) cpuset_css_offline P(cpu_hotplug_lock.read) ...blocking... 256 cgroup_bpf_release mutex_lock(&cgroup_mutex); ..blocking... To fix the problem, place cgroup_bpf_release works on a dedicated workqueue which can break the loop and solve the problem. System wqs are for misc things which shouldn't create a large number of concurrent work items. If something is going to generate >WQ_DFL_ACTIVE(256) concurrent work items, it should use its own dedicated workqueue. Fixes: `4bfc0bb2c6` ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself") Cc: stable@vger.kernel.org # v5.3+ Link: https://lore.kernel.org/cgroups/e90c32d2-2a85-4f28-9154-09c7d320cb60@huawei.com/T/#t Tested-by: Vishal Chourasia <vishalc@linux.ibm.com> Signed-off-by: Chen Ridong <chenridong@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-10-08 08:43:22 -10:00
Chen Ni	7de7d35429	iommu/arm-smmu-v3: Convert comma to semicolon Replace comma between expressions with semicolons. Using a ',' in place of a ';' can have unintended side effects. Although that is not the case here, it is seems best to use ';' unless ',' is intended. Found by inspection. No functional change intended. Compile tested only. Fixes: `e3b1be2e73` ("iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20240923021557.3432068-1-nichen@iscas.ac.cn Signed-off-by: Will Deacon <will@kernel.org>	2024-10-08 18:38:34 +01:00
Daniel Mentz	f63237f54c	iommu/arm-smmu-v3: Fix last_sid_idx calculation for sid_bits==32 The function arm_smmu_init_strtab_2lvl uses the expression ((1 << smmu->sid_bits) - 1) to calculate the largest StreamID value. However, this fails for the maximum allowed value of SMMU_IDR1.SIDSIZE which is 32. The C standard states: "If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined." With smmu->sid_bits being 32, the prerequisites for undefined behavior are met. We observed that the value of (1 << 32) is 1 and not 0 as we initially expected. Similar bit shift operations in arm_smmu_init_strtab_linear seem to not be affected, because it appears to be unlikely for an SMMU to have SMMU_IDR1.SIDSIZE set to 32 but then not support 2-level Stream tables This issue was found by Ryan Huang <tzukui@google.com> on our team. Fixes: `ce410410f1` ("iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()") Signed-off-by: Daniel Mentz <danielmentz@google.com> Link: https://lore.kernel.org/r/20241002015357.1766934-1-danielmentz@google.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-08 18:37:48 +01:00
Robin Murphy	0dfe314cdd	iommu/arm-smmu: Clarify MMU-500 CPRE workaround CPRE workarounds are implicated in at least 5 MMU-500 errata, some of which remain unfixed. The comment and warning message have proven to be unhelpfully misleading about this scope, so reword them to get the point across with less risk of going out of date or confusing users. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/dfa82171b5248ad7cf1f25592101a6eec36b8c9a.1728400877.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2024-10-08 18:35:41 +01:00
Nam Cao	6b1e0651e9	irqchip/sifive-plic: Unmask interrupt in plic_irq_enable() It is possible that an interrupt is disabled and masked at the same time. When the interrupt is enabled again by enable_irq(), only plic_irq_enable() is called, not plic_irq_unmask(). The interrupt remains masked and never raises. An example where interrupt is both disabled and masked is when handle_fasteoi_irq() is the handler, and IRQS_ONESHOT is set. The interrupt handler: 1. Mask the interrupt 2. Handle the interrupt 3. Check if interrupt is still enabled, and unmask it (see cond_unmask_eoi_irq()) If another task disables the interrupt in the middle of the above steps, the interrupt will not get unmasked, and will remain masked when it is enabled in the future. The problem is occasionally observed when PREEMPT_RT is enabled, because PREEMPT_RT adds the IRQS_ONESHOT flag. But PREEMPT_RT only makes the problem more likely to appear, the bug has been around since commit `a1706a1c50` ("irqchip/sifive-plic: Separate the enable and mask operations"). Fix it by unmasking interrupt in plic_irq_enable(). Fixes: `a1706a1c50` ("irqchip/sifive-plic: Separate the enable and mask operations") Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241003084152.2422969-1-namcao@linutronix.de	2024-10-08 17:49:21 +02:00
Marc Zyngier	1442ee0011	irqchip/gic-v4: Don't allow a VMOVP on a dying VPE Kunkun Jiang reported that there is a small window of opportunity for userspace to force a change of affinity for a VPE while the VPE has already been unmapped, but the corresponding doorbell interrupt still visible in /proc/irq/. Plug the race by checking the value of vmapp_count, which tracks whether the VPE is mapped ot not, and returning an error in this case. This involves making vmapp_count common to both GICv4.1 and its v4.0 ancestor. Fixes: `64edfaa9a2` ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMAPP") Reported-by: Kunkun Jiang <jiangkunkun@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/c182ece6-2ba0-ce4f-3404-dba7a3ab6c52@huawei.com Link: https://lore.kernel.org/all/20241002204959.2051709-1-maz@kernel.org	2024-10-08 17:44:27 +02:00
Martin Kletzander	2b5648416e	x86/resctrl: Avoid overflow in MB settings in bw_validate() The resctrl schemata file supports specifying memory bandwidth associated with the Memory Bandwidth Allocation (MBA) feature via a percentage (this is the default) or bandwidth in MiBps (when resctrl is mounted with the "mba_MBps" option). The allowed range for the bandwidth percentage is from /sys/fs/resctrl/info/MB/min_bandwidth to 100, using a granularity of /sys/fs/resctrl/info/MB/bandwidth_gran. The supported range for the MiBps bandwidth is 0 to U32_MAX. There are two issues with parsing of MiBps memory bandwidth: * The user provided MiBps is mistakenly rounded up to the granularity that is unique to percentage input. * The user provided MiBps is parsed using unsigned long (thus accepting values up to ULONG_MAX), and then assigned to u32 that could result in overflow. Do not round up the MiBps value and parse user provided bandwidth as the u32 it is intended to be. Use the appropriate kstrtou32() that can detect out of range values. Fixes: `8205a078ba` ("x86/intel_rdt/mba_sc: Add schemata support") Fixes: `6ce1560d35` ("x86/resctrl: Switch over to the resctrl mbps_val list") Co-developed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Martin Kletzander <nert.pinx@gmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com>	2024-10-08 16:17:38 +02:00
Benjamin Bara	3fe9f5882c	ASoC: dapm: avoid container_of() to get component The current implementation does not work for widgets of DAPMs without component, as snd_soc_dapm_to_component() requires it. If the widget is directly owned by the card, e.g. as it is the case for the tegra implementation, the call leads to UB. Therefore directly access the component of the widget's DAPM to be able to check if a component is available. Fixes: `f82eb06a40` ("ASoC: tegra: machine: Handle component name prefix") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Benjamin Bara <benjamin.bara@skidata.com> Link: https://patch.msgid.link/20241008-tegra-dapm-v2-1-5e999cb5f0e7@skidata.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-08 13:52:23 +01:00
Anumula Murali Mohan Reddy	5069d7e202	RDMA/core: Fix ENODEV error for iWARP test over vlan If traffic is over vlan, cma_validate_port() fails to match vlan net_device ifindex with bound_if_index and results in ENODEV error. It is because rdma_copy_src_l2_addr() always assigns bound_if_index with real net_device ifindex. This patch fixes the issue by assigning bound_if_index with vlan net_device index if traffic is over vlan. Fixes: `f8ef1be816` ("RDMA/cma: Avoid GID lookups on iWARP devices") Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Link: https://patch.msgid.link/20241008114334.146702-1-anumula@chelsio.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-08 15:07:41 +03:00
Mark Brown	d4a89e5aee	KVM: arm64: Expose S1PIE to guests Prior to commit `70ed723829` ("KVM: arm64: Sanitise ID_AA64MMFR3_EL1") we just exposed the santised view of ID_AA64MMFR3_EL1 to guests, meaning that they saw both TCRX and S1PIE if present on the host machine. That commit added VMM control over the contents of the register and exposed S1POE but removed S1PIE, meaning that the extension is no longer visible to guests. Reenable support for S1PIE with VMM control. Fixes: `70ed723829` ("KVM: arm64: Sanitise ID_AA64MMFR3_EL1") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20241005-kvm-arm64-fix-s1pie-v1-1-5901f02de749@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-08 12:59:23 +01:00
Christian Brauner	dad1b6c805	Merge patch series "fsdax/xfs: unshare range fixes for 6.12" Darrick J. Wong <djwong@kernel.org> says: This patchset fixes multiple data corruption bugs in the fallocate unshare range implementation for fsdax. * patches from https://lore.kernel.org/r/172796813251.1131942.12184885574609980777.stgit@frogsfrogsfrogs: fsdax: dax_unshare_iter needs to copy entire blocks fsdax: remove zeroing code from dax_unshare_iter iomap: share iomap_unshare_iter predicate code with fsdax xfs: don't allocate COW extents when unsharing a hole Link: https://lore.kernel.org/r/172796813251.1131942.12184885574609980777.stgit@frogsfrogsfrogs Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-08 12:27:14 +02:00
Kai Vehmanen	9814c1447f	ASoC: SOF: Intel: hda-loader: do not wait for HDaudio IOC Commit `9ee3f0d8c9` ("ASOC: SOF: Intel: hda-loader: only wait for HDaudio IOC for IPC4 devices") removed DMA wait for IPC3 case. Proceed and remove the wait for IPC4 devices as well. There is no dependency to IPC version in the load logic and checking the firmware status is a sufficient check in case of errors. The removed code also had a bug in that -ETIMEDOUT is returned without stopping the DMA transfer. Cc: stable@vger.kernel.org Link: https://github.com/thesofproject/linux/issues/5135 Fixes: `9ee3f0d8c9` ("ASOC: SOF: Intel: hda-loader: only wait for HDaudio IOC for IPC4 devices") Suggested-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://patch.msgid.link/20241008060710.15409-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-08 10:49:14 +01:00
Venkata Prasad Potturu	494ddacd4a	ASoC: SOF: amd: Fix for ACP SRAM addr for acp7.0 platform Incorrect SRAM base addr for acp7.0 platform results firmware boot failure. Add condition check to support SRAM addr for various platforms. Fixes: `145d7e5ae8` ("ASoC: SOF: amd: add option to use sram for data bin loading") Signed-off-by: Venkata Prasad Potturu <venkataprasad.potturu@amd.com> Link: https://patch.msgid.link/20241008091347.594378-2-venkataprasad.potturu@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-08 10:49:13 +01:00
Venkata Prasad Potturu	0a5c40393b	ASoC: SOF: amd: Add error log for DSP firmware validation failure Add dev_err to print ACP_SHA_DSP_FW_QUALIFIER and ACP_SHA_PSP_ACK register values for PSP firmware validation failure case. Signed-off-by: Venkata Prasad Potturu <venkataprasad.potturu@amd.com> Link: https://patch.msgid.link/20241008091347.594378-1-venkataprasad.potturu@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-08 10:49:12 +01:00
Amadeusz Sławiński	0dbb186c35	ASoC: Intel: avs: Update stream status in a separate thread Function snd_pcm_period_elapsed() is part of sequence servicing HDAudio stream IRQs. It's called under Global Interrupt Enable (GIE) disabled - no HDAudio interrupts will be raised. At the same time, the function may end up calling __snd_pcm_xrun() or snd_pcm_drain_done(). On the avs-driver side, this translates to IPCs and as GIE is disabled, these will never complete successfully. Improve system stability by scheduling stream-IRQ handling in a separate thread. Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Link: https://patch.msgid.link/20241008083758.756578-1-amadeuszx.slawinski@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-08 10:49:11 +01:00
Oliver Upton	79cc6cdb93	KVM: arm64: nv: Clarify safety of allowing TLBI unmaps to reschedule There's been a decent amount of attention around unmaps of nested MMUs, and TLBI handling is no exception to this. Add a comment clarifying why it is safe to reschedule during a TLBI unmap, even without a reference on the MMU in progress. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241007233028.2236133-5-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-08 10:40:27 +01:00
Oliver Upton	c268f204f7	KVM: arm64: nv: Punt stage-2 recycling to a vCPU request Currently, when a nested MMU is repurposed for some other MMU context, KVM unmaps everything during vcpu_load() while holding the MMU lock for write. This is quite a performance bottleneck for large nested VMs, as all vCPU scheduling will spin until the unmap completes. Start punting the MMU cleanup to a vCPU request, where it is then possible to periodically release the MMU lock and CPU in the presence of contention. Ensure that no vCPU winds up using a stale MMU by tracking the pending unmap on the S2 MMU itself and requesting an unmap on every vCPU that finds it. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241007233028.2236133-4-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-08 10:40:27 +01:00
Oliver Upton	3c164eb946	KVM: arm64: nv: Do not block when unmapping stage-2 if disallowed Right now the nested code allows unmap operations on a shadow stage-2 to block unconditionally. This is wrong in a couple places, such as a non-blocking MMU notifier or on the back of a sched_in() notifier as part of shadow MMU recycling. Carry through whether or not blocking is allowed to kvm_pgtable_stage2_unmap(). This 'fixes' an issue where stage-2 MMU reclaim would precipitate a stack overflow from a pile of kvm_sched_in() callbacks, all trying to recycle a stage-2 MMU. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241007233028.2236133-3-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-08 10:40:27 +01:00
Oliver Upton	6ded46b5a4	KVM: arm64: nv: Keep reference on stage-2 MMU when scheduled out If a vCPU is scheduling out and not in WFI emulation, it is highly likely it will get scheduled again soon and reuse the MMU it had before. Dropping the MMU at vcpu_put() can have some unfortunate consequences, as the MMU could get reclaimed and used in a different context, forcing another 'cold start' on an otherwise active MMU. Avoid that altogether by keeping a reference on the MMU if the vCPU is scheduling out, ensuring that another vCPU cannot reclaim it while the current vCPU is away. Since there are more MMUs than vCPUs, this does not affect the guarantee that an unused MMU is available at any time. Furthermore, this makes the vcpu->arch.hw_mmu ~stable in preemptible code, at least for where it matters in the stage-2 abort path. Yes, the MMU can change across WFI emulation, but there isn't even a use case where this would matter. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241007233028.2236133-2-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-08 10:40:27 +01:00
Oliver Upton	ae8f8b3761	KVM: arm64: Unregister redistributor for failed vCPU creation Alex reports that syzkaller has managed to trigger a use-after-free when tearing down a VM: BUG: KASAN: slab-use-after-free in kvm_put_kvm+0x300/0xe68 virt/kvm/kvm_main.c:5769 Read of size 8 at addr ffffff801c6890d0 by task syz.3.2219/10758 CPU: 3 UID: 0 PID: 10758 Comm: syz.3.2219 Not tainted 6.11.0-rc6-dirty #64 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x17c/0x1a8 arch/arm64/kernel/stacktrace.c:317 show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324 __dump_stack lib/dump_stack.c:93 [inline] dump_stack_lvl+0x94/0xc0 lib/dump_stack.c:119 print_report+0x144/0x7a4 mm/kasan/report.c:377 kasan_report+0xcc/0x128 mm/kasan/report.c:601 __asan_report_load8_noabort+0x20/0x2c mm/kasan/report_generic.c:381 kvm_put_kvm+0x300/0xe68 virt/kvm/kvm_main.c:5769 kvm_vm_release+0x4c/0x60 virt/kvm/kvm_main.c:1409 __fput+0x198/0x71c fs/file_table.c:422 ____fput+0x20/0x30 fs/file_table.c:450 task_work_run+0x1cc/0x23c kernel/task_work.c:228 do_notify_resume+0x144/0x1a0 include/linux/resume_user_mode.h:50 el0_svc+0x64/0x68 arch/arm64/kernel/entry-common.c:169 el0t_64_sync_handler+0x90/0xfc arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598 Upon closer inspection, it appears that we do not properly tear down the MMIO registration for a vCPU that fails creation late in the game, e.g. a vCPU w/ the same ID already exists in the VM. It is important to consider the context of commit that introduced this bug by moving the unregistration out of __kvm_vgic_vcpu_destroy(). That change correctly sought to avoid an srcu v. config_lock inversion by breaking up the vCPU teardown into two parts, one guarded by the config_lock. Fix the use-after-free while avoiding lock inversion by adding a special-cased unregistration to __kvm_vgic_vcpu_destroy(). This is safe because failed vCPUs are torn down outside of the config_lock. Cc: stable@vger.kernel.org Fixes: `f616506754` ("KVM: arm64: vgic: Don't hold config_lock while unregistering redistributors") Reported-by: Alexander Potapenko <glider@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241007223909.2157336-1-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-08 10:40:27 +01:00
Marc Zyngier	9b7c3dd596	Merge branch kvm-arm64/idregs-6.12 into kvmarm/fixes * kvm-arm64/idregs-6.12: : . : Make some fields of ID_AA64DFR0_EL1 and ID_AA64PFR1_EL1 : writable from userspace, so that a VMM can influence the : set of guest-visible features. : : - for ID_AA64DFR0_EL1: DoubleLock, WRPs, PMUVer and DebugVer : are writable (courtesy of Shameer Kolothum) : : - for ID_AA64PFR1_EL1: BT, SSBS, CVS2_frac are writable : (courtesy of Shaoqin Huang) : . KVM: selftests: aarch64: Add writable test for ID_AA64PFR1_EL1 KVM: arm64: Allow userspace to change ID_AA64PFR1_EL1 KVM: arm64: Use kvm_has_feat() to check if FEAT_SSBS is advertised to the guest KVM: arm64: Disable fields that KVM doesn't know how to handle in ID_AA64PFR1_EL1 KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-08 10:40:04 +01:00
Jonathan Corbet	368196e501	netfs: fix documentation build error Commit `86b374d061` ("netfs: Remove fs/netfs/io.c") did what it said on the tin, but failed to remove the reference to fs/netfs/io.c from the documentation, leading to this docs build error: WARNING: kernel-doc './scripts/kernel-doc -rst -enable-lineno -sphinx-version 7.3.7 ./fs/netfs/io.c' failed with return code 1 Remove the offending kernel-doc line, making the docs build process a little happier. Fixes: `86b374d061` ("netfs: Remove fs/netfs/io.c") Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/874j5nlu86.fsf@trenco.lwn.net Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-08 10:39:38 +02:00
christoph.plattner	557f6e4ab5	ALSA: hda: Sound support for HP Spectre x360 16 inch model 2024 Included solution with ALC287/CS35L41 did not cover full function, 14 inch code blocked. Forcing output for treble/bass speaker to connection 0x02, setting pin configs for LEDs and re-powering amp and calling fixups for cs35l41, mute and gpio leds was a working combination to reach correct behaviour. Signed-off-by: christoph.plattner <christoph.plattner@gmx.at> Link: https://patch.msgid.link/20241005173509.1196001-1-christoph.plattner@gmx.at Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-10-08 10:07:32 +02:00
WangYuli	7a5ab80711	HID: multitouch: Add quirk for HONOR MagicBook Art 14 touchpad The behavior of HONOR MagicBook Art 14 touchpad is not consistent after reboots, as sometimes it reports itself as a touchpad, and sometimes as a mouse. Similarly to GLO-GXXX it is possible to call MT_QUIRK_FORCE_GET_FEATURE as a workaround to force set feature in mt_set_input_mode() for such special touchpad device. [jkosina@suse.com: reword changelog a little bit] Link: https://gitlab.freedesktop.org/libinput/libinput/-/issues/1040 Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: WangYuli <wangyuli@uniontech.com> Reviewed-by: Benjamin Tissoires <bentiss@kernel.org> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-10-08 09:15:25 +02:00
Qianqiang Liu	6ff57a2ea7	RDMA/nldev: Fix NULL pointer dereferences issue in rdma_nl_notify_event nlmsg_put() may return a NULL pointer assigned to nlh, which will later be dereferenced in nlmsg_end(). Fixes: `9cbed5aab5` ("RDMA/nldev: Add support for RDMA monitoring") Link: https://patch.msgid.link/r/Zva71Yf3F94uxi5A@iZbp1asjb3cy8ks0srf007Z Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-08 10:12:42 +03:00
Selvin Xavier	8e65abacbc	RDMA/bnxt_re: Fix the max WQEs used in Static WQE mode max_sw_wqe used for static wqe mode should be same as the max_wqe. Calculate the max_sw_wqe only for the variable WQE mode. Fixes: `de1d364c38` ("RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters") Link: https://patch.msgid.link/r/1726715161-18941-7-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-08 10:12:36 +03:00
Kalesh AP	c5c1ae73b7	RDMA/bnxt_re: Add a check for memory allocation __alloc_pbl() can return error when memory allocation fails. Driver is not checking the status on one of the instances. Fixes: `0c4dcd6028` ("RDMA/bnxt_re: Refactor hardware queue memory allocation") Link: https://patch.msgid.link/r/1726715161-18941-4-git-send-email-selvin.xavier@broadcom.com Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-08 10:12:29 +03:00
Saravanan Vajravel	9ab20f76ae	RDMA/bnxt_re: Fix incorrect AVID type in WQE structure Driver uses internal data structure to construct WQE frame. It used avid type as u16 which can accommodate up to 64K AVs. When outstanding AVID crosses 64K, driver truncates AVID and hence it uses incorrect AVID to WR. This leads to WR failure due to invalid AV ID and QP is moved to error state with reason set to 19 (INVALID AVID). When RDMA CM path is used, this issue hits QP1 and it is moved to error state Fixes: `1ac5a40479` ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Link: https://patch.msgid.link/r/1726715161-18941-3-git-send-email-selvin.xavier@broadcom.com Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-08 10:12:20 +03:00
Kalesh AP	3fc5410f22	RDMA/bnxt_re: Fix a possible memory leak In bnxt_re_setup_chip_ctx() when bnxt_qplib_map_db_bar() fails driver is not freeing the memory allocated for "rdev->chip_ctx". Fixes: `0ac20faf5d` ("RDMA/bnxt_re: Reorg the bar mapping") Link: https://patch.msgid.link/r/1726715161-18941-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-08 10:11:50 +03:00
Stefan Blum	1a5cbb526e	HID: multitouch: Add support for B2402FVA track point By default the track point does not work on the Asus Expertbook B2402FVA. From libinput record i got the ID of the track point device: evdev: # Name: ASUE1201:00 04F3:32AE # ID: bus 0x18 vendor 0x4f3 product 0x32ae version 0x100 I found that the track point is functional, when i set the MT_CLS_WIN_8_FORCE_MULTI_INPUT_NSMU class for the reported device. Signed-off-by: Stefan Blum <stefan.blum@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-10-08 09:10:59 +02:00
Wade Wang	87b6962090	HID: plantronics: Workaround for an unexcepted opposite volume key Some Plantronics headset as the below send an unexcept opposite volume key's HID report for each volume key press after 200ms, like unecepted Volume Up Key following Volume Down key pressed by user. This patch adds a quirk to hid-plantronics for these devices, which will ignore the second unexcepted opposite volume key if it happens within 220ms from the last one that was handled. Plantronics EncorePro 500 Series (047f:431e) Plantronics Blackwire_3325 Series (047f:430c) The patch was tested on the mentioned model, it shouldn't affect other models, however, this quirk might be needed for them too. Auto-repeat (when a key is held pressed) is not affected per test result. Cc: stable@vger.kernel.org Signed-off-by: Wade Wang <wade.wang@hp.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-10-08 08:45:17 +02:00
Jiri Olsa	45126b155e	bpf: Fix memory leak in bpf_core_apply We need to free specs properly. Fixes: `3d2786d65a` ("bpf: correctly handle malformed BPF_CORE_TYPE_ID_LOCAL relos") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20241007160958.607434-1-jolsa@kernel.org	2024-10-07 20:28:24 -07:00
SurajSonawane2415	b402328a24	block: Fix elevator_get_default() checking for NULL q->tag_set elevator_get_default() and elv_support_iosched() both check for whether or not q->tag_set is non-NULL, however it's not possible for them to be NULL. This messes up some static checkers, as the checking of tag_set isn't consistent. Remove the checks, which both simplifies the logic and avoids checker errors. Signed-off-by: SurajSonawane2415 <surajsonawane0215@gmail.com> Link: https://lore.kernel.org/r/20241007111416.13814-1-surajsonawane0215@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-10-07 14:24:36 -06:00
Richard Gong	f8bc84b609	x86/amd_nb: Add new PCI ID for AMD family 1Ah model 20h Add new PCI ID for Device 18h and Function 4. Signed-off-by: Richard Gong <richard.gong@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20240913162903.649519-1-richard.gong@amd.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>	2024-10-07 21:04:28 +02:00
Timo Grautstueck	ab8851431b	lib/Kconfig.debug: fix grammar in RUST_BUILD_ASSERT_ALLOW Just a grammar fix in lib/Kconfig.debug, under the config option RUST_BUILD_ASSERT_ALLOW. Reported-by: Miguel Ojeda <ojeda@kernel.org> Closes: https://github.com/Rust-for-Linux/linux/issues/1006 Fixes: `ecaa6ddff2` ("rust: add `build_error` crate") Signed-off-by: Timo Grautstueck <timo.grautstueck@web.de> Link: https://lore.kernel.org/r/20241006140244.5509-1-timo.grautstueck@web.de Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2024-10-07 19:13:03 +02:00
Dhananjay Ugwekar	c10e50a469	cpufreq/amd-pstate: Fix amd_pstate mode switch on shared memory systems While switching the driver mode between active and passive, Collaborative Processor Performance Control (CPPC) is disabled in amd_pstate_unregister_driver(). But, it is not enabled back while registering the new driver (passive or active). This leads to the new driver mode not working correctly, so enable it back in amd_pstate_register_driver(). Fixes: `3ca7bc818d` ("cpufreq: amd-pstate: Add guided mode control support via sysfs") Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241004122303.94283-1-Dhananjay.Ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>	2024-10-07 11:32:05 -05:00
Miquel Raynal	8380dbf1b9	ASoC: dt-bindings: davinci-mcasp: Fix interrupt properties Combinations of "tx" alone, "rx" alone and "tx", "rx" together are supposedly valid (see link below), which is not the case today as "rx" alone is not accepted by the current binding. Let's rework the two interrupt properties to expose all correct possibilities. Cc: Péter Ujfalusi <peter.ujfalusi@gmail.com> Link: https://lore.kernel.org/linux-sound/20241003102552.2c11840e@xps-13/T/#m277fce1d49c50d94e071f7890aed472fa2c64052 Fixes: `8be90641a0` ("ASoC: dt-bindings: davinci-mcasp: convert McASP bindings to yaml schema") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20241003083611.461894-1-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-07 15:35:35 +01:00
Zichen Xie	49da1463c9	ASoC: qcom: Fix NULL Dereference in asoc_qcom_lpass_cpu_platform_probe() A devm_kzalloc() in asoc_qcom_lpass_cpu_platform_probe() could possibly return NULL pointer. NULL Pointer Dereference may be triggerred without addtional check. Add a NULL check for the returned pointer. Fixes: `b5022a36d2` ("ASoC: qcom: lpass: Use regmap_field for i2sctl and dmactl registers") Cc: stable@vger.kernel.org Signed-off-by: Zichen Xie <zichenxie0106@gmail.com> Link: https://patch.msgid.link/20241006205737.8829-1-zichenxie0106@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-10-07 15:35:34 +01:00
Darrick J. Wong	50793801fc	fsdax: dax_unshare_iter needs to copy entire blocks The code that copies data from srcmap to iomap in dax_unshare_iter is very very broken, which bfoster's recent fsx changes have exposed. If the pos and len passed to dax_file_unshare are not aligned to an fsblock boundary, the iter pos and length in the _iter function will reflect this unalignment. dax_iomap_direct_access always returns a pointer to the start of the kmapped fsdax page, even if its pos argument is in the middle of that page. This is catastrophic for data integrity when iter->pos is not aligned to a page, because daddr/saddr do not point to the same byte in the file as iter->pos. Hence we corrupt user data by copying it to the wrong place. If iter->pos + iomap_length() in the _iter function not aligned to a page, then we fail to copy a full block, and only partially populate the destination block. This is catastrophic for data confidentiality because we expose stale pmem contents. Fix both of these issues by aligning copy_pos/copy_len to a page boundary (remember, this is fsdax so 1 fsblock == 1 base page) so that we always copy full blocks. We're not done yet -- there's no call to invalidate_inode_pages2_range, so programs that have the file range mmap'd will continue accessing the old memory mapping after the file metadata updates have completed. Be careful with the return value -- if the unshare succeeds, we still need to return the number of bytes that the iomap iter thinks we're operating on. Cc: ruansy.fnst@fujitsu.com Fixes: `d984648e42` ("fsdax,xfs: port unshare to fsdax") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/172796813328.1131942.16777025316348797355.stgit@frogsfrogsfrogs Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-07 13:51:47 +02:00
Darrick J. Wong	95472274b6	fsdax: remove zeroing code from dax_unshare_iter Remove the code in dax_unshare_iter that zeroes the destination memory because it's not necessary. If srcmap is unwritten, we don't have to do anything because that unwritten extent came from the regular file mapping, and unwritten extents cannot be shared. The same applies to holes. Furthermore, zeroing to unshare a mapping is just plain wrong because unsharing means copy on write, and we should be copying data. This is effectively a revert of commit `13dd4e0462` ("fsdax: unshare: zero destination if srcmap is HOLE or UNWRITTEN") Cc: ruansy.fnst@fujitsu.com Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/172796813311.1131942.16033376284752798632.stgit@frogsfrogsfrogs Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-07 13:51:47 +02:00
Darrick J. Wong	6ef6a0e821	iomap: share iomap_unshare_iter predicate code with fsdax The predicate code that iomap_unshare_iter uses to decide if it's really needs to unshare a file range mapping should be shared with the fsdax version, because right now they're opencoded and inconsistent. Note that we simplify the predicate logic a bit -- we no longer allow unsharing of inline data mappings, but there aren't any filesystems that allow shared inline data currently. This is a fix in the sense that it should have been ported to fsdax. Fixes: `b53fdb215d` ("iomap: improve shared block detection in iomap_unshare_iter") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/172796813294.1131942.15762084021076932620.stgit@frogsfrogsfrogs Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-07 13:51:47 +02:00
Darrick J. Wong	b8c4076db5	xfs: don't allocate COW extents when unsharing a hole It doesn't make sense to allocate a COW extent when unsharing a hole because holes cannot be shared. Fixes: `1f1397b721` ("xfs: don't allocate into the data fork for an unshare request") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/172796813277.1131942.5486112889531210260.stgit@frogsfrogsfrogs Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-07 13:51:46 +02:00
David Howells	796a404964	netfs: In readahead, put the folio refs as soon extracted netfslib currently defers dropping the ref on the folios it obtains during readahead to after it has started I/O on the basis that we can do it whilst we wait for the I/O to complete, but this runs the risk of the I/O collection racing with this in future. Furthermore, Matthew Wilcox strongly suggests that the refs should be dropped immediately, as readahead_folio() does (netfslib is using __readahead_batch() which doesn't drop the refs). Fixes: `ee4cdf7ba8` ("netfs: Speed up buffered reading") Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/3771538.1728052438@warthog.procyon.org.uk cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-07 13:48:22 +02:00
Josua Mayer	841dd5b122	arm64: dts: marvell: cn9130-sr-som: fix cp0 mdio pin numbers SolidRun CN9130 SoM actually uses CP_MPP[0:1] for mdio. CP_MPP[40] provides reference clock for dsa switch and ethernet phy on Clearfog Pro, wheras MPP[41] controls efuse programming voltage "VHV". Update the cp0 mdio pinctrl node to specify mpp0, mpp1. Fixes: `1c510c7d82` ("arm64: dts: add description for solidrun cn9130 som and clearfog boards") Cc: stable@vger.kernel.org # 6.11.x Signed-off-by: Josua Mayer <josua@solid-run.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/stable/20241002-cn9130-som-mdio-v1-1-0942be4dc550%40solid-run.com Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>	2024-10-07 10:05:35 +02:00
Sabrina Dubroca	3f0ab59e65	xfrm: validate new SA's prefixlen using SA family when sel.family is unset This expands the validation introduced in commit `07bf790895` ("xfrm: Validate address prefix lengths in the xfrm selector.") syzbot created an SA with usersa.sel.family = AF_UNSPEC usersa.sel.prefixlen_s = 128 usersa.family = AF_INET Because of the AF_UNSPEC selector, verify_newsa_info doesn't put limits on prefixlen_{s,d}. But then copy_from_user_state sets x->sel.family to usersa.family (AF_INET). Do the same conversion in verify_newsa_info before validating prefixlen_{s,d}, since that's how prefixlen is going to be used later on. Reported-by: syzbot+cc39f136925517aed571@syzkaller.appspotmail.com Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2024-10-07 09:09:08 +02:00
Bartosz Wawrzyniak	2d0f973b5f	phy: cadence: Sierra: Fix offset of DEQ open eye algorithm control register Fix the value of SIERRA_DEQ_OPENEYE_CTRL_PREG and add a definition for SIERRA_DEQ_TAU_EPIOFFSET_MODE_PREG. This fixes the SGMII single link register configuration. Fixes: `7a5ad9b4b9` ("phy: cadence: Sierra: Update single link PCIe register configuration") Signed-off-by: Bartosz Wawrzyniak <bwawrzyn@cisco.com> Link: https://lore.kernel.org/r/20241003123405.1101157-1-bwawrzyn@cisco.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-07 11:55:34 +05:30
Sam Edwards	cb4c7df596	phy: usb: Fix missing elements in BCM4908 USB init array The Broadcom USB PHY driver contains a lookup table (`reg_bits_map_tables`) to resolve register bitmaps unique to certain versions of the USB PHY as found in various Broadcom chip families. A recent commit (see 'fixes' tag) introduced two new elements to each chip family in this table -- except for one: BCM4908. This resulted in the xHCI controller not being initialized correctly, causing a panic on boot. The next patch will update this table to use designated initializers in order to prevent this from happening again. For now, just add back the missing array elements to resolve the regression. Fixes: `4536fe9640` ("phy: usb: suppress OC condition for 7439b2") Signed-off-by: Sam Edwards <CFSworks@gmail.com> Reviewed-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20241004034131.1363813-2-CFSworks@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-10-07 11:43:46 +05:30
Mohammed Anees	ccf9af8b0d	iioc: dac: ltc2664: Fix span variable usage in ltc2664_channel_config() In the current implementation of the ltc2664_channel_config() function, a variable named span is declared and initialized to 0, intended to capture the return value of the ltc2664_set_span() function. However, the output of ltc2664_set_span() is directly assigned to chan->span, leaving span unchanged. As a result, when the function later checks if (span < 0), this condition will never trigger an error since span remains 0, this flaw leads to ineffective error handling. Resolve this issue by using the ret variable to get the return value and later assign it if successful and remove unused span variable. Fixes: `4cc2fc445d` ("iio: dac: ltc2664: Add driver for LTC2664 and LTC2672") Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com> Link: https://patch.msgid.link/20241005200435.25061-1-pvmohammedanees2003@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:31:46 +01:00
Javier Carrasco	27b6aa68a6	iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig This driver makes use of regmap_mmio, but does not select the required module. Add the missing 'select REGMAP_MMIO'. Fixes: `4d4b30526e` ("iio: dac: add support for stm32 DAC") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241003-ad2s1210-select-v1-8-4019453f8c33@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:31:36 +01:00
Javier Carrasco	252ff06a4c	iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig This driver makes use of regmap_spi, but does not select the required module. Add the missing 'select REGMAP_SPI'. Fixes: `8316cebd1e` ("iio: dac: add support for ltc1660") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241003-ad2s1210-select-v1-7-4019453f8c33@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:31:25 +01:00
Javier Carrasco	bcdab6f74c	iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig This driver makes use of regmap_spi, but does not select the required module. Add the missing 'select REGMAP_SPI'. Fixes: `cbbb819837` ("iio: dac: ad5770r: Add AD5770R support") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241003-ad2s1210-select-v1-6-4019453f8c33@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:31:14 +01:00
Javier Carrasco	b7983033a1	iio: amplifiers: ada4250: add missing select REGMAP_SPI in Kconfig This driver makes use of regmap_spi, but does not select the required module. Add the missing 'select REGMAP_SPI'. Fixes: `28b4c30bfa` ("iio: amplifiers: ada4250: add support for ADA4250") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241003-ad2s1210-select-v1-5-4019453f8c33@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:31:03 +01:00
Javier Carrasco	c64643ed4e	iio: frequency: adf4377: add missing select REMAP_SPI in Kconfig This driver makes use of regmap_spi, but does not select the required module. Add the missing 'select REGMAP_SPI'. Fixes: `eda549e2e5` ("iio: frequency: adf4377: add support for ADF4377") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241003-ad2s1210-select-v1-3-4019453f8c33@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:30:53 +01:00
Javier Carrasco	2caa67b625	iio: resolver: ad2s1210: add missing select (TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Fixes: `128b9389db` ("staging: iio: resolver: ad2s1210: add triggered buffer support") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20241003-ad2s1210-select-v1-2-4019453f8c33@gmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:30:38 +01:00
Javier Carrasco	17a9936018	iio: resolver: ad2s1210 add missing select REGMAP in Kconfig This driver makes use of regmap, but does not select the required module. Add the missing 'select REGMAP'. Fixes: `b3689e1441` ("staging: iio: resolver: ad2s1210: use regmap for config registers") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20241003-ad2s1210-select-v1-1-4019453f8c33@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:30:27 +01:00
Javier Carrasco	75461a0b15	iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Fixes: `16b0526153` ("mb1232.c: add distance iio sensor with i2c") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241003-iio-select-v1-13-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:30:15 +01:00
Javier Carrasco	3f7b25f6ad	iio: pressure: bm1390: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Note the original driver patch had wrong part number hence the odd fixes entry. Fixes: `81ca5979b6` ("iio: pressure: Support ROHM BU1390") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Acked-by: Matti Vaittinen <mazziesaccount@gmail.com> Link: https://patch.msgid.link/20241003-iio-select-v1-12-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:30:02 +01:00
Javier Carrasco	fbb913895e	iio: magnetometer: af8133j: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Fixes: `1d8f4b0462` ("iio: magnetometer: add a driver for Voltafield AF8133J magnetometer") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: Andrey Skvortsov <andrej.skvortzov@gmail.com> Link: https://patch.msgid.link/20241003-iio-select-v1-11-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:29:46 +01:00
Javier Carrasco	aa99ef68ef	iio: light: bu27008: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Fixes: `41ff93d14f` ("iio: light: ROHM BU27008 color sensor") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Acked-by: Matti Vaittinen <mazziesaccount@gmail.com> Link: https://patch.msgid.link/20241003-iio-select-v1-10-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:29:32 +01:00
Javier Carrasco	3fd8bbf939	iio: chemical: ens160: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Fixes: `0fc26596b4` ("iio: chemical: ens160: add triggered buffer support") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Acked-by: Gustavo Silva <gustavograzs@gmail.com> Link: https://patch.msgid.link/20241003-iio-select-v1-9-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:29:13 +01:00
Javier Carrasco	62ec3df342	iio: dac: ad5766: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Fixes: `885b9790c2` ("drivers:iio:dac:ad5766.c: Add trigger buffer") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241003-iio-select-v1-8-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:28:55 +01:00
Javier Carrasco	5bede94867	iio: dac: ad3552r: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Fixes: `8f2b54824b` ("drivers:iio:dac: Add AD3552R driver support") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241003-iio-select-v1-7-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:28:43 +01:00
Javier Carrasco	a985576af8	iio: adc: ti-lmp92064: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Fixes: `6c7bc1d27b` ("iio: adc: ti-lmp92064: add buffering support") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241003-iio-select-v1-6-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:28:31 +01:00
Javier Carrasco	f3fe8c52c5	iio: adc: ti-lmp92064: add missing select REGMAP_SPI in Kconfig This driver makes use of regmap_spi, but does not select the required module. Add the missing 'select REGMAP_SPI'. Fixes: `6271989426` ("iio: adc: add ADC driver for the TI LMP92064 controller") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241003-iio-select-v1-5-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:28:19 +01:00
Javier Carrasco	eb143d05de	iio: adc: ti-ads124s08: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Fixes: `e717f8c6df` ("iio: adc: Add the TI ads124s08 ADC code") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241003-iio-select-v1-3-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:28:05 +01:00
Javier Carrasco	f4dc96f051	iio: adc: ad7944: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Fixes: `d1efcf8871` ("iio: adc: ad7944: add driver for AD7944/AD7985/AD7986") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20241003-iio-select-v1-2-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:27:52 +01:00
Javier Carrasco	96666f05d1	iio: accel: kx022a: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig This driver makes use of triggered buffers, but does not select the required modules. Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'. Fixes: `7c1d1677b3` ("iio: accel: Support Kionix/ROHM KX022A accelerometer") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Acked-by: Matti Vaittinen <mazziesaccount@gmail.com> Link: https://patch.msgid.link/20241003-iio-select-v1-1-67c0385197cd@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-10-06 16:27:25 +01:00
Kent Overstreet	cba31b7eee	bcachefs: Delete vestigal check_inode() checks BCH_INODE_i_size_dirty dates from before we had logged operations for truncate (as well as finsert) - it hasn't been needed since before bcachefs was mainlined. BCH_INODE_i_sectors_dirty hasn't been needed since we started always updating i_sectors transactionally - it's been unused for even longer. BCH_INODE_backptr_untrusted also hasn't been used since prior to mainlining; when unlinking a hardling, we zero out the backpointer fields if they're for the dirent being removed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-06 03:03:45 -04:00
Kent Overstreet	12f286085b	bcachefs: btree_iter_peek_upto() now handles BTREE_ITER_all_snapshots end_pos now compares against snapshot ID when required Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-06 03:03:45 -04:00
Kent Overstreet	38864eccf7	bcachefs: reattach_inode() now correctly handles interior snapshot nodes When we find an unreachable inode, we now reattach it in the oldest version that needs to be reattached (thus avoiding redundant work reattaching every single version), and we now fix up inode -> dirent backpointers in newer versions as needed - or white out the reattaching dirent in newer versions, if the newer version isn't supposed to be reattached. This results in the second verify fsck now passing cleanly after repairing on a user-provided filesystem image with thousands of different snapshots. Reported-by: Christopher Snowhill <chris@kode54.net> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-06 03:03:45 -04:00
Kent Overstreet	bade9711e0	bcachefs: Split out check_unreachable_inodes() pass With inode backpointers, we can write a very simple check_unreachable_inodes() pass that only looks for non-unlinked inodes that are missing backpointers, and reattaches them. This simplifies check_directory_structure() so that it's now only checking for directory structure loops, Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-06 03:03:45 -04:00
Kent Overstreet	bf4baaa087	bcachefs: Fix lockdep splat in bch2_accounting_read We can't take sb_lock while holding mark_lock, so split out replicas_entry_validate() and replicas_entry_sb_validate() - replicas_entry_validate() now uses the normal online device interface. 00039 ========= TEST set_option 00039 00039 WATCHDOG 30 00040 bcachefs (vdb): starting version 1.12: rebalance_work_acct_fix opts=errors=panic 00040 bcachefs (vdb): initializing new filesystem 00040 bcachefs (vdb): going read-write 00040 bcachefs (vdb): marking superblocks 00040 bcachefs (vdb): initializing freespace 00040 bcachefs (vdb): done initializing freespace 00040 bcachefs (vdb): reading snapshots table 00040 bcachefs (vdb): reading snapshots done 00040 bcachefs (vdb): done starting filesystem 00040 zstd 00041 bcachefs (vdb): shutting down 00041 bcachefs (vdb): going read-only 00041 bcachefs (vdb): finished waiting for writes to stop 00041 bcachefs (vdb): flushing journal and stopping allocators, journal seq 3 00041 bcachefs (vdb): flushing journal and stopping allocators complete, journal seq 11 00041 bcachefs (vdb): shutdown complete, journal seq 12 00041 bcachefs (vdb): marking filesystem clean 00041 bcachefs (vdb): shutdown complete 00041 Setting option on offline fs 00041 bch2_write_super(): fatal error : attempting to write superblock that wasn't version downgraded (1.12: (unknown version) > 1.10: disk_accounting_v3) 00041 fatal error - emergency read only 00041 bch2_write_super(): fatal error : attempting to write superblock that wasn't version downgraded (1.12: (unknown version) > 1.10: disk_accounting_v3) 00042 bcachefs (vdb): starting version 1.12: rebalance_work_acct_fix opts=errors=panic,compression=zstd 00042 bcachefs (vdb): recovering from clean shutdown, journal seq 12 00042 bcachefs (vdb): accounting_read... 00042 00042 ====================================================== 00042 WARNING: possible circular locking dependency detected 00042 6.12.0-rc1-ktest-g805e938a8502 #6807 Not tainted 00042 ------------------------------------------------------ 00042 mount.bcachefs/665 is trying to acquire lock: 00045 ffffff80cc280908 (&c->sb_lock){+.+.}-{3:3}, at: bch2_replicas_entry_validate (fs/bcachefs/replicas.c:102) 00045 00045 but task is already holding lock: 00048 ffffff80cc284870 (&c->mark_lock){++++}-{0:0}, at: bch2_accounting_read (fs/bcachefs/disk_accounting.c:670 (discriminator 1)) 00048 00048 which lock already depends on the new lock. 00048 00048 00048 the existing dependency chain (in reverse order) is: 00048 00048 -> #1 (&c->mark_lock){++++}-{0:0}: 00049 percpu_down_write (kernel/locking/percpu-rwsem.c:232) 00052 bch2_sb_replicas_to_cpu_replicas (fs/bcachefs/replicas.c:583) 00055 bch2_sb_to_fs (fs/bcachefs/super-io.c:614) 00057 bch2_fs_open (fs/bcachefs/super.c:828 fs/bcachefs/super.c:2050) 00060 bch2_fs_get_tree (fs/bcachefs/fs.c:2067) 00062 vfs_get_tree (fs/super.c:1801) 00064 path_mount (fs/namespace.c:3507 fs/namespace.c:3834) 00066 __arm64_sys_mount (fs/namespace.c:3847 fs/namespace.c:4055 fs/namespace.c:4032 fs/namespace.c:4032) 00067 invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54) 00068 do_el0_svc (include/linux/thread_info.h:127 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2)) 00069 el0_svc (arch/arm64/include/asm/irqflags.h:82 arch/arm64/include/asm/irqflags.h:123 arch/arm64/include/asm/irqflags.h:136 arch/arm64/kernel/entry-common.c:165 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713) 00069 ========= FAILED TIMEOUT set_option in 30s Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-10-06 03:03:45 -04:00
SurajSonawane2415	d41bff05a6	hid: intel-ish-hid: Fix uninitialized variable 'rv' in ish_fw_xfer_direct_dma Fix the uninitialized symbol 'rv' in the function ish_fw_xfer_direct_dma to resolve the following warning from the smatch tool: drivers/hid/intel-ish-hid/ishtp-fw-loader.c:714 ish_fw_xfer_direct_dma() error: uninitialized symbol 'rv'. Initialize 'rv' to 0 to prevent undefined behavior from uninitialized access. Cc: stable@vger.kernel.org Fixes: `91b228107d` ("HID: intel-ish-hid: ISH firmware loader client driver") Signed-off-by: SurajSonawane2415 <surajsonawane0215@gmail.com> Link: https://patch.msgid.link/20241004075944.44932-1-surajsonawane0215@gmail.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2024-10-04 10:33:11 +02:00
Hannes Reinecke	782373ba27	nvme: tcp: avoid race between queue_lock lock and destroy Commit `76d54bf20c` ("nvme-tcp: don't access released socket during error recovery") added a mutex_lock() call for the queue->queue_lock in nvme_tcp_get_address(). However, the mutex_lock() races with mutex_destroy() in nvme_tcp_free_queue(), and causes the WARN below. DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 3 PID: 34077 at kernel/locking/mutex.c:587 __mutex_lock+0xcf0/0x1220 Modules linked in: nvmet_tcp nvmet nvme_tcp nvme_fabrics iw_cm ib_cm ib_core pktcdvd nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables qrtr sunrpc ppdev 9pnet_virtio 9pnet pcspkr netfs parport_pc parport e1000 i2c_piix4 i2c_smbus loop fuse nfnetlink zram bochs drm_vram_helper drm_ttm_helper ttm drm_kms_helper xfs drm sym53c8xx floppy nvme scsi_transport_spi nvme_core nvme_auth serio_raw ata_generic pata_acpi dm_multipath qemu_fw_cfg [last unloaded: ib_uverbs] CPU: 3 UID: 0 PID: 34077 Comm: udisksd Not tainted 6.11.0-rc7 #319 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:__mutex_lock+0xcf0/0x1220 Code: 08 84 d2 0f 85 c8 04 00 00 8b 15 ef b6 c8 01 85 d2 0f 85 78 f4 ff ff 48 c7 c6 20 93 ee af 48 c7 c7 60 91 ee af e8 f0 a7 6d fd <0f> 0b e9 5e f4 ff ff 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 RSP: 0018:ffff88811305f760 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff88812c652058 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: ffff88811305f8b0 R08: 0000000000000001 R09: ffffed1075c36341 R10: ffff8883ae1b1a0b R11: 0000000000010498 R12: 0000000000000000 R13: 0000000000000000 R14: dffffc0000000000 R15: ffff88812c652058 FS: 00007f9713ae4980(0000) GS:ffff8883ae180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fcd78483c7c CR3: 0000000122c38000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __warn.cold+0x5b/0x1af ? __mutex_lock+0xcf0/0x1220 ? report_bug+0x1ec/0x390 ? handle_bug+0x3c/0x80 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? __mutex_lock+0xcf0/0x1220 ? nvme_tcp_get_address+0xc2/0x1e0 [nvme_tcp] ? __pfx___mutex_lock+0x10/0x10 ? __lock_acquire+0xd6a/0x59e0 ? nvme_tcp_get_address+0xc2/0x1e0 [nvme_tcp] nvme_tcp_get_address+0xc2/0x1e0 [nvme_tcp] ? __pfx_nvme_tcp_get_address+0x10/0x10 [nvme_tcp] nvme_sysfs_show_address+0x81/0xc0 [nvme_core] dev_attr_show+0x42/0x80 ? __asan_memset+0x1f/0x40 sysfs_kf_seq_show+0x1f0/0x370 seq_read_iter+0x2cb/0x1130 ? rw_verify_area+0x3b1/0x590 ? __mutex_lock+0x433/0x1220 vfs_read+0x6a6/0xa20 ? lockdep_hardirqs_on+0x78/0x100 ? __pfx_vfs_read+0x10/0x10 ksys_read+0xf7/0x1d0 ? __pfx_ksys_read+0x10/0x10 ? __x64_sys_openat+0x105/0x1d0 do_syscall_64+0x93/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? __pfx_ksys_read+0x10/0x10 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? do_syscall_64+0x9f/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f9713f55cfa Code: 55 48 89 e5 48 83 ec 20 48 89 55 e8 48 89 75 f0 89 7d f8 e8 e8 74 f8 ff 48 8b 55 e8 48 8b 75 f0 41 89 c0 8b 7d f8 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 2e 44 89 c7 48 89 45 f8 e8 42 75 f8 ff 48 8b RSP: 002b:00007ffd7f512e70 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 000055c38f316859 RCX: 00007f9713f55cfa RDX: 0000000000000fff RSI: 00007ffd7f512eb0 RDI: 0000000000000011 RBP: 00007ffd7f512e90 R08: 0000000000000000 R09: 00000000ffffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000055c38f317148 R13: 0000000000000000 R14: 00007f96f4004f30 R15: 000055c3b6b623c0 </TASK> The WARN is observed when the blktests test case nvme/014 is repeated with tcp transport. It is rare, and 200 times repeat is required to recreate in some test environments. To avoid the WARN, check the NVME_TCP_Q_LIVE flag before locking queue->queue_lock. The flag is cleared long time before the lock gets destroyed. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-03 12:14:35 -07:00
Linus Walleij	c9560baef0	Merge tag 'intel-pinctrl-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pinctrl/intel into fixes intel-pinctrl for v6.12-2 Fixes a few issues with Intel pin control platform driver: * fix missing reference counter drop of fwnode on error path * replace comma by semicolon to follow the kernel style * add Panther Lake to the list of supported devices The following is an automated git shortlog grouped by driver: intel: - platform: Add Panther Lake to the list of supported - platform: use semicolon instead of comma in ncommunities assignment - platform: fix error path in device_for_each_child_node() Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2024-10-03 16:03:27 +02:00
Andy Shevchenko	3775625709	pinctrl: intel: platform: Add Panther Lake to the list of supported Intel Panther Lake is supported by the generic platform driver, so add it to the list of supported in Kconfig. Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>	2024-10-03 13:14:31 +03:00
Herve Codina	1117b916f5	soc: fsl: cpm1: qmc: Fix unused data compilation warning In some configuration, compilation raises warnings related to unused data. Indeed, depending on configuration, those data can be unused. mark those data as __maybe_unused to avoid compilation warnings. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409071707.ou2KFNKO-lkp@intel.com/ Fixes: `eb680d5630` ("soc: fsl: cpm1: qmc: Add support for QUICC Engine (QE) implementation") Signed-off-by: Herve Codina <herve.codina@bootlin.com> Link: https://lore.kernel.org/r/20240909121129.57067-1-herve.codina@bootlin.com Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>	2024-10-02 23:29:38 +02:00
Geert Uytterhoeven	122019f051	soc: fsl: cpm1: qmc: Do not use IS_ERR_VALUE() on error pointers ppc64_book3e_allmodconfig: drivers/soc/fsl/qe/qmc.c: In function ‘qmc_qe_init_resources’: include/linux/err.h:28:49: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 28 \| #define IS_ERR_VALUE(x) unlikely((unsigned long)(void *)(x) >= (unsigned long)-MAX_ERRNO) \| ^ include/linux/compiler.h:77:45: note: in definition of macro ‘unlikely’ 77 \| # define unlikely(x) __builtin_expect(!!(x), 0) \| ^ drivers/soc/fsl/qe/qmc.c:1764:13: note: in expansion of macro ‘IS_ERR_VALUE’ 1764 \| if (IS_ERR_VALUE(info)) { \| ^~~~~~~~~~~~ IS_ERR_VALUE() is only meant for pointers. Fix this by checking for a negative error value instead, which matches the documented behavior of devm_qe_muram_alloc() aka devm_cpm_muram_alloc(). While at it, remove the unneeded print in case of a memory allocation failure, and propagate the returned error code. Fixes: `eb680d5630` ("soc: fsl: cpm1: qmc: Add support for QUICC Engine (QE) implementation") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Herve Codina <herve.codina@bootlin.com> Acked-by: Herve Codina <herve.codina@bootlin.com> Link: https://lore.kernel.org/r/8b113596b2c8cdda6655346232cc603efdeb935a.1727708905.git.geert+renesas@glider.be Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>	2024-10-02 23:28:46 +02:00
Martin KaFai Lau	bcd28cfd04	Merge branch 'bpf: devmap: provide rxq after redirect' Florian Kauer says: ==================== rxq contains a pointer to the device from where the redirect happened. Currently, the BPF program that was executed after a redirect via BPF_MAP_TYPE_DEVMAP* does not have it set. Add bugfix and related selftest. --- Changes in v4: - return -> goto out_close, thanks Toke - Link to v3: https://lore.kernel.org/r/20240909-devel-koalo-fix-ingress-ifindex-v3-0-66218191ecca@linutronix.de Changes in v3: - initialize skel to NULL, thanks Stanislav - Link to v2: https://lore.kernel.org/r/20240906-devel-koalo-fix-ingress-ifindex-v2-0-4caa12c644b4@linutronix.de Changes in v2: - changed fixes tag - added selftest - Link to v1: https://lore.kernel.org/r/20240905-devel-koalo-fix-ingress-ifindex-v1-1-d12a0d74c29c@linutronix.de ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-10-02 13:51:49 -07:00
Florian Kauer	49ebeb0c15	bpf: selftests: send packet to devmap redirect XDP The current xdp_devmap_attach test attaches a program that redirects to another program via devmap. It is, however, never executed, so do that to catch any bugs that might occur during execution. Also, execute the same for a veth pair so that we also cover the non-generic path. Warning: Running this without the bugfix in this series will likely crash your system. Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20240911-devel-koalo-fix-ingress-ifindex-v4-2-5c643ae10258@linutronix.de Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-10-02 13:51:43 -07:00
Florian Kauer	ca9984c5f0	bpf: devmap: provide rxq after redirect rxq contains a pointer to the device from where the redirect happened. Currently, the BPF program that was executed after a redirect via BPF_MAP_TYPE_DEVMAP* does not have it set. This is particularly bad since accessing ingress_ifindex, e.g. SEC("xdp") int prog(struct xdp_md pkt) { return bpf_redirect_map(&dev_redirect_map, 0, 0); } SEC("xdp/devmap") int prog_after_redirect(struct xdp_md pkt) { bpf_printk("ifindex %i", pkt->ingress_ifindex); return XDP_PASS; } depends on access to rxq, so a NULL pointer gets dereferenced: <1>[ 574.475170] BUG: kernel NULL pointer dereference, address: 0000000000000000 <1>[ 574.475188] #PF: supervisor read access in kernel mode <1>[ 574.475194] #PF: error_code(0x0000) - not-present page <6>[ 574.475199] PGD 0 P4D 0 <4>[ 574.475207] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI <4>[ 574.475217] CPU: 4 UID: 0 PID: 217 Comm: kworker/4:1 Not tainted 6.11.0-rc5-reduced-00859-g780801200300 #23 <4>[ 574.475226] Hardware name: Intel(R) Client Systems NUC13ANHi7/NUC13ANBi7, BIOS ANRPL357.0026.2023.0314.1458 03/14/2023 <4>[ 574.475231] Workqueue: mld mld_ifc_work <4>[ 574.475247] RIP: 0010:bpf_prog_5e13354d9cf5018a_prog_after_redirect+0x17/0x3c <4>[ 574.475257] Code: cc cc cc cc cc cc cc 80 00 00 00 cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 66 90 55 48 89 e5 f3 0f 1e fa 48 8b 57 20 <48> 8b 52 00 8b 92 e0 00 00 00 48 bf f8 a6 d5 c4 5d a0 ff ff be 0b <4>[ 574.475263] RSP: 0018:ffffa62440280c98 EFLAGS: 00010206 <4>[ 574.475269] RAX: ffffa62440280cd8 RBX: 0000000000000001 RCX: 0000000000000000 <4>[ 574.475274] RDX: 0000000000000000 RSI: ffffa62440549048 RDI: ffffa62440280ce0 <4>[ 574.475278] RBP: ffffa62440280c98 R08: 0000000000000002 R09: 0000000000000001 <4>[ 574.475281] R10: ffffa05dc8b98000 R11: ffffa05f577fca40 R12: ffffa05dcab24000 <4>[ 574.475285] R13: ffffa62440280ce0 R14: ffffa62440549048 R15: ffffa62440549000 <4>[ 574.475289] FS: 0000000000000000(0000) GS:ffffa05f4f700000(0000) knlGS:0000000000000000 <4>[ 574.475294] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 574.475298] CR2: 0000000000000000 CR3: 000000025522e000 CR4: 0000000000f50ef0 <4>[ 574.475303] PKRU: 55555554 <4>[ 574.475306] Call Trace: <4>[ 574.475313] <IRQ> <4>[ 574.475318] ? __die+0x23/0x70 <4>[ 574.475329] ? page_fault_oops+0x180/0x4c0 <4>[ 574.475339] ? skb_pp_cow_data+0x34c/0x490 <4>[ 574.475346] ? kmem_cache_free+0x257/0x280 <4>[ 574.475357] ? exc_page_fault+0x67/0x150 <4>[ 574.475368] ? asm_exc_page_fault+0x26/0x30 <4>[ 574.475381] ? bpf_prog_5e13354d9cf5018a_prog_after_redirect+0x17/0x3c <4>[ 574.475386] bq_xmit_all+0x158/0x420 <4>[ 574.475397] __dev_flush+0x30/0x90 <4>[ 574.475407] veth_poll+0x216/0x250 [veth] <4>[ 574.475421] __napi_poll+0x28/0x1c0 <4>[ 574.475430] net_rx_action+0x32d/0x3a0 <4>[ 574.475441] handle_softirqs+0xcb/0x2c0 <4>[ 574.475451] do_softirq+0x40/0x60 <4>[ 574.475458] </IRQ> <4>[ 574.475461] <TASK> <4>[ 574.475464] __local_bh_enable_ip+0x66/0x70 <4>[ 574.475471] __dev_queue_xmit+0x268/0xe40 <4>[ 574.475480] ? selinux_ip_postroute+0x213/0x420 <4>[ 574.475491] ? alloc_skb_with_frags+0x4a/0x1d0 <4>[ 574.475502] ip6_finish_output2+0x2be/0x640 <4>[ 574.475512] ? nf_hook_slow+0x42/0xf0 <4>[ 574.475521] ip6_finish_output+0x194/0x300 <4>[ 574.475529] ? __pfx_ip6_finish_output+0x10/0x10 <4>[ 574.475538] mld_sendpack+0x17c/0x240 <4>[ 574.475548] mld_ifc_work+0x192/0x410 <4>[ 574.475557] process_one_work+0x15d/0x380 <4>[ 574.475566] worker_thread+0x29d/0x3a0 <4>[ 574.475573] ? __pfx_worker_thread+0x10/0x10 <4>[ 574.475580] ? __pfx_worker_thread+0x10/0x10 <4>[ 574.475587] kthread+0xcd/0x100 <4>[ 574.475597] ? __pfx_kthread+0x10/0x10 <4>[ 574.475606] ret_from_fork+0x31/0x50 <4>[ 574.475615] ? __pfx_kthread+0x10/0x10 <4>[ 574.475623] ret_from_fork_asm+0x1a/0x30 <4>[ 574.475635] </TASK> <4>[ 574.475637] Modules linked in: veth br_netfilter bridge stp llc iwlmvm x86_pkg_temp_thermal iwlwifi efivarfs nvme nvme_core <4>[ 574.475662] CR2: 0000000000000000 <4>[ 574.475668] ---[ end trace 0000000000000000 ]--- Therefore, provide it to the program by setting rxq properly. Fixes: `cb261b594b` ("bpf: Run devmap xdp_prog on flush instead of bulk enqueue") Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20240911-devel-koalo-fix-ingress-ifindex-v4-1-5c643ae10258@linutronix.de Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-10-02 13:48:26 -07:00
Rosen Penev	393c554093	pinctrl: aw9523: add missing mutex_destroy Otherwise the mutex remains after a failed kzalloc. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://lore.kernel.org/20241001212724.309320-1-rosenp@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2024-10-02 15:48:38 +02:00
Charlie Jenkins	6eabf65604	irqchip/sifive-plic: Return error code on failure Set error to -ENOMEM if kcalloc() fails or if irq_domain_add_linear() fails inside of plic_probe() instead of returning 0. Fixes: `4d936f10ff` ("irqchip/sifive-plic: Probe plic driver early for Allwinner D1 platform") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240903-correct_error_codes_sifive_plic-v1-1-d929b79663a2@rivosinc.com Closes: https://lore.kernel.org/r/202409031122.yBh8HrxA-lkp@intel.com/	2024-10-02 15:15:33 +02:00
Andrew Jones	4a1361e9a5	irqchip/riscv-imsic: Fix output text of base address The "per-CPU IDs ... at base ..." info log is outputting a physical address, not a PPN. Fixes: `027e125acd` ("irqchip/riscv-imsic: Add device MSI domain support for platform devices") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/all/20240909085610.46625-2-ajones@ventanamicro.com	2024-10-02 15:12:18 +02:00
Sergey Matsievskiy	7f1f78b903	irqchip/ocelot: Comment sticky register clearing code Add comment to the sticky register clearing code. Signed-off-by: Sergey Matsievskiy <matsievskiysv@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240925184416.54204-3-matsievskiysv@gmail.com	2024-10-02 15:11:07 +02:00
Sergey Matsievskiy	9e9c4666ab	irqchip/ocelot: Fix trigger register address Controllers, supported by this driver, have two sets of registers: * (main) interrupt registers control peripheral interrupt sources. * device interrupt registers configure per-device (network interface) interrupts and act as an extra stage before the main interrupt registers. In the driver unmask code, device trigger registers are used in the mask calculation of the main interrupt sticky register, mixing two kinds of registers. Use the main interrupt trigger register instead. Signed-off-by: Sergey Matsievskiy <matsievskiysv@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240925184416.54204-2-matsievskiysv@gmail.com	2024-10-02 15:11:07 +02:00
Lukas Bulwahn	5fd7e1ee09	irqchip: Remove obsolete config ARM_GIC_V3_ITS_PCI Commit `b5712bf89b` ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") moves the functionality of irq-gic-v3-its-pci-msi.c into irq-gic-v3-its-msi-parent.c, and drops the former file. With that, the config option ARM_GIC_V3_ITS_PCI is obsolete, but the definition of that config was not removed in the commit above. Remove this obsolete config ARM_GIC_V3_ITS_PCI. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240926125502.363364-1-lukas.bulwahn@redhat.com	2024-10-02 15:08:12 +02:00
Chen Yu	d4ac164bde	sched/eevdf: Fix wakeup-preempt by checking cfs_rq->nr_running Commit `85e511df3c` ("sched/eevdf: Allow shorter slices to wakeup-preempt") introduced a mechanism that a wakee with shorter slice could preempt the current running task. It also lower the bar for the current task to be preempted, by checking the rq->nr_running instead of cfs_rq->nr_running when the current task has ran out of time slice. But there is a scenario that is problematic. Say, if there is 1 cfs task and 1 rt task, before `85e511df3c`, update_deadline() will not trigger a reschedule, and after `85e511df3c`, since rq->nr_running is 2 and resched is true, a resched_curr() would happen. Some workloads (like the hackbench reported by lkp) do not like over-scheduling. We can see that the preemption rate has been increased by 2.2%: 1.654e+08 +2.2% 1.69e+08 hackbench.time.involuntary_context_switches Restore its previous check criterion. Fixes: `85e511df3c` ("sched/eevdf: Allow shorter slices to wakeup-preempt") Closes: https://lore.kernel.org/oe-lkp/202409231416.9403c2e9-oliver.sang@intel.com Reported-by: kernel test robot <oliver.sang@intel.com> Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Honglei Wang <jameshongleiwang@126.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20240925085440.358138-1-yu.c.chen@intel.com	2024-10-02 11:27:54 +02:00
Mike Galbraith	9b5ce1a37e	sched: Fix sched_delayed vs cfs_bandwidth Meeting an unfinished DELAY_DEQUEUE treated entity in unthrottle_cfs_rq() leads to a couple terminal scenarios. Finish it first, so ENQUEUE_WAKEUP can proceed as it would have sans DELAY_DEQUEUE treatment. Fixes: `152e11f6df` ("sched/fair: Implement delayed dequeue") Reported-by: Venkat Rao Bagalkote <venkat88@linux.vnet.ibm.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Venkat Rao Bagalkote <venkat88@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/7515d2e64c989b9e3b828a9e21bcd959b99df06a.camel@gmx.de	2024-10-02 11:27:54 +02:00
Daniel Borkmann	3ed6be6891	bpf: Sync uapi bpf.h header to tools directory There is a delta between kernel UAPI bpf.h and tools UAPI bpf.h, thus sync them again. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2024-10-01 21:44:10 +02:00
Toke Høiland-Jørgensen	09d88791c7	bpf: Make sure internal and UAPI bpf_redirect flags don't overlap The bpf_redirect_info is shared between the SKB and XDP redirect paths, and the two paths use the same numeric flag values in the ri->flags field (specifically, BPF_F_BROADCAST == BPF_F_NEXTHOP). This means that if skb bpf_redirect_neigh() is used with a non-NULL params argument and, subsequently, an XDP redirect is performed using the same bpf_redirect_info struct, the XDP path will get confused and end up crashing, which syzbot managed to trigger. With the stack-allocated bpf_redirect_info, the structure is no longer shared between the SKB and XDP paths, so the crash doesn't happen anymore. However, different code paths using identically-numbered flag values in the same struct field still seems like a bit of a mess, so this patch cleans that up by moving the flag definitions together and redefining the three flags in BPF_F_REDIRECT_INTERNAL to not overlap with the flags used for XDP. It also adds a BUILD_BUG_ON() check to make sure the overlap is not re-introduced by mistake. Fixes: `e624d4ed4a` ("xdp: Extend xdp_redirect_map with broadcast support") Reported-by: syzbot+cca39e6e84a367a7e6f6@syzkaller.appspotmail.com Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://syzkaller.appspot.com/bug?extid=cca39e6e84a367a7e6f6 Link: https://lore.kernel.org/bpf/20240920125625.59465-1-toke@redhat.com	2024-10-01 21:40:12 +02:00
Nilay Shroff	e38dad438f	nvmet-passthru: clear EUID/NGUID/UUID while using loop target When nvme passthru is configured using loop target, the clear_ids attribute is, by default, set to true. This attribute would ensure that EUID/NGUID/UUID is cleared for the loop passthru target. The newer NVMe disk supporting the NVMe spec 1.3 or higher, typically, implements the support for "Namespace Identification Descriptor list" command. This command when issued from host returns EUID/NGUID/UUID assigned to the inquired namespace. Not clearing these values, while using nvme passthru using loop target, would result in NVMe host driver rejecting the namespace. This check was implemented in the commit `2079f41ec6` ("nvme: check that EUI/GUID/UUID are globally unique"). The fix implemented in this commit ensure that when host issues ns-id descriptor list command, the EUID/NGUID/UUID are cleared by passthru target. In fact, the function nvmet_passthru_override_id_descs() which clears those unique ids already exits, so we just need to ensure that ns-id descriptor list command falls through the corretc code path. And while we're at it, we also combines the three passthru admin command cases together which shares the same code. Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-10-01 11:08:40 -07:00
Eduard Zingerman	a41b3828ec	selftests/bpf: Verify that sync_linked_regs preserves subreg_def This test was added because of a bug in verifier.c:sync_linked_regs(), upon range propagation it destroyed subreg_def marks for registers. The test is written in a way to return an upper half of a register that is affected by range propagation and must have it's subreg_def preserved. This gives a return value of 0 and leads to undefined return value if subreg_def mark is not preserved. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240924210844.1758441-2-eddyz87@gmail.com	2024-10-01 17:19:04 +02:00
Eduard Zingerman	e9bd9c498c	bpf: sync_linked_regs() must preserve subreg_def Range propagation must not affect subreg_def marks, otherwise the following example is rewritten by verifier incorrectly when BPF_F_TEST_RND_HI32 flag is set: 0: call bpf_ktime_get_ns call bpf_ktime_get_ns 1: r0 &= 0x7fffffff after verifier r0 &= 0x7fffffff 2: w1 = w0 rewrites w1 = w0 3: if w0 < 10 goto +0 --------------> r11 = 0x2f5674a6 (r) 4: r1 >>= 32 r11 <<= 32 (r) 5: r0 = r1 r1 \|= r11 (r) 6: exit; if w0 < 0xa goto pc+0 r1 >>= 32 r0 = r1 exit (or zero extension of w1 at (2) is missing for architectures that require zero extension for upper register half). The following happens w/o this patch: - r0 is marked as not a subreg at (0); - w1 is marked as subreg at (2); - w1 subreg_def is overridden at (3) by copy_register_state(); - w1 is read at (5) but mark_insn_zext() does not mark (2) for zero extension, because w1 subreg_def is not set; - because of BPF_F_TEST_RND_HI32 flag verifier inserts random value for hi32 bits of (2) (marked (r)); - this random value is read at (5). Fixes: `75748837b7` ("bpf: Propagate scalar ranges through register assignments.") Reported-by: Lonial Con <kongln9170@gmail.com> Signed-off-by: Lonial Con <kongln9170@gmail.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://lore.kernel.org/bpf/7e2aa30a62d740db182c170fdd8f81c596df280d.camel@gmail.com Link: https://lore.kernel.org/bpf/20240924210844.1758441-1-eddyz87@gmail.com	2024-10-01 17:18:52 +02:00
Ma Ke	b0f0e3f055	pinctrl: stm32: check devm_kasprintf() returned value devm_kasprintf() can return a NULL pointer on failure but this returned value is not checked. Fix this lack and check the returned value. Found by code review. Cc: stable@vger.kernel.org Fixes: `32c170ff15` ("pinctrl: stm32: set default gpio line names using pin names") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Link: https://lore.kernel.org/20240906100326.624445-1-make24@iscas.ac.cn Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2024-10-01 14:03:34 +02:00
Ma Ke	665a58fe66	pinctrl: apple: check devm_kasprintf() returned value devm_kasprintf() can return a NULL pointer on failure but this returned value is not checked. Fix this lack and check the returned value. Found by code review. Cc: stable@vger.kernel.org Fixes: `a0f160ffcb` ("pinctrl: add pinctrl/GPIO driver for Apple SoCs") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Reviewed-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/20240905020917.356534-1-make24@iscas.ac.cn Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2024-10-01 14:03:34 +02:00
Changhuang Liang	2cf5966366	reset: starfive: jh71x0: Fix accessing the empty member on JH7110 SoC data->asserted will be NULL on JH7110 SoC since commit `82327b127d` ("reset: starfive: Add StarFive JH7110 reset driver") was added. Add the judgment condition to avoid errors when calling reset_control_status on JH7110 SoC. Fixes: `82327b127d` ("reset: starfive: Add StarFive JH7110 reset driver") Signed-off-by: Changhuang Liang <changhuang.liang@starfivetech.com> Acked-by: Hal Feng <hal.feng@starfivetech.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Link: https://lore.kernel.org/r/20240925112442.1732416-1-changhuang.liang@starfivetech.com Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>	2024-09-30 14:24:37 +02:00
Yan Zhen	e7b71bf181	reset: npcm: convert comma to semicolon Replace a comma between expression statements by a semicolon. Signed-off-by: Yan Zhen <yanzhen@vivo.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Link: https://lore.kernel.org/r/20240909061258.2246292-1-yanzhen@vivo.com Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>	2024-09-30 14:17:13 +02:00
Andy Shevchenko	21dcd49fb4	Merge patch series "pinctrl: intel: platform: fix error path in device_for_each_child_node()" Javier Carrasco <javier.carrasco.cruz@gmail.com> says: This series fixes an error path where the reference of a child node is not decremented upon early return. When at it, a trivial comma/semicolon substitution I found by chance has been added to improve code clarity. Link: https://lore.kernel.org/r/20240926-intel-pinctrl-platform-scoped-v1-0-5ee4c936eea3@gmail.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>	2024-09-30 14:35:38 +03:00
Javier Carrasco	d594de8956	pinctrl: intel: platform: use semicolon instead of comma in ncommunities assignment Substitute the comma with a semicolon in the `ncommunities` assignment for better readability and consistency with common C coding style. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>	2024-09-30 14:35:23 +03:00
Javier Carrasco	16a6d2e685	pinctrl: intel: platform: fix error path in device_for_each_child_node() The device_for_each_child_node() loop requires calls to fwnode_handle_put() upon early returns to decrement the refcount of the child node and avoid leaking memory if that error path is triggered. There is one early returns within that loop in intel_platform_pinctrl_prepare_community(), but fwnode_handle_put() is missing. Instead of adding the missing call, the scoped version of the loop can be used to simplify the code and avoid mistakes in the future if new early returns are added, as the child node is only used for parsing, and it is never assigned. Cc: stable@vger.kernel.org Fixes: `c5860e4a27` ("pinctrl: intel: Add a generic Intel pin control platform driver") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>	2024-09-30 14:35:23 +03:00
Jinjie Ruan	a03c246d4e	clk: samsung: Fix out-of-bound access of of_match_node() Currently, there is no terminator entry for exynosautov920_cmu_of_match, hence facing below KASAN warning, BUG: KASAN: global-out-of-bounds in of_match_node+0x120/0x13c Read of size 1 at addr ffffffe31cc9e628 by task swapper/0/1 CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0+ #334 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x94/0xec show_stack+0x18/0x24 dump_stack_lvl+0x90/0xd0 print_report+0x1f4/0x5b4 kasan_report+0xc8/0x110 __asan_report_load1_noabort+0x20/0x2c of_match_node+0x120/0x13c of_match_device+0x70/0xb4 platform_match+0xa0/0x25c __device_attach_driver+0x7c/0x2d4 bus_for_each_drv+0x100/0x188 __device_attach+0x174/0x364 device_initial_probe+0x14/0x20 bus_probe_device+0x128/0x158 device_add+0xb3c/0x10fc of_device_add+0xdc/0x150 of_platform_device_create_pdata+0x120/0x20c of_platform_bus_create+0x2bc/0x620 of_platform_populate+0x58/0x108 of_platform_default_populate_init+0x100/0x120 do_one_initcall+0x110/0x788 kernel_init_freeable+0x44c/0x61c kernel_init+0x24/0x1e4 ret_from_fork+0x10/0x20 The buggy address belongs to the variable: exynosautov920_cmu_of_match+0xc8/0x2c80 Add a dummy terminator entry at the end to assist of_match_node() in traversing up to the terminator entry without accessing an out-of-boundary index. Fixes: `485e13fe2f` ("clk: samsung: add top clock support for ExynosAuto v920 SoC") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240927102104.3268790-1-ruanjinjie@huawei.com [krzk: drop trailing comma] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>	2024-09-30 13:10:11 +02:00
Jonathan Cameron	d6bf6983b3	iio: pressure: sdp500: Add missing select CRC8 Fix: sh4-linux-ld: drivers/iio/pressure/sdp500.o: in function `sdp500_probe': >> drivers/iio/pressure/sdp500.c:130:(.text+0xe8): undefined reference to `crc8_populate_msb' sh4-linux-ld: drivers/iio/pressure/sdp500.o: in function `sdp500_read_raw': >> drivers/iio/pressure/sdp500.c:74:(.text+0x200): undefined reference to `crc8' by adding missing select. Reviewed-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409271341.0dhpXk7G-lkp@intel.com/ Link: https://patch.msgid.link/20240929172105.1819259-1-jic23@kernel.org Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-09-30 09:55:21 +01:00
Javier Carrasco	c9e9746f27	iio: light: veml6030: fix ALS sensor resolution The driver still uses the sensor resolution provided in the datasheet until Rev. 1.6, 28-Apr-2022, which was updated with Rev 1.7, 28-Nov-2023. The original ambient light resolution has been updated from 0.0036 lx/ct to 0.0042 lx/ct, which is the value that can be found in the current device datasheet. Update the default resolution for IT = 100 ms and GAIN = 1/8 from the original 4608 mlux/cnt to the current value from the "Resolution and maximum detection range" table (Application Note 84367, page 5), 5376 mlux/cnt. Cc: <stable@vger.kernel.org> Fixes: `7b779f573c` ("iio: light: add driver for veml6030 ambient light sensor") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20240923-veml6035-v2-1-58c72a0df31c@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-09-30 09:20:54 +01:00
Dan Carpenter	50161b2768	iio: bmi323: fix reversed if statement in bmi323_core_runtime_resume() This reversed if statement means that the function just returns success without writing to the registers. Fixes: `16531118ba` ("iio: bmi323: peripheral in lowest power state on suspend") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/689a2122-6e2f-4b0c-9a1c-39a98621c6c1@stanley.mountain Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-09-30 09:20:54 +01:00
Dan Carpenter	506a1ac4c4	iio: bmi323: fix copy and paste bugs in suspend resume This code is using bmi323_reg_savestate[] and ->reg_settings[] instead of bmi323_ext_reg_savestate[] and ->ext_reg_settings[]. This was discovered by Smatch: drivers/iio/imu/bmi323/bmi323_core.c:2202 bmi323_core_runtime_suspend() error: buffer overflow 'bmi323_reg_savestate' 9 <= 11 Fixes: `16531118ba` ("iio: bmi323: peripheral in lowest power state on suspend") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/7175b8ec-85cf-4fbf-a4e1-c4c43c3b665c@stanley.mountain Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-09-30 09:20:54 +01:00
Nathan Chancellor	cd8247cd41	iio: bmi323: Drop CONFIG_PM guards around runtime functions When building with clang and CONFIG_PM disabled (such as with s390), it warns: drivers/iio/imu/bmi323/bmi323_core.c:121:27: warning: variable 'bmi323_reg_savestate' is not needed and will not be emitted [-Wunneeded-internal-declaration] 121 \| static const unsigned int bmi323_reg_savestate[] = { \| ^~~~~~~~~~~~~~~~~~~~ drivers/iio/imu/bmi323/bmi323_core.c:133:27: warning: variable 'bmi323_ext_reg_savestate' is not needed and will not be emitted [-Wunneeded-internal-declaration] 133 \| static const unsigned int bmi323_ext_reg_savestate[] = { \| ^~~~~~~~~~~~~~~~~~~~~~~~ These arrays have no references outside of sizeof(), which will be evaluated at compile time. To avoid these warnings, remove the CONFIG_PM ifdef guard and use the RUNTIME_PM_OPS macro to ensure these functions always appear used to the compiler, which allows the references to the arrays to be visible as well. This results in no difference in runtime behavior because bmi323_core_pm_ops is only used when CONFIG_PM is set with the pm_ptr() macro. Fixes: `b09999ee1e` ("iio: bmi323: suspend and resume triggering on relevant pm operations") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://patch.msgid.link/20240910-iio-bmi323-remove-config_pm-guards-v1-1-0552249207af@kernel.org Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-09-30 09:20:54 +01:00
Rob Herring (Arm)	9de32f48c5	dt-bindings: iio: dac: adi,ad56xx: Fix duplicate compatible strings adi,ad5686.yaml and adi,ad5696.yaml duplicate all the I2C device compatible strings with the exception of "adi,ad5337r". Since adi,ad5686.yaml references spi-peripheral-props.yaml, drop the I2C devices from it making it only SPI devices. Update the titles to make the distinction clear. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20240910234440.1045098-1-robh@kernel.org Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-09-30 09:20:53 +01:00
Emil Gedenryd	530688e39c	iio: light: opt3001: add missing full-scale range value The opt3001 driver uses predetermined full-scale range values to determine what exponent to use for event trigger threshold values. The problem is that one of the values specified in the datasheet is missing from the implementation. This causes larger values to be scaled down to an incorrect exponent, effectively reducing the maximum settable threshold value by a factor of 2. Add missing full-scale range array value. Fixes: `94a9b7b180` ("iio: light: add support for TI's opt3001 light sensor") Signed-off-by: Emil Gedenryd <emil.gedenryd@axis.com> Cc: <Stable@vger.kernel.org> Link: https://patch.msgid.link/20240913-add_opt3002-v2-1-69e04f840360@axis.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-09-30 09:20:53 +01:00
Javier Carrasco	c7c44e5775	iio: light: veml6030: fix IIO device retrieval from embedded device The dev pointer that is received as an argument in the in_illuminance_period_available_show function references the device embedded in the IIO device, not in the i2c client. dev_to_iio_dev() must be used to accessthe right data. The current implementation leads to a segmentation fault on every attempt to read the attribute because indio_dev gets a NULL assignment. This bug has been present since the first appearance of the driver, apparently since the last version (V6) before getting applied. A constant attribute was used until then, and the last modifications might have not been tested again. Cc: stable@vger.kernel.org Fixes: `7b779f573c` ("iio: light: add driver for veml6030 ambient light sensor") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20240913-veml6035-v1-3-0b09c0c90418@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-09-30 09:20:53 +01:00
Mikhail Lobanov	db9795a43d	iio: accel: bma400: Fix uninitialized variable field_value in tap event handling. In the current implementation, the local variable field_value is used without prior initialization, which may lead to reading uninitialized memory. Specifically, in the macro set_mask_bits, the initial (potentially uninitialized) value of the buffer is copied into old__, and a mask is applied to calculate new__. A similar issue was resolved in commit `6ee2a7058f` ("iio: accel: bma400: Fix smatch warning based on use of unintialized value."). Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `961db2da15` ("iio: accel: bma400: Add support for single and double tap events") Signed-off-by: Mikhail Lobanov <m.lobanov@rosalinux.ru> Link: https://patch.msgid.link/20240910083624.27224-1-m.lobanov@rosalinux.ru Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-09-30 09:20:53 +01:00
Keith Busch	8be007c8e0	block: fix blk_rq_map_integrity_sg kernel-doc Fix the documentation to match the new function signature. Fixes: `76c313f658` ("blk-integrity: improved sg segment mapping") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240922141800.3622319-1-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-09-27 04:19:53 -06:00
Wander Lairson Costa	8b62645b09	bpf: Use raw_spinlock_t in ringbuf The function __bpf_ringbuf_reserve is invoked from a tracepoint, which disables preemption. Using spinlock_t in this context can lead to a "sleep in atomic" warning in the RT variant. This issue is illustrated in the example below: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 556208, name: test_progs preempt_count: 1, expected: 0 RCU nest depth: 1, expected: 1 INFO: lockdep is turned off. Preemption disabled at: [<ffffd33a5c88ea44>] migrate_enable+0xc0/0x39c CPU: 7 PID: 556208 Comm: test_progs Tainted: G Hardware name: Qualcomm SA8775P Ride (DT) Call trace: dump_backtrace+0xac/0x130 show_stack+0x1c/0x30 dump_stack_lvl+0xac/0xe8 dump_stack+0x18/0x30 __might_resched+0x3bc/0x4fc rt_spin_lock+0x8c/0x1a4 __bpf_ringbuf_reserve+0xc4/0x254 bpf_ringbuf_reserve_dynptr+0x5c/0xdc bpf_prog_ac3d15160d62622a_test_read_write+0x104/0x238 trace_call_bpf+0x238/0x774 perf_call_bpf_enter.isra.0+0x104/0x194 perf_syscall_enter+0x2f8/0x510 trace_sys_enter+0x39c/0x564 syscall_trace_enter+0x220/0x3c0 do_el0_svc+0x138/0x1dc el0_svc+0x54/0x130 el0t_64_sync_handler+0x134/0x150 el0t_64_sync+0x17c/0x180 Switch the spinlock to raw_spinlock_t to avoid this error. Fixes: `457f44363a` ("bpf: Implement BPF ring buffer and verifier support for it") Reported-by: Brian Grech <bgrech@redhat.com> Signed-off-by: Wander Lairson Costa <wander.lairson@gmail.com> Signed-off-by: Wander Lairson Costa <wander@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20240920190700.617253-1-wander@redhat.com	2024-09-25 11:55:55 +02:00
Florian Westphal	645546a05b	xfrm: policy: remove last remnants of pernet inexact list xfrm_net still contained the no-longer-used inexact policy list heads, remove them. Fixes: `a54ad727f7` ("xfrm: policy: remove remaining use of inexact list") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2024-09-24 09:58:16 +02:00
Eyal Birger	b846972103	xfrm: respect ip protocols rules criteria when performing dst lookups The series in the "fixes" tag added the ability to consider L4 attributes in routing rules. The dst lookup on the outer packet of encapsulated traffic in the xfrm code was not adapted to this change, thus routing behavior that relies on L4 information is not respected. Pass the ip protocol information when performing dst lookups. Fixes: `a25724b05a` ("Merge branch 'fib_rules-support-sport-dport-and-proto-match'") Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Tested-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2024-09-23 07:02:07 +02:00
Eyal Birger	e509996b16	xfrm: extract dst lookup parameters into a struct Preparation for adding more fields to dst lookup functions without changing their signatures. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2024-09-23 07:02:07 +02:00
Pedro Falcato	79efebae4a	9p: Avoid creating multiple slab caches with the same name In the spirit of [1], avoid creating multiple slab caches with the same name. Instead, add the dev_name into the mix. [1]: https://lore.kernel.org/all/20240807090746.2146479-1-pedro.falcato@gmail.com/ Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com> Reported-by: syzbot+3c5d43e97993e1fa612b@syzkaller.appspotmail.com Message-ID: <20240807094725.2193423-1-pedro.falcato@gmail.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-09-23 05:51:27 +09:00
David Howells	1325e4a91a	9p: Enable multipage folios Enable support for multipage folios on the 9P filesystem. This is all handled through netfslib and is already enabled on AFS and CIFS also. Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Jeff Layton <jlayton@kernel.org> cc: Matthew Wilcox <willy@infradead.org> cc: v9fs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Message-ID: <20240620173137.610345-7-dhowells@redhat.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-09-23 05:51:27 +09:00
Dominique Martinet	38d222b316	9p: v9fs_fid_find: also lookup by inode if not found dentry It's possible for v9fs_fid_find "find by dentry" branch to not turn up anything despite having an entry set (because e.g. uid doesn't match), in which case the calling code will generally make an extra lookup to the server. In this case we might have had better luck looking by inode, so fall back to look up by inode if we have one and the lookup by dentry failed. Message-Id: <20240523210024.1214386-1-asmadeus@codewreck.org> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-09-23 05:51:27 +09:00
Ben Hutchings	d4cdc46ca1	wifi: iwlegacy: Fix "field-spanning write" warning in il_enqueue_hcmd() iwlegacy uses command buffers with a payload size of 320 bytes (default) or 4092 bytes (huge). The struct il_device_cmd type describes the default buffers and there is no separate type describing the huge buffers. The il_enqueue_hcmd() function works with both default and huge buffers, and has a memcpy() to the buffer payload. The size of this copy may exceed 320 bytes when using a huge buffer, which now results in a run-time warning: memcpy: detected field-spanning write (size 1014) of single field "&out_cmd->cmd.payload" at drivers/net/wireless/intel/iwlegacy/common.c:3170 (size 320) To fix this: - Define a new struct type for huge buffers, with a correctly sized payload field - When using a huge buffer in il_enqueue_hcmd(), cast the command buffer pointer to that type when looking up the payload field Reported-by: Martin-Éric Racine <martin-eric.racine@iki.fi> References: https://bugs.debian.org/1062421 References: https://bugzilla.kernel.org/show_bug.cgi?id=219124 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Fixes: `54d9469bc5` ("fortify: Add run-time WARN for cross-field memcpy()") Tested-by: Martin-Éric Racine <martin-eric.racine@iki.fi> Tested-by: Brandon Nielsen <nielsenb@jetfuse.net> Acked-by: Stanislaw Gruszka <stf_xl@wp.pl> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/ZuIhQRi/791vlUhE@decadent.org.uk	2024-09-19 11:45:46 +03:00
Felix Fietkau	34b6954810	wifi: mt76: do not increase mcu skb refcount if retry is not supported If mcu_skb_prepare_msg is not implemented, incrementing skb refcount does not work for mcu message retry. In some cases (e.g. on SDIO), shared skbs can trigger a BUG_ON, crashing the system. Fix this by only incrementing refcount if retry is actually supported. Fixes: `3688c18b65` ("wifi: mt76: mt7915: retry mcu messages") Closes: https://lore.kernel.org/r/d907b13a-f8be-4cb8-a0bb-560a21278041@notapiano/ Reported-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> #KernelCI Tested-by: Alper Nebi Yasak <alpernebiyasak@gmail.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20240917110942.22077-1-nbd@nbd.name	2024-09-18 16:34:00 +03:00
Ping-Ke Shih	5575058ba9	wifi: rtw89: coex: add debug message of link counts on 2/5GHz bands for wl_info v7 The counts will be used by MLO, and it is ongoing to add the code, so add debug message in todo part to avoid warnings reported by clang: coex.c:6323:23: warning: variable 'cnt_2g' set but not used [-Wunused-but-set-variable] 6323 \| u8 i, mode, cnt = 0, cnt_2g = 0, cnt_5g = 0, ... \| ^ coex.c:6323:35: warning: variable 'cnt_5g' set but not used [-Wunused-but-set-variable] 6323 \| u8 i, mode, cnt = 0, cnt_2g = 0, cnt_5g = 0, ... \| ^ Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20240912021626.10494-1-pkshih@realtek.com	2024-09-18 16:33:17 +03:00
Shaoqin Huang	dc9b5d7e0b	KVM: selftests: aarch64: Add writable test for ID_AA64PFR1_EL1 Add writable test for the ID_AA64PFR1_EL1 register. Signed-off-by: Shaoqin Huang <shahuang@redhat.com> Link: https://lore.kernel.org/r/20240723072004.1470688-5-shahuang@redhat.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-08-25 17:48:44 +01:00
Shaoqin Huang	78c4446b5f	KVM: arm64: Allow userspace to change ID_AA64PFR1_EL1 Allow userspace to change the guest-visible value of the register with different way of handling: - Since the RAS and MPAM is not writable in the ID_AA64PFR0_EL1 register, RAS_frac and MPAM_frac are also not writable in the ID_AA64PFR1_EL1 register. - The MTE is controlled by a separate UAPI (KVM_CAP_ARM_MTE) with an internal flag (KVM_ARCH_FLAG_MTE_ENABLED). So it's not writable. - For those fields which KVM doesn't know how to handle, they are not exposed to the guest (being disabled in the register read accessor), those fields value will always be 0. Those fields don't have a known behavior now, so don't advertise them to the userspace. Thus still not writable. Those fields include SME, RNDR_trap, NMI, GCS, THE, DF2, PFAR, MTE_frac, MTEX. - The BT, SSBS, CSV2_frac don't introduce any new registers which KVM doesn't know how to handle, they can be written without ill effect. So let them writable. Besides, we don't do the crosscheck in KVM about the CSV2_frac even if it depends on the value of CSV2, it should be made sure by the VMM instead of KVM. Signed-off-by: Shaoqin Huang <shahuang@redhat.com> Link: https://lore.kernel.org/r/20240723072004.1470688-4-shahuang@redhat.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-08-25 17:48:44 +01:00
Shaoqin Huang	e8d164974c	KVM: arm64: Use kvm_has_feat() to check if FEAT_SSBS is advertised to the guest Currently KVM use cpus_have_final_cap() to check if FEAT_SSBS is advertised to the guest. But if FEAT_SSBS is writable and isn't advertised to the guest, this is wrong. Update it to use kvm_has_feat() to check if FEAT_SSBS is advertised to the guest, thus the KVM can do the right thing if FEAT_SSBS isn't advertised to the guest. Signed-off-by: Shaoqin Huang <shahuang@redhat.com> Link: https://lore.kernel.org/r/20240723072004.1470688-3-shahuang@redhat.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-08-25 17:48:44 +01:00
Shaoqin Huang	ffe68b2d19	KVM: arm64: Disable fields that KVM doesn't know how to handle in ID_AA64PFR1_EL1 For some of the fields in the ID_AA64PFR1_EL1 register, KVM doesn't know how to handle them right now. So explicitly disable them in the register accessor, then those fields value will be masked to 0 even if on the hardware the field value is 1. This is safe because from a UAPI point of view that read_sanitised_ftr_reg() doesn't yet return a nonzero value for any of those fields. This will benifit the migration if the host and VM have different values when restoring a VM. Those fields include RNDR_trap, NMI, MTE_frac, GCS, THE, MTEX, DF2, PFAR. Signed-off-by: Shaoqin Huang <shahuang@redhat.com> Link: https://lore.kernel.org/r/20240723072004.1470688-2-shahuang@redhat.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-08-25 17:48:43 +01:00
Shameer Kolothum	980c41f554	KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace KVM exposes the OS double lock feature bit to Guests but returns RAZ/WI on Guest OSDLR_EL1 access. This breaks Guest migration between systems where this feature differ. Add support to make this feature writable from userspace by setting the mask bit. While at it, set the mask bits for the exposed WRPs(Number of Watchpoints) as well. Also update the selftest to cover these fields. However we still can't make BRPs and CTX_CMPs fields writable, because as per ARM ARM DDI 0487K.a, section D2.8.3 Breakpoint types and linking of breakpoints, highest numbered breakpoints(BRPs) must be context aware breakpoints(CTX_CMPs). KVM does not trap + emulate the breakpoint registers, and as such cannot support a layout that misaligns with the underlying hardware. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Link: https://lore.kernel.org/r/20240816132819.34316-1-shameerali.kolothum.thodi@huawei.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-08-22 18:05:37 +01:00

1060 changed files with 12411 additions and 7637 deletions

12

.mailmap

View File

@@ -73,6 +73,8 @@ Andrey Ryabinin <ryabinin.a.a@gmail.com> <aryabinin@virtuozzo.com>
 Andrzej Hajda <andrzej.hajda@intel.com> <a.hajda@samsung.com>
 André Almeida <andrealmeid@igalia.com> <andrealmeid@collabora.com>
 Andy Adamson <andros@citi.umich.edu>
 Andy Chiu <andybnac@gmail.com> <andy.chiu@sifive.com>
 Andy Chiu <andybnac@gmail.com> <taochiu@synology.com>
 Andy Shevchenko <andy@kernel.org> <andy@smile.org.ua>
 Andy Shevchenko <andy@kernel.org> <ext-andriy.shevchenko@nokia.com>
 Anilkumar Kolli <quic_akolli@quicinc.com> <akolli@codeaurora.org>
@@ -197,7 +199,8 @@ Elliot Berman <quic_eberman@quicinc.com> <eberman@codeaurora.org>
 Enric Balletbo i Serra <eballetbo@kernel.org> <enric.balletbo@collabora.com>
 Enric Balletbo i Serra <eballetbo@kernel.org> <eballetbo@iseebcn.com>
 Erik Kaneda <erik.kaneda@intel.com> <erik.schmauss@intel.com>
 Eugen Hristev <eugen.hristev@collabora.com> <eugen.hristev@microchip.com>
 Eugen Hristev <eugen.hristev@linaro.org> <eugen.hristev@microchip.com>
 Eugen Hristev <eugen.hristev@linaro.org> <eugen.hristev@collabora.com>
 Evgeniy Polyakov <johnpol@2ka.mipt.ru>
 Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> <ezequiel@collabora.com>
 Faith Ekstrand <faith.ekstrand@collabora.com> <jason@jlekstrand.net>
@@ -280,7 +283,7 @@ Jan Glauber <jan.glauber@gmail.com> <jglauber@cavium.com>
 Jan Kuliga <jtkuliga.kdev@gmail.com> <jankul@alatek.krakow.pl>
 Jarkko Sakkinen <jarkko@kernel.org> <jarkko.sakkinen@linux.intel.com>
 Jarkko Sakkinen <jarkko@kernel.org> <jarkko@profian.com>
 Jarkko Sakkinen <jarkko@kernel.org> <jarkko.sakkinen@tuni.fi>
 Jarkko Sakkinen <jarkko@kernel.org> <jarkko.sakkinen@parity.io>
 Jason Gunthorpe <jgg@ziepe.ca> <jgg@mellanox.com>
 Jason Gunthorpe <jgg@ziepe.ca> <jgg@nvidia.com>
 Jason Gunthorpe <jgg@ziepe.ca> <jgunthorpe@obsidianresearch.com>
@@ -304,6 +307,11 @@ Jens Axboe <axboe@kernel.dk> <axboe@fb.com>
 Jens Axboe <axboe@kernel.dk> <axboe@meta.com>
 Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
 Jernej Skrabec <jernej.skrabec@gmail.com> <jernej.skrabec@siol.net>
 Jesper Dangaard Brouer <hawk@kernel.org> <brouer@redhat.com>
 Jesper Dangaard Brouer <hawk@kernel.org> <hawk@comx.dk>
 Jesper Dangaard Brouer <hawk@kernel.org> <jbrouer@redhat.com>
 Jesper Dangaard Brouer <hawk@kernel.org> <jdb@comx.dk>
 Jesper Dangaard Brouer <hawk@kernel.org> <netoptimizer@brouer.com>
 Jessica Zhang <quic_jesszhan@quicinc.com> <jesszhan@codeaurora.org>
 Jilai Wang <quic_jilaiw@quicinc.com> <jilaiw@codeaurora.org>
 Jiri Kosina <jikos@kernel.org> <jikos@jikos.cz>

									
										7

Documentation/admin-guide/LSM/ipe.rst
									
												View File
												
				@@ -223,7 +223,10 @@ are signed through the PKCS#7 message format to enforce some level of

				authorization of the policies (prohibiting an attacker from gaining

				unconstrained root, and deploying an "allow all" policy). These

				policies must be signed by a certificate that chains to the

				``SYSTEM_TRUSTED_KEYRING``. With openssl, the policy can be signed by::

				``SYSTEM_TRUSTED_KEYRING``, or to the secondary and/or platform keyrings if

				``CONFIG_IPE_POLICY_SIG_SECONDARY_KEYRING`` and/or

				``CONFIG_IPE_POLICY_SIG_PLATFORM_KEYRING`` are enabled, respectively.

				With openssl, the policy can be signed by::

				   openssl smime -sign \

				      -in "$MY_POLICY" \

				@@ -266,7 +269,7 @@ in the kernel. This file is write-only and accepts a PKCS#7 signed

				policy. Two checks will always be performed on this policy: First, the

				``policy_names`` must match with the updated version and the existing

				version. Second the updated policy must have a policy version greater than

				or equal to the currently-running version. This is to prevent rollback attacks.

				the currently-running version. This is to prevent rollback attacks.

				The ``delete`` file is used to remove a policy that is no longer needed.

				This file is write-only and accepts a value of ``1`` to delete the policy.

									
										20

Documentation/admin-guide/pm/cpufreq.rst
									
												View File
												
				@@ -425,8 +425,8 @@ This governor exposes only one tunable:

				``rate_limit_us``

					Minimum time (in microseconds) that has to pass between two consecutive

					runs of governor computations (default: 1000 times the scaling driver's

					transition latency).

					runs of governor computations (default: 1.5 times the scaling driver's

					transition latency or the maximum 2ms).

					The purpose of this tunable is to reduce the scheduler context overhead

					of the governor which might be excessive without it.

				@@ -474,17 +474,17 @@ This governor exposes the following tunables:

					This is how often the governor's worker routine should run, in

					microseconds.

					Typically, it is set to values of the order of 10000 (10 ms).  Its

					default value is equal to the value of ``cpuinfo_transition_latency``

					for each policy this governor is attached to (but since the unit here

					is greater by 1000, this means that the time represented by

					``sampling_rate`` is 1000 times greater than the transition latency by

					default).

					Typically, it is set to values of the order of 2000 (2 ms).  Its

					default value is to add a 50% breathing room

					to ``cpuinfo_transition_latency`` on each policy this governor is

					attached to. The minimum is typically the length of two scheduler

					ticks.

					If this tunable is per-policy, the following shell command sets the time

					represented by it to be 750 times as high as the transition latency::

					represented by it to be 1.5 times as high as the transition latency

					(the default)::

					# echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) > ondemand/sampling_rate

					# echo `$(($(cat cpuinfo_transition_latency) * 3 / 2)) > ondemand/sampling_rate

				``up_threshold``

					If the estimated CPU load is above this value (in percent), the governor

									
										38

Documentation/core-api/protection-keys.rst
									
												View File
												
				@@ -12,7 +12,10 @@ Pkeys Userspace (PKU) is a feature which can be found on:

				        * Intel server CPUs, Skylake and later

				        * Intel client CPUs, Tiger Lake (11th Gen Core) and later

				        * Future AMD CPUs

				        * arm64 CPUs implementing the Permission Overlay Extension (FEAT_S1POE)

				x86_64

				======

				Pkeys work by dedicating 4 previously Reserved bits in each page table entry to

				a "protection key", giving 16 possible keys.

				@@ -28,6 +31,22 @@ register.  The feature is only available in 64-bit mode, even though there is

				theoretically space in the PAE PTEs.  These permissions are enforced on data

				access only and have no effect on instruction fetches.

				arm64

				=====

				Pkeys use 3 bits in each page table entry, to encode a "protection key index",

				giving 8 possible keys.

				Protections for each key are defined with a per-CPU user-writable system

				register (POR_EL0).  This is a 64-bit register encoding read, write and execute

				overlay permissions for each protection key index.

				Being a CPU register, POR_EL0 is inherently thread-local, potentially giving

				each thread a different set of protections from every other thread.

				Unlike x86_64, the protection key permissions also apply to instruction

				fetches.

				Syscalls

				========

				@@ -38,11 +57,10 @@ There are 3 system calls which directly interact with pkeys::

					int pkey_mprotect(unsigned long start, size_t len,

							  unsigned long prot, int pkey);

				Before a pkey can be used, it must first be allocated with

				pkey_alloc().  An application calls the WRPKRU instruction

				directly in order to change access permissions to memory covered

				with a key.  In this example WRPKRU is wrapped by a C function

				called pkey_set().

				Before a pkey can be used, it must first be allocated with pkey_alloc().  An

				application writes to the architecture specific CPU register directly in order

				to change access permissions to memory covered with a key.  In this example

				this is wrapped by a C function called pkey_set().

				::

					int real_prot = PROT_READ|PROT_WRITE;

				@@ -64,9 +82,9 @@ is no longer in use::

					munmap(ptr, PAGE_SIZE);

					pkey_free(pkey);

				.. note:: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.

				          An example implementation can be found in

				          tools/testing/selftests/x86/protection_keys.c.

				.. note:: pkey_set() is a wrapper around writing to the CPU register.

				          Example implementations can be found in

				          tools/testing/selftests/mm/pkey-{arm64,powerpc,x86}.h

				Behavior

				========

				@@ -96,3 +114,7 @@ with a read()::

				The kernel will send a SIGSEGV in both cases, but si_code will be set

				to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when

				the plain mprotect() permissions are violated.

				Note that kernel accesses from a kthread (such as io_uring) will use a default

				value for the protection key register and so will not be consistent with

				userspace's value of the register or mprotect().

									
										24

Documentation/devicetree/bindings/display/mediatek/mediatek,dpi.yaml
									
												View File
												
				@@ -63,6 +63,16 @@ properties:

				      - const: sleep

				  power-domains:

				    description: |

				      The MediaTek DPI module is typically associated with one of the

				      following multimedia power domains:

				        POWER_DOMAIN_DISPLAY

				        POWER_DOMAIN_VDOSYS

				        POWER_DOMAIN_MM

				      The specific power domain used varies depending on the SoC design.

				      It is recommended to explicitly add the appropriate power domain

				      property to the DPI node in the device tree.

				    maxItems: 1

				  port:

				@@ -79,20 +89,6 @@ required:

				  - clock-names

				  - port

				allOf:

				  - if:

				      not:

				        properties:

				          compatible:

				            contains:

				              enum:

				                - mediatek,mt6795-dpi

				                - mediatek,mt8173-dpi

				                - mediatek,mt8186-dpi

				    then:

				      properties:

				        power-domains: false

				additionalProperties: false

				examples:

									
										19

Documentation/devicetree/bindings/display/mediatek/mediatek,split.yaml
									
												View File
												
				@@ -38,6 +38,7 @@ properties:

				    description: A phandle and PM domain specifier as defined by bindings of

				      the power controller specified by phandle. See

				      Documentation/devicetree/bindings/power/power-domain.yaml for details.

				    maxItems: 1

				  mediatek,gce-client-reg:

				    description:

				@@ -57,6 +58,9 @@ properties:

				  clocks:

				    items:

				      - description: SPLIT Clock

				      - description: Used for interfacing with the HDMI RX signal source.

				      - description: Paired with receiving HDMI RX metadata.

				    minItems: 1

				required:

				  - compatible

				@@ -72,9 +76,24 @@ allOf:

				            const: mediatek,mt8195-mdp3-split

				    then:

				      properties:

				        clocks:

				          minItems: 3

				      required:

				        - mediatek,gce-client-reg

				  - if:

				      properties:

				        compatible:

				          contains:

				            const: mediatek,mt8173-disp-split

				    then:

				      properties:

				        clocks:

				          maxItems: 1

				additionalProperties: false

				examples:

									
										21

Documentation/devicetree/bindings/iio/adc/adi,ad7380.yaml
									
												View File
												
				@@ -67,6 +67,10 @@ properties:

				      A 2.5V to 3.3V supply for the external reference voltage. When omitted,

				      the internal 2.5V reference is used.

				  refin-supply:

				    description:

				      A 2.5V to 3.3V supply for external reference voltage, for ad7380-4 only.

				  aina-supply:

				    description:

				      The common mode voltage supply for the AINA- pin on pseudo-differential

				@@ -135,6 +139,23 @@ allOf:

				        ainc-supply: false

				        aind-supply: false

				  # ad7380-4 uses refin-supply as external reference.

				  # All other chips from ad738x family use refio as optional external reference.

				  # When refio-supply is omitted, internal reference is used.

				  - if:

				      properties:

				        compatible:

				          enum:

				            - adi,ad7380-4

				    then:

				      properties:

				        refio-supply: false

				      required:

				        - refin-supply

				    else:

				      properties:

				        refin-supply: false

				examples:

				  - |

				    #include <dt-bindings/interrupt-controller/irq.h>

									
										53

Documentation/devicetree/bindings/iio/dac/adi,ad5686.yaml
									
												View File
												
				@@ -4,7 +4,7 @@

				$id: http://devicetree.org/schemas/iio/dac/adi,ad5686.yaml#

				$schema: http://devicetree.org/meta-schemas/core.yaml#

				title: Analog Devices AD5360 and similar DACs

				title: Analog Devices AD5360 and similar SPI DACs

				maintainers:

				  - Michael Hennerich <michael.hennerich@analog.com>

				@@ -12,41 +12,22 @@ maintainers:

				properties:

				  compatible:

				    oneOf:

				      - description: SPI devices

				        enum:

				          - adi,ad5310r

				          - adi,ad5672r

				          - adi,ad5674r

				          - adi,ad5676

				          - adi,ad5676r

				          - adi,ad5679r

				          - adi,ad5681r

				          - adi,ad5682r

				          - adi,ad5683

				          - adi,ad5683r

				          - adi,ad5684

				          - adi,ad5684r

				          - adi,ad5685r

				          - adi,ad5686

				          - adi,ad5686r

				      - description: I2C devices

				        enum:

				          - adi,ad5311r

				          - adi,ad5337r

				          - adi,ad5338r

				          - adi,ad5671r

				          - adi,ad5675r

				          - adi,ad5691r

				          - adi,ad5692r

				          - adi,ad5693

				          - adi,ad5693r

				          - adi,ad5694

				          - adi,ad5694r

				          - adi,ad5695r

				          - adi,ad5696

				          - adi,ad5696r

				    enum:

				      - adi,ad5310r

				      - adi,ad5672r

				      - adi,ad5674r

				      - adi,ad5676

				      - adi,ad5676r

				      - adi,ad5679r

				      - adi,ad5681r

				      - adi,ad5682r

				      - adi,ad5683

				      - adi,ad5683r

				      - adi,ad5684

				      - adi,ad5684r

				      - adi,ad5685r

				      - adi,ad5686

				      - adi,ad5686r

				  reg:

				    maxItems: 1

									
										3

Documentation/devicetree/bindings/iio/dac/adi,ad5696.yaml
									
												View File
												
				@@ -4,7 +4,7 @@

				$id: http://devicetree.org/schemas/iio/dac/adi,ad5696.yaml#

				$schema: http://devicetree.org/meta-schemas/core.yaml#

				title: Analog Devices AD5696 and similar multi-channel DACs

				title: Analog Devices AD5696 and similar I2C multi-channel DACs

				maintainers:

				  - Michael Auchter <michael.auchter@ni.com>

				@@ -16,6 +16,7 @@ properties:

				  compatible:

				    enum:

				      - adi,ad5311r

				      - adi,ad5337r

				      - adi,ad5338r

				      - adi,ad5671r

				      - adi,ad5675r

									
										1

Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml
									
												View File
												
				@@ -26,6 +26,7 @@ properties:

				      - brcm,asp-v2.1-mdio

				      - brcm,asp-v2.2-mdio

				      - brcm,unimac-mdio

				      - brcm,bcm6846-mdio

				  reg:

				    minItems: 1

									
										5

Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml
									
												View File
												
				@@ -154,8 +154,6 @@ allOf:

				              - qcom,sm8550-qmp-gen4x2-pcie-phy

				              - qcom,sm8650-qmp-gen3x2-pcie-phy

				              - qcom,sm8650-qmp-gen4x2-pcie-phy

				              - qcom,x1e80100-qmp-gen3x2-pcie-phy

				              - qcom,x1e80100-qmp-gen4x2-pcie-phy

				    then:

				      properties:

				        clocks:

				@@ -171,6 +169,8 @@ allOf:

				              - qcom,sc8280xp-qmp-gen3x1-pcie-phy

				              - qcom,sc8280xp-qmp-gen3x2-pcie-phy

				              - qcom,sc8280xp-qmp-gen3x4-pcie-phy

				              - qcom,x1e80100-qmp-gen3x2-pcie-phy

				              - qcom,x1e80100-qmp-gen4x2-pcie-phy

				              - qcom,x1e80100-qmp-gen4x4-pcie-phy

				    then:

				      properties:

				@@ -201,6 +201,7 @@ allOf:

				              - qcom,sm8550-qmp-gen4x2-pcie-phy

				              - qcom,sm8650-qmp-gen4x2-pcie-phy

				              - qcom,x1e80100-qmp-gen4x2-pcie-phy

				              - qcom,x1e80100-qmp-gen4x4-pcie-phy

				    then:

				      properties:

				        resets:

									
										18

Documentation/devicetree/bindings/sound/davinci-mcasp-audio.yaml
									
												View File
												
				@@ -102,21 +102,21 @@ properties:

				    default: 2

				  interrupts:

				    oneOf:

				      - minItems: 1

				        items:

				          - description: TX interrupt

				          - description: RX interrupt

				      - items:

				          - description: common/combined interrupt

				    minItems: 1

				    maxItems: 2

				  interrupt-names:

				    oneOf:

				      - minItems: 1

				      - description: TX interrupt

				        const: tx

				      - description: RX interrupt

				        const: rx

				      - description: TX and RX interrupts

				        items:

				          - const: tx

				          - const: rx

				      - const: common

				      - description: Common/combined interrupt

				        const: common

				  fck_parent:

				    $ref: /schemas/types.yaml#/definitions/string

									
										4

Documentation/devicetree/bindings/sound/rockchip,rk3308-codec.yaml
									
												View File
												
				@@ -48,6 +48,10 @@ properties:

				      - const: mclk_rx

				      - const: hclk

				  port:

				    $ref: audio-graph-port.yaml#

				    unevaluatedProperties: false

				  resets:

				    maxItems: 1

									
										2

Documentation/filesystems/caching/cachefiles.rst
									
												View File
												
				@@ -115,7 +115,7 @@ set up cache ready for use.  The following script commands are available:

					This mask can also be set through sysfs, eg::

						echo 5 >/sys/modules/cachefiles/parameters/debug

						echo 5 > /sys/module/cachefiles/parameters/debug

				Starting the Cache

									
										2

Documentation/filesystems/iomap/operations.rst
									
												View File
												
				@@ -208,7 +208,7 @@ The filesystem must arrange to `cancel

				such `reservations

				<https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/>`_

				because writeback will not consume the reservation.

				The ``iomap_file_buffered_write_punch_delalloc`` can be called from a

				The ``iomap_write_delalloc_release`` can be called from a

				``->iomap_end`` function to find all the clean areas of the folios

				caching a fresh (``IOMAP_F_NEW``) delalloc mapping.

				It takes the ``invalidate_lock``.

									
										1

Documentation/filesystems/netfs_library.rst
									
												View File
												
				@@ -592,4 +592,3 @@ API Function Reference

				.. kernel-doc:: include/linux/netfs.h

				.. kernel-doc:: fs/netfs/buffered_read.c

				.. kernel-doc:: fs/netfs/io.c

									
										13

Documentation/iio/ad7380.rst
									
												View File
												
				@@ -41,13 +41,22 @@ supports only 1 SDO line.

				Reference voltage

				-----------------

				2 possible reference voltage sources are supported:

				ad7380-4

				~~~~~~~~

				ad7380-4 supports only an external reference voltage (2.5V to 3.3V). It must be

				declared in the device tree as ``refin-supply``.

				All other devices from ad738x family

				~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

				All other devices from ad738x support 2 possible reference voltage sources:

				- Internal reference (2.5V)

				- External reference (2.5V to 3.3V)

				The source is determined by the device tree. If ``refio-supply`` is present,

				then the external reference is used, else the internal reference is used.

				then it is used as external reference, else the internal reference is used.

				Oversampling and resolution boost

				---------------------------------

									
										38

Documentation/mm/damon/maintainer-profile.rst
									
												View File
												
				@@ -7,26 +7,26 @@ The DAMON subsystem covers the files that are listed in 'DATA ACCESS MONITOR'

				section of 'MAINTAINERS' file.

				The mailing lists for the subsystem are damon@lists.linux.dev and

				linux-mm@kvack.org.  Patches should be made against the mm-unstable `tree

				<https://git.kernel.org/akpm/mm/h/mm-unstable>` whenever possible and posted to

				the mailing lists.

				linux-mm@kvack.org.  Patches should be made against the `mm-unstable tree

				<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ whenever possible and posted

				to the mailing lists.

				SCM Trees

				---------

				There are multiple Linux trees for DAMON development.  Patches under

				development or testing are queued in `damon/next

				<https://git.kernel.org/sj/h/damon/next>` by the DAMON maintainer.

				<https://git.kernel.org/sj/h/damon/next>`_ by the DAMON maintainer.

				Sufficiently reviewed patches will be queued in `mm-unstable

				<https://git.kernel.org/akpm/mm/h/mm-unstable>` by the memory management

				<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ by the memory management

				subsystem maintainer.  After more sufficient tests, the patches will be queued

				in `mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>` , and finally

				in `mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>`_, and finally

				pull-requested to the mainline by the memory management subsystem maintainer.

				Note again the patches for mm-unstable `tree

				<https://git.kernel.org/akpm/mm/h/mm-unstable>` are queued by the memory

				Note again the patches for `mm-unstable tree

				<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ are queued by the memory

				management subsystem maintainer.  If the patches requires some patches in

				damon/next `tree <https://git.kernel.org/sj/h/damon/next>` which not yet merged

				`damon/next tree <https://git.kernel.org/sj/h/damon/next>`_ which not yet merged

				in mm-unstable, please make sure the requirement is clearly specified.

				Submit checklist addendum

				@@ -37,25 +37,25 @@ When making DAMON changes, you should do below.

				- Build changes related outputs including kernel and documents.

				- Ensure the builds introduce no new errors or warnings.

				- Run and ensure no new failures for DAMON `selftests

				  <https://github.com/awslabs/damon-tests/blob/master/corr/run.sh#L49>` and

				  <https://github.com/damonitor/damon-tests/blob/master/corr/run.sh#L49>`_ and

				  `kunittests

				  <https://github.com/awslabs/damon-tests/blob/master/corr/tests/kunit.sh>`.

				  <https://github.com/damonitor/damon-tests/blob/master/corr/tests/kunit.sh>`_.

				Further doing below and putting the results will be helpful.

				- Run `damon-tests/corr

				  <https://github.com/awslabs/damon-tests/tree/master/corr>` for normal

				  <https://github.com/damonitor/damon-tests/tree/master/corr>`_ for normal

				  changes.

				- Run `damon-tests/perf

				  <https://github.com/awslabs/damon-tests/tree/master/perf>` for performance

				  <https://github.com/damonitor/damon-tests/tree/master/perf>`_ for performance

				  changes.

				Key cycle dates

				---------------

				Patches can be sent anytime.  Key cycle dates of the `mm-unstable

				<https://git.kernel.org/akpm/mm/h/mm-unstable>` and `mm-stable

				<https://git.kernel.org/akpm/mm/h/mm-stable>` trees depend on the memory

				<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ and `mm-stable

				<https://git.kernel.org/akpm/mm/h/mm-stable>`_ trees depend on the memory

				management subsystem maintainer.

				Review cadence

				@@ -72,13 +72,13 @@ Mailing tool

				Like many other Linux kernel subsystems, DAMON uses the mailing lists

				(damon@lists.linux.dev and linux-mm@kvack.org) as the major communication

				channel.  There is a simple tool called `HacKerMaiL

				<https://github.com/damonitor/hackermail>` (``hkml``), which is for people who

				<https://github.com/damonitor/hackermail>`_ (``hkml``), which is for people who

				are not very familiar with the mailing lists based communication.  The tool

				could be particularly helpful for DAMON community members since it is developed

				and maintained by DAMON maintainer.  The tool is also officially announced to

				support DAMON and general Linux kernel development workflow.

				In other words, `hkml <https://github.com/damonitor/hackermail>` is a mailing

				In other words, `hkml <https://github.com/damonitor/hackermail>`_ is a mailing

				tool for DAMON community, which DAMON maintainer is committed to support.

				Please feel free to try and report issues or feature requests for the tool to

				the maintainer.

				@@ -98,8 +98,8 @@ slots, and attendees should reserve one of those at least 24 hours before the

				time slot, by reaching out to the maintainer.

				Schedules and available reservation time slots are available at the Google `doc

				<https://docs.google.com/document/d/1v43Kcj3ly4CYqmAkMaZzLiM2GEnWfgdGbZAH3mi2vpM/edit?usp=sharing>`.

				<https://docs.google.com/document/d/1v43Kcj3ly4CYqmAkMaZzLiM2GEnWfgdGbZAH3mi2vpM/edit?usp=sharing>`_.

				There is also a public Google `calendar

				<https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`

				<https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`_

				that has the events.  Anyone can subscribe it.  DAMON maintainer will also

				provide periodic reminder to the mailing list (damon@lists.linux.dev).

									
										5

Documentation/networking/packet_mmap.rst
									
												View File
												
				@@ -16,7 +16,7 @@ ii) transmit network traffic, or any other that needs raw

				Howto can be found at:

				    https://sites.google.com/site/packetmmap/

				    https://web.archive.org/web/20220404160947/https://sites.google.com/site/packetmmap/

				Please send your comments to

				    - Ulisses Alonso Camaró <uaca@i.hate.spam.alumni.uv.es>

				@@ -166,7 +166,8 @@ As capture, each frame contains two parts::

				    /* bind socket to eth0 */

				    bind(this->socket, (struct sockaddr *)&my_addr, sizeof(struct sockaddr_ll));

				 A complete tutorial is available at: https://sites.google.com/site/packetmmap/

				 A complete tutorial is available at:

				 https://web.archive.org/web/20220404160947/https://sites.google.com/site/packetmmap/

				By default, the user should put data at::

									
										42

Documentation/process/maintainer-soc.rst
									
												View File
												
				@@ -30,10 +30,13 @@ tree as a dedicated branch covering multiple subsystems.

				The main SoC tree is housed on git.kernel.org:

				  https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git/

				Maintainers

				-----------

				Clearly this is quite a wide range of topics, which no one person, or even

				small group of people are capable of maintaining.  Instead, the SoC subsystem

				is comprised of many submaintainers, each taking care of individual platforms

				and driver subdirectories.

				is comprised of many submaintainers (platform maintainers), each taking care of

				individual platforms and driver subdirectories.

				In this regard, "platform" usually refers to a series of SoCs from a given

				vendor, for example, Nvidia's series of Tegra SoCs.  Many submaintainers operate

				on a vendor level, responsible for multiple product lines.  For several reasons,

				@@ -43,14 +46,43 @@ MAINTAINERS file.

				Most of these submaintainers have their own trees where they stage patches,

				sending pull requests to the main SoC tree.  These trees are usually, but not

				always, listed in MAINTAINERS.  The main SoC maintainers can be reached via the

				alias soc@kernel.org if there is no platform-specific maintainer, or if they

				are unresponsive.

				always, listed in MAINTAINERS.

				What the SoC tree is not, however, is a location for architecture-specific code

				changes.  Each architecture has its own maintainers that are responsible for

				architectural details, CPU errata and the like.

				Submitting Patches for Given SoC

				~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

				All typical platform related patches should be sent via SoC submaintainers

				(platform-specific maintainers).  This includes also changes to per-platform or

				shared defconfigs (scripts/get_maintainer.pl might not provide correct

				addresses in such case).

				Submitting Patches to the Main SoC Maintainers

				~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

				The main SoC maintainers can be reached via the alias soc@kernel.org only in

				following cases:

				1. There are no platform-specific maintainers.

				2. Platform-specific maintainers are unresponsive.

				3. Introducing a completely new SoC platform.  Such new SoC work should be sent

				   first to common mailing lists, pointed out by scripts/get_maintainer.pl, for

				   community review.  After positive community review, work should be sent to

				   soc@kernel.org in one patchset containing new arch/foo/Kconfig entry, DTS

				   files, MAINTAINERS file entry and optionally initial drivers with their

				   Devicetree bindings.  The MAINTAINERS file entry should list new

				   platform-specific maintainers, who are going to be responsible for handling

				   patches for the platform from now on.

				Note that the soc@kernel.org is usually not the place to discuss the patches,

				thus work sent to this address should be already considered as acceptable by

				the community.

				Information for (new) Submaintainers

				------------------------------------

									
										2

Documentation/rust/arch-support.rst
									
												View File
												
				@@ -17,7 +17,7 @@ Architecture   Level of support  Constraints

				=============  ================  ==============================================

				``arm64``      Maintained        Little Endian only.

				``loongarch``  Maintained        \-

				``riscv``      Maintained        ``riscv64`` only.

				``riscv``      Maintained        ``riscv64`` and LLVM/Clang only.

				``um``         Maintained        \-

				``x86``        Maintained        ``x86_64`` only.

				=============  ================  ==============================================

									
										261

Documentation/userspace-api/mseal.rst
									
												View File
												
				@@ -23,177 +23,166 @@ applications can additionally seal security critical data at runtime.

				A similar feature already exists in the XNU kernel with the

				VM_FLAGS_PERMANENT flag [1] and on OpenBSD with the mimmutable syscall [2].

				User API

				========

				mseal()

				-----------

				The mseal() syscall has the following signature:

				SYSCALL

				=======

				mseal syscall signature

				-----------------------

				   ``int mseal(void \* addr, size_t len, unsigned long flags)``

				``int mseal(void addr, size_t len, unsigned long flags)``

				   **addr**/**len**: virtual memory address range.

				      The address range set by **addr**/**len** must meet:

				         - The start address must be in an allocated VMA.

				         - The start address must be page aligned.

				         - The end address (**addr** + **len**) must be in an allocated VMA.

				         - no gap (unallocated memory) between start and end address.

				**addr/len**: virtual memory address range.

				      The ``len`` will be paged aligned implicitly by the kernel.

				The address range set by ``addr``/``len`` must meet:

				   - The start address must be in an allocated VMA.

				   - The start address must be page aligned.

				   - The end address (``addr`` + ``len``) must be in an allocated VMA.

				   - no gap (unallocated memory) between start and end address.

				   **flags**: reserved for future use.

				The ``len`` will be paged aligned implicitly by the kernel.

				   **Return values**:

				      - **0**: Success.

				      - **-EINVAL**:

				         * Invalid input ``flags``.

				         * The start address (``addr``) is not page aligned.

				         * Address range (``addr`` + ``len``) overflow.

				      - **-ENOMEM**:

				         * The start address (``addr``) is not allocated.

				         * The end address (``addr`` + ``len``) is not allocated.

				         * A gap (unallocated memory) between start and end address.

				      - **-EPERM**:

				         * sealing is supported only on 64-bit CPUs, 32-bit is not supported.

				**flags**: reserved for future use.

				   **Note about error return**:

				      - For above error cases, users can expect the given memory range is

				        unmodified, i.e. no partial update.

				      - There might be other internal errors/cases not listed here, e.g.

				        error during merging/splitting VMAs, or the process reaching the maximum

				        number of supported VMAs. In those cases, partial updates to the given

				        memory range could happen. However, those cases should be rare.

				**return values**:

				   **Architecture support**:

				      mseal only works on 64-bit CPUs, not 32-bit CPUs.

				- ``0``: Success.

				   **Idempotent**:

				      users can call mseal multiple times. mseal on an already sealed memory

				      is a no-action (not error).

				- ``-EINVAL``:

				    - Invalid input ``flags``.

				    - The start address (``addr``) is not page aligned.

				    - Address range (``addr`` + ``len``) overflow.

				   **no munseal**

				      Once mapping is sealed, it can't be unsealed. The kernel should never

				      have munseal, this is consistent with other sealing feature, e.g.

				      F_SEAL_SEAL for file.

				- ``-ENOMEM``:

				    - The start address (``addr``) is not allocated.

				    - The end address (``addr`` + ``len``) is not allocated.

				    - A gap (unallocated memory) between start and end address.

				Blocked mm syscall for sealed mapping

				-------------------------------------

				   It might be important to note: **once the mapping is sealed, it will

				   stay in the process's memory until the process terminates**.

				- ``-EPERM``:

				    - sealing is supported only on 64-bit CPUs, 32-bit is not supported.

				   Example::

				- For above error cases, users can expect the given memory range is

				  unmodified, i.e. no partial update.

				         *ptr = mmap(0, 4096, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);

				         rc = mseal(ptr, 4096, 0);

				         /* munmap will fail */

				         rc = munmap(ptr, 4096);

				         assert(rc < 0);

				- There might be other internal errors/cases not listed here, e.g.

				  error during merging/splitting VMAs, or the process reaching the max

				  number of supported VMAs. In those cases, partial updates to the given

				  memory range could happen. However, those cases should be rare.

				   Blocked mm syscall:

				      - munmap

				      - mmap

				      - mremap

				      - mprotect and pkey_mprotect

				      - some destructive madvise behaviors: MADV_DONTNEED, MADV_FREE,

				        MADV_DONTNEED_LOCKED, MADV_FREE, MADV_DONTFORK, MADV_WIPEONFORK

				**Blocked operations after sealing**:

				    Unmapping, moving to another location, and shrinking the size,

				    via munmap() and mremap(), can leave an empty space, therefore

				    can be replaced with a VMA with a new set of attributes.

				   The first set of syscalls to block is munmap, mremap, mmap. They can

				   either leave an empty space in the address space, therefore allowing

				   replacement with a new mapping with new set of attributes, or can

				   overwrite the existing mapping with another mapping.

				    Moving or expanding a different VMA into the current location,

				    via mremap().

				   mprotect and pkey_mprotect are blocked because they changes the

				   protection bits (RWX) of the mapping.

				    Modifying a VMA via mmap(MAP_FIXED).

				   Certain destructive madvise behaviors, specifically MADV_DONTNEED,

				   MADV_FREE, MADV_DONTNEED_LOCKED, and MADV_WIPEONFORK, can introduce

				   risks when applied to anonymous memory by threads lacking write

				   permissions. Consequently, these operations are prohibited under such

				   conditions. The aforementioned behaviors have the potential to modify

				   region contents by discarding pages, effectively performing a memset(0)

				   operation on the anonymous memory.

				    Size expansion, via mremap(), does not appear to pose any

				    specific risks to sealed VMAs. It is included anyway because

				    the use case is unclear. In any case, users can rely on

				    merging to expand a sealed VMA.

				   Kernel will return -EPERM for blocked syscalls.

				    mprotect() and pkey_mprotect().

				   When blocked syscall return -EPERM due to sealing, the memory regions may

				   or may not be changed, depends on the syscall being blocked:

				    Some destructive madvice() behaviors (e.g. MADV_DONTNEED)

				    for anonymous memory, when users don't have write permission to the

				    memory. Those behaviors can alter region contents by discarding pages,

				    effectively a memset(0) for anonymous memory.

				      - munmap: munmap is atomic. If one of VMAs in the given range is

				        sealed, none of VMAs are updated.

				      - mprotect, pkey_mprotect, madvise: partial update might happen, e.g.

				        when mprotect over multiple VMAs, mprotect might update the beginning

				        VMAs before reaching the sealed VMA and return -EPERM.

				      - mmap and mremap: undefined behavior.

				    Kernel will return -EPERM for blocked operations.

				    For blocked operations, one can expect the given address is unmodified,

				    i.e. no partial update. Note, this is different from existing mm

				    system call behaviors, where partial updates are made till an error is

				    found and returned to userspace. To give an example:

				    Assume following code sequence:

				    - ptr = mmap(null, 8192, PROT_NONE);

				    - munmap(ptr + 4096, 4096);

				    - ret1 = mprotect(ptr, 8192, PROT_READ);

				    - mseal(ptr, 4096);

				    - ret2 = mprotect(ptr, 8192, PROT_NONE);

				    ret1 will be -ENOMEM, the page from ptr is updated to PROT_READ.

				    ret2 will be -EPERM, the page remains to be PROT_READ.

				**Note**:

				- mseal() only works on 64-bit CPUs, not 32-bit CPU.

				- users can call mseal() multiple times, mseal() on an already sealed memory

				  is a no-action (not error).

				- munseal() is not supported.

				Use cases:

				==========

				Use cases

				=========

				- glibc:

				  The dynamic linker, during loading ELF executables, can apply sealing to

				  non-writable memory segments.

				  mapping segments.

				- Chrome browser: protect some security sensitive data-structures.

				- Chrome browser: protect some security sensitive data structures.

				Notes on which memory to seal:

				==============================

				It might be important to note that sealing changes the lifetime of a mapping,

				i.e. the sealed mapping won’t be unmapped till the process terminates or the

				exec system call is invoked. Applications can apply sealing to any virtual

				memory region from userspace, but it is crucial to thoroughly analyze the

				mapping's lifetime prior to apply the sealing.

				When not to use mseal

				=====================

				Applications can apply sealing to any virtual memory region from userspace,

				but it is *crucial to thoroughly analyze the mapping's lifetime* prior to

				apply the sealing. This is because the sealed mapping *won’t be unmapped*

				until the process terminates or the exec system call is invoked.

				For example:

				   - aio/shm

				     aio/shm can call mmap and  munmap on behalf of userspace, e.g.

				     ksys_shmdt() in shm.c. The lifetimes of those mapping are not tied to

				     the lifetime of the process. If those memories are sealed from userspace,

				     then munmap will fail, causing leaks in VMA address space during the

				     lifetime of the process.

				- aio/shm

				   - ptr allocated by malloc (heap)

				     Don't use mseal on the memory ptr return from malloc().

				     malloc() is implemented by allocator, e.g. by glibc. Heap manager might

				     allocate a ptr from brk or mapping created by mmap.

				     If an app calls mseal on a ptr returned from malloc(), this can affect

				     the heap manager's ability to manage the mappings; the outcome is

				     non-deterministic.

				  aio/shm can call mmap()/munmap() on behalf of userspace, e.g. ksys_shmdt() in

				  shm.c. The lifetime of those mapping are not tied to the lifetime of the

				  process. If those memories are sealed from userspace, then munmap() will fail,

				  causing leaks in VMA address space during the lifetime of the process.

				     Example::

				- Brk (heap)

				        ptr = malloc(size);

				        /* don't call mseal on ptr return from malloc. */

				        mseal(ptr, size);

				        /* free will success, allocator can't shrink heap lower than ptr */

				        free(ptr);

				  Currently, userspace applications can seal parts of the heap by calling

				  malloc() and mseal().

				  let's assume following calls from user space:

				mseal doesn't block

				===================

				In a nutshell, mseal blocks certain mm syscall from modifying some of VMA's

				attributes, such as protection bits (RWX). Sealed mappings doesn't mean the

				memory is immutable.

				  - ptr = malloc(size);

				  - mprotect(ptr, size, RO);

				  - mseal(ptr, size);

				  - free(ptr);

				  Technically, before mseal() is added, the user can change the protection of

				  the heap by calling mprotect(RO). As long as the user changes the protection

				  back to RW before free(), the memory range can be reused.

				  Adding mseal() into the picture, however, the heap is then sealed partially,

				  the user can still free it, but the memory remains to be RO. If the address

				  is re-used by the heap manager for another malloc, the process might crash

				  soon after. Therefore, it is important not to apply sealing to any memory

				  that might get recycled.

				  Furthermore, even if the application never calls the free() for the ptr,

				  the heap manager may invoke the brk system call to shrink the size of the

				  heap. In the kernel, the brk-shrink will call munmap(). Consequently,

				  depending on the location of the ptr, the outcome of brk-shrink is

				  nondeterministic.

				Additional notes:

				=================

				As Jann Horn pointed out in [3], there are still a few ways to write

				to RO memory, which is, in a way, by design. Those cases are not covered

				by mseal(). If applications want to block such cases, sandbox tools (such as

				seccomp, LSM, etc) might be considered.

				to RO memory, which is, in a way, by design. And those could be blocked

				by different security measures.

				Those cases are:

				- Write to read-only memory through /proc/self/mem interface.

				- Write to read-only memory through ptrace (such as PTRACE_POKETEXT).

				- userfaultfd.

				   - Write to read-only memory through /proc/self/mem interface (FOLL_FORCE).

				   - Write to read-only memory through ptrace (such as PTRACE_POKETEXT).

				   - userfaultfd.

				The idea that inspired this patch comes from Stephen Röttger’s work in V8

				CFI [4]. Chrome browser in ChromeOS will be the first user of this API.

				Reference:

				==========

				[1] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b9f69dbd3c8c3fd30a/osfmk/mach/vm_statistics.h#L274

				[2] https://man.openbsd.org/mimmutable.2

				[3] https://lore.kernel.org/lkml/CAG48ez3ShUYey+ZAFsU2i1RpQn0a5eOs2hzQ426FkcgnfUGLvA@mail.gmail.com

				[4] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXgeaRHo/edit#heading=h.bvaojj9fu6hc

				Reference

				=========

				- [1] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b9f69dbd3c8c3fd30a/osfmk/mach/vm_statistics.h#L274

				- [2] https://man.openbsd.org/mimmutable.2

				- [3] https://lore.kernel.org/lkml/CAG48ez3ShUYey+ZAFsU2i1RpQn0a5eOs2hzQ426FkcgnfUGLvA@mail.gmail.com

				- [4] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXgeaRHo/edit#heading=h.bvaojj9fu6hc

									
										16

Documentation/virt/kvm/api.rst
									
												View File
												
				@@ -8098,13 +8098,15 @@ KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS By default, KVM emulates MONITOR/MWAIT (if

				                                    KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT is

				                                    disabled.

				KVM_X86_QUIRK_SLOT_ZAP_ALL          By default, KVM invalidates all SPTEs in

				                                    fast way for memslot deletion when VM type

				                                    is KVM_X86_DEFAULT_VM.

				                                    When this quirk is disabled or when VM type

				                                    is other than KVM_X86_DEFAULT_VM, KVM zaps

				                                    only leaf SPTEs that are within the range of

				                                    the memslot being deleted.

				KVM_X86_QUIRK_SLOT_ZAP_ALL          By default, for KVM_X86_DEFAULT_VM VMs, KVM

				                                    invalidates all SPTEs in all memslots and

				                                    address spaces when a memslot is deleted or

				                                    moved.  When this quirk is disabled (or the

				                                    VM type isn't KVM_X86_DEFAULT_VM), KVM only

				                                    ensures the backing memory of the deleted

				                                    or moved memslot isn't reachable, i.e KVM

				                                    _may_ invalidate only SPTEs related to the

				                                    memslot.

				=================================== ============================================

				7.32 KVM_CAP_MAX_VCPU_ID

									
										2

Documentation/virt/kvm/locking.rst
									
												View File
												
				@@ -136,7 +136,7 @@ For direct sp, we can easily avoid it since the spte of direct sp is fixed

				to gfn.  For indirect sp, we disabled fast page fault for simplicity.

				A solution for indirect sp could be to pin the gfn, for example via

				kvm_vcpu_gfn_to_pfn_atomic, before the cmpxchg.  After the pinning:

				gfn_to_pfn_memslot_atomic, before the cmpxchg.  After the pinning:

				- We have held the refcount of pfn; that means the pfn can not be freed and

				  be reused for another gfn.

183

MAINTAINERS

View File

@@ -258,12 +258,6 @@ L:	linux-acenic@sunsite.dk
 S:	Maintained
 F:	drivers/net/ethernet/alteon/acenic*
 ACER ASPIRE 1 EMBEDDED CONTROLLER DRIVER
 M:	Nikita Travkin <nikita@trvn.ru>
 S:	Maintained
 F:	Documentation/devicetree/bindings/platform/acer,aspire1-ec.yaml
 F:	drivers/platform/arm64/acer-aspire1-ec.c
 ACER ASPIRE ONE TEMPERATURE AND FAN DRIVER
 M:	Peter Kaestle <peter@piie.net>
 L:	platform-driver-x86@vger.kernel.org
@@ -888,7 +882,6 @@ F:	drivers/staging/media/sunxi/cedrus/
 ALPHA PORT
 M:	Richard Henderson <richard.henderson@linaro.org>
 M:	Ivan Kokshaysky <ink@jurassic.park.msu.ru>
 M:	Matt Turner <mattst88@gmail.com>
 L:	linux-alpha@vger.kernel.org
 S:	Odd Fixes
@@ -1761,8 +1754,8 @@ F:	include/uapi/linux/if_arcnet.h
 ARM AND ARM64 SoC SUB-ARCHITECTURES (COMMON PARTS)
 M:	Arnd Bergmann <arnd@arndb.de>
 M:	Olof Johansson <olof@lixom.net>
 M:	soc@kernel.org
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 L:	soc@lists.linux.dev
 S:	Maintained
 P:	Documentation/process/maintainer-soc.rst
 C:	irc://irc.libera.chat/armlinux
@@ -2263,12 +2256,6 @@ L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 S:	Maintained
 F:	arch/arm/mach-ep93xx/ts72xx.c
 ARM/CIRRUS LOGIC CLPS711X ARM ARCHITECTURE
 M:	Alexander Shiyan <shc_work@mail.ru>
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 S:	Odd Fixes
 N:	clps711x
 ARM/CIRRUS LOGIC EP93XX ARM ARCHITECTURE
 M:	Hartley Sweeten <hsweeten@visionengravers.com>
 M:	Alexander Sverdlin <alexander.sverdlin@gmail.com>
@@ -3815,14 +3802,6 @@ F:	drivers/video/backlight/
 F:	include/linux/backlight.h
 F:	include/linux/pwm_backlight.h
 BAIKAL-T1 PVT HARDWARE MONITOR DRIVER
 M:	Serge Semin <fancer.lancer@gmail.com>
 L:	linux-hwmon@vger.kernel.org
 S:	Supported
 F:	Documentation/devicetree/bindings/hwmon/baikal,bt1-pvt.yaml
 F:	Documentation/hwmon/bt1-pvt.rst
 F:	drivers/hwmon/bt1-pvt.[ch]
 BARCO P50 GPIO DRIVER
 M:	Santosh Kumar Yadav <santoshkumar.yadav@barco.com>
 M:	Peter Korsgaard <peter.korsgaard@barco.com>
@@ -6476,7 +6455,6 @@ F:	drivers/mtd/nand/raw/denali*
 DESIGNWARE EDMA CORE IP DRIVER
 M:	Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
 R:	Serge Semin <fancer.lancer@gmail.com>
 L:	dmaengine@vger.kernel.org
 S:	Maintained
 F:	drivers/dma/dw-edma/
@@ -9745,6 +9723,7 @@ F:	include/dt-bindings/gpio/
 F:	include/linux/gpio.h
 F:	include/linux/gpio/
 F:	include/linux/of_gpio.h
 K:	(devm_)?gpio_(request|free|direction|get|set)
 GPIO UAPI
 M:	Bartosz Golaszewski <brgl@bgdev.pl>
@@ -9759,14 +9738,6 @@ F:	drivers/gpio/gpiolib-cdev.c
 F:	include/uapi/linux/gpio.h
 F:	tools/gpio/
 GRE DEMULTIPLEXER DRIVER
 M:	Dmitry Kozlov <xeb@mail.ru>
 L:	netdev@vger.kernel.org
 S:	Maintained
 F:	include/net/gre.h
 F:	net/ipv4/gre_demux.c
 F:	net/ipv4/gre_offload.c
 GRETH 10/100/1G Ethernet MAC device driver
 M:	Andreas Larsson <andreas@gaisler.com>
 L:	netdev@vger.kernel.org
@@ -11283,10 +11254,10 @@ F:	security/integrity/
 F:	security/integrity/ima/
 INTEGRITY POLICY ENFORCEMENT (IPE)
 M:	Fan Wu <wufan@linux.microsoft.com>
 M:	Fan Wu <wufan@kernel.org>
 L:	linux-security-module@vger.kernel.org
 S:	Supported
 T:	git https://github.com/microsoft/ipe.git
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe.git
 F:	Documentation/admin-guide/LSM/ipe.rst
 F:	Documentation/security/ipe.rst
 F:	scripts/ipe/
@@ -11604,6 +11575,16 @@ F:	drivers/crypto/intel/keembay/keembay-ocs-hcu-core.c
 F:	drivers/crypto/intel/keembay/ocs-hcu.c
 F:	drivers/crypto/intel/keembay/ocs-hcu.h
 INTEL LA JOLLA COVE ADAPTER (LJCA) USB I/O EXPANDER DRIVERS
 M:	Wentong Wu <wentong.wu@intel.com>
 M:	Sakari Ailus <sakari.ailus@linux.intel.com>
 S:	Maintained
 F:	drivers/gpio/gpio-ljca.c
 F:	drivers/i2c/busses/i2c-ljca.c
 F:	drivers/spi/spi-ljca.c
 F:	drivers/usb/misc/usb-ljca.c
 F:	include/linux/usb/ljca.h
 INTEL MANAGEMENT ENGINE (mei)
 M:	Tomas Winkler <tomas.winkler@intel.com>
 L:	linux-kernel@vger.kernel.org
@@ -12242,6 +12223,7 @@ R:	Dmitry Vyukov <dvyukov@google.com>
 R:	Vincenzo Frascino <vincenzo.frascino@arm.com>
 L:	kasan-dev@googlegroups.com
 S:	Maintained
 B:	https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management
 F:	Documentation/dev-tools/kasan.rst
 F:	arch/*/include/asm/*kasan.h
 F:	arch/*/mm/kasan_init*
@@ -12265,6 +12247,7 @@ R:	Dmitry Vyukov <dvyukov@google.com>
 R:	Andrey Konovalov <andreyknvl@gmail.com>
 L:	kasan-dev@googlegroups.com
 S:	Maintained
 B:	https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management
 F:	Documentation/dev-tools/kcov.rst
 F:	include/linux/kcov.h
 F:	include/uapi/linux/kcov.h
@@ -12947,12 +12930,6 @@ S:	Maintained
 F:	drivers/ata/pata_arasan_cf.c
 F:	include/linux/pata_arasan_cf_data.h
 LIBATA PATA DRIVERS
 R:	Sergey Shtylyov <s.shtylyov@omp.ru>
 L:	linux-ide@vger.kernel.org
 F:	drivers/ata/ata_*.c
 F:	drivers/ata/pata_*.c
 LIBATA PATA FARADAY FTIDE010 AND GEMINI SATA BRIDGE DRIVERS
 M:	Linus Walleij <linus.walleij@linaro.org>
 L:	linux-ide@vger.kernel.org
@@ -12969,14 +12946,6 @@ F:	drivers/ata/ahci_platform.c
 F:	drivers/ata/libahci_platform.c
 F:	include/linux/ahci_platform.h
 LIBATA SATA AHCI SYNOPSYS DWC CONTROLLER DRIVER
 M:	Serge Semin <fancer.lancer@gmail.com>
 L:	linux-ide@vger.kernel.org
 S:	Maintained
 F:	Documentation/devicetree/bindings/ata/baikal,bt1-ahci.yaml
 F:	Documentation/devicetree/bindings/ata/snps,dwc-ahci.yaml
 F:	drivers/ata/ahci_dwc.c
 LIBATA SATA PROMISE TX2/TX4 CONTROLLER DRIVER
 M:	Mikael Pettersson <mikpelinux@gmail.com>
 L:	linux-ide@vger.kernel.org
@@ -14173,8 +14142,7 @@ T:	git git://linuxtv.org/media_tree.git
 F:	drivers/media/platform/nxp/imx-pxp.[ch]
 MEDIA DRIVERS FOR ASCOT2E
 M:	Sergey Kozlov <serjk@netup.ru>
 M:	Abylay Ospan <aospan@netup.ru>
 M:	Abylay Ospan <aospan@amazon.com>
 L:	linux-media@vger.kernel.org
 S:	Supported
 W:	https://linuxtv.org
@@ -14191,8 +14159,7 @@ T:	git git://linuxtv.org/media_tree.git
 F:	drivers/media/dvb-frontends/cxd2099*
 MEDIA DRIVERS FOR CXD2841ER
 M:	Sergey Kozlov <serjk@netup.ru>
 M:	Abylay Ospan <aospan@netup.ru>
 M:	Abylay Ospan <aospan@amazon.com>
 L:	linux-media@vger.kernel.org
 S:	Supported
 W:	https://linuxtv.org
@@ -14245,7 +14212,7 @@ F:	drivers/media/platform/nxp/imx7-media-csi.c
 F:	drivers/media/platform/nxp/imx8mq-mipi-csi2.c
 MEDIA DRIVERS FOR HELENE
 M:	Abylay Ospan <aospan@netup.ru>
 M:	Abylay Ospan <aospan@amazon.com>
 L:	linux-media@vger.kernel.org
 S:	Supported
 W:	https://linuxtv.org
@@ -14254,8 +14221,7 @@ T:	git git://linuxtv.org/media_tree.git
 F:	drivers/media/dvb-frontends/helene*
 MEDIA DRIVERS FOR HORUS3A
 M:	Sergey Kozlov <serjk@netup.ru>
 M:	Abylay Ospan <aospan@netup.ru>
 M:	Abylay Ospan <aospan@amazon.com>
 L:	linux-media@vger.kernel.org
 S:	Supported
 W:	https://linuxtv.org
@@ -14264,8 +14230,7 @@ T:	git git://linuxtv.org/media_tree.git
 F:	drivers/media/dvb-frontends/horus3a*
 MEDIA DRIVERS FOR LNBH25
 M:	Sergey Kozlov <serjk@netup.ru>
 M:	Abylay Ospan <aospan@netup.ru>
 M:	Abylay Ospan <aospan@amazon.com>
 L:	linux-media@vger.kernel.org
 S:	Supported
 W:	https://linuxtv.org
@@ -14281,8 +14246,7 @@ T:	git git://linuxtv.org/media_tree.git
 F:	drivers/media/dvb-frontends/mxl5xx*
 MEDIA DRIVERS FOR NETUP PCI UNIVERSAL DVB devices
 M:	Sergey Kozlov <serjk@netup.ru>
 M:	Abylay Ospan <aospan@netup.ru>
 M:	Abylay Ospan <aospan@amazon.com>
 L:	linux-media@vger.kernel.org
 S:	Supported
 W:	https://linuxtv.org
@@ -14907,9 +14871,10 @@ N:	include/linux/page[-_]*
 MEMORY MAPPING
 M:	Andrew Morton <akpm@linux-foundation.org>
 R:	Liam R. Howlett <Liam.Howlett@oracle.com>
 M:	Liam R. Howlett <Liam.Howlett@oracle.com>
 M:	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
 R:	Vlastimil Babka <vbabka@suse.cz>
 R:	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
 R:	Jann Horn <jannh@google.com>
 L:	linux-mm@kvack.org
 S:	Maintained
 W:	http://www.linux-mm.org
@@ -14932,13 +14897,6 @@ F:	drivers/mtd/
 F:	include/linux/mtd/
 F:	include/uapi/mtd/
 MEMSENSING MICROSYSTEMS MSA311 DRIVER
 M:	Dmitry Rokosov <ddrokosov@sberdevices.ru>
 L:	linux-iio@vger.kernel.org
 S:	Maintained
 F:	Documentation/devicetree/bindings/iio/accel/memsensing,msa311.yaml
 F:	drivers/iio/accel/msa311.c
 MEN A21 WATCHDOG DRIVER
 M:	Johannes Thumshirn <morbidrsa@gmail.com>
 L:	linux-watchdog@vger.kernel.org
@@ -15083,6 +15041,7 @@ F:	drivers/spi/spi-at91-usart.c
 MICROCHIP AUDIO ASOC DRIVERS
 M:	Claudiu Beznea <claudiu.beznea@tuxon.dev>
 M:	Andrei Simion <andrei.simion@microchip.com>
 L:	linux-sound@vger.kernel.org
 S:	Supported
 F:	Documentation/devicetree/bindings/sound/atmel*
@@ -15191,6 +15150,7 @@ F:	include/video/atmel_lcdc.h
 MICROCHIP MCP16502 PMIC DRIVER
 M:	Claudiu Beznea <claudiu.beznea@tuxon.dev>
 M:	Andrei Simion <andrei.simion@microchip.com>
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 S:	Supported
 F:	Documentation/devicetree/bindings/regulator/microchip,mcp16502.yaml
@@ -15272,7 +15232,6 @@ F:	drivers/tty/serial/8250/8250_pci1xxxx.c
 MICROCHIP POLARFIRE FPGA DRIVERS
 M:	Conor Dooley <conor.dooley@microchip.com>
 R:	Vladimir Georgiev <v.georgiev@metrotek.ru>
 L:	linux-fpga@vger.kernel.org
 S:	Supported
 F:	Documentation/devicetree/bindings/fpga/microchip,mpf-spi-fpga-mgr.yaml
@@ -15322,6 +15281,7 @@ F:	drivers/spi/spi-atmel.*
 MICROCHIP SSC DRIVER
 M:	Claudiu Beznea <claudiu.beznea@tuxon.dev>
 M:	Andrei Simion <andrei.simion@microchip.com>
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 S:	Supported
 F:	Documentation/devicetree/bindings/misc/atmel-ssc.txt
@@ -15527,17 +15487,6 @@ F:	arch/mips/
 F:	drivers/platform/mips/
 F:	include/dt-bindings/mips/
 MIPS BAIKAL-T1 PLATFORM
 M:	Serge Semin <fancer.lancer@gmail.com>
 L:	linux-mips@vger.kernel.org
 S:	Supported
 F:	Documentation/devicetree/bindings/bus/baikal,bt1-*.yaml
 F:	Documentation/devicetree/bindings/clock/baikal,bt1-*.yaml
 F:	drivers/bus/bt1-*.c
 F:	drivers/clk/baikal-t1/
 F:	drivers/memory/bt1-l2-ctl.c
 F:	drivers/mtd/maps/physmap-bt1-rom.[ch]
 MIPS BOSTON DEVELOPMENT BOARD
 M:	Paul Burton <paulburton@kernel.org>
 L:	linux-mips@vger.kernel.org
@@ -15550,7 +15499,6 @@ F:	include/dt-bindings/clock/boston-clock.h
 MIPS CORE DRIVERS
 M:	Thomas Bogendoerfer <tsbogend@alpha.franken.de>
 M:	Serge Semin <fancer.lancer@gmail.com>
 L:	linux-mips@vger.kernel.org
 S:	Supported
 F:	drivers/bus/mips_cdmm.c
@@ -16086,6 +16034,7 @@ F:	include/uapi/linux/net_dropmon.h
 F:	net/core/drop_monitor.c
 NETWORKING DRIVERS
 M:	Andrew Lunn <andrew+netdev@lunn.ch>
 M:	"David S. Miller" <davem@davemloft.net>
 M:	Eric Dumazet <edumazet@google.com>
 M:	Jakub Kicinski <kuba@kernel.org>
@@ -16151,6 +16100,7 @@ M:	"David S. Miller" <davem@davemloft.net>
 M:	Eric Dumazet <edumazet@google.com>
 M:	Jakub Kicinski <kuba@kernel.org>
 M:	Paolo Abeni <pabeni@redhat.com>
 R:	Simon Horman <horms@kernel.org>
 L:	netdev@vger.kernel.org
 S:	Maintained
 P:	Documentation/process/maintainer-netdev.rst
@@ -16193,6 +16143,7 @@ F:	include/uapi/linux/rtnetlink.h
 F:	lib/net_utils.c
 F:	lib/random32.c
 F:	net/
 F:	samples/pktgen/
 F:	tools/net/
 F:	tools/testing/selftests/net/
 X:	Documentation/networking/mac80211-injection.rst
@@ -16517,12 +16468,6 @@ F:	include/linux/ntb.h
 F:	include/linux/ntb_transport.h
 F:	tools/testing/selftests/ntb/
 NTB IDT DRIVER
 M:	Serge Semin <fancer.lancer@gmail.com>
 L:	ntb@lists.linux.dev
 S:	Supported
 F:	drivers/ntb/hw/idt/
 NTB INTEL DRIVER
 M:	Dave Jiang <dave.jiang@intel.com>
 L:	ntb@lists.linux.dev
@@ -18543,13 +18488,6 @@ F:	drivers/pps/
 F:	include/linux/pps*.h
 F:	include/uapi/linux/pps.h
 PPTP DRIVER
 M:	Dmitry Kozlov <xeb@mail.ru>
 L:	netdev@vger.kernel.org
 S:	Maintained
 W:	http://sourceforge.net/projects/accel-pptp
 F:	drivers/net/ppp/pptp.c
 PRESSURE STALL INFORMATION (PSI)
 M:	Johannes Weiner <hannes@cmpxchg.org>
 M:	Suren Baghdasaryan <surenb@google.com>
@@ -19523,6 +19461,14 @@ S:	Maintained
 F:	Documentation/tools/rtla/
 F:	tools/tracing/rtla/
 Real-time Linux (PREEMPT_RT)
 M:	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
 M:	Clark Williams <clrkwllms@kernel.org>
 M:	Steven Rostedt <rostedt@goodmis.org>
 L:	linux-rt-devel@lists.linux.dev
 S:	Supported
 K:	PREEMPT_RT
 REALTEK AUDIO CODECS
 M:	Oder Chiou <oder_chiou@realtek.com>
 S:	Maintained
@@ -19632,15 +19578,6 @@ S:	Supported
 F:	Documentation/devicetree/bindings/i2c/renesas,iic-emev2.yaml
 F:	drivers/i2c/busses/i2c-emev2.c
 RENESAS ETHERNET AVB DRIVER
 R:	Sergey Shtylyov <s.shtylyov@omp.ru>
 L:	netdev@vger.kernel.org
 L:	linux-renesas-soc@vger.kernel.org
 F:	Documentation/devicetree/bindings/net/renesas,etheravb.yaml
 F:	drivers/net/ethernet/renesas/Kconfig
 F:	drivers/net/ethernet/renesas/Makefile
 F:	drivers/net/ethernet/renesas/ravb*
 RENESAS ETHERNET SWITCH DRIVER
 R:	Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
 L:	netdev@vger.kernel.org
@@ -19690,14 +19627,6 @@ F:	Documentation/devicetree/bindings/i2c/renesas,rmobile-iic.yaml
 F:	drivers/i2c/busses/i2c-rcar.c
 F:	drivers/i2c/busses/i2c-sh_mobile.c
 RENESAS R-CAR SATA DRIVER
 R:	Sergey Shtylyov <s.shtylyov@omp.ru>
 L:	linux-ide@vger.kernel.org
 L:	linux-renesas-soc@vger.kernel.org
 S:	Supported
 F:	Documentation/devicetree/bindings/ata/renesas,rcar-sata.yaml
 F:	drivers/ata/sata_rcar.c
 RENESAS R-CAR THERMAL DRIVERS
 M:	Niklas Söderlund <niklas.soderlund@ragnatech.se>
 L:	linux-renesas-soc@vger.kernel.org
@@ -19773,16 +19702,6 @@ S:	Supported
 F:	Documentation/devicetree/bindings/i2c/renesas,rzv2m.yaml
 F:	drivers/i2c/busses/i2c-rzv2m.c
 RENESAS SUPERH ETHERNET DRIVER
 R:	Sergey Shtylyov <s.shtylyov@omp.ru>
 L:	netdev@vger.kernel.org
 L:	linux-renesas-soc@vger.kernel.org
 F:	Documentation/devicetree/bindings/net/renesas,ether.yaml
 F:	drivers/net/ethernet/renesas/Kconfig
 F:	drivers/net/ethernet/renesas/Makefile
 F:	drivers/net/ethernet/renesas/sh_eth*
 F:	include/linux/sh_eth.h
 RENESAS USB PHY DRIVER
 M:	Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
 L:	linux-renesas-soc@vger.kernel.org
@@ -21777,8 +21696,8 @@ F:	drivers/accessibility/speakup/
 SPEAR PLATFORM/CLOCK/PINCTRL SUPPORT
 M:	Viresh Kumar <vireshk@kernel.org>
 M:	Shiraz Hashim <shiraz.linux.kernel@gmail.com>
 M:	soc@kernel.org
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 L:	soc@lists.linux.dev
 S:	Maintained
 W:	http://www.st.com/spear
 F:	arch/arm/boot/dts/st/spear*
@@ -22436,19 +22355,11 @@ F:	drivers/tty/serial/8250/8250_lpss.c
 SYNOPSYS DESIGNWARE APB GPIO DRIVER
 M:	Hoan Tran <hoan@os.amperecomputing.com>
 M:	Serge Semin <fancer.lancer@gmail.com>
 L:	linux-gpio@vger.kernel.org
 S:	Maintained
 F:	Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml
 F:	drivers/gpio/gpio-dwapb.c
 SYNOPSYS DESIGNWARE APB SSI DRIVER
 M:	Serge Semin <fancer.lancer@gmail.com>
 L:	linux-spi@vger.kernel.org
 S:	Supported
 F:	Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml
 F:	drivers/spi/spi-dw*
 SYNOPSYS DESIGNWARE AXI DMAC DRIVER
 M:	Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
 S:	Maintained
@@ -23292,7 +23203,7 @@ F:	Documentation/devicetree/bindings/iio/adc/ti,lmp92064.yaml
 F:	drivers/iio/adc/ti-lmp92064.c
 TI PCM3060 ASoC CODEC DRIVER
 M:	Kirill Marinushkin <kmarinushkin@birdec.com>
 M:	Kirill Marinushkin <k.marinushkin@gmail.com>
 L:	linux-sound@vger.kernel.org
 S:	Maintained
 F:	Documentation/devicetree/bindings/sound/pcm3060.txt
@@ -23758,12 +23669,6 @@ L:	linux-input@vger.kernel.org
 S:	Maintained
 F:	drivers/hid/hid-udraw-ps3.c
 UFS FILESYSTEM
 M:	Evgeniy Dushistov <dushistov@mail.ru>
 S:	Maintained
 F:	Documentation/admin-guide/ufs.rst
 F:	fs/ufs/
 UHID USERSPACE HID IO DRIVER
 M:	David Rheinsberg <david@readahead.eu>
 L:	linux-input@vger.kernel.org
@@ -24066,6 +23971,7 @@ USB RAW GADGET DRIVER
 R:	Andrey Konovalov <andreyknvl@gmail.com>
 L:	linux-usb@vger.kernel.org
 S:	Maintained
 B:	https://github.com/xairy/raw-gadget/issues
 F:	Documentation/usb/raw-gadget.rst
 F:	drivers/usb/gadget/legacy/raw_gadget.c
 F:	include/uapi/linux/usb/raw_gadget.h
@@ -24737,9 +24643,10 @@ F:	tools/testing/vsock/
 VMA
 M:	Andrew Morton <akpm@linux-foundation.org>
 R:	Liam R. Howlett <Liam.Howlett@oracle.com>
 M:	Liam R. Howlett <Liam.Howlett@oracle.com>
 M:	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
 R:	Vlastimil Babka <vbabka@suse.cz>
 R:	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
 R:	Jann Horn <jannh@google.com>
 L:	linux-mm@kvack.org
 S:	Maintained
 W:	https://www.linux-mm.org

									
										2

Makefile
									
												View File
												
				@@ -2,7 +2,7 @@

				VERSION = 6

				PATCHLEVEL = 12

				SUBLEVEL = 0

				EXTRAVERSION = -rc3

				EXTRAVERSION = -rc6

				NAME = Baby Opossum Posse

				# *DOCUMENTATION*

26

arch/Kconfig

View File

@@ -838,7 +838,7 @@ config CFI_CLANG
 config CFI_ICALL_NORMALIZE_INTEGERS
 	bool "Normalize CFI tags for integers"
 	depends on CFI_CLANG
 	depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS
 	depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG
 	help
 	  This option normalizes the CFI tags for integer types so that all
 	  integer types of the same size and signedness receive the same CFI
@@ -851,21 +851,19 @@ config CFI_ICALL_NORMALIZE_INTEGERS
 	  This option is necessary for using CFI with Rust. If unsure, say N.
 config HAVE_CFI_ICALL_NORMALIZE_INTEGERS
 	def_bool !GCOV_KERNEL && !KASAN
 	depends on CFI_CLANG
 config HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG
 	def_bool y
 	depends on $(cc-option,-fsanitize=kcfi -fsanitize-cfi-icall-experimental-normalize-integers)
 	help
 	  Is CFI_ICALL_NORMALIZE_INTEGERS supported with the set of compilers
 	  currently in use?
 	# With GCOV/KASAN we need this fix: https://github.com/llvm/llvm-project/pull/104826
 	depends on CLANG_VERSION >= 190103 || (!GCOV_KERNEL && !KASAN_GENERIC && !KASAN_SW_TAGS)
 	  This option defaults to false if GCOV or KASAN is enabled, as there is
 	  an LLVM bug that makes normalized integers tags incompatible with
 	  KASAN and GCOV. Kconfig currently does not have the infrastructure to
 	  detect whether your rustc compiler contains the fix for this bug, so
 	  it is assumed that it doesn't. If your compiler has the fix, you can
 	  explicitly enable this option in your config file. The Kconfig logic
 	  needed to detect this will be added in a future kernel release.
 config HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC
 	def_bool y
 	depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG
 	depends on RUSTC_VERSION >= 107900
 	# With GCOV/KASAN we need this fix: https://github.com/rust-lang/rust/pull/129373
 	depends on (RUSTC_LLVM_VERSION >= 190103 && RUSTC_VERSION >= 108200) || \
 		(!GCOV_KERNEL && !KASAN_GENERIC && !KASAN_SW_TAGS)
 config CFI_PERMISSIVE
 	bool "Use CFI in permissive mode"

2

arch/arm/boot/dts/broadcom/bcm2837-rpi-cm3-io3.dts

View File

@@ -77,7 +77,7 @@
 };
 &hdmi {
 	hpd-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
 	hpd-gpios = <&expgpio 0 GPIO_ACTIVE_LOW>;
 	power-domains = <&power RPI_POWER_DOMAIN_HDMI>;
 	status = "okay";
 };

2

arch/arm64/boot/dts/marvell/cn9130-sr-som.dtsi

View File

@@ -136,7 +136,7 @@
 		};
 		cp0_mdio_pins: cp0-mdio-pins {
 			marvell,pins = "mpp40", "mpp41";
 			marvell,pins = "mpp0", "mpp1";
 			marvell,function = "ge";
 		};

									
										1

arch/arm64/include/asm/kvm_asm.h
									
												View File
												
				@@ -178,6 +178,7 @@ struct kvm_nvhe_init_params {

					unsigned long hcr_el2;

					unsigned long vttbr;

					unsigned long vtcr;

					unsigned long tmp;

				};

				/*

									
										7

arch/arm64/include/asm/kvm_host.h
									
												View File
												
				@@ -51,6 +51,7 @@

				#define KVM_REQ_RELOAD_PMU	KVM_ARCH_REQ(5)

				#define KVM_REQ_SUSPEND		KVM_ARCH_REQ(6)

				#define KVM_REQ_RESYNC_PMU_EL0	KVM_ARCH_REQ(7)

				#define KVM_REQ_NESTED_S2_UNMAP	KVM_ARCH_REQ(8)

				#define KVM_DIRTY_LOG_MANUAL_CAPS   (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \

								     KVM_DIRTY_LOG_INITIALLY_SET)

				@@ -211,6 +212,12 @@ struct kvm_s2_mmu {

					 */

					bool	nested_stage2_enabled;

					/*

					 * true when this MMU needs to be unmapped before being used for a new

					 * purpose.

					 */

					bool	pending_unmap;

					/*

					 *  0: Nobody is currently using this, check vttbr for validity

					 * >0: Somebody is actively using this.

									
										3

arch/arm64/include/asm/kvm_mmu.h
									
												View File
												
				@@ -166,7 +166,8 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size,

				int create_hyp_stack(phys_addr_t phys_addr, unsigned long *haddr);

				void __init free_hyp_pgds(void);

				void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size);

				void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start,

							    u64 size, bool may_block);

				void kvm_stage2_flush_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end);

				void kvm_stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end);

									
										4

arch/arm64/include/asm/kvm_nested.h
									
												View File
												
				@@ -78,6 +78,8 @@ extern void kvm_s2_mmu_iterate_by_vmid(struct kvm *kvm, u16 vmid,

				extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu);

				extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu);

				extern void check_nested_vcpu_requests(struct kvm_vcpu *vcpu);

				struct kvm_s2_trans {

					phys_addr_t output;

					unsigned long block_size;

				@@ -124,7 +126,7 @@ extern int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu,

								    struct kvm_s2_trans *trans);

				extern int kvm_inject_s2_fault(struct kvm_vcpu *vcpu, u64 esr_el2);

				extern void kvm_nested_s2_wp(struct kvm *kvm);

				extern void kvm_nested_s2_unmap(struct kvm *kvm);

				extern void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block);

				extern void kvm_nested_s2_flush(struct kvm *kvm);

				unsigned long compute_tlb_inval_range(struct kvm_s2_mmu *mmu, u64 val);

									
										8

arch/arm64/include/asm/uprobes.h
									
												View File
												
				@@ -10,11 +10,9 @@

				#include <asm/insn.h>

				#include <asm/probes.h>

				#define MAX_UINSN_BYTES		AARCH64_INSN_SIZE

				#define UPROBE_SWBP_INSN	cpu_to_le32(BRK64_OPCODE_UPROBES)

				#define UPROBE_SWBP_INSN_SIZE	AARCH64_INSN_SIZE

				#define UPROBE_XOL_SLOT_BYTES	MAX_UINSN_BYTES

				#define UPROBE_XOL_SLOT_BYTES	AARCH64_INSN_SIZE

				typedef __le32 uprobe_opcode_t;

				@@ -23,8 +21,8 @@ struct arch_uprobe_task {

				struct arch_uprobe {

					union {

						u8 insn[MAX_UINSN_BYTES];

						u8 ixol[MAX_UINSN_BYTES];

						__le32 insn;

						__le32 ixol;

					};

					struct arch_probe_insn api;

					bool simulate;

									
										1

arch/arm64/kernel/asm-offsets.c
									
												View File
												
				@@ -146,6 +146,7 @@ int main(void)

				  DEFINE(NVHE_INIT_HCR_EL2,	offsetof(struct kvm_nvhe_init_params, hcr_el2));

				  DEFINE(NVHE_INIT_VTTBR,	offsetof(struct kvm_nvhe_init_params, vttbr));

				  DEFINE(NVHE_INIT_VTCR,	offsetof(struct kvm_nvhe_init_params, vtcr));

				  DEFINE(NVHE_INIT_TMP,		offsetof(struct kvm_nvhe_init_params, tmp));

				#endif

				#ifdef CONFIG_CPU_PM

				  DEFINE(CPU_CTX_SP,		offsetof(struct cpu_suspend_ctx, sp));

									
										16

arch/arm64/kernel/probes/decode-insn.c
									
												View File
												
				@@ -99,10 +99,6 @@ arm_probe_decode_insn(probe_opcode_t insn, struct arch_probe_insn *api)

					    aarch64_insn_is_blr(insn) ||

					    aarch64_insn_is_ret(insn)) {

						api->handler = simulate_br_blr_ret;

					} else if (aarch64_insn_is_ldr_lit(insn)) {

						api->handler = simulate_ldr_literal;

					} else if (aarch64_insn_is_ldrsw_lit(insn)) {

						api->handler = simulate_ldrsw_literal;

					} else {

						/*

						 * Instruction cannot be stepped out-of-line and we don't

				@@ -140,6 +136,17 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi)

					probe_opcode_t insn = le32_to_cpu(*addr);

					probe_opcode_t *scan_end = NULL;

					unsigned long size = 0, offset = 0;

					struct arch_probe_insn *api = &asi->api;

					if (aarch64_insn_is_ldr_lit(insn)) {

						api->handler = simulate_ldr_literal;

						decoded = INSN_GOOD_NO_SLOT;

					} else if (aarch64_insn_is_ldrsw_lit(insn)) {

						api->handler = simulate_ldrsw_literal;

						decoded = INSN_GOOD_NO_SLOT;

					} else {

						decoded = arm_probe_decode_insn(insn, &asi->api);

					}

					/*

					 * If there's a symbol defined in front of and near enough to

				@@ -157,7 +164,6 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi)

						else

							scan_end = addr - MAX_ATOMIC_CONTEXT_SIZE;

					}

					decoded = arm_probe_decode_insn(insn, &asi->api);

					if (decoded != INSN_REJECTED && scan_end)

						if (is_probed_address_atomic(addr - 1, scan_end))

									
										18

arch/arm64/kernel/probes/simulate-insn.c
									
												View File
												
				@@ -171,17 +171,15 @@ simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs)

				void __kprobes

				simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs)

				{

					u64 *load_addr;

					unsigned long load_addr;

					int xn = opcode & 0x1f;

					int disp;

					disp = ldr_displacement(opcode);

					load_addr = (u64 *) (addr + disp);

					load_addr = addr + ldr_displacement(opcode);

					if (opcode & (1 << 30))	/* x0-x30 */

						set_x_reg(regs, xn, *load_addr);

						set_x_reg(regs, xn, READ_ONCE(*(u64 *)load_addr));

					else			/* w0-w30 */

						set_w_reg(regs, xn, *load_addr);

						set_w_reg(regs, xn, READ_ONCE(*(u32 *)load_addr));

					instruction_pointer_set(regs, instruction_pointer(regs) + 4);

				}

				@@ -189,14 +187,12 @@ simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs)

				void __kprobes

				simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs)

				{

					s32 *load_addr;

					unsigned long load_addr;

					int xn = opcode & 0x1f;

					int disp;

					disp = ldr_displacement(opcode);

					load_addr = (s32 *) (addr + disp);

					load_addr = addr + ldr_displacement(opcode);

					set_x_reg(regs, xn, *load_addr);

					set_x_reg(regs, xn, READ_ONCE(*(s32 *)load_addr));

					instruction_pointer_set(regs, instruction_pointer(regs) + 4);

				}

									
										4

arch/arm64/kernel/probes/uprobes.c
									
												View File
												
				@@ -42,7 +42,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm,

					else if (!IS_ALIGNED(addr, AARCH64_INSN_SIZE))

						return -EINVAL;

					insn = *(probe_opcode_t *)(&auprobe->insn[0]);

					insn = le32_to_cpu(auprobe->insn);

					switch (arm_probe_decode_insn(insn, &auprobe->api)) {

					case INSN_REJECTED:

				@@ -108,7 +108,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)

					if (!auprobe->simulate)

						return false;

					insn = *(probe_opcode_t *)(&auprobe->insn[0]);

					insn = le32_to_cpu(auprobe->insn);

					addr = instruction_pointer(regs);

					if (auprobe->api.handler)

									
										3

arch/arm64/kernel/process.c
									
												View File
												
				@@ -412,6 +412,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)

						p->thread.cpu_context.x19 = (unsigned long)args->fn;

						p->thread.cpu_context.x20 = (unsigned long)args->fn_arg;

						if (system_supports_poe())

							p->thread.por_el0 = POR_EL0_INIT;

					}

					p->thread.cpu_context.pc = (unsigned long)ret_from_fork;

					p->thread.cpu_context.sp = (unsigned long)childregs;

									
										92

arch/arm64/kernel/signal.c
									
												View File
												
				@@ -19,6 +19,7 @@

				#include <linux/ratelimit.h>

				#include <linux/rseq.h>

				#include <linux/syscalls.h>

				#include <linux/pkeys.h>

				#include <asm/daifflags.h>

				#include <asm/debug-monitors.h>

				@@ -66,10 +67,63 @@ struct rt_sigframe_user_layout {

					unsigned long end_offset;

				};

				/*

				 * Holds any EL0-controlled state that influences unprivileged memory accesses.

				 * This includes both accesses done in userspace and uaccess done in the kernel.

				 *

				 * This state needs to be carefully managed to ensure that it doesn't cause

				 * uaccess to fail when setting up the signal frame, and the signal handler

				 * itself also expects a well-defined state when entered.

				 */

				struct user_access_state {

					u64 por_el0;

				};

				#define BASE_SIGFRAME_SIZE round_up(sizeof(struct rt_sigframe), 16)

				#define TERMINATOR_SIZE round_up(sizeof(struct _aarch64_ctx), 16)

				#define EXTRA_CONTEXT_SIZE round_up(sizeof(struct extra_context), 16)

				/*

				 * Save the user access state into ua_state and reset it to disable any

				 * restrictions.

				 */

				static void save_reset_user_access_state(struct user_access_state *ua_state)

				{

					if (system_supports_poe()) {

						u64 por_enable_all = 0;

						for (int pkey = 0; pkey < arch_max_pkey(); pkey++)

							por_enable_all |= POE_RXW << (pkey * POR_BITS_PER_PKEY);

						ua_state->por_el0 = read_sysreg_s(SYS_POR_EL0);

						write_sysreg_s(por_enable_all, SYS_POR_EL0);

						/* Ensure that any subsequent uaccess observes the updated value */

						isb();

					}

				}

				/*

				 * Set the user access state for invoking the signal handler.

				 *

				 * No uaccess should be done after that function is called.

				 */

				static void set_handler_user_access_state(void)

				{

					if (system_supports_poe())

						write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);

				}

				/*

				 * Restore the user access state to the values saved in ua_state.

				 *

				 * No uaccess should be done after that function is called.

				 */

				static void restore_user_access_state(const struct user_access_state *ua_state)

				{

					if (system_supports_poe())

						write_sysreg_s(ua_state->por_el0, SYS_POR_EL0);

				}

				static void init_user_layout(struct rt_sigframe_user_layout *user)

				{

					const size_t reserved_size =

				@@ -261,18 +315,20 @@ static int restore_fpmr_context(struct user_ctxs *user)

					return err;

				}

				static int preserve_poe_context(struct poe_context __user *ctx)

				static int preserve_poe_context(struct poe_context __user *ctx,

								const struct user_access_state *ua_state)

				{

					int err = 0;

					__put_user_error(POE_MAGIC, &ctx->head.magic, err);

					__put_user_error(sizeof(*ctx), &ctx->head.size, err);

					__put_user_error(read_sysreg_s(SYS_POR_EL0), &ctx->por_el0, err);

					__put_user_error(ua_state->por_el0, &ctx->por_el0, err);

					return err;

				}

				static int restore_poe_context(struct user_ctxs *user)

				static int restore_poe_context(struct user_ctxs *user,

							       struct user_access_state *ua_state)

				{

					u64 por_el0;

					int err = 0;

				@@ -282,7 +338,7 @@ static int restore_poe_context(struct user_ctxs *user)

					__get_user_error(por_el0, &(user->poe->por_el0), err);

					if (!err)

						write_sysreg_s(por_el0, SYS_POR_EL0);

						ua_state->por_el0 = por_el0;

					return err;

				}

				@@ -850,7 +906,8 @@ invalid:

				}

				static int restore_sigframe(struct pt_regs *regs,

							    struct rt_sigframe __user *sf)

							    struct rt_sigframe __user *sf,

							    struct user_access_state *ua_state)

				{

					sigset_t set;

					int i, err;

				@@ -899,7 +956,7 @@ static int restore_sigframe(struct pt_regs *regs,

						err = restore_zt_context(&user);

					if (err == 0 && system_supports_poe() && user.poe)

						err = restore_poe_context(&user);

						err = restore_poe_context(&user, ua_state);

					return err;

				}

				@@ -908,6 +965,7 @@ SYSCALL_DEFINE0(rt_sigreturn)

				{

					struct pt_regs *regs = current_pt_regs();

					struct rt_sigframe __user *frame;

					struct user_access_state ua_state;

					/* Always make any pending restarted system calls return -EINTR */

					current->restart_block.fn = do_no_restart_syscall;

				@@ -924,12 +982,14 @@ SYSCALL_DEFINE0(rt_sigreturn)

					if (!access_ok(frame, sizeof (*frame)))

						goto badframe;

					if (restore_sigframe(regs, frame))

					if (restore_sigframe(regs, frame, &ua_state))

						goto badframe;

					if (restore_altstack(&frame->uc.uc_stack))

						goto badframe;

					restore_user_access_state(&ua_state);

					return regs->regs[0];

				badframe:

				@@ -1035,7 +1095,8 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,

				}

				static int setup_sigframe(struct rt_sigframe_user_layout *user,

							  struct pt_regs *regs, sigset_t *set)

							  struct pt_regs *regs, sigset_t *set,

							  const struct user_access_state *ua_state)

				{

					int i, err = 0;

					struct rt_sigframe __user *sf = user->sigframe;

				@@ -1097,10 +1158,9 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,

						struct poe_context __user *poe_ctx =

							apply_user_offset(user, user->poe_offset);

						err |= preserve_poe_context(poe_ctx);

						err |= preserve_poe_context(poe_ctx, ua_state);

					}

					/* ZA state if present */

					if (system_supports_sme() && err == 0 && user->za_offset) {

						struct za_context __user *za_ctx =

				@@ -1237,9 +1297,6 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,

						sme_smstop();

					}

					if (system_supports_poe())

						write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);

					if (ka->sa.sa_flags & SA_RESTORER)

						sigtramp = ka->sa.sa_restorer;

					else

				@@ -1253,6 +1310,7 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,

				{

					struct rt_sigframe_user_layout user;

					struct rt_sigframe __user *frame;

					struct user_access_state ua_state;

					int err = 0;

					fpsimd_signal_preserve_current_state();

				@@ -1260,13 +1318,14 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,

					if (get_sigframe(&user, ksig, regs))

						return 1;

					save_reset_user_access_state(&ua_state);

					frame = user.sigframe;

					__put_user_error(0, &frame->uc.uc_flags, err);

					__put_user_error(NULL, &frame->uc.uc_link, err);

					err |= __save_altstack(&frame->uc.uc_stack, regs->sp);

					err |= setup_sigframe(&user, regs, set);

					err |= setup_sigframe(&user, regs, set, &ua_state);

					if (err == 0) {

						setup_return(regs, &ksig->ka, &user, usig);

						if (ksig->ka.sa.sa_flags & SA_SIGINFO) {

				@@ -1276,6 +1335,11 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,

						}

					}

					if (err == 0)

						set_handler_user_access_state();

					else

						restore_user_access_state(&ua_state);

					return err;

				}

									
										5

arch/arm64/kvm/arm.c
									
												View File
												
				@@ -997,6 +997,9 @@ static int kvm_vcpu_suspend(struct kvm_vcpu *vcpu)

				static int check_vcpu_requests(struct kvm_vcpu *vcpu)

				{

					if (kvm_request_pending(vcpu)) {

						if (kvm_check_request(KVM_REQ_VM_DEAD, vcpu))

							return -EIO;

						if (kvm_check_request(KVM_REQ_SLEEP, vcpu))

							kvm_vcpu_sleep(vcpu);

				@@ -1031,6 +1034,8 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)

						if (kvm_dirty_ring_check_request(vcpu))

							return 0;

						check_nested_vcpu_requests(vcpu);

					}

					return 1;

									
										52

arch/arm64/kvm/hyp/nvhe/hyp-init.S
									
												View File
												
				@@ -24,28 +24,25 @@

					.align	11

				SYM_CODE_START(__kvm_hyp_init)

					ventry	__invalid		// Synchronous EL2t

					ventry	__invalid		// IRQ EL2t

					ventry	__invalid		// FIQ EL2t

					ventry	__invalid		// Error EL2t

					ventry	.			// Synchronous EL2t

					ventry	.			// IRQ EL2t

					ventry	.			// FIQ EL2t

					ventry	.			// Error EL2t

					ventry	__invalid		// Synchronous EL2h

					ventry	__invalid		// IRQ EL2h

					ventry	__invalid		// FIQ EL2h

					ventry	__invalid		// Error EL2h

					ventry	.			// Synchronous EL2h

					ventry	.			// IRQ EL2h

					ventry	.			// FIQ EL2h

					ventry	.			// Error EL2h

					ventry	__do_hyp_init		// Synchronous 64-bit EL1

					ventry	__invalid		// IRQ 64-bit EL1

					ventry	__invalid		// FIQ 64-bit EL1

					ventry	__invalid		// Error 64-bit EL1

					ventry	.			// IRQ 64-bit EL1

					ventry	.			// FIQ 64-bit EL1

					ventry	.			// Error 64-bit EL1

					ventry	__invalid		// Synchronous 32-bit EL1

					ventry	__invalid		// IRQ 32-bit EL1

					ventry	__invalid		// FIQ 32-bit EL1

					ventry	__invalid		// Error 32-bit EL1

				__invalid:

					b	.

					ventry	.			// Synchronous 32-bit EL1

					ventry	.			// IRQ 32-bit EL1

					ventry	.			// FIQ 32-bit EL1

					ventry	.			// Error 32-bit EL1

					/*

					 * Only uses x0..x3 so as to not clobber callee-saved SMCCC registers.

				@@ -76,6 +73,13 @@ __do_hyp_init:

					eret

				SYM_CODE_END(__kvm_hyp_init)

				SYM_CODE_START_LOCAL(__kvm_init_el2_state)

					/* Initialize EL2 CPU state to sane values. */

					init_el2_state				// Clobbers x0..x2

					finalise_el2_state

					ret

				SYM_CODE_END(__kvm_init_el2_state)

				/*

				 * Initialize the hypervisor in EL2.

				 *

				@@ -102,9 +106,12 @@ SYM_CODE_START_LOCAL(___kvm_hyp_init)

					// TPIDR_EL2 is used to preserve x0 across the macro maze...

					isb

					msr	tpidr_el2, x0

					init_el2_state

					finalise_el2_state

					str	lr, [x0, #NVHE_INIT_TMP]

					bl	__kvm_init_el2_state

					mrs	x0, tpidr_el2

					ldr	lr, [x0, #NVHE_INIT_TMP]

				1:

					ldr	x1, [x0, #NVHE_INIT_TPIDR_EL2]

				@@ -199,9 +206,8 @@ SYM_CODE_START_LOCAL(__kvm_hyp_init_cpu)

				2:	msr	SPsel, #1			// We want to use SP_EL{1,2}

					/* Initialize EL2 CPU state to sane values. */

					init_el2_state				// Clobbers x0..x2

					finalise_el2_state

					bl	__kvm_init_el2_state

					__init_el2_nvhe_prepare_eret

					/* Enable MMU, set vectors and stack. */

									
										12

arch/arm64/kvm/hypercalls.c
									
												View File
												
				@@ -317,7 +317,7 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu)

								 * to the guest, and hide SSBS so that the

								 * guest stays protected.

								 */

								if (cpus_have_final_cap(ARM64_SSBS))

								if (kvm_has_feat(vcpu->kvm, ID_AA64PFR1_EL1, SSBS, IMP))

									break;

								fallthrough;

							case SPECTRE_UNAFFECTED:

				@@ -428,7 +428,7 @@ int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)

				 * Convert the workaround level into an easy-to-compare number, where higher

				 * values mean better protection.

				 */

				static int get_kernel_wa_level(u64 regid)

				static int get_kernel_wa_level(struct kvm_vcpu *vcpu, u64 regid)

				{

					switch (regid) {

					case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:

				@@ -449,7 +449,7 @@ static int get_kernel_wa_level(u64 regid)

							 * don't have any FW mitigation if SSBS is there at

							 * all times.

							 */

							if (cpus_have_final_cap(ARM64_SSBS))

							if (kvm_has_feat(vcpu->kvm, ID_AA64PFR1_EL1, SSBS, IMP))

								return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;

							fallthrough;

						case SPECTRE_UNAFFECTED:

				@@ -486,7 +486,7 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)

					case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:

					case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:

					case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:

						val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;

						val = get_kernel_wa_level(vcpu, reg->id) & KVM_REG_FEATURE_LEVEL_MASK;

						break;

					case KVM_REG_ARM_STD_BMAP:

						val = READ_ONCE(smccc_feat->std_bmap);

				@@ -588,7 +588,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)

						if (val & ~KVM_REG_FEATURE_LEVEL_MASK)

							return -EINVAL;

						if (get_kernel_wa_level(reg->id) < val)

						if (get_kernel_wa_level(vcpu, reg->id) < val)

							return -EINVAL;

						return 0;

				@@ -624,7 +624,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)

						 * We can deal with NOT_AVAIL on NOT_REQUIRED, but not the

						 * other way around.

						 */

						if (get_kernel_wa_level(reg->id) < wa_level)

						if (get_kernel_wa_level(vcpu, reg->id) < wa_level)

							return -EINVAL;

						return 0;

									
										15

arch/arm64/kvm/mmu.c
									
												View File
												
				@@ -328,9 +328,10 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64

								   may_block));

				}

				void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size)

				void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start,

							    u64 size, bool may_block)

				{

					__unmap_stage2_range(mmu, start, size, true);

					__unmap_stage2_range(mmu, start, size, may_block);

				}

				void kvm_stage2_flush_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end)

				@@ -1015,7 +1016,7 @@ static void stage2_unmap_memslot(struct kvm *kvm,

						if (!(vma->vm_flags & VM_PFNMAP)) {

							gpa_t gpa = addr + (vm_start - memslot->userspace_addr);

							kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, vm_end - vm_start);

							kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, vm_end - vm_start, true);

						}

						hva = vm_end;

					} while (hva < reg_end);

				@@ -1042,7 +1043,7 @@ void stage2_unmap_vm(struct kvm *kvm)

					kvm_for_each_memslot(memslot, bkt, slots)

						stage2_unmap_memslot(kvm, memslot);

					kvm_nested_s2_unmap(kvm);

					kvm_nested_s2_unmap(kvm, true);

					write_unlock(&kvm->mmu_lock);

					mmap_read_unlock(current->mm);

				@@ -1912,7 +1913,7 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)

							     (range->end - range->start) << PAGE_SHIFT,

							     range->may_block);

					kvm_nested_s2_unmap(kvm);

					kvm_nested_s2_unmap(kvm, range->may_block);

					return false;

				}

				@@ -2179,8 +2180,8 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm,

					phys_addr_t size = slot->npages << PAGE_SHIFT;

					write_lock(&kvm->mmu_lock);

					kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, size);

					kvm_nested_s2_unmap(kvm);

					kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, size, true);

					kvm_nested_s2_unmap(kvm, true);

					write_unlock(&kvm->mmu_lock);

				}

									
										53

arch/arm64/kvm/nested.c
									
												View File
												
				@@ -632,9 +632,9 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu)

					/* Set the scene for the next search */

					kvm->arch.nested_mmus_next = (i + 1) % kvm->arch.nested_mmus_size;

					/* Clear the old state */

					/* Make sure we don't forget to do the laundry */

					if (kvm_s2_mmu_valid(s2_mmu))

						kvm_stage2_unmap_range(s2_mmu, 0, kvm_phys_size(s2_mmu));

						s2_mmu->pending_unmap = true;

					/*

					 * The virtual VMID (modulo CnP) will be used as a key when matching

				@@ -650,6 +650,16 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu)

				out:

					atomic_inc(&s2_mmu->refcnt);

					/*

					 * Set the vCPU request to perform an unmap, even if the pending unmap

					 * originates from another vCPU. This guarantees that the MMU has been

					 * completely unmapped before any vCPU actually uses it, and allows

					 * multiple vCPUs to lend a hand with completing the unmap.

					 */

					if (s2_mmu->pending_unmap)

						kvm_make_request(KVM_REQ_NESTED_S2_UNMAP, vcpu);

					return s2_mmu;

				}

				@@ -663,6 +673,13 @@ void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu)

				void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu)

				{

					/*

					 * The vCPU kept its reference on the MMU after the last put, keep

					 * rolling with it.

					 */

					if (vcpu->arch.hw_mmu)

						return;

					if (is_hyp_ctxt(vcpu)) {

						vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;

					} else {

				@@ -674,10 +691,18 @@ void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu)

				void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu)

				{

					if (kvm_is_nested_s2_mmu(vcpu->kvm, vcpu->arch.hw_mmu)) {

					/*

					 * Keep a reference on the associated stage-2 MMU if the vCPU is

					 * scheduling out and not in WFI emulation, suggesting it is likely to

					 * reuse the MMU sometime soon.

					 */

					if (vcpu->scheduled_out && !vcpu_get_flag(vcpu, IN_WFI))

						return;

					if (kvm_is_nested_s2_mmu(vcpu->kvm, vcpu->arch.hw_mmu))

						atomic_dec(&vcpu->arch.hw_mmu->refcnt);

						vcpu->arch.hw_mmu = NULL;

					}

					vcpu->arch.hw_mmu = NULL;

				}

				/*

				@@ -730,7 +755,7 @@ void kvm_nested_s2_wp(struct kvm *kvm)

					}

				}

				void kvm_nested_s2_unmap(struct kvm *kvm)

				void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block)

				{

					int i;

				@@ -740,7 +765,7 @@ void kvm_nested_s2_unmap(struct kvm *kvm)

						struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];

						if (kvm_s2_mmu_valid(mmu))

							kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu));

							kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), may_block);

					}

				}

				@@ -1184,3 +1209,17 @@ int kvm_init_nv_sysregs(struct kvm *kvm)

					return 0;

				}

				void check_nested_vcpu_requests(struct kvm_vcpu *vcpu)

				{

					if (kvm_check_request(KVM_REQ_NESTED_S2_UNMAP, vcpu)) {

						struct kvm_s2_mmu *mmu = vcpu->arch.hw_mmu;

						write_lock(&vcpu->kvm->mmu_lock);

						if (mmu->pending_unmap) {

							kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), true);

							mmu->pending_unmap = false;

						}

						write_unlock(&vcpu->kvm->mmu_lock);

					}

				}

									
										77

arch/arm64/kvm/sys_regs.c
									
												View File
												
				@@ -1527,6 +1527,14 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu,

							val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE);

						val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SME);

						val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_RNDR_trap);

						val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_NMI);

						val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE_frac);

						val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_GCS);

						val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_THE);

						val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTEX);

						val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_DF2);

						val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_PFAR);

						break;

					case SYS_ID_AA64PFR2_EL1:

						/* We only expose FPMR */

				@@ -1550,7 +1558,8 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu,

						val &= ~ID_AA64MMFR2_EL1_CCIDX_MASK;

						break;

					case SYS_ID_AA64MMFR3_EL1:

						val &= ID_AA64MMFR3_EL1_TCRX | ID_AA64MMFR3_EL1_S1POE;

						val &= ID_AA64MMFR3_EL1_TCRX | ID_AA64MMFR3_EL1_S1POE |

							ID_AA64MMFR3_EL1_S1PIE;

						break;

					case SYS_ID_MMFR4_EL1:

						val &= ~ARM64_FEATURE_MASK(ID_MMFR4_EL1_CCIDX);

				@@ -1985,7 +1994,7 @@ static u64 reset_clidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)

					 * one cache line.

					 */

					if (kvm_has_mte(vcpu->kvm))

						clidr |= 2 << CLIDR_TTYPE_SHIFT(loc);

						clidr |= 2ULL << CLIDR_TTYPE_SHIFT(loc);

					__vcpu_sys_reg(vcpu, r->reg) = clidr;

				@@ -2376,7 +2385,19 @@ static const struct sys_reg_desc sys_reg_descs[] = {

						   ID_AA64PFR0_EL1_RAS |

						   ID_AA64PFR0_EL1_AdvSIMD |

						   ID_AA64PFR0_EL1_FP), },

					ID_SANITISED(ID_AA64PFR1_EL1),

					ID_WRITABLE(ID_AA64PFR1_EL1, ~(ID_AA64PFR1_EL1_PFAR |

								       ID_AA64PFR1_EL1_DF2 |

								       ID_AA64PFR1_EL1_MTEX |

								       ID_AA64PFR1_EL1_THE |

								       ID_AA64PFR1_EL1_GCS |

								       ID_AA64PFR1_EL1_MTE_frac |

								       ID_AA64PFR1_EL1_NMI |

								       ID_AA64PFR1_EL1_RNDR_trap |

								       ID_AA64PFR1_EL1_SME |

								       ID_AA64PFR1_EL1_RES0 |

								       ID_AA64PFR1_EL1_MPAM_frac |

								       ID_AA64PFR1_EL1_RAS_frac |

								       ID_AA64PFR1_EL1_MTE)),

					ID_WRITABLE(ID_AA64PFR2_EL1, ID_AA64PFR2_EL1_FPMR),

					ID_UNALLOCATED(4,3),

					ID_WRITABLE(ID_AA64ZFR0_EL1, ~ID_AA64ZFR0_EL1_RES0),

				@@ -2390,7 +2411,21 @@ static const struct sys_reg_desc sys_reg_descs[] = {

					  .get_user = get_id_reg,

					  .set_user = set_id_aa64dfr0_el1,

					  .reset = read_sanitised_id_aa64dfr0_el1,

					  .val = ID_AA64DFR0_EL1_PMUVer_MASK |

					/*

					 * Prior to FEAT_Debugv8.9, the architecture defines context-aware

					 * breakpoints (CTX_CMPs) as the highest numbered breakpoints (BRPs).

					 * KVM does not trap + emulate the breakpoint registers, and as such

					 * cannot support a layout that misaligns with the underlying hardware.

					 * While it may be possible to describe a subset that aligns with

					 * hardware, just prevent changes to BRPs and CTX_CMPs altogether for

					 * simplicity.

					 *

					 * See DDI0487K.a, section D2.8.3 Breakpoint types and linking

					 * of breakpoints for more details.

					 */

					  .val = ID_AA64DFR0_EL1_DoubleLock_MASK |

						 ID_AA64DFR0_EL1_WRPs_MASK |

						 ID_AA64DFR0_EL1_PMUVer_MASK |

						 ID_AA64DFR0_EL1_DebugVer_MASK, },

					ID_SANITISED(ID_AA64DFR1_EL1),

					ID_UNALLOCATED(5,2),

				@@ -2433,6 +2468,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {

									ID_AA64MMFR2_EL1_NV |

									ID_AA64MMFR2_EL1_CCIDX)),

					ID_WRITABLE(ID_AA64MMFR3_EL1, (ID_AA64MMFR3_EL1_TCRX	|

								       ID_AA64MMFR3_EL1_S1PIE   |

								       ID_AA64MMFR3_EL1_S1POE)),

					ID_SANITISED(ID_AA64MMFR4_EL1),

					ID_UNALLOCATED(7,5),

				@@ -2903,7 +2939,7 @@ static bool handle_alle1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p,

					 * Drop all shadow S2s, resulting in S1/S2 TLBIs for each of the

					 * corresponding VMIDs.

					 */

					kvm_nested_s2_unmap(vcpu->kvm);

					kvm_nested_s2_unmap(vcpu->kvm, true);

					write_unlock(&vcpu->kvm->mmu_lock);

				@@ -2955,7 +2991,30 @@ union tlbi_info {

				static void s2_mmu_unmap_range(struct kvm_s2_mmu *mmu,

							       const union tlbi_info *info)

				{

					kvm_stage2_unmap_range(mmu, info->range.start, info->range.size);

					/*

					 * The unmap operation is allowed to drop the MMU lock and block, which

					 * means that @mmu could be used for a different context than the one

					 * currently being invalidated.

					 *

					 * This behavior is still safe, as:

					 *

					 *  1) The vCPU(s) that recycled the MMU are responsible for invalidating

					 *     the entire MMU before reusing it, which still honors the intent

					 *     of a TLBI.

					 *

					 *  2) Until the guest TLBI instruction is 'retired' (i.e. increment PC

					 *     and ERET to the guest), other vCPUs are allowed to use stale

					 *     translations.

					 *

					 *  3) Accidentally unmapping an unrelated MMU context is nonfatal, and

					 *     at worst may cause more aborts for shadow stage-2 fills.

					 *

					 * Dropping the MMU lock also implies that shadow stage-2 fills could

					 * happen behind the back of the TLBI. This is still safe, though, as

					 * the L1 needs to put its stage-2 in a consistent state before doing

					 * the TLBI.

					 */

					kvm_stage2_unmap_range(mmu, info->range.start, info->range.size, true);

				}

				static bool handle_vmalls12e1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p,

				@@ -3050,7 +3109,11 @@ static void s2_mmu_unmap_ipa(struct kvm_s2_mmu *mmu,

					max_size = compute_tlb_inval_range(mmu, info->ipa.addr);

					base_addr &= ~(max_size - 1);

					kvm_stage2_unmap_range(mmu, base_addr, max_size);

					/*

					 * See comment in s2_mmu_unmap_range() for why this is allowed to

					 * reschedule.

					 */

					kvm_stage2_unmap_range(mmu, base_addr, max_size, true);

				}

				static bool handle_ipas2e1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p,

									
										41

arch/arm64/kvm/vgic/vgic-init.c
									
												View File
												
				@@ -417,8 +417,28 @@ static void __kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)

					kfree(vgic_cpu->private_irqs);

					vgic_cpu->private_irqs = NULL;

					if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)

					if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {

						/*

						 * If this vCPU is being destroyed because of a failed creation

						 * then unregister the redistributor to avoid leaving behind a

						 * dangling pointer to the vCPU struct.

						 *

						 * vCPUs that have been successfully created (i.e. added to

						 * kvm->vcpu_array) get unregistered in kvm_vgic_destroy(), as

						 * this function gets called while holding kvm->arch.config_lock

						 * in the VM teardown path and would otherwise introduce a lock

						 * inversion w.r.t. kvm->srcu.

						 *

						 * vCPUs that failed creation are torn down outside of the

						 * kvm->arch.config_lock and do not get unregistered in

						 * kvm_vgic_destroy(), meaning it is both safe and necessary to

						 * do so here.

						 */

						if (kvm_get_vcpu_by_id(vcpu->kvm, vcpu->vcpu_id) != vcpu)

							vgic_unregister_redist_iodev(vcpu);

						vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;

					}

				}

				void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)

				@@ -524,22 +544,31 @@ int kvm_vgic_map_resources(struct kvm *kvm)

					if (ret)

						goto out;

					dist->ready = true;

					dist_base = dist->vgic_dist_base;

					mutex_unlock(&kvm->arch.config_lock);

					ret = vgic_register_dist_iodev(kvm, dist_base, type);

					if (ret)

					if (ret) {

						kvm_err("Unable to register VGIC dist MMIO regions\n");

						goto out_slots;

					}

					/*

					 * kvm_io_bus_register_dev() guarantees all readers see the new MMIO

					 * registration before returning through synchronize_srcu(), which also

					 * implies a full memory barrier. As such, marking the distributor as

					 * 'ready' here is guaranteed to be ordered after all vCPUs having seen

					 * a completely configured distributor.

					 */

					dist->ready = true;

					goto out_slots;

				out:

					mutex_unlock(&kvm->arch.config_lock);

				out_slots:

					mutex_unlock(&kvm->slots_lock);

					if (ret)

						kvm_vgic_destroy(kvm);

						kvm_vm_dead(kvm);

					mutex_unlock(&kvm->slots_lock);

					return ret;

				}

									
										7

arch/arm64/kvm/vgic/vgic-kvm-device.c
									
												View File
												
				@@ -236,7 +236,12 @@ static int vgic_set_common_attr(struct kvm_device *dev,

						mutex_lock(&dev->kvm->arch.config_lock);

						if (vgic_ready(dev->kvm) || dev->kvm->arch.vgic.nr_spis)

						/*

						 * Either userspace has already configured NR_IRQS or

						 * the vgic has already been initialized and vgic_init()

						 * supplied a default amount of SPIs.

						 */

						if (dev->kvm->arch.vgic.nr_spis)

							ret = -EBUSY;

						else

							dev->kvm->arch.vgic.nr_spis =

									
										12

arch/arm64/net/bpf_jit_comp.c
									
												View File
												
				@@ -2220,7 +2220,11 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,

					emit(A64_STR64I(A64_R(20), A64_SP, regs_off + 8), ctx);

					if (flags & BPF_TRAMP_F_CALL_ORIG) {

						emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);

						/* for the first pass, assume the worst case */

						if (!ctx->image)

							ctx->idx += 4;

						else

							emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);

						emit_call((const u64)__bpf_tramp_enter, ctx);

					}

				@@ -2264,7 +2268,11 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,

					if (flags & BPF_TRAMP_F_CALL_ORIG) {

						im->ip_epilogue = ctx->ro_image + ctx->idx;

						emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);

						/* for the first pass, assume the worst case */

						if (!ctx->image)

							ctx->idx += 4;

						else

							emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);

						emit_call((const u64)__bpf_tramp_exit, ctx);

					}

									
										4

arch/loongarch/include/asm/bootinfo.h
									
												View File
												
				@@ -26,6 +26,10 @@ struct loongson_board_info {

				#define NR_WORDS DIV_ROUND_UP(NR_CPUS, BITS_PER_LONG)

				/*

				 * The "core" of cores_per_node and cores_per_package stands for a

				 * logical core, which means in a SMT system it stands for a thread.

				 */

				struct loongson_system_configuration {

					int nr_cpus;

					int nr_nodes;

									
										2

arch/loongarch/include/asm/kasan.h
									
												View File
												
				@@ -16,7 +16,7 @@

				#define XRANGE_SHIFT (48)

				/* Valid address length */

				#define XRANGE_SHADOW_SHIFT	(PGDIR_SHIFT + PAGE_SHIFT - 3)

				#define XRANGE_SHADOW_SHIFT	min(cpu_vabits, VA_BITS)

				/* Used for taking out the valid address */

				#define XRANGE_SHADOW_MASK	GENMASK_ULL(XRANGE_SHADOW_SHIFT - 1, 0)

				/* One segment whole address space size */

									
										2

arch/loongarch/include/asm/loongarch.h
									
												View File
												
				@@ -250,7 +250,7 @@

				#define  CSR_ESTAT_IS_WIDTH		15

				#define  CSR_ESTAT_IS			(_ULCAST_(0x7fff) << CSR_ESTAT_IS_SHIFT)

				#define LOONGARCH_CSR_ERA		0x6	/* ERA */

				#define LOONGARCH_CSR_ERA		0x6	/* Exception return address */

				#define LOONGARCH_CSR_BADV		0x7	/* Bad virtual address */

									
										11

arch/loongarch/include/asm/pgalloc.h
									
												View File
												
				@@ -10,6 +10,7 @@

				#define __HAVE_ARCH_PMD_ALLOC_ONE

				#define __HAVE_ARCH_PUD_ALLOC_ONE

				#define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL

				#include <asm-generic/pgalloc.h>

				static inline void pmd_populate_kernel(struct mm_struct *mm,

				@@ -44,6 +45,16 @@ extern void pagetable_init(void);

				extern pgd_t *pgd_alloc(struct mm_struct *mm);

				static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)

				{

					pte_t *pte = __pte_alloc_one_kernel(mm);

					if (pte)

						kernel_pte_init(pte);

					return pte;

				}

				#define __pte_free_tlb(tlb, pte, address)			\

				do {								\

					pagetable_pte_dtor(page_ptdesc(pte));			\

									
										35

arch/loongarch/include/asm/pgtable.h
									
												View File
												
				@@ -269,6 +269,7 @@ extern void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pm

				extern void pgd_init(void *addr);

				extern void pud_init(void *addr);

				extern void pmd_init(void *addr);

				extern void kernel_pte_init(void *addr);

				/*

				 * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that

				@@ -325,39 +326,17 @@ static inline void set_pte(pte_t *ptep, pte_t pteval)

				{

					WRITE_ONCE(*ptep, pteval);

					if (pte_val(pteval) & _PAGE_GLOBAL) {

						pte_t *buddy = ptep_buddy(ptep);

						/*

						 * Make sure the buddy is global too (if it's !none,

						 * it better already be global)

						 */

						if (pte_none(ptep_get(buddy))) {

				#ifdef CONFIG_SMP

							/*

							 * For SMP, multiple CPUs can race, so we need

							 * to do this atomically.

							 */

							__asm__ __volatile__(

							__AMOR "$zero, %[global], %[buddy] \n"

							: [buddy] "+ZB" (buddy->pte)

							: [global] "r" (_PAGE_GLOBAL)

							: "memory");

							DBAR(0b11000); /* o_wrw = 0b11000 */

				#else /* !CONFIG_SMP */

							WRITE_ONCE(*buddy, __pte(pte_val(ptep_get(buddy)) | _PAGE_GLOBAL));

				#endif /* CONFIG_SMP */

						}

					}

					if (pte_val(pteval) & _PAGE_GLOBAL)

						DBAR(0b11000); /* o_wrw = 0b11000 */

				#endif

				}

				static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)

				{

					/* Preserve global status for the pair */

					if (pte_val(ptep_get(ptep_buddy(ptep))) & _PAGE_GLOBAL)

						set_pte(ptep, __pte(_PAGE_GLOBAL));

					else

						set_pte(ptep, __pte(0));

					pte_t pte = ptep_get(ptep);

					pte_val(pte) &= _PAGE_GLOBAL;

					set_pte(ptep, pte);

				}

				#define PGD_T_LOG2	(__builtin_ffs(sizeof(pgd_t)) - 1)

									
										14

arch/loongarch/kernel/process.c
									
												View File
												
				@@ -293,13 +293,15 @@ unsigned long stack_top(void)

				{

					unsigned long top = TASK_SIZE & PAGE_MASK;

					/* Space for the VDSO & data page */

					top -= PAGE_ALIGN(current->thread.vdso->size);

					top -= VVAR_SIZE;

					if (current->thread.vdso) {

						/* Space for the VDSO & data page */

						top -= PAGE_ALIGN(current->thread.vdso->size);

						top -= VVAR_SIZE;

					/* Space to randomize the VDSO base */

					if (current->flags & PF_RANDOMIZE)

						top -= VDSO_RANDOMIZE_SIZE;

						/* Space to randomize the VDSO base */

						if (current->flags & PF_RANDOMIZE)

							top -= VDSO_RANDOMIZE_SIZE;

					}

					return top;

				}

									
										3

arch/loongarch/kernel/setup.c
									
												View File
												
				@@ -55,6 +55,7 @@

				#define SMBIOS_FREQHIGH_OFFSET		0x17

				#define SMBIOS_FREQLOW_MASK		0xFF

				#define SMBIOS_CORE_PACKAGE_OFFSET	0x23

				#define SMBIOS_THREAD_PACKAGE_OFFSET	0x25

				#define LOONGSON_EFI_ENABLE		(1 << 3)

				unsigned long fw_arg0, fw_arg1, fw_arg2;

				@@ -125,7 +126,7 @@ static void __init parse_cpu_table(const struct dmi_header *dm)

					cpu_clock_freq = freq_temp * 1000000;

					loongson_sysconf.cpuname = (void *)dmi_string_parse(dm, dmi_data[16]);

					loongson_sysconf.cores_per_package = *(dmi_data + SMBIOS_CORE_PACKAGE_OFFSET);

					loongson_sysconf.cores_per_package = *(dmi_data + SMBIOS_THREAD_PACKAGE_OFFSET);

					pr_info("CpuClock = %llu\n", cpu_clock_freq);

				}

									
										5

arch/loongarch/kernel/traps.c
									
												View File
												
				@@ -555,6 +555,9 @@ asmlinkage void noinstr do_ale(struct pt_regs *regs)

				#else

					unsigned int *pc;

					if (regs->csr_prmd & CSR_PRMD_PIE)

						local_irq_enable();

					perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, regs->csr_badvaddr);

					/*

				@@ -579,6 +582,8 @@ sigbus:

					die_if_kernel("Kernel ale access", regs);

					force_sig_fault(SIGBUS, BUS_ADRALN, (void __user *)regs->csr_badvaddr);

				out:

					if (regs->csr_prmd & CSR_PRMD_PIE)

						local_irq_disable();

				#endif

					irqentry_exit(regs, state);

				}

									
										8

arch/loongarch/kernel/vdso.c
									
												View File
												
				@@ -34,7 +34,6 @@ static union {

					struct loongarch_vdso_data vdata;

				} loongarch_vdso_data __page_aligned_data;

				static struct page *vdso_pages[] = { NULL };

				struct vdso_data *vdso_data = generic_vdso_data.data;

				struct vdso_pcpu_data *vdso_pdata = loongarch_vdso_data.vdata.pdata;

				struct vdso_rng_data *vdso_rng_data = &loongarch_vdso_data.vdata.rng_data;

				@@ -85,10 +84,8 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,

				struct loongarch_vdso_info vdso_info = {

					.vdso = vdso_start,

					.size = PAGE_SIZE,

					.code_mapping = {

						.name = "[vdso]",

						.pages = vdso_pages,

						.mremap = vdso_mremap,

					},

					.data_mapping = {

				@@ -103,11 +100,14 @@ static int __init init_vdso(void)

					unsigned long i, cpu, pfn;

					BUG_ON(!PAGE_ALIGNED(vdso_info.vdso));

					BUG_ON(!PAGE_ALIGNED(vdso_info.size));

					for_each_possible_cpu(cpu)

						vdso_pdata[cpu].node = cpu_to_node(cpu);

					vdso_info.size = PAGE_ALIGN(vdso_end - vdso_start);

					vdso_info.code_mapping.pages =

						kcalloc(vdso_info.size / PAGE_SIZE, sizeof(struct page *), GFP_KERNEL);

					pfn = __phys_to_pfn(__pa_symbol(vdso_info.vdso));

					for (i = 0; i < vdso_info.size / PAGE_SIZE; i++)

						vdso_info.code_mapping.pages[i] = pfn_to_page(pfn + i);

									
										7

arch/loongarch/kvm/timer.c
									
												View File
												
				@@ -161,10 +161,11 @@ static void _kvm_save_timer(struct kvm_vcpu *vcpu)

					if (kvm_vcpu_is_blocking(vcpu)) {

						/*

						 * HRTIMER_MODE_PINNED is suggested since vcpu may run in

						 * the same physical cpu in next time

						 * HRTIMER_MODE_PINNED_HARD is suggested since vcpu may run in

						 * the same physical cpu in next time, and the timer should run

						 * in hardirq context even in the PREEMPT_RT case.

						 */

						hrtimer_start(&vcpu->arch.swtimer, expire, HRTIMER_MODE_ABS_PINNED);

						hrtimer_start(&vcpu->arch.swtimer, expire, HRTIMER_MODE_ABS_PINNED_HARD);

					}

				}

									
										2

arch/loongarch/kvm/vcpu.c
									
												View File
												
				@@ -1457,7 +1457,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)

					vcpu->arch.vpid = 0;

					vcpu->arch.flush_gpa = INVALID_GPA;

					hrtimer_init(&vcpu->arch.swtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED);

					hrtimer_init(&vcpu->arch.swtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED_HARD);

					vcpu->arch.swtimer.function = kvm_swtimer_wakeup;

					vcpu->arch.handle_exit = kvm_handle_exit;

									
										2

arch/loongarch/mm/init.c
									
												View File
												
				@@ -201,7 +201,9 @@ pte_t * __init populate_kernel_pte(unsigned long addr)

						pte = memblock_alloc(PAGE_SIZE, PAGE_SIZE);

						if (!pte)

							panic("%s: Failed to allocate memory\n", __func__);

						pmd_populate_kernel(&init_mm, pmd, pte);

						kernel_pte_init(pte);

					}

					return pte_offset_kernel(pmd, addr);

									
										20

arch/loongarch/mm/pgtable.c
									
												View File
												
				@@ -116,6 +116,26 @@ void pud_init(void *addr)

				EXPORT_SYMBOL_GPL(pud_init);

				#endif

				void kernel_pte_init(void *addr)

				{

					unsigned long *p, *end;

					p = (unsigned long *)addr;

					end = p + PTRS_PER_PTE;

					do {

						p[0] = _PAGE_GLOBAL;

						p[1] = _PAGE_GLOBAL;

						p[2] = _PAGE_GLOBAL;

						p[3] = _PAGE_GLOBAL;

						p[4] = _PAGE_GLOBAL;

						p += 8;

						p[-3] = _PAGE_GLOBAL;

						p[-2] = _PAGE_GLOBAL;

						p[-1] = _PAGE_GLOBAL;

					} while (p != end);

				}

				pmd_t mk_pmd(struct page *page, pgprot_t prot)

				{

					pmd_t pmd;

									
										1

arch/mips/kernel/cmpxchg.c
									
												View File
												
				@@ -102,3 +102,4 @@ unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,

							return old;

					}

				}

				EXPORT_SYMBOL(__cmpxchg_small);

									
										1

arch/powerpc/platforms/powernv/opal-irqchip.c
									
												View File
												
				@@ -282,6 +282,7 @@ int __init opal_event_init(void)

								 name, NULL);

						if (rc) {

							pr_warn("Error %d requesting OPAL irq %d\n", rc, (int)r->start);

							kfree(name);

							continue;

						}

					}

2

arch/riscv/Kconfig

View File

@@ -177,7 +177,7 @@ config RISCV
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RETHOOK if !XIP_KERNEL
 	select HAVE_RSEQ
 	select HAVE_RUST if RUSTC_SUPPORTS_RISCV
 	select HAVE_RUST if RUSTC_SUPPORTS_RISCV && CC_IS_CLANG
 	select HAVE_SAMPLE_FTRACE_DIRECT
 	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
 	select HAVE_STACKPROTECTOR

									
										6

arch/riscv/errata/Makefile
									
												View File
												
				@@ -2,6 +2,12 @@ ifdef CONFIG_RELOCATABLE

				KBUILD_CFLAGS += -fno-pie

				endif

				ifdef CONFIG_RISCV_ALTERNATIVE_EARLY

				ifdef CONFIG_FORTIFY_SOURCE

				KBUILD_CFLAGS += -D__NO_FORTIFY

				endif

				endif

				obj-$(CONFIG_ERRATA_ANDES) += andes/

				obj-$(CONFIG_ERRATA_SIFIVE) += sifive/

				obj-$(CONFIG_ERRATA_THEAD) += thead/

									
										5

arch/riscv/kernel/Makefile
									
												View File
												
				@@ -36,6 +36,11 @@ KASAN_SANITIZE_alternative.o := n

				KASAN_SANITIZE_cpufeature.o := n

				KASAN_SANITIZE_sbi_ecall.o := n

				endif

				ifdef CONFIG_FORTIFY_SOURCE

				CFLAGS_alternative.o += -D__NO_FORTIFY

				CFLAGS_cpufeature.o += -D__NO_FORTIFY

				CFLAGS_sbi_ecall.o += -D__NO_FORTIFY

				endif

				endif

				extra-y += vmlinux.lds

									
										4

arch/riscv/kernel/acpi.c
									
												View File
												
				@@ -210,7 +210,7 @@ void __init __iomem *__acpi_map_table(unsigned long phys, unsigned long size)

					if (!size)

						return NULL;

					return early_ioremap(phys, size);

					return early_memremap(phys, size);

				}

				void __init __acpi_unmap_table(void __iomem *map, unsigned long size)

				@@ -218,7 +218,7 @@ void __init __acpi_unmap_table(void __iomem *map, unsigned long size)

					if (!map || !size)

						return;

					early_iounmap(map, size);

					early_memunmap(map, size);

				}

				void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size)

									
										2

arch/riscv/kernel/asm-offsets.c
									
												View File
												
				@@ -4,8 +4,6 @@

				 * Copyright (C) 2017 SiFive

				 */

				#define GENERATING_ASM_OFFSETS

				#include <linux/kbuild.h>

				#include <linux/mm.h>

				#include <linux/sched.h>

									
										7

arch/riscv/kernel/cacheinfo.c
									
												View File
												
				@@ -80,8 +80,7 @@ int populate_cache_leaves(unsigned int cpu)

				{

					struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);

					struct cacheinfo *this_leaf = this_cpu_ci->info_list;

					struct device_node *np = of_cpu_device_node_get(cpu);

					struct device_node *prev = NULL;

					struct device_node *np, *prev;

					int levels = 1, level = 1;

					if (!acpi_disabled) {

				@@ -105,6 +104,10 @@ int populate_cache_leaves(unsigned int cpu)

						return 0;

					}

					np = of_cpu_device_node_get(cpu);

					if (!np)

						return -ENOENT;

					if (of_property_read_bool(np, "cache-size"))

						ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level);

					if (of_property_read_bool(np, "i-cache-size"))

									
										2

arch/riscv/kernel/cpu-hotplug.c
									
												View File
												
				@@ -58,7 +58,7 @@ void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu)

					if (cpu_ops->cpu_is_stopped)

						ret = cpu_ops->cpu_is_stopped(cpu);

					if (ret)

						pr_warn("CPU%d may not have stopped: %d\n", cpu, ret);

						pr_warn("CPU%u may not have stopped: %d\n", cpu, ret);

				}

				/*

									
										2

arch/riscv/kernel/efi-header.S
									
												View File
												
				@@ -64,7 +64,7 @@ extra_header_fields:

					.long	efi_header_end - _start			// SizeOfHeaders

					.long	0					// CheckSum

					.short	IMAGE_SUBSYSTEM_EFI_APPLICATION		// Subsystem

					.short	0					// DllCharacteristics

					.short	IMAGE_DLL_CHARACTERISTICS_NX_COMPAT	// DllCharacteristics

					.quad	0					// SizeOfStackReserve

					.quad	0					// SizeOfStackCommit

					.quad	0					// SizeOfHeapReserve

									
										6

arch/riscv/kernel/pi/Makefile
									
												View File
												
				@@ -16,8 +16,12 @@ KBUILD_CFLAGS	:= $(filter-out $(CC_FLAGS_LTO), $(KBUILD_CFLAGS))

				KBUILD_CFLAGS	+= -mcmodel=medany

				CFLAGS_cmdline_early.o += -D__NO_FORTIFY

				CFLAGS_lib-fdt_ro.o += -D__NO_FORTIFY

				CFLAGS_fdt_early.o += -D__NO_FORTIFY

				# lib/string.c already defines __NO_FORTIFY

				CFLAGS_ctype.o += -D__NO_FORTIFY

				CFLAGS_lib-fdt.o += -D__NO_FORTIFY

				CFLAGS_lib-fdt_ro.o += -D__NO_FORTIFY

				CFLAGS_archrandom_early.o += -D__NO_FORTIFY

				$(obj)/%.pi.o: OBJCOPYFLAGS := --prefix-symbols=__pi_ \

							       --remove-section=.note.gnu.property \

									
										2

arch/riscv/kernel/traps_misaligned.c
									
												View File
												
				@@ -136,8 +136,6 @@

				#define REG_PTR(insn, pos, regs)	\

					(ulong *)((ulong)(regs) + REG_OFFSET(insn, pos))

				#define GET_RM(insn)			(((insn) >> 12) & 7)

				#define GET_RS1(insn, regs)		(*REG_PTR(insn, SH_RS1, regs))

				#define GET_RS2(insn, regs)		(*REG_PTR(insn, SH_RS2, regs))

				#define GET_RS1S(insn, regs)		(*REG_PTR(RVC_RS1S(insn), 0, regs))

									
										1

arch/riscv/kernel/vdso/Makefile
									
												View File
												
				@@ -18,6 +18,7 @@ obj-vdso = $(patsubst %, %.o, $(vdso-syms)) note.o

				ccflags-y := -fno-stack-protector

				ccflags-y += -DDISABLE_BRANCH_PROFILING

				ccflags-y += -fno-builtin

				ifneq ($(c-gettimeofday-y),)

				  CFLAGS_vgettimeofday.o += -fPIC -include $(c-gettimeofday-y)

									
										8

arch/riscv/kvm/aia_imsic.c
									
												View File
												
				@@ -55,7 +55,7 @@ struct imsic {

					/* IMSIC SW-file */

					struct imsic_mrif *swfile;

					phys_addr_t swfile_pa;

					spinlock_t swfile_extirq_lock;

					raw_spinlock_t swfile_extirq_lock;

				};

				#define imsic_vs_csr_read(__c)			\

				@@ -622,7 +622,7 @@ static void imsic_swfile_extirq_update(struct kvm_vcpu *vcpu)

					 * interruptions between reading topei and updating pending status.

					 */

					spin_lock_irqsave(&imsic->swfile_extirq_lock, flags);

					raw_spin_lock_irqsave(&imsic->swfile_extirq_lock, flags);

					if (imsic_mrif_atomic_read(mrif, &mrif->eidelivery) &&

					    imsic_mrif_topei(mrif, imsic->nr_eix, imsic->nr_msis))

				@@ -630,7 +630,7 @@ static void imsic_swfile_extirq_update(struct kvm_vcpu *vcpu)

					else

						kvm_riscv_vcpu_unset_interrupt(vcpu, IRQ_VS_EXT);

					spin_unlock_irqrestore(&imsic->swfile_extirq_lock, flags);

					raw_spin_unlock_irqrestore(&imsic->swfile_extirq_lock, flags);

				}

				static void imsic_swfile_read(struct kvm_vcpu *vcpu, bool clear,

				@@ -1051,7 +1051,7 @@ int kvm_riscv_vcpu_aia_imsic_init(struct kvm_vcpu *vcpu)

					}

					imsic->swfile = page_to_virt(swfile_page);

					imsic->swfile_pa = page_to_phys(swfile_page);

					spin_lock_init(&imsic->swfile_extirq_lock);

					raw_spin_lock_init(&imsic->swfile_extirq_lock);

					/* Setup IO device */

					kvm_iodevice_init(&imsic->iodev, &imsic_iodoev_ops);

									
										8

arch/riscv/net/bpf_jit_comp64.c
									
												View File
												
				@@ -18,6 +18,7 @@

				#define RV_MAX_REG_ARGS 8

				#define RV_FENTRY_NINSNS 2

				#define RV_FENTRY_NBYTES (RV_FENTRY_NINSNS * 4)

				#define RV_KCFI_NINSNS (IS_ENABLED(CONFIG_CFI_CLANG) ? 1 : 0)

				/* imm that allows emit_imm to emit max count insns */

				#define RV_MAX_COUNT_IMM 0x7FFF7FF7FF7FF7FF

				@@ -271,7 +272,8 @@ static void __build_epilogue(bool is_tail_call, struct rv_jit_context *ctx)

					if (!is_tail_call)

						emit_addiw(RV_REG_A0, RV_REG_A5, 0, ctx);

					emit_jalr(RV_REG_ZERO, is_tail_call ? RV_REG_T3 : RV_REG_RA,

						  is_tail_call ? (RV_FENTRY_NINSNS + 1) * 4 : 0, /* skip reserved nops and TCC init */

						  /* kcfi, fentry and TCC init insns will be skipped on tailcall */

						  is_tail_call ? (RV_KCFI_NINSNS + RV_FENTRY_NINSNS + 1) * 4 : 0,

						  ctx);

				}

				@@ -548,8 +550,8 @@ static void emit_atomic(u8 rd, u8 rs, s16 off, s32 imm, bool is64,

						     rv_lr_w(r0, 0, rd, 0, 0), ctx);

						jmp_offset = ninsns_rvoff(8);

						emit(rv_bne(RV_REG_T2, r0, jmp_offset >> 1), ctx);

						emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 0) :

						     rv_sc_w(RV_REG_T3, rs, rd, 0, 0), ctx);

						emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 1) :

						     rv_sc_w(RV_REG_T3, rs, rd, 0, 1), ctx);

						jmp_offset = ninsns_rvoff(-6);

						emit(rv_bne(RV_REG_T3, 0, jmp_offset >> 1), ctx);

						emit(rv_fence(0x3, 0x3), ctx);

13

arch/s390/configs/debug_defconfig

View File

@@ -50,7 +50,6 @@ CONFIG_NUMA=y
 CONFIG_HZ_100=y
 CONFIG_CERT_STORE=y
 CONFIG_EXPOLINE=y
 # CONFIG_EXPOLINE_EXTERN is not set
 CONFIG_EXPOLINE_AUTO=y
 CONFIG_CHSC_SCH=y
 CONFIG_VFIO_CCW=m
@@ -95,6 +94,7 @@ CONFIG_BINFMT_MISC=m
 CONFIG_ZSWAP=y
 CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y
 CONFIG_ZSMALLOC_STAT=y
 CONFIG_SLAB_BUCKETS=y
 CONFIG_SLUB_STATS=y
 # CONFIG_COMPAT_BRK is not set
 CONFIG_MEMORY_HOTPLUG=y
@@ -426,6 +426,13 @@ CONFIG_DEVTMPFS_SAFE=y
 # CONFIG_FW_LOADER is not set
 CONFIG_CONNECTOR=y
 CONFIG_ZRAM=y
 CONFIG_ZRAM_BACKEND_LZ4=y
 CONFIG_ZRAM_BACKEND_LZ4HC=y
 CONFIG_ZRAM_BACKEND_ZSTD=y
 CONFIG_ZRAM_BACKEND_DEFLATE=y
 CONFIG_ZRAM_BACKEND_842=y
 CONFIG_ZRAM_BACKEND_LZO=y
 CONFIG_ZRAM_DEF_COMP_DEFLATE=y
 CONFIG_BLK_DEV_LOOP=m
 CONFIG_BLK_DEV_DRBD=m
 CONFIG_BLK_DEV_NBD=m
@@ -486,6 +493,7 @@ CONFIG_DM_UEVENT=y
 CONFIG_DM_FLAKEY=m
 CONFIG_DM_VERITY=m
 CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
 CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_PLATFORM_KEYRING=y
 CONFIG_DM_SWITCH=m
 CONFIG_DM_INTEGRITY=m
 CONFIG_DM_VDO=m
@@ -535,6 +543,7 @@ CONFIG_NLMON=m
 CONFIG_MLX4_EN=m
 CONFIG_MLX5_CORE=m
 CONFIG_MLX5_CORE_EN=y
 # CONFIG_NET_VENDOR_META is not set
 # CONFIG_NET_VENDOR_MICREL is not set
 # CONFIG_NET_VENDOR_MICROCHIP is not set
 # CONFIG_NET_VENDOR_MICROSEMI is not set
@@ -695,6 +704,7 @@ CONFIG_NFSD=m
 CONFIG_NFSD_V3_ACL=y
 CONFIG_NFSD_V4=y
 CONFIG_NFSD_V4_SECURITY_LABEL=y
 # CONFIG_NFSD_LEGACY_CLIENT_TRACKING is not set
 CONFIG_CIFS=m
 CONFIG_CIFS_UPCALL=y
 CONFIG_CIFS_XATTR=y
@@ -740,7 +750,6 @@ CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_SM2=m
 CONFIG_CRYPTO_CURVE25519=m
 CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m

14

arch/s390/configs/defconfig

View File

@@ -48,7 +48,6 @@ CONFIG_NUMA=y
 CONFIG_HZ_100=y
 CONFIG_CERT_STORE=y
 CONFIG_EXPOLINE=y
 # CONFIG_EXPOLINE_EXTERN is not set
 CONFIG_EXPOLINE_AUTO=y
 CONFIG_CHSC_SCH=y
 CONFIG_VFIO_CCW=m
@@ -89,6 +88,7 @@ CONFIG_BINFMT_MISC=m
 CONFIG_ZSWAP=y
 CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y
 CONFIG_ZSMALLOC_STAT=y
 CONFIG_SLAB_BUCKETS=y
 # CONFIG_COMPAT_BRK is not set
 CONFIG_MEMORY_HOTPLUG=y
 CONFIG_MEMORY_HOTREMOVE=y
@@ -416,6 +416,13 @@ CONFIG_DEVTMPFS_SAFE=y
 # CONFIG_FW_LOADER is not set
 CONFIG_CONNECTOR=y
 CONFIG_ZRAM=y
 CONFIG_ZRAM_BACKEND_LZ4=y
 CONFIG_ZRAM_BACKEND_LZ4HC=y
 CONFIG_ZRAM_BACKEND_ZSTD=y
 CONFIG_ZRAM_BACKEND_DEFLATE=y
 CONFIG_ZRAM_BACKEND_842=y
 CONFIG_ZRAM_BACKEND_LZO=y
 CONFIG_ZRAM_DEF_COMP_DEFLATE=y
 CONFIG_BLK_DEV_LOOP=m
 CONFIG_BLK_DEV_DRBD=m
 CONFIG_BLK_DEV_NBD=m
@@ -476,6 +483,7 @@ CONFIG_DM_UEVENT=y
 CONFIG_DM_FLAKEY=m
 CONFIG_DM_VERITY=m
 CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
 CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_PLATFORM_KEYRING=y
 CONFIG_DM_SWITCH=m
 CONFIG_DM_INTEGRITY=m
 CONFIG_DM_VDO=m
@@ -525,6 +533,7 @@ CONFIG_NLMON=m
 CONFIG_MLX4_EN=m
 CONFIG_MLX5_CORE=m
 CONFIG_MLX5_CORE_EN=y
 # CONFIG_NET_VENDOR_META is not set
 # CONFIG_NET_VENDOR_MICREL is not set
 # CONFIG_NET_VENDOR_MICROCHIP is not set
 # CONFIG_NET_VENDOR_MICROSEMI is not set
@@ -682,6 +691,7 @@ CONFIG_NFSD=m
 CONFIG_NFSD_V3_ACL=y
 CONFIG_NFSD_V4=y
 CONFIG_NFSD_V4_SECURITY_LABEL=y
 # CONFIG_NFSD_LEGACY_CLIENT_TRACKING is not set
 CONFIG_CIFS=m
 CONFIG_CIFS_UPCALL=y
 CONFIG_CIFS_XATTR=y
@@ -726,7 +736,6 @@ CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_SM2=m
 CONFIG_CRYPTO_CURVE25519=m
 CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
@@ -767,6 +776,7 @@ CONFIG_CRYPTO_LZ4=m
 CONFIG_CRYPTO_LZ4HC=m
 CONFIG_CRYPTO_ZSTD=m
 CONFIG_CRYPTO_ANSI_CPRNG=m
 CONFIG_CRYPTO_JITTERENTROPY_OSR=1
 CONFIG_CRYPTO_USER_API_HASH=m
 CONFIG_CRYPTO_USER_API_SKCIPHER=m
 CONFIG_CRYPTO_USER_API_RNG=m

1

arch/s390/configs/zfcpdump_defconfig

View File

@@ -49,6 +49,7 @@ CONFIG_ZFCP=y
 # CONFIG_HVC_IUCV is not set
 # CONFIG_HW_RANDOM_S390 is not set
 # CONFIG_HMC_DRV is not set
 # CONFIG_S390_UV_UAPI is not set
 # CONFIG_S390_TAPE is not set
 # CONFIG_VMCP is not set
 # CONFIG_MONWRITER is not set

									
										1

arch/s390/include/asm/perf_event.h
									
												View File
												
				@@ -49,6 +49,7 @@ struct perf_sf_sde_regs {

				};

				#define perf_arch_fetch_caller_regs(regs, __ip) do {			\

					(regs)->psw.mask = 0;						\

					(regs)->psw.addr = (__ip);					\

					(regs)->gprs[15] = (unsigned long)__builtin_frame_address(0) -	\

						offsetof(struct stack_frame, back_chain);		\

									
										2

arch/s390/kvm/diag.c
									
												View File
												
				@@ -77,7 +77,7 @@ static int __diag_page_ref_service(struct kvm_vcpu *vcpu)

					vcpu->stat.instruction_diagnose_258++;

					if (vcpu->run->s.regs.gprs[rx] & 7)

						return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);

					rc = read_guest(vcpu, vcpu->run->s.regs.gprs[rx], rx, &parm, sizeof(parm));

					rc = read_guest_real(vcpu, vcpu->run->s.regs.gprs[rx], &parm, sizeof(parm));

					if (rc)

						return kvm_s390_inject_prog_cond(vcpu, rc);

					if (parm.parm_version != 2 || parm.parm_len < 5 || parm.code != 0x258)

									
										4

arch/s390/kvm/gaccess.c
									
												View File
												
				@@ -828,6 +828,8 @@ static int access_guest_page(struct kvm *kvm, enum gacc_mode mode, gpa_t gpa,

					const gfn_t gfn = gpa_to_gfn(gpa);

					int rc;

					if (!gfn_to_memslot(kvm, gfn))

						return PGM_ADDRESSING;

					if (mode == GACC_STORE)

						rc = kvm_write_guest_page(kvm, gfn, data, offset, len);

					else

				@@ -985,6 +987,8 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,

						gra += fragment_len;

						data += fragment_len;

					}

					if (rc > 0)

						vcpu->arch.pgm.code = rc;

					return rc;

				}

									
										14

arch/s390/kvm/gaccess.h
									
												View File
												
				@@ -405,11 +405,12 @@ int read_guest_abs(struct kvm_vcpu *vcpu, unsigned long gpa, void *data,

				 * @len: number of bytes to copy

				 *

				 * Copy @len bytes from @data (kernel space) to @gra (guest real address).

				 * It is up to the caller to ensure that the entire guest memory range is

				 * valid memory before calling this function.

				 * Guest low address and key protection are not checked.

				 *

				 * Returns zero on success or -EFAULT on error.

				 * Returns zero on success, -EFAULT when copying from @data failed, or

				 * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info

				 * is also stored to allow injecting into the guest (if applicable) using

				 * kvm_s390_inject_prog_cond().

				 *

				 * If an error occurs data may have been copied partially to guest memory.

				 */

				@@ -428,11 +429,12 @@ int write_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data,

				 * @len: number of bytes to copy

				 *

				 * Copy @len bytes from @gra (guest real address) to @data (kernel space).

				 * It is up to the caller to ensure that the entire guest memory range is

				 * valid memory before calling this function.

				 * Guest key protection is not checked.

				 *

				 * Returns zero on success or -EFAULT on error.

				 * Returns zero on success, -EFAULT when copying to @data failed, or

				 * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info

				 * is also stored to allow injecting into the guest (if applicable) using

				 * kvm_s390_inject_prog_cond().

				 *

				 * If an error occurs data may have been copied partially to kernel space.

				 */

									
										17

arch/s390/pci/pci_event.c
									
												View File
												
				@@ -280,18 +280,19 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf)

						goto no_pdev;

					switch (ccdf->pec) {

					case 0x003a: /* Service Action or Error Recovery Successful */

					case 0x002a: /* Error event concerns FMB */

					case 0x002b:

					case 0x002c:

						break;

					case 0x0040: /* Service Action or Error Recovery Failed */

					case 0x003b:

						zpci_event_io_failure(pdev, pci_channel_io_perm_failure);

						break;

					default: /* PCI function left in the error state attempt to recover */

						ers_res = zpci_event_attempt_error_recovery(pdev);

						if (ers_res != PCI_ERS_RESULT_RECOVERED)

							zpci_event_io_failure(pdev, pci_channel_io_perm_failure);

						break;

					default:

						/*

						 * Mark as frozen not permanently failed because the device

						 * could be subsequently recovered by the platform.

						 */

						zpci_event_io_failure(pdev, pci_channel_io_frozen);

						break;

					}

					pci_dev_put(pdev);

				no_pdev:

1

arch/x86/Kconfig

View File

@@ -2257,6 +2257,7 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING
 config ADDRESS_MASKING
 	bool "Linear Address Masking support"
 	depends on X86_64
 	depends on COMPILE_TEST || !CPU_MITIGATIONS # wait for LASS
 	help
 	  Linear Address Masking (LAM) modifies the checking that is applied
 	  to 64-bit linear addresses, allowing software to use of the

									
										5

arch/x86/entry/entry.S
									
												View File
												
				@@ -9,6 +9,8 @@

				#include <asm/unwind_hints.h>

				#include <asm/segment.h>

				#include <asm/cache.h>

				#include <asm/cpufeatures.h>

				#include <asm/nospec-branch.h>

				#include "calling.h"

				@@ -19,6 +21,9 @@ SYM_FUNC_START(entry_ibpb)

					movl	$PRED_CMD_IBPB, %eax

					xorl	%edx, %edx

					wrmsr

					/* Make sure IBPB clears return stack preductions too. */

					FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_BUG_IBPB_NO_RET

					RET

				SYM_FUNC_END(entry_ibpb)

				/* For KVM */

									
										6

arch/x86/entry/entry_32.S
									
												View File
												
				@@ -871,6 +871,8 @@ SYM_FUNC_START(entry_SYSENTER_32)

					/* Now ready to switch the cr3 */

					SWITCH_TO_USER_CR3 scratch_reg=%eax

					/* Clobbers ZF */

					CLEAR_CPU_BUFFERS

					/*

					 * Restore all flags except IF. (We restore IF separately because

				@@ -881,7 +883,6 @@ SYM_FUNC_START(entry_SYSENTER_32)

					BUG_IF_WRONG_CR3 no_user_check=1

					popfl

					popl	%eax

					CLEAR_CPU_BUFFERS

					/*

					 * Return back to the vDSO, which will pop ecx and edx.

				@@ -1144,7 +1145,6 @@ SYM_CODE_START(asm_exc_nmi)

					/* Not on SYSENTER stack. */

					call	exc_nmi

					CLEAR_CPU_BUFFERS

					jmp	.Lnmi_return

				.Lnmi_from_sysenter_stack:

				@@ -1165,6 +1165,7 @@ SYM_CODE_START(asm_exc_nmi)

					CHECK_AND_APPLY_ESPFIX

					RESTORE_ALL_NMI cr3_reg=%edi pop=4

					CLEAR_CPU_BUFFERS

					jmp	.Lirq_return

				#ifdef CONFIG_X86_ESPFIX32

				@@ -1206,6 +1207,7 @@ SYM_CODE_START(asm_exc_nmi)

					 *  1 - orig_ax

					 */

					lss	(1+5+6)*4(%esp), %esp			# back to espfix stack

					CLEAR_CPU_BUFFERS

					jmp	.Lirq_return

				#endif

				SYM_CODE_END(asm_exc_nmi)

									
										5

arch/x86/include/asm/amd_nb.h
									
												View File
												
				@@ -116,7 +116,10 @@ static inline bool amd_gart_present(void)

				#define amd_nb_num(x)		0

				#define amd_nb_has_feature(x)	false

				#define node_to_amd_nb(x)	NULL

				static inline struct amd_northbridge *node_to_amd_nb(int node)

				{

					return NULL;

				}

				#define amd_gart_present(x)	false

				#endif

									
										4

arch/x86/include/asm/cpufeatures.h
									
												View File
												
				@@ -215,7 +215,7 @@

				#define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE	( 7*32+23) /* Disable Speculative Store Bypass. */

				#define X86_FEATURE_LS_CFG_SSBD		( 7*32+24)  /* AMD SSBD implementation via LS_CFG MSR */

				#define X86_FEATURE_IBRS		( 7*32+25) /* "ibrs" Indirect Branch Restricted Speculation */

				#define X86_FEATURE_IBPB		( 7*32+26) /* "ibpb" Indirect Branch Prediction Barrier */

				#define X86_FEATURE_IBPB		( 7*32+26) /* "ibpb" Indirect Branch Prediction Barrier without a guaranteed RSB flush */

				#define X86_FEATURE_STIBP		( 7*32+27) /* "stibp" Single Thread Indirect Branch Predictors */

				#define X86_FEATURE_ZEN			( 7*32+28) /* Generic flag for all Zen and newer */

				#define X86_FEATURE_L1TF_PTEINV		( 7*32+29) /* L1TF workaround PTE inversion */

				@@ -348,6 +348,7 @@

				#define X86_FEATURE_CPPC		(13*32+27) /* "cppc" Collaborative Processor Performance Control */

				#define X86_FEATURE_AMD_PSFD            (13*32+28) /* Predictive Store Forwarding Disable */

				#define X86_FEATURE_BTC_NO		(13*32+29) /* Not vulnerable to Branch Type Confusion */

				#define X86_FEATURE_AMD_IBPB_RET	(13*32+30) /* IBPB clears return address predictor */

				#define X86_FEATURE_BRS			(13*32+31) /* "brs" Branch Sampling available */

				/* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */

				@@ -523,4 +524,5 @@

				#define X86_BUG_DIV0			X86_BUG(1*32 + 1) /* "div0" AMD DIV0 speculation bug */

				#define X86_BUG_RFDS			X86_BUG(1*32 + 2) /* "rfds" CPU is vulnerable to Register File Data Sampling */

				#define X86_BUG_BHI			X86_BUG(1*32 + 3) /* "bhi" CPU is affected by Branch History Injection */

				#define X86_BUG_IBPB_NO_RET	   	X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */

				#endif /* _ASM_X86_CPUFEATURES_H */

									
										11

arch/x86/include/asm/nospec-branch.h
									
												View File
												
				@@ -323,7 +323,16 @@

				 * Note: Only the memory operand variant of VERW clears the CPU buffers.

				 */

				.macro CLEAR_CPU_BUFFERS

					ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF

				#ifdef CONFIG_X86_64

					ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF

				#else

					/*

					 * In 32bit mode, the memory operand must be a %cs reference. The data

					 * segments may not be usable (vm86 mode), and the stack segment may not

					 * be flat (ESPFIX32).

					 */

					ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF

				#endif

				.endm

				#ifdef CONFIG_X86_64

									
										4

arch/x86/include/asm/runtime-const.h
									
												View File
												
				@@ -6,7 +6,7 @@

					typeof(sym) __ret;					\

					asm_inline("mov %1,%0\n1:\n"				\

						".pushsection runtime_ptr_" #sym ",\"a\"\n\t"	\

						".long 1b - %c2 - .\n\t"			\

						".long 1b - %c2 - .\n"				\

						".popsection"					\

						:"=r" (__ret)					\

						:"i" ((unsigned long)0x0123456789abcdefull),	\

				@@ -20,7 +20,7 @@

					typeof(0u+(val)) __ret = (val);				\

					asm_inline("shrl $12,%k0\n1:\n"				\

						".pushsection runtime_shift_" #sym ",\"a\"\n\t"	\

						".long 1b - 1 - .\n\t"				\

						".long 1b - 1 - .\n"				\

						".popsection"					\

						:"+r" (__ret));					\

					__ret; })

									
										43

arch/x86/include/asm/uaccess_64.h
									
												View File
												
				@@ -12,6 +12,13 @@

				#include <asm/cpufeatures.h>

				#include <asm/page.h>

				#include <asm/percpu.h>

				#include <asm/runtime-const.h>

				/*

				 * Virtual variable: there's no actual backing store for this,

				 * it can purely be used as 'runtime_const_ptr(USER_PTR_MAX)'

				 */

				extern unsigned long USER_PTR_MAX;

				#ifdef CONFIG_ADDRESS_MASKING

				/*

				@@ -46,19 +53,24 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,

				#endif

				/*

				 * The virtual address space space is logically divided into a kernel

				 * half and a user half.  When cast to a signed type, user pointers

				 * are positive and kernel pointers are negative.

				 */

				#define valid_user_address(x) ((__force long)(x) >= 0)

				#define valid_user_address(x) \

					((__force unsigned long)(x) <= runtime_const_ptr(USER_PTR_MAX))

				/*

				 * Masking the user address is an alternative to a conditional

				 * user_access_begin that can avoid the fencing. This only works

				 * for dense accesses starting at the address.

				 */

				#define mask_user_address(x) ((typeof(x))((long)(x)|((long)(x)>>63)))

				static inline void __user *mask_user_address(const void __user *ptr)

				{

					unsigned long mask;

					asm("cmp %1,%0\n\t"

					    "sbb %0,%0"

						:"=r" (mask)

						:"r" (ptr),

						 "0" (runtime_const_ptr(USER_PTR_MAX)));

					return (__force void __user *)(mask | (__force unsigned long)ptr);

				}

				#define masked_user_access_begin(x) ({				\

					__auto_type __masked_ptr = (x);				\

					__masked_ptr = mask_user_address(__masked_ptr);		\

				@@ -69,23 +81,16 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,

				 * arbitrary values in those bits rather then masking them off.

				 *

				 * Enforce two rules:

				 * 1. 'ptr' must be in the user half of the address space

				 * 1. 'ptr' must be in the user part of the address space

				 * 2. 'ptr+size' must not overflow into kernel addresses

				 *

				 * Note that addresses around the sign change are not valid addresses,

				 * and will GP-fault even with LAM enabled if the sign bit is set (see

				 * "CR3.LAM_SUP" that can narrow the canonicality check if we ever

				 * enable it, but not remove it entirely).

				 *

				 * So the "overflow into kernel addresses" does not imply some sudden

				 * exact boundary at the sign bit, and we can allow a lot of slop on the

				 * size check.

				 * Note that we always have at least one guard page between the

				 * max user address and the non-canonical gap, allowing us to

				 * ignore small sizes entirely.

				 *

				 * In fact, we could probably remove the size check entirely, since

				 * any kernel accesses will be in increasing address order starting

				 * at 'ptr', and even if the end might be in kernel space, we'll

				 * hit the GP faults for non-canonical accesses before we ever get

				 * there.

				 * at 'ptr'.

				 *

				 * That's a separate optimization, for now just handle the small

				 * constant case.

									
										2

arch/x86/kernel/amd_nb.c
									
												View File
												
				@@ -44,6 +44,7 @@

				#define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4	0x14f4

				#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4	0x12fc

				#define PCI_DEVICE_ID_AMD_1AH_M00H_DF_F4	0x12c4

				#define PCI_DEVICE_ID_AMD_1AH_M20H_DF_F4	0x16fc

				#define PCI_DEVICE_ID_AMD_1AH_M60H_DF_F4	0x124c

				#define PCI_DEVICE_ID_AMD_1AH_M70H_DF_F4	0x12bc

				#define PCI_DEVICE_ID_AMD_MI200_DF_F4		0x14d4

				@@ -127,6 +128,7 @@ static const struct pci_device_id amd_nb_link_ids[] = {

					{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F4) },

					{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F4) },

					{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M00H_DF_F4) },

					{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M20H_DF_F4) },

					{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M60H_DF_F4) },

					{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M70H_DF_F4) },

					{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI200_DF_F4) },

									
										14

arch/x86/kernel/apic/apic.c
									
												View File
												
				@@ -440,7 +440,19 @@ static int lapic_timer_shutdown(struct clock_event_device *evt)

					v = apic_read(APIC_LVTT);

					v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR);

					apic_write(APIC_LVTT, v);

					apic_write(APIC_TMICT, 0);

					/*

					 * Setting APIC_LVT_MASKED (above) should be enough to tell

					 * the hardware that this timer will never fire. But AMD

					 * erratum 411 and some Intel CPU behavior circa 2024 say

					 * otherwise.  Time for belt and suspenders programming: mask

					 * the timer _and_ zero the counter registers:

					 */

					if (v & APIC_LVT_TIMER_TSCDEADLINE)

						wrmsrl(MSR_IA32_TSC_DEADLINE, 0);

					else

						apic_write(APIC_TMICT, 0);

					return 0;

				}

									
										3

arch/x86/kernel/cpu/amd.c
									
												View File
												
				@@ -1202,5 +1202,6 @@ void amd_check_microcode(void)

					if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)

						return;

					on_each_cpu(zenbleed_check_cpu, NULL, 1);

					if (cpu_feature_enabled(X86_FEATURE_ZEN2))

						on_each_cpu(zenbleed_check_cpu, NULL, 1);

				}

									
										32

arch/x86/kernel/cpu/bugs.c
									
												View File
												
				@@ -1115,8 +1115,25 @@ do_cmd_auto:

					case RETBLEED_MITIGATION_IBPB:

						setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);

						/*

						 * IBPB on entry already obviates the need for

						 * software-based untraining so clear those in case some

						 * other mitigation like SRSO has selected them.

						 */

						setup_clear_cpu_cap(X86_FEATURE_UNRET);

						setup_clear_cpu_cap(X86_FEATURE_RETHUNK);

						setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);

						mitigate_smt = true;

						/*

						 * There is no need for RSB filling: entry_ibpb() ensures

						 * all predictions, including the RSB, are invalidated,

						 * regardless of IBPB implementation.

						 */

						setup_clear_cpu_cap(X86_FEATURE_RSB_VMEXIT);

						break;

					case RETBLEED_MITIGATION_STUFF:

				@@ -2627,6 +2644,14 @@ static void __init srso_select_mitigation(void)

							if (has_microcode) {

								setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);

								srso_mitigation = SRSO_MITIGATION_IBPB;

								/*

								 * IBPB on entry already obviates the need for

								 * software-based untraining so clear those in case some

								 * other mitigation like Retbleed has selected them.

								 */

								setup_clear_cpu_cap(X86_FEATURE_UNRET);

								setup_clear_cpu_cap(X86_FEATURE_RETHUNK);

							}

						} else {

							pr_err("WARNING: kernel not compiled with MITIGATION_IBPB_ENTRY.\n");

				@@ -2638,6 +2663,13 @@ static void __init srso_select_mitigation(void)

							if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {

								setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);

								srso_mitigation = SRSO_MITIGATION_IBPB_ON_VMEXIT;

								/*

								 * There is no need for RSB filling: entry_ibpb() ensures

								 * all predictions, including the RSB, are invalidated,

								 * regardless of IBPB implementation.

								 */

								setup_clear_cpu_cap(X86_FEATURE_RSB_VMEXIT);

							}

						} else {

							pr_err("WARNING: kernel not compiled with MITIGATION_SRSO.\n");

									
										13

arch/x86/kernel/cpu/common.c
									
												View File
												
				@@ -69,6 +69,7 @@

				#include <asm/sev.h>

				#include <asm/tdx.h>

				#include <asm/posted_intr.h>

				#include <asm/runtime-const.h>

				#include "cpu.h"

				@@ -1443,6 +1444,9 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)

					     boot_cpu_has(X86_FEATURE_HYPERVISOR)))

						setup_force_cpu_bug(X86_BUG_BHI);

					if (cpu_has(c, X86_FEATURE_AMD_IBPB) && !cpu_has(c, X86_FEATURE_AMD_IBPB_RET))

						setup_force_cpu_bug(X86_BUG_IBPB_NO_RET);

					if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))

						return;

				@@ -2386,6 +2390,15 @@ void __init arch_cpu_finalize_init(void)

					alternative_instructions();

					if (IS_ENABLED(CONFIG_X86_64)) {

						unsigned long USER_PTR_MAX = TASK_SIZE_MAX-1;

						/*

						 * Enable this when LAM is gated on LASS support

						if (cpu_feature_enabled(X86_FEATURE_LAM))

							USER_PTR_MAX = (1ul << 63) - PAGE_SIZE - 1;

						 */

						runtime_const_init(ptr, USER_PTR_MAX);

						/*

						 * Make sure the first 2MB area is not mapped by huge pages

						 * There are typically fixed size MTRRs in there and overlapping

									
										53

arch/x86/kernel/cpu/microcode/amd.c
									
												View File
												
				@@ -584,7 +584,7 @@ void __init load_ucode_amd_bsp(struct early_load_data *ed, unsigned int cpuid_1_

						native_rdmsr(MSR_AMD64_PATCH_LEVEL, ed->new_rev, dummy);

				}

				static enum ucode_state load_microcode_amd(u8 family, const u8 *data, size_t size);

				static enum ucode_state _load_microcode_amd(u8 family, const u8 *data, size_t size);

				static int __init save_microcode_in_initrd(void)

				{

				@@ -605,7 +605,7 @@ static int __init save_microcode_in_initrd(void)

					if (!desc.mc)

						return -EINVAL;

					ret = load_microcode_amd(x86_family(cpuid_1_eax), desc.data, desc.size);

					ret = _load_microcode_amd(x86_family(cpuid_1_eax), desc.data, desc.size);

					if (ret > UCODE_UPDATED)

						return -EINVAL;

				@@ -613,16 +613,19 @@ static int __init save_microcode_in_initrd(void)

				}

				early_initcall(save_microcode_in_initrd);

				static inline bool patch_cpus_equivalent(struct ucode_patch *p, struct ucode_patch *n)

				static inline bool patch_cpus_equivalent(struct ucode_patch *p,

									 struct ucode_patch *n,

									 bool ignore_stepping)

				{

					/* Zen and newer hardcode the f/m/s in the patch ID */

				        if (x86_family(bsp_cpuid_1_eax) >= 0x17) {

						union cpuid_1_eax p_cid = ucode_rev_to_cpuid(p->patch_id);

						union cpuid_1_eax n_cid = ucode_rev_to_cpuid(n->patch_id);

						/* Zap stepping */

						p_cid.stepping = 0;

						n_cid.stepping = 0;

						if (ignore_stepping) {

							p_cid.stepping = 0;

							n_cid.stepping = 0;

						}

						return p_cid.full == n_cid.full;

					} else {

				@@ -644,13 +647,13 @@ static struct ucode_patch *cache_find_patch(struct ucode_cpu_info *uci, u16 equi

					WARN_ON_ONCE(!n.patch_id);

					list_for_each_entry(p, &microcode_cache, plist)

						if (patch_cpus_equivalent(p, &n))

						if (patch_cpus_equivalent(p, &n, false))

							return p;

					return NULL;

				}

				static inline bool patch_newer(struct ucode_patch *p, struct ucode_patch *n)

				static inline int patch_newer(struct ucode_patch *p, struct ucode_patch *n)

				{

					/* Zen and newer hardcode the f/m/s in the patch ID */

				        if (x86_family(bsp_cpuid_1_eax) >= 0x17) {

				@@ -659,6 +662,9 @@ static inline bool patch_newer(struct ucode_patch *p, struct ucode_patch *n)

						zp.ucode_rev = p->patch_id;

						zn.ucode_rev = n->patch_id;

						if (zn.stepping != zp.stepping)

							return -1;

						return zn.rev > zp.rev;

					} else {

						return n->patch_id > p->patch_id;

				@@ -668,10 +674,14 @@ static inline bool patch_newer(struct ucode_patch *p, struct ucode_patch *n)

				static void update_cache(struct ucode_patch *new_patch)

				{

					struct ucode_patch *p;

					int ret;

					list_for_each_entry(p, &microcode_cache, plist) {

						if (patch_cpus_equivalent(p, new_patch)) {

							if (!patch_newer(p, new_patch)) {

						if (patch_cpus_equivalent(p, new_patch, true)) {

							ret = patch_newer(p, new_patch);

							if (ret < 0)

								continue;

							else if (!ret) {

								/* we already have the latest patch */

								kfree(new_patch->data);

								kfree(new_patch);

				@@ -944,6 +954,20 @@ static enum ucode_state __load_microcode_amd(u8 family, const u8 *data,

					return UCODE_OK;

				}

				static enum ucode_state _load_microcode_amd(u8 family, const u8 *data, size_t size)

				{

					enum ucode_state ret;

					/* free old equiv table */

					free_equiv_cpu_table();

					ret = __load_microcode_amd(family, data, size);

					if (ret != UCODE_OK)

						cleanup();

					return ret;

				}

				static enum ucode_state load_microcode_amd(u8 family, const u8 *data, size_t size)

				{

					struct cpuinfo_x86 *c;

				@@ -951,14 +975,9 @@ static enum ucode_state load_microcode_amd(u8 family, const u8 *data, size_t siz

					struct ucode_patch *p;

					enum ucode_state ret;

					/* free old equiv table */

					free_equiv_cpu_table();

					ret = __load_microcode_amd(family, data, size);

					if (ret != UCODE_OK) {

						cleanup();

					ret = _load_microcode_amd(family, data, size);

					if (ret != UCODE_OK)

						return ret;

					}

					for_each_node(nid) {

						cpu = cpumask_first(cpumask_of_node(nid));

									
										4

arch/x86/kernel/cpu/resctrl/core.c
									
												View File
												
				@@ -207,7 +207,7 @@ static inline bool rdt_get_mb_table(struct rdt_resource *r)

					return false;

				}

				static bool __get_mem_config_intel(struct rdt_resource *r)

				static __init bool __get_mem_config_intel(struct rdt_resource *r)

				{

					struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

					union cpuid_0x10_3_eax eax;

				@@ -241,7 +241,7 @@ static bool __get_mem_config_intel(struct rdt_resource *r)

					return true;

				}

				static bool __rdt_get_mem_config_amd(struct rdt_resource *r)

				static __init bool __rdt_get_mem_config_amd(struct rdt_resource *r)

				{

					struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

					u32 eax, ebx, ecx, edx, subleaf;

Compare commits

1239 Commits v6.12-rc3 ... v6.12-rc6

12 .mailmap Unescape Escape View File

7 Documentation/admin-guide/LSM/ipe.rst Unescape Escape View File

20 Documentation/admin-guide/pm/cpufreq.rst Unescape Escape View File

38 Documentation/core-api/protection-keys.rst Unescape Escape View File

24 Documentation/devicetree/bindings/display/mediatek/mediatek,dpi.yaml Unescape Escape View File

19 Documentation/devicetree/bindings/display/mediatek/mediatek,split.yaml Unescape Escape View File

21 Documentation/devicetree/bindings/iio/adc/adi,ad7380.yaml Unescape Escape View File

53 Documentation/devicetree/bindings/iio/dac/adi,ad5686.yaml Unescape Escape View File

3 Documentation/devicetree/bindings/iio/dac/adi,ad5696.yaml Unescape Escape View File

1 Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml Unescape Escape View File

5 Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml Unescape Escape View File

18 Documentation/devicetree/bindings/sound/davinci-mcasp-audio.yaml Unescape Escape View File

4 Documentation/devicetree/bindings/sound/rockchip,rk3308-codec.yaml Unescape Escape View File

2 Documentation/filesystems/caching/cachefiles.rst Unescape Escape View File

2 Documentation/filesystems/iomap/operations.rst Unescape Escape View File

1 Documentation/filesystems/netfs_library.rst Unescape Escape View File

13 Documentation/iio/ad7380.rst Unescape Escape View File

38 Documentation/mm/damon/maintainer-profile.rst Unescape Escape View File

5 Documentation/networking/packet_mmap.rst Unescape Escape View File

42 Documentation/process/maintainer-soc.rst Unescape Escape View File

2 Documentation/rust/arch-support.rst Unescape Escape View File

261 Documentation/userspace-api/mseal.rst Unescape Escape View File

16 Documentation/virt/kvm/api.rst Unescape Escape View File

2 Documentation/virt/kvm/locking.rst Unescape Escape View File

183 MAINTAINERS Unescape Escape View File

2 Makefile Unescape Escape View File

26 arch/Kconfig Unescape Escape View File

2 arch/arm/boot/dts/broadcom/bcm2837-rpi-cm3-io3.dts Unescape Escape View File

2 arch/arm64/boot/dts/marvell/cn9130-sr-som.dtsi Unescape Escape View File

1 arch/arm64/include/asm/kvm_asm.h Unescape Escape View File

7 arch/arm64/include/asm/kvm_host.h Unescape Escape View File

3 arch/arm64/include/asm/kvm_mmu.h Unescape Escape View File

4 arch/arm64/include/asm/kvm_nested.h Unescape Escape View File

8 arch/arm64/include/asm/uprobes.h Unescape Escape View File

1 arch/arm64/kernel/asm-offsets.c Unescape Escape View File

16 arch/arm64/kernel/probes/decode-insn.c Unescape Escape View File

18 arch/arm64/kernel/probes/simulate-insn.c Unescape Escape View File

4 arch/arm64/kernel/probes/uprobes.c Unescape Escape View File

3 arch/arm64/kernel/process.c Unescape Escape View File

92 arch/arm64/kernel/signal.c Unescape Escape View File

5 arch/arm64/kvm/arm.c Unescape Escape View File

52 arch/arm64/kvm/hyp/nvhe/hyp-init.S Unescape Escape View File

12 arch/arm64/kvm/hypercalls.c Unescape Escape View File

15 arch/arm64/kvm/mmu.c Unescape Escape View File

53 arch/arm64/kvm/nested.c Unescape Escape View File

77 arch/arm64/kvm/sys_regs.c Unescape Escape View File

41 arch/arm64/kvm/vgic/vgic-init.c Unescape Escape View File

7 arch/arm64/kvm/vgic/vgic-kvm-device.c Unescape Escape View File

12 arch/arm64/net/bpf_jit_comp.c Unescape Escape View File

4 arch/loongarch/include/asm/bootinfo.h Unescape Escape View File

2 arch/loongarch/include/asm/kasan.h Unescape Escape View File

2 arch/loongarch/include/asm/loongarch.h Unescape Escape View File

11 arch/loongarch/include/asm/pgalloc.h Unescape Escape View File

35 arch/loongarch/include/asm/pgtable.h Unescape Escape View File

14 arch/loongarch/kernel/process.c Unescape Escape View File

3 arch/loongarch/kernel/setup.c Unescape Escape View File

5 arch/loongarch/kernel/traps.c Unescape Escape View File

8 arch/loongarch/kernel/vdso.c Unescape Escape View File

7 arch/loongarch/kvm/timer.c Unescape Escape View File

2 arch/loongarch/kvm/vcpu.c Unescape Escape View File

2 arch/loongarch/mm/init.c Unescape Escape View File

20 arch/loongarch/mm/pgtable.c Unescape Escape View File

1 arch/mips/kernel/cmpxchg.c Unescape Escape View File

1 arch/powerpc/platforms/powernv/opal-irqchip.c Unescape Escape View File

2 arch/riscv/Kconfig Unescape Escape View File

6 arch/riscv/errata/Makefile Unescape Escape View File

5 arch/riscv/kernel/Makefile Unescape Escape View File

4 arch/riscv/kernel/acpi.c Unescape Escape View File

2 arch/riscv/kernel/asm-offsets.c Unescape Escape View File

7 arch/riscv/kernel/cacheinfo.c Unescape Escape View File

2 arch/riscv/kernel/cpu-hotplug.c Unescape Escape View File

2 arch/riscv/kernel/efi-header.S Unescape Escape View File

6 arch/riscv/kernel/pi/Makefile Unescape Escape View File

2 arch/riscv/kernel/traps_misaligned.c Unescape Escape View File

1 arch/riscv/kernel/vdso/Makefile Unescape Escape View File

8 arch/riscv/kvm/aia_imsic.c Unescape Escape View File

8 arch/riscv/net/bpf_jit_comp64.c Unescape Escape View File

13 arch/s390/configs/debug_defconfig Unescape Escape View File

1239 Commits

v6.12-rc3 ... v6.12-rc6

12

.mailmap

View File

7

Documentation/admin-guide/LSM/ipe.rst

View File

20

Documentation/admin-guide/pm/cpufreq.rst

View File

38

Documentation/core-api/protection-keys.rst

View File

24

Documentation/devicetree/bindings/display/mediatek/mediatek,dpi.yaml

View File

19

Documentation/devicetree/bindings/display/mediatek/mediatek,split.yaml

View File

21

Documentation/devicetree/bindings/iio/adc/adi,ad7380.yaml

View File

53

Documentation/devicetree/bindings/iio/dac/adi,ad5686.yaml

View File

3

Documentation/devicetree/bindings/iio/dac/adi,ad5696.yaml

View File

1

Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml

View File

5

Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-pcie-phy.yaml

View File

18

Documentation/devicetree/bindings/sound/davinci-mcasp-audio.yaml

View File

4

Documentation/devicetree/bindings/sound/rockchip,rk3308-codec.yaml

View File

2

Documentation/filesystems/caching/cachefiles.rst

View File

2

Documentation/filesystems/iomap/operations.rst

View File

1

Documentation/filesystems/netfs_library.rst

View File

13

Documentation/iio/ad7380.rst

View File

38

Documentation/mm/damon/maintainer-profile.rst

View File

5

Documentation/networking/packet_mmap.rst

View File

42

Documentation/process/maintainer-soc.rst

View File

2

Documentation/rust/arch-support.rst

View File

261

Documentation/userspace-api/mseal.rst

View File

16

Documentation/virt/kvm/api.rst

View File

2

Documentation/virt/kvm/locking.rst

View File

183

MAINTAINERS

View File

2

Makefile

View File

26

arch/Kconfig

View File

2

arch/arm/boot/dts/broadcom/bcm2837-rpi-cm3-io3.dts

View File

2

arch/arm64/boot/dts/marvell/cn9130-sr-som.dtsi

View File

1

arch/arm64/include/asm/kvm_asm.h

View File

7

arch/arm64/include/asm/kvm_host.h

View File

3

arch/arm64/include/asm/kvm_mmu.h

View File

4

arch/arm64/include/asm/kvm_nested.h

View File

8

arch/arm64/include/asm/uprobes.h

View File

1

arch/arm64/kernel/asm-offsets.c

View File

16

arch/arm64/kernel/probes/decode-insn.c

View File

18

arch/arm64/kernel/probes/simulate-insn.c

View File

4

arch/arm64/kernel/probes/uprobes.c

View File

3

arch/arm64/kernel/process.c

View File

92

arch/arm64/kernel/signal.c

View File

5

arch/arm64/kvm/arm.c

View File

52

arch/arm64/kvm/hyp/nvhe/hyp-init.S

View File

12

arch/arm64/kvm/hypercalls.c

View File

15

arch/arm64/kvm/mmu.c

View File

53

arch/arm64/kvm/nested.c

View File

77

arch/arm64/kvm/sys_regs.c

View File

41

arch/arm64/kvm/vgic/vgic-init.c

View File

7

arch/arm64/kvm/vgic/vgic-kvm-device.c

View File

12

arch/arm64/net/bpf_jit_comp.c

View File

4

arch/loongarch/include/asm/bootinfo.h

View File

2

arch/loongarch/include/asm/kasan.h

View File

2

arch/loongarch/include/asm/loongarch.h

View File

11

arch/loongarch/include/asm/pgalloc.h

View File

35

arch/loongarch/include/asm/pgtable.h

View File

14

arch/loongarch/kernel/process.c

View File

3

arch/loongarch/kernel/setup.c

View File

5

arch/loongarch/kernel/traps.c

View File

8

arch/loongarch/kernel/vdso.c

View File

7

arch/loongarch/kvm/timer.c

View File

2

arch/loongarch/kvm/vcpu.c

View File

2

arch/loongarch/mm/init.c

View File

20

arch/loongarch/mm/pgtable.c

View File

1

arch/mips/kernel/cmpxchg.c

View File

1

arch/powerpc/platforms/powernv/opal-irqchip.c

View File

2

arch/riscv/Kconfig

View File

6

arch/riscv/errata/Makefile

View File

5

arch/riscv/kernel/Makefile

View File

4

arch/riscv/kernel/acpi.c

View File

2

arch/riscv/kernel/asm-offsets.c

View File

7

arch/riscv/kernel/cacheinfo.c

View File

2

arch/riscv/kernel/cpu-hotplug.c

View File

2

arch/riscv/kernel/efi-header.S

View File

6

arch/riscv/kernel/pi/Makefile

View File

2

arch/riscv/kernel/traps_misaligned.c

View File

1

arch/riscv/kernel/vdso/Makefile

View File

8

arch/riscv/kvm/aia_imsic.c

View File

8

arch/riscv/net/bpf_jit_comp64.c

View File

13

arch/s390/configs/debug_defconfig

View File

14

arch/s390/configs/defconfig

View File