Pull USB / Thunderbolt driver fixes from Greg KH:
"Here are a small set of USB and Thunderbolt driver fixes for reported
problems and a documentation update, for 6.3-rc4.
Included in here are:
- documentation update for uvc gadget driver
- small thunderbolt driver fixes
- cdns3 driver fixes
- dwc3 driver fixes
- dwc2 driver fixes
- chipidea driver fixes
- typec driver fixes
- onboard_usb_hub device id updates
- quirk updates
All of these have been in linux-next with no reported problems"
* tag 'usb-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (30 commits)
usb: dwc2: fix a race, don't power off/on phy for dual-role mode
usb: dwc2: fix a devres leak in hw_enable upon suspend resume
usb: chipidea: core: fix possible concurrent when switch role
usb: chipdea: core: fix return -EINVAL if request role is the same with current role
thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit
thunderbolt: Disable interrupt auto clear for rings
thunderbolt: Use const qualifier for `ring_interrupt_index`
usb: gadget: Use correct endianness of the wLength field for WebUSB
uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2
usb: cdnsp: changes PCI Device ID to fix conflict with CNDS3 driver
usb: cdns3: Fix issue with using incorrect PCI device function
usb: cdnsp: Fixes issue with redundant Status Stage
MAINTAINERS: make me a reviewer of USB/IP
thunderbolt: Use scale field when allocating USB3 bandwidth
thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
thunderbolt: Call tb_check_quirks() after initializing adapters
thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access
thunderbolt: Fix memory leak in margining
usb: dwc2: drd: fix inconsistent mode if role-switch-default-mode="host"
docs: usb: Add documentation for the UVC Gadget
...
Pull scheduler fix from Borislav Petkov:
- Fix a corner case where vruntime of a task is not being sanitized
* tag 'sched_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Sanitize vruntime of entity being migrated
Pull perf fix from Borislav Petkov:
- Properly clear perf event status tracking in the AMD perf event
overflow handler
* tag 'perf_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/amd/core: Always clear status for idx
Pull core fixes from Borislav Petkov:
- Do the delayed RCU wakeup for kthreads in the proper order so that
former doesn't get ignored
- A noinstr warning fix
* tag 'core_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up
entry: Fix noinstr warning in __enter_from_user_mode()
Pull x86 fixes from Borislav Petkov:
- Add a AMX ptrace self test
- Prevent a false-positive warning when retrieving the (invalid)
address of dynamic FPU features in their init state which are not
saved in init_fpstate at all
- Randomize per-CPU entry areas only when KASLR is enabled
* tag 'x86_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/x86/amx: Add a ptrace test
x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf()
x86/mm: Do not shuffle CPU entry areas without KASLR
Pull cifs client fixes from Steve French:
"Twelve cifs/smb3 client fixes (most also for stable)
- forced umount fix
- fix for two perf regressions
- reconnect fixes
- small debugging improvements
- multichannel fixes"
* tag 'smb3-client-fixes-6.3-rc3' of git://git.samba.org/sfrench/cifs-2.6:
smb3: fix unusable share after force unmount failure
cifs: fix dentry lookups in directory handle cache
smb3: lower default deferred close timeout to address perf regression
cifs: fix missing unload_nls() in smb2_reconnect()
cifs: avoid race conditions with parallel reconnects
cifs: append path to open_enter trace event
cifs: print session id while listing open files
cifs: dump pending mids for all channels in DebugData
cifs: empty interface list when server doesn't support query interfaces
cifs: do not poll server interfaces too regularly
cifs: lock chan_lock outside match_session
cifs: check only tcon status on tcon related functions
Pull nfsd fix from Chuck Lever:
- Fix a crash when using NFS with krb5p
* tag 'nfsd-6.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
SUNRPC: Fix a crash in gss_krb5_checksum()
Pull yet more xfs bug fixes from Darrick Wong:
"The first bugfix addresses a longstanding problem where we use the
wrong file mapping cursors when trying to compute the speculative
preallocation quantity. This has been causing sporadic crashes when
alwayscow mode is engaged.
The other two fixes correct minor problems in more recent changes.
- Fix the new allocator tracepoints because git am mismerged the
changes such that the trace_XXX got rebased to be in function YYY
instead of XXX
- Ensure that the perag AGFL_RESET state is consistent with whatever
we've just read off the disk
- Fix a bug where we used the wrong iext cursor during a write begin"
* tag 'xfs-6.3-fixes-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix mismerged tracepoints
xfs: clear incore AGFL_RESET state if it's not needed
xfs: pass the correct cursor to xfs_iomap_prealloc_size
Pull xfs percpu counter fixes from Darrick Wong:
"We discovered a filesystem summary counter corruption problem that was
traced to cpu hot-remove racing with the call to percpu_counter_sum
that sets the free block count in the superblock when writing it to
disk. The root cause is that percpu_counter_sum doesn't cull from
dying cpus and hence misses those counter values if the cpu shutdown
hooks have not yet run to merge the values.
I'm hoping this is a fairly painless fix to the problem, since the
dying cpu mask should generally be empty. It's been in for-next for a
week without any complaints from the bots.
- Fix a race in the percpu counters summation code where the
summation failed to add in the values for any CPUs that were dying
but not yet dead. This fixes some minor discrepancies and incorrect
assertions when running generic/650"
* tag 'xfs-6.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
pcpcntr: remove percpu_counter_sum_all()
fork: remove use of percpu_counter_sum_all
pcpcntrs: fix dying cpu summation race
cpumask: introduce for_each_cpu_or
Pull xfs fixes from Darrick Wong:
"This batch started with some debugging enhancements to the new
allocator refactoring that we put in 6.3-rc1 to assist developers in
rebasing their dev branches.
As for more serious code changes -- there's a bug fix to make the
lockless allocator scan the whole filesystem before resorting to the
locking allocator. We're also adding a selftest for the venerable
directory/xattr hash function to make sure that it produces consistent
results so that we can address any fallout as soon as possible.
- Add a few debugging assertions so that people (me) trying to port
code to the new allocator functions don't mess up the caller
requirements
- Relax some overly cautious lock ordering enforcement in the new
allocator code, which means that file allocations will locklessly
scan for the best space they can get before backing off to the
traditional lock-and-really-get-it behavior
- Add tracepoints to make it easier to trace the xfs allocator
behavior
- Actually test the dir/xattr hash algorithm to make sure it produces
consistent results across all the platforms XFS supports"
* tag 'xfs-6.3-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: test dir/attr hash when loading module
xfs: add tracepoints for each of the externally visible allocators
xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags
xfs: try to idiot-proof the allocators
Pull hwmon fixes from Guenter Roeck:
- it87: Fix voltage scaling for chips with 10.9mV ADCs
- xgene: Fix ioremap and memremap leak
- peci/cputemp: Fix miscalculated DTS temperature for SKX
- hwmon core: fix potential sensor registration failure with thermal
subsystem if of_node is missing
* tag 'hwmon-for-v6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon (it87): Fix voltage scaling for chips with 10.9mV ADCs
hwmon: (xgene) Fix ioremap and memremap leak
hwmon: fix potential sensor registration fail if of_node is missing
hwmon: (peci/cputemp) Fix miscalculated DTS for SKX
Pull misc fixes from Andrew Morton:
"21 hotfixes, 8 of which are cc:stable. 11 are for MM, the remainder
are for other subsystems"
* tag 'mm-hotfixes-stable-2023-03-24-17-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits)
mm: mmap: remove newline at the end of the trace
mailmap: add entries for Richard Leitner
kcsan: avoid passing -g for test
kfence: avoid passing -g for test
mm: kfence: fix using kfence_metadata without initialization in show_object()
lib: dhry: fix unstable smp_processor_id(_) usage
mailmap: add entry for Enric Balletbo i Serra
mailmap: map Sai Prakash Ranjan's old address to his current one
mailmap: map Rajendra Nayak's old address to his current one
Revert "kasan: drop skip_kasan_poison variable in free_pages_prepare"
mailmap: add entry for Tobias Klauser
kasan, powerpc: don't rename memintrinsics if compiler adds prefixes
mm/ksm: fix race with VMA iteration and mm_struct teardown
kselftest: vm: fix unused variable warning
mm: fix error handling for map_deny_write_exec
mm: deduplicate error handling for map_deny_write_exec
checksyscalls: ignore fstat to silence build warning on LoongArch
nilfs2: fix kernel-infoleak in nilfs_ioctl_wrap_copy()
test_maple_tree: add more testing for mas_empty_area()
maple_tree: fix mas_skip_node() end slot detection
...
Pull ksmbd server fixes from Steve French:
- return less confusing messages on unsupported dialects
(STATUS_NOT_SUPPORTED instead of I/O error)
- fix for overly frequent inactive session termination
- fix refcount leak
- fix bounds check problems found by static checkers
- fix to advertise named stream support correctly
- Fix AES256 signing bug when connected to from MacOS
* tag '6.3-rc3-ksmbd-smb3-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: return unsupported error on smb1 mount
ksmbd: return STATUS_NOT_SUPPORTED on unsupported smb2.0 dialect
ksmbd: don't terminate inactive sessions after a few seconds
ksmbd: fix possible refcount leak in smb2_open()
ksmbd: add low bound validation to FSCTL_QUERY_ALLOCATED_RANGES
ksmbd: add low bound validation to FSCTL_SET_ZERO_DATA
ksmbd: set FILE_NAMED_STREAMS attribute in FS_ATTRIBUTE_INFORMATION
ksmbd: fix wrong signingkey creation when encryption is AES256
Pull ARM SoC fixes from Arnd Bergmann:
"As usual, most of the bug fixes address issues in the devicetree
files, and out of these, most are for the Qualcomm and NXP platforms,
including:
- A missing 'reserved-memory' property on LG G Watch R that is needed
to prevent clashing with firmware
- Annotations for cache coherency on multiple machines
- Corrections for pinctrl, regulator, clock, iommu and power domain
properties for i.MX and Qualcomm to correctly reflect the hardware
settings
- Firmware file names on multiple machines SA8540P Ride board
- An incompatible change to the qcom vadc driver requires adding
individual labels
- Fix EQoS PHY reset GPIO by dropping the deprecated/wrong property
and switch to the new bindings.
- A fix for PCI bus address translation Tegra194 and Tegra234.
There are also a couple of device driver fixes, addressing:
- A race condition in the amdtee driver
- A performance regression in the Qualcomm 'llcc' driver
- An unitialized variable use NXP i.MX 'weim' driver
- Error handling issues in Qualcomm 'rmtfs', and 'scm' drivers and
the Arm scmi firmware driver"
* tag 'arm-fixes-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (48 commits)
arm64: dts: qcom: sc8280xp-x13s: mark bob regulator as always-on
arm64: dts: qcom: sc8280xp-x13s: mark s12b regulator as always-on
arm64: dts: qcom: sc8280xp-x13s: mark s10b regulator as always-on
arm64: dts: qcom: sc8280xp-x13s: mark s11b regulator as always-on
arm64: dts: imx93: add missing #address-cells and #size-cells to i2c nodes
bus: imx-weim: fix branch condition evaluates to a garbage value
arm64: dts: imx8mn: specify #sound-dai-cells for SAI nodes
ARM: dts: imx6sl: tolino-shine2hd: fix usbotg1 pinctrl
ARM: dts: imx6sll: e60k02: fix usbotg1 pinctrl
ARM: dts: imx6sll: e70k02: fix usbotg1 pinctrl
arm64: dts: imx93: Fix eqos properties
arm64: dts: imx8mp: Fix LCDIF2 node clock order
arm64: dts: imx8mm-nitrogen-r2: fix WM8960 clock name
arm64: dts: imx8dxl-evk: Fix eqos phy reset gpio
firmware: qcom: scm: fix bogus irq error at probe
arm64: dts: qcom: sm8550: Mark UFS controller as cache coherent
arm64: dts: qcom: sa8540p-ride: correct name of remoteproc_nsp0 firmware
arm64: dts: qcom: sm8450: Mark UFS controller as cache coherent
arm64: dts: qcom: sm8350: Mark UFS controller as cache coherent
arm64: dts: qcom: sm8550: fix LPASS pinctrl slew base address
...
Pull power supply fixes from Sebastian Reichel:
- rk817: Fix compiler warning
- cros_usbpd-charger: Fix excessive error printing
- axp288_fuel_gauge: handle platform_get_irq error
- bq24190 and da9150: Fix race condition in remove path
* tag 'for-v6.3-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
power: supply: da9150: Fix use after free bug in da9150_charger_remove due to race condition
power: supply: bq24190: Fix use after free bug in bq24190_remove due to race condition
power: supply: axp288_fuel_gauge: Added check for negative values
power: supply: cros_usbpd: reclassify "default case!" as debug
power: supply: rk817: Fix unsigned comparison with less than zero
Pull drm fixes from Daniel Vetter:
- usual pile of fixes for amdgpu & i915
- probe error handling fixes for meson, lt8912b bridge
- the host1x patch from Arnd
- panel-orientation fix for Lenovo Book X90F
* tag 'drm-fixes-2023-03-24' of git://anongit.freedesktop.org/drm/drm: (23 commits)
gpu: host1x: fix uninitialized variable use
drm/amd/display: Set dcn32 caps.seamless_odm
drm/amd/display: fix wrong index used in dccg32_set_dpstreamclk
drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD Navi
drm/amd/display: remove outdated 8bpc comments
drm/amdgpu/gfx: set cg flags to enter/exit safe mode
drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
drm/amdgpu: add mes resume when do gfx post soft reset
drm/amdgpu: skip ASIC reset for APUs when go to S4
drm/amdgpu: reposition the gpu reset checking for reuse
drm/bridge: lt8912b: return EPROBE_DEFER if bridge is not found
drm/meson: fix missing component unbind on bind errors
drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Book X90F
Revert "drm/i915/hwmon: Enable PL1 power limit"
drm/i915: Update vblank timestamping stuff on seamless M/N change
drm/i915: Fix format for perf_limit_reasons
drm/i915/gt: perform uc late init after probe error injection
drm/i915/active: Fix missing debug object activation
drm/i915/guc: Fix missing ecodes
drm/i915/mtl: Disable MC6 for MTL A step
...
Pull device mapper fixes from Mike Snitzer:
- Fix DM thin to work as a swap device by using 'limit_swap_bios' DM
target flag (initially added to allow swap to dm-crypt) to throttle
the amount of outstanding swap bios.
- Fix DM crypt soft lockup warnings by calling cond_resched() from the
cpu intensive loop in dmcrypt_write().
- Fix DM crypt to not access an uninitialized tasklet. This fix allows
for consistent handling of IO completion, by _not_ needlessly punting
to a workqueue when tasklets are not needed.
- Fix DM core's alloc_dev() initialization for DM stats to check for
and propagate alloc_percpu() failure.
* tag 'for-6.3/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm stats: check for and propagate alloc_percpu failure
dm crypt: avoid accessing uninitialized tasklet
dm crypt: add cond_resched() to dmcrypt_write()
dm thin: fix deadlock when swapping to thin device
Pull block fixes from Jens Axboe:
- NVMe pull request via Christoph:
- Send Identify with CNS 06h only to I/O controllers (Martin
George)
- Fix nvme_tcp_term_pdu to match spec (Caleb Sander)
- Pass in issue_flags for uring_cmd, so the end_io handlers don't need
to assume what the right context is (me)
- Fix for ublk, marking it as LIVE before adding it to avoid races on
the initial IO (Ming)
* tag 'block-6.3-2023-03-24' of git://git.kernel.dk/linux:
nvme-tcp: fix nvme_tcp_term_pdu to match spec
nvme: send Identify with CNS 06h only to I/O controllers
block/io_uring: pass in issue_flags for uring_cmd task_work handling
block: ublk_drv: mark device as LIVE before adding disk
Pull io_uring fixes from Jens Axboe:
- Fix an issue with repeated -ECONNREFUSED on a socket (me)
- Fix a NULL pointer deference due to a stale lookup cache for
allocating direct descriptors (Savino)
* tag 'io_uring-6.3-2023-03-24' of git://git.kernel.dk/linux:
io_uring/rsrc: fix null-ptr-deref in io_file_bitmap_get()
io_uring/net: avoid sending -ECONNABORTED on repeated connection requests
Pull thermal control fixes from Rafael Wysocki:
"These address two recent regressions related to thermal control.
Specifics:
- Restore the thermal core behavior regarding zero-temperature trip
points to avoid a driver regression (Ido Schimmel)
- Fix a recent regression in the ACPI processor driver preventing it
from changing the number of CPU cooling device states exposed via
sysfs after the given CPU cooling device has been registered
(Rafael Wysocki)"
* tag 'thermal-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: core: Restore behavior regarding invalid trip points
ACPI: processor: thermal: Update CPU cooling devices on cpufreq policy changes
thermal: core: Introduce thermal_cooling_device_update()
thermal: core: Introduce thermal_cooling_device_present()
ACPI: processor: Reorder acpi_processor_driver_init()
Pull ACPI fixes from Rafael Wysocki:
"These add new ACPI IRQ override and backlight detection quirks.
Specifics:
- Add backlight=native DMI quirk for Acer Aspire 3830TG to the ACPI
backlight driver (Hans de Goede)
- Add an ACPI IRQ override quirk for Medion S17413 (Aymeric Wibo)"
* tag 'acpi-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: resource: Add Medion S17413 to IRQ override quirk
ACPI: video: Add backlight=native DMI quirk for Acer Aspire 3830TG
At some point in between sending this patch to the list and merging it
into for-next, the tracepoints got all mixed up because I've
over-reliant on automated tools not sucking. The end result is that the
tracepoints are all wrong, so fix them.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
If user does forced unmount ("umount -f") while files are still open
on the share (as was seen in a Kubernetes example running on SMB3.1.1
mount) then we were marking the share as "TID_EXITING" in umount_begin()
which caused all subsequent operations (except write) to fail ... but
unfortunately when umount_begin() is called we do not know yet that
there are open files or active references on the share that would prevent
unmount from succeeding. Kubernetes had example when they were doing
umount -f when files were open which caused the share to become
unusable until the files were closed (and the umount retried).
Fix this so that TID_EXITING is not set until we are about to send
the tree disconnect (not at the beginning of forced umounts in
umount_begin) so that if "umount -f" fails (due to open files or
references) the mount is still usable.
Cc: stable@vger.kernel.org
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Get rid of any prefix paths in @path before lookup_positive_unlocked()
as it will call ->lookup() which already adds those prefix paths
through build_path_from_dentry().
This has caused a performance regression when mounting shares with a
prefix path where readdir(2) would end up retrying several times to
open bad directory names that contained duplicate prefix paths.
Fix this by skipping any prefix paths in @path before calling
lookup_positive_unlocked().
Fixes: e4029e0726 ("cifs: find and use the dentry for cached non-root directories also")
Cc: stable@vger.kernel.org # 6.1+
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Performance tests with large number of threads noted that the change
of the default closetimeo (deferred close timeout between when
close is done by application and when client has to send the close
to the server), to 5 seconds from 1 second, significantly degraded
perf in some cases like this (in the filebench example reported,
the stats show close requests on the wire taking twice as long,
and 50% regression in filebench perf). This is stil configurable
via mount parm closetimeo, but to be safe, decrease default back
to its previous value of 1 second.
Reported-by: Yin Fengwei <fengwei.yin@intel.com>
Reported-by: kernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/lkml/997614df-10d4-af53-9571-edec36b0e2f3@intel.com/
Fixes: 5efdd9122e ("smb3: allow deferred close timeout to be configurable")
Cc: stable@vger.kernel.org # 6.0+
Tested-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Make sure to unload_nls() @nls_codepage if we no longer need it.
Fixes: bc962159e8 ("cifs: avoid race conditions with parallel reconnects")
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pull slab fix from Vlastimil Babka:
"A single build fix for a corner case configuration that is apparently
possible to achieve on some arches, from Geert"
* tag 'slab-fix-for-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab: Fix undefined init_cache_node_node() for NUMA and !SMP
Pull EFI fixes from Ard Biesheuvel:
- Set the NX compat flag for arm64 and zboot, to ensure compatibility
with EFI firmware that complies with tightening requirements imposed
across the ecosystem.
- Improve identification of Ampere Altra systems based on SMBIOS data.
- Fix some issues related to the EFI framebuffer that were introduced
as a result from some refactoring related to zboot and the merge with
sysfb.
- Makefile tweak to avoid rebuilding vmlinuz unnecessarily.
- Fix efi_random_alloc() return value on out of memory condition.
* tag 'efi-fixes-for-v6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/libstub: randomalloc: Return EFI_OUT_OF_RESOURCES on failure
efi/libstub: Use relocated version of kernel's struct screen_info
efi/libstub: zboot: Add compressed image to make targets
efi: sysfb_efi: Add quirk for Lenovo Yoga Book X91F/L
efi: sysfb_efi: Fix DMI quirks not working for simpledrm
efi/libstub: smbios: Drop unused 'recsize' parameter
arm64: efi: Use SMBIOS processor version to key off Ampere quirk
efi/libstub: smbios: Use length member instead of record struct size
efi: earlycon: Reprobe after parsing config tables
arm64: efi: Set NX compat flag in PE/COFF header
efi/libstub: arm64: Remap relocated image with strict permissions
efi/libstub: zboot: Mark zboot EFI application as NX compatible
Qualcomm driver fixes for v6.3
Support for the secure world interrupting the SCM driver drive the wait
queue mechanism was recently introduced, but most platforms doesn't have
this mechanism and an error should not be printed in the log.
The rmtfs_mem driver recently gained support for assigning the region to
multiple VMIDs, but accidentally removed the support for running without
assignment. A couple of changes are introducd to correct this.
The SC8280XP LLCC slice configuration is wrong, reslting in incorrect
configuration of the hardware. The table is corrected, based on the
datasheet.
* tag 'qcom-driver-fixes-for-6.3' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
firmware: qcom: scm: fix bogus irq error at probe
soc: qcom: rmtfs: handle optional qcom,vmid correctly
soc: qcom: rmtfs: fix error handling reading qcom,vmid
soc: qcom: llcc: Fix slice configuration values for SC8280XP
Link: https://lore.kernel.org/r/20230323142505.1086072-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Qualcomm ARM64 Devicetree fixes for v6.3
This correct SIM card selection on the two newly introduced
MSM8916-based USB modems.
The firmware-name for the first CDSP is corrected on the SA8540P Ride
board.
The PCIe controller in SC7280 is marked cache-coherent, which resolves
seen data corruption issues.
Labels are added to the vadc channel nodes on SC8280XP, as the Linux
driver was updated to not include the unit address when generating
device names and collisions thereby prevented registration of the
channels. Audio clocks and routing is corrected and a few regulators are
marked always-on for the Lenovo Thinkpad X13s, as their clients are not
fully described at this point.
SPI5 was accidentally enabled by default on SM6115, and is disabled
again.
CDSP on SM6375 is provided its power-domains, to appropriately vote for
during power up for the DSP.
The iommu mask for the PCIe controllers in SM8150 is updated, to match
what the hypervisor expects.
Th Venus firmware path is corrected on Xiaomi Mi Pad 5 Pro.
The UFS controller is marked cache coherent on SM8350 and SM8450.
The clocks for the second WSA macro on SM8450 is corrected, and given
its own clocks.
The bias-pull-up value for I2C pins are corrected on SM8550, to trigger
the selection of the strong pull. CPU compatibles and the base address
of the LPASS TLMM block are corrected.
* tag 'qcom-arm64-fixes-for-6.3' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (23 commits)
arm64: dts: qcom: sc8280xp-x13s: mark bob regulator as always-on
arm64: dts: qcom: sc8280xp-x13s: mark s12b regulator as always-on
arm64: dts: qcom: sc8280xp-x13s: mark s10b regulator as always-on
arm64: dts: qcom: sc8280xp-x13s: mark s11b regulator as always-on
arm64: dts: qcom: sm8550: Mark UFS controller as cache coherent
arm64: dts: qcom: sa8540p-ride: correct name of remoteproc_nsp0 firmware
arm64: dts: qcom: sm8450: Mark UFS controller as cache coherent
arm64: dts: qcom: sm8350: Mark UFS controller as cache coherent
arm64: dts: qcom: sm8550: fix LPASS pinctrl slew base address
arm64: dts: qcom: sc8280xp-x13s: fix va dmic dai links and routing
arm64: dts: qcom: sc8280xp-x13s: fix dmic sample rate
arm64: dts: qcom: sc8280xp: fix lpass tx macro clocks
arm64: dts: qcom: sc8280xp: fix rx frame shapping info
arm64: dts: qcom: sm8450: correct WSA2 assigned clocks
arm64: dts: qcom: sc7280: Mark PCIe controller as cache coherent
arm64: dts: qcom: msm8916-ufi: Fix sim card selection pinctrl
arm64: dts: qcom: sm8250-xiaomi-elish: Correct venus firmware path
arm64: dts: qcom: sm8550: Use correct CPU compatibles
arm64: dts: qcom: sm8550: Add bias pull up value to tlmm i2c data clk states
arm64: dts: qcom: sm6375: Add missing power-domain-named to CDSP
...
Link: https://lore.kernel.org/r/20230323141642.1085684-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull RISC-V fixes from Palmer Dabbelt:
- A fix to match the CSR ASID masking rules when passing ASIDs to
firmware
- Force GCC to use ISA 2.2, to avoid a host of compatibily issues
between toolchains
* tag 'riscv-for-linus-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Handle zicsr/zifencei issues between clang and binutils
riscv: mm: Fix incorrect ASID argument when flushing TLB
Pull xen fixes from Juergen Gross:
- fix build warning
- avoid concurrent accesses to the Xen PV console ring page
* tag 'for-linus-6.3-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/PVH: avoid 32-bit build warning when obtaining VGA console info
hvc/xen: prevent concurrent accesses to the shared ring
Pull chrome platform fix from Tzung-Bi Shih:
"Fix a kernel data leak vulnerability"
* tag 'tag-chrome-platform-fixes-for-v6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
platform/chrome: cros_ec_chardev: fix kernel data leak from ioctl
Pull i2c fixes from Wolfram Sang:
"A set of regular driver fixes"
* tag 'i2c-for-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: xgene-slimpro: Fix out-of-bounds bug in xgene_slimpro_i2c_xfer()
i2c: hisi: Only use the completion interrupt to finish the transfer
i2c: hisi: Avoid redundant interrupts
i2c: mxs: ensure that DMA buffers are safe for DMA
i2c: imx-lpi2c: check only for enabled interrupt flags
i2c: imx-lpi2c: clean rx/tx buffers upon new message
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, wifi and bluetooth.
Current release - regressions:
- wifi: mt76: mt7915: add back 160MHz channel width support for
MT7915
- libbpf: revert poisoning of strlcpy, it broke uClibc-ng
Current release - new code bugs:
- bpf: improve the coverage of the "allow reads from uninit stack"
feature to fix verification complexity problems
- eth: am65-cpts: reset PPS genf adj settings on enable
Previous releases - regressions:
- wifi: mac80211: serialize ieee80211_handle_wake_tx_queue()
- wifi: mt76: do not run mt76_unregister_device() on unregistered hw,
fix null-deref
- Bluetooth: btqcomsmd: fix command timeout after setting BD address
- eth: igb: revert rtnl_lock() that causes a deadlock
- dsa: mscc: ocelot: fix device specific statistics
Previous releases - always broken:
- xsk: add missing overflow check in xdp_umem_reg()
- wifi: mac80211:
- fix QoS on mesh interfaces
- fix mesh path discovery based on unicast packets
- Bluetooth:
- ISO: fix timestamped HCI ISO data packet parsing
- remove "Power-on" check from Mesh feature
- usbnet: more fixes to drivers trusting packet length
- wifi: iwlwifi: mvm: fix mvmtxq->stopped handling
- Bluetooth: btintel: iterate only bluetooth device ACPI entries
- eth: iavf: fix inverted Rx hash condition leading to disabled hash
- eth: igc: fix the validation logic for taprio's gate list
- dsa: tag_brcm: legacy: fix daisy-chained switches
Misc:
- bpf: adjust insufficient default bpf_jit_limit to account for
growth of BPF use over the last 5 years
- xdp: bpf_xdp_metadata() use EOPNOTSUPP as unique errno indicating
no driver support"
* tag 'net-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits)
Bluetooth: HCI: Fix global-out-of-bounds
Bluetooth: mgmt: Fix MGMT add advmon with RSSI command
Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work
Bluetooth: L2CAP: Fix responding with wrong PDU type
Bluetooth: btqcomsmd: Fix command timeout after setting BD address
Bluetooth: btinel: Check ACPI handle for NULL before accessing
net: mdio: thunder: Add missing fwnode_handle_put()
net: dsa: mt7530: move setting ssc_delta to PHY_INTERFACE_MODE_TRGMII case
net: dsa: mt7530: move lowering TRGMII driving to mt7530_setup()
net: dsa: mt7530: move enabling disabling core clock to mt7530_pll_setup()
net: asix: fix modprobe "sysfs: cannot create duplicate filename"
gve: Cache link_speed value from device
tools: ynl: Fix genlmsg header encoding formats
net: enetc: fix aggregate RMON counters not showing the ranges
Bluetooth: Remove "Power-on" check from Mesh feature
Bluetooth: Fix race condition in hci_cmd_sync_clear
Bluetooth: btintel: Iterate only bluetooth device ACPI entries
Bluetooth: ISO: fix timestamped HCI ISO data packet parsing
Bluetooth: btusb: Remove detection of ISO packets over bulk
Bluetooth: hci_core: Detect if an ACL packet is in fact an ISO packet
...
Prior to commit 7ac2ff8bb3, when we loaded the incore perag structure
with information from the AGF header, we would set or clear the
pagf_agfl_reset field based on whether or not the AGFL list was
misaligned within the block. IOWs, it's an incore state bit that's
supposed to cache something in the ondisk metadata. Therefore, the code
still needs to support clearing the incore bit if (somehow) the AGFL
were to correct itself.
It turns out that xfs_repair does exactly this -- phase 4 loads the AGF
to scan the rmapbt for corrupt records, which can set NEEDS_AGFL_RESET.
The scan unsets AGF_INIT but doesn't unset NEEDS_AGFL_RESET. Phase 5
totally rewrites the AGFL and fixes the alignment problem, didn't clear
NEEDS_AGFL_RESET historically, and reloads the perag state to fix the
freelist. This results in the AGFL being reset based on stale data,
which then causes the new AGFL blocks to be leaked. A subsequent
xfs_repair -n then complains about the leaks.
One could argue that phase 5 ought to clear this bit directly when it
reloads the perag AGF data after rewriting the AGFL, but libxfs used to
handle this for us, so it should go back to doing that.
Found by fuzzing flfirst = ones in xfs/352.
Fixes: 7ac2ff8bb3 ("xfs: perags need atomic operational state")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
In xfs_buffered_write_iomap_begin, @icur is the iext cursor for the data
fork and @ccur is the cursor for the cow fork. Pass in whichever cursor
corresponds to allocfork, because otherwise the xfs_iext_prev_extent
call can use the data fork cursor to walk off the end of the cow fork
structure. Best case it returns the wrong results, worst case it does
this:
stack segment: 0000 [#1] PREEMPT SMP
CPU: 2 PID: 3141909 Comm: fsstress Tainted: G W 6.3.0-rc2-xfsx #6.3.0-rc2 7bf5cc2e98997627cae5c930d890aba3aeec65dd
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20171121_152543-x86-ol7-builder-01.us.oracle.com-4.el7.1 04/01/2014
RIP: 0010:xfs_iext_prev+0x71/0x150 [xfs]
RSP: 0018:ffffc90002233aa8 EFLAGS: 00010297
RAX: 000000000000000f RBX: 000000000000000e RCX: 000000000000000c
RDX: 0000000000000002 RSI: 000000000000000e RDI: ffff8883d0019ba0
RBP: 989642409af8a7a7 R08: ffffea0000000001 R09: 0000000000000002
R10: 0000000000000000 R11: 000000000000000c R12: ffffc90002233b00
R13: ffff8883d0019ba0 R14: 989642409af8a6bf R15: 000ffffffffe0000
FS: 00007fdf8115f740(0000) GS:ffff88843fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdf8115e000 CR3: 0000000357256000 CR4: 00000000003506e0
Call Trace:
<TASK>
xfs_iomap_prealloc_size.constprop.0.isra.0+0x1a6/0x410 [xfs 619a268fb2406d68bd34e007a816b27e70abc22c]
xfs_buffered_write_iomap_begin+0xa87/0xc60 [xfs 619a268fb2406d68bd34e007a816b27e70abc22c]
iomap_iter+0x132/0x2f0
iomap_file_buffered_write+0x92/0x330
xfs_file_buffered_write+0xb1/0x330 [xfs 619a268fb2406d68bd34e007a816b27e70abc22c]
vfs_write+0x2eb/0x410
ksys_write+0x65/0xe0
do_syscall_64+0x2b/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Found by xfs/538 in alwayscow mode, but this doesn't seem particular to
that test.
Fixes: 590b16516e ("xfs: refactor xfs_iomap_prealloc_size")
Actually-Fixes: 66ae56a53f ("xfs: introduce an always_cow mode")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Pull btrfs fixes from David Sterba:
"A few more fixes, the zoned accounting fix is spread across a few
patches, preparatory and the actual fixes:
- zoned mode:
- fix accounting of unusable zone space
- fix zone activation condition for DUP profile
- preparatory patches
- improved error handling of missing chunks
- fix compiler warning"
* tag 'for-6.3-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zoned: drop space_info->active_total_bytes
btrfs: zoned: count fresh BG region as zone unusable
btrfs: use temporary variable for space_info in btrfs_update_block_group
btrfs: rename BTRFS_FS_NO_OVERCOMMIT to BTRFS_FS_ACTIVE_ZONE_TRACKING
btrfs: zoned: fix btrfs_can_activate_zone() to support DUP profile
btrfs: fix compiler warning on SPARC/PA-RISC handling fscrypt_setup_filename
btrfs: handle missing chunk mapping more gracefully
Pull SCSI fixes from James Bottomley:
"Four small fixes, three in drivers.
The core fix adds a UFS device to an existing quirk to avoid a huge
delay on boot"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: scsi_dh_alua: Fix memleak for 'qdata' in alua_activate()
scsi: qla2xxx: Synchronize the IOCB count to be in order
scsi: qla2xxx: Perform lockless command completion in abort path
scsi: core: Add BLIST_SKIP_VPD_PAGES for SKhynix H28U74301AMR
When multiple processes/channels do reconnects in parallel
we used to return success immediately
negotiate/session-setup/tree-connect, causing race conditions
between processes that enter the function in parallel.
This caused several errors related to session not found to
show up during parallel reconnects.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
We do not dump the file path for smb3_open_enter ftrace
calls, which is a severe handicap while debugging
using ftrace evens. This change adds that info.
Unfortunately, we're not updating the path in open params
in many places; which I had to do as a part of this change.
SMB2_open gets path in utf16 format, but it's easier of
path is supplied as char pointer in oparms.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
ksmbd disconnect connection when mounting with vers=smb1.
ksmbd should send smb1 negotiate response to client for correct
unsupported error return. This patch add needed SMB1 macros and fill
NegProt part of the response for smb1 negotiate response.
Cc: stable@vger.kernel.org
Reported-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Nathan reported that when building with GNU as and a version of clang that
defaults to DWARF5, the assembler will complain with:
Error: non-constant .uleb128 is not supported
This is because `-g` defaults to the compiler debug info default. If the
assembler does not support some of the directives used, the above errors
occur. To fix, remove the explicit passing of `-g`.
All the test wants is that stack traces print valid function names, and
debug info is not required for that. (I currently cannot recall why I
added the explicit `-g`.)
Link: https://lkml.kernel.org/r/20230316224705.709984-2-elver@google.com
Fixes: 1fe84fd4a4 ("kcsan: Add test suite")
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Nathan reported that when building with GNU as and a version of clang that
defaults to DWARF5:
$ make -skj"$(nproc)" ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- \
LLVM=1 LLVM_IAS=0 O=build \
mrproper allmodconfig mm/kfence/kfence_test.o
/tmp/kfence_test-08a0a0.s: Assembler messages:
/tmp/kfence_test-08a0a0.s:14627: Error: non-constant .uleb128 is not supported
/tmp/kfence_test-08a0a0.s:14628: Error: non-constant .uleb128 is not supported
/tmp/kfence_test-08a0a0.s:14632: Error: non-constant .uleb128 is not supported
/tmp/kfence_test-08a0a0.s:14633: Error: non-constant .uleb128 is not supported
/tmp/kfence_test-08a0a0.s:14639: Error: non-constant .uleb128 is not supported
...
This is because `-g` defaults to the compiler debug info default. If the
assembler does not support some of the directives used, the above errors
occur. To fix, remove the explicit passing of `-g`.
All the test wants is that stack traces print valid function names, and
debug info is not required for that. (I currently cannot recall why I
added the explicit `-g`.)
Link: https://lkml.kernel.org/r/20230316224705.709984-1-elver@google.com
Fixes: bc8fbc5f30 ("kfence: add test suite")
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This reverts commit 487a32ec24.
should_skip_kasan_poison() reads the PG_skip_kasan_poison flag from
page->flags. However, this line of code in free_pages_prepare():
page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
clears most of page->flags, including PG_skip_kasan_poison, before calling
should_skip_kasan_poison(), which meant that it would never return true as
a result of the page flag being set. Therefore, fix the code to call
should_skip_kasan_poison() before clearing the flags, as we were doing
before the reverted patch.
This fixes a measurable performance regression introduced in the reverted
commit, where munmap() takes longer than intended if HW tags KASAN is
supported and enabled at runtime. Without this patch, we see a
single-digit percentage performance regression in a particular
mmap()-heavy benchmark when enabling HW tags KASAN, and with the patch,
there is no statistically significant performance impact when enabling HW
tags KASAN.
Link: https://lkml.kernel.org/r/20230310042914.3805818-2-pcc@google.com
Fixes: 487a32ec24 ("kasan: drop skip_kasan_poison variable in free_pages_prepare")
Link: https://linux-review.googlesource.com/id/Ic4f13affeebd20548758438bb9ed9ca40e312b79
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org> [6.1]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The ioctl helper function nilfs_ioctl_wrap_copy(), which exchanges a
metadata array to/from user space, may copy uninitialized buffer regions
to user space memory for read-only ioctl commands NILFS_IOCTL_GET_SUINFO
and NILFS_IOCTL_GET_CPINFO.
This can occur when the element size of the user space metadata given by
the v_size member of the argument nilfs_argv structure is larger than the
size of the metadata element (nilfs_suinfo structure or nilfs_cpinfo
structure) on the file system side.
KMSAN-enabled kernels detect this issue as follows:
BUG: KMSAN: kernel-infoleak in instrument_copy_to_user
include/linux/instrumented.h:121 [inline]
BUG: KMSAN: kernel-infoleak in _copy_to_user+0xc0/0x100 lib/usercopy.c:33
instrument_copy_to_user include/linux/instrumented.h:121 [inline]
_copy_to_user+0xc0/0x100 lib/usercopy.c:33
copy_to_user include/linux/uaccess.h:169 [inline]
nilfs_ioctl_wrap_copy+0x6fa/0xc10 fs/nilfs2/ioctl.c:99
nilfs_ioctl_get_info fs/nilfs2/ioctl.c:1173 [inline]
nilfs_ioctl+0x2402/0x4450 fs/nilfs2/ioctl.c:1290
nilfs_compat_ioctl+0x1b8/0x200 fs/nilfs2/ioctl.c:1343
__do_compat_sys_ioctl fs/ioctl.c:968 [inline]
__se_compat_sys_ioctl+0x7dd/0x1000 fs/ioctl.c:910
__ia32_compat_sys_ioctl+0x93/0xd0 fs/ioctl.c:910
do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
__do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
entry_SYSENTER_compat_after_hwframe+0x70/0x82
Uninit was created at:
__alloc_pages+0x9f6/0xe90 mm/page_alloc.c:5572
alloc_pages+0xab0/0xd80 mm/mempolicy.c:2287
__get_free_pages+0x34/0xc0 mm/page_alloc.c:5599
nilfs_ioctl_wrap_copy+0x223/0xc10 fs/nilfs2/ioctl.c:74
nilfs_ioctl_get_info fs/nilfs2/ioctl.c:1173 [inline]
nilfs_ioctl+0x2402/0x4450 fs/nilfs2/ioctl.c:1290
nilfs_compat_ioctl+0x1b8/0x200 fs/nilfs2/ioctl.c:1343
__do_compat_sys_ioctl fs/ioctl.c:968 [inline]
__se_compat_sys_ioctl+0x7dd/0x1000 fs/ioctl.c:910
__ia32_compat_sys_ioctl+0x93/0xd0 fs/ioctl.c:910
do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
__do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
entry_SYSENTER_compat_after_hwframe+0x70/0x82
Bytes 16-127 of 3968 are uninitialized
...
This eliminates the leak issue by initializing the page allocated as
buffer using get_zeroed_page().
Link: https://lkml.kernel.org/r/20230307085548.6290-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+132fdd2f1e1805fdc591@syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/000000000000a5bd2d05f63f04ae@google.com
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Fix mas_skip_node() for mas_empty_area()", v2.
mas_empty_area() was incorrectly returning an error when there was room.
The issue was tracked down to mas_skip_node() using the incorrect
end-of-slot count. Instead of using the nodes hard limit, the limit of
data should be used.
mas_skip_node() was also setting the min and max to that of the child
node, which was unnecessary. Within these limits being set, there was
also a bug that corrupted the maple state's max if the offset was set to
the maximum node pivot. The bug was without consequence unless there was
a sufficient gap in the next child node which would cause an error to be
returned.
This patch set fixes these errors by removing the limit setting from
mas_skip_node() and uses the mas_data_end() for slot limits, and adds
tests for all failures discovered.
This patch (of 2):
mas_skip_node() is used to move the maple state to the node with a higher
limit. It does this by walking up the tree and increasing the slot count.
Since slot count may not be able to be increased, it may need to walk up
multiple times to find room to walk right to a higher limit node. The
limit of slots that was being used was the node limit and not the last
location of data in the node. This would cause the maple state to be
shifted outside actual data and enter an error state, thus returning
-EBUSY.
The result of the incorrect error state means that mas_awalk() would
return an error instead of finding the allocation space.
The fix is to use mas_data_end() in mas_skip_node() to detect the nodes
data end point and continue walking the tree up until it is safe to move
to a node with a higher limit.
The walk up the tree also sets the maple state limits so remove the buggy
code from mas_skip_node(). Setting the limits had the unfortunate side
effect of triggering another bug if the parent node was full and the there
was no suitable gap in the second last child, but room in the next child.
mas_skip_node() may also be passed a maple state in an error state from
mas_anode_descend() when no allocations are available. Return on such an
error state immediately.
Link: https://lkml.kernel.org/r/20230307180247.2220303-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20230307180247.2220303-2-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Snild Dolkow <snild@sony.com>
Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/
Tested-by: Snild Dolkow <snild@sony.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Gao Xiang has reported that the page allocator complains about high order
__GFP_NOFAIL request coming from the vmalloc core:
__alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5549
alloc_pages+0x1aa/0x270 mm/mempolicy.c:2286
vm_area_alloc_pages mm/vmalloc.c:2989 [inline]
__vmalloc_area_node mm/vmalloc.c:3057 [inline]
__vmalloc_node_range+0x978/0x13c0 mm/vmalloc.c:3227
kvmalloc_node+0x156/0x1a0 mm/util.c:606
kvmalloc include/linux/slab.h:737 [inline]
kvmalloc_array include/linux/slab.h:755 [inline]
kvcalloc include/linux/slab.h:760 [inline]
it seems that I have completely missed high order allocation backing
vmalloc areas case when implementing __GFP_NOFAIL support. This means
that [k]vmalloc at al. can allocate higher order allocations with
__GFP_NOFAIL which can trigger OOM killer for non-costly orders easily or
cause a lot of reclaim/compaction activity if those requests cannot be
satisfied.
Fix the issue by falling back to zero order allocations for __GFP_NOFAIL
requests if the high order request fails.
Link: https://lkml.kernel.org/r/ZAXynvdNqcI0f6Us@dhcp22.suse.cz
Fixes: 9376130c39 ("mm/vmalloc: add support for __GFP_NOFAIL")
Reported-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lkml.kernel.org/r/20230305053035.1911-1-hsiangkao@linux.alibaba.com
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Baoquan He <bhe@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Daniel Borkmann says:
====================
pull-request: bpf 2023-03-23
We've added 8 non-merge commits during the last 13 day(s) which contain
a total of 21 files changed, 238 insertions(+), 161 deletions(-).
The main changes are:
1) Fix verification issues in some BPF programs due to their stack usage
patterns, from Eduard Zingerman.
2) Fix to add missing overflow checks in xdp_umem_reg and return an error
in such case, from Kal Conley.
3) Fix and undo poisoning of strlcpy in libbpf given it broke builds for
libcs which provided the former like uClibc-ng, from Jesus Sanchez-Palencia.
4) Fix insufficient bpf_jit_limit default to avoid users running into hard
to debug seccomp BPF errors, from Daniel Borkmann.
5) Fix driver return code when they don't support a bpf_xdp_metadata kfunc
to make it unambiguous from other errors, from Jesper Dangaard Brouer.
6) Two BPF selftest fixes to address compilation errors from recent changes
in kernel structures, from Alexei Starovoitov.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
xdp: bpf_xdp_metadata use EOPNOTSUPP for no driver support
bpf: Adjust insufficient default bpf_jit_limit
xsk: Add missing overflow check in xdp_umem_reg
selftests/bpf: Fix progs/test_deny_namespace.c issues.
selftests/bpf: Fix progs/find_vma_fail1.c build error.
libbpf: Revert poisoning of strlcpy
selftests/bpf: Tests for uninitialized stack reads
bpf: Allow reads from uninit stack
====================
Link: https://lore.kernel.org/r/20230323225221.6082-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- Fix MGMT add advmon with RSSI command
- L2CAP: Fix responding with wrong PDU type
- Fix race condition in hci_cmd_sync_clear
- ISO: Fix timestamped HCI ISO data packet parsing
- HCI: Fix global-out-of-bounds
- hci_sync: Resume adv with no RPA when active scan
* tag 'for-net-2023-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: HCI: Fix global-out-of-bounds
Bluetooth: mgmt: Fix MGMT add advmon with RSSI command
Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work
Bluetooth: L2CAP: Fix responding with wrong PDU type
Bluetooth: btqcomsmd: Fix command timeout after setting BD address
Bluetooth: btinel: Check ACPI handle for NULL before accessing
Bluetooth: Remove "Power-on" check from Mesh feature
Bluetooth: Fix race condition in hci_cmd_sync_clear
Bluetooth: btintel: Iterate only bluetooth device ACPI entries
Bluetooth: ISO: fix timestamped HCI ISO data packet parsing
Bluetooth: btusb: Remove detection of ISO packets over bulk
Bluetooth: hci_core: Detect if an ACL packet is in fact an ISO packet
Bluetooth: hci_sync: Resume adv with no RPA when active scan
====================
Link: https://lore.kernel.org/r/20230323202335.3380841-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kalle Valo says:
====================
wireless fixes for v6.3
Third set of fixes for v6.3. mt76 has two kernel crash fixes and
adding back 160 MHz channel support for mt7915. mac80211 has fixes for
a race in transmit path and two mesh related fixes. iwlwifi also has
fixes for races.
* tag 'wireless-2023-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: mac80211: fix mesh path discovery based on unicast packets
wifi: mac80211: fix qos on mesh interfaces
wifi: iwlwifi: mvm: protect TXQ list manipulation
wifi: iwlwifi: mvm: fix mvmtxq->stopped handling
wifi: mac80211: Serialize ieee80211_handle_wake_tx_queue()
wifi: mwifiex: mark OF related data as maybe unused
wifi: mt76: connac: do not check WED status for non-mmio devices
wifi: mt76: mt7915: add back 160MHz channel width support for MT7915
wifi: mt76: do not run mt76_unregister_device() on unregistered hw
====================
Link: https://lore.kernel.org/r/20230323110332.C4FE4C433D2@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull gfs2 fix from Andreas Gruenbacher:
- Reinstate commit 970343cd49 ("GFS2: free disk inode which is
deleted by remote node -V2") as reverting that commit could cause
gfs2_put_super() to hang.
* tag 'gfs2-v6.3-rc3-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
Reinstate "GFS2: free disk inode which is deleted by remote node -V2"
The MGMT command: MGMT_OP_ADD_ADV_PATTERNS_MONITOR_RSSI uses variable
length argument. This causes host not able to register advmon with rssi.
This patch has been locally tested by adding monitor with rssi via
btmgmt on a kernel 6.1 machine.
Reviewed-by: Archie Pusaka <apusaka@chromium.org>
Fixes: b338d91703 ("Bluetooth: Implement support for Mesh")
Signed-off-by: Howard Chung <howardchung@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
In btsdio_probe, &data->work was bound with btsdio_work.In
btsdio_send_frame, it was started by schedule_work.
If we call btsdio_remove with an unfinished job, there may
be a race condition and cause UAF bug on hdev.
Fixes: ddbaf13e36 ("[Bluetooth] Add generic driver for Bluetooth SDIO devices")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
L2CAP_ECRED_CONN_REQ shall be responded with L2CAP_ECRED_CONN_RSP not
L2CAP_LE_CONN_RSP:
L2CAP LE EATT Server - Reject - run
Listening for connections
New client connection with handle 0x002a
Sending L2CAP Request from client
Client received response code 0x15
Unexpected L2CAP response code (expected 0x18)
L2CAP LE EATT Server - Reject - test failed
> ACL Data RX: Handle 42 flags 0x02 dlen 26
LE L2CAP: Enhanced Credit Connection Request (0x17) ident 1 len 18
PSM: 39 (0x0027)
MTU: 64
MPS: 64
Credits: 5
Source CID: 65
Source CID: 66
Source CID: 67
Source CID: 68
Source CID: 69
< ACL Data TX: Handle 42 flags 0x00 dlen 16
LE L2CAP: LE Connection Response (0x15) ident 1 len 8
invalid size
00 00 00 00 00 00 06 00
L2CAP LE EATT Server - Reject - run
Listening for connections
New client connection with handle 0x002a
Sending L2CAP Request from client
Client received response code 0x18
L2CAP LE EATT Server - Reject - test passed
Fixes: 15f02b9105 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
On most devices using the btqcomsmd driver (e.g. the DragonBoard 410c
and other devices based on the Qualcomm MSM8916/MSM8909/... SoCs)
the Bluetooth firmware seems to become unresponsive for a while after
setting the BD address. On recent kernel versions (at least 5.17+)
this often causes timeouts for subsequent commands, e.g. the HCI reset
sent by the Bluetooth core during initialization:
Bluetooth: hci0: Opcode 0x c03 failed: -110
Unfortunately this behavior does not seem to be documented anywhere.
Experimentation suggests that the minimum necessary delay to avoid
the problem is ~150us. However, to be sure add a sleep for > 1ms
in case it is a bit longer on other firmware versions.
Older kernel versions are likely also affected, although perhaps with
slightly different errors or less probability. Side effects can easily
hide the issue in most cases, e.g. unrelated incoming interrupts that
cause the necessary delay.
Fixes: 1511cc750c ("Bluetooth: Introduce Qualcomm WCNSS SMD based HCI driver")
Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
There are two related issues that appear in certain combinations with
clang and GNU binutils.
The first occurs when a version of clang that supports zicsr or zifencei
via '-march=' [1] (i.e, >= 17.x) is used in combination with a version
of GNU binutils that do not recognize zicsr and zifencei in the
'-march=' value (i.e., < 2.36):
riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zicsr2p0_zifencei2p0: Invalid or unknown z ISA extension: 'zifencei'
riscv64-linux-gnu-ld: failed to merge target specific data of file fs/efivarfs/file.o
riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zicsr2p0_zifencei2p0: Invalid or unknown z ISA extension: 'zifencei'
riscv64-linux-gnu-ld: failed to merge target specific data of file fs/efivarfs/super.o
The second occurs when a version of clang that does not support zicsr or
zifencei via '-march=' (i.e., <= 16.x) is used in combination with a
version of GNU as that defaults to a newer ISA base spec, which requires
specifying zicsr and zifencei in the '-march=' value explicitly (i.e, >=
2.38):
../arch/riscv/kernel/kexec_relocate.S: Assembler messages:
../arch/riscv/kernel/kexec_relocate.S:147: Error: unrecognized opcode `fence.i', extension `zifencei' required
clang-12: error: assembler command failed with exit code 1 (use -v to see invocation)
This is the same issue addressed by commit 6df2a016c0 ("riscv: fix
build with binutils 2.38") (see [2] for additional information) but
older versions of clang miss out on it because the cc-option check
fails:
clang-12: error: invalid arch name 'rv64imac_zicsr_zifencei', unsupported standard user-level extension 'zicsr'
clang-12: error: invalid arch name 'rv64imac_zicsr_zifencei', unsupported standard user-level extension 'zicsr'
To resolve the first issue, only attempt to add zicsr and zifencei to
the march string when using the GNU assembler 2.38 or newer, which is
when the default ISA spec was updated, requiring these extensions to be
specified explicitly. LLVM implements an older version of the base
specification for all currently released versions, so these instructions
are available as part of the 'i' extension. If LLVM's implementation is
updated in the future, a CONFIG_AS_IS_LLVM condition can be added to
CONFIG_TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI.
To resolve the second issue, use version 2.2 of the base ISA spec when
using an older version of clang that does not support zicsr or zifencei
via '-march=', as that is the spec version most compatible with the one
clang/LLVM implements and avoids the need to specify zicsr and zifencei
explicitly due to still being a part of 'i'.
[1]: 22e199e6af
[2]: https://lore.kernel.org/ZAxT7T9Xy1Fo3d5W@aurel32.net/
Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1808
Co-developed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230313-riscv-zicsr-zifencei-fiasco-v1-1-dd1b7840a551@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.3
- send Identify with CNS 06h only to I/O controllers (Martin George)
- fix nvme_tcp_term_pdu to match spec (Caleb Sander)"
* tag 'nvme-6.3-2023-03-23' of git://git.infradead.org/nvme:
nvme-tcp: fix nvme_tcp_term_pdu to match spec
nvme: send Identify with CNS 06h only to I/O controllers
It turns out that reverting commit 970343cd49 ("GFS2: free disk inode
which is deleted by remote node -V2") causes a regression related to
evicting inodes that were unlinked on a different cluster node.
We could also have simply added a call to d_mark_dontcache() to function
gfs2_try_evict(), but the original pre-revert code is better tested and
proven.
This reverts commit 445cb1277e.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
When in dual role mode (dr_mode == USB_DR_MODE_OTG), platform probe
successively basically calls:
- dwc2_gadget_init()
- dwc2_hcd_init()
- dwc2_lowlevel_hw_disable() since recent change [1]
- usb_add_gadget_udc()
The PHYs (and so the clocks it may provide) shouldn't be disabled for all
SoCs, in OTG mode, as the HCD part has been initialized.
On STM32 this creates some weird race condition upon boot, when:
- initially attached as a device, to a HOST
- and there is a gadget script invoked to setup the device part.
Below issue becomes systematic, as long as the gadget script isn't
started by userland: the hardware PHYs (and so the clocks provided by the
PHYs) remains disabled.
It ends up in having an endless interrupt storm, before the watchdog
resets the platform.
[ 16.924163] dwc2 49000000.usb-otg: EPs: 9, dedicated fifos, 952 entries in SPRAM
[ 16.962704] dwc2 49000000.usb-otg: DWC OTG Controller
[ 16.966488] dwc2 49000000.usb-otg: new USB bus registered, assigned bus number 2
[ 16.974051] dwc2 49000000.usb-otg: irq 77, io mem 0x49000000
[ 17.032170] hub 2-0:1.0: USB hub found
[ 17.042299] hub 2-0:1.0: 1 port detected
[ 17.175408] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
[ 17.181741] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
[ 17.189303] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
...
The host part is also not functional, until the gadget part is configured.
The HW may only be disabled for peripheral mode (original init), e.g.
dr_mode == USB_DR_MODE_PERIPHERAL, until the gadget driver initializes.
But when in USB_DR_MODE_OTG, the HW should remain enabled, as the HCD part
is able to run, while the gadget part isn't necessarily configured.
I don't fully get the of purpose the original change, that claims disabling
the hardware is missing. It creates conditions on SOCs using the PHY
initialization to be completely non working in OTG mode. Original
change [1] should be reworked to be platform specific.
[1] https://lore.kernel.org/r/20221206-dwc2-gadget-dual-role-v1-2-36515e1092cd@theobroma-systems.com
Fixes: ade23d7b7e ("usb: dwc2: power on/off phy for peripheral mode in dual-role mode")
Cc: stable <stable@kernel.org>
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Reviewed-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Tested-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Link: https://lore.kernel.org/r/20230315144433.3095859-1-fabrice.gasnier@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Each time the platform goes to low power, PM suspend / resume routines
call: __dwc2_lowlevel_hw_enable -> devm_add_action_or_reset().
This adds a new devres each time.
This may also happen at runtime, as dwc2_lowlevel_hw_enable() can be
called from udc_start().
This can be seen with tracing:
- echo 1 > /sys/kernel/debug/tracing/events/dev/devres_log/enable
- go to low power
- cat /sys/kernel/debug/tracing/trace
A new "ADD" entry is found upon each low power cycle:
... devres_log: 49000000.usb-otg ADD 82a13bba devm_action_release (8 bytes)
... devres_log: 49000000.usb-otg ADD 49889daf devm_action_release (8 bytes)
...
A second issue is addressed here:
- regulator_bulk_enable() is called upon each PM cycle (suspend/resume).
- regulator_bulk_disable() never gets called.
So the reference count for these regulators constantly increase, by one
upon each low power cycle, due to missing regulator_bulk_disable() call
in __dwc2_lowlevel_hw_disable().
The original fix that introduced the devm_add_action_or_reset() call,
fixed an issue during probe, that happens due to other errors in
dwc2_driver_probe() -> dwc2_core_reset(). Then the probe fails without
disabling regulators, when dr_mode == USB_DR_MODE_PERIPHERAL.
Rather fix the error path: disable all the low level hardware in the
error path, by using the "hsotg->ll_hw_enabled" flag. Checking dr_mode
has been introduced to avoid a dual call to dwc2_lowlevel_hw_disable().
"ll_hw_enabled" should achieve the same (and is used currently in the
remove() routine).
Fixes: 54c1960605 ("usb: dwc2: Always disable regulators on driver teardown")
Fixes: 33a06f1300 ("usb: dwc2: Fix error path in gadget registration")
Cc: stable <stable@kernel.org>
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/20230316084127.126084-1-fabrice.gasnier@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull zonefs fixes from Damien Le Moal:
- Silence a false positive smatch warning about an uninitialized
variable
- Fix an error message to provide more useful information about invalid
zone append write results
* tag 'zonefs-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: Fix error message in zonefs_file_dio_append()
zonefs: Prevent uninitialized symbol 'size' warning
In the output of /proc/fs/cifs/open_files, we only print
the tree id for the tcon of each open file. It becomes
difficult to know which tcon these files belong to with
just the tree id.
This change dumps ses id in addition to all other data today.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Currently, we only dump the pending mid information only
on the primary channel in /proc/fs/cifs/DebugData.
If multichannel is active, we do not print the pending MID
list on secondary channels.
This change will dump the pending mids for all the channels
based on server->conn_id.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
When querying server interfaces returns -EOPNOTSUPP,
clear the list of interfaces. Assumption is that multichannel
would be disabled too.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
We have the server interface list hanging off the tcon
structure today for reasons unknown. So each tcon which is
connected to a file server can query them separately,
which is really unnecessary. To avoid this, in the query
function, we will check the time of last update of the
interface list, and avoid querying the server if it is
within a certain range.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
The logic in efi_random_alloc() will iterate over the memory map twice,
once to count the number of candidate slots, and another time to locate
the chosen slot after randomization.
If there is insufficient memory to do the allocation, the second loop
will run to completion without actually having located a slot, but we
currently return EFI_SUCCESS in this case, as we fail to initialize
status to the appropriate error value of EFI_OUT_OF_RESOURCES.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
[Why]
The commit c76e483cd9 ("drm/amd/display: Don't restrict bpc to 8 bpc")
removes the historical 8bpc dependency and sets max_bpc to 16.
[How]
The comment that states "8bpc for non-edp" needs to be removed as well.
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
sriov needs to enter/exit safe mode in update umd p state
add the cg flag to let it enter or exit while needed
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
For engines not supporting soft reset, i.e. VCN, there will be a failed
ib test before mode 1 reset during asic reset. The fences in this case
are never signaled and next time when we try to free the sa_bo, kernel
will hang.
[How]
During pre_asic_reset, driver will clear job fences and afterwards the
fences' refcount will be reduced to 1. For drm_sched_jobs it will be
released in job_free_cb, and for non-sched jobs like ib_test, it's meant
to be released in sa_bo_free but only when the fences are signaled. So
we have to force signal the non_sched bad job's fence during
pre_asic_reset or the clear is not complete.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
when gfx do soft reset, mes will also do reset, if mes is not
resumed when do recover from soft reset, mes is unable to respond
in later sequence
[how]
resume mes when do gfx post soft reset
Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For GC IP v11.0.4/11, PSP TMR need to be reserved
for ASIC mode2 reset. But for S4, when psp suspend,
it will destroy the TMR that fails the ASIC reset.
[ 96.006101] amdgpu 0000:62:00.0: amdgpu: MODE2 reset
[ 100.409717] amdgpu 0000:62:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000011 SMN_C2PMSG_82:0x00000002
[ 100.411593] amdgpu 0000:62:00.0: amdgpu: Mode2 reset failed!
[ 100.412470] amdgpu 0000:62:00.0: PM: pci_pm_freeze(): amdgpu_pmops_freeze+0x0/0x50 [amdgpu] returns -62
[ 100.414020] amdgpu 0000:62:00.0: PM: dpm_run_callback(): pci_pm_freeze+0x0/0xd0 returns -62
[ 100.415311] amdgpu 0000:62:00.0: PM: pci_pm_freeze+0x0/0xd0 returned -62 after 4623202 usecs
[ 100.416608] amdgpu 0000:62:00.0: PM: failed to freeze async: error -62
We can skip the reset on APUs, assuming we can resume them
properly. Verified on some GFX11, GFX10 and old GFX9 APUs.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
In some cases, we expose the kernel's struct screen_info to the EFI stub
directly, so it gets populated before even entering the kernel. This
means the early console is available as soon as the early param parsing
happens, which is nice. It also means we need two different ways to pass
this information, as this trick only works if the EFI stub is baked into
the core kernel image, which is not always the case.
Huacai reports that the preparatory refactoring that was needed to
implement this alternative method for zboot resulted in a non-functional
efifb earlycon for other cases as well, due to the reordering of the
kernel image relocation with the population of the screen_info struct,
and the latter now takes place after copying the image to its new
location, which means we copy the old, uninitialized state.
So let's ensure that the same-image version of alloc_screen_info()
produces the correct screen_info pointer, by taking the displacement of
the loaded image into account.
Reported-by: Huacai Chen <chenhuacai@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Link: https://lore.kernel.org/linux-efi/20230310021749.921041-1-chenhuacai@loongson.cn/
Fixes: 42c8ea3dca ("efi: libstub: Factor out EFI stub entrypoint into separate file")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
In device_for_each_child_node(), we should add fwnode_handle_put()
when break out of the iteration device_for_each_child_node()
as it will automatically increase and decrease the refcounter.
Fixes: 379d7ac7ca ("phy: mdio-thunder: Add driver for Cavium Thunder SoC MDIO buses.")
Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
mlx5 fixes 2023-03-21
This series provides bug fixes to mlx5 driver.
* tag 'mlx5-fixes-2023-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: E-Switch, Fix an Oops in error handling code
net/mlx5: Read the TC mapping of all priorities on ETS query
net/mlx5e: Overcome slow response for first macsec ASO WQE
net/mlx5e: Initialize link speed to zero
net/mlx5: Fix steering rules cleanup
net/mlx5e: Block entering switchdev mode with ns inconsistency
net/mlx5e: Set uplink rep as NETNS_LOCAL
====================
Link: https://lore.kernel.org/r/20230321211135.47711-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-03-21 (ice)
This series contains updates to ice driver only.
Piotr sets first_desc field for proper handling of Flow Director
packets.
Michal moves error checking for VF earlier in function to properly return
error before other checks/reporting; he also corrects VSI filter removal to
be done during VSI removal and not rebuild.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: remove filters only if VSI is deleted
ice: check if VF exists before mode check
ice: fix rx buffers handling for flow director packets
====================
Link: https://lore.kernel.org/r/20230321183641.2849726-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-03-21 (iavf, i40e)
This series contains updates to iavf and i40e drivers.
Stefan Assmann adds check, and return, if driver has already gone
through remove to prevent hang for iavf.
Radoslaw adds zero initialization to ensure Flow Director packets are
populated with correct values for i40e.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
i40e: fix flow director packet filter programming
iavf: fix hang on reboot with ice
====================
Link: https://lore.kernel.org/r/20230321183548.2849671-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move lowering the TRGMII Tx clock driving to mt7530_setup(), after setting
the core clock, as seen on the U-Boot MediaTek ethernet driver.
Move the code which looks like it lowers the TRGMII Rx clock driving to
after the TRGMII Tx clock driving is lowered. This is run after lowering
the Tx clock driving on the U-Boot MediaTek ethernet driver as well.
This way, the switch should consume less power regardless of port 6 being
used.
Update the comment explaining mt7530_pad_clk_setup().
Tested rgmii and trgmii modes of port 6 and rgmii mode of port 5 on MCM
MT7530 on MT7621AT Unielec U7621-06 and standalone MT7530 on MT7623NI
Bananapi BPI-R2.
Fixes: b8f126a8d5 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
Link: 29a48bf9cc/drivers/net/mtk_eth.c (L682)
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20230320190520.124513-2-arinc.unal@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Split the code that enables and disables TRGMII clocks and core clock.
Move enabling and disabling core clock to mt7530_pll_setup() as it's
supposed to be run there.
Add 20 ms delay before enabling the core clock as seen on the U-Boot
MediaTek ethernet driver.
Change the comment for enabling and disabling TRGMII clocks as the code
seems to affect both TXC and RXC.
Tested rgmii and trgmii modes of port 6 and rgmii mode of port 5 on MCM
MT7530 on MT7621AT Unielec U7621-06 and standalone MT7530 on MT7623NI
Bananapi BPI-R2.
Fixes: b8f126a8d5 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
Link: 29a48bf9cc/drivers/net/mtk_eth.c (L589)
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20230320190520.124513-1-arinc.unal@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
"modprobe asix ; rmmod asix ; modprobe asix" fails with:
sysfs: cannot create duplicate filename \
'/devices/virtual/mdio_bus/usb-003:004'
Issue was originally reported by Anton Lundin on 2022-06-22 (link below).
Chrome OS team hit the same issue in Feb, 2023 when trying to find
work arounds for other issues with AX88172 devices.
The use of devm_mdiobus_register() with usbnet devices results in the
MDIO data being associated with the USB device. When the asix driver
is unloaded, the USB device continues to exist and the corresponding
"mdiobus_unregister()" is NOT called until the USB device is unplugged
or unauthorized. So the next "modprobe asix" will fail because the MDIO
phy sysfs attributes still exist.
The 'easy' (from a design PoV) fix is to use the non-devm variants of
mdiobus_* functions and explicitly manage this use in the asix_bind
and asix_unbind function calls. I've not explored trying to fix usbnet
initialization so devm_* stuff will work.
Fixes: e532a096be ("net: usb: asix: ax88772: add phylib support")
Reported-by: Anton Lundin <glance@acc.umu.se>
Link: https://lore.kernel.org/netdev/20220623063649.GD23685@pengutronix.de/T/
Tested-by: Eizan Miyamoto <eizan@chromium.org>
Signed-off-by: Grant Grundler <grundler@chromium.org>
Link: https://lore.kernel.org/r/20230321170539.732147-1-grundler@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The link speed is never changed for the uptime of a VM, and the current
implementation sends an admin queue command for each call. Admin queue
command invocations have nontrivial overhead (e.g., VM exits), which can
be disruptive to users if triggered frequently. Our telemetry data shows
that there are VMs that make frequent calls to this admin queue command.
Caching the result of the original admin queue command would eliminate
the need to send multiple admin queue commands on subsequent calls to
retrieve link speed.
Fixes: 7e074d5a76 ("gve: Enable Link Speed Reporting in the driver.")
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230321172332.91678-1-joshwash@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Coverity had rightly indicated a possible deadlock
due to chan_lock being done inside match_session.
All callers of match_* functions should pick up the
necessary locks and call them.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Fixes: 724244cdb3 ("cifs: protect session channel fields with chan_lock")
Signed-off-by: Steve French <stfrench@microsoft.com>
When running "ethtool -S eno0 --groups rmon" without an explicit "--src
emac|pmac" argument, the kernel will not report
rx-rmon-etherStatsPkts64to64Octets, rx-rmon-etherStatsPkts65to127Octets,
etc. This is because on ETHTOOL_MAC_STATS_SRC_AGGREGATE, we do not
populate the "ranges" argument.
ocelot_port_get_rmon_stats() does things differently and things work
there. I had forgotten to make sure that the code is structured the same
way in both drivers, so do that now.
Fixes: cf52bd238b ("net: enetc: add support for MAC Merge statistics counters")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230321232831.1200905-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The Bluetooth mesh experimental feature enable was requiring the
controller to be powered off in order for the Enable to work. Mesh is
supposed to be enablable regardless of the controller state, and created
an unintended requirement that the mesh daemon be started before the
classic bluetoothd daemon.
Fixes: af6bcc1921 ("Bluetooth: Add experimental wrapper for MGMT based mesh")
Signed-off-by: Brian Gix <brian.gix@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
There is a potential race condition in hci_cmd_sync_work and
hci_cmd_sync_clear, and could lead to use-after-free. For instance,
hci_cmd_sync_work is added to the 'req_workqueue' after cancel_work_sync
The entry of 'cmd_sync_work_list' may be freed in hci_cmd_sync_clear, and
causing kernel panic when it is used in 'hci_cmd_sync_work'.
Here's the call trace:
dump_stack_lvl+0x49/0x63
print_report.cold+0x5e/0x5d3
? hci_cmd_sync_work+0x282/0x320
kasan_report+0xaa/0x120
? hci_cmd_sync_work+0x282/0x320
__asan_report_load8_noabort+0x14/0x20
hci_cmd_sync_work+0x282/0x320
process_one_work+0x77b/0x11c0
? _raw_spin_lock_irq+0x8e/0xf0
worker_thread+0x544/0x1180
? poll_idle+0x1e0/0x1e0
kthread+0x285/0x320
? process_one_work+0x11c0/0x11c0
? kthread_complete_and_exit+0x30/0x30
ret_from_fork+0x22/0x30
</TASK>
Allocated by task 266:
kasan_save_stack+0x26/0x50
__kasan_kmalloc+0xae/0xe0
kmem_cache_alloc_trace+0x191/0x350
hci_cmd_sync_queue+0x97/0x2b0
hci_update_passive_scan+0x176/0x1d0
le_conn_complete_evt+0x1b5/0x1a00
hci_le_conn_complete_evt+0x234/0x340
hci_le_meta_evt+0x231/0x4e0
hci_event_packet+0x4c5/0xf00
hci_rx_work+0x37d/0x880
process_one_work+0x77b/0x11c0
worker_thread+0x544/0x1180
kthread+0x285/0x320
ret_from_fork+0x22/0x30
Freed by task 269:
kasan_save_stack+0x26/0x50
kasan_set_track+0x25/0x40
kasan_set_free_info+0x24/0x40
____kasan_slab_free+0x176/0x1c0
__kasan_slab_free+0x12/0x20
slab_free_freelist_hook+0x95/0x1a0
kfree+0xba/0x2f0
hci_cmd_sync_clear+0x14c/0x210
hci_unregister_dev+0xff/0x440
vhci_release+0x7b/0xf0
__fput+0x1f3/0x970
____fput+0xe/0x20
task_work_run+0xd4/0x160
do_exit+0x8b0/0x22a0
do_group_exit+0xba/0x2a0
get_signal+0x1e4a/0x25b0
arch_do_signal_or_restart+0x93/0x1f80
exit_to_user_mode_prepare+0xf5/0x1a0
syscall_exit_to_user_mode+0x26/0x50
ret_from_fork+0x15/0x30
Fixes: 6a98e3836f ("Bluetooth: Add helper for serialized HCI command execution")
Cc: stable@vger.kernel.org
Signed-off-by: Min Li <lm0963hack@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Current flow interates over entire ACPI table entries looking for
Bluetooth Per Platform Antenna Gain(PPAG) entry. This patch iterates
over ACPI entries relvant to Bluetooth device only.
Fixes: c585a92b2f ("Bluetooth: btintel: Set Per Platform Antenna Gain(PPAG)")
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Use correct HCI ISO data packet header struct when the packet has
timestamp. The timestamp, when present, goes before the other fields
(Core v5.3 4E 5.4.5), so the structs are not compatible.
Fixes: ccf74f2390 ("Bluetooth: Add BTPROTO_ISO socket type")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
This removes the code introduced by
14202eff21 as hci_recv_frame is now able
to detect ACL packets that are in fact ISO packets.
Fixes: 14202eff21 ("Bluetooth: btusb: Detect if an ACL packet is in fact an ISO packet")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Because some transports don't have a dedicated type for ISO packets
(see 14202eff21) they may use ACL type
when in fact they are ISO packets.
In the past this was left for the driver to detect such thing but it
creates a problem when using the likes of btproxy when used by a VM as
the host would not be aware of the connection the guest is doing it
won't be able to detect such behavior, so this make bt_recv_frame
detect when it happens as it is the common interface to all drivers
including guest VMs.
Fixes: 14202eff21 ("Bluetooth: btusb: Detect if an ACL packet is in fact an ISO packet")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
The address resolution should be disabled during the active scan,
so all the advertisements can reach the host. The advertising
has to be paused before disabling the address resolution,
because the advertising will prevent any changes to the resolving
list and the address resolution status. Skipping this will cause
the hci error and the discovery failure.
According to the bluetooth specification:
"7.8.44 LE Set Address Resolution Enable command
This command shall not be used when:
- Advertising (other than periodic advertising) is enabled,
- Scanning is enabled, or
- an HCI_LE_Create_Connection, HCI_LE_Extended_Create_Connection, or
HCI_LE_Periodic_Advertising_Create_Sync command is outstanding."
If the host is using RPA, the controller needs to generate RPA for
the advertising, so the advertising must remain paused during the
active scan.
If the host is not using RPA, the advertising can be resumed after
disabling the address resolution.
Fixes: 9afc675ede ("Bluetooth: hci_sync: allow advertise when scan without RPA")
Signed-off-by: Zhengping Jiang <jiangzp@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Steve reported that inactive sessions are terminated after a few
seconds. ksmbd terminate when receiving -EAGAIN error from
kernel_recvmsg(). -EAGAIN means there is no data available in timeout.
So ksmbd should keep connection with unlimited retries instead of
terminating inactive sessions.
Cc: stable@vger.kernel.org
Reported-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reference count of acls will leak when memory allocation fails. Fix this
by adding the missing posix_acl_release().
Fixes: e2f34481b2 ("cifsd: add server-side procedures for SMB3")
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Smatch static checker warning:
fs/ksmbd/vfs.c:1040 ksmbd_vfs_fqar_lseek() warn: no lower bound on 'length'
fs/ksmbd/vfs.c:1041 ksmbd_vfs_fqar_lseek() warn: no lower bound on 'start'
Fix unexpected result that could caused from negative start and length.
Fixes: f441584858 ("cifsd: add file operations")
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Smatch static checker warning:
fs/ksmbd/smb2pdu.c:7759 smb2_ioctl()
warn: no lower bound on 'off'
Fix unexpected result that could caused from negative off and bfz.
Fixes: b5e5f9dfc9 ("ksmbd: check invalid FileOffset and BeyondFinalZero in FSCTL_ZERO_DATA")
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
If vfs objects = streams_xattr in ksmbd.conf FILE_NAMED_STREAMS should
be set to Attributes in FS_ATTRIBUTE_INFORMATION. MacOS client show
"Format: SMB (Unknown)" on faked NTFS and no streams support.
Cc: stable@vger.kernel.org
Reported-by: Miao Lihua <441884205@qq.com>
Tested-by: Miao Lihua <441884205@qq.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
MacOS and Win11 support AES256 encrytion and it is included in the cipher
array of encryption context. Especially on macOS, The most preferred
cipher is AES256. Connecting to ksmbd fails on newer MacOS clients that
support AES256 encryption. MacOS send disconnect request after receiving
final session setup response from ksmbd. Because final session setup is
signed with signing key was generated incorrectly.
For signging key, 'L' value should be initialized to 128 if key size is
16bytes.
Cc: stable@vger.kernel.org
Reported-by: Miao Lihua <441884205@qq.com>
Tested-by: Miao Lihua <441884205@qq.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pull ARM fix from Russell King:
"Just one fix for now to eliminate a KASAN false positive"
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9290/1: uaccess: Fix KASAN false-positives
Pull bootconfig fixes from Masami Hiramatsu:
- Fix bootconfig test script to test increased maximum number (8192)
node correctly
- Change the console message if there is no bootconfig data and the
kernel is compiled with CONFIG_BOOT_CONFIG_FORCE=y
* tag 'bootconfig-fixes-v6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
bootconfig: Change message if no bootconfig with CONFIG_BOOT_CONFIG_FORCE=y
bootconfig: Fix testcase to increase max node
Anna says:
> KASAN reports [...] a slab-out-of-bounds in gss_krb5_checksum(),
> and it can cause my client to panic when running cthon basic
> tests with krb5p.
> Running faddr2line gives me:
>
> gss_krb5_checksum+0x4b6/0x630:
> ahash_request_free at
> /home/anna/Programs/linux-nfs.git/./include/crypto/hash.h:619
> (inlined by) gss_krb5_checksum at
> /home/anna/Programs/linux-nfs.git/net/sunrpc/auth_gss/gss_krb5_crypto.c:358
My diagnosis is that the memcpy() at the end of gss_krb5_checksum()
reads past the end of the buffer containing the checksum data
because the callers have ignored gss_krb5_checksum()'s API contract:
* Caller provides the truncation length of the output token (h) in
* cksumout.len.
Instead they provide the fixed length of the hmac buffer. This
length happens to be larger than the value returned by
crypto_ahash_digestsize().
Change these errant callers to work like krb5_etm_{en,de}crypt().
As a defensive measure, bound the length of the byte copy at the
end of gss_krb5_checksum().
Kunit sez:
Testing complete. Ran 68 tests: passed: 68
Elapsed time: 81.680s total, 5.875s configuring, 75.610s building, 0.103s running
Reported-by: Anna Schumaker <schumaker.anna@gmail.com>
Fixes: 8270dbfceb ("SUNRPC: Obscure Kerberos integrity keys")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Mika writes:
thunderbolt: Fixes for v6.3-rc4
This includes following fixes and quirks for v6.3-rc:
- Quirk to disable CL-states on AMD USB4 host routers
- Fix memory leak in lane margining
- Correct the retimer access flows
- Quirk to limit USB3 bandwidth on certain Intel USB4 host routers
- Fix usage of scale field when allocting USB3 bandwidth
- Fix interrupt "auto clear" on non-Intel USB4 host routers.
There are also two commits that are not fixes themselves but are needed
for the USB3 bandwidth quirk and for the interrupt auto clear fix to
work.
All these have been in linux-next with no reported issues.
* tag 'thunderbolt-for-v6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt:
thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit
thunderbolt: Disable interrupt auto clear for rings
thunderbolt: Use const qualifier for `ring_interrupt_index`
thunderbolt: Use scale field when allocating USB3 bandwidth
thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
thunderbolt: Call tb_check_quirks() after initializing adapters
thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access
thunderbolt: Fix memory leak in margining
thunderbolt: Add quirk to disable CLx
Commit 7c3d5c20dc ("thermal/core: Add a generic thermal_zone_get_trip()
function") stopped marking trip points with a zero temperature as
disabled, behavior that was originally introduced in commit 81ad4276b5
("Thermal: Ignore invalid trip points").
When using the mlxsw driver we see that when such trip points are not
disabled, the thermal subsystem repeatedly tries to set the state of the
associated cooling devices to the maximum state.
Address this by restoring the original behavior and mark trip points
with a zero temperature as disabled.
Fixes: 7c3d5c20dc ("thermal/core: Add a generic thermal_zone_get_trip() function")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
__copy_xstate_to_uabi_buf() copies either from the tasks XSAVE buffer
or from init_fpstate into the ptrace buffer. Dynamic features, like
XTILEDATA, have an all zeroes init state and are not saved in
init_fpstate, which means the corresponding bit is not set in the
xfeatures bitmap of the init_fpstate header.
But __copy_xstate_to_uabi_buf() retrieves addresses for both the tasks
xstate and init_fpstate unconditionally via __raw_xsave_addr().
So if the tasks XSAVE buffer has a dynamic feature set, then the
address retrieval for init_fpstate triggers the warning in
__raw_xsave_addr() which checks the feature bit in the init_fpstate
header.
Remove the address retrieval from init_fpstate for extended features.
They have an all zeroes init state so init_fpstate has zeros for them.
Then zeroing the user buffer for the init state is the same as copying
them from init_fpstate.
Fixes: 2308ee57d9 ("x86/fpu/amx: Enable the AMX feature in 64-bit mode")
Reported-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/kvm/20230221163655.920289-2-mizhang@google.com/
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/all/20230227210504.18520-2-chang.seok.bae%40intel.com
Cc: stable@vger.kernel.org
The commit 97e3d26b5e ("x86/mm: Randomize per-cpu entry area") fixed
an omission of KASLR on CPU entry areas. It doesn't take into account
KASLR switches though, which may result in unintended non-determinism
when a user wants to avoid it (e.g. debugging, benchmarking).
Generate only a single combination of CPU entry areas offsets -- the
linear array that existed prior randomization when KASLR is turned off.
Since we have 3f148f3318 ("x86/kasan: Map shadow for percpu pages on
demand") and followups, we can use the more relaxed guard
kasrl_enabled() (in contrast to kaslr_memory_enabled()).
Fixes: 97e3d26b5e ("x86/mm: Randomize per-cpu entry area")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20230306193144.24605-1-mkoutny%40suse.com
When fixed files are unregistered, file_alloc_end and alloc_hint
are not cleared. This can later cause a NULL pointer dereference in
io_file_bitmap_get() if auto index selection is enabled via
IORING_FILE_INDEX_ALLOC:
[ 6.519129] BUG: kernel NULL pointer dereference, address: 0000000000000000
[...]
[ 6.541468] RIP: 0010:_find_next_zero_bit+0x1a/0x70
[...]
[ 6.560906] Call Trace:
[ 6.561322] <TASK>
[ 6.561672] io_file_bitmap_get+0x38/0x60
[ 6.562281] io_fixed_fd_install+0x63/0xb0
[ 6.562851] ? __pfx_io_socket+0x10/0x10
[ 6.563396] io_socket+0x93/0xf0
[ 6.563855] ? __pfx_io_socket+0x10/0x10
[ 6.564411] io_issue_sqe+0x5b/0x3d0
[ 6.564914] io_submit_sqes+0x1de/0x650
[ 6.565452] __do_sys_io_uring_enter+0x4fc/0xb20
[ 6.566083] ? __do_sys_io_uring_register+0x11e/0xd80
[ 6.566779] do_syscall_64+0x3c/0x90
[ 6.567247] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[...]
To fix the issue, set file alloc range and alloc_hint to zero after
file tables are freed.
Cc: stable@vger.kernel.org
Fixes: 4278a0deb1 ("io_uring: defer alloc_hint update to io_file_bitmap_set()")
Signed-off-by: Savino Dicanosa <sd7.dev@pm.me>
[axboe: add explicit bitmap == NULL check as well]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When driver doesn't implement a bpf_xdp_metadata kfunc the fallback
implementation returns EOPNOTSUPP, which indicate device driver doesn't
implement this kfunc.
Currently many drivers also return EOPNOTSUPP when the hint isn't
available, which is ambiguous from an API point of view. Instead
change drivers to return ENODATA in these cases.
There can be natural cases why a driver doesn't provide any hardware
info for a specific hint, even on a frame to frame basis (e.g. PTP).
Lets keep these cases as separate return codes.
When describing the return values, adjust the function kernel-doc layout
to get proper rendering for the return values.
Fixes: ab46182d0d ("net/mlx4_en: Support RX XDP metadata")
Fixes: bc8d405b1b ("net/mlx5e: Support RX XDP metadata")
Fixes: 306531f024 ("veth: Support RX XDP metadata")
Fixes: 3d76a4d3d4 ("bpf: XDP metadata RX kfuncs")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/167940675120.2718408.8176058626864184420.stgit@firesoul
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The hvc machinery registers both a console and a tty device based on
the hv ops provided by the specific implementation. Those two
interfaces however have different locks, and there's no single locks
that's shared between the tty and the console implementations, hence
the driver needs to protect itself against concurrent accesses.
Otherwise concurrent calls using the split interfaces are likely to
corrupt the ring indexes, leaving the console unusable.
Introduce a lock to xencons_info to serialize accesses to the shared
ring. This is only required when using the shared memory console,
concurrent accesses to the hypercall based console implementation are
not an issue.
Note the conditional logic in domU_read_console() is slightly modified
so the notify_daemon() call can be done outside of the locked region:
it's an hypercall and there's no need for it to be done with the lock
held.
Fixes: b536b4b962 ('xen: use the hvc console infrastructure for Xen console')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20221130150919.13935-1-roger.pau@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Introduce a core thermal API function, thermal_cooling_device_update(),
for updating the max_state value for a cooling device and rearranging
its statistics in sysfs after a possible change of its ->get_max_state()
callback return value.
That callback is now invoked only once, during cooling device
registration, to populate the max_state field in the cooling device
object, so if its return value changes, it needs to be invoked again
and the new return value needs to be stored as max_state. Moreover,
the statistics presented in sysfs need to be rearranged in general,
because there may not be enough room in them to store data for all
of the possible states (in the case when max_state grows).
The new function takes care of that (and some other minor things
related to it), but some extra locking and lockdep annotations are
added in several places too to protect against crashes in the cases
when the statistics are not present or when a stale max_state value
might be used by sysfs attributes.
Note that the actual user of the new function will be added separately.
Link: https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Introduce a helper function, thermal_cooling_device_present(), for
checking if the given cooling device is in the list of registered
cooling devices to avoid some code duplication in a subsequent
patch.
No expected functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
The cpufreq policy notifier in the ACPI processor driver may as
well be registered before the driver itself, which causes
acpi_processor_cpufreq_init to be true (unless the notifier
registration fails, which is unlikely at that point) when the
ACPI CPU thermal cooling devices are registered, so the
processor_get_max_state() result does not change while
acpi_processor_driver_init() is running.
Change the ordering in acpi_processor_driver_init() accordingly
to prevent the max_state value from remaining 0 permanently for all
ACPI CPU cooling devices due to setting acpi_processor_cpufreq_init
too late. [Note that processor_get_max_state() may still return
different values at different times after this change, depending on
the cpufreq driver registration time, but that issue needs to be
addressed separately.]
Fixes: a365105c685c("thermal: sysfs: Reuse cdev->max_state")
Reported-by: Wang, Quanxian <quanxian.wang@intel.com>
Link: https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
If a packet has reached its intended destination, it was bumped to the code
that accepts it, without first checking if a mesh_path needs to be created
based on the discovered source.
Fix this by moving the destination address check further down.
Cc: stable@vger.kernel.org
Fixes: 986e43b19a ("wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20230314095956.62085-3-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Some recent upstream debugging uncovered the fact that in
iwlwifi, the TXQ list manipulation is racy.
Introduce a new state bit for when the TXQ is completely
ready and can be used without locking, and if that's not
set yet acquire the lock to check everything correctly.
Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Tested-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This could race if the queue is redirected while full, then
the flushing internally would start it while it's not yet
usable again. Fix it by using two state bits instead of just
one.
Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Tested-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
sh/migor_defconfig:
mm/slab.c: In function ‘slab_memory_callback’:
mm/slab.c:1127:23: error: implicit declaration of function ‘init_cache_node_node’; did you mean ‘drain_cache_node_node’? [-Werror=implicit-function-declaration]
1127 | ret = init_cache_node_node(nid);
| ^~~~~~~~~~~~~~~~~~~~
| drain_cache_node_node
The #ifdef condition protecting the definition of init_cache_node_node()
no longer matches the conditions protecting the (multiple) users.
Fix this by syncing the conditions.
Fixes: 76af6a054d ("mm/migrate: add CPU hotplug to demotion #ifdef")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/b5bdea22-ed2f-3187-6efe-0c72330270a4@infradead.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
The FEI field of C2HTermReq/H2CTermReq is 4 bytes but not 4-byte-aligned
in the NVMe/TCP specification (it is located at offset 10 in the PDU).
Split it into two 16-bit integers in struct nvme_tcp_term_pdu
so no padding is inserted. There should also be 10 reserved bytes after.
There are currently no users of this type.
Fixes: fc221d0544 ("nvme-tcp: Add protocol header")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Caleb Sander <csander@purestorage.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Identify CNS 06h (I/O Command Set Specific Identify Controller data
structure) is supported only on i/o controllers.
But nvme_init_non_mdts_limits() currently invokes this on all
controllers. Correct this by ensuring this is sent to I/O
controllers only.
Signed-off-by: Martin George <marting@netapp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Vladimir Oltean says:
====================
Fix trainwreck with Ocelot switch statistics counters
While testing the patch set for preemptible traffic classes with some
controlled traffic and measuring counter deltas:
https://lore.kernel.org/netdev/20230220122343.1156614-1-vladimir.oltean@nxp.com/
I noticed that in the output of "ethtool -S swp0 --groups eth-mac
eth-phy eth-ctrl rmon -- --src emac | grep -v ': 0'", the TX counters
were off. Quickly I realized that their values were permutated by 1
compared to their names, and that for example
tx-rmon-etherStatsPkts64to64Octets was incrementing when
tx-rmon-etherStatsPkts65to127Octets should have.
Initially I suspected something having to do with the bulk reading
logic, and indeed I found a bug there (fixed as 1/3), but that was not
the source of the problems. Instead it revealed other problems.
While dumping the regions created by the driver on my switch, I figured
out that it sees a discontinuity which shouldn't have existed between
reg 0x278 and reg 0x280.
Discontinuity between last reg 0x0 and new reg 0x0, creating new region
Discontinuity between last reg 0x108 and new reg 0x200, creating new region
Discontinuity between last reg 0x278 and new reg 0x280, creating new region
Discontinuity between last reg 0x2b0 and new reg 0x400, creating new region
region of 67 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 31 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 13 contiguous counters starting with SYS:STAT:CNT[0x0a0]
region of 18 contiguous counters starting with SYS:STAT:CNT[0x100]
That is where TX_MM_HOLD should have been, and that was the bug, since
it was missing. After adding it, the regions look like this and the
off-by-one issue is resolved:
Discontinuity between last reg 0x000000 and new reg 0x000000, creating new region
Discontinuity between last reg 0x000108 and new reg 0x000200, creating new region
Discontinuity between last reg 0x0002b0 and new reg 0x000400, creating new region
region of 67 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 45 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 18 contiguous counters starting with SYS:STAT:CNT[0x100]
However, as I am thinking out loud, it should have not reported the
other counters as off by one even when skipping TX_MM_HOLD... after all,
on Ocelot/Seville, there are more counters which need to be skipped.
Which is when I investigated and noticed the bug solved in 2/3.
I've validated that both on native VSC9959 (which uses
ocelot_mm_stats_layout) as well as by faking the other switches by
making VSC9959 use the plain ocelot_stats_layout.
To summarize: on all Ocelot switches, the TX counters and drop counters
are completely broken. The RX counters are mostly fine.
With this occasion, I have collected more cleanup patches in this area,
which I'm going to submit after the net -> net-next merge.
====================
Link: https://lore.kernel.org/r/20230321010325.897817-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The lack of a definition for this counter is what initially prompted me
to investigate a problem which really manifested itself as the previous
change, "net: mscc: ocelot: fix transfer from region->buf to ocelot->stats".
When TX_MM_HOLD is defined in enum ocelot_stat but not in struct
ocelot_stat_layout ocelot_mm_stats_layout, this creates a hole, which
due to the aforementioned bug, makes all counters following TX_MM_HOLD
be recorded off by one compared to their correct position. So for
example, a non-zero TX_PMAC_OCTETS would be reported as TX_MERGE_FRAGMENTS,
TX_PMAC_UNICAST would be reported as TX_PMAC_OCTETS, TX_PMAC_64 would be
reported as TX_PMAC_PAUSE, etc etc. This is because the size of the hole
(1) is much smaller than the size of the region, so the phenomenon where
the stats are off-by-one, rather than lost, prevails.
However, the phenomenon where stats are lost can be seen too, for
example with DROP_LOCAL, which is at the beginning of its own region
(offset 0x000400 vs the previous 0x0002b0 constitutes a discontinuity).
This is also reported as off by one and saved to TX_PMAC_1527_MAX, but
that counter is not reported to the unstructured "ethtool -S", as
opposed to DROP_LOCAL which is (as "drop_local").
Fixes: ab3f97a961 ("net: mscc: ocelot: export ethtool MAC Merge stats for Felix VSC9959")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To understand the problem, we need some definitions.
The driver is aware of multiple counters (enum ocelot_stat), yet not all
switches supported by the driver implement all counters. There are 2
statistics layouts: ocelot_stats_layout and ocelot_mm_stats_layout, the
latter having 36 counters more than the former.
ocelot->stats[] is not a compact array, i.e. there are elements within
it which are not going to be populated for ocelot_stats_layout. On the
other hand, ocelot->stats[] is easily indexable, for example "tx_octets"
for port 3 can be found at ocelot->stats[3 * OCELOT_NUM_STATS +
OCELOT_STAT_TX_OCTETS], and that is why we keep it sparse.
Regions, as created by ocelot_prepare_stats_regions(), are compact
(every element from region->buf will correspond to a counter that is
present in this switch's layout) but are not easily indexable.
Let's define holes as the ranges of values of enum ocelot_stat for which
ocelot_stats_layout doesn't have a "reg" defined. For example, there is
a hole between OCELOT_STAT_RX_GREEN_PRIO_7 and OCELOT_STAT_TX_OCTETS
which is of 23 elements that are only present on ocelot_mm_stats_layout,
and as such, they are also present in enum ocelot_stat. Let's define the
left extremity of the hole - the last enum ocelot_stat still defined -
as A (in this case OCELOT_STAT_RX_GREEN_PRIO_7) and the right extremity -
the first enum ocelot_stat that is defined after a series of undefined
ones - as B (in this case OCELOT_STAT_TX_OCTETS).
There is a bug in the procedure which transfers stats from region->buf[]
to ocelot->stats[].
For each hole in the ocelot_stats_layout, the logic transfers the stats
starting with enum ocelot_stat B to ocelot->stats[] index A + 1. So all
stats after a hole are saved to a position which is off by B - A + 1
elements.
This causes 2 kinds of issues:
(a) counters which shouldn't increment increment
(b) counters which should increment don't
Holes in the ocelot_stat_layout automatically imply the end of a region
and the beginning of a new one; however the reverse is not necessarily
true. For example, for ocelot_mm_stat_layout, there could be multiple
regions (which indicate discontinuities in register addresses) while
there is no hole (which indicates discontinuities in enum ocelot_stat
values).
In the example above, the stats from the second region->buf[] are not
transferred to ocelot->stats starting with index
"port * OCELOT_NUM_STATS + OCELOT_STAT_TX_OCTETS" as they should, but
rather, starting with element
"port * OCELOT_NUM_STATS + OCELOT_STAT_RX_GREEN_PRIO_7 + 1".
That stats[] array element is not reported to user space for switches
that use ocelot_stat_layout, and that is how issue (b) occurs.
However, if the length of the second region is larger than the hole,
then some stats will start to be transferred to the ocelot->stats[]
indices which *are* reported to user space, but those indices contain
wrong values (corresponding to unexpected counters). This is how issue
(a) occurs.
The procedure, as it was introduced in commit d87b1c08f3 ("net: mscc:
ocelot: use bulk reads for stats"), was not buggy, because there were no
holes in the struct ocelot_stat_layout instances at that time. The
problem is that when those holes were introduced, the function was not
updated to take them into consideration.
To update the procedure, we need to know, for each region, which enum
ocelot_stat corresponds to its region->base. We have no way of deducing
that based on the contents of struct ocelot_stats_region, so we need to
add this information.
Fixes: ab3f97a961 ("net: mscc: ocelot: export ethtool MAC Merge stats for Felix VSC9959")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The blamed commit changed struct ocelot_stat_layout :: "u32 offset" to
"u32 reg".
However, "u32 reg" is not quite a register address, but an enum
ocelot_reg, which in itself encodes an enum ocelot_target target in the
upper bits, and an index into the ocelot->map[target][] array in the
lower bits.
So, whereas the previous code comparison between stats_layout[i].offset
and last + 1 was correct (because those "offsets" at the time were
32-bit relative addresses), the new code, comparing layout[i].reg to
last + 4 is not correct, because the "reg" here is an enum/index, not an
actual register address.
What we want to compare are indeed register addresses, but to do that,
we need to actually go through the same motions as
__ocelot_bulk_read_ix() itself.
With this bug, all statistics counters are deemed by
ocelot_prepare_stats_regions() as constituting their own region.
(Truncated) log on VSC9959 (Felix) below (prints added by me):
Before:
region of 1 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x001]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x002]
...
region of 1 contiguous counters starting with SYS:STAT:CNT[0x041]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x042]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x081]
...
region of 1 contiguous counters starting with SYS:STAT:CNT[0x0ac]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x100]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x101]
...
region of 1 contiguous counters starting with SYS:STAT:CNT[0x111]
After:
region of 67 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 45 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 18 contiguous counters starting with SYS:STAT:CNT[0x100]
Since commit d87b1c08f3 ("net: mscc: ocelot: use bulk reads for
stats") intended bulking as a performance improvement, and since now,
with trivial-sized regions, performance is even worse than without
bulking at all, this could easily qualify as a performance regression.
Fixes: d4c3676507 ("net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There are memory leaks reported by kmemleak:
unreferenced object 0xffff888106500800 (size 128):
comm "modprobe", pid 1017, jiffies 4297787785 (age 67.152s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000970ce626>] __kmem_cache_alloc_node+0x20c/0x380
[<00000000fb5f78d9>] kmalloc_trace+0x2f/0xb0
[<000000000e947e2a>] idt77252_init_one+0x2847/0x3c90 [idt77252]
[<000000006efb048e>] local_pci_probe+0xeb/0x1a0
...
unreferenced object 0xffff888106500b00 (size 128):
comm "modprobe", pid 1017, jiffies 4297787785 (age 67.152s)
hex dump (first 32 bytes):
00 20 3d 01 80 88 ff ff 00 20 3d 01 80 88 ff ff . =...... =.....
f0 23 3d 01 80 88 ff ff 00 20 3d 01 00 00 00 00 .#=...... =.....
backtrace:
[<00000000970ce626>] __kmem_cache_alloc_node+0x20c/0x380
[<00000000fb5f78d9>] kmalloc_trace+0x2f/0xb0
[<00000000f451c5be>] alloc_scq.constprop.0+0x4a/0x400 [idt77252]
[<00000000e6313849>] idt77252_init_one+0x28cf/0x3c90 [idt77252]
The root cause is traced to the vc_maps which alloced in open_card_oam()
are not freed in close_card_oam(). The vc_maps are used to record
open connections, so when close a vc_map in close_card_oam(), the memory
should be freed. Moreover, the ubr0 is not closed when close a idt77252
device, leading to the memory leak of vc_map and scq_info.
Fix them by adding kfree in close_card_oam() and implementing new
close_card_ubr0() to close ubr0.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Francois Romieu <romieu@fr.zoreil.com>
Link: https://lore.kernel.org/r/20230320143318.2644630-1-lizetao1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When BCM63xx internal switches are connected to switches with a 4-byte
Broadcom tag, it does not identify the packet as VLAN tagged, so it adds one
based on its PVID (which is likely 0).
Right now, the packet is received by the BCM63xx internal switch and the 6-byte
tag is properly processed. The next step would to decode the corresponding
4-byte tag. However, the internal switch adds an invalid VLAN tag after the
6-byte tag and the 4-byte tag handling fails.
In order to fix this we need to remove the invalid VLAN tag after the 6-byte
tag before passing it to the 4-byte tag decoding.
Fixes: 964dbf186e ("net: dsa: tag_brcm: add support for legacy tags")
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20230319095540.239064-1-noltari@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull VFIO fix from Alex Williamson:
- Fix dirty size reporting for pre-copy in mlx5 variant driver (Yishai
Hadas)
* tag 'vfio-v6.3-rc4' of https://github.com/awilliam/linux-vfio:
vfio/mlx5: Fix the report of dirty_bytes upon pre-copy
Pull nfsd fixes from Chuck Lever:
- Fix a crash during NFS READs from certain client implementations
- Address a minor kbuild regression in v6.3
* tag 'nfsd-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
nfsd: don't replace page in rq_pages if it's a continuation of last page
NFS & NFSD: Update GSS dependencies
The error handling dereferences "vport". There is nothing we can do if
it is an error pointer except returning the error code.
Fixes: 133dcfc577 ("net/mlx5: E-Switch, Alloc and free unique metadata for match")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
When ETS configurations are queried by the user to get the mapping
assignment between packet priority and traffic class, only priorities up
to maximum TCs are queried from QTCT register in FW to retrieve their
assigned TC, leaving the rest of the priorities mapped to the default
TC #0 which might be misleading.
Fix by querying the TC mapping of all priorities on each ETS query,
regardless of the maximum number of TCs configured in FW.
Fixes: 820c2c5e77 ("net/mlx5e: Read ETS settings directly from firmware")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
First ASO WQE poll causes a cache miss in hardware hence the resut is
delayed. It causes to the situation where such WQE is polled earlier
than it is needed.
Add logic to retry ASO CQ polling operation.
Fixes: 739cfa3451 ("net/mlx5: Make ASO poll CQ usable in atomic context")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
mlx5e_port_max_linkspeed does not guarantee value assignment for speed.
Avoid cases where link_speed might be used uninitialized. In case
mlx5e_port_max_linkspeed fails, a default link speed of 50000 will be
used for the calculations.
Fixes: 3f6d08d196 ("net/mlx5e: Add RSS support for hairpin")
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
vport's mc, uc and multicast rules are not deleted in teardown path when
EEH happens. Since the vport's promisc settings(uc, mc and all) in
firmware are reset after EEH, mlx5 driver will try to delete the above
rules in the initialization path. This cause kernel crash because these
software rules are no longer valid.
Fix by nullifying these rules right after delete to avoid accessing any dangling
pointers.
Call Trace:
__list_del_entry_valid+0xcc/0x100 (unreliable)
tree_put_node+0xf4/0x1b0 [mlx5_core]
tree_remove_node+0x30/0x70 [mlx5_core]
mlx5_del_flow_rules+0x14c/0x1f0 [mlx5_core]
esw_apply_vport_rx_mode+0x10c/0x200 [mlx5_core]
esw_update_vport_rx_mode+0xb4/0x180 [mlx5_core]
esw_vport_change_handle_locked+0x1ec/0x230 [mlx5_core]
esw_enable_vport+0x130/0x260 [mlx5_core]
mlx5_eswitch_enable_sriov+0x2a0/0x2f0 [mlx5_core]
mlx5_device_enable_sriov+0x74/0x440 [mlx5_core]
mlx5_load_one+0x114c/0x1550 [mlx5_core]
mlx5_pci_resume+0x68/0xf0 [mlx5_core]
eeh_report_resume+0x1a4/0x230
eeh_pe_dev_traverse+0x98/0x170
eeh_handle_normal_event+0x3e4/0x640
eeh_handle_event+0x4c/0x370
eeh_event_handler+0x14c/0x210
kthread+0x168/0x1b0
ret_from_kernel_thread+0x5c/0x84
Fixes: a35f71f27a ("net/mlx5: E-Switch, Implement promiscuous rx modes vf request handling")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Upon entering switchdev mode, VF/SF representors are spawned in the
devlink instance's net namespace, whereas the PF net device transforms
into the uplink representor, remaining in the net namespace the PF net
device was in. Therefore, if a PF net device's namespace is different from
its parent devlink net namespace, entering switchdev mode can create an
illegal situation where all representors sharing the same core device
are NOT in the same net namespace.
To avoid this issue, block entering switchdev mode for devices whose child
netdev net namespace has diverged from the parent devlink's.
Fixes: 7768d1971d ("net/mlx5: E-Switch, Add control for encapsulation")
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Gavi Teitz <gavi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Previously, NETNS_LOCAL was not set for uplink representors, inconsistent
with VF representors, and allowed the uplink representor to be moved
between net namespaces and separated from the VF representors it shares
the core device with. Such usage would break the isolation model of
namespaces, as devices in different namespaces would have access to
shared memory.
To solve this issue, set NETNS_LOCAL for uplink representors if eswitch is
in switchdev mode.
Fixes: 7a9fb35e8c ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode")
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Gavi Teitz <gavi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
We've seen recent AWS EKS (Kubernetes) user reports like the following:
After upgrading EKS nodes from v20230203 to v20230217 on our 1.24 EKS
clusters after a few days a number of the nodes have containers stuck
in ContainerCreating state or liveness/readiness probes reporting the
following error:
Readiness probe errored: rpc error: code = Unknown desc = failed to
exec in container: failed to start exec "4a11039f730203ffc003b7[...]":
OCI runtime exec failed: exec failed: unable to start container process:
unable to init seccomp: error loading seccomp filter into kernel:
error loading seccomp filter: errno 524: unknown
However, we had not been seeing this issue on previous AMIs and it only
started to occur on v20230217 (following the upgrade from kernel 5.4 to
5.10) with no other changes to the underlying cluster or workloads.
We tried the suggestions from that issue (sysctl net.core.bpf_jit_limit=452534528)
which helped to immediately allow containers to be created and probes to
execute but after approximately a day the issue returned and the value
returned by cat /proc/vmallocinfo | grep bpf_jit | awk '{s+=$2} END {print s}'
was steadily increasing.
I tested bpf tree to observe bpf_jit_charge_modmem, bpf_jit_uncharge_modmem
their sizes passed in as well as bpf_jit_current under tcpdump BPF filter,
seccomp BPF and native (e)BPF programs, and the behavior all looks sane
and expected, that is nothing "leaking" from an upstream perspective.
The bpf_jit_limit knob was originally added in order to avoid a situation
where unprivileged applications loading BPF programs (e.g. seccomp BPF
policies) consuming all the module memory space via BPF JIT such that loading
of kernel modules would be prevented. The default limit was defined back in
2018 and while good enough back then, we are generally seeing far more BPF
consumers today.
Adjust the limit for the BPF JIT pool from originally 1/4 to now 1/2 of the
module memory space to better reflect today's needs and avoid more users
running into potentially hard to debug issues.
Fixes: fdadd04931 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
Reported-by: Stephen Haynes <sh@synk.net>
Reported-by: Lefteris Alexakis <lefteris.alexakis@kpn.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://github.com/awslabs/amazon-eks-ami/issues/1179
Link: https://github.com/awslabs/amazon-eks-ami/issues/1219
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230320143725.8394-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Pull keyrings fixes from David Howells:
- Fix request_key() so that it doesn't cache a looked up key on the
current thread if that thread is a kernel thread.
The cache is cleared during notify_resume - but that doesn't happen
in kernel threads. This is causing cifs DNS keys to be
un-invalidateable.
- Fix a wrapper check in verify_pefile() to not round up the length.
- Change asymmetric_keys code to log errors to make it easier for users
to work out why failures occurred.
* tag 'keys-fixes-20230321' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
asymmetric_keys: log on fatal failures in PE/pkcs7
verify_pefile: relax wrapper length check
keys: Do not cache key in task struct if key is requested from kernel thread
When a system with E810 with existing VFs gets rebooted the following
hang may be observed.
Pid 1 is hung in iavf_remove(), part of a network driver:
PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow"
#0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
#1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
#2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
#3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
#4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
#5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
#6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
#7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
#8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
#9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
#10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
#11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
#12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7
RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90
R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005
R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000
ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b
During reboot all drivers PM shutdown callbacks are invoked.
In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
In ice_shutdown() the call chain above is executed, which at some point
calls iavf_remove(). However iavf_remove() expects the VF to be in one
of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
that's not the case it sleeps forever.
So if iavf_shutdown() gets invoked before iavf_remove() the system will
hang indefinitely because the adapter is already in state __IAVF_REMOVE.
Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
as we already went through iavf_shutdown().
Fixes: 974578017f ("iavf: Add waiting so the port is initialized in remove")
Fixes: a8417330f8 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
Reported-by: Marius Cornea <mcornea@redhat.com>
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Filters shouldn't be removed in VSI rebuild path. Removing them on PF
VSI results in no rule for PF MAC after changing for example queues
amount.
Remove all filters only in the VSI remove flow. As unload should also
cause the filter to be removed introduce, a new function ice_stop_eth().
It will unroll ice_start_eth(), so remove filters and close VSI.
Fixes: 6624e780a5 ("ice: split ice_vsi_setup into smaller functions")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The key which gets cached in task structure from a kernel thread does not
get invalidated even after expiry. Due to which, a new key request from
kernel thread will be served with the cached key if it's present in task
struct irrespective of the key validity. The change is to not cache key in
task_struct when key requested from kernel thread so that kernel thread
gets a valid key on every key request.
The problem has been seen with the cifs module doing DNS lookups from a
kernel thread and the results getting pinned by being attached to that
kernel thread's cache - and thus not something that can be easily got rid
of. The cache would ordinarily be cleared by notify-resume, but kernel
threads don't do that.
This isn't seen with AFS because AFS is doing request_key() within the
kernel half of a user thread - which will do notify-resume.
Fixes: 7743c48e54 ("keys: Cache result of request_key*() temporarily in task_struct")
Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Steve French <smfrench@gmail.com>
cc: keyrings@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/CAGypqWw951d=zYRbdgNR4snUDvJhWL=q3=WOyh7HhSJupjz2vA@mail.gmail.com/
Since commit 6c40624930 ("bootconfig: Increase max nodes of bootconfig
from 1024 to 8192 for DCC support") increased the max number of bootconfig
node to 8192, the bootconfig testcase of the max number of nodes fails.
To fix this issue, we can not simply increase the number in the test script
because the test bootconfig file becomes too big (>32KB). To fix that, we
can use a combination of three alphabets (26^3 = 17576). But with that,
we can not express the 8193 (just one exceed from the limitation) because
it also exceeds the max size of bootconfig. So, the first 26 nodes will just
use one alphabet.
With this fix, test-bootconfig.sh passes all tests.
Link: https://lore.kernel.org/all/167888844790.791176.670805252426835131.stgit@devnote2/
Reported-by: Heinz Wiesinger <pprkut@slackware.com>
Link: https://lore.kernel.org/all/2463802.XAFRqVoOGU@amaterasu.liwjatan.org
Fixes: 6c40624930 ("bootconfig: Increase max nodes of bootconfig from 1024 to 8192 for DCC support")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Smatch reports:
drivers/hwmon/xgene-hwmon.c:757 xgene_hwmon_probe() warn:
'ctx->pcc_comm_addr' from ioremap() not released on line: 757.
This is because in drivers/hwmon/xgene-hwmon.c:701 xgene_hwmon_probe(),
ioremap and memremap is not released, which may cause a leak.
To fix this, ioremap and memremap is modified to devm_ioremap and
devm_memremap.
Signed-off-by: Tianyi Jing <jingfelix@hust.edu.cn>
Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn>
Link: https://lore.kernel.org/r/20230318143851.2191625-1-jingfelix@hust.edu.cn
[groeck: Fixed formatting and subject]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Avoid needlessly rebuilding the compressed image by adding the file
'vmlinuz' to the 'targets' Kbuild make variable.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
RCU sometimes needs to perform a delayed wake up for specific kthreads
handling offloaded callbacks (RCU_NOCB). These wakeups are performed
by timers and upon entry to idle (also to guest and to user on nohz_full).
However the delayed wake-up on kernel exit is actually performed after
the thread flags are fetched towards the fast path check for work to
do on exit to user. As a result, and if there is no other pending work
to do upon that kernel exit, the current task will resume to userspace
with TIF_RESCHED set and the pending wake up ignored.
Fix this with fetching the thread flags _after_ the delayed RCU-nocb
kthread wake-up.
Fixes: 47b8ff194c ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230315194349.10798-3-joel@joelfernandes.org
The variable 'status' (which contains the unhandled overflow bits) is
not being properly masked in some cases, displaying the following
warning:
WARNING: CPU: 156 PID: 475601 at arch/x86/events/amd/core.c:972 amd_pmu_v2_handle_irq+0x216/0x270
This seems to be happening because the loop is being continued before
the status bit being unset, in case x86_perf_event_set_period()
returns 0. This is also causing an inconsistency because the "handled"
counter is incremented, but the status bit is not cleaned.
Move the bit cleaning together above, together when the "handled"
counter is incremented.
Fixes: 7685665c39 ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230321113338.1669660-1-leitao@debian.org
Commit 829c1651e9 ("sched/fair: sanitize vruntime of entity being placed")
fixes an overflowing bug, but ignore a case that se->exec_start is reset
after a migration.
For fixing this case, we delay the reset of se->exec_start after
placing the entity which se->exec_start to detect long sleeping task.
In order to take into account a possible divergence between the clock_task
of 2 rqs, we increase the threshold to around 104 days.
Fixes: 829c1651e9 ("sched/fair: sanitize vruntime of entity being placed")
Originally-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Qiao <zhangqiao22@huawei.com>
Link: https://lore.kernel.org/r/20230317160810.107988-1-vincent.guittot@linaro.org
__enter_from_user_mode() is triggering noinstr warnings with
CONFIG_DEBUG_PREEMPT due to its call of preempt_count_add() via
ct_state().
The preemption disable isn't needed as interrupts are already disabled.
And the context_tracking_enabled() check in ct_state() also isn't needed
as that's already being done by the CT_WARN_ON().
Just use __ct_state() instead.
Fixes the following warnings:
vmlinux.o: warning: objtool: enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section
vmlinux.o: warning: objtool: syscall_enter_from_user_mode+0xf9: call to preempt_count_add() leaves .noinstr.text section
vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare+0xc7: call to preempt_count_add() leaves .noinstr.text section
vmlinux.o: warning: objtool: irqentry_enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section
Fixes: 171476775d ("context_tracking: Convert state to atomic_t")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/d8955fa6d68dc955dda19baf13ae014ae27926f5.1677369694.git.jpoimboe@kernel.org
Like the Windows Lenovo Yoga Book X91F/L the Android Lenovo Yoga Book
X90F/L has a portrait 1200x1920 screen used in landscape mode,
add a quirk for this.
When the quirk for the X91F/L was initially added it was written to
also apply to the X90F/L but this does not work because the Android
version of the Yoga Book uses completely different DMI strings.
Also adjust the X91F/L quirk to reflect that it only applies to
the X91F/L models.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230301095218.28457-1-hdegoede@redhat.com
Since io_uring does nonblocking connect requests, if we do two repeated
ones without having a listener, the second will get -ECONNABORTED rather
than the expected -ECONNREFUSED. Treat -ECONNABORTED like a normal retry
condition if we're nonblocking, if we haven't already seen it.
Cc: stable@vger.kernel.org
Fixes: 3fb1bd6881 ("io_uring/net: handle -EINPROGRESS correct for IORING_OP_CONNECT")
Link: https://github.com/axboe/liburing/issues/828
Reported-by: Hui, Chunyang <sanqian.hcy@antgroup.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_uring_cmd_done() currently assumes that the uring_lock is held
when invoked, and while it generally is, this is not guaranteed.
Pass in the issue_flags associated with it, so that we have
IO_URING_F_UNLOCKED available to be able to lock the CQ ring
appropriately when completing events.
Cc: stable@vger.kernel.org
Fixes: ee692a21e9 ("fs,io_uring: add infrastructure for uring-cmd")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull fsverity fixes from Eric Biggers:
"Fix two significant performance issues with fsverity"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY
fsverity: Remove WQ_UNBOUND from fsverity read workqueue
Pull fscrypt fix from Eric Biggers:
"Fix a bug where when a filesystem was being unmounted, the fscrypt
keyring was destroyed before inodes have been released by the Landlock
LSM.
This bug was found by syzbot"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
fscrypt: check for NULL keyring in fscrypt_put_master_key_activeref()
fscrypt: improve fscrypt_destroy_keyring() documentation
fscrypt: destroy keyring after security_sb_delete()
Since the expected write location in a sequential file is always at the
end of the file (append write), when an invalid write append location is
detected in zonefs_file_dio_append(), print the invalid written location
instead of the expected write location.
Fixes: a608da3bd7 ("zonefs: Detect append writes at invalid locations")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
The error handling for platform_get_irq() failing no longer works after
a recent change, clang now points this out with a warning:
drivers/gpu/host1x/dev.c:520:6: error: variable 'syncpt_irq' is uninitialized when used here [-Werror,-Wuninitialized]
if (syncpt_irq < 0)
^~~~~~~~~~
Fix this by removing the variable and checking the correct error status.
Fixes: 625d4ffb43 ("gpu: host1x: Rewrite syncpoint interrupt handling")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Acer Aspire 3830TG predates Windows 8, so it defaults to using
acpi_video# for backlight control, but this is non functional on
this model.
Add a DMI quirk to use the native backlight interface which does
work properly.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull NFS client fixes from Anna Schumaker:
- Fix /proc/PID/io read_bytes accounting
- Fix setting NLM file_lock start and end during decoding testargs
- Fix timing for setting access cache timestamps
* tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS: Correct timing for assigning access cache timestamp
lockd: set file_lock start and end when decoding nlm4 testargs
NFS: Fix /proc/PID/io read_bytes for buffered reads
cppcheck reports
drivers/thunderbolt/nhi.c:74:7: style: Local variable 'bit' shadows outer variable [shadowVariable]
int bit;
^
drivers/thunderbolt/nhi.c:66:6: note: Shadowed declaration
int bit = ring_interrupt_index(ring) & 31;
^
drivers/thunderbolt/nhi.c:74:7: note: Shadow variable
int bit;
^
For readablity rename the outer to interrupt_bit and the innner
to auto_clear_bit.
Fixes: 468c49f447 ("thunderbolt: Disable interrupt auto clear for ring")
Cc: stable@vger.kernel.org
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Geoff Levand says:
====================
net/ps3_gelic_net: DMA related fixes
v9: Make rx_skb_size local to gelic_descr_prepare_rx.
v8: Add more cpu_to_be32 calls.
v7: Remove all cleanups, sync to spider net.
v6: Reworked and cleaned up patches.
v5: Some additional patch cleanups.
v4: More patch cleanups.
v3: Cleaned up patches as requested.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The current Gelic Etherenet driver was checking the return value of its
dma_map_single call, and not using the dma_mapping_error() routine.
Fixes runtime problems like these:
DMA-API: ps3_gelic_driver sb_05: device driver failed to check map error
WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:1027 .check_unmap+0x888/0x8dc
Fixes: 02c1889166 ("ps3: gigabit ethernet driver for PS3, take3")
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Gelic Ethernet device needs to have the RX sk_buffs aligned to
GELIC_NET_RXBUF_ALIGN, and also the length of the RX sk_buffs must
be a multiple of GELIC_NET_RXBUF_ALIGN.
The current Gelic Ethernet driver was not allocating sk_buffs large
enough to allow for this alignment.
Also, correct the maximum and minimum MTU sizes, and add a new
preprocessor macro for the maximum frame size, GELIC_NET_MAX_FRAME.
Fixes various randomly occurring runtime network errors.
Fixes: 02c1889166 ("ps3: gigabit ethernet driver for PS3, take3")
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
clang with W=1 reports
drivers/net/usb/plusb.c:65:1: error:
unused function 'pl_clear_QuickLink_features' [-Werror,-Wunused-function]
pl_clear_QuickLink_features(struct usbnet *dev, int val)
^
This static function is not used, so remove it.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packet length retrieved from descriptor may be larger than
the actual socket buffer length. In such case the cloned
skb passed up the network stack will leak kernel memory contents.
Additionally prevent integer underflow when size is less than
ETH_FCS_LEN.
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Szymon Heidrich <szymon.heidrich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In emac_probe, &adpt->work_thread is bound with
emac_work_thread. Then it will be started by timeout
handler emac_tx_timeout or a IRQ handler emac_isr.
If we remove the driver which will call emac_remove
to make cleanup, there may be a unfinished work.
The possible sequence is as follows:
Fix it by finishing the work before cleanup in the emac_remove
and disable timeout response.
CPU0 CPU1
|emac_work_thread
emac_remove |
free_netdev |
kfree(netdev); |
|emac_reinit_locked
|emac_mac_down
|//use netdev
Fixes: b9b17debc6 ("net: emac: emac gigabit ethernet controller driver")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We collect the software statistics counters for RX bytes (reported to
/proc/net/dev and to ethtool -S $dev | grep 'rx_bytes: ") at a time when
skb->len has already been adjusted by the eth_type_trans() ->
skb_pull_inline(skb, ETH_HLEN) call to exclude the L2 header.
This means that when connecting 2 DSA interfaces back to back and
sending 1 packet with length 100, the sending interface will report
tx_bytes as incrementing by 100, and the receiving interface will report
rx_bytes as incrementing by 86.
Since accounting for that in scripts is quirky and is something that
would be DSA-specific behavior (requiring users to know that they are
running on a DSA interface in the first place), the proposal is that we
treat it as a bug and fix it.
This design bug has always existed in DSA, according to my analysis:
commit 91da11f870 ("net: Distributed Switch Architecture protocol
support") also updates skb->dev->stats.rx_bytes += skb->len after the
eth_type_trans() call. Technically, prior to Florian's commit
a86d8becc3 ("net: dsa: Factor bottom tag receive functions"), each and
every vendor-specific tagging protocol driver open-coded the same bug,
until the buggy code was consolidated into something resembling what can
be seen now. So each and every driver should have its own Fixes: tag,
because of their different histories until the convergence point.
I'm not going to do that, for the sake of simplicity, but just blame the
oldest appearance of buggy code.
There are 2 ways to fix the problem. One is the obvious way, and the
other is how I ended up doing it. Obvious would have been to move
dev_sw_netstats_rx_add() one line above eth_type_trans(), and below
skb_push(skb, ETH_HLEN). But DSA processing is not as simple as that.
We count the bytes after removing everything DSA-related from the
packet, to emulate what the packet's length was, on the wire, when the
user port received it.
When eth_type_trans() executes, dsa_untag_bridge_pvid() has not run yet,
so in case the switch driver requests this behavior - commit
412a1526d0 ("net: dsa: untag the bridge pvid from rx skbs") has the
details - the obvious variant of the fix wouldn't have worked, because
the positioning there would have also counted the not-yet-stripped VLAN
header length, something which is absent from the packet as seen on the
wire (there it may be untagged, whereas software will see it as
PVID-tagged).
Fixes: f613ed665b ("net: dsa: Add support for 64-bit statistics")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we change the M/N values seamlessly during a fastset we should
also update the vblank timestamping stuff to make sure the vblank
timestamp corrections/guesstimations come out exact.
Note that only crtc_clock and framedur_ns can actually end up
changing here during fastsets. Everything else we touch can
only change during full modesets.
Technically we should try to do this exactly at the start of
vblank, but that would require some kind of double buffering
scheme. Let's skip that for now and just update things right
after the commit has been submitted to the hardware. This
means the information will be properly up to date when the
vblank irq handler goes to work. Only if someone ends up
querying some vblanky stuff in between the commit and start
of vblank may we see a slight discrepancy.
Also this same problem really exists for the DRRS downclocking
stuff. But as that is supposed to be more or less transparent
to the user, and it only drops to low gear after a long delay
(1 sec currently) we probably don't have to worry about it.
Any time something is actively submitting updates DRRS will
remain in high gear and so the timestamping constants will
match the hardware state.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com>
Fixes: e6f29923c0 ("drm/i915: Allow M/N change during fastset on bdw+")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230310235828.17439-1-ville.syrjala@linux.intel.com
(cherry picked from commit 8cb1f95cca)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The Wa_14017073508 require to send Media Busy/Idle mailbox while
accessing Media tile. As of now it is getting handled while __gt_unpark,
__gt_park. But there are various corner cases where forcewakes are taken
without __gt_unpark i.e. without sending Busy Mailbox especially during
register reads. Forcewakes are taken without busy mailbox leads to
GPU HANG. So bringing mailbox calls under forcewake calls are no feasible
option as forcewake calls are atomic and mailbox calls are blocking.
The issue already fixed in B step so disabling MC6 on A step and
reverting previous commit which handles Wa_14017073508
Fixes: 8f70f1ec58 ("drm/i915/mtl: Add Wa_14017073508 for SAMedia")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230310061339.2495416-2-badal.nilawar@intel.com
(cherry picked from commit 038a24835a)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
When interrupt auto clear is programmed, any read to the interrupt
status register will clear all interrupts. If two interrupts have
come in before one can be serviced then this will cause lost interrupts.
On AMD USB4 routers this has manifested in odd problems particularly
with long strings of control tranfers such as reading the DROM via bit
banging.
Instead of clearing interrupts automatically, clear the bit corresponding
to the given ring's interrupt in the ISR.
Fixes: 7a1808f82a ("thunderbolt: Handle ring interrupt by reading interrupt status register")
Cc: Sanju Mehta <Sanju.Mehta@amd.com>
Cc: stable@vger.kernel.org
Tested-by: Anson Tsao <anson.tsao@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
percpu_counter_sum_all() is now redundant as the race condition it
was invented to handle is now dealt with by percpu_counter_sum()
directly and all users of percpu_counter_sum_all() have been
removed.
Remove it.
This effectively reverts the changes made in f689054aac
("percpu_counter: add percpu_counter_sum_all interface") except for
the cpumask iteration that fixes percpu_counter_sum() made earlier
in this series.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
This effectively reverts the change made in commit f689054aac
("percpu_counter: add percpu_counter_sum_all interface") as the
race condition percpu_counter_sum_all() was invented to avoid is
now handled directly in percpu_counter_sum() and nobody needs to
care about summing racing with cpu unplug anymore.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
In commit f689054aac ("percpu_counter: add percpu_counter_sum_all
interface") a race condition between a cpu dying and
percpu_counter_sum() iterating online CPUs was identified. The
solution was to iterate all possible CPUs for summation via
percpu_counter_sum_all().
We recently had a percpu_counter_sum() call in XFS trip over this
same race condition and it fired a debug assert because the
filesystem was unmounting and the counter *should* be zero just
before we destroy it. That was reported here:
https://lore.kernel.org/linux-kernel/20230314090649.326642-1-yebin@huaweicloud.com/
likely as a result of running generic/648 which exercises
filesystems in the presence of CPU online/offline events.
The solution to use percpu_counter_sum_all() is an awful one. We
use percpu counters and percpu_counter_sum() for accurate and
reliable threshold detection for space management, so a summation
race condition during these operations can result in overcommit of
available space and that may result in filesystem shutdowns.
As percpu_counter_sum_all() iterates all possible CPUs rather than
just those online or even those present, the mask can include CPUs
that aren't even installed in the machine, or in the case of
machines that can hot-plug CPU capable nodes, even have physical
sockets present in the machine.
Fundamentally, this race condition is caused by the CPU being
offlined being removed from the cpu_online_mask before the notifier
that cleans up per-cpu state is run. Hence percpu_counter_sum() will
not sum the count for a cpu currently being taken offline,
regardless of whether the notifier has run or not. This is
the root cause of the bug.
The percpu counter notifier iterates all the registered counters,
locks the counter and moves the percpu count to the global sum.
This is serialised against other operations that move the percpu
counter to the global sum as well as percpu_counter_sum() operations
that sum the percpu counts while holding the counter lock.
Hence the notifier is safe to run concurrently with sum operations,
and the only thing we actually need to care about is that
percpu_counter_sum() iterates dying CPUs. That's trivial to do,
and when there are no CPUs dying, it has no addition overhead except
for a cpumask_or() operation.
This change makes percpu_counter_sum() always do the right thing in
the presence of CPU hot unplug events and makes
percpu_counter_sum_all() unnecessary. This, in turn, means that
filesystems like XFS, ext4, and btrfs don't have to work out when
they should use percpu_counter_sum() vs percpu_counter_sum_all() in
their space accounting algorithms
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Equivalent of for_each_cpu_and, except it ORs the two masks together
so it iterates all the CPUs present in either mask.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Back in the 6.2-rc1 days, Eric Whitney reported a fstests regression in
ext4 against generic/454. The cause of this test failure was the
unfortunate combination of setting an xattr name containing UTF8 encoded
emoji, an xattr hash function that accepted a char pointer with no
explicit signedness, signed type extension of those chars to an int, and
the 6.2 build tools maintainers deciding to mandate -funsigned-char
across the board. As a result, the ondisk extended attribute structure
written out by 6.1 and 6.2 were not the same.
This discrepancy, in fact, had been noticeable if a filesystem with such
an xattr were moved between any two architectures that don't employ the
same signedness of a raw "char" declaration. The only reason anyone
noticed is that x86 gcc defaults to signed, and no such -funsigned-char
update was made to e2fsprogs, so e2fsck immediately started reporting
data corruption.
After a day and a half of discussing how to handle this use case (xattrs
with bit 7 set anywhere in the name) without breaking existing users,
Linus merged his own patch and didn't tell the maintainer. None of the
ext4 developers realized this until AUTOSEL announced that the commit
had been backported to stable.
In the end, this problem could have been detected much earlier if there
had been any useful tests of hash function(s) in use inside ext4 to make
sure that they always produce the same outputs given the same inputs.
The XFS dirent/xattr name hash takes a uint8_t*, so I don't think it's
vulnerable to this problem. However, let's avoid all this drama by
adding our own self test to check that the da hash produces the same
outputs for a static pile of inputs on various platforms. This enables
us to fix any breakage that may result in a controlled fashion. The
buffer and test data are identical to the patches submitted to xfsprogs.
Link: https://lore.kernel.org/linux-ext4/Y8bpkm3jA3bDm3eL@debian-BULLSEYE-live-builder-AMD64/
Link: https://lore.kernel.org/linux-xfs/ZBUKCRR7xvIqPrpX@destitution/T/#md38272cc684e2c0d61494435ccbb91f022e8dee4
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
There are now five separate space allocator interfaces exposed to the
rest of XFS for five different strategies to find space. Add
tracepoints for each of them so that I can tell from a trace dump
exactly which ones got called and what happened underneath them. Add a
sixth so it's more obvious if an allocation actually happened.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Callers of xfs_alloc_vextent_iterate_ags that pass in the TRYLOCK flag
want us to perform a non-blocking scan of the AGs for free space. There
are no ordering constraints for non-blocking AGF lock acquisition, so
the scan can freely start over at AG 0 even when minimum_agno > 0.
This manifests fairly reliably on xfs/294 on 6.3-rc2 with the parent
pointer patchset applied and the realtime volume enabled. I observed
the following sequence as part of an xfs_dir_createname call:
0. Fragment the free space, then allocate nearly all the free space in
all AGs except AG 0.
1. Create a directory in AG 2 and let it grow for a while.
2. Try to allocate 2 blocks to expand the dirent part of a directory.
The space will be allocated out of AG 0, but the allocation will not
be contiguous. This (I think) activates the LOWMODE allocator.
3. The bmapi call decides to convert from extents to bmbt format and
tries to allocate 1 block. This allocation request calls
xfs_alloc_vextent_start_ag with the inode number, which starts the
scan at AG 2. We ignore AG 0 (with all its free space) and instead
scrape AG 2 and 3 for more space. We find one block, but this now
kicks t_highest_agno to 3.
4. The createname call decides it needs to split the dabtree. It tries
to allocate even more space with xfs_alloc_vextent_start_ag, but now
we're constrained to AG 3, and we don't find the space. The
createname returns ENOSPC and the filesystem shuts down.
This change fixes the problem by making the trylock scan wrap around to
AG 0 if it doesn't like the AGs that it finds. Since the current
transaction itself holds AGF 0, the trylock of AGF 0 will succeed, and
we take space from the AG that has plenty.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
The cooling levels array is supposed to prevent the system fans from
being configured below a 20% duty cycle as otherwise some of them get
stuck at 0 RPM.
Due to an off-by-one error, the last element in the array was not
initialized, causing it to be set to zero, which in turn lead to fans
being configured with a 0% duty cycle in maximum cooling state.
Since commit 332fdf951d ("mlxsw: thermal: Fix out-of-bounds memory
accesses") the contents of the array are static. Therefore, instead of
fixing the initialization of the array, simply remove it and adjust
thermal_cooling_device_ops::set_cur_state() so that the configured duty
cycle is never set below 20%.
Before:
# cat /sys/class/thermal/thermal_zone0/cdev0/type
mlxsw_fan
# echo 10 > /sys/class/thermal/thermal_zone0/cdev0/cur_state
# cat /sys/class/hwmon/hwmon0/name
mlxsw
# cat /sys/class/hwmon/hwmon0/pwm1
0
After:
# cat /sys/class/thermal/thermal_zone0/cdev0/type
mlxsw_fan
# echo 10 > /sys/class/thermal/thermal_zone0/cdev0/cur_state
# cat /sys/class/hwmon/hwmon0/name
mlxsw
# cat /sys/class/hwmon/hwmon0/pwm1
255
This bug was uncovered when the thermal subsystem repeatedly tried to
configure the cooling devices to their maximum state due to another
issue [1]. This resulted in the fans being stuck at 0 RPM, which
eventually lead to the system undergoing thermal shutdown.
[1] https://lore.kernel.org/netdev/ZA3CFNhU4AbtsP4G@shredder/
Fixes: a421ce088a ("mlxsw: core: Extend cooling device with cooling levels")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew reports that the SFF modules on one of the ZII platforms do not
indicate link up due to the SFP code believing that LOS indicating that
there is no signal being received from the remote end, but in fact the
LOS signal is showing that there is signal.
What makes SFF modules different from SFPs is they typically have an
inverted LOS, which uncovered this issue. When we read the hardware
state, we mask it with state_hw_mask so we ignore anything we're not
interested in. However, we don't re-read when state_hw_mask changes,
leading to sfp->state being stale.
Arrange for a software poll of the module state after we have parsed
the EEPROM in sfp_sm_mod_probe() and updated state_*_mask. This will
generate any necessary events for signal changes for the state
machine as well as updating sfp->state.
Reported-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Fixes: 8475c4b70b ("net: sfp: re-implement soft state polling setup")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently DMA address width is either read from a RO device register
or force set from the platform data. This breaks DMA when the host DMA
address width is <=32it but the device is >32bit.
Right now the driver may decide to use a 2nd DMA descriptor for
another buffer (happens in case of TSO xmit) assuming that 32bit
addressing is used due to platform configuration but the device will
still use both descriptor addresses as one address.
This can be observed with the Intel EHL platform driver that sets
32bit for addr64 but the MAC reports 40bit. The TX queue gets stuck in
case of TCP with iptables NAT configuration on TSO packets.
The logic should be like this: Whatever we do on the host side (memory
allocation GFP flags) should happen with the host DMA width, whenever
we decide how to set addresses on the device registers we must use the
device DMA address width.
This patch renames the platform address width field from addr64 (term
used in device datasheet) to host_addr and uses this value exclusively
for host side operations while all chip operations consider the device
DMA width as read from the device register.
Fixes: 7cfc4486e7 ("stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing")
Signed-off-by: Jochen Henneberg <jh@henneberg-systemdesign.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli says:
====================
ACPI/DT mdiobus module owner fixes
This patch series fixes wrong mdiobus module ownership for MDIO buses
registered from DT or ACPI.
Thanks Maxime for providing the first patch and making me see that ACPI
also had the same issue.
Changes in v2:
- fixed missing kdoc in the first patch
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Bus ownership is wrong when using acpi_mdiobus_register() to register an
mdio bus. That function is not inline, so when it calls
mdiobus_register() the wrong THIS_MODULE value is captured.
CC: Maxime Bizon <mbizon@freebox.fr>
Fixes: 803ca24d2f ("net: mdio: Add ACPI support code for mdio")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bus ownership is wrong when using of_mdiobus_register() to register an mdio
bus. That function is not inline, so when it calls mdiobus_register() the wrong
THIS_MODULE value is captured.
Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Fixes: 90eff9096c ("net: phy: Allow splitting MDIO bus/device support from PHYs")
[florian: fix kdoc, added Fixes tag]
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the phy_disconnect() -> phy_stop() path, we will be forcibly setting
the PHY state machine to PHY_HALTED. This invalidates the old_state !=
phydev->state condition in phy_state_machine() such that we will neither
display the state change for debugging, nor will we invoke the
link_change_notify() callback.
Factor the code by introducing phy_process_state_change(), and ensure
that we process the state change from phy_stop() as well.
Fixes: 5c5f626bca ("net: phy: improve handling link_change_notify callback")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-03-16 (igb, igbvf, igc)
This series contains updates to igb, igbvf, and igc drivers.
Lin Ma removes rtnl_lock() when disabling SRIOV on remove which was
causing deadlock on igb.
Akihiko Odaki delays enabling of SRIOV on igb to prevent early messages
that could get ignored and clears MAC address when PF returns nack on
reset; indicating no MAC address was assigned for igbvf.
Gaosheng Cui frees IRQs in error path for igbvf.
Akashi Takahiro fixes logic on checking TAPRIO gate support for igc.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In xirc2ps_probe, the local->tx_timeout_task was bounded
with xirc2ps_tx_timeout_task. When timeout occurs,
it will call xirc_tx_timeout->schedule_work to start the
work.
When we call xirc2ps_detach to remove the driver, there
may be a sequence as follows:
Stop responding to timeout tasks and complete scheduled
tasks before cleanup in xirc2ps_detach, which will fix
the problem.
CPU0 CPU1
|xirc2ps_tx_timeout_task
xirc2ps_detach |
free_netdev |
kfree(dev); |
|
| do_reset
| //use dev
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have to make sure that the info returned by the helper is valid
before using it.
Found by Linux Verification Center (linuxtesting.org) with the SVACE
static analysis tool.
Fixes: f990c82c38 ("qed*: Add support for ndo_set_vf_trust")
Fixes: 733def6a04 ("qed*: IOV link control")
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is a bug for fscrypt_put_master_key_activeref() to see a NULL
keyring. But it used to be possible due to the bug, now fixed, where
fscrypt_destroy_keyring() was called before security_sb_delete(). To be
consistent with how fscrypt_destroy_keyring() uses WARN_ON for the same
issue, WARN and leak the fscrypt_master_key if the keyring is NULL
instead of dereferencing the NULL pointer.
This is a robustness improvement, not a fix.
Link: https://lore.kernel.org/r/20230313221231.272498-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Another Lenovo convertable which reports a landscape resolution of
1920x1200 with a pitch of (1920 * 4) bytes, while the actual framebuffer
has a resolution of 1200x1920 with a pitch of (1200 * 4) bytes.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Commit 8633ef82f1 ("drivers/firmware: consolidate EFI framebuffer setup
for all arches") moved the sysfb_apply_efi_quirks() call in sysfb_init()
from before the [sysfb_]parse_mode() call to after it.
But sysfb_apply_efi_quirks() modifies the global screen_info struct which
[sysfb_]parse_mode() parses, so doing it later is too late.
This has broken all DMI based quirks for correcting wrong firmware efifb
settings when simpledrm is used.
To fix this move the sysfb_apply_efi_quirks() call back to its old place
and split the new setup of the efifb_fwnode (which requires
the platform_device) into its own function and call that at
the place of the moved sysfb_apply_efi_quirks(pd) calls.
Fixes: 8633ef82f1 ("drivers/firmware: consolidate EFI framebuffer setup for all arches")
Cc: stable@vger.kernel.org
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
We no longer use the recsize argument for locating the string table in
an SMBIOS record, so we can drop it from the internal API.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Instead of using the SMBIOS type 1 record 'family' field, which is often
modified by OEMs, use the type 4 'processor ID' and 'processor version'
fields, which are set to a small set of probe-able values on all known
Ampere EFI systems in the field.
Fixes: 550b33cfd4 ("arm64: efi: Force the use of ...")
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
The type 1 SMBIOS record happens to always be the same size, but there
are other record types which have been augmented over time, and so we
should really use the length field in the header to decide where the
string table starts.
Fixes: 550b33cfd4 ("arm64: efi: Force the use of ...")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-03-16 (iavf)
This series contains updates to iavf driver only.
Alex fixes incorrect check against Rx hash feature and corrects payload
value for IPv6 UDP packet.
Ahmed removes bookkeeping of VLAN 0 filter as it always exists and can
cause a false max filter error message.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
iavf: do not track VLAN 0 filters
iavf: fix non-tunneled IPv6 UDP packet type and hashing
iavf: fix inverted Rx hash condition leading to disabled hash
====================
Link: https://lore.kernel.org/r/20230316155316.1554931-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The splice read calls nfsd_splice_actor to put the pages containing file
data into the svc_rqst->rq_pages array. It's possible however to get a
splice result that only has a partial page at the end, if (e.g.) the
filesystem hands back a short read that doesn't cover the whole page.
nfsd_splice_actor will plop the partial page into its rq_pages array and
return. Then later, when nfsd_splice_actor is called again, the
remainder of the page may end up being filled out. At this point,
nfsd_splice_actor will put the page into the array _again_ corrupting
the reply. If this is done enough times, rq_next_page will overrun the
array and corrupt the trailing fields -- the rq_respages and
rq_next_page pointers themselves.
If we've already added the page to the array in the last pass, don't add
it to the array a second time when dealing with a splice continuation.
This was originally handled properly in nfsd_splice_actor, but commit
91e23b1c39 ("NFSD: Clean up nfsd_splice_actor()") removed the check
for it.
Fixes: 91e23b1c39 ("NFSD: Clean up nfsd_splice_actor()")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Dario Lesca <d.lesca@solinos.it>
Tested-by: David Critch <dcritch@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2150630
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
We had a couple of checks for session in cifs_tree_connect
and cifs_mark_open_files_invalid, which were unnecessary.
And that was done with ses_lock. Changed that to tc_lock too.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Arm SCMI fixes for v6.3
Few fixes addressing issues around validation of device tree SCMI node,
allowing raw SCMI access even on systems which fail to probe the normal
driver stack, duplicate header inclusion, clean up return statement using
literal values instead of variable to simplify and use of devm_bitmap_zalloc
instead of devm_kcalloc to improve semantic.
* tag 'scmi-fixes-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_scmi: Use the bitmap API to allocate bitmaps
firmware: arm_scmi: Fix device node validation for mailbox transport
firmware: arm_scmi: Fix raw coexistence mode behaviour on failure path
firmware: arm_scmi: Remove duplicate include header inclusion
firmware: arm_scmi: Return a literal instead of a variable
firmware: arm_scmi: Clean up a return statement in scmi_probe
Link: https://lore.kernel.org/r/20230315193557.1709241-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
i.MX fixes for 6.3:
- A couple of i.MX93 fixes from Alexander Stein to correct EQoS Ethernet
properties.
- Correct clock-names of FlexSPI device in imx8-ss-lsio DT.
- Fix EQoS PHY reset GPIO by dropping the deprecated/wrong property and
switch to the new bindings.
- Fix an issue with imx-weim bus driver that branch condition evaluates
to a garbage value.
- Correct WM8960 clock name for imx8mm-nitrogen-r2 board.
- Fix LCDIF2 clocks for i.MX8MP DT.
- Add missing #sound-dai-cells properties to SAI nodes for i.MX8MN DT.
- Revert LS1028A DT changes of getting MAC addresses from VPD, as the
dependency on NVMEM device is not in place.
- A series from Peng Fan to add missing pinctrl property for i.MX6SL
based devices.
* tag 'imx-fixes-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
arm64: dts: imx93: add missing #address-cells and #size-cells to i2c nodes
bus: imx-weim: fix branch condition evaluates to a garbage value
arm64: dts: imx8mn: specify #sound-dai-cells for SAI nodes
ARM: dts: imx6sl: tolino-shine2hd: fix usbotg1 pinctrl
ARM: dts: imx6sll: e60k02: fix usbotg1 pinctrl
ARM: dts: imx6sll: e70k02: fix usbotg1 pinctrl
arm64: dts: imx93: Fix eqos properties
arm64: dts: imx8mp: Fix LCDIF2 node clock order
arm64: dts: imx8mm-nitrogen-r2: fix WM8960 clock name
arm64: dts: imx8dxl-evk: Fix eqos phy reset gpio
Revert "arm64: dts: ls1028a: sl28: get MAC addresses from VPD"
arm64: dts: freescale: imx8-ss-lsio: Fix flexspi clock order
Link: https://lore.kernel.org/r/20230315132814.GF143566@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
arm64: tegra: Device tree fixes for v6.3-rc1
This contains a fix for the CBB bus' ranges property on Tegra194 and
Tegra234 that restores proper translation of PCI addresses.
* tag 'tegra-for-6.3-arm64-dt-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
arm64: tegra: Bump CBB ranges property on Tegra194 and Tegra234
Link: https://lore.kernel.org/r/20230302094213.3874449-1-thierry.reding@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
While adding and removing the controller, the following call trace was
observed:
WARNING: CPU: 3 PID: 623596 at kernel/dma/mapping.c:532 dma_free_attrs+0x33/0x50
CPU: 3 PID: 623596 Comm: sh Kdump: loaded Not tainted 5.14.0-96.el9.x86_64 #1
RIP: 0010:dma_free_attrs+0x33/0x50
Call Trace:
qla2x00_async_sns_sp_done+0x107/0x1b0 [qla2xxx]
qla2x00_abort_srb+0x8e/0x250 [qla2xxx]
? ql_dbg+0x70/0x100 [qla2xxx]
__qla2x00_abort_all_cmds+0x108/0x190 [qla2xxx]
qla2x00_abort_all_cmds+0x24/0x70 [qla2xxx]
qla2x00_abort_isp_cleanup+0x305/0x3e0 [qla2xxx]
qla2x00_remove_one+0x364/0x400 [qla2xxx]
pci_device_remove+0x36/0xa0
__device_release_driver+0x17a/0x230
device_release_driver+0x24/0x30
pci_stop_bus_device+0x68/0x90
pci_stop_and_remove_bus_device_locked+0x16/0x30
remove_store+0x75/0x90
kernfs_fop_write_iter+0x11c/0x1b0
new_sync_write+0x11f/0x1b0
vfs_write+0x1eb/0x280
ksys_write+0x5f/0xe0
do_syscall_64+0x5c/0x80
? do_user_addr_fault+0x1d8/0x680
? do_syscall_64+0x69/0x80
? exc_page_fault+0x62/0x140
? asm_exc_page_fault+0x8/0x30
entry_SYSCALL_64_after_hwframe+0x44/0xae
The command was completed in the abort path during driver unload with a
lock held, causing the warning in abort path. Hence complete the command
without any lock held.
Reported-by: Lin Li <lilin@redhat.com>
Tested-by: Lin Li <lilin@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230313043711.13500-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiaomi Poco F1 (qcom/sdm845-xiaomi-beryllium*.dts) comes with a SKhynix
H28U74301AMR UFS. The sd_read_cpr() operation leads to a 120 second
timeout, making the device bootup very slow:
[ 121.457736] sd 0:0:0:1: [sdb] tag#23 timing out command, waited 120s
Setting the BLIST_SKIP_VPD_PAGES allows the device to skip the failing
sd_read_cpr operation and boot normally.
Signed-off-by: Joel Selvaraj <joelselvaraj.oss@gmail.com>
Link: https://lore.kernel.org/r/20230313041402.39330-1-joelselvaraj.oss@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The data->block[0] variable comes from user and is a number between
0-255. Without proper check, the variable may be very large to cause
an out-of-bounds when performing memcpy in slimpro_i2c_blkwr.
Fix this bug by checking the value of writelen.
Fixes: f6505fbabc ("i2c: add SLIMpro I2C device driver on APM X-Gene platform")
Signed-off-by: Wei Chen <harperchen1110@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
The controller will always generate a completion interrupt when the
transfer is finished normally or not. Currently we use either error or
completion interrupt to finish, this may result the completion
interrupt unhandled and corrupt the next transfer, especially at low
speed mode. Since on error case, the error interrupt will come first
then is the completion interrupt. So only use the completion interrupt
to finish the whole transfer process.
Fixes: d62fbdb99a ("i2c: add support for HiSilicon I2C controller")
Reported-by: Sheng Feng <fengsheng5@huawei.com>
Signed-off-by: Sheng Feng <fengsheng5@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
After issuing all the messages we can disable the TX_EMPTY interrupts
to avoid handling redundant interrupts. For doing a sinlge bus
detection (i2cdetect -y -r 0) we can reduce ~97% interrupts (before
~12000 after ~400).
Signed-off-by: Sheng Feng <fengsheng5@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
We found that after commit 9c46929e79
("ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems"), the
PCF85063 RTC driver stopped working on i.MX28 due to regmap_bulk_read()
reading bogus data into a stack buffer. This is caused by the i2c-mxs
driver using DMA transfers even for messages without the I2C_M_DMA_SAFE
flag, and the aforementioned commit enabling vmapped stacks.
As the MXS I2C controller requires DMA for reads of >4 bytes, DMA can't be
disabled, so the issue is fixed by using i2c_get_dma_safe_msg_buf() to
create a bounce buffer when needed.
Fixes: 9c46929e79 ("ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems")
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
When reading from I2C, the Tx watermark is set to 0. Unfortunately the
TDF (transmit data flag) is enabled when Tx FIFO entries is equal or less
than watermark. So it is set in every case, hence the reset default of 1.
This results in the MSR_RDF _and_ MSR_TDF flags to be set thus trying
to send Tx data on a read message.
Mask the IRQ status to filter for wanted flags only.
Fixes: a55fa9d0e4 ("i2c: imx-lpi2c: add low power i2c bus driver")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Tested-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Check alloc_precpu()'s return value and return an error from
dm_stats_init() if it fails. Update alloc_dev() to fail if
dm_stats_init() does.
Otherwise, a NULL pointer dereference will occur in dm_stats_cleanup()
even if dm-stats isn't being actively used.
Fixes: fd2ed4d252 ("dm: add statistics support")
Cc: stable@vger.kernel.org
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
In porting his development branch to 6.3-rc1, yours truly has
repeatedly screwed up the args->pag being fed to the xfs_alloc_vextent*
functions. Add some debugging assertions to test the preconditions
required of the callers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
The check introduced in the commit a5fd39464a ("igc: Lift TAPRIO schedule
restriction") can detect a false positive error in some corner case.
For instance,
tc qdisc replace ... taprio num_tc 4
...
sched-entry S 0x01 100000 # slot#1
sched-entry S 0x03 100000 # slot#2
sched-entry S 0x04 100000 # slot#3
sched-entry S 0x08 200000 # slot#4
flags 0x02 # hardware offload
Here the queue#0 (the first queue) is on at the slot#1 and #2,
and off at the slot#3 and #4. Under the current logic, when the slot#4
is examined, validate_schedule() returns *false* since the enablement
count for the queue#0 is two and it is already off at the previous slot
(i.e. #3). But this definition is truely correct.
Let's fix the logic to enforce a strict validation for consecutively-opened
slots.
Fixes: a5fd39464a ("igc: Lift TAPRIO schedule restriction")
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
vf reset nack actually represents the reset operation itself is
performed but no address is assigned. Therefore, e1000_reset_hw_vf
should fill the "perm_addr" with the zero address and return success on
such an occasion. This prevents its callers in netdev.c from saying PF
still resetting, and instead allows them to correctly report that no
address is assigned.
Fixes: 6ddbc4cf1f ("igb: Indicate failure on vf reset for empty mac address")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Enabling SR-IOV causes the virtual functions to make requests to the
PF via the mailbox. Notably, E1000_VF_RESET request will happen during
the initialization of the VF. However, unless the reinit is done, the
VMMB interrupt, which delivers mailbox interrupt from VF to PF will be
kept masked and such requests will be silently ignored.
Enable SR-IOV at the very end of the procedure to configure the device
for SR-IOV so that the PF is configured properly for SR-IOV when a VF is
activated.
Fixes: fa44f2f185 ("igb: Enable SR-IOV configuration via PCI sysfs interface")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When an interface with the maximum number of VLAN filters is brought up,
a spurious error is logged:
[257.483082] 8021q: adding VLAN 0 to HW filter on device enp0s3
[257.483094] iavf 0000:00:03.0 enp0s3: Max allowed VLAN filters 8. Remove existing VLANs or disable filtering via Ethtool if supported.
The VF driver complains that it cannot add the VLAN 0 filter.
On the other hand, the PF driver always adds VLAN 0 filter on VF
initialization. The VF does not need to ask the PF for that filter at
all.
Fix the error by not tracking VLAN 0 filters altogether. With that, the
check added by commit 0e710a3ffd ("iavf: Fix VF driver counting VLAN 0
filters") in iavf_virtchnl.c is useless and might be confusing if left as
it suggests that we track VLAN 0.
Fixes: 0e710a3ffd ("iavf: Fix VF driver counting VLAN 0 filters")
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently, IAVF's decode_rx_desc_ptype() correctly reports payload type
of L4 for IPv4 UDP packets and IPv{4,6} TCP, but only L3 for IPv6 UDP.
Originally, i40e, ice and iavf were affected.
Commit 73df8c9e3e ("i40e: Correct UDP packet header for non_tunnel-ipv6")
fixed that in i40e, then
commit 638a0c8c88 ("ice: fix incorrect payload indicator on PTYPE")
fixed that for ice.
IPv6 UDP is L4 obviously. Fix it and make iavf report correct L4 hash
type for such packets, so that the stack won't calculate it on CPU when
needs it.
Fixes: 206812b5fc ("i40e/i40evf: i40e implementation for skb_set_hash")
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Condition, which checks whether the netdev has hashing enabled is
inverted. Basically, the tagged commit effectively disabled passing flow
hash from descriptor to skb, unless user *disables* it via Ethtool.
Commit a876c3ba59 ("i40e/i40evf: properly report Rx packet hash")
fixed this problem, but only for i40e.
Invert the condition now in iavf and unblock passing hash to skbs again.
Fixes: 857942fd1a ("i40e: Fix Rx hash reported to the stack by our driver")
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The full pagecache drop at the end of FS_IOC_ENABLE_VERITY is causing
performance problems and is hindering adoption of fsverity. It was
intended to solve a race condition where unverified pages might be left
in the pagecache. But actually it doesn't solve it fully.
Since the incomplete solution for this race condition has too much
performance impact for it to be worth it, let's remove it for now.
Fixes: 3fda4c617e ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl")
Cc: stable@vger.kernel.org
Reviewed-by: Victor Hsieh <victorhsieh@google.com>
Link: https://lore.kernel.org/r/20230314235332.50270-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
The space_info->active_total_bytes is no longer necessary as we now
count the region of newly allocated block group as zone_unusable. Drop
its usage.
Fixes: 6a921de589 ("btrfs: zoned: introduce space_info->active_total_bytes")
CC: stable@vger.kernel.org # 6.1+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The naming of space_info->active_total_bytes is misleading. It counts
not only active block groups but also full ones which are previously
active but now inactive. That confusion results in a bug not counting
the full BGs into active_total_bytes on mount time.
For a background, there are three kinds of block groups in terms of
activation.
1. Block groups never activated
2. Block groups currently active
3. Block groups previously active and currently inactive (due to fully
written or zone finish)
What we really wanted to exclude from "total_bytes" is the total size of
BGs #1. They seem empty and allocatable but since they are not activated,
we cannot rely on them to do the space reservation.
And, since BGs #1 never get activated, they should have no "used",
"reserved" and "pinned" bytes.
OTOH, BGs #3 can be counted in the "total", since they are already full
we cannot allocate from them anyway. For them, "total_bytes == used +
reserved + pinned + zone_unusable" should hold.
Tracking #2 and #3 as "active_total_bytes" (current implementation) is
confusing. And, tracking #1 and subtract that properly from "total_bytes"
every time you need space reservation is cumbersome.
Instead, we can count the whole region of a newly allocated block group as
zone_unusable. Then, once that block group is activated, release
[0 .. zone_capacity] from the zone_unusable counters. With this, we can
eliminate the confusing ->active_total_bytes and the code will be common
among regular and the zoned mode. Also, no additional counter is needed
with this approach.
Fixes: 6a921de589 ("btrfs: zoned: introduce space_info->active_total_bytes")
CC: stable@vger.kernel.org # 6.1+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We do
cache->space_info->counter += num_bytes;
everywhere in here. This is makes the lines longer than they need to
be, and will be especially noticeable when we add the active tracking in,
so add a temp variable for the space_info so this is cleaner.
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This flag only gets set when we're doing active zone tracking, and we're
going to need to use this flag for things related to this behavior.
Rename the flag to represent what it actually means for the file system
so it can be used in other ways and still make sense.
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs_can_activate_zone() returns true if at least one device has one zone
available for activation. This is OK for the single profile, but not OK for
DUP profile. We need two zones to create a DUP block group. Fix it by
properly handling the case with the profile flags.
Fixes: 265f7237dd ("btrfs: zoned: allow DUP on meta-data block groups")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 1ec49744ba ("btrfs: turn on -Wmaybe-uninitialized") exposed
that on SPARC and PA-RISC, gcc is unaware that fscrypt_setup_filename()
only returns negative error values or 0. This ultimately results in a
maybe-uninitialized warning in btrfs_lookup_dentry().
Change to only return negative error values or 0 from
fscrypt_setup_filename() at the relevant call site, and assert that no
positive error codes are returned (which would have wider implications
involving other users).
Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/all/481b19b5-83a0-4793-b4fd-194ad7b978c3@roeck-us.net/
Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
During my scrub rework, I did a stupid thing like this:
bio->bi_iter.bi_sector = stripe->logical;
btrfs_submit_bio(fs_info, bio, stripe->mirror_num);
Above bi_sector assignment is using logical address directly, which
lacks ">> SECTOR_SHIFT".
This results a read on a range which has no chunk mapping.
This results the following crash:
BTRFS critical (device dm-1): unable to find logical 11274289152 length 65536
assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387
Sure this is all my fault, but this shows a possible problem in real
world, that some bit flip in file extents/tree block can point to
unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT
is not configured, cause invalid pointer access.
[PROBLEMS]
In the above call chain, we just don't handle the possible error from
btrfs_get_chunk_map() inside __btrfs_map_block().
[FIX]
The fix is straightforward, replace the ASSERT() with proper error
handling (callers handle errors already).
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The driver can be compile tested with !CONFIG_OF making certain data
unused:
drivers/net/wireless/marvell/mwifiex/sdio.c:498:34: error: ‘mwifiex_sdio_of_match_table’ defined but not used [-Werror=unused-const-variable=]
drivers/net/wireless/marvell/mwifiex/pcie.c:175:34: error: ‘mwifiex_pcie_of_match_table’ defined but not used [-Werror=unused-const-variable=]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230312132523.352182-1-krzysztof.kozlowski@linaro.org
WQ_UNBOUND causes significant scheduler latency on ARM64/Android. This
is problematic for latency sensitive workloads, like I/O
post-processing.
Removing WQ_UNBOUND gives a 96% reduction in fsverity workqueue related
scheduler latency and improves app cold startup times by ~30ms.
WQ_UNBOUND was also removed from the dm-verity workqueue for the same
reason [1].
This code was tested by running Android app startup benchmarks and
measuring how long the fsverity workqueue spent in the runnable state.
Before
Total workqueue scheduler latency: 553800us
After
Total workqueue scheduler latency: 18962us
[1]: https://lore.kernel.org/all/20230202012348.885402-1-nhuck@google.com/
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Fixes: 8a1d0f9cac ("fs-verity: add data verification hooks for ->readpages()")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230310193325.620493-1-nhuck@google.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
When the user's login time is newer than the cache's timestamp,
the original entry in the RB-tree will be replaced by a new entry.
Currently, the timestamp is only set if the entry is not found in
the RB-tree, which can cause the timestamp to be undefined when
the entry exists. This may result in a significant increase in
ACCESS operations if the timestamp is set to zero.
Signed-off-by: Chengen Du <chengen.du@canonical.com>
Fixes: 0eb43812c0 ("NFS: Clear the file access cache upon login”)
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Commit 6930bcbfb6 dropped the setting of the file_lock range when
decoding a nlm_lock off the wire. This causes the client side grant
callback to miss matching blocks and reject the lock, only to rerequest
it 30s later.
Add a helper function to set the file_lock range from the start and end
values that the protocol uses, and have the nlm_lock decoder call that to
set up the file_lock args properly.
Fixes: 6930bcbfb6 ("lockd: detect and reject lock arguments that overflow")
Reported-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Amir Goldstein <amir73il@gmail.com>
Cc: stable@vger.kernel.org #6.0
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Prior to commit 8786fde842 ("Convert NFS from readpages to
readahead"), nfs_readpages() used the old mm interface read_cache_pages()
which called task_io_account_read() for each NFS page read. After
this commit, nfs_readpages() is converted to nfs_readahead(), which
now uses the new mm interface readahead_page(). The new interface
requires callers to call task_io_account_read() themselves.
In addition, to nfs_readahead() task_io_account_read() should also
be called from nfs_read_folio().
Fixes: 8786fde842 ("Convert NFS from readpages to readahead")
Link: https://lore.kernel.org/linux-nfs/CAPt2mGNEYUk5u8V4abe=5MM5msZqmvzCVrtCP4Qw1n=gCHCnww@mail.gmail.com/
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Add them to the SoC .dtsi, so that not every board has to specify them.
Fixes: 1225396fef ("arm64: dts: imx93: add lpi2c nodes")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
If bus type is other than imx50_weim_devtype and have no child devices,
variable 'ret' in function weim_parse_dt() will not be initialized, but
will be used as branch condition and return value. Fix this by
initializing 'ret' with 0.
This was discovered with help of clang-analyzer, but the situation is
quite possible in real life.
Fixes: 52c47b6341 ("bus: imx-weim: improve error handling upon child probe-failure")
Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru>
Cc: stable@vger.kernel.org
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
usb@2184000: 'pinctrl-0' is a dependency of 'pinctrl-names'
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Fixes: c100ea86e6 ("ARM: dts: add Netronix E60K02 board common file")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
usb@2184000: 'pinctrl-0' is a dependency of 'pinctrl-names'
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Fixes: 3bb3fd8565 ("ARM: dts: add Netronix E70K02 board common file")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
'macirq' is supposed to be listed first. Also only 'snps,clk-csr' is
listed in the bindings while 'clk_csr' is only supported for legacy
reasons. See commit 83936ea8d8 ("net: stmmac: add a parse for new
property 'snps,clk-csr'")
Fixes: 1f4263ea6a ("arm64: dts: imx93: add eqos support")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The 'axi' clock are the bus APB clock, the 'disp_axi' clock are the
pixel data AXI clock. The naming is confusing. Fix the clock order.
Fixes: 94e6197dad ("arm64: dts: imx8mp: Add LCDIF2 & LDB nodes")
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The WM8960 Linux driver expects the clock to be named "mclk". Otherwise
the clock will be ignored and not prepared/enabled by the driver.
Fixes: 40ba2eda0a ("arm64: dts: imx8mm-nitrogen-r2: add audio")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The deprecated property is named snps,reset-gpio, but this devicetree
used snps,reset-gpios instead which results in the reset not being used
and the following make dtbs_check error:
./arch/arm64/boot/dts/freescale/imx8dxl-evk.dtb: ethernet@5b050000: 'snps,reset-gpio' is a dependency of 'snps,reset-delays-us'
From schema: ./Documentation/devicetree/bindings/net/snps,dwmac.yaml
Use the preferred method of defining the reset gpio in the phy node
itself. Note that this drops the 10 us pre-delay, but prior this wasn't
used at all and a pre-delay doesn't make much sense in this context so
it should be fine.
Fixes: 8dd495d123 ("arm64: dts: freescale: add support for i.MX8DXL EVK board")
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Commit 732ea9db9d ("efi: libstub: Move screen_info handling to common
code") reorganized the earlycon handling so that all architectures pass
the screen_info data via a EFI config table instead of populating struct
screen_info directly, as the latter is only possible when the EFI stub
is baked into the kernel (and not into the decompressor).
However, this means that struct screen_info may not have been populated
yet by the time the earlycon probe takes place, and this results in a
non-functional early console.
So let's probe again right after parsing the config tables and
populating struct screen_info. Note that this means that earlycon output
starts a bit later than before, and so it may fail to capture issues
that occur while doing the early EFI initialization.
Fixes: 732ea9db9d ("efi: libstub: Move screen_info handling to common code")
Reported-by: Shawn Guo <shawn.guo@linaro.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Merge commit bf9bec4cb3 ("Merge branch 'bpf: Allow reads from uninit stack'")
from bpf-next to bpf tree to address verification issues in some programs
due to stack usage.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When tunneling aggregated USB3 (20 Gb/s) the bandwidth values that are
programmed to the ADP_USB3_CS_2 go higher than 4096 and that does not
fit anymore to the 12-bit field. Fix this by scaling the value using
the scale field accordingly.
Fixes: 3b1d8d577c ("thunderbolt: Implement USB3 bandwidth negotiation routines")
Cc: stable@vger.kernel.org
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Current Intel USB4 host routers have hardware limitation that the USB3
bandwidth cannot go higher than 16376 Mb/s. Work this around by adding a
new quirk that limits the bandwidth for the affected host routers.
Cc: stable@vger.kernel.org
Signed-off-by: Gil Fine <gil.fine@linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
In order to apply quirks based on certain adapter types move call to
tb_check_quirks() happen after the adapters are initialized. This should
not affect the existing quirks.
Cc: stable@vger.kernel.org
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
According to USB4 retimer specification, the process of firmware update
sequence requires issuing a SET_INBOUND_SBTX port operation that later
shall be followed by UNSET_INBOUND_SBTX port operation. This last step
is not currently issued by the driver but it is necessary to make sure
the retimers are put back to passthrough mode even during enumeration.
If this step is missing the link may not come up properly after
soft-reboot for example.
For this reason issue UNSET_INBOUND_SBTX after SET_INBOUND_SBTX for
enumeration and also when the NVM upgrade is run.
Reported-by: Christian Schaubschläger <christian.schaubschlaeger@gmx.at>
Link: https://lore.kernel.org/linux-usb/b556f5ed-5ee8-9990-9910-afd60db93310@gmx.at/
Cc: stable@vger.kernel.org
Signed-off-by: Gil Fine <gil.fine@linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Memory for the usb4->margining needs to be relased for the upstream port
of the router as well, even though the debugfs directory gets released
with the router device removal. Fix this.
Fixes: d0f1e0c2a6 ("thunderbolt: Add support for receiver lane margining")
Cc: stable@vger.kernel.org
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
In da9150_charger_probe, &charger->otg_work is bound with
da9150_charger_otg_work. da9150_charger_otg_ncb may be
called to start the work.
If we remove the module which will call da9150_charger_remove
to make cleanup, there may be a unfinished work. The possible
sequence is as follows:
Fix it by canceling the work before cleanup in the da9150_charger_remove
CPU0 CPUc1
|da9150_charger_otg_work
da9150_charger_remove |
power_supply_unregister |
device_unregister |
power_supply_dev_release|
kfree(psy) |
|
| power_supply_changed(charger->usb);
| //use
Fixes: c1a281e34d ("power: Add support for DA9150 Charger")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
The following build error can be seen:
progs/test_deny_namespace.c:22:19: error: call to undeclared function 'BIT_LL'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
__u64 cap_mask = BIT_LL(CAP_SYS_ADMIN);
The struct kernel_cap_struct no longer exists in the kernel as well.
Adjust bpf prog to fix both issues.
Fixes: f122a08b19 ("capability: just use a 'u64' instead of a 'u32[2]' array")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The commit 11e456cae9 ("selftests/bpf: Fix compilation errors: Assign a value to a constant")
fixed the issue cleanly in bpf-next.
This is an alternative fix in bpf tree to avoid merge conflict between bpf and bpf-next.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This reverts commit 6d0c4b11e743("libbpf: Poison strlcpy()").
It added the pragma poison directive to libbpf_internal.h to protect
against accidental usage of strlcpy but ended up breaking the build for
toolchains based on libcs which provide the strlcpy() declaration from
string.h (e.g. uClibc-ng). The include order which causes the issue is:
string.h,
from Iibbpf_common.h:12,
from libbpf.h:20,
from libbpf_internal.h:26,
from strset.c:9:
Fixes: 6d0c4b11e7 ("libbpf: Poison strlcpy()")
Signed-off-by: Jesus Sanchez-Palencia <jesussanp@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230309004836.2808610-1-jesussanp@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Geert reports that:
> On v6.2, "make ARCH=m68k defconfig" gives you
> CONFIG_RPCSEC_GSS_KRB5=m
> On v6.3, it became builtin, due to dropping the dependencies on
> the individual crypto modules.
>
> $ grep -E "CRYPTO_(MD5|DES|CBC|CTS|ECB|HMAC|SHA1|AES)" .config
> CONFIG_CRYPTO_AES=y
> CONFIG_CRYPTO_AES_TI=m
> CONFIG_CRYPTO_DES=m
> CONFIG_CRYPTO_CBC=m
> CONFIG_CRYPTO_CTS=m
> CONFIG_CRYPTO_ECB=m
> CONFIG_CRYPTO_HMAC=m
> CONFIG_CRYPTO_MD5=m
> CONFIG_CRYPTO_SHA1=m
This behavior is triggered by the "default y" in the definition of
RPCSEC_GSS.
The "default y" was added in 2010 by commit df486a2590 ("NFS: Fix
the selection of security flavours in Kconfig"). However,
svc_gss_principal was removed in 2012 by commit 03a4e1f6dd
("nfsd4: move principal name into svc_cred"), so the 2010 fix is
no longer necessary. We can safely change the NFS_V4 and NFSD_V4
dependencies back to RPCSEC_GSS_KRB5 to get the nicer v6.2
behavior back.
Selecting KRB5 symbolically represents the true requirement here:
that all spec-compliant NFSv4 implementations must have Kerberos
available to use.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: dfe9a12345 ("SUNRPC: Enable rpcsec_gss_krb5.ko to be built without CRYPTO_DES")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
The PE/COFF header has a NX compat flag which informs the firmware that
the application does not rely on memory regions being mapped with both
executable and writable permissions at the same time.
This is typically used by the firmware to decide whether it can set the
NX attribute on all allocations it returns, but going forward, it may be
used to enforce a policy that only permits applications with the NX flag
set to be loaded to begin wiht in some configurations, e.g., when Secure
Boot is in effect.
Even though the arm64 version of the EFI stub may relocate the kernel
before executing it, it always did so after disabling the MMU, and so we
were always in line with what the NX compat flag conveys, we just never
bothered to set it.
So let's set the flag now.
Cc: <stable@vger.kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
After relocating the executable image, use the EFI memory attributes
protocol to remap the code and data regions with the appropriate
permissions.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Now that the zboot loader will invoke the EFI memory attributes protocol
to remap the decompressed code and rodata as read-only/executable, we
can set the PE/COFF header flag that indicates to the firmware that the
application does not rely on writable memory being executable at the
same time.
Cc: <stable@vger.kernel.org> # v6.2+
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
In bq24190_probe, &bdi->input_current_limit_work is bound
with bq24190_input_current_limit_work. When external power
changed, it will call bq24190_charger_external_power_changed
to start the work.
If we remove the module which will call bq24190_remove to make
cleanup, there may be a unfinished work. The possible
sequence is as follows:
CPU0 CPUc1
|bq24190_input_current_limit_work
bq24190_remove |
power_supply_unregister |
device_unregister |
power_supply_dev_release|
kfree(psy) |
|
| power_supply_get_property_from_supplier
| //use
Fix it by finishing the work before cleanup in the bq24190_remove
Fixes: 9777467257 ("power_supply: Initialize changed_work before calling device_add")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Variable 'pirq', which may receive negative value
in platform_get_irq().
Used as an index in a function regmap_irq_get_virq().
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
This doesn't need to be printed every second as an error:
...
<3>[17438.628385] cros-usbpd-charger cros-usbpd-charger.3.auto: Port 1: default case!
<3>[17439.634176] cros-usbpd-charger cros-usbpd-charger.3.auto: Port 1: default case!
<3>[17440.640298] cros-usbpd-charger cros-usbpd-charger.3.auto: Port 1: default case!
...
Reduce priority from ERROR to DEBUG.
Signed-off-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
The tmp is defined as u32 type, which results in invalid processing of
tmp<0 in function rk817_read_or_set_full_charge_on_boot(). Therefore,
drop the comparison.
drivers/power/supply/rk817_charger.c:828 rk817_read_or_set_full_charge_on_boot() warn: unsigned 'tmp' is never less than zero.
drivers/power/supply/rk817_charger.c:788 rk817_read_or_set_full_charge_on_boot() warn: unsigned 'tmp' is never less than zero.
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3444
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Tested-by: Chris Morgan <macromorgan@hotmail.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
The second LPASS pin controller IO address is supposed to be the MCC
range which contains the slew rate registers. The Linux driver then
accesses slew rate register with hard-coded offset (0xa000). However
the DTS contained the address of slew rate register as the second IO
address, thus any reads were effectively pass the memory space and lead
to "Internal error: synchronous external aborts" when applying pin
configuration.
Fixes: 6de7f9c343 ("arm64: dts: qcom: sm8550: add GPR and LPASS pin controller")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230302154724.856062-1-krzysztof.kozlowski@linaro.org
VA dmics 0, 1, 2 micbias on X13s are connected to WCD MICBIAS1, WCD MICBIAS1
and WCD MICBIAS3 respectively. Reflect this in dt to get dmics working.
Also fix dmics to go via VA Macro instead of TX macro to fix device switching.
Fixes: 8c1ea87e80b4 ("arm64: dts: qcom: sc8280xp-x13s: Add soundcard support")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230302115741.7726-5-srinivas.kandagatla@linaro.org
When neither "no_read_workqueue" nor "no_write_workqueue" are enabled,
tasklet_trylock() in crypt_dec_pending() may still return false due to
an uninitialized state, and dm-crypt will unnecessarily do io completion
in io_queue workqueue instead of current context.
Fix this by adding an 'in_tasklet' flag to dm_crypt_io struct and
initialize it to false in crypt_io_init(). Set this flag to true in
kcryptd_queue_crypt() before calling tasklet_schedule(). If set
crypt_dec_pending() will punt io completion to a workqueue.
This also nicely avoids the tasklet_trylock/unlock hack when tasklets
aren't in use.
Fixes: 8e14f61015 ("dm crypt: do not call bio_endio() from the dm-crypt tasklet")
Cc: stable@vger.kernel.org
Reported-by: Hou Tao <houtao1@huawei.com>
Suggested-by: Ignat Korchagin <ignat@cloudflare.com>
Reviewed-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Some boards might use USB-A female connector for USB ports, however,
the port could be connected to a dual-mode USB controller, making it
also behaves as a peripheral device if male-to-male cable is connected.
In this case, the dts looks like this:
&usb0 {
status = "okay";
dr_mode = "otg";
usb-role-switch;
role-switch-default-mode = "host";
};
After boot, dwc2_ovr_init() sets GOTGCTL to GOTGCTL_AVALOVAL and call
dwc2_force_mode() with parameter host=false, which causes inconsistent
mode - The hardware is in peripheral mode while the kernel status is
in host mode.
What we can do now is to call dwc2_drd_role_sw_set() to switch to
device mode, and everything should work just fine now, even switching
back to none(default) mode afterwards.
Fixes: e14acb8769 ("usb: dwc2: drd: add role-switch-default-node support")
Cc: stable <stable@kernel.org>
Signed-off-by: Ziyang Huang <hzyitc@outlook.com>
Tested-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Acked-by: Minas Harutyunyan <hminas@synopsys.com>
Reviewed-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/SG2PR01MB204837BF68EDB0E343D2A375C9A59@SG2PR01MB2048.apcprd01.prod.exchangelabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The kernel will dump in the below cases:
sysfs: cannot create duplicate filename
'/devices/virtual/usb_power_delivery/pd1/source-capabilities'
1. After soft reset has completed, an Explicit Contract negotiation occurs.
The sink device will receive source capabilitys again. This will cause
a duplicate source-capabilities file be created.
2. Power swap twice on a device that is initailly sink role.
This will unregister existing capabilities when above cases occurs.
Fixes: 8203d26905 ("usb: typec: tcpm: Register USB Power Delivery Capabilities")
cc: <stable@vger.kernel.org>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230215054951.238394-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the unbind callback for f_uac1 and f_uac2, a call to snd_card_free()
via g_audio_cleanup() will disconnect the card and then wait for all
resources to be released, which happens when the refcount falls to zero.
Since userspace can keep the refcount incremented by not closing the
relevant file descriptor, the call to unbind may block indefinitely.
This can cause a deadlock during reboot, as evidenced by the following
blocked task observed on my machine:
task:reboot state:D stack:0 pid:2827 ppid:569 flags:0x0000000c
Call trace:
__switch_to+0xc8/0x140
__schedule+0x2f0/0x7c0
schedule+0x60/0xd0
schedule_timeout+0x180/0x1d4
wait_for_completion+0x78/0x180
snd_card_free+0x90/0xa0
g_audio_cleanup+0x2c/0x64
afunc_unbind+0x28/0x60
...
kernel_restart+0x4c/0xac
__do_sys_reboot+0xcc/0x1ec
__arm64_sys_reboot+0x28/0x30
invoke_syscall+0x4c/0x110
...
The issue can also be observed by opening the card with arecord and
then stopping the process through the shell before unbinding:
# arecord -D hw:UAC2Gadget -f S32_LE -c 2 -r 48000 /dev/null
Recording WAVE '/dev/null' : Signed 32 bit Little Endian, Rate 48000 Hz, Stereo
^Z[1]+ Stopped arecord -D hw:UAC2Gadget -f S32_LE -c 2 -r 48000 /dev/null
# echo gadget.0 > /sys/bus/gadget/drivers/configfs-gadget/unbind
(observe that the unbind command never finishes)
Fix the problem by using snd_card_free_when_closed() instead, which will
still disconnect the card as desired, but defer the task of freeing the
resources to the core once userspace closes its file descriptor.
Fixes: 132fcb4608 ("usb: gadget: Add Audio Class 2.0 Driver")
Cc: stable@vger.kernel.org
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Ruslan Bilovol <ruslan.bilovol@gmail.com>
Reviewed-by: John Keeping <john@metanate.com>
Link: https://lore.kernel.org/r/20230302163648.3349669-1-alvin@pqrs.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Previously, there was a 100uS delay inserted after issuing an end transfer
command for specific controller revisions. This was due to the fact that
there was a GUCTL2 bit field which enabled synchronous completion of the
end transfer command once the CMDACT bit was cleared in the DEPCMD
register. Since this bit does not exist for all controller revisions and
the current implementation heavily relies on utizling the EndTransfer
command completion interrupt, add the delay back in for uses where the
interrupt on completion bit is not set, and increase the duration to 1ms
for the controller to complete the command.
An issue was seen where the USB request buffer was unmapped while the DWC3
controller was still accessing the TRB. However, it was confirmed that the
end transfer command was successfully submitted. (no end transfer timeout)
In situations, such as dwc3_gadget_soft_disconnect() and
__dwc3_gadget_ep_disable(), the dwc3_remove_request() is utilized, which
will issue the end transfer command, and follow up with
dwc3_gadget_giveback(). At least for the USB ep disable path, it is
required for any pending and started requests to be completed and returned
to the function driver in the same context of the disable call. Without
the GUCTL2 bit, it is not ensured that the end transfer is completed before
the buffers are unmapped.
Fixes: cf2f8b63f7 ("usb: dwc3: gadget: Remove END_TRANSFER delay")
Cc: stable <stable@kernel.org>
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20230306200557.29387-1-quic_wcheng@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 130a96d698 ("usb: typec: ucsi: acpi: Increase command
completion timeout value") increased the timeout from 5 seconds
to 60 seconds due to issues related to alternate mode discovery.
After the alternate mode discovery switch to polled mode
the timeout was reduced, but instead of being set back to
5 seconds it was reduced to 1 second.
This is causing problems when using a Lenovo ThinkPad X1 yoga gen7
connected over Type-C to a LG 27UL850-W (charging DP over Type-C).
When the monitor is already connected at boot the following error
is logged: "PPM init failed (-110)", /sys/class/typec is empty and
on unplugging the NULL pointer deref fixed earlier in this series
happens.
When the monitor is connected after boot the following error
is logged instead: "GET_CONNECTOR_STATUS failed (-110)".
Setting the timeout back to 5 seconds fixes both cases.
Fixes: e08065069f ("usb: typec: ucsi: acpi: Reduce the command completion timeout")
Cc: stable@vger.kernel.org
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230308154244.722337-4-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ucsi_init() which runs from a workqueue sets ucsi->connector and
on an error will clear it again.
ucsi->connector gets dereferenced by ucsi_resume(), this checks for
ucsi->connector being NULL in case ucsi_init() has not finished yet;
or in case ucsi_init() has failed.
ucsi_init() setting ucsi->connector and then clearing it again on
an error creates a race where the check in ucsi_resume() may pass,
only to have ucsi->connector free-ed underneath it when ucsi_init()
hits an error.
Fix this race by making ucsi_init() store the connector array in
a local variable and only assign it to ucsi->connector on success.
Fixes: bdc62f2bae ("usb: typec: ucsi: Simplified registration and I/O API")
Cc: stable@vger.kernel.org
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230308154244.722337-3-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When mailboxes are used as a transport it is possible to setup the SCMI
transport layer, depending on the underlying channels configuration, to use
one or two mailboxes, associated, respectively, to one or two, distinct,
shared memory areas: any other combination should be treated as invalid.
Add more strict checking of SCMI mailbox transport device node descriptors.
Fixes: 5c8a47a5a9 ("firmware: arm_scmi: Make scmi core independent of the transport type")
Cc: <stable@vger.kernel.org> # 4.19
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20230307162324.891866-1-cristian.marussi@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The previous commit mistakenly introduced sim_ctrl_default as pinctrl,
this is incorrect, the interface for sim card selection varies between
different devices and should not be placed in the dtsi.
This commit selects external SIM card slot for ufi001c as default.
uf896 selects the correct SIM card slot automatically, thus does not need
this pinctrl node.
Fixes: faf6943146 ("arm64: dts: qcom: msm8916-thwc: Add initial device trees")
Signed-off-by: Yang Xiwen <forbidden405@foxmail.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/tencent_7036BCA256055D05F8C49D86DF7F0E2D1A05@qq.com
For uniquely identifying the vadc channels, label property has to be used.
The initial commit adding vadc support assumed that the driver will use the
unit address along with the node name to identify the channels. But this
assumption is now broken by,
commit 701c875ade ("iio: adc: qcom-spmi-adc5: Fix the channel name") that
stripped unit address from channel names. This results in probe failure of
the vadc driver:
[ 8.380370] iio iio:device0: tried to double register : in_temp_pmic-die-temp_input
[ 8.380383] qcom-spmi-adc5 c440000.spmi:pmic@0:adc@3100: Failed to register sysfs interfaces
[ 8.380386] qcom-spmi-adc5: probe of c440000.spmi:pmic@0:adc@3100 failed with error -16
Hence, let's get rid of the assumption about drivers and rely on label
property to uniquely identify the channels.
The labels are derived from the schematics for each PMIC. For internal adc
channels such as die and xo, the PMIC names are used as a prefix.
Fixes: 7c01513474 ("arm64: dts: qcom: sc8280xp-x13s: Add PM8280_{1/2} ADC_TM5 channels")
Fixes: 9d41cd1739 ("arm64: dts: qcom: sc8280xp-x13s: Add PMR735A VADC channel")
Fixes: 3375151a71 ("arm64: dts: qcom: sc8280xp-x13s: Add PM8280_{1/2} VADC channels")
Fixes: 9a6b3042c5 ("arm64: dts: qcom: sc8280xp-x13s: Add PMK8280 VADC channels")
Reported-by: Steev Klimaszewski <steev@kali.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230211052415.14581-1-manivannan.sadhasivam@linaro.org
This is an already known issue that dm-thin volume cannot be used as
swap, otherwise a deadlock may happen when dm-thin internal memory
demand triggers swap I/O on the dm-thin volume itself.
But thanks to commit a666e5c05e ("dm: fix deadlock when swapping to
encrypted device"), the limit_swap_bios target flag can also be used
for dm-thin to avoid the recursive I/O when it is used as swap.
Fix is to simply set ti->limit_swap_bios to true in both pool_ctr()
and thin_ctr().
In my test, I create a dm-thin volume /dev/vg/swap and use it as swap
device. Then I run fio on another dm-thin volume /dev/vg/main and use
large --blocksize to trigger swap I/O onto /dev/vg/swap.
The following fio command line is used in my test,
fio --name recursive-swap-io --lockmem 1 --iodepth 128 \
--ioengine libaio --filename /dev/vg/main --rw randrw \
--blocksize 1M --numjobs 32 --time_based --runtime=12h
Without this fix, the whole system can be locked up within 15 seconds.
With this fix, there is no any deadlock or hung task observed after
2 hours of running fio.
Furthermore, if blocksize is changed from 1M to 128M, after around 30
seconds fio has no visible I/O, and the out-of-memory killer message
shows up in kernel message. After around 20 minutes all fio processes
are killed and the whole system is back to being alive.
This is exactly what is expected when recursive I/O happens on dm-thin
volume when it is used as swap.
Depends-on: a666e5c05e ("dm: fix deadlock when swapping to encrypted device")
Cc: stable@vger.kernel.org
Signed-off-by: Coly Li <colyli@suse.de>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
__copy_to_user_memcpy() and __clear_user_memset() had been calling
memcpy() and memset() respectively, leading to false-positive KASAN
reports when starting userspace:
[ 10.707901] Run /init as init process
[ 10.731892] process '/bin/busybox' started with executable stack
[ 10.745234] ==================================================================
[ 10.745796] BUG: KASAN: user-memory-access in __clear_user_memset+0x258/0x3ac
[ 10.747260] Write of size 2687 at addr 000de581 by task init/1
Use __memcpy() and __memset() instead to allow userspace access, which
is of course the intent of these functions.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Zev Weiss <zev@bewilderbeest.net>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Add QUIRK_NO_CLX to disable the CLx state for hardware which
doesn't supports it.
AMD Yellow Carp and Pink Sardine don't support CLx state,
hence disabling it using QUIRK_NO_CLX.
Cc: stable@vger.kernel.org
Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
[mw: added debug log when the quirk is run]
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
When SCMI raw coexistence mode is enabled make the core stack probe
successfully even when the initial base protocol exchanges with the
platform/server failed.
This behaviour enables the system to boot with a broken regular SCMI
stack but with a fully functional and accessible SCMI raw debugfs
interface that can be used to further debug the issue.
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20230223152330.2707260-1-cristian.marussi@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
With commit b203e6f1e8 ("arm64: dts: ls1028a: sl28: get MAC addresses
from VPD"), the network adapter now depends on the nvmem device to be
present, which isn't the case and thus breaks networking on this board.
Revert it.
Fixes: b203e6f1e8 ("arm64: dts: ls1028a: sl28: get MAC addresses from VPD")
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The correct clock order is "fspi_en" and "fspi". As they are identical
just reordering the names is sufficient.
Fixes: 6276d66984 ("arm64: dts: imx8dxl: add flexspi0 support")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Both Xavier (Tegra194) and Orin (Tegra234) support a 40-bit address map,
so bump the CBB ranges property to cover all of the 1 TiB address space.
This fixes an issue where some of the PCIe regions could not be remapped
because of they were outside the memory specified by the CBB's ranges
property.
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
There is a potential race condition in amdtee_open_session that may
lead to use-after-free. For instance, in amdtee_open_session() after
sess->sess_mask is set, and before setting:
sess->session_info[i] = session_info;
if amdtee_close_session() closes this same session, then 'sess' data
structure will be released, causing kernel panic when 'sess' is
accessed within amdtee_open_session().
The solution is to set the bit sess->sess_mask as the last step in
amdtee_open_session().
Fixes: 757cc3e9ff ("tee: add AMD-TEE driver")
Cc: stable@vger.kernel.org
Signed-off-by: Rijo Thomas <Rijo-john.Thomas@amd.com>
Acked-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Eduard Zingerman says:
====================
This patch-set modifies BPF verifier to accept programs that read from
uninitialized stack locations, but only if executed in privileged mode.
This provides significant verification performance gains: 30% to 70% less
processed states for big number of test programs.
The reason for performance gains comes from treating STACK_MISC and
STACK_INVALID as compatible, when cached state is compared to current state
in verifier.c:stacksafe().
The change should not affect safety, because any value read from STACK_MISC
location has full binary range (e.g. 0x00-0xff for byte-sized reads).
Details and measurements are provided in the description for the patch #1.
The change was suggested by Andrii Nakryiko, the initial patch was created
by Alexei Starovoitov. The discussion could be found at [1].
Changes v1 -> v2 (v1 available at [2]):
- Calls to helper functions now convert STACK_INVALID to STACK_MISC
(suggested by Andrii);
- The test case progs/test_global_func10.c is updated to expect new
error message. Before recent commit [3] exact content of error
messages was not verified for this test.
- Replaced incorrect '//'-style comments in test case asm blocks by
'/*...*/'-style comments in order to fix compilation issues;
- Changed the tag from "Suggested-By" to "Co-developed-by" for Alexei
on patch #1, please let me know if this is appropriate use of the tag.
[1] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
[2] https://lore.kernel.org/bpf/20230216183606.2483834-1-eddyz87@gmail.com/
[3] 95ebb37617 ("selftests/bpf: Convert test_global_funcs test to test_loader framework")
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Three testcases to make sure that stack reads from uninitialized
locations are accepted by verifier when executed in privileged mode:
- read from a fixed offset;
- read from a variable offset;
- passing a pointer to stack to a helper converts
STACK_INVALID to STACK_MISC.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230219200427.606541-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commits updates the following functions to allow reads from
uninitialized stack locations when env->allow_uninit_stack option is
enabled:
- check_stack_read_fixed_off()
- check_stack_range_initialized(), called from:
- check_stack_read_var_off()
- check_helper_mem_access()
Such change allows to relax logic in stacksafe() to treat STACK_MISC
and STACK_INVALID in a same way and make the following stack slot
configurations equivalent:
| Cached state | Current state |
| stack slot | stack slot |
|------------------+------------------|
| STACK_INVALID or | STACK_INVALID or |
| STACK_MISC | STACK_SPILL or |
| | STACK_MISC or |
| | STACK_ZERO or |
| | STACK_DYNPTR |
This leads to significant verification speed gains (see below).
The idea was suggested by Andrii Nakryiko [1] and initial patch was
created by Alexei Starovoitov [2].
Currently the env->allow_uninit_stack is allowed for programs loaded
by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities.
A number of test cases from verifier/*.c were expecting uninitialized
stack access to be an error. These test cases were updated to execute
in unprivileged mode (thus preserving the tests).
The test progs/test_global_func10.c expected "invalid indirect read
from stack" error message because of the access to uninitialized
memory region. This error is no longer possible in privileged mode.
The test is updated to provoke an error "invalid indirect access to
stack" because of access to invalid stack address (such error is not
verified by progs/test_global_func*.c series of tests).
The following tests had to be removed because these can't be made
unprivileged:
- verifier/sock.c:
- "sk_storage_get(map, skb->sk, &stack_value, 1): partially init
stack_value"
BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode.
- verifier/var_off.c:
- "indirect variable-offset stack access, max_off+size > max_initialized"
- "indirect variable-offset stack access, uninitialized"
These tests verify that access to uninitialized stack values is
detected when stack offset is not a constant. However, variable
stack access is prohibited in unprivileged mode, thus these tests
are no longer valid.
* * *
Here is veristat log comparing this patch with current master on a
set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg
and cilium BPF binaries (see [3]):
$ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log
File Program States (A) States (B) States (DIFF)
-------------------------- -------------------------- ---------- ---------- ----------------
bpf_host.o tail_handle_ipv6_from_host 349 244 -105 (-30.09%)
bpf_host.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
bpf_lxc.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
bpf_sock.o cil_sock4_connect 70 48 -22 (-31.43%)
bpf_sock.o cil_sock4_sendmsg 68 46 -22 (-32.35%)
bpf_xdp.o tail_handle_nat_fwd_ipv4 1554 803 -751 (-48.33%)
bpf_xdp.o tail_lb_ipv4 6457 2473 -3984 (-61.70%)
bpf_xdp.o tail_lb_ipv6 7249 3908 -3341 (-46.09%)
pyperf600_bpf_loop.bpf.o on_event 287 145 -142 (-49.48%)
strobemeta.bpf.o on_event 15915 4772 -11143 (-70.02%)
strobemeta_nounroll2.bpf.o on_event 17087 3820 -13267 (-77.64%)
xdp_synproxy_kern.bpf.o syncookie_tc 21271 6635 -14636 (-68.81%)
xdp_synproxy_kern.bpf.o syncookie_xdp 23122 6024 -17098 (-73.95%)
-------------------------- -------------------------- ---------- ---------- ----------------
Note: I limited selection by states_pct<-30%.
Inspection of differences in pyperf600_bpf_loop behavior shows that
the following patch for the test removes almost all differences:
- a/tools/testing/selftests/bpf/progs/pyperf.h
+ b/tools/testing/selftests/bpf/progs/pyperf.h
@ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args *ctx)
}
if (event->pthread_match || !pidData->use_tls) {
- void* frame_ptr;
- FrameData frame;
+ void* frame_ptr = 0;
+ FrameData frame = {};
Symbol sym = {};
int cur_cpu = bpf_get_smp_processor_id();
W/o this patch the difference comes from the following pattern
(for different variables):
static bool get_frame_data(... FrameData *frame ...)
{
...
bpf_probe_read_user(&frame->f_code, ...);
if (!frame->f_code)
return false;
...
bpf_probe_read_user(&frame->co_name, ...);
if (frame->co_name)
...;
}
int __on_event(struct bpf_raw_tracepoint_args *ctx)
{
FrameData frame;
...
get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback
...
}
SEC("raw_tracepoint/kfree_skb")
int on_event(struct bpf_raw_tracepoint_args* ctx)
{
...
ret |= __on_event(ctx);
ret |= __on_event(ctx);
...
}
With regards to value `frame->co_name` the following is important:
- Because of the conditional `if (!frame->f_code)` each call to
__on_event() produces two states, one with `frame->co_name` marked
as STACK_MISC, another with it as is (and marked STACK_INVALID on a
first call).
- The call to bpf_probe_read_user() does not mark stack slots
corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks
these slots as BPF_MISC, this happens because of the following loop
in the check_helper_call():
for (i = 0; i < meta.access_size; i++) {
err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B,
BPF_WRITE, -1, false);
if (err)
return err;
}
Note the size of the write, it is a one byte write for each byte
touched by a helper. The BPF_B write does not lead to write marks
for the target stack slot.
- Which means that w/o this patch when second __on_event() call is
verified `if (frame->co_name)` will propagate read marks first to a
stack slot with STACK_MISC marks and second to a stack slot with
STACK_INVALID marks and these states would be considered different.
[1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/
[2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
[3] git@github.com:anakryiko/cilium.git
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Co-developed-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230219200427.606541-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-22 12:34:50 -08:00
366 changed files with 4215 additions and 1576 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.