Pull section attribute fix from Miguel Ojeda:
"Fix Oops in Clang-compiled kernels (Nick Desaulniers)"
* tag 'compiler-attributes-for-linus-v5.3-rc8' of git://github.com/ojeda/linux:
include/linux/compiler.h: fix Oops for Clang-compiled kernels
Pull GPIO fixes from Linus Walleij:
"All related to the PCA953x driver when handling chips with more than 8
ports, now that works again"
* tag 'gpio-v5.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: pca953x: use pca953x_read_regs instead of regmap_bulk_read
gpio: pca953x: correct type of reg_direction
This reverts commit 558682b529.
Chris Wilson reports that it breaks his CPU hotplug test scripts. In
particular, it breaks offlining and then re-onlining the boot CPU, which
we treat specially (and the BIOS does too).
The symptoms are that we can offline the CPU, but it then does not come
back online again:
smpboot: CPU 0 is now offline
smpboot: Booting Node 0 Processor 0 APIC 0x0
smpboot: do_boot_cpu failed(-1) to wakeup CPU#0
Thomas says he knows why it's broken (my personal suspicion: our magic
handling of the "cpu0_logical_apicid" thing), but for 5.3 the right fix
is to just revert it, since we've never touched the LDR bits before, and
it's not worth the risk to do anything else at this stage.
[ Hotpluging of the boot CPU is special anyway, and should be off by
default. See the "BOOTPARAM_HOTPLUG_CPU0" config option and the
cpu0_hotplug kernel parameter.
In general you should not do it, and it has various known limitations
(hibernate and suspend require the boot CPU, for example).
But it should work, even if the boot CPU is special and needs careful
treatment - Linus ]
Link: https://lore.kernel.org/lkml/156785100521.13300.14461504732265570003@skylake-alporthouse-com/
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bandan Das <bsd@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull Documentation updates from Greg KH:
"A few small patches for the documenation file that came in through the
char-misc tree in -rc7 for your tree.
They fix the mistake in the .rst format that kept the table of
companies from showing up in the html output, and most importantly,
add people's names to the list showing support for our process"
* tag 'char-misc-5.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Documentation/process: Add Qualcomm process ambassador for hardware security issues
Documentation/process/embargoed-hardware-issues: Microsoft ambassador
Documentation/process: Add Google contact for embargoed hardware issues
Documentation/process: Volunteer as the ambassador for Xen
Pull dmaengine fixes from Vinod Koul:
"Some late fixes for drivers:
- memory leak in ti crossbar dma driver
- cleanup of omap dma probe
- Fix for link list configuration in sprd dma driver
- Handling fixed for DMACHCLR if iommu is mapped in rcar dma"
* tag 'dmaengine-fix-5.3' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: rcar-dmac: Fix DMACHCLR handling if iommu is mapped
dmaengine: sprd: Fix the DMA link-list configuration
dmaengine: ti: omap-dma: Add cleanup in omap_dma_probe()
dmaengine: ti: dma-crossbar: Fix a memory leak bug
Pull SCSI fix from James Bottomley:
"Just a single lpfc fix adjusting the number of available queues for
high CPU count systems"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: lpfc: Raise config max for lpfc_fcp_mq_threshold variable
Pull libnvdimm fix from Dan Williams:
"Restore support for 1GB alignment namespaces, truncate the end of
misaligned namespaces"
* tag 'libnvdimm-fix-5.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm/pfn: Fix namespace creation on misaligned addresses
Pull input fix from Dmitry Torokhov:
"A tiny update from Benjamin removing a mistakenly added Elan PNP ID so
that the device is again handled by hid-multitouch"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - remove Lenovo Legion Y7000 PnpID
Looks like the Bios of the Lenovo Legion Y7000 is using ELAN061B
when the actual device is supposed to be used with hid-multitouch.
Remove it from the list of the supported device, hoping that
no one will complain about the loss in functionality.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203467
Fixes: 738c06d0e4 ("Input: elan_i2c - add hardware ID for multiple Lenovo laptops")
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Pull ARM SoC fixes from Arnd Bergmann:
"There are three more fixes for this week:
- The Windows-on-ARM laptops require a workaround to prevent crashing
at boot from ACPI
- The Renesas 'draak' board needs one bugfix for the backlight
regulator
- Also for Renesas, the 'hihope' board accidentally had its eMMC
turned off in the 5.3 merge window"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
soc: qcom: geni: Provide parameter error checking
arm64: dts: renesas: hihope-common: Fix eMMC status
arm64: dts: renesas: r8a77995: draak: Fix backlight regulator name
Pull configfs fixes from Christoph Hellwig:
"Late configfs fixes from Al that fix pretty nasty removal vs attribute
access races"
* tag 'configfs-for-5.3' of git://git.infradead.org/users/hch/configfs:
configfs: provide exclusion between IO and removals
configfs: new object reprsenting tree fragments
configfs_register_group() shouldn't be (and isn't) called in rmdirable parts
configfs: stash the data we need into configfs_buffer at open time
Pull IOMMU fixes from Joerg Roedel:
- Revert an Intel VT-d patch that caused problems for some users.
- Removal of a feature in the Intel VT-d driver that was never
supported in hardware. This qualifies as a fix because the code for
this feature sets reserved bits in the invalidation queue descriptor,
causing failed invalidations on real hardware.
- Two fixes for AMD IOMMU driver to fix a race condition and to add a
missing IOTLB flush when kernel is booted in kdump mode.
* tag 'iommu-fixes-v5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Fix race in increase_address_space()
iommu/amd: Flush old domains in kdump kernel
iommu/vt-d: Remove global page flush support
Revert "iommu/vt-d: Avoid duplicated pci dma alias consideration"
Pull MMC fix from Ulf Hansson:
"Revert in order to fix card init for some eMMCs that need retries for
CMD6"
* tag 'mmc-v5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
Revert "mmc: core: do not retry CMD6 in __mmc_switch()"
Pull drm fixes from Dave Airlie:
"Live from my friend's couch in Barcelona, latest round of drm fixes.
The command line parser regression fixes look a bit larger because
they come with selftests included for the bugs they fix. Otherwise a
single nouveau, single ingenic and single vmwgfx fix:
nouveau:
- add missing MODULE_FIRMWARE definitions
igenic:
- hardcode panel type DPI
vmwgfx:
- double free fix
core:
- command line mode parser fixes"
* tag 'drm-fixes-2019-09-06' of git://anongit.freedesktop.org/drm/drm:
drm/vmwgfx: Fix double free in vmw_recv_msg()
drm/nouveau/sec2/gp102: add missing MODULE_FIRMWAREs
drm/selftests: modes: Add more unit tests for the cmdline parser
drm/modes: Introduce a whitelist for the named modes
drm/modes: Fix the command line parser to take force options into account
drm/modes: Add a switch to differentiate free standing options
drm/ingenic: Hardcode panel type to DPI
Pull virtio fixes from Michael Tsirkin:
"virtio, vhost, and balloon bugfixes.
A couple of last minute bugfixes. And a revert of a failed attempt at
metadata access optimization - we'll try again in the next cycle"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
mm/balloon_compaction: suppress allocation warnings
Revert "vhost: access vq metadata through kernel virtual address"
vhost: Remove unnecessary variable
virtio-net: lower min ring num_free for efficiency
vhost/test: fix build for vhost test
vhost/test: fix build for vhost test
Pull powerpc fixes from Michael Ellerman:
"One fix for a boot hang on some Freescale machines when PREEMPT is
enabled.
Two CVE fixes for bugs in our handling of FP registers and
transactional memory, both of which can result in corrupted FP state,
or FP state leaking between processes.
Thanks to: Chris Packham, Christophe Leroy, Gustavo Romero, Michael
Neuling"
* tag 'powerpc-5.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts
powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction
powerpc/64e: Drop stale call to smp_processor_id() which hangs SMP startup
After the conversion to lock-less dma-api call the
increase_address_space() function can be called without any
locking. Multiple CPUs could potentially race for increasing
the address space, leading to invalid domain->mode settings
and invalid page-tables. This has been happening in the wild
under high IO load and memory pressure.
Fix the race by locking this operation. The function is
called infrequently so that this does not introduce
a performance regression in the dma-api path again.
Reported-by: Qian Cai <cai@lca.pw>
Fixes: 256e4621c2 ('iommu/amd: Make use of the generic IOVA allocator')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When devices are attached to the amd_iommu in a kdump kernel, the old device
table entries (DTEs), which were copied from the crashed kernel, will be
overwritten with a new domain number. When the new DTE is written, the IOMMU
is told to flush the DTE from its internal cache--but it is not told to flush
the translation cache entries for the old domain number.
Without this patch, AMD systems using the tg3 network driver fail when kdump
tries to save the vmcore to a network system, showing network timeouts and
(sometimes) IOMMU errors in the kernel log.
This patch will flush IOMMU translation cache entries for the old domain when
a DTE gets overwritten with a new domain number.
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Fixes: 3ac3e5ee5e ('iommu/amd: Copy old trans table from old kernel')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If a request_key authentication token key gets revoked, there's a window in
which request_key_auth_describe() can see it with a NULL payload - but it
makes no check for this and something like the following oops may occur:
BUG: Kernel NULL pointer dereference at 0x00000038
Faulting instruction address: 0xc0000000004ddf30
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [...] request_key_auth_describe+0x90/0xd0
LR [...] request_key_auth_describe+0x54/0xd0
Call Trace:
[...] request_key_auth_describe+0x54/0xd0 (unreliable)
[...] proc_keys_show+0x308/0x4c0
[...] seq_read+0x3d0/0x540
[...] proc_reg_read+0x90/0x110
[...] __vfs_read+0x3c/0x70
[...] vfs_read+0xb4/0x1b0
[...] ksys_read+0x7c/0x130
[...] system_call+0x5c/0x70
Fix this by checking for a NULL pointer when describing such a key.
Also make the read routine check for a NULL pointer to be on the safe side.
[DH: Modified to not take already-held rcu lock and modified to also check
in the read routine]
Fixes: 04c567d931 ("[PATCH] Keys: Fix race between two instantiators of a key")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull sound fixes from Takashi Iwai:
"A collection of small HD-audio fixes:
- A regression fix for Realtek codecs due to the recent
initialization procedure change
- A fix for potential endless loop at the quirk table lookup
- Quirks for Lenovo, ASUS and HP machines"
* tag 'sound-5.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek - Fix the problem of two front mics on a ThinkCentre
ALSA: hda/realtek - Enable internal speaker & headset mic of ASUS UX431FL
ALSA: hda/realtek - Add quirk for HP Pavilion 15
ALSA: hda/realtek - Fix overridden device-specific initialization
ALSA: hda - Fix potential endless loop at applying quirks
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- EFI boot fix for signed kernels
- an AC flags fix related to UBSAN
- Hyper-V infinite loop fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/hyper-v: Fix overflow bug in fill_gva_list()
x86/uaccess: Don't leak the AC flags into __get_user() argument evaluation
x86/boot: Preserve boot_params.secure_boot from sanitizing
Pull scheduler fixes from Ingo Molnar:
"This fixes an ABI bug introduced this cycle, plus fixes a throttling
bug"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Fix uclamp ABI bug, clean up and robustify sched_read_attr() ABI logic and code
sched/fair: Don't assign runtime for throttled cfs_rq
Pull clang-format update from Miguel Ojeda:
"Update with the latest for_each macro list"
* tag 'clang-format-for-linus-v5.3-rc8' of git://github.com/ojeda/linux:
clang-format: Update with the latest for_each macro list
Second Round of Renesas ARM Based SoC Fixes for v5.3
* RZ/G2M based HiHope main board
- Re-enabled accidently disabled SDHI3 (eMMC) support
* tag 'renesas-fixes2-for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
arm64: dts: renesas: hihope-common: Fix eMMC status
Link: https://lore.kernel.org/r/cover.1567675986.git.horms+renesas@verge.net.au
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
We recently added a kfree() after the end of the loop:
if (retries == RETRIES) {
kfree(reply);
return -EINVAL;
}
There are two problems. First the test is wrong and because retries
equals RETRIES if we succeed on the last iteration through the loop.
Second if we fail on the last iteration through the loop then the kfree
is a double free.
When you're reading this code, please note the break statement at the
end of the while loop. This patch changes the loop so that if it's not
successful then "reply" is NULL and we can test for that afterward.
Cc: <stable@vger.kernel.org>
Fixes: 6b7c3b86f0 ("drm/vmwgfx: fix memory leak when too many retries have occurred")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Make sure that attribute methods are not called after the item
has been removed from the tree. To do so, we
* at the point of no return in removals, grab ->frag_sem
exclusive and mark the fragment dead.
* call the methods of attributes with ->frag_sem taken
shared and only after having verified that the fragment is still
alive.
The main benefit is for method instances - they are
guaranteed that the objects they are accessing *and* all ancestors
are still there. Another win is that we don't need to bother
with extra refcount on config_item when opening a file -
the item will be alive for as long as it stays in the tree, and
we won't touch it/attributes/any associated data after it's
been removed from the tree.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Thadeu Lima de Souza Cascardo reported that 'chrt' broke on recent kernels:
$ chrt -p $$
chrt: failed to get pid 26306's policy: Argument list too long
and he has root-caused the bug to the following commit increasing sched_attr
size and breaking sched_read_attr() into returning -EFBIG:
a509a7cd79 ("sched/uclamp: Extend sched_setattr() to support utilization clamping")
The other, bigger bug is that the whole sched_getattr() and sched_read_attr()
logic of checking non-zero bits in new ABI components is arguably broken,
and pretty much any extension of the ABI will spuriously break the ABI.
That's way too fragile.
Instead implement the perf syscall's extensible ABI instead, which we
already implement on the sched_setattr() side:
- if user-attributes have the same size as kernel attributes then the
logic is unchanged.
- if user-attributes are larger than the kernel knows about then simply
skip the extra bits, but set attr->size to the (smaller) kernel size
so that tooling can (in principle) handle older kernel as well.
- if user-attributes are smaller than the kernel knows about then just
copy whatever user-space can accept.
Also clean up the whole logic:
- Simplify the code flow - there's no need for 'ret' for example.
- Standardize on 'kattr/uattr' and 'ksize/usize' naming to make sure we
always know which side we are dealing with.
- Why is it called 'read' when what it does is to copy to user? This
code is so far away from VFS read() semantics that the naming is
actively confusing. Name it sched_attr_copy_to_user() instead, which
mirrors other copy_to_user() functionality.
- Move the attr->size assignment from the head of sched_getattr() to the
sched_attr_copy_to_user() function. Nothing else within the kernel
should care about the size of the structure.
With these fixes the sched_getattr() syscall now nicely supports an
extensible ABI in both a forward and backward compatible fashion, and
will also fix the chrt bug.
As an added bonus the bogus -EFBIG return is removed as well, which as
Thadeu noted should have been -E2BIG to begin with.
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: a509a7cd79 ("sched/uclamp: Extend sched_setattr() to support utilization clamping")
Link: https://lkml.kernel.org/r/20190904075532.GA26751@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Renesas ARM Based SoC Fixes for v5.3
* R-Car D3 (r8a77995) based Draak Board
- Correct backlight regulator name in device tree
* tag 'renesas-fixes-for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
arm64: dts: renesas: r8a77995: draak: Fix backlight regulator name
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
When in userspace and MSR FP=0 the hardware FP state is unrelated to
the current process. This is extended for transactions where if tbegin
is run with FP=0, the hardware checkpoint FP state will also be
unrelated to the current process. Due to this, we need to ensure this
hardware checkpoint is updated with the correct state before we enable
FP for this process.
Unfortunately we get this wrong when returning to a process from a
hardware interrupt. A process that starts a transaction with FP=0 can
take an interrupt. When the kernel returns back to that process, we
change to FP=1 but with hardware checkpoint FP state not updated. If
this transaction is then rolled back, the FP registers now contain the
wrong state.
The process looks like this:
Userspace: Kernel
Start userspace
with MSR FP=0 TM=1
< -----
...
tbegin
bne
Hardware interrupt
---- >
<do_IRQ...>
....
ret_from_except
restore_math()
/* sees FP=0 */
restore_fp()
tm_active_with_fp()
/* sees FP=1 (Incorrect) */
load_fp_state()
FP = 0 -> 1
< -----
Return to userspace
with MSR TM=1 FP=1
with junk in the FP TM checkpoint
TM rollback
reads FP junk
When returning from the hardware exception, tm_active_with_fp() is
incorrectly making restore_fp() call load_fp_state() which is setting
FP=1.
The fix is to remove tm_active_with_fp().
tm_active_with_fp() is attempting to handle the case where FP state
has been changed inside a transaction. In this case the checkpointed
and transactional FP state is different and hence we must restore the
FP state (ie. we can't do lazy FP restore inside a transaction that's
used FP). It's safe to remove tm_active_with_fp() as this case is
handled by restore_tm_state(). restore_tm_state() detects if FP has
been using inside a transaction and will set load_fp and call
restore_math() to ensure the FP state (checkpoint and transaction) is
restored.
This is a data integrity problem for the current process as the FP
registers are corrupted. It's also a security problem as the FP
registers from one process may be leaked to another.
Similarly for VMX.
A simple testcase to replicate this will be posted to
tools/testing/selftests/powerpc/tm/tm-poison.c
This fixes CVE-2019-15031.
Fixes: a7771176b4 ("powerpc: Don't enable FP/Altivec if not checkpointed")
Cc: stable@vger.kernel.org # 4.15+
Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190904045529.23002-2-gromero@linux.vnet.ibm.com
When we take an FP unavailable exception in a transaction we have to
account for the hardware FP TM checkpointed registers being
incorrect. In this case for this process we know the current and
checkpointed FP registers must be the same (since FP wasn't used
inside the transaction) hence in the thread_struct we copy the current
FP registers to the checkpointed ones.
This copy is done in tm_reclaim_thread(). We use thread->ckpt_regs.msr
to determine if FP was on when in userspace. thread->ckpt_regs.msr
represents the state of the MSR when exiting userspace. This is setup
by check_if_tm_restore_required().
Unfortunatley there is an optimisation in giveup_all() which returns
early if tsk->thread.regs->msr (via local variable `usermsr`) has
FP=VEC=VSX=SPE=0. This optimisation means that
check_if_tm_restore_required() is not called and hence
thread->ckpt_regs.msr is not updated and will contain an old value.
This can happen if due to load_fp=255 we start a userspace process
with MSR FP=1 and then we are context switched out. In this case
thread->ckpt_regs.msr will contain FP=1. If that same process is then
context switched in and load_fp overflows, MSR will have FP=0. If that
process now enters a transaction and does an FP instruction, the FP
unavailable will not update thread->ckpt_regs.msr (the bug) and MSR
FP=1 will be retained in thread->ckpt_regs.msr. tm_reclaim_thread()
will then not perform the required memcpy and the checkpointed FP regs
in the thread struct will contain the wrong values.
The code path for this happening is:
Userspace: Kernel
Start userspace
with MSR FP/VEC/VSX/SPE=0 TM=1
< -----
...
tbegin
bne
fp instruction
FP unavailable
---- >
fp_unavailable_tm()
tm_reclaim_current()
tm_reclaim_thread()
giveup_all()
return early since FP/VMX/VSX=0
/* ckpt MSR not updated (Incorrect) */
tm_reclaim()
/* thread_struct ckpt FP regs contain junk (OK) */
/* Sees ckpt MSR FP=1 (Incorrect) */
no memcpy() performed
/* thread_struct ckpt FP regs not fixed (Incorrect) */
tm_recheckpoint()
/* Put junk in hardware checkpoint FP regs */
....
< -----
Return to userspace
with MSR TM=1 FP=1
with junk in the FP TM checkpoint
TM rollback
reads FP junk
This is a data integrity problem for the current process as the FP
registers are corrupted. It's also a security problem as the FP
registers from one process may be leaked to another.
This patch moves up check_if_tm_restore_required() in giveup_all() to
ensure thread->ckpt_regs.msr is updated correctly.
A simple testcase to replicate this will be posted to
tools/testing/selftests/powerpc/tm/tm-poison.c
Similarly for VMX.
This fixes CVE-2019-15030.
Fixes: f48e91e87e ("powerpc/tm: Fix FP and VMX register corruption")
Cc: stable@vger.kernel.org # 4.12+
Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190904045529.23002-1-gromero@linux.vnet.ibm.com
There is no reason to print warnings when balloon page allocation fails,
as they are expected and can be handled gracefully. Since VMware
balloon now uses balloon-compaction infrastructure, and suppressed these
warnings before, it is also beneficial to suppress these warnings to
keep the same behavior that the balloon had before.
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
This reverts commit 7f466032dc ("vhost: access vq metadata through
kernel virtual address"). The commit caused a bunch of issues, and
while commit 73f628ec9e ("vhost: disable metadata prefetch
optimization") disabled the optimization it's not nice to keep lots of
dead code around.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It is unnecessary to use ret variable to return the error
code, just return the error code directly.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This change lowers ring buffer reclaim threshold from 1/2*queue to budget
for better performance. According to our test with qemu + dpdk, packet
dropping happens when the guest is not able to provide free buffer in
avail ring timely with default 1/2*queue. The value in the patch has been
tested and does show better performance.
Test setup: iperf3 to generate packets to guest (total 30mins, pps 400k, UDP)
avg packets drop before: 2842
avg packets drop after: 360(-87.3%)
Further, current code suffers from a starvation problem: the amount of
work done by try_fill_recv is not bounded by the budget parameter, thus
(with large queues) once in a while userspace gets blocked for a long
time while queue is being refilled. Trigger refills earlier to make sure
the amount of work to do is limited.
Signed-off-by: jiangkidd <jiangkidd@hotmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since vhost_exceeds_weight() was introduced, callers need to specify
the packet weight and byte weight in vhost_dev_init(). Note that, the
packet weight isn't counted in this patch to keep the original behavior
unchanged.
Fixes: e82b9b0727 ("vhost: introduce vhost_exceeds_weight()")
Cc: stable@vger.kernel.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
This ThinkCentre machine has a new realtek codec alc222, it is not
in the support list, we add it in the realtek.c then this machine
can apply FIXUPs for the realtek codec.
And this machine has two front mics which can't be handled
by PA so far, it uses the pin 0x18 and 0x19 as the front mics, as
a result the existing FIXUP ALC294_FIXUP_LENOVO_MIC_LOCATION doesn't
work on this machine. Fortunately another FIXUP
ALC283_FIXUP_HEADSET_MIC also can change the location for one of the
two mics on this machine.
Link: https://lore.kernel.org/r/20190904055327.9883-1-hui.wang@canonical.com
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The commit 20c169aceb ("dmaengine: rcar-dmac: clear pertinence
number of channels") forgets to clear the last channel by
DMACHCLR in rcar_dmac_init() (and doesn't need to clear the first
channel) if iommu is mapped to the device. So, this patch fixes it
by using "channels_mask" bitfield.
Note that the hardware and driver don't support more than 32 bits
in DMACHCLR register anyway, so this patch should reject more than
32 channels in rcar_dmac_parse_of().
Fixes: 20c169aceb ("dmaengine: rcar-dmac: clear pertinence number of channels")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/1567424643-26629-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
For the Spreadtrum DMA link-list mode, when the DMA engine got a slave
hardware request, which will trigger the DMA engine to load the DMA
configuration from the link-list memory automatically. But before the
slave hardware request, the slave will get an incorrect residue due
to the first node used to trigger the link-list was configured as the
last source address and destination address.
Thus we should make sure the first node was configured the start source
address and destination address, which can fix this issue.
Fixes: 4ac6954647 ("dmaengine: sprd: Support DMA link-list mode")
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Link: https://lore.kernel.org/r/77868edb7aff9d5cb12ac3af8827ef2e244441a6.1567150471.git.baolin.wang@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Turns out the commit 3a0681c744 ("mmc: core: do not retry CMD6 in
__mmc_switch()") breaks initialization of a Toshiba THGBMNG5 eMMC card,
when using the meson-gx-mmc.c driver on a custom board based on Amlogic
A113D.
The CMD6 that switches the card into HS200 mode is then one that fails and
according to the below printed messages from the log:
[ 1.648951] mmc0: mmc_select_hs200 failed, error -84
[ 1.648988] mmc0: error -84 whilst initialising MMC card
After some analyze, it turns out that adding a delay of ~5ms inside
mmc_select_bus_width() but after mmc_compare_ext_csds() has been executed,
also fixes the problem. Adding yet some more debug code, trying to figure
out if potentially the card could be in a busy state, both by using CMD13
and ->card_busy() ops concluded that this was not the case.
Therefore, let's simply revert the commit that dropped support for retrying
of CMD6, as this also fixes the problem.
Fixes: 3a0681c744 ("mmc: core: do not retry CMD6 in __mmc_switch()")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kaisrlik <ja.kaisrlik@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Global pages support is removed from VT-d spec 3.0. Since global pages G
flag only affects first-level paging structures and because DMA request
with PASID are only supported by VT-d spec. 3.0 and onward, we can
safely remove global pages support.
For kernel shared virtual address IOTLB invalidation, PASID
granularity and page selective within PASID will be used. There is
no global granularity supported. Without this fix, IOTLB invalidation
will cause invalid descriptor error in the queued invalidation (QI)
interface.
Fixes: 1c4f88b7f1 ("iommu/vt-d: Shared virtual address in scalable mode")
Reported-by: Sanjay K Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
do_sched_cfs_period_timer() will refill cfs_b runtime and call
distribute_cfs_runtime to unthrottle cfs_rq, sometimes cfs_b->runtime
will allocate all quota to one cfs_rq incorrectly, then other cfs_rqs
attached to this cfs_b can't get runtime and will be throttled.
We find that one throttled cfs_rq has non-negative
cfs_rq->runtime_remaining and cause an unexpetced cast from s64 to u64
in snippet:
distribute_cfs_runtime() {
runtime = -cfs_rq->runtime_remaining + 1;
}
The runtime here will change to a large number and consume all
cfs_b->runtime in this cfs_b period.
According to Ben Segall, the throttled cfs_rq can have
account_cfs_rq_runtime called on it because it is throttled before
idle_balance, and the idle_balance calls update_rq_clock to add time
that is accounted to the task.
This commit prevents cfs_rq to be assgined new runtime if it has been
throttled until that distribute_cfs_runtime is called.
Signed-off-by: Liangyan <liangyan.peng@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Ben Segall <bsegall@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: shanpeic@linux.alibaba.com
Cc: stable@vger.kernel.org
Cc: xlpang@linux.alibaba.com
Fixes: d3d9dc3302 ("sched: Throttle entities exceeding their allowed bandwidth")
Link: https://lkml.kernel.org/r/20190826121633.6538-1-liangyan.peng@linux.alibaba.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Original pin node values of ASUS UX431FL with ALC294:
0x12 0xb7a60140
0x13 0x40000000
0x14 0x90170110
0x15 0x411111f0
0x16 0x411111f0
0x17 0x90170111
0x18 0x411111f0
0x19 0x411111f0
0x1a 0x411111f0
0x1b 0x411111f0
0x1d 0x4066852d
0x1e 0x411111f0
0x1f 0x411111f0
0x21 0x04211020
1. Has duplicated internal speakers (0x14 & 0x17) which makes the output
route become confused. So, the output volume cannot be changed by
setting.
2. Misses the headset mic pin node.
This patch disables the confusing speaker (NID 0x14) and enables the
headset mic (NID 0x19).
Link: https://lore.kernel.org/r/20190902100054.6941-1-jian-hong@endlessm.com
Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Refcounted, hangs of configfs_dirent, created by operations that add
fragments to configfs tree (mkdir and configfs_register_{subsystem,group}).
Will be used in the next commit to provide exclusion between fragment
removal and ->show/->store calls.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
revert cc57c07343 "configfs: fix registered group removal"
It was an attempt to handle something that fundamentally doesn't
work - configfs_register_group() should never be done in a part
of tree that can be rmdir'ed. And in mainline it never had been,
so let's not borrow trouble; the fix was racy anyway, it would take
a lot more to make that work and desired semantics is not clear.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
simplifies the ->read()/->write()/->release() instances nicely
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When the 'start' parameter is >= 0xFF000000 on 32-bit
systems, or >= 0xFFFFFFFF'FF000000 on 64-bit systems,
fill_gva_list() gets into an infinite loop.
With such inputs, 'cur' overflows after adding HV_TLB_FLUSH_UNIT
and always compares as less than end. Memory is filled with
guest virtual addresses until the system crashes.
Fix this by never incrementing 'cur' to be larger than 'end'.
Reported-by: Jong Hyun Park <park.jonghyun@yonsei.ac.kr>
Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 2ffd9e33ce ("x86/hyper-v: Use hypercall for remote TLB flush")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We want to throw out the attrbute if it refers to the mounted on fileid,
and not the real fileid. However we do not want to block cache consistency
updates from NFSv4 writes.
Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Fixes: 7e10cc25bf ("NFS: Don't refresh attributes with mounted-on-file...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Pull char/misc driver fixes from Greg KH:
"Here are some small char and misc driver fixes for reported issues for
5.3-rc7
Also included in here is the documentation for how we are handling
hardware issues under embargo that everyone has finally agreed on, as
well as a MAINTAINERS update for the suckers who agreed to handle the
LICENSES/ files.
All of these have been in linux-next last week with no reported
issues"
* tag 'char-misc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
fsi: scom: Don't abort operations for minor errors
vmw_balloon: Fix offline page marking with compaction
VMCI: Release resource if the work is already queued
Documentation/process: Embargoed hardware security issues
lkdtm/bugs: fix build error in lkdtm_EXHAUST_STACK
mei: me: add Tiger Lake point LP device ID
intel_th: pci: Add Tiger Lake support
intel_th: pci: Add support for another Lewisburg PCH
stm class: Fix a double free of stm_source_device
MAINTAINERS: add entry for LICENSES and SPDX stuff
fpga: altera-ps-spi: Fix getting of optional confd gpio
Pull USB fixes from Greg KH:
"Here are some small USB fixes that have been in linux-next this past
week for 5.3-rc7
They fix the usual xhci, syzbot reports, and other small issues that
have come up last week.
All have been in linux-next with no reported issues"
* tag 'usb-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: cdc-wdm: fix race between write and disconnect due to flag abuse
usb: host: xhci: rcar: Fix typo in compatible string matching
usb: host: xhci-tegra: Set DMA mask correctly
USB: storage: ums-realtek: Whitelist auto-delink support
USB: storage: ums-realtek: Update module parameter description for auto_delink_en
usb: host: ohci: fix a race condition between shutdown and irq
usb: hcd: use managed device resources
typec: tcpm: fix a typo in the comparison of pdo_max_voltage
usb-storage: Add new JMS567 revision to unusual_devs
usb: chipidea: udc: don't do hardware access if gadget has stopped
usbtmc: more sanity checking for packet size
usb: udc: lpc32xx: silence fall-through warning
HP Pavilion 15 (AMD Ryzen-based model) with 103c:84e7 needs the same
quirk like HP Envy/Spectre x360 for enabling the mute LED over Mic3 pin.
[ rearranged in the SSID number order by tiwai ]
Signed-off-by: Sam Bazley <sambazley@fastmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit
a90118c445 ("x86/boot: Save fields explicitly, zero out everything else")
now zeroes the secure boot setting information (enabled/disabled/...)
passed by the boot loader or by the kernel's EFI handover mechanism.
The problem manifests itself with signed kernels using the EFI handoff
protocol with grub and the kernel loses the information whether secure
boot is enabled in the firmware, i.e., the log message "Secure boot
enabled" becomes "Secure boot could not be determined".
efi_main() arch/x86/boot/compressed/eboot.c sets this field early but it
is subsequently zeroed by the above referenced commit.
Include boot_params.secure_boot in the preserve field list.
[ bp: restructure commit message and massage. ]
Fixes: a90118c445 ("x86/boot: Save fields explicitly, zero out everything else")
Signed-off-by: John S. Gruber <JohnSGruber@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: stable <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/CAPotdmSPExAuQcy9iAHqX3js_fc4mMLQOTr5RBGvizyCOPcTQQ@mail.gmail.com
Pull networking fixes from David Miller:
1) Fix some length checks during OGM processing in batman-adv, from
Sven Eckelmann.
2) Fix regression that caused netfilter conntrack sysctls to not be
per-netns any more. From Florian Westphal.
3) Use after free in netpoll, from Feng Sun.
4) Guard destruction of pfifo_fast per-cpu qdisc stats with
qdisc_is_percpu_stats(), from Davide Caratti. Similar bug is fixed
in pfifo_fast_enqueue().
5) Fix memory leak in mld_del_delrec(), from Eric Dumazet.
6) Handle neigh events on internal ports correctly in nfp, from John
Hurley.
7) Clear SKB timestamp in NF flow table code so that it does not
confuse fq scheduler. From Florian Westphal.
8) taprio destroy can crash if it is invoked in a failure path of
taprio_init(), because the list head isn't setup properly yet and
the list del is unconditional. Perform the list add earlier to
address this. From Vladimir Oltean.
9) Make sure to reapply vlan filters on device up, in aquantia driver.
From Dmitry Bogdanov.
10) sgiseeq driver releases DMA memory using free_page() instead of
dma_free_attrs(). From Christophe JAILLET.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (58 commits)
net: seeq: Fix the function used to release some memory in an error handling path
enetc: Add missing call to 'pci_free_irq_vectors()' in probe and remove functions
net: bcmgenet: use ethtool_op_get_ts_info()
tc-testing: don't hardcode 'ip' in nsPlugin.py
net: dsa: microchip: add KSZ8563 compatibility string
dt-bindings: net: dsa: document additional Microchip KSZ8563 switch
net: aquantia: fix out of memory condition on rx side
net: aquantia: linkstate irq should be oneshot
net: aquantia: reapply vlan filters on up
net: aquantia: fix limit of vlan filters
net: aquantia: fix removal of vlan 0
net/sched: cbs: Set default link speed to 10 Mbps in cbs_set_port_rate
taprio: Set default link speed to 10 Mbps in taprio_set_picos_per_byte
taprio: Fix kernel panic in taprio_destroy
net: dsa: microchip: fill regmap_config name
rxrpc: Fix lack of conn cleanup when local endpoint is cleaned up [ver #2]
net: stmmac: dwmac-rk: Don't fail if phy regulator is absent
amd-xgbe: Fix error path in xgbe_mod_init()
netfilter: nft_meta_bridge: Fix get NFT_META_BRI_IIFVPROTO in network byteorder
mac80211: Correctly set noencrypt for PAE frames
...
In commit 99cd149efe ("sgiseeq: replace use of dma_cache_wback_inv"),
a call to 'get_zeroed_page()' has been turned into a call to
'dma_alloc_coherent()'. Only the remove function has been updated to turn
the corresponding 'free_page()' into 'dma_free_attrs()'.
The error hndling path of the probe function has not been updated.
Fix it now.
Rename the corresponding label to something more in line.
Fixes: 99cd149efe ("sgiseeq: replace use of dma_cache_wback_inv")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Fix the bogus detection of 32bit user mode for uretprobes which
caused corruption of the user return address resulting in
application crashes. In the uprobes handler in_ia32_syscall() is
obviously always returning false on a 64bit kernel. Use
user_64bit_mode() instead which works correctly.
- Prevent large page splitting when ftrace flips RW/RO on the kernel
text which caused iTLB performance issues. Ftrace wants to be
converted to text_poke() which avoids the problem, but for now
allow large page preservation in the static protections check when
the change request spawns a full large page.
- Prevent arch_dynirq_lower_bound() from returning 0 when the IOAPIC
is configured via device tree. In the device tree case the GSI 1:1
mapping is meaningless therefore the lower bound which protects the
GSI range on ACPI machines is irrelevant. Return the lower bound
which the core hands to the function instead of blindly returning 0
which causes the core to allocate the invalid virtual interupt
number 0 which in turn prevents all drivers from allocating and
requesting an interrupt.
- Remove the bogus initialization of LDR and DFR in the 32bit bigsmp
APIC driver. That uses physical destination mode where LDR/DFR are
ignored, but the initialization and the missing clear of LDR caused
the APIC to be left in a inconsistent state on kexec/reboot.
- Clear LDR when clearing the APIC registers so the APIC is in a well
defined state.
- Initialize variables proper in the find_trampoline_placement()
code.
- Silence GCC( build warning for the real mode part of the build"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text
x86/build: Add -Wnoaddress-of-packed-member to REALMODE_CFLAGS, to silence GCC9 build warning
x86/boot/compressed/64: Fix missing initialization in find_trampoline_placement()
x86/apic: Include the LDR when clearing out APIC registers
x86/apic: Do not initialize LDR and DFR for bigsmp
uprobes/x86: Fix detection of 32-bit user mode
x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines
Pull perf fixes from Thomas Gleixner:
"Two fixes for perf x86 hardware implementations:
- Restrict the period on Nehalem machines to prevent perf from
hogging the CPU
- Prevent the AMD IBS driver from overwriting the hardwre controlled
and pre-seeded reserved bits (0-6) in the count register which
caused a sample bias for dispatched micro-ops"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops
perf/x86/intel: Restrict period on Nehalem
Pull turbostat updates from Len Brown:
"User-space turbostat (and x86_energy_perf_policy) patches.
They are primarily bug fixes from users"
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: update version number
tools/power turbostat: Add support for Hygon Fam 18h (Dhyana) RAPL
tools/power turbostat: Fix caller parameter of get_tdp_amd()
tools/power turbostat: Fix CPU%C1 display value
tools/power turbostat: do not enforce 1ms
tools/power turbostat: read from pipes too
tools/power turbostat: Add Ice Lake NNPI support
tools/power turbostat: rename has_hsw_msrs()
tools/power turbostat: Fix Haswell Core systems
tools/power turbostat: add Jacobsville support
tools/power turbostat: fix buffer overrun
tools/power turbostat: fix file descriptor leaks
tools/power turbostat: fix leak of file descriptor on error return path
tools/power turbostat: Make interval calculation per thread to reduce jitter
tools/power turbostat: remove duplicate pc10 column
tools/power x86_energy_perf_policy: Fix argument parsing
tools/power: Fix typo in man page
tools/power/x86: Enable compiler optimisations and Fortify by default
tools/power x86_energy_perf_policy: Fix "uninitialized variable" warnings at -O2
Call to 'pci_free_irq_vectors()' are missing both in the error handling
path of the probe function, and in the remove function.
Add them.
Fixes: 19971f5ea0 ("enetc: add PTP clock driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change enables the use of SW timestamping on the Raspberry Pi 4.
bcmgenet's transmit function bcmgenet_xmit() implements software
timestamping. However the SOF_TIMESTAMPING_TX_SOFTWARE capability was
missing and only SOF_TIMESTAMPING_RX_SOFTWARE was announced. By using
ethtool_ops bcmgenet_ethtool_ops() as get_ts_info(), the
SOF_TIMESTAMPING_TX_SOFTWARE capability is announced.
Similar to commit a8f5cb9e79 ("smsc95xx: use ethtool_op_get_ts_info()")
Signed-off-by: Ryan M. Collins <rmc032@bucknell.edu>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
the following tdc test fails on Fedora:
# ./tdc.py -e 2638
-- ns/SubPlugin.__init__
Test 2638: Add matchall and try to get it
-----> prepare stage *** Could not execute: "$TC qdisc add dev $DEV1 clsact"
-----> prepare stage *** Error message: "/bin/sh: ip: command not found"
returncode 127; expected [0]
-----> prepare stage *** Aborting test run.
Let nsPlugin.py use the 'IP' variable introduced with commit 92c1a19e2f
("tc-tests: added path to ip command in tdc"), so that the path to 'ip' is
correctly resolved to the value we have in tdc_config.py.
# ./tdc.py -e 2638
-- ns/SubPlugin.__init__
Test 2638: Add matchall and try to get it
All test results:
1..1
ok 1 2638 - Add matchall and try to get it
Fixes: 489ce2f425 ("tc-testing: Restore original behaviour for namespaces in tdc")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Razvan Stefanescu says:
====================
net: dsa: microchip: add KSZ8563 support
This patchset adds compatibility string for the KSZ8563 switch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh says:
====================
net: aquantia: fixes on vlan filters and other conditions
Here is a set of various bug fixes related to vlan filter offload and
two other rare cases.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
On embedded environments with hard memory limits it is a normal although
rare case when skb can't be allocated on rx part under high traffic.
In such OOM cases napi_complete_done() was not called.
So the napi object became in an invalid state like it is "scheduled".
Kernel do not re-schedules the poll of that napi object.
Consequently, kernel can not remove that object the system hangs on
`ifconfig down` waiting for a poll.
We are fixing this by gracefully closing napi poll routine with correct
invocation of napi_complete_done.
This was reproduced with artificially failing the allocation of skb to
simulate an "out of memory" error case and check that traffic does
not get stuck.
Fixes: 970a2e9864 ("net: ethernet: aquantia: Vector operations")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Declaring threaded irq handler should also indicate the irq is
oneshot. It is oneshot indeed, because HW implements irq automasking
on trigger.
Not declaring this causes some kernel configurations to fail
on interface up, because request_threaded_irq returned an err code.
The issue was originally hidden on normal x86_64 configuration with
latest kernel, because depending on interrupt controller, irq driver
added ONESHOT flag on its own.
Issue was observed on older kernels (4.14) where no such logic exists.
Fixes: 4c83f170b3 ("net: aquantia: link status irq handling")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Reported-by: Michael Symolkin <Michael.Symolkin@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of device reconfiguration the driver may reset the device invisible
for other modules, vlan module in particular. So vlans will not be
removed&created and vlan filters will not be configured in the device.
The patch reapplies the vlan filters at device start.
Fixes: 7975d2aff5 ("net: aquantia: add support of rx-vlan-filter offload")
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a limit condition of vlans on the interface before setting vlan
promiscuous mode
Fixes: 48dd73d08d ("net: aquantia: fix vlans not working over bridged network")
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to absence of checking against the rx flow rule when vlan 0 is being
removed, the other rule could be removed instead of the rule with vlan 0
Fixes: 7975d2aff5 ("net: aquantia: add support of rx-vlan-filter offload")
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean says:
====================
Fix issues in tc-taprio and tc-cbs
This series fixes one panic and one WARN_ON found in the tc-taprio
qdisc, while trying to apply it:
- On an interface which is not multi-queue
- On an interface which has no carrier
The tc-cbs was also visually found to suffer of the same issue as
tc-taprio, and the fix was only compile-tested in that case.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The discussion to be made is absolutely the same as in the case of
previous patch ("taprio: Set default link speed to 10 Mbps in
taprio_set_picos_per_byte"). Nothing is lost when setting a default.
Cc: Leandro Dorileo <leandro.maciel.dorileo@intel.com>
Fixes: e0a7683d30 ("net/sched: cbs: fix port_rate miscalculation")
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The taprio budget needs to be adapted at runtime according to interface
link speed. But that handling is problematic.
For one thing, installing a qdisc on an interface that doesn't have
carrier is not illegal. But taprio prints the following stack trace:
[ 31.851373] ------------[ cut here ]------------
[ 31.856024] WARNING: CPU: 1 PID: 207 at net/sched/sch_taprio.c:481 taprio_dequeue+0x1a8/0x2d4
[ 31.864566] taprio: dequeue() called with unknown picos per byte.
[ 31.864570] Modules linked in:
[ 31.873701] CPU: 1 PID: 207 Comm: tc Not tainted 5.3.0-rc5-01199-g8838fe023cd6 #1689
[ 31.881398] Hardware name: Freescale LS1021A
[ 31.885661] [<c03133a4>] (unwind_backtrace) from [<c030d8cc>] (show_stack+0x10/0x14)
[ 31.893368] [<c030d8cc>] (show_stack) from [<c10ac958>] (dump_stack+0xb4/0xc8)
[ 31.900555] [<c10ac958>] (dump_stack) from [<c0349d04>] (__warn+0xe0/0xf8)
[ 31.907395] [<c0349d04>] (__warn) from [<c0349d64>] (warn_slowpath_fmt+0x48/0x6c)
[ 31.914841] [<c0349d64>] (warn_slowpath_fmt) from [<c0f38db4>] (taprio_dequeue+0x1a8/0x2d4)
[ 31.923150] [<c0f38db4>] (taprio_dequeue) from [<c0f227b0>] (__qdisc_run+0x90/0x61c)
[ 31.930856] [<c0f227b0>] (__qdisc_run) from [<c0ec82ac>] (net_tx_action+0x12c/0x2bc)
[ 31.938560] [<c0ec82ac>] (net_tx_action) from [<c0302298>] (__do_softirq+0x130/0x3c8)
[ 31.946350] [<c0302298>] (__do_softirq) from [<c03502a0>] (irq_exit+0xbc/0xd8)
[ 31.953536] [<c03502a0>] (irq_exit) from [<c03a4808>] (__handle_domain_irq+0x60/0xb4)
[ 31.961328] [<c03a4808>] (__handle_domain_irq) from [<c0754478>] (gic_handle_irq+0x58/0x9c)
[ 31.969638] [<c0754478>] (gic_handle_irq) from [<c0301a8c>] (__irq_svc+0x6c/0x90)
[ 31.977076] Exception stack(0xe8167b20 to 0xe8167b68)
[ 31.982100] 7b20: e9d4bd80 00000cc0 000000cf 00000000 e9d4bd80 c1f38958 00000cc0 c1f38960
[ 31.990234] 7b40: 00000001 000000cf 00000004 e9dc0800 00000000 e8167b70 c0f478ec c0f46d94
[ 31.998363] 7b60: 60070013 ffffffff
[ 32.001833] [<c0301a8c>] (__irq_svc) from [<c0f46d94>] (netlink_trim+0x18/0xd8)
[ 32.009104] [<c0f46d94>] (netlink_trim) from [<c0f478ec>] (netlink_broadcast_filtered+0x34/0x414)
[ 32.017930] [<c0f478ec>] (netlink_broadcast_filtered) from [<c0f47cec>] (netlink_broadcast+0x20/0x28)
[ 32.027102] [<c0f47cec>] (netlink_broadcast) from [<c0eea378>] (rtnetlink_send+0x34/0x88)
[ 32.035238] [<c0eea378>] (rtnetlink_send) from [<c0f25890>] (notify_and_destroy+0x2c/0x44)
[ 32.043461] [<c0f25890>] (notify_and_destroy) from [<c0f25e08>] (qdisc_graft+0x398/0x470)
[ 32.051595] [<c0f25e08>] (qdisc_graft) from [<c0f27a00>] (tc_modify_qdisc+0x3a4/0x724)
[ 32.059470] [<c0f27a00>] (tc_modify_qdisc) from [<c0ee4c84>] (rtnetlink_rcv_msg+0x260/0x2ec)
[ 32.067864] [<c0ee4c84>] (rtnetlink_rcv_msg) from [<c0f4a988>] (netlink_rcv_skb+0xb8/0x110)
[ 32.076172] [<c0f4a988>] (netlink_rcv_skb) from [<c0f4a170>] (netlink_unicast+0x1b4/0x22c)
[ 32.084392] [<c0f4a170>] (netlink_unicast) from [<c0f4a5e4>] (netlink_sendmsg+0x33c/0x380)
[ 32.092614] [<c0f4a5e4>] (netlink_sendmsg) from [<c0ea9f40>] (sock_sendmsg+0x14/0x24)
[ 32.100403] [<c0ea9f40>] (sock_sendmsg) from [<c0eaa780>] (___sys_sendmsg+0x214/0x228)
[ 32.108279] [<c0eaa780>] (___sys_sendmsg) from [<c0eabad0>] (__sys_sendmsg+0x50/0x8c)
[ 32.116068] [<c0eabad0>] (__sys_sendmsg) from [<c0301000>] (ret_fast_syscall+0x0/0x54)
[ 32.123938] Exception stack(0xe8167fa8 to 0xe8167ff0)
[ 32.128960] 7fa0: b6fa68c8 000000f8 00000003 bea142d0 00000000 00000000
[ 32.137093] 7fc0: b6fa68c8 000000f8 0052154c 00000128 5d6468a2 00000000 00000028 00558c9c
[ 32.145224] 7fe0: 00000070 bea14278 00530d64 b6e17e64
[ 32.150659] ---[ end trace 2139c9827c3e5177 ]---
This happens because the qdisc ->dequeue callback gets called. Which
again is not illegal, the qdisc will dequeue even when the interface is
up but doesn't have carrier (and hence SPEED_UNKNOWN), and the frames
will be dropped further down the stack in dev_direct_xmit().
And, at the end of the day, for what? For calculating the initial budget
of an interface which is non-operational at the moment and where frames
will get dropped anyway.
So if we can't figure out the link speed, default to SPEED_10 and move
along. We can also remove the runtime check now.
Cc: Leandro Dorileo <leandro.maciel.dorileo@intel.com>
Fixes: 7b9eba7ba0 ("net/sched: taprio: fix picos_per_byte miscalculation")
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
taprio_init may fail earlier than this line:
list_add(&q->taprio_list, &taprio_list);
i.e. due to the net device not being multi queue.
Attempting to remove q from the global taprio_list when it is not part
of it will result in a kernel panic.
Fix it by matching list_add and list_del better to one another in the
order of operations. This way we can keep the deletion unconditional
and with lower complexity - O(1).
Cc: Leandro Dorileo <leandro.maciel.dorileo@intel.com>
Fixes: 7b9eba7ba0 ("net/sched: taprio: fix picos_per_byte miscalculation")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the register value width as the regmap_config name to prevent the
following error when the second and third regmap_configs are
initialized.
"debugfs: Directory '${bus-id}' with parent 'regmap' already present!"
Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Wunderlich says:
====================
Here are two batman-adv bugfixes:
- Fix OGM and OGMv2 header read boundary check,
by Sven Eckelmann (2 patches)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 9392bd98bb ("tools/power turbostat: Add support for AMD
Fam 17h (Zen) RAPL") and the commit 3316f99a9f ("tools/power
turbostat: Also read package power on AMD F17h (Zen)") add AMD Fam 17h
RAPL support.
Hygon Family 18h(Dhyana) support RAPL in bit 14 of CPUID 0x80000007 EDX,
and has MSRs RAPL_PWR_UNIT/CORE_ENERGY_STAT/PKG_ENERGY_STAT. So add Hygon
Dhyana Family 18h support for RAPL.
Already tested on Hygon multi-node systems and it shows correct per-core
energy usage and the total package power.
Signed-off-by: Pu Wen <puwen@hygon.cn>
Reviewed-by: Calvin Walton <calvin.walton@kepstin.ca>
Signed-off-by: Len Brown <len.brown@intel.com>
Commit 9392bd98bb ("tools/power turbostat: Add support for AMD
Fam 17h (Zen) RAPL") add a function get_tdp_amd(), the parameter is CPU
family. But the rapl_probe_amd() function use wrong model parameter.
Fix the wrong caller parameter of get_tdp_amd() to use family.
Cc: <stable@vger.kernel.org> # v5.1+
Signed-off-by: Pu Wen <puwen@hygon.cn>
Reviewed-by: Calvin Walton <calvin.walton@kepstin.ca>
Signed-off-by: Len Brown <len.brown@intel.com>
In some case C1% will be wrong value, when platform doesn't have MSR for
C1 residency.
For example:
Core CPU CPU%c1
- - 100.00
0 0 100.00
0 2 100.00
1 1 100.00
1 3 100.00
But adding Busy% will fix this
Core CPU Busy% CPU%c1
- - 99.77 0.23
0 0 99.77 0.23
0 2 99.77 0.23
1 1 99.77 0.23
1 3 99.77 0.23
This issue can be reproduced on most of the recent systems including
Broadwell, Skylake and later.
This is because if we don't select Busy% or Avg_MHz or Bzy_MHz then
mperf value will not be read from MSR, so it will be 0. But this
is required for C1% calculation when MSR for C1 residency is not present.
Same is true for C3, C6 and C7 column selection.
So add another define DO_BIC_READ(), which doesn't depend on user
column selection and use for mperf, C3, C6 and C7 related counters.
So when there is no platform support for C1 residency counters,
we still read these counters, if the CPU has support and user selected
display of CPU%c1.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Turbostat works by taking a snapshot of counters, sleeping, taking another
snapshot, calculating deltas, and printing out the table.
The sleep time is controlled via -i option or by user sending a signal or a
character to stdin. In the latter case, turbostat always adds 1 ms
sleep before it reads the counters, in order to avoid larger imprecisions
in the results in prints.
While the 1 ms delay may be a good idea for a "dumb" user, it is a
problem for an "aware" user. I do thousands and thousands of measurements
over a short period of time (like 2ms), and turbostat unconditionally adds
a 1ms to my interval, so I cannot get what I really need.
This patch removes the unconditional 1ms sleep. This is an expert user
tool, after all, and non-experts will unlikely ever use it in the non-fixed
interval mode anyway, so I think it is OK to remove the 1ms delay.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Commit '47936f944e78 tools/power turbostat: fix printing on input' make
a valid fix, but it completely disabled piped stdin support, which is
a valuable use-case. Indeed, if stdin is a pipe, turbostat won't read
anything from it, so it becomes impossible to get turbostat output at
user-defined moments, instead of the regular intervals.
There is no reason why this should works for terminals, but not for
pipes. This patch improves the situation. Instead of ignoring pipes, we
read data from them but gracefully handle the EOF case.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Perhaps if this more descriptive name had been used,
then we wouldn't have had the HSW ULT vs HSW CORE bug,
fixed by the previous commit.
Signed-off-by: Len Brown <len.brown@intel.com>
turbostat: cpu0: msr offset 0x630 read failed: Input/output error
because Haswell Core does not have C8-C10.
Output C8-C10 only on Haswell ULT.
Fixes: f5a4c76ad7 ("tools/power turbostat: consolidate duplicate model numbers")
Reported-by: Prarit Bhargava <prarit@redhat.com>
Suggested-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com>
Signed-off-by: Len Brown <len.brown@intel.com>
turbostat could be terminated by general protection fault on some latest
hardwares which (for example) support 9 levels of C-states and show 18
"tADDED" lines. That bloats the total output and finally causes buffer
overrun. So let's extend the buffer to avoid this.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Currently the error return path does not close the file fp and leaks
a file descriptor. Fix this by closing the file.
Fixes: 5ea7647b33 ("tools/power turbostat: Warn on bad ACPI LPIT data")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Turbostat currently normalizes TSC and other values by dividing by an
interval. This interval is the delta between the start of one global
(all counters on all CPUs) sampling and the start of another. However,
this introduces a lot of jitter into the data.
In order to reduce jitter, the interval calculation should be based on
timestamps taken per thread and close to the start of the thread's
sampling.
Define a per thread time value to hold the delta between samples taken
on the thread.
Use the timestamp taken at the beginning of sampling to calculate the
delta.
Move the thread's beginning timestamp to after the CPU migration to
avoid jitter due to the migration.
Use the global time delta for the average time delta.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The -w argument in x86_energy_perf_policy currently triggers an
unconditional segfault.
This is because the argument string reads: "+a:c:dD:E:e:f:m:M:rt:u:vw" and
yet the argument handler expects an argument.
When parse_optarg_string is called with a null argument, we then proceed to
crash in strncmp, not horribly friendly.
The man page describes -w as taking an argument, the long form
(--hwp-window) is correctly marked as taking a required argument, and the
code expects it.
As such, this patch simply marks the short form (-w) as requiring an
argument.
Signed-off-by: Zephaniah E. Loss-Cutler-Hull <zephaniah@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Compiling without optimisations is silly, especially since some
warnings depend on the optimiser. Use -O2.
Fortify adds warnings for unchecked I/O (among other things), which
seems to be a good idea for user-space code. Enable that too.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Len Brown <len.brown@intel.com>
x86_energy_perf_policy first uses __get_cpuid() to check the maximum
CPUID level and exits if it is too low. It then assumes that later
calls will succeed (which I think is architecturally guaranteed). It
also assumes that CPUID works at all (which is not guaranteed on
x86_32).
If optimisations are enabled, gcc warns about potentially
uninitialized variables. Fix this by adding an exit-on-error after
every call to __get_cpuid() instead of just checking the maximum
level.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Len Brown <len.brown@intel.com>
Pull i2c fixes from Wolfram Sang:
"I2C has a bunch of driver fixes and a core improvement to make the
on-going API transition more robust"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: mediatek: disable zero-length transfers for mt8183
i2c: iproc: Stop advertising support of SMBUS quick cmd
MAINTAINERS: i2c mv64xxx: Update documentation path
i2c: piix4: Fix port selection for AMD Family 16h Model 30h
i2c: designware: Synchronize IRQs when unregistering slave client
i2c: i801: Avoid memory leak in check_acpi_smo88xx_device()
i2c: make i2c_unregister_device() ERR_PTR safe
Pull tracing fixes from Steven Rostedt:
"Small fixes and minor cleanups for tracing:
- Make exported ftrace function not static
- Fix NULL pointer dereference in reading probes as they are created
- Fix NULL pointer dereference in k/uprobe clean up path
- Various documentation fixes"
* tag 'trace-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Correct kdoc formats
ftrace/x86: Remove mcount() declaration
tracing/probe: Fix null pointer dereference
tracing: Make exported ftrace_set_clr_event non-static
ftrace: Check for successful allocation of hash
ftrace: Check for empty hash and comment the race with registering probes
ftrace: Fix NULL pointer dereference in t_probe_next()
Pull RISC-V fix from Paul Walmsley:
"One significant fix for 32-bit RISC-V systems:
Fix the RV32 memory map to prevent userspace from corrupting the
FIXMAP area. Without this patch, the system can crash very early
during the boot"
* tag 'riscv/for-v5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Fix FIXMAP area corruption on RV32 systems
Pull KVM fixes from Radim Krčmář:
"PPC:
- Fix bug which could leave locks held in the host on return to a
guest.
x86:
- Prevent infinitely looping emulation of a failing syscall while
single stepping.
- Do not crash the host when nesting is disabled"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Don't update RIP or do single-step on faulting emulation
KVM: x86: hyper-v: don't crash on KVM_GET_SUPPORTED_HV_CPUID when kvm_intel.nested is disabled
KVM: PPC: Book3S: Fix incorrect guest-to-user-translation error handling
Merge misc mm fixes from Andrew Morton:
"7 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: memcontrol: fix percpu vmstats and vmevents flush
mm, memcg: do not set reclaim_state on soft limit reclaim
mailmap: add aliases for Dmitry Safonov
mm/z3fold.c: fix lock/unlock imbalance in z3fold_page_isolate
mm, memcg: partially revert "mm/memcontrol.c: keep local VM counters in sync with the hierarchical ones"
mm/zsmalloc.c: fix build when CONFIG_COMPACTION=n
mm: memcontrol: flush percpu slab vmstats on kmem offlining
Fix the following kdoc warnings:
kernel/trace/trace.c:1579: warning: Function parameter or member 'tr' not described in 'update_max_tr_single'
kernel/trace/trace.c:1579: warning: Function parameter or member 'tsk' not described in 'update_max_tr_single'
kernel/trace/trace.c:1579: warning: Function parameter or member 'cpu' not described in 'update_max_tr_single'
kernel/trace/trace.c:1776: warning: Function parameter or member 'type' not described in 'register_tracer'
kernel/trace/trace.c:2239: warning: Function parameter or member 'task' not described in 'tracing_record_taskinfo'
kernel/trace/trace.c:2239: warning: Function parameter or member 'flags' not described in 'tracing_record_taskinfo'
kernel/trace/trace.c:2269: warning: Function parameter or member 'prev' not described in 'tracing_record_taskinfo_sched_switch'
kernel/trace/trace.c:2269: warning: Function parameter or member 'next' not described in 'tracing_record_taskinfo_sched_switch'
kernel/trace/trace.c:2269: warning: Function parameter or member 'flags' not described in 'tracing_record_taskinfo_sched_switch'
kernel/trace/trace.c:3078: warning: Function parameter or member 'ip' not described in 'trace_vbprintk'
kernel/trace/trace.c:3078: warning: Function parameter or member 'fmt' not described in 'trace_vbprintk'
kernel/trace/trace.c:3078: warning: Function parameter or member 'args' not described in 'trace_vbprintk'
Link: http://lkml.kernel.org/r/20190828052549.2472-2-jakub.kicinski@netronome.com
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The function ftrace_set_clr_event is declared static and marked
EXPORT_SYMBOL_GPL(), which is at best an odd combination. Because the
function was decided to be a part of API, this commit removes the static
attribute and adds the declaration to the header.
Link: http://lkml.kernel.org/r/20190704172110.27041-1-efremov@linux.com
Fixes: f45d1225ad ("tracing: Kernel access to Ftrace instances")
Reviewed-by: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Pull crypto fix from Herbert Xu:
"Fix a potential crash in the ccp driver"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ccp - Ignore unconfigured CCP device on suspend/resume
Commit dfe2a77fd2 ("kfifo: fix kfifo_alloc() and kfifo_init()") made
the kfifo code round the number of elements up. That was good for
__kfifo_alloc(), but it's actually wrong for __kfifo_init().
The difference? __kfifo_alloc() will allocate the rounded-up number of
elements, but __kfifo_init() uses an allocation done by the caller. We
can't just say "use more elements than the caller allocated", and have
to round down.
The good news? All the normal cases will be using power-of-two arrays
anyway, and most users of kfifo's don't use kfifo_init() at all, but one
of the helper macros to declare a KFIFO that enforce the proper
power-of-two behavior. But it looks like at least ibmvscsis might be
affected.
The bad news? Will Deacon refers to an old thread and points points out
that the memory ordering in kfifo's is questionable. See
https://lore.kernel.org/lkml/20181211034032.32338-1-yuleixzhang@tencent.com/
for more.
Fixes: dfe2a77fd2 ("kfifo: fix kfifo_alloc() and kfifo_init()")
Reported-by: laokz <laokz@foxmail.com>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Greg KH <greg@kroah.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adric Blake has noticed[1] the following warning:
WARNING: CPU: 7 PID: 175 at mm/vmscan.c:245 set_task_reclaim_state+0x1e/0x40
[...]
Call Trace:
mem_cgroup_shrink_node+0x9b/0x1d0
mem_cgroup_soft_limit_reclaim+0x10c/0x3a0
balance_pgdat+0x276/0x540
kswapd+0x200/0x3f0
? wait_woken+0x80/0x80
kthread+0xfd/0x130
? balance_pgdat+0x540/0x540
? kthread_park+0x80/0x80
ret_from_fork+0x35/0x40
---[ end trace 727343df67b2398a ]---
which tells us that soft limit reclaim is about to overwrite the
reclaim_state configured up in the call chain (kswapd in this case but
the direct reclaim is equally possible). This means that reclaim stats
would get misleading once the soft reclaim returns and another reclaim
is done.
Fix the warning by dropping set_task_reclaim_state from the soft reclaim
which is always called with reclaim_state set up.
[1] http://lkml.kernel.org/r/CAE1jjeePxYPvw1mw2B3v803xHVR_BNnz0hQUY_JDMN8ny29M6w@mail.gmail.com
Link: http://lkml.kernel.org/r/20190828071808.20410-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Adric Blake <promarbler14@gmail.com>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hillf Danton <hdanton@sina.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 766a4c19d8 ("mm/memcontrol.c: keep local VM counters in sync
with the hierarchical ones") effectively decreased the precision of
per-memcg vmstats_local and per-memcg-per-node lruvec percpu counters.
That's good for displaying in memory.stat, but brings a serious
regression into the reclaim process.
One issue I've discovered and debugged is the following:
lruvec_lru_size() can return 0 instead of the actual number of pages in
the lru list, preventing the kernel to reclaim last remaining pages.
Result is yet another dying memory cgroups flooding. The opposite is
also happening: scanning an empty lru list is the waste of cpu time.
Also, inactive_list_is_low() can return incorrect values, preventing the
active lru from being scanned and freed. It can fail both because the
size of active and inactive lists are inaccurate, and because the number
of workingset refaults isn't precise. In other words, the result is
pretty random.
I'm not sure, if using the approximate number of slab pages in
count_shadow_number() is acceptable, but issues described above are
enough to partially revert the patch.
Let's keep per-memcg vmstat_local batched (they are only used for
displaying stats to the userspace), but keep lruvec stats precise. This
change fixes the dead memcg flooding on my setup.
Link: http://lkml.kernel.org/r/20190817004726.2530670-1-guro@fb.com
Fixes: 766a4c19d8 ("mm/memcontrol.c: keep local VM counters in sync with the hierarchical ones")
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I've noticed that the "slab" value in memory.stat is sometimes 0, even
if some children memory cgroups have a non-zero "slab" value. The
following investigation showed that this is the result of the kmem_cache
reparenting in combination with the per-cpu batching of slab vmstats.
At the offlining some vmstat value may leave in the percpu cache, not
being propagated upwards by the cgroup hierarchy. It means that stats
on ancestor levels are lower than actual. Later when slab pages are
released, the precise number of pages is substracted on the parent
level, making the value negative. We don't show negative values, 0 is
printed instead.
To fix this issue, let's flush percpu slab memcg and lruvec stats on
memcg offlining. This guarantees that numbers on all ancestor levels
are accurate and match the actual number of outstanding slab pages.
Link: http://lkml.kernel.org/r/20190819202338.363363-3-guro@fb.com
Fixes: fb2f2b0adb ("mm: memcg/slab: reparent memcg kmem_caches on cgroup removal")
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Spurious warning when loading rules using the physdev match,
from Todd Seidelmann.
2) Fix FTP conntrack helper debugging output, from Thomas Jarosch.
3) Restore per-netns nf_conntrack_{acct,helper,timeout} sysctl knobs,
from Florian Westphal.
4) Clear skbuff timestamp from the flowtable datapath, also from Florian.
5) Fix incorrect byteorder of NFT_META_BRI_IIFVPROTO, from wenxu.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2019-08-31
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix 32-bit zero-extension during constant blinding which
has been causing a regression on ppc64, from Naveen.
2) Fix a latency bug in nfp driver when updating stack index
register, from Jiong.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When a local endpoint is ceases to be in use, such as when the kafs module
is unloaded, the kernel will emit an assertion failure if there are any
outstanding client connections:
rxrpc: Assertion failed
------------[ cut here ]------------
kernel BUG at net/rxrpc/local_object.c:433!
and even beyond that, will evince other oopses if there are service
connections still present.
Fix this by:
(1) Removing the triggering of connection reaping when an rxrpc socket is
released. These don't actually clean up the connections anyway - and
further, the local endpoint may still be in use through another
socket.
(2) Mark the local endpoint as dead when we start the process of tearing
it down.
(3) When destroying a local endpoint, strip all of its client connections
from the idle list and discard the ref on each that the list was
holding.
(4) When destroying a local endpoint, call the service connection reaper
directly (rather than through a workqueue) to immediately kill off all
outstanding service connections.
(5) Make the service connection reaper reap connections for which the
local endpoint is marked dead.
Only after destroying the connections can we close the socket lest we get
an oops in a workqueue that's looking at a connection or a peer.
Fixes: 3d18cbb7fd ("rxrpc: Fix conn expiry timers")
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells says:
====================
rxrpc: Fix use of skb_cow_data()
Here's a series of patches that replaces the use of skb_cow_data() in rxrpc
with skb_unshare() early on in the input process. The problem that is
being seen is that skb_cow_data() indirectly requires that the maximum
usage count on an sk_buff be 1, and it may generate an assertion failure in
pskb_expand_head() if not.
This can occur because rxrpc_input_data() may be still holding a ref when
it has just attached the sk_buff to the rx ring and given that attachment
its own ref. If recvmsg happens fast enough, skb_cow_data() can see the
ref still held by the softirq handler.
Further, a packet may contain multiple subpackets, each of which gets its
own attachment to the ring and its own ref - also making skb_cow_data() go
bang.
Fix this by:
(1) The DATA packet is currently parsed for subpackets twice by the input
routines. Parse it just once instead and make notes in the sk_buff
private data.
(2) Use the notes from (1) when attaching the packet to the ring multiple
times. Once the packet is attached to the ring, recvmsg can see it
and start modifying it, so the softirq handler is not permitted to
look inside it from that point.
(3) Pass the ref from the input code to the ring rather than getting an
extra ref. rxrpc_input_data() uses a ref on the second refcount to
prevent the packet from evaporating under it.
(4) Call skb_unshare() on secured DATA packets in rxrpc_input_packet()
before we take call->input_lock. Other sorts of packets don't get
modified and so can be left.
A trace is emitted if skb_unshare() eats the skb. Note that
skb_share() for our accounting in this regard as we can't see the
parameters in the packet to log in a trace line if it releases it.
(5) Remove the calls to skb_cow_data(). These are then no longer
necessary.
There are also patches to improve the rxrpc_skb tracepoint to make sure
that Tx-derived buffers are identified separately from Rx-derived buffers
in the trace.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The devicetree binding lists the phy phy as optional. As such, the
driver should not bail out if it can't find a regulator. Instead it
should just skip the remaining regulator related code and continue
on normally.
Skip the remainder of phy_power_on() if a regulator supply isn't
available. This also gets rid of the bogus return code.
Fixes: 2e12f53663 ("net: stmmac: dwmac-rk: Use standard devicetree property for phy regulator")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In xgbe_mod_init(), we should do cleanup if some error occurs
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: efbaa82833 ("amd-xgbe: Add support to handle device renaming")
Fixes: 47f164deab ("amd-xgbe: Add PCI device support")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The race between adding a function probe and reading the probes that exist
is very subtle. It needs a comment. Also, the issue can also happen if the
probe has has the EMPTY_HASH as its func_hash.
Cc: stable@vger.kernel.org
Fixes: 7b60f3d876 ("ftrace: Dynamically create the probe ftrace_ops for the trace_array")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
LTP testsuite on powerpc results in the below crash:
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc00000000029d800
Oops: Kernel access of bad area, sig: 11 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
...
CPU: 68 PID: 96584 Comm: cat Kdump: loaded Tainted: G W
NIP: c00000000029d800 LR: c00000000029dac4 CTR: c0000000001e6ad0
REGS: c0002017fae8ba10 TRAP: 0300 Tainted: G W
MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28022422 XER: 20040000
CFAR: c00000000029d90c DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 0
...
NIP [c00000000029d800] t_probe_next+0x60/0x180
LR [c00000000029dac4] t_mod_start+0x1a4/0x1f0
Call Trace:
[c0002017fae8bc90] [c000000000cdbc40] _cond_resched+0x10/0xb0 (unreliable)
[c0002017fae8bce0] [c0000000002a15b0] t_start+0xf0/0x1c0
[c0002017fae8bd30] [c0000000004ec2b4] seq_read+0x184/0x640
[c0002017fae8bdd0] [c0000000004a57bc] sys_read+0x10c/0x300
[c0002017fae8be30] [c00000000000b388] system_call+0x5c/0x70
The test (ftrace_set_ftrace_filter.sh) is part of ftrace stress tests
and the crash happens when the test does 'cat
$TRACING_PATH/set_ftrace_filter'.
The address points to the second line below, in t_probe_next(), where
filter_hash is dereferenced:
hash = iter->probe->ops.func_hash->filter_hash;
size = 1 << hash->size_bits;
This happens due to a race with register_ftrace_function_probe(). A new
ftrace_func_probe is created and added into the func_probes list in
trace_array under ftrace_lock. However, before initializing the filter,
we drop ftrace_lock, and re-acquire it after acquiring regex_lock. If
another process is trying to read set_ftrace_filter, it will be able to
acquire ftrace_lock during this window and it will end up seeing a NULL
filter_hash.
Fix this by just checking for a NULL filter_hash in t_probe_next(). If
the filter_hash is NULL, then this probe is just being added and we can
simply return from here.
Link: http://lkml.kernel.org/r/05e021f757625cbbb006fad41380323dbe4e3b43.1562249521.git.naveen.n.rao@linux.vnet.ibm.com
Cc: stable@vger.kernel.org
Fixes: 7b60f3d876 ("ftrace: Dynamically create the probe ftrace_ops for the trace_array")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Pull ARM fixes from Russell King:
"Three fixes for ARM this time around:
- A fix for update_sections_early() to cope with NULL ->mm pointers.
- A correction to the backtrace code to allow proper backtraces.
- Reinforcement of pfn_valid() with PFNs >= 4GiB"
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8901/1: add a criteria for pfn_valid of arm
ARM: 8897/1: check stmfd instruction using right shift
ARM: 8874/1: mm: only adjust sections of valid mm structures
If check_cached_key() returns a non-NULL value, we still need to call
key_type::match_free() to undo key_type::match_preparse().
Fixes: 7743c48e54 ("keys: Cache result of request_key*() temporarily in task_struct")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ARM SoC fixes from Arnd Bergmann:
"The majority of the fixes this time are for OMAP hardware, here is a
breakdown of the significant changes:
Various device tree bug fixes:
- TI am57xx boards need a voltage level fix to avoid damaging SD
cards
- vf610-bk4 fails to detect its flash due to an incorrect description
- meson-g12a USB phy configuration fails
- meson-g12b reboot should not power off the SD card
- Some corrections for apparently harmless differences from the
documentation.
Regression fixes:
- ams-delta FIQ interrupts broke in 5.3
- TI am3/am4 mmc controllers broke in 5.2
The logic_pio driver (used on some Huawei ARM servers) got a few bug
fixes for reliability.
And a couple of compile-time warning fixes"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (26 commits)
soc: ixp4xx: Protect IXP4xx SoC drivers by ARCH_IXP4XX || COMPILE_TEST
soc: ti: pm33xx: Make two symbols static
soc: ti: pm33xx: Fix static checker warnings
ARM: OMAP: dma: Mark expected switch fall-throughs
ARM: dts: Fix incomplete dts data for am3 and am4 mmc
bus: ti-sysc: Simplify cleanup upon failures in sysc_probe()
ARM: OMAP1: ams-delta-fiq: Fix missing irq_ack
ARM: dts: dra74x: Fix iodelay configuration for mmc3
ARM: dts: am335x: Fix UARTs length
ARM: OMAP2+: Fix omap4 errata warning on other SoCs
bus: hisi_lpc: Add .remove method to avoid driver unbind crash
bus: hisi_lpc: Unregister logical PIO range to avoid potential use-after-free
lib: logic_pio: Add logic_pio_unregister_range()
lib: logic_pio: Avoid possible overlap for unregistering regions
lib: logic_pio: Fix RCU usage
arm64: dts: amlogic: odroid-n2: keep SD card regulator always on
arm64: dts: meson-g12a-sei510: enable IR controller
arm64: dts: meson-g12a: add missing dwc2 phy-names
ARM: dts: vf610-bk4: Fix qspi node description
ARM: dts: Fix incorrect dcan register mapping for am3, am4 and dra7
...
Pull rdma fix from Doug Ledford:
"Much calmer week this week. Just one patch queued up:
The way the siw driver was locking around the traversal of the list of
ipv6 addresses on a device was causing a scheduling while atomic
issue. Bernard straightened it out by using the rtnl_lock"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/siw: Fix IPv6 addr_list locking
Pull two ceph fixes from Ilya Dryomov:
"A fix for a -rc1 regression in rbd and a trivial static checker fix"
* tag 'ceph-for-5.3-rc7' of git://github.com/ceph/ceph-client:
rbd: restore zeroing past the overlap when reading from parent
libceph: don't call crypto_free_sync_skcipher() on a NULL tfm
Pull drm fixes from Dave Airlie:
"Nothing too crazy, there's probably more patches than I'd like at this
stage, but they are all pretty self contained:
amdgpu:
- Fix GFXOFF regression for PCO and RV2
- Fix missing fence reference
- Fix VG20 power readings on certain SMU firmware versions
- Fix dpm level setup for VG20
- Add an ATPX laptop quirk
i915:
- Fix DP MST max BPC property creation after DRM register
- Fix unused ggtt deballooning and NULL dereference in guest
- Fix DSC eDP transcoder identification
- Fix WARN from DMA API debug by setting DMA max segment size
qxl:
- Make qxl reservel the vga ports using vgaargb to prevent switching to vga compatibility mode.
omap:
- Fix omap port lookup for SDI output
virtio:
- Use virtio_max_dma_size to fix an issue with swiotlb.
komeda:
- Compiler fixes to komeda.
- Add missing of_node_get() call in komeda.
- Reorder the komeda de-init functions"
* tag 'drm-fixes-2019-08-30' of git://anongit.freedesktop.org/drm/drm:
drm/komeda: Reordered the komeda's de-init functions
drm/amdgpu: fix GFXOFF on Picasso and Raven2
drm/amdgpu: Add APTX quirk for Dell Latitude 5495
drm/amd/powerplay: correct Vega20 dpm level related settings
drm/i915: Call dma_set_max_seg_size() in i915_driver_hw_probe()
drm/i915/dp: Fix DSC enable code to use cpu_transcoder instead of encoder->type
drm/i915: Don't deballoon unused ggtt drm_mm_node in linux guest
drm/i915: Do not create a new max_bpc prop for MST connectors
drm/powerplay: Fix Vega20 power reading again
drm/powerplay: Fix Vega20 Average Power value v4
drm/amdgpu: fix dma_fence_wait without reference
drm/komeda: Add missing of_node_get() call
drm/komeda: Clean warning 'komeda_component_add' might be a candidate for 'gnu_printf'
drm/komeda: Fix warning -Wunused-but-set-variable
drm/komeda: Fix error: not allocating enough data 1592 vs 1584
drm/virtio: use virtio_max_dma_size
drm/omap: Fix port lookup for SDI output
drm/qxl: get vga ioports
This reverts commit 557529494d.
Commit 557529494d ("iommu/vt-d: Avoid duplicated pci dma alias
consideration") aimed to address a NULL pointer deference issue
happened when a thunderbolt device driver returned unexpectedly.
Unfortunately, this change breaks a previous pci quirk added by
commit cc346a4714 ("PCI: Add function 1 DMA alias quirk for
Marvell devices"), as the result, devices like Marvell 88SE9128
SATA controller doesn't work anymore.
We will continue to try to find the real culprit mentioned in
557529494d, but for now we should revert it to fix current
breakage.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204627
Cc: Stijn Tintel <stijn@linux-ipv6.be>
Cc: Petr Vandrovec <petr@vandrovec.name>
Reported-by: Stijn Tintel <stijn@linux-ipv6.be>
Reported-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Quoting from mt8183 datasheet, the number of transfers to be
transferred in one transaction should be set to bigger than 1,
so we should forbid zero-length transfer and update functionality.
Reported-by: Alexandru M Stan <amstan@chromium.org>
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Qii Wang <qii.wang@mediatek.com>
[wsa: shortened commit message a little]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The driver does not support the SMBUS Quick command so remove the
flag that indicates that level of support.
By default the i2c_detect tool uses the quick command to try and
detect devices at some bus addresses. If the quick command is used
then we will not detect the device, even though it is present.
Fixes: e6e5dd3566 (i2c: iproc: Add Broadcom iProc I2C Driver)
Signed-off-by: Lori Hikichi <lori.hikichi@broadcom.com>
Signed-off-by: Rayagonda Kokatanur <rayagonda.kokatanur@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
When counting dispatched micro-ops with cnt_ctl=1, in order to prevent
sample bias, IBS hardware preloads the least significant 7 bits of
current count (IbsOpCurCnt) with random values, such that, after the
interrupt is handled and counting resumes, the next sample taken
will be slightly perturbed.
The current count bitfield is in the IBS execution control h/w register,
alongside the maximum count field.
Currently, the IBS driver writes that register with the maximum count,
leaving zeroes to fill the current count field, thereby overwriting
the random bits the hardware preloaded for itself.
Fix the driver to actually retain and carry those random bits from the
read of the IBS control register, through to its write, instead of
overwriting the lower current count bits with zeroes.
Tested with:
perf record -c 100001 -e ibs_op/cnt_ctl=1/pp -a -C 0 taskset -c 0 <workload>
'perf annotate' output before:
15.70 65: addsd %xmm0,%xmm1
17.30 add $0x1,%rax
15.88 cmp %rdx,%rax
je 82
17.32 72: test $0x1,%al
jne 7c
7.52 movapd %xmm1,%xmm0
5.90 jmp 65
8.23 7c: sqrtsd %xmm1,%xmm0
12.15 jmp 65
'perf annotate' output after:
16.63 65: addsd %xmm0,%xmm1
16.82 add $0x1,%rax
16.81 cmp %rdx,%rax
je 82
16.69 72: test $0x1,%al
jne 7c
8.30 movapd %xmm1,%xmm0
8.13 jmp 65
8.24 7c: sqrtsd %xmm1,%xmm0
8.39 jmp 65
Tested on Family 15h and 17h machines.
Machines prior to family 10h Rev. C don't have the RDWROPCNT capability,
and have the IbsOpCurCnt bitfield reserved, so this patch shouldn't
affect their operation.
It is unknown why commit db98c5faf8 ("perf/x86: Implement 64-bit
counter support for IBS") ignored the lower 4 bits of the IbsOpCurCnt
field; the number of preloaded random bits has always been 7, AFAICT.
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Arnaldo Carvalho de Melo" <acme@kernel.org>
Cc: <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: "Namhyung Kim" <namhyung@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20190826195730.30614-1-kim.phillips@amd.com
The recent change to shuffle the codec initialization procedure for
Realtek via commit 607ca3bd22 ("ALSA: hda/realtek - EAPD turn on
later") caused the silent output on some machines. This change was
supposed to be safe, but it isn't actually; some devices have quirk
setups to override the EAPD via COEF or BTL in the additional verb
table, which is applied at the beginning of snd_hda_gen_init(). And
this EAPD setup is again overridden in alc_auto_init_amp().
For recovering from the regression, tell snd_hda_gen_init() not to
apply the verbs there by a new flag, then apply the verbs in
alc_init().
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=204727
Fixes: 607ca3bd22 ("ALSA: hda/realtek - EAPD turn on later")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The IP datasheet says this controller is compatible with SD Host
Specification Version v4.00.
As it turned out, the ADMA of this IP does not work with 64-bit mode
when it is in the Version 3.00 compatible mode; it understands the
old 64-bit descriptor table (as defined in SDHCI v2), but the ADMA
System Address Register (SDHCI_ADMA_ADDRESS) cannot point to the
64-bit address.
I noticed this issue only after commit bd2e75633c ("dma-contiguous:
use fallback alloc_pages for single pages"). Prior to that commit,
dma_set_mask_and_coherent() returned the dma address that fits in
32-bit range, at least for the default arm64 configuration
(arch/arm64/configs/defconfig). Now the host->adma_addr exceeds the
32-bit limit, causing the real problem for the Socionext SoCs.
(As a side-note, I was also able to reproduce the issue for older
kernels by turning off CONFIG_DMA_CMA.)
Call sdhci_enable_v4_mode() to fix this.
Cc: <stable@vger.kernel.org> # v4.20+
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The OCR register defines the supported range of VDD voltages for SD cards.
However, it has turned out that some SD cards reports an invalid voltage
range, for example having bit7 set.
When a host supports MMC_CAP2_FULL_PWR_CYCLE and some of the voltages from
the invalid VDD range, this triggers the core to run a power cycle of the
card to try to initialize it at the lowest common supported voltage.
Obviously this fails, since the card can't support it.
Let's fix this problem, by clearing invalid bits from the read OCR register
for SD cards, before proceeding with the VDD voltage negotiation.
Cc: stable@vger.kernel.org
Reported-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Philip Langdale <philipl@overt.org>
Tested-by: Philip Langdale <philipl@overt.org>
Tested-by: Manuel Presnitz <mail@mpy.de>
Pull cifs fixes from Steve French:
"A few small SMB3 fixes, and a larger one to fix various older string
handling functions"
* tag '5.3-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module number
cifs: replace various strncpy with strscpy and similar
cifs: Use kzfree() to zero out the password
cifs: set domainName when a domain-key is used in multiuser
Get the vlan_proto of ingress bridge in network byteorder as userspace
expects. Otherwise this is inconsistent with NFT_META_PROTOCOL.
Fixes: 2a3a93ef0b ("netfilter: nft_meta_bridge: Add NFT_META_BRI_IIFVPROTO support")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Johannes Berg says:
====================
We have
* one fix for a driver as I'm covering for Kalle while he's on vacation
* two fixes for eapol-over-nl80211 work
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Family 16h Model 30h SMBus controller needs the same port selection fix
as described and fixed in commit 0fe16195f8 ("i2c: piix4: Fix SMBus port
selection for AMD Family 17h chips")
commit 6befa3fde6 ("i2c: piix4: Support alternative port selection
register") also fixed the port selection for Hudson2, but unfortunately
this is not the exact same device and the AMD naming and PCI Device IDs
aren't particularly helpful here.
The SMBus port selection register is common to the following Families
and models, as documented in AMD's publicly available BIOS and Kernel
Developer Guides:
50742 - Family 15h Model 60h-6Fh (PCI_DEVICE_ID_AMD_KERNCZ_SMBUS)
55072 - Family 15h Model 70h-7Fh (PCI_DEVICE_ID_AMD_KERNCZ_SMBUS)
52740 - Family 16h Model 30h-3Fh (PCI_DEVICE_ID_AMD_HUDSON2_SMBUS)
The Hudson2 PCI Device ID (PCI_DEVICE_ID_AMD_HUDSON2_SMBUS) is shared
between Bolton FCH and Family 16h Model 30h, but the location of the
SmBus0Sel port selection bits are different:
51192 - Bolton Register Reference Guide
We distinguish between Bolton and Family 16h Model 30h using the PCI
Revision ID:
Bolton is device 0x780b, revision 0x15
Family 16h Model 30h is device 0x780b, revision 0x1F
Family 15h Model 60h and 70h are both device 0x790b, revision 0x4A.
The following additional public AMD BKDG documents were checked and do
not share the same port selection register:
42301 - Family 15h Model 00h-0Fh doesn't mention any
42300 - Family 15h Model 10h-1Fh doesn't mention any
49125 - Family 15h Model 30h-3Fh doesn't mention any
48751 - Family 16h Model 00h-0Fh uses the previously supported
index register SB800_PIIX4_PORT_IDX_ALT at 0x2e
Signed-off-by: Andrew Cooks <andrew.cooks@opengear.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: stable@vger.kernel.org [v4.6+]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
ftrace does not use text_poke() for enabling trace functionality. It uses
its own mechanism and flips the whole kernel text to RW and back to RO.
The CPA rework removed a loop based check of 4k pages which tried to
preserve a large page by checking each 4k page whether the change would
actually cover all pages in the large page.
This resulted in endless loops for nothing as in testing it turned out that
it actually never preserved anything. Of course testing missed to include
ftrace, which is the one and only case which benefitted from the 4k loop.
As a consequence enabling function tracing or ftrace based kprobes results
in a full 4k split of the kernel text, which affects iTLB performance.
The kernel RO protection is the only valid case where this can actually
preserve large pages.
All other static protections (RO data, data NX, PCI, BIOS) are truly
static. So a conflict with those protections which results in a split
should only ever happen when a change of memory next to a protected region
is attempted. But these conflicts are rightfully splitting the large page
to preserve the protected regions. In fact a change to the protected
regions itself is a bug and is warned about.
Add an exception for the static protection check for kernel text RO when
the to be changed region spawns a full large page which allows to preserve
the large mappings. This also prevents the syslog to be spammed about CPA
violations when ftrace is used.
The exception needs to be removed once ftrace switched over to text_poke()
which avoids the whole issue.
Fixes: 585948f4f6 ("x86/mm/cpa: Avoid the 4k pages check completely")
Reported-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Song Liu <songliubraving@fb.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908282355340.1938@nanos.tec.linutronix.de
Make sure interrupt handler i2c_dw_irq_handler_slave() has finished
before clearing the the dev->slave pointer in i2c_dw_unreg_slave().
There is possibility for a race if i2c_dw_irq_handler_slave() is running
on another CPU while clearing the dev->slave pointer.
Reported-by: Krzysztof Adamski <krzysztof.adamski@nokia.com>
Reported-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
check_acpi_smo88xx_device() utilizes acpi_get_object_info() which in its turn
allocates a buffer. User is responsible to clean allocated resources. The last
has been missed in the original code. Fix it here.
While here, replace !ACPI_SUCCESS() with ACPI_FAILURE().
Fixes: 19b07cb4a1 ("i2c: i801: Register optional lis3lv02d I2C device on Dell machines")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
We are moving towards returning ERR_PTRs when i2c_new_*_device() calls
fail. Make sure its counterpart for unregistering handles ERR_PTRs as
well.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Pull fallthrough fixes from Gustavo A. R. Silva:
"Fix fall-through warnings on arc and nds32 for multiple
configurations"
* tag 'Wimplicit-fallthrough-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
nds32: Mark expected switch fall-throughs
ARC: unwind: Mark expected switch fall-through
Pull mtd fix from Miquel Raynal:
"Add a 'depends on' in the core Hyperbus Kconfig entry to avoid build
errors"
* tag 'mtd/fixes-for-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: hyperbus: fix dependency and build error
Mark switch cases where we are expecting to fall through.
This patch fixes the following warnings (Building: allmodconfig nds32):
include/math-emu/soft-fp.h:124:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/nds32/kernel/signal.c:362:20: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/nds32/kernel/signal.c:315:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:417:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:430:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/soft-fp.h:124:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:417:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:430:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
This patch fixes the following warnings (Building: haps_hs_defconfig arc):
arch/arc/kernel/unwind.c: In function ‘read_pointer’:
./include/linux/compiler.h:328:5: warning: this statement may fall through [-Wimplicit-fallthrough=]
do { \
^
./include/linux/compiler.h:338:2: note: in expansion of macro ‘__compiletime_assert’
__compiletime_assert(condition, msg, prefix, suffix)
^~~~~~~~~~~~~~~~~~~~
./include/linux/compiler.h:350:2: note: in expansion of macro ‘_compiletime_assert’
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
^~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
^~~~~~~~~~~~~~~~
arch/arc/kernel/unwind.c:573:3: note: in expansion of macro ‘BUILD_BUG_ON’
BUILD_BUG_ON(sizeof(u32) != sizeof(value));
^~~~~~~~~~~~
arch/arc/kernel/unwind.c:575:2: note: here
case DW_EH_PE_native:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Hisilicon fixes for v5.3-rc
- Fixed RCU usage in logical PIO
- Added a function to unregister a logical PIO range in logical PIO
to support the fixes in the hisi-lpc driver
- Fixed and optimized hisi-lpc driver to avoid potential use-after-free
and driver unbind crash
* tag 'hisi-fixes-for-5.3' of git://github.com/hisilicon/linux-hisi:
bus: hisi_lpc: Add .remove method to avoid driver unbind crash
bus: hisi_lpc: Unregister logical PIO range to avoid potential use-after-free
lib: logic_pio: Add logic_pio_unregister_range()
lib: logic_pio: Avoid possible overlap for unregistering regions
lib: logic_pio: Fix RCU usage
Link: https://lore.kernel.org/r/5D562335.7000902@hisilicon.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
In ieee80211_deliver_skb_to_local_stack intercepts EAPoL frames if
mac80211 is configured to do so and forwards the contents over nl80211.
During this process some additional data is also forwarded, including
whether the frame was received encrypted or not. Unfortunately just
prior to the call to ieee80211_deliver_skb_to_local_stack, skb->cb is
cleared, resulting in incorrect data being exposed over nl80211.
Fixes: 018f6fbf54 ("mac80211: Send control port frames over nl80211")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Link: https://lore.kernel.org/r/20190827224120.14545-2-denkenz@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
We need to use a different firmware for C0 versions of killer Qu NICs.
Add structures for them and handle them in the if block that detects
C0 revisions.
Additionally, instead of having an inclusive check for QnJ devices,
make the selection exclusive, so that switching to QnJ is the
exception, not the default. This prevents us from having to add all
the non-QnJ cards to an exclusion list. To do so, only go into the
QnJ block if the device has an RF ID type HR and HW revision QnJ.
Cc: stable@vger.kernel.org # 5.2
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/20190821171732.2266-1-luca@coelho.fi
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If 'fq' qdisc is used and a program has requested timestamps,
skb->tstamp needs to be cleared, else fq will treat these as
'transmit time'.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Since the chained quirks via chained_before flag is applied before the
depth check, it may lead to the endless recursive calls, when the
chain were set up incorrectly. Fix it by moving the depth check at
the beginning of the loop.
Fixes: 1f57825077 ("ALSA: hda - Add chained_before flag to the fixup entry")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
lib/devres.c, which implements devm_ioremap_resource(), is only built
when CONFIG_HAS_IOMEM is set/enabled, so MTD_HYPERBUS should depend
on HAS_IOMEM. Fixes a build error and a Kconfig warning (as seen on
UML builds):
WARNING: unmet direct dependencies detected for MTD_COMPLEX_MAPPINGS
Depends on [n]: MTD [=m] && HAS_IOMEM [=n]
Selected by [m]:
- MTD_HYPERBUS [=m] && MTD [=m]
ERROR: "devm_ioremap_resource" [drivers/mtd/hyperbus/hyperbus-core.ko] undefined!
Fixes: dcc7d3446a ("mtd: Add support for HyperBus memory devices")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-mtd@lists.infradead.org
Acked-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
A similar workaround for the suspend/resume problem is needed for yet
another ASUS machines, P6X models. Like the previous fix, the BIOS
doesn't provide the standard DMI_SYS_* entry, so again DMI_BOARD_*
entries are used instead.
Reported-and-tested-by: SteveM <swm@swm1.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
nfp: flower: fix bugs in merge tunnel encap code
John says:
There are few bugs in the merge encap code that have come to light with
recent driver changes. Effectively, flow bind callbacks were being
registered twice when using internal ports (new 'busy' code triggers
this). There was also an issue with neighbour notifier messages being
ignored for internal ports.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent code changes to NFP allowed the offload of neighbour entries to FW
when the next hop device was an internal port. This allows for offload of
tunnel encap when the end-point IP address is applied to such a port.
Unfortunately, the neighbour event handler still rejects events that are
not associated with a repr dev and so the firmware neighbour table may get
out of sync for internal ports.
Fix this by allowing internal port neighbour events to be correctly
processed.
Fixes: 45756dfeda ("nfp: flower: allow tunnels to output to internal port")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Internal port TC offload is implemented through user-space applications
(such as OvS) by adding filters at egress via TC clsact qdiscs. Indirect
block offload support in the NFP driver accepts both ingress qdisc binds
and egress binds if the device is an internal port. However, clsact sends
bind notification for both ingress and egress block binds which can lead
to the driver registering multiple callbacks and receiving multiple
notifications of new filters.
Fix this by rejecting ingress block bind callbacks when the port is
internal and only adding filter callbacks for egress binds.
Fixes: 4d12ba4278 ("nfp: flower: allow offloading of matches on 'internal' ports")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hayes Wang says:
====================
r8152: fix side effect
v3:
Update the commit message for patch #1.
v2:
Replace patch #2 with "r8152: remove calling netif_napi_del".
v1:
The commit 0ee1f47349 ("r8152: napi hangup fix after disconnect")
add a check to avoid using napi_disable after netif_napi_del. However,
the commit ffa9fec30c ("r8152: set RTL8152_UNPLUG only for real
disconnection") let the check useless.
Therefore, I revert commit 0ee1f47349 ("r8152: napi hangup fix
after disconnect") first, and add another patch to fix it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove unnecessary use of netif_napi_del. This also avoids to call
napi_disable() after netif_napi_del().
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 0ee1f47349.
The commit 0ee1f47349 ("r8152: napi hangup fix after
disconnect") adds a check about RTL8152_UNPLUG to determine
if calling napi_disable() is invalid in rtl8152_close(),
when rtl8152_disconnect() is called. This avoids to use
napi_disable() after calling netif_napi_del().
Howver, commit ffa9fec30c ("r8152: set RTL8152_UNPLUG
only for real disconnection") causes that RTL8152_UNPLUG
is not always set when calling rtl8152_disconnect().
Therefore, I have to revert commit 0ee1f47349 ("r8152:
napi hangup fix after disconnect"), first. And submit
another patch to fix it.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Action sample doesn't properly handle psample_group pointer in overwrite
case. Following issues need to be fixed:
- In tcf_sample_init() function RCU_INIT_POINTER() is used to set
s->psample_group, even though we neither setting the pointer to NULL, nor
preventing concurrent readers from accessing the pointer in some way.
Use rcu_swap_protected() instead to safely reset the pointer.
- Old value of s->psample_group is not released or deallocated in any way,
which results resource leak. Use psample_group_put() on non-NULL value
obtained with rcu_swap_protected().
- The function psample_group_put() that released reference to struct
psample_group pointed by rcu-pointer s->psample_group doesn't respect rcu
grace period when deallocating it. Extend struct psample_group with rcu
head and use kfree_rcu when freeing it.
Fixes: 5c5670fae4 ("net/sched: Introduce sample tc action")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the ibmvnic driver will not schedule device resets
if the device is being removed, but does not check the device
state before the reset is actually processed. This leads to a race
where a reset is scheduled with a valid device state but is
processed after the driver has been removed, resulting in an oops.
Fix this by checking the device state before processing a queued
reset event.
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pfn_valid can be wrong when parsing a invalid pfn whose phys address
exceeds BITS_PER_LONG as the MSB will be trimed when shifted.
The issue originally arise from bellowing call stack, which corresponding to
an access of the /proc/kpageflags from userspace with a invalid pfn parameter
and leads to kernel panic.
[46886.723249] c7 [<c031ff98>] (stable_page_flags) from [<c03203f8>]
[46886.723264] c7 [<c0320368>] (kpageflags_read) from [<c0312030>]
[46886.723280] c7 [<c0311fb0>] (proc_reg_read) from [<c02a6e6c>]
[46886.723290] c7 [<c02a6e24>] (__vfs_read) from [<c02a7018>]
[46886.723301] c7 [<c02a6f74>] (vfs_read) from [<c02a778c>]
[46886.723315] c7 [<c02a770c>] (SyS_pread64) from [<c0108620>]
(ret_fast_syscall+0x0/0x28)
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Currently, various virtual memory areas of Linux RISC-V are organized
in increasing order of their virtual addresses is as follows:
1. User space area (This is lowest area and starts at 0x0)
2. FIXMAP area
3. VMALLOC area
4. Kernel area (This is highest area and starts at PAGE_OFFSET)
The maximum size of user space aread is represented by TASK_SIZE.
On RV32 systems, TASK_SIZE is defined as VMALLOC_START which causes the
user space area to overlap the FIXMAP area. This allows user space apps
to potentially corrupt the FIXMAP area and kernel OF APIs will crash
whenever they access corrupted FDT in the FIXMAP area.
On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas
happen to overlap so we don't see any FIXMAP area corruptions.
This patch fixes FIXMAP area corruption on RV32 systems by setting
TASK_SIZE to FIXADDR_START. We also move FIXADDR_TOP, FIXADDR_SIZE,
and FIXADDR_START defines to asm/pgtable.h so that we can avoid cyclic
header includes.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Tested-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Only the first fragment in a datagram contains the L4 headers. When the
Open vSwitch module parses a packet, it always sets the IP protocol
field in the key, but can only set the L4 fields on the first fragment.
The original behavior would not clear the L4 portion of the key, so
garbage values would be sent in the key for "later" fragments. This
patch clears the L4 fields in that circumstance to prevent sending those
garbage values as part of the upcall.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When IP fragments are reassembled before being sent to conntrack, the
key from the last fragment is used. Unless there are reordering
issues, the last fragment received will not contain the L4 ports, so the
key for the reassembled datagram won't contain them. This patch updates
the key once we have a reassembled datagram.
The handle_fragments() function works on L3 headers so we pull the L3/L4
flow key update code from key_extract into a new function
'key_extract_l3l4'. Then we add a another new function
ovs_flow_key_update_l3l4() and export it so that it is accessible by
handle_fragments() for conntrack packet reassembly.
Co-authored-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that 'TCQ_F_CPUSTATS' bit can be cleared, depending on the value of
'TCQ_F_NOLOCK' bit in the parent qdisc, we need to be sure that per-cpu
counters are present when 'reset()' is called for pfifo_fast qdiscs.
Otherwise, the following script:
# tc q a dev lo handle 1: root htb default 100
# tc c a dev lo parent 1: classid 1:100 htb \
> rate 95Mbit ceil 100Mbit burst 64k
[...]
# tc f a dev lo parent 1: protocol arp basic classid 1:100
[...]
# tc q a dev lo parent 1:100 handle 100: pfifo_fast
[...]
# tc q d dev lo root
can generate the following splat:
Unable to handle kernel paging request at virtual address dfff2c01bd148000
Mem abort info:
ESR = 0x96000004
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
[dfff2c01bd148000] address between user and kernel address ranges
Internal error: Oops: 96000004 [#1] SMP
[...]
pstate: 80000005 (Nzcv daif -PAN -UAO)
pc : pfifo_fast_reset+0x280/0x4d8
lr : pfifo_fast_reset+0x21c/0x4d8
sp : ffff800d09676fa0
x29: ffff800d09676fa0 x28: ffff200012ee22e4
x27: dfff200000000000 x26: 0000000000000000
x25: ffff800ca0799958 x24: ffff1001940f332b
x23: 0000000000000007 x22: ffff200012ee1ab8
x21: 0000600de8a40000 x20: 0000000000000000
x19: ffff800ca0799900 x18: 0000000000000000
x17: 0000000000000002 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: ffff1001b922e6e2
x11: 1ffff001b922e6e1 x10: 0000000000000000
x9 : 1ffff001b922e6e1 x8 : dfff200000000000
x7 : 0000000000000000 x6 : 0000000000000000
x5 : 1fffe400025dc45c x4 : 1fffe400025dc357
x3 : 00000c01bd148000 x2 : 0000600de8a40000
x1 : 0000000000000007 x0 : 0000600de8a40004
Call trace:
pfifo_fast_reset+0x280/0x4d8
qdisc_reset+0x6c/0x370
htb_reset+0x150/0x3b8 [sch_htb]
qdisc_reset+0x6c/0x370
dev_deactivate_queue.constprop.5+0xe0/0x1a8
dev_deactivate_many+0xd8/0x908
dev_deactivate+0xe4/0x190
qdisc_graft+0x88c/0xbd0
tc_get_qdisc+0x418/0x8a8
rtnetlink_rcv_msg+0x3a8/0xa78
netlink_rcv_skb+0x18c/0x328
rtnetlink_rcv+0x28/0x38
netlink_unicast+0x3c4/0x538
netlink_sendmsg+0x538/0x9a0
sock_sendmsg+0xac/0xf8
___sys_sendmsg+0x53c/0x658
__sys_sendmsg+0xc8/0x140
__arm64_sys_sendmsg+0x74/0xa8
el0_svc_handler+0x164/0x468
el0_svc+0x10/0x14
Code: 910012a0 92400801 d343fc03 11000c21 (38fb6863)
Fix this by testing the value of 'TCQ_F_CPUSTATS' bit in 'qdisc->flags',
before dereferencing 'qdisc->cpu_qstats'.
Changes since v1:
- coding style improvements, thanks to Stefano Brivio
Fixes: 8a53e616de ("net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too")
CC: Paolo Abeni <pabeni@redhat.com>
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yash Shah says:
====================
macb: Update ethernet compatible string for SiFive FU540
This patch series renames the compatible property to a more appropriate
string. The patchset is based on Linux-5.3-rc6 and tested on SiFive
Unleashed board
Change history:
Since v1:
- Dropped PATCH3 because it's already merged
- Change the reference url in the patch descriptions to point to a
'lore.kernel.org' link instead of 'lkml.org'
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The scom driver currently fails out of operations if certain system
errors are flagged in the status register; system checkstop, special
attention, or recoverable error. These errors won't impact the ability
of the scom engine to perform operations, so the driver should continue
under these conditions.
Also, don't do a PIB reset for these conditions, since it won't help.
Fixes: 6b293258cd ("fsi: scom: Major overhaul")
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20190827041249.13381-1-jk@ozlabs.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Francois reported that VMware balloon gets stuck after a balloon reset,
when the VMCI doorbell is removed. A similar error can occur when the
balloon driver is removed with the following splat:
[ 1088.622000] INFO: task modprobe:3565 blocked for more than 120 seconds.
[ 1088.622035] Tainted: G W 5.2.0 #4
[ 1088.622087] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1088.622205] modprobe D 0 3565 1450 0x00000000
[ 1088.622210] Call Trace:
[ 1088.622246] __schedule+0x2a8/0x690
[ 1088.622248] schedule+0x2d/0x90
[ 1088.622250] schedule_timeout+0x1d3/0x2f0
[ 1088.622252] wait_for_completion+0xba/0x140
[ 1088.622320] ? wake_up_q+0x80/0x80
[ 1088.622370] vmci_resource_remove+0xb9/0xc0 [vmw_vmci]
[ 1088.622373] vmci_doorbell_destroy+0x9e/0xd0 [vmw_vmci]
[ 1088.622379] vmballoon_vmci_cleanup+0x6e/0xf0 [vmw_balloon]
[ 1088.622381] vmballoon_exit+0x18/0xcc8 [vmw_balloon]
[ 1088.622394] __x64_sys_delete_module+0x146/0x280
[ 1088.622408] do_syscall_64+0x5a/0x130
[ 1088.622410] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1088.622415] RIP: 0033:0x7f54f62791b7
[ 1088.622421] Code: Bad RIP value.
[ 1088.622421] RSP: 002b:00007fff2a949008 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 1088.622426] RAX: ffffffffffffffda RBX: 000055dff8b55d00 RCX: 00007f54f62791b7
[ 1088.622426] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055dff8b55d68
[ 1088.622427] RBP: 000055dff8b55d00 R08: 00007fff2a947fb1 R09: 0000000000000000
[ 1088.622427] R10: 00007f54f62f5cc0 R11: 0000000000000206 R12: 000055dff8b55d68
[ 1088.622428] R13: 0000000000000001 R14: 000055dff8b55d68 R15: 00007fff2a94a3f0
The cause for the bug is that when the "delayed" doorbell is invoked, it
takes a reference on the doorbell entry and schedules work that is
supposed to run the appropriate code and drop the doorbell entry
reference. The code ignores the fact that if the work is already queued,
it will not be scheduled to run one more time. As a result one of the
references would not be dropped. When the code waits for the reference
to get to zero, during balloon reset or module removal, it gets stuck.
Fix it. Drop the reference if schedule_work() indicates that the work is
already queued.
Note that this bug got more apparent (or apparent at all) due to
commit ce664331b2 ("vmw_balloon: VMCI_DOORBELL_SET does not check status").
Fixes: 83e2ec765b ("VMCI: doorbell implementation.")
Reported-by: Francois Rigault <rigault.francois@gmail.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: Adit Ranadive <aditr@vmware.com>
Cc: Alexios Zavras <alexios.zavras@intel.com>
Cc: Vishnu DASA <vdasa@vmware.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@vmware.com>
Reviewed-by: Vishnu Dasa <vdasa@vmware.com>
Link: https://lore.kernel.org/r/20190820202638.49003-1-namit@vmware.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Falcon microcontroller that runs the XUSB firmware and which is
responsible for exposing the XHCI interface can address only 40 bits of
memory. Typically that's not a problem because Tegra devices don't have
enough system memory to exceed those 40 bits.
However, if the ARM SMMU is enable on Tegra186 and later, the addresses
passed to the XUSB controller can be anywhere in the 48-bit IOV address
space of the ARM SMMU. Since the DMA/IOMMU API starts allocating from
the top of the IOVA space, the Falcon microcontroller is not able to
load the firmware successfully.
Fix this by setting the DMA mask to 40 bits, which will force the DMA
API to map the buffer for the firmware to an IOVA that is addressable by
the Falcon.
Signed-off-by: Nagarjuna Kristam <nkristam@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/1566989697-13049-1-git-send-email-nkristam@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.
[ 34.851754] usb 2-1: USB disconnect, device number 2
[ 35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[ 35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 #85
[ 35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[ 35.187886] Workqueue: usb_hub_wq hub_event
[ 35.192063] Call trace:
[ 35.194509] dump_backtrace+0x0/0x150
[ 35.198165] show_stack+0x14/0x20
[ 35.201475] dump_stack+0xa0/0xc4
[ 35.204785] __report_bad_irq+0x34/0xe8
[ 35.208614] note_interrupt+0x2cc/0x318
[ 35.212446] handle_irq_event_percpu+0x5c/0x88
[ 35.216883] handle_irq_event+0x48/0x78
[ 35.220712] handle_fasteoi_irq+0xb4/0x188
[ 35.224802] generic_handle_irq+0x24/0x38
[ 35.228804] __handle_domain_irq+0x5c/0xb0
[ 35.232893] gic_handle_irq+0x58/0xa8
[ 35.236548] el1_irq+0xb8/0x180
[ 35.239681] __do_softirq+0x94/0x23c
[ 35.243253] irq_exit+0xd0/0xd8
[ 35.246387] __handle_domain_irq+0x60/0xb0
[ 35.250475] gic_handle_irq+0x58/0xa8
[ 35.254130] el1_irq+0xb8/0x180
[ 35.257268] kernfs_find_ns+0x5c/0x120
[ 35.261010] kernfs_find_and_get_ns+0x3c/0x60
[ 35.265361] sysfs_unmerge_group+0x20/0x68
[ 35.269454] dpm_sysfs_remove+0x2c/0x68
[ 35.273284] device_del+0x80/0x370
[ 35.276683] hid_destroy_device+0x28/0x60
[ 35.280686] usbhid_disconnect+0x4c/0x80
[ 35.284602] usb_unbind_interface+0x6c/0x268
[ 35.288867] device_release_driver_internal+0xe4/0x1b0
[ 35.293998] device_release_driver+0x14/0x20
[ 35.298261] bus_remove_device+0x110/0x128
[ 35.302350] device_del+0x148/0x370
[ 35.305832] usb_disable_device+0x8c/0x1d0
[ 35.309921] usb_disconnect+0xc8/0x2d0
[ 35.313663] hub_event+0x6e0/0x1128
[ 35.317146] process_one_work+0x1e0/0x320
[ 35.321148] worker_thread+0x40/0x450
[ 35.324805] kthread+0x124/0x128
[ 35.328027] ret_from_fork+0x10/0x18
[ 35.331594] handlers:
[ 35.333862] [<0000000079300c1d>] usb_hcd_irq
[ 35.338126] [<0000000079300c1d>] usb_hcd_irq
[ 35.342389] Disabling IRQ #156
ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
ohci_irq()
-> process_done_list()
-> takeback_td()
-> start_ed_unlink()
So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:
/* interrupt for some other device? */
if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
return IRQ_NOTMINE;
To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().
This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Using managed device resources in usb_hcd_pci_probe() allows devm usage for
resource subranges, such as the mmio resource for the platform device
created to control host/device mode mux, which is a xhci extended
capability, and sits inside the xhci mmio region.
If managed device resources are not used then "parent" resource
is released before subrange at driver removal as .remove callback is
called before the devres list of resources for this device is walked
and released.
This has been observed with the xhci extended capability driver causing a
use-after-free which is now fixed.
An additional nice benefit is that error handling on driver initialisation
is simplified much.
Signed-off-by: Carsten Schmid <carsten_schmid@mentor.com>
Tested-by: Carsten Schmid <carsten_schmid@mentor.com>
Reviewed-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Fixes: fa31b3cb2a ("xhci: Add Intel extended cap / otg phy mux handling")
Cc: <stable@vger.kernel.org> # v4.19+
Link: https://lore.kernel.org/r/1566569488679.31808@mentor.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To address the requirements of embargoed hardware issues, like Meltdown,
Spectre, L1TF etc. it is necessary to define and document a process for
handling embargoed hardware security issues.
Following the discussion at the maintainer summit 2018 in Edinburgh
(https://lwn.net/Articles/769417/) the volunteered people have worked
out a process and a Memorandum of Understanding. The latter addresses
the fact that the Linux kernel community cannot sign NDAs for various
reasons.
The initial contact point for hardware security issues is different from
the regular kernel security contact to provide a known and neutral
interface for hardware vendors and researchers. The initial primary
contact team is proposed to be staffed by Linux Foundation Fellows, who
are not associated to a vendor or a distribution and are well connected
in the industry as a whole.
The process is designed with the experience of the past incidents in
mind and tries to address the remaining gaps, so future (hopefully rare)
incidents can be handled more efficiently. It won't remove the fact,
that most of this has to be done behind closed doors, but it is set up
to avoid big bureaucratic hurdles for individual developers.
The process is solely for handling hardware security issues and cannot
be used for regular kernel (software only) security bugs.
This memo can help with hardware companies who, and I quote, "[my
manager] doesn't want to bet his job on the list keeping things secret."
This despite numerous leaks directly from that company over the years,
and none ever so far from the kernel security team. Cognitive
dissidence seems to be a requirement to be a good manager.
To accelerate the adoption of this process, we introduce the concept of
ambassadors in participating companies. The ambassadors are there to
guide people to comply with the process, but are not automatically
involved in the disclosure of a particular incident.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Laura Abbott <labbott@redhat.com>
Acked-by: Ben Hutchings <ben@decadent.org.uk>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Link: https://lore.kernel.org/r/20190815212505.GC12041@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Moritz writes:
FPGA Manager fixes for 5.3
A single fix for the altera-ps-spi driver that fixes the behavior when
the driver receives -EPROBE_DEFER when trying to obtain a GPIO desc.
Signed-off-by: Moritz Fischer <mdf@kernel.org>
* tag 'fpga-fixes-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga:
fpga: altera-ps-spi: Fix getting of optional confd gpio
Pull arm64 fixes from Will Deacon:
"Hot on the heels of our last set of fixes are a few more for -rc7.
Two of them are fixing issues with our virtual interrupt controller
implementation in KVM/arm, while the other is a longstanding but
straightforward kallsyms fix which was been acked by Masami and
resolves an initialisation failure in kprobes observed on arm64.
- Fix GICv2 emulation bug (KVM)
- Fix deadlock in virtual GIC interrupt injection code (KVM)
- Fix kprobes blacklist init failure due to broken kallsyms lookup"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
KVM: arm/arm64: vgic-v2: Handle SGI bits in GICD_I{S,C}PENDR0 as WI
KVM: arm/arm64: vgic: Fix potential deadlock when ap_list is long
kallsyms: Don't let kallsyms_lookup_size_offset() fail on retrieving the first symbol
Yi reported[1] that after commit a3619190d6 ("libnvdimm/pfn: stop
padding pmem namespaces to section alignment"), it was no longer
possible to create a device dax namespace with a 1G alignment. The
reason was that the pmem region was not itself 1G-aligned. The code
happily skips past the first 512M, but fails to account for a now
misaligned end offset (since space was allocated starting at that
misaligned address, and extending for size GBs). Reintroduce
end_trunc, so that the code correctly handles the misaligned end
address. This results in the same behavior as before the introduction
of the offending commit.
[1] https://lists.01.org/pipermail/linux-nvdimm/2019-July/022813.html
Fixes: a3619190d6 ("libnvdimm/pfn: stop padding pmem namespaces ...")
Reported-and-tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Link: https://lore.kernel.org/r/x49ftll8f39.fsf@segfault.boston.devel.redhat.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The de-init routine should be doing the following in order:-
1. Unregister the drm device
2. Shut down the crtcs - failing to do this might cause a connector leakage
See the 'commit 109c4d18e5 ("drm/arm/malidp: Ensure that the crtcs are
shutdown before removing any encoder/connector")'
3. Disable the interrupts
4. Unbind the components
5. Free up DRM mode_config info
Changes from v1:-
1. Re-ordered the header files inclusion
2. Rebased on top of the latest drm-misc-fixes
Signed-off-by:. Ayan Kumar Halder <Ayan.Halder@arm.com>
Reviewed-by: Mihail Atanassov <mihail.atanassov@arm.com>
Reviewed-by: James Qian Wang (Arm Technology China) <james.qian.wang@arm.com>
Link: https://patchwork.freedesktop.org/patch/327606/
One of the very few warnings I have in the current build comes from
arch/x86/boot/edd.c, where I get the following with a gcc9 build:
arch/x86/boot/edd.c: In function ‘query_edd’:
arch/x86/boot/edd.c:148:11: warning: taking address of packed member of ‘struct boot_params’ may result in an unaligned pointer value [-Waddress-of-packed-member]
148 | mbrptr = boot_params.edd_mbr_sig_buffer;
| ^~~~~~~~~~~
This warning triggers because we throw away all the CFLAGS and then make
a new set for REALMODE_CFLAGS, so the -Wno-address-of-packed-member we
added in the following commit is not present:
6f303d6053 ("gcc-9: silence 'address-of-packed-member' warning")
The simplest solution for now is to adjust the warning for this version
of CFLAGS as well, but it would definitely make sense to examine whether
REALMODE_CFLAGS could be derived from CFLAGS, so that it picks up changes
in the compiler flags environment automatically.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Walking the address list of an inet6_dev requires
appropriate locking. Since the called function
siw_listen_address() may sleep, we have to use
rtnl_lock() instead of read_lock_bh().
Also introduces sanity checks if we got a device
from in_dev_get() or in6_dev_get().
Reported-by: Bart Van Assche <bvanassche@acm.org>
Fixes: 6c52fdc244 ("rdma/siw: connection management")
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190828130355.22830-1-bmt@zurich.ibm.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
The register number needs to be translated for chips with more than 8
ports. This patch fixes a bug causing all chips with more than 8 GPIO pins
to not work correctly.
Fixes: 0f25fda840 ("gpio: pca953x: Zap ad-hoc reg_direction cache")
Cc: Cc: <stable@vger.kernel.org>
Signed-off-by: David Jander <david@protonic.nl>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
The parent image is read only up to the overlap point, the rest of
the buffer should be zeroed. This snuck in because as it turns out
the overlap test case has not been triggering this code path for
a while now.
Fixes: a9b67e6994 ("rbd: replace obj_req->tried_parent with obj_req->read_state")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
In set_secret(), key->tfm is assigned to NULL on line 55, and then
ceph_crypto_key_destroy(key) is executed.
ceph_crypto_key_destroy(key)
crypto_free_sync_skcipher(key->tfm)
crypto_free_skcipher(&tfm->base);
This happens to work because crypto_sync_skcipher is a trivial wrapper
around crypto_skcipher: &tfm->base is still 0 and crypto_free_skcipher()
handles that. Let's not rely on the layout of crypto_sync_skcipher.
This bug is found by a static analysis tool STCheck written by us.
Fixes: 69d6302b65 ("libceph: Remove VLA usage of skcipher").
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
A guest is not allowed to inject a SGI (or clear its pending state)
by writing to GICD_ISPENDR0 (resp. GICD_ICPENDR0), as these bits are
defined as WI (as per ARM IHI 0048B 4.3.7 and 4.3.8).
Make sure we correctly emulate the architecture.
Fixes: 96b298000d ("KVM: arm/arm64: vgic-new: Add PENDING registers handlers")
Cc: stable@vger.kernel.org # 4.7+
Reported-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Vladimir Rutsky reported stuck TCP sessions after memory pressure
events. Edge Trigger epoll() user would never receive an EPOLLOUT
notification allowing them to retry a sendmsg().
Jason tested the case of sk_stream_alloc_skb() returning NULL,
but there are other paths that could lead both sendmsg() and sendpage()
to return -1 (EAGAIN), with an empty skb queued on the write queue.
This patch makes sure we remove this empty skb so that
Jason code can detect that the queue is empty, and
call sk->sk_write_space(sk) accordingly.
Fixes: ce5ec44099 ("tcp: ensure epoll edge trigger wakeup when write queue is empty")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Baron <jbaron@akamai.com>
Reported-by: Vladimir Rutsky <rutsky@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rds6_inc_info_copy() function has a couple struct members which
are leaking stack information. The ->tos field should hold actual
information and the ->flags field needs to be zeroed out.
Fixes: 3eb450367d ("rds: add type of service(tos) infrastructure")
Fixes: b7ff8b1036 ("rds: Extend RDS API for IPv6 support")
Reported-by: 黄ID蝴蝶 <butterflyhuangxx@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After witnessing the discussion in https://lkml.org/lkml/2019/8/14/151
w.r.t. ioctl extensibility, it became clear that such an issue might
prevent that the 3 RSV bits inside the DSA 802.1Q tag might also suffer
the same fate and be useless for further extension.
So clearly specify that the reserved bits should currently be
transmitted as zero and ignored on receive. The DSA tagger already does
this (and has always did), and is the only known user so far (no
Wireshark dissection plugin, etc). So there should be no incompatibility
to speak of.
Fixes: 0471dd429c ("net: dsa: tag_8021q: Create a stable binary format")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 34786005ec ("net: phy: prevent PHYs w/o Clause 22 regs from calling
genphy_config_aneg") introduced a check that aborts phy_config_aneg()
if the phy is a C45 phy.
This causes phy_state_machine() to call phy_error() so that the phy
ends up in PHY_HALTED state.
Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg()
(analogous to the C22 case) so that the state machine can run
correctly.
genphy_c45_config_aneg() closely resembles mv3310_config_aneg()
in drivers/net/phy/marvell10g.c, excluding vendor specific
configurations for 1000BaseT.
Fixes: 22b56e8270 ("net: phy: replace genphy_10g_driver with genphy_c45_driver")
Signed-off-by: Marco Hartmann <marco.hartmann@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using strscpy is cleaner, and avoids some problems with
handling maximum length strings. Linus noticed the
original problem and Aurelien pointed out some additional
problems. Fortunately most of this is SMB1 code (and
in particular the ASCII string handling older, which
is less common).
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
The net pointer in struct xt_tgdtor_param is not explicitly
initialized therefore is still NULL when dereferencing it.
So we have to find a way to pass the correct net pointer to
ipt_destroy_target().
The best way I find is just saving the net pointer inside the per
netns struct tcf_idrinfo, which could make this patch smaller.
Fixes: 0c66dc1ea3 ("netfilter: conntrack: register hooks in netns when needed by ruleset")
Reported-and-tested-by: itugrok@yahoo.com
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's safer to zero out the password so that it can never be disclosed.
Fixes: 0c219f5799c7 ("cifs: set domainName when a domain-key is used in multiuser")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
RHBZ: 1710429
When we use a domain-key to authenticate using multiuser we must also set
the domainnmame for the new volume as it will be used and passed to the server
in the NTLMSSP Domain-name.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
Stable fixes:
- Fix a page lock leak in nfs_pageio_resend()
- Ensure O_DIRECT reports an error if the bytes read/written is 0
- Don't handle errors if the bind/connect succeeded
- Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was
invalidat ed"
Bugfixes:
- Don't refresh attributes with mounted-on-file information
- Fix return values for nfs4_file_open() and nfs_finish_open()
- Fix pnfs layoutstats reporting of I/O errors
- Don't use soft RPC calls for pNFS/flexfiles I/O, and don't abort
for soft I/O errors when the user specifies a hard mount.
- Various fixes to the error handling in sunrpc
- Don't report writepage()/writepages() errors twice"
* tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: remove set but not used variable 'mapping'
NFSv2: Fix write regression
NFSv2: Fix eof handling
NFS: Fix writepage(s) error handling to not report errors twice
NFS: Fix spurious EIO read errors
pNFS/flexfiles: Don't time out requests on hard mounts
SUNRPC: Handle connection breakages correctly in call_status()
Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated"
SUNRPC: Handle EADDRINUSE and ENOBUFS correctly
pNFS/flexfiles: Turn off soft RPC calls
SUNRPC: Don't handle errors if the bind/connect succeeded
NFS: On fatal writeback errors, we need to call nfs_inode_remove_request()
NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup
NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0
NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend()
NFSv4: Fix return value in nfs_finish_open()
NFSv4: Fix return values for nfs4_file_open()
NFS: Don't refresh attributes with mounted-on-file information
Don't advance RIP or inject a single-step #DB if emulation signals a
fault. This logic applies to all state updates that are conditional on
clean retirement of the emulation instruction, e.g. updating RFLAGS was
previously handled by commit 38827dbd3f ("KVM: x86: Do not update
EFLAGS on faulting emulation").
Not advancing RIP is likely a nop, i.e. ctxt->eip isn't updated with
ctxt->_eip until emulation "retires" anyways. Skipping #DB injection
fixes a bug reported by Andy Lutomirski where a #UD on SYSCALL due to
invalid state with EFLAGS.TF=1 would loop indefinitely due to emulation
overwriting the #UD with #DB and thus restarting the bad SYSCALL over
and over.
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@kernel.org>
Fixes: 663f4c61b8 ("KVM: x86: handle singlestep during emulation")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
If kvm_intel is loaded with nested=0 parameter an attempt to perform
KVM_GET_SUPPORTED_HV_CPUID results in OOPS as nested_get_evmcs_version hook
in kvm_x86_ops is NULL (we assign it in nested_vmx_hardware_setup() and
this only happens in case nested is enabled).
Check that kvm_x86_ops->nested_get_evmcs_version is not NULL before
calling it. With this, we can remove the stub from svm as it is no
longer needed.
Cc: <stable@vger.kernel.org>
Fixes: e2e871ab2f ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version() helper")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Pull ARC updates from Vineet Gupta:
- support for Edge Triggered IRQs in ARC IDU intc
- other fixes here and there
* tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
arc: prefer __section from compiler_attributes.h
dt-bindings: IDU-intc: Add support for edge-triggered interrupts
dt-bindings: IDU-intc: Clean up documentation
ARCv2: IDU-intc: Add support for edge-triggered interrupts
ARC: unwind: Mark expected switch fall-throughs
ARC: [plat-hsdk]: allow to switch between AXI DMAC port configurations
ARC: fix typo in setup_dma_ops log message
ARCv2: entry: early return from exception need not clear U & DE bits
Pull MFD fix from Lee Jones:
"Identify potentially unused functions in rk808 driver when !PM"
* tag 'mfd-fixes-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
mfd: rk808: Make PM function declaration static
mfd: rk808: Mark pm functions __maybe_unused
Pull sound fixes from Takashi Iwai:
"A collection of small fixes as usual:
- More coverage of USB-audio descriptor sanity checks
- A fix for mute LED regression on Conexant HD-audio codecs
- A few device-specific fixes and quirks for USB-audio and HD-audio
- A fix for (die-hard remaining) possible race in sequencer core
- FireWire oxfw regression fix that was introduced in 5.3-rc1"
* tag 'sound-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: oxfw: fix to handle correct stream for PCM playback
ALSA: seq: Fix potential concurrent access to the deleted pool
ALSA: usb-audio: Check mixer unit bitmap yet more strictly
ALSA: line6: Fix memory leak at line6_init_pcm() error path
ALSA: usb-audio: Fix invalid NULL check in snd_emuusb_set_samplerate()
ALSA: hda/ca0132 - Add new SBZ quirk
ALSA: usb-audio: Add implicit fb quirk for Behringer UFX1604
ALSA: hda - Fixes inverted Conexant GPIO mic mute led
For picasso(adev->pdev->device == 0x15d8)&raven2(adev->rev_id >= 0x8),
firmware is sufficient to support gfxoff.
In commit 98f58ada2d, for picasso&raven2,
return directly and cause gfxoff disabled.
Fixes: 98f58ada2d ("drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible")
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Pull networking fixes from David Miller:
1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya
Leoshkevich.
2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for
SMC from Jason Baron.
3) ipv6_mc_may_pull() should return 0 for malformed packets, not
-EINVAL. From Stefano Brivio.
4) Don't forget to unpin umem xdp pages in error path of
xdp_umem_reg(). From Ivan Khoronzhuk.
5) Fix sta object leak in mac80211, from Johannes Berg.
6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2
switches. From Florian Fainelli.
7) Revert DMA sync removal from r8169 which was causing regressions on
some MIPS Loongson platforms. From Heiner Kallweit.
8) Use after free in flow dissector, from Jakub Sitnicki.
9) Fix NULL derefs of net devices during ICMP processing across
collect_md tunnels, from Hangbin Liu.
10) proto_register() memory leaks, from Zhang Lin.
11) Set NLM_F_MULTI flag in multipart netlink messages consistently,
from John Fastabend.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
r8152: Set memory to all 0xFFs on failed reg reads
openvswitch: Fix conntrack cache with timeout
ipv4: mpls: fix mpls_xmit for iptunnel
nexthop: Fix nexthop_num_path for blackhole nexthops
net: rds: add service level support in rds-info
net: route dump netlink NLM_F_MULTI flag missing
s390/qeth: reject oversized SNMP requests
sock: fix potential memory leak in proto_register()
MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT
xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode
ipv4/icmp: fix rt dst dev null pointer dereference
openvswitch: Fix log message in ovs conntrack
bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0
bpf: fix use after free in prog symbol exposure
bpf: fix precision tracking in presence of bpf2bpf calls
flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH
Revert "r8169: remove not needed call to dma_sync_single_for_device"
ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev
net/ncsi: Fix the payload copying for the request coming from Netlink
qed: Add cleanup in qed_slowpath_start()
...
When I merged the extension sysctl tables with the main one I forgot to
reset them on netns creation. They currently read/write init_net settings.
Fixes: d912dec124 ("netfilter: conntrack: merge acct and helper sysctl table with main one")
Fixes: cb2833ed00 ("netfilter: conntrack: merge ecache and timestamp sysctl tables with main one")
Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If the ap_list is longer than 256 entries, merge_final() in list_sort()
will call the comparison callback with the same element twice, causing
a deadlock in vgic_irq_cmp().
Fix it by returning early when irqa == irqb.
Cc: stable@vger.kernel.org # 4.7+
Fixes: 8e44474579 ("KVM: arm/arm64: vgic-new: Add IRQ sorting")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Heyi Guo <guoheyi@huawei.com>
[maz: massaged commit log and patch, added Fixes and Cc-stable]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
An arm64 kernel configured with
CONFIG_KPROBES=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
CONFIG_KALLSYMS_BASE_RELATIVE=y
reports the following kprobe failure:
[ 0.032677] kprobes: failed to populate blacklist: -22
[ 0.033376] Please take care of using kprobes.
It appears that kprobe fails to retrieve the symbol at address
0xffff000010081000, despite this symbol being in System.map:
ffff000010081000 T __exception_text_start
This symbol is part of the first group of aliases in the
kallsyms_offsets array (symbol names generated using ugly hacks in
scripts/kallsyms.c):
kallsyms_offsets:
.long 0x1000 // do_undefinstr
.long 0x1000 // efi_header_end
.long 0x1000 // _stext
.long 0x1000 // __exception_text_start
.long 0x12b0 // do_cp15instr
Looking at the implementation of get_symbol_pos(), it returns the
lowest index for aliasing symbols. In this case, it return 0.
But kallsyms_lookup_size_offset() considers 0 as a failure, which
is obviously wrong (there is definitely a valid symbol living there).
In turn, the kprobe blacklisting stops abruptly, hence the original
error.
A CONFIG_KALLSYMS_ALL kernel wouldn't fail as there is always
some random symbols at the beginning of this array, which are never
looked up via kallsyms_lookup_size_offset.
Fix it by considering that get_symbol_pos() is always successful
(which is consistent with the other uses of this function).
Fixes: ffc5089196 ("[PATCH] Create kallsyms_lookup_size_offset()")
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Fixes gcc '-Wunused-but-set-variable' warning:
fs/nfs/write.c: In function nfs_page_async_flush:
fs/nfs/write.c:609:24: warning: variable mapping set but not used [-Wunused-but-set-variable]
It is not use since commit aefb623c422e ("NFS: Fix
writepage(s) error handling to not report errors twice")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If we received a reply from the server with a zero length read and
no error, then that implies we are at eof.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Avoids:
../drivers/mfd/rk808.c:771:1: warning: symbol 'rk8xx_pm_ops' \
was not declared. Should it be static?
Fixes: 5752bc4373 ("mfd: rk808: Mark pm functions __maybe_unused")
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
The find_pattern() debug output was printing the 'skip' character.
This can be a NULL-byte and messes up further pr_debug() output.
Output without the fix:
kernel: nf_conntrack_ftp: Pattern matches!
kernel: nf_conntrack_ftp: Skipped up to `<7>nf_conntrack_ftp: find_pattern `PORT': dlen = 8
kernel: nf_conntrack_ftp: find_pattern `EPRT': dlen = 8
Output with the fix:
kernel: nf_conntrack_ftp: Pattern matches!
kernel: nf_conntrack_ftp: Skipped up to 0x0 delimiter!
kernel: nf_conntrack_ftp: Match succeeded!
kernel: nf_conntrack_ftp: conntrack_ftp: match `172,17,0,100,200,207' (20 bytes at 4150681645)
kernel: nf_conntrack_ftp: find_pattern `PORT': dlen = 8
Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Simplify the check in physdev_mt_check() to emit an error message
only when passed an invalid chain (ie, NF_INET_LOCAL_OUT).
This avoids cluttering up the log with errors against valid rules.
For large/heavily modified rulesets, current behavior can quickly
overwhelm the ring buffer, because this function gets called on
every change, regardless of the rule that was changed.
Signed-off-by: Todd Seidelmann <tseidelmann@linode.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The in-place decryption routines in AF_RXRPC's rxkad security module
currently call skb_cow_data() to make sure the data isn't shared and that
the skb can be written over. This has a problem, however, as the softirq
handler may be still holding a ref or the Rx ring may be holding multiple
refs when skb_cow_data() is called in rxkad_verify_packet() - and so
skb_shared() returns true and __pskb_pull_tail() dislikes that. If this
occurs, something like the following report will be generated.
kernel BUG at net/core/skbuff.c:1463!
...
RIP: 0010:pskb_expand_head+0x253/0x2b0
...
Call Trace:
__pskb_pull_tail+0x49/0x460
skb_cow_data+0x6f/0x300
rxkad_verify_packet+0x18b/0xb10 [rxrpc]
rxrpc_recvmsg_data.isra.11+0x4a8/0xa10 [rxrpc]
rxrpc_kernel_recv_data+0x126/0x240 [rxrpc]
afs_extract_data+0x51/0x2d0 [kafs]
afs_deliver_fs_fetch_data+0x188/0x400 [kafs]
afs_deliver_to_call+0xac/0x430 [kafs]
afs_wait_for_call_to_complete+0x22f/0x3d0 [kafs]
afs_make_call+0x282/0x3f0 [kafs]
afs_fs_fetch_data+0x164/0x300 [kafs]
afs_fetch_data+0x54/0x130 [kafs]
afs_readpages+0x20d/0x340 [kafs]
read_pages+0x66/0x180
__do_page_cache_readahead+0x188/0x1a0
ondemand_readahead+0x17d/0x2e0
generic_file_read_iter+0x740/0xc10
__vfs_read+0x145/0x1a0
vfs_read+0x8c/0x140
ksys_read+0x4a/0xb0
do_syscall_64+0x43/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fix this by using skb_unshare() instead in the input path for DATA packets
that have a security index != 0. Non-DATA packets don't need in-place
encryption and neither do unencrypted DATA packets.
Fixes: 248f219cb8 ("rxrpc: Rewrite the data and ack handling code")
Reported-by: Julian Wollrath <jwollrath@web.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Use the previously-added transmit-phase skbuff private flag to simplify the
socket buffer tracing a bit. Which phase the skbuff comes from can now be
divined from the skb rather than having to be guessed from the call state.
We can also reduce the number of rxrpc_skb_trace values by eliminating the
difference between Tx and Rx in the symbols.
Signed-off-by: David Howells <dhowells@redhat.com>
Add a flag in the private data on an skbuff to indicate that this is a
transmission-phase buffer rather than a receive-phase buffer.
Signed-off-by: David Howells <dhowells@redhat.com>
Abstract out rxtx ring cleanup into its own function from its two callers.
This makes it easier to apply the same changes to both.
Signed-off-by: David Howells <dhowells@redhat.com>
Pass the reference held on a DATA skb in the rxrpc input handler into the
Rx ring rather than getting an additional ref for this and then dropping
the original ref at the end.
Signed-off-by: David Howells <dhowells@redhat.com>
Use the information now cached in the skbuff private data to avoid the need
to reparse a jumbo packet. We can find all the subpackets by dead
reckoning, so it's only necessary to note how many there are, whether the
last one is flagged as LAST_PACKET and whether any have the REQUEST_ACK
flag set.
This is necessary as once recvmsg() can see the packet, it can start
modifying it, such as doing in-place decryption.
Fixes: 248f219cb8 ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
Improve the information stored about jumbo packets so that we don't need to
reparse them so much later.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
Currently, we don't call dma_set_max_seg_size() for i915 because we
intentionally do not limit the segment length that the device supports.
However, this results in a warning being emitted if we try to map
anything larger than SZ_64K on a kernel with CONFIG_DMA_API_DEBUG_SG
enabled:
[ 7.751926] DMA-API: i915 0000:00:02.0: mapping sg segment longer
than device claims to support [len=98304] [max=65536]
[ 7.751934] WARNING: CPU: 5 PID: 474 at kernel/dma/debug.c:1220
debug_dma_map_sg+0x20f/0x340
This was originally brought up on
https://bugs.freedesktop.org/show_bug.cgi?id=108517 , and the consensus
there was it wasn't really useful to set a limit (and that dma-debug
isn't really all that useful for i915 in the first place). Unfortunately
though, CONFIG_DMA_API_DEBUG_SG is enabled in the debug configs for
various distro kernels. Since a WARN_ON() will disable automatic problem
reporting (and cause any CI with said option enabled to start
complaining), we really should just fix the problem.
Note that as me and Chris Wilson discussed, the other solution for this
would be to make DMA-API not make such assumptions when a driver hasn't
explicitly set a maximum segment size. But, taking a look at the commit
which originally introduced this behavior, commit 78c47830a5
("dma-debug: check scatterlist segments"), there is an explicit mention
of this assumption and how it applies to devices with no segment size:
Conversely, devices which are less limited than the rather
conservative defaults, or indeed have no limitations at all
(e.g. GPUs with their own internal MMU), should be encouraged to
set appropriate dma_parms, as they may get more efficient DMA
mapping performance out of it.
So unless there's any concerns (I'm open to discussion!), let's just
follow suite and call dma_set_max_seg_size() with UINT_MAX as our limit
to silence any warnings.
Changes since v3:
* Drop patch for enabling CONFIG_DMA_API_DEBUG_SG in CI. It looks like
just turning it on causes the kernel to spit out bogus WARN_ONs()
during some igt tests which would otherwise require teaching igt to
disable the various DMA-API debugging options causing this. This is
too much work to be worth it, since DMA-API debugging is useless for
us. So, we'll just settle with this single patch to squelch WARN_ONs()
during driver load for users that have CONFIG_DMA_API_DEBUG_SG turned
on for some reason.
* Move dma_set_max_seg_size() call into i915_driver_hw_probe() - Chris
Wilson
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.18+
Link: https://patchwork.freedesktop.org/patch/msgid/20190823205251.14298-1-lyude@redhat.com
(cherry picked from commit acd674af95)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
We're not allowed to create new properties after device registration
so for MST connectors we need to either create the max_bpc property
earlier, or we reuse one we already have. Let's do the latter apporach
since the corresponding SST connector already has the prop and its
min/max are correct also for the MST connector.
The problem was highlighted by commit 4f5368b554 ("drm/kms:
Catch mode_object lifetime errors") which results in the following
spew:
[ 1330.878941] WARNING: CPU: 2 PID: 1554 at drivers/gpu/drm/drm_mode_object.c:45 __drm_mode_object_add+0xa0/0xb0 [drm]
...
[ 1330.879008] Call Trace:
[ 1330.879023] drm_property_create+0xba/0x180 [drm]
[ 1330.879036] drm_property_create_range+0x15/0x30 [drm]
[ 1330.879048] drm_connector_attach_max_bpc_property+0x62/0x80 [drm]
[ 1330.879086] intel_dp_add_mst_connector+0x11f/0x140 [i915]
[ 1330.879094] drm_dp_add_port.isra.20+0x20b/0x440 [drm_kms_helper]
...
Cc: stable@vger.kernel.org
Cc: Lyude Paul <lyude@redhat.com>
Cc: sunpeng.li@amd.com
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sean Paul <sean@poorly.run>
Fixes: 5ca0ef8a56 ("drm/i915: Add max_bpc property for DP MST")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190820161657.9658-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
(cherry picked from commit 1b9bd09630)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The newly added suspend/resume functions are only used if CONFIG_PM
is enabled:
drivers/mfd/rk808.c:752:12: error: 'rk8xx_resume' defined but not used [-Werror=unused-function]
drivers/mfd/rk808.c:732:12: error: 'rk8xx_suspend' defined but not used [-Werror=unused-function]
Mark them as __maybe_unused so the compiler can silently drop them
when they are not needed.
Fixes: 586c1b4125 ("mfd: rk808: Add RK817 and RK809 support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
H_PUT_TCE_INDIRECT handlers receive a page with up to 512 TCEs from
a guest. Although we verify correctness of TCEs before we do anything
with the existing tables, there is a small window when a check in
kvmppc_tce_validate might pass and right after that the guest alters
the page of TCEs, causing an early exit from the handler and leaving
srcu_read_lock(&vcpu->kvm->srcu) (virtual mode) or lock_rmap(rmap)
(real mode) locked.
This fixes the bug by jumping to the common exit code with an appropriate
unlock.
Cc: stable@vger.kernel.org # v4.11+
Fixes: 121f80ba68 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
For the 40.46 SMU release, they changed CurrSocketPower to
AverageSocketPower, but this was changed back in 40.47 so just check if
it's 40.46 and make the appropriate change
Tested with 40.45, 40.46 and 40.47 successfully
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The SMU changed reading from CurrSocketPower to AverageSocketPower, so
reflect this accordingly. This fixes the issue where Average Power
Consumption was being reported as 0 from SMU 40.46-onward
v2: Fixed headline prefix
v3: Add check for SMU version for proper compatibility
v4: Style fix
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since BPF constant blinding is performed after the verifier pass, the
ALU32 instructions inserted for doubleword immediate loads don't have a
corresponding zext instruction. This is causing a kernel oops on powerpc
and can be reproduced by running 'test_cgroup_storage' with
bpf_jit_harden=2.
Fix this by emitting BPF_ZEXT during constant blinding if
prog->aux->verifier_zext is set.
Fixes: a4b1d3c1dd ("bpf: verifier: insert zero extension according to analysis result")
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
NFP is using Local Memory to model stack. LM_addr could be used as base of
a 16 32-bit word region of Local Memory. Then, if the stack offset is
beyond the current region, the local index needs to be updated. The update
needs at least three cycles to take effect, therefore the sequence normally
looks like:
local_csr_wr[ActLMAddr3, gprB_5]
nop
nop
nop
If the local index switch happens on a narrow loads, then the instruction
preparing value to zero high 32-bit of the destination register could be
counted as one cycle, the sequence then could be something like:
local_csr_wr[ActLMAddr3, gprB_5]
nop
nop
immed[gprB_5, 0]
However, we have zero extension optimization that zeroing high 32-bit could
be eliminated, therefore above IMMED insn won't be available for which case
the first sequence needs to be generated.
Fixes: 0b4de1ff19 ("nfp: bpf: eliminate zero extension code-gen")
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
If writepage()/writepages() saw an error, but handled it without
reporting it, we should not be re-reporting that error on exit.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If the client attempts to read a page, but the read fails due to some
spurious error (e.g. an ACCESS error or a timeout, ...) then we need
to allow other processes to retry.
Also try to report errors correctly when doing a synchronous readpage.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If the mount is hard, we should ignore the 'io_maxretrans' module
parameter so that we always keep retrying.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If the connection breaks while we're waiting for a reply from the
server, then we want to immediately try to reconnect.
Fixes: ec6017d903 ("SUNRPC fix regression in umount of a secure mount")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
This reverts commit a79f194aa4.
The mechanism for aborting I/O is racy, since we are not guaranteed that
the request is asleep while we're changing both task->tk_status and
task->tk_action.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v5.1
If a connect or bind attempt returns EADDRINUSE, that means we want to
retry with a different port. It is not a fatal connection error.
Similarly, ENOBUFS is not fatal, but just indicates a memory allocation
issue. Retry after a short delay.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
The pNFS/flexfiles I/O requests are sent with the SOFTCONN flag set, so
they automatically time out if the connection breaks. It should
therefore not be necessary to have the soft flag set in addition.
Fixes: 5f01d95394 ("nfs41: create NFSv3 DS connection if specified")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Don't handle errors in call_bind_status()/call_connect_status()
if it turns out that a previous call caused it to succeed.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v5.1+
Although APIC initialization will typically clear out the LDR before
setting it, the APIC cleanup code should reset the LDR.
This was discovered with a 32-bit KVM guest jumping into a kdump
kernel. The stale bits in the LDR triggered a bug in the KVM APIC
implementation which caused the destination mapping for VCPUs to be
corrupted.
Note that this isn't intended to paper over the KVM APIC bug. The kernel
has to clear the LDR when resetting the APIC registers except when X2APIC
is enabled.
This lacks a Fixes tag because missing to clear LDR goes way back into pre
git history.
[ tglx: Made x2apic_enabled a function call as required ]
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190826101513.5080-3-bsd@redhat.com
Legacy apic init uses bigsmp for smp systems with 8 and more CPUs. The
bigsmp APIC implementation uses physical destination mode, but it
nevertheless initializes LDR and DFR. The LDR even ends up incorrectly with
multiple bit being set.
This does not cause a functional problem because LDR and DFR are ignored
when physical destination mode is active, but it triggered a problem on a
32-bit KVM guest which jumps into a kdump kernel.
The multiple bits set unearthed a bug in the KVM APIC implementation. The
code which creates the logical destination map for VCPUs ignores the
disabled state of the APIC and ends up overwriting an existing valid entry
and as a result, APIC calibration hangs in the guest during kdump
initialization.
Remove the bogus LDR/DFR initialization.
This is not intended to work around the KVM APIC bug. The LDR/DFR
ininitalization is wrong on its own.
The issue goes back into the pre git history. The fixes tag is the commit
in the bitkeeper import which introduced bigsmp support in 2003.
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: db7b9e9f26b8 ("[PATCH] Clustered APIC setup for >8 CPU systems")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190826101513.5080-2-bsd@redhat.com
This updates the documentation for supporting an optional extra interrupt
cell to specify edge vs level triggered.
Signed-off-by: Mischa Jonker <mischa.jonker@synopsys.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This adds support for an optional extra interrupt cell to specify edge
vs level triggered. It is backward compatible with dts files with only
one cell, and will default to level-triggered in such a case.
Note that I had to make a change to idu_irq_set_affinity as well, as
this function was setting the interrupt type to "level" unconditionally,
since this was the only type supported previously.
Signed-off-by: Mischa Jonker <mischa.jonker@synopsys.com>
Reviewed-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
When userspace application calls ioctl(2) to configure hardware for PCM
playback substream, ALSA OXFW driver handles incoming AMDTP stream.
In this case, outgoing AMDTP stream should be handled.
This commit fixes the bug for v5.3-rc kernel.
Fixes: 4f380d0070 ("ALSA: oxfw: configure packet format in pcm.hw_params callback")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
32-bit processes running on a 64-bit kernel are not always detected
correctly, causing the process to crash when uretprobes are installed.
The reason for the crash is that in_ia32_syscall() is used to determine the
process's mode, which only works correctly when called from a syscall.
In the case of uretprobes, however, the function is called from a exception
and always returns 'false' on a 64-bit kernel. In consequence this leads to
corruption of the process's return address.
Fix this by using user_64bit_mode() instead of in_ia32_syscall(), which
is correct in any situation.
[ tglx: Add a comment and the following historical info ]
This should have been detected by the rename which happened in commit
abfb9498ee ("x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()")
which states in the changelog:
The is_ia32_task()/is_x32_task() function names are a big misnomer: they
suggests that the compat-ness of a system call is a task property, which
is not true, the compatness of a system call purely depends on how it
was invoked through the system call layer.
.....
and then it went and blindly renamed every call site.
Sadly enough this was already mentioned here:
8faaed1b9f ("uprobes/x86: Introduce sizeof_long(), cleanup adjust_ret_addr() and
arch_uretprobe_hijack_return_addr()")
where the changelog says:
TODO: is_ia32_task() is not what we actually want, TS_COMPAT does
not necessarily mean 32bit. Fortunately syscall-like insns can't be
probed so it actually works, but it would be better to rename and
use is_ia32_frame().
and goes all the way back to:
0326f5a94d ("uprobes/core: Handle breakpoint and singlestep exceptions")
Oh well. 7+ years until someone actually tried a uretprobe on a 32bit
process on a 64bit kernel....
Fixes: 0326f5a94d ("uprobes/core: Handle breakpoint and singlestep exceptions")
Signed-off-by: Sebastian Mayr <me@sam.st>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190728152617.7308-1-me@sam.st
get_registers() blindly copies the memory written to by the
usb_control_msg() call even if the underlying urb failed.
This could lead to junk register values being read by the driver, since
some indirect callers of get_registers() ignore the return values. One
example is:
ocp_read_dword() ignores the return value of generic_ocp_read(), which
calls get_registers().
So, emulate PCI "Master Abort" behavior by setting the buffer to all
0xFFs when usb_control_msg() fails.
This patch is copied from the r8152 driver (v2.12.0) published by
Realtek (www.realtek.com).
Signed-off-by: Prashant Malani <pmalani@chromium.org>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch addresses a conntrack cache issue with timeout policy.
Currently, we do not check if the timeout extension is set properly in the
cached conntrack entry. Thus, after packet recirculate from conntrack
action, the timeout policy is not applied properly. This patch fixes the
aforementioned issue.
Fixes: 06bd2bdf19 ("openvswitch: Add timeout support to ct action")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using mpls over gre/gre6 setup, rt->rt_gw4 address is not set, the
same for rt->rt_gw_family. Therefore, when rt->rt_gw_family is checked
in mpls_xmit(), neigh_xmit() call is skipped. As a result, such setup
doesn't work anymore.
This issue was found with LTP mpls03 tests.
Fixes: 1550c17193 ("ipv4: Prepare rtable for IPv6 gateway")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Donald reported this sequence:
ip next add id 1 blackhole
ip next add id 2 blackhole
ip ro add 1.1.1.1/32 nhid 1
ip ro add 1.1.1.2/32 nhid 2
would cause a crash. Backtrace is:
[ 151.302790] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 151.304043] CPU: 1 PID: 277 Comm: ip Not tainted 5.3.0-rc5+ #37
[ 151.305078] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
[ 151.306526] RIP: 0010:fib_add_nexthop+0x8b/0x2aa
[ 151.307343] Code: 35 f7 81 48 8d 14 01 c7 02 f1 f1 f1 f1 c7 42 04 01 f4 f4 f4 48 89 f2 48 c1 ea 03 65 48 8b 0c 25 28 00 00 00 48 89 4d d0 31 c9 <80> 3c 02 00 74 08 48 89 f7 e8 1a e8 53 ff be 08 00 00 00 4c 89 e7
[ 151.310549] RSP: 0018:ffff888116c27340 EFLAGS: 00010246
[ 151.311469] RAX: dffffc0000000000 RBX: ffff8881154ece00 RCX: 0000000000000000
[ 151.312713] RDX: 0000000000000004 RSI: 0000000000000020 RDI: ffff888115649b40
[ 151.313968] RBP: ffff888116c273d8 R08: ffffed10221e3757 R09: ffff888110f1bab8
[ 151.315212] R10: 0000000000000001 R11: ffff888110f1bab3 R12: ffff888115649b40
[ 151.316456] R13: 0000000000000020 R14: ffff888116c273b0 R15: ffff888115649b40
[ 151.317707] FS: 00007f60b4d8d800(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
[ 151.319113] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 151.320119] CR2: 0000555671ffdc00 CR3: 00000001136ba005 CR4: 0000000000020ee0
[ 151.321367] Call Trace:
[ 151.321820] ? fib_nexthop_info+0x635/0x635
[ 151.322572] fib_dump_info+0xaa4/0xde0
[ 151.323247] ? fib_create_info+0x2431/0x2431
[ 151.324008] ? napi_alloc_frag+0x2a/0x2a
[ 151.324711] rtmsg_fib+0x2c4/0x3be
[ 151.325339] fib_table_insert+0xe2f/0xeee
...
fib_dump_info incorrectly has nhs = 0 for blackhole nexthops, so it
believes the nexthop object is a multipath group (nhs != 1) and ends
up down the nexthop_mpath_fill_node() path which is wrong for a
blackhole.
The blackhole check in nexthop_num_path is leftover from early days
of the blackhole implementation which did not initialize the device.
In the end the design was simpler (fewer special case checks) to set
the device to loopback in nh_info, so the check in nexthop_num_path
should have been removed.
Fixes: 430a049190 ("nexthop: Add support for nexthop groups")
Reported-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull auxdisplay cleanup from Miguel Ojeda:
"Make ht16k33_fb_fix and ht16k33_fb_var constant (Nishka Dasgupta)"
* tag 'auxdisplay-for-linus-v5.3-rc7' of git://github.com/ojeda/linux:
auxdisplay: ht16k33: Make ht16k33_fb_fix and ht16k33_fb_var constant
Pull UML fix from Richard Weinberger:
"Fix time travel mode"
* tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: fix time travel mode
Pull UBIFS and JFFS2 fixes from Richard Weinberger:
"UBIFS:
- Don't block too long in writeback_inodes_sb()
- Fix for a possible overrun of the log head
- Fix double unlock in orphan_delete()
JFFS2:
- Remove C++ style from UAPI header and unbreak picky toolchains"
* tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
ubifs: Limit the number of pages in shrink_liability
ubifs: Correctly initialize c->min_log_bytes
ubifs: Fix double unlock around orphan_delete()
jffs2: Remove C++ style comments from uapi header
Pull x86 fixes from Thomas Gleixner:
"A few fixes for x86:
- Fix a boot regression caused by the recent bootparam sanitizing
change, which escaped the attention of all people who reviewed that
code.
- Address a boot problem on machines with broken E820 tables caused
by an underflow which ended up placing the trampoline start at
physical address 0.
- Handle machines which do not advertise a legacy timer of any form,
but need calibration of the local APIC timer gracefully by making
the calibration routine independent from the tick interrupt. Marked
for stable as well as there seems to be quite some new laptops
rolled out which expose this.
- Clear the RDRAND CPUID bit on AMD family 15h and 16h CPUs which are
affected by broken firmware which does not initialize RDRAND
correctly after resume. Add a command line parameter to override
this for machine which either do not use suspend/resume or have a
fixed BIOS. Unfortunately there is no way to detect this on boot,
so the only safe decision is to turn it off by default.
- Prevent RFLAGS from being clobbers in CALL_NOSPEC on 32bit which
caused fast KVM instruction emulation to break.
- Explain the Intel CPU model naming convention so that the repeating
discussions come to an end"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386
x86/boot: Fix boot regression caused by bootparam sanitizing
x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h
x86/boot/compressed/64: Fix boot on machines with broken E820 table
x86/apic: Handle missing global clockevent gracefully
x86/cpu: Explain Intel model naming convention
Pull timekeeping fix from Thomas Gleixner:
"A single fix for a regression caused by the generic VDSO
implementation where a math overflow causes CLOCK_BOOTTIME to become a
random number generator"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping/vsyscall: Prevent math overflow in BOOTTIME update
Pull scheduler fix from Thomas Gleixner:
"Handle the worker management in situations where a task is scheduled
out on a PI lock contention correctly and schedule a new worker if
possible"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Schedule new worker even if PI-blocked
Pull perf fixes from Thomas Gleixner:
"Two small fixes for kprobes and perf:
- Prevent a deadlock in kprobe_optimizer() causes by reverse lock
ordering
- Fix a comment typo"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kprobes: Fix potential deadlock in kprobe_optimizer()
perf/x86: Fix typo in comment
Pull irq fix from Thomas Gleixner:
"A single fix for a imbalanced kobject operation in the irq decriptor
code which was unearthed by the new warnings in the kobject code"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Properly pair kobject_del() with kobject_add()
Mergr misc fixes from Andrew Morton:
"11 fixes"
Mostly VM fixes, one psi polling fix, and one parisc build fix.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y
mm/zsmalloc.c: fix race condition in zs_destroy_pool
mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely
mm, page_owner: handle THP splits correctly
userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx
psi: get poll_work to run when calling poll syscall next time
mm: memcontrol: flush percpu vmevents before releasing memcg
mm: memcontrol: flush percpu vmstats before releasing memcg
parisc: fix compilation errrors
mm, page_alloc: move_freepages should not examine struct page of reserved memory
mm/z3fold.c: fix race between migration and destruction
The input pool of a client might be deleted via the resize ioctl, the
the access to it should be covered by the proper locks. Currently the
only missing place is the call in snd_seq_ioctl_get_client_pool(), and
this patch papers over it.
Reported-by: syzbot+4a75454b9ca2777f35c7@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull dma-mapping fixes from Christoph Hellwig:
"Two fixes for regressions in this merge window:
- select the Kconfig symbols for the noncoherent dma arch helpers on
arm if swiotlb is selected, not just for LPAE to not break then Xen
build, that uses swiotlb indirectly through swiotlb-xen
- fix the page allocator fallback in dma_alloc_contiguous if the CMA
allocation fails"
* tag 'dma-mapping-5.3-5' of git://git.infradead.org/users/hch/dma-mapping:
dma-direct: fix zone selection after an unaddressable CMA allocation
arm: select the dma-noncoherent symbols for all swiotlb builds
The code like this:
ptr = kmalloc(size, GFP_KERNEL);
page = virt_to_page(ptr);
offset = offset_in_page(ptr);
kfree(page_address(page) + offset);
may produce false-positive invalid-free reports on the kernel with
CONFIG_KASAN_SW_TAGS=y.
In the example above we lose the original tag assigned to 'ptr', so
kfree() gets the pointer with 0xFF tag. In kfree() we check that 0xFF
tag is different from the tag in shadow hence print false report.
Instead of just comparing tags, do the following:
1) Check that shadow doesn't contain KASAN_TAG_INVALID. Otherwise it's
double-free and it doesn't matter what tag the pointer have.
2) If pointer tag is different from 0xFF, make sure that tag in the
shadow is the same as in the pointer.
Link: http://lkml.kernel.org/r/20190819172540.19581-1-aryabinin@virtuozzo.com
Fixes: 7f94ffbc4c ("kasan: add hooks implementation for tag-based mode")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Walter Wu <walter-zh.wu@mediatek.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In zs_destroy_pool() we call flush_work(&pool->free_work). However, we
have no guarantee that migration isn't happening in the background at
that time.
Since migration can't directly free pages, it relies on free_work being
scheduled to free the pages. But there's nothing preventing an
in-progress migrate from queuing the work *after*
zs_unregister_migration() has called flush_work(). Which would mean
pages still pointing at the inode when we free it.
Since we know at destroy time all objects should be free, no new
migrations can come in (since zs_page_isolate() fails for fully-free
zspages). This means it is sufficient to track a "# isolated zspages"
count by class, and have the destroy logic ensure all such pages have
drained before proceeding. Keeping that state under the class spinlock
keeps the logic straightforward.
In this case a memory leak could lead to an eventual crash if compaction
hits the leaked page. This crash would only occur if people are
changing their zswap backend at runtime (which eventually starts
destruction).
Link: http://lkml.kernel.org/r/20190809181751.219326-2-henryburns@google.com
Fixes: 48b4800a1c ("zsmalloc: page migration support")
Signed-off-by: Henry Burns <henryburns@google.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Henry Burns <henrywolfeburns@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Jonathan Adams <jwadams@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In zs_page_migrate() we call putback_zspage() after we have finished
migrating all pages in this zspage. However, the return value is
ignored. If a zs_free() races in between zs_page_isolate() and
zs_page_migrate(), freeing the last object in the zspage,
putback_zspage() will leave the page in ZS_EMPTY for potentially an
unbounded amount of time.
To fix this, we need to do the same thing as zs_page_putback() does:
schedule free_work to occur.
To avoid duplicated code, move the sequence to a new
putback_zspage_deferred() function which both zs_page_migrate() and
zs_page_putback() call.
Link: http://lkml.kernel.org/r/20190809181751.219326-1-henryburns@google.com
Fixes: 48b4800a1c ("zsmalloc: page migration support")
Signed-off-by: Henry Burns <henryburns@google.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Henry Burns <henrywolfeburns@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Jonathan Adams <jwadams@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Similar to vmstats, percpu caching of local vmevents leads to an
accumulation of errors on non-leaf levels. This happens because some
leftovers may remain in percpu caches, so that they are never propagated
up by the cgroup tree and just disappear into nonexistence with on
releasing of the memory cgroup.
To fix this issue let's accumulate and propagate percpu vmevents values
before releasing the memory cgroup similar to what we're doing with
vmstats.
Since on cpu hotplug we do flush percpu vmstats anyway, we can iterate
only over online cpus.
Link: http://lkml.kernel.org/r/20190819202338.363363-4-guro@fb.com
Fixes: 42a3003535 ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Percpu caching of local vmstats with the conditional propagation by the
cgroup tree leads to an accumulation of errors on non-leaf levels.
Let's imagine two nested memory cgroups A and A/B. Say, a process
belonging to A/B allocates 100 pagecache pages on the CPU 0. The percpu
cache will spill 3 times, so that 32*3=96 pages will be accounted to A/B
and A atomic vmstat counters, 4 pages will remain in the percpu cache.
Imagine A/B is nearby memory.max, so that every following allocation
triggers a direct reclaim on the local CPU. Say, each such attempt will
free 16 pages on a new cpu. That means every percpu cache will have -16
pages, except the first one, which will have 4 - 16 = -12. A/B and A
atomic counters will not be touched at all.
Now a user removes A/B. All percpu caches are freed and corresponding
vmstat numbers are forgotten. A has 96 pages more than expected.
As memory cgroups are created and destroyed, errors do accumulate. Even
1-2 pages differences can accumulate into large numbers.
To fix this issue let's accumulate and propagate percpu vmstat values
before releasing the memory cgroup. At this point these numbers are
stable and cannot be changed.
Since on cpu hotplug we do flush percpu vmstats anyway, we can iterate
only over online cpus.
Link: http://lkml.kernel.org/r/20190819202338.363363-2-guro@fb.com
Fixes: 42a3003535 ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 0cfaee2af3 ("include/asm-generic/5level-fixup.h: fix variable
'p4d' set but not used") converted a few functions from macros to static
inline, which causes parisc to complain,
In file included from include/asm-generic/4level-fixup.h:38:0,
from arch/parisc/include/asm/pgtable.h:5,
from arch/parisc/include/asm/io.h:6,
from include/linux/io.h:13,
from sound/core/memory.c:9:
include/asm-generic/5level-fixup.h:14:18: error: unknown type name 'pgd_t'; did you mean 'pid_t'?
#define p4d_t pgd_t
^
include/asm-generic/5level-fixup.h:24:28: note: in expansion of macro 'p4d_t'
static inline int p4d_none(p4d_t p4d)
^~~~~
It is because "4level-fixup.h" is included before "asm/page.h" where
"pgd_t" is defined.
Link: http://lkml.kernel.org/r/20190815205305.1382-1-cai@lca.pw
Fixes: 0cfaee2af3 ("include/asm-generic/5level-fixup.h: fix variable 'p4d' set but not used")
Signed-off-by: Qian Cai <cai@lca.pw>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After commit 907ec5fca3 ("mm: zero remaining unavailable struct
pages"), struct page of reserved memory is zeroed. This causes
page->flags to be 0 and fixes issues related to reading
/proc/kpageflags, for example, of reserved memory.
The VM_BUG_ON() in move_freepages_block(), however, assumes that
page_zone() is meaningful even for reserved memory. That assumption is
no longer true after the aforementioned commit.
There's no reason why move_freepages_block() should be testing the
legitimacy of page_zone() for reserved memory; its scope is limited only
to pages on the zone's freelist.
Note that pfn_valid() can be true for reserved memory: there is a
backing struct page. The check for page_to_nid(page) is also buggy but
reserved memory normally only appears on node 0 so the zeroing doesn't
affect this.
Move the debug checks to after verifying PageBuddy is true. This
isolates the scope of the checks to only be for buddy pages which are on
the zone's freelist which move_freepages_block() is operating on. In
this case, an incorrect node or zone is a bug worthy of being warned
about (and the examination of struct page is acceptable bcause this
memory is not reserved).
Why does move_freepages_block() gets called on reserved memory? It's
simply math after finding a valid free page from the per-zone free area
to use as fallback. We find the beginning and end of the pageblock of
the valid page and that can bring us into memory that was reserved per
the e820. pfn_valid() is still true (it's backed by a struct page), but
since it's zero'd we shouldn't make any inferences here about comparing
its node or zone. The current node check just happens to succeed most
of the time by luck because reserved memory typically appears on node 0.
The fix here is to validate that we actually have buddy pages before
testing if there's any type of zone or node strangeness going on.
We noticed it almost immediately after bringing 907ec5fca3 in on
CONFIG_DEBUG_VM builds. It depends on finding specific free pages in
the per-zone free area where the math in move_freepages() will bring the
start or end pfn into reserved memory and wanting to claim that entire
pageblock as a new migratetype. So the path will be rare, require
CONFIG_DEBUG_VM, and require fallback to a different migratetype.
Some struct pages were already zeroed from reserve pages before
907ec5fca3c so it theoretically could trigger before this commit. I
think it's rare enough under a config option that most people don't run
that others may not have noticed. I wouldn't argue against a stable tag
and the backport should be easy enough, but probably wouldn't single out
a commit that this is fixing.
Mel said:
: The overhead of the debugging check is higher with this patch although
: it'll only affect debug builds and the path is not particularly hot.
: If this was a concern, I think it would be reasonable to simply remove
: the debugging check as the zone boundaries are checked in
: move_freepages_block and we never expect a zone/node to be smaller than
: a pageblock and stuck in the middle of another zone.
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1908122036560.10779@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>From IB specific 7.6.5 SERVICE LEVEL, Service Level (SL)
is used to identify different flows within an IBA subnet.
It is carried in the local route header of the packet.
Before this commit, run "rds-info -I". The outputs are as
below:
"
RDS IB Connections:
LocalAddr RemoteAddr Tos SL LocalDev RemoteDev
192.2.95.3 192.2.95.1 2 0 fe80::21:28:1a:39 fe80::21:28:10:b9
192.2.95.3 192.2.95.1 1 0 fe80::21:28:1a:39 fe80::21:28:10:b9
192.2.95.3 192.2.95.1 0 0 fe80::21:28:1a:39 fe80::21:28:10:b9
"
After this commit, the output is as below:
"
RDS IB Connections:
LocalAddr RemoteAddr Tos SL LocalDev RemoteDev
192.2.95.3 192.2.95.1 2 2 fe80::21:28:1a:39 fe80::21:28:10:b9
192.2.95.3 192.2.95.1 1 1 fe80::21:28:1a:39 fe80::21:28:10:b9
192.2.95.3 192.2.95.1 0 0 fe80::21:28:1a:39 fe80::21:28:10:b9
"
The commit fe3475af3b ("net: rds: add per rds connection cache
statistics") adds cache_allocs in struct rds_info_rdma_connection
as below:
struct rds_info_rdma_connection {
...
__u32 rdma_mr_max;
__u32 rdma_mr_size;
__u8 tos;
__u32 cache_allocs;
};
The peer struct in rds-tools of struct rds_info_rdma_connection is as
below:
struct rds_info_rdma_connection {
...
uint32_t rdma_mr_max;
uint32_t rdma_mr_size;
uint8_t tos;
uint8_t sl;
uint32_t cache_allocs;
};
The difference between userspace and kernel is the member variable sl.
In the kernel struct, the member variable sl is missing. This will
introduce risks. So it is necessary to use this commit to avoid this risk.
Fixes: fe3475af3b ("net: rds: add per rds connection cache statistics")
CC: Joe Jin <joe.jin@oracle.com>
CC: JUNXIAO_BI <junxiao.bi@oracle.com>
Suggested-by: Gerd Rausch <gerd.rausch@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
An excerpt from netlink(7) man page,
In multipart messages (multiple nlmsghdr headers with associated payload
in one byte stream) the first and all following headers have the
NLM_F_MULTI flag set, except for the last header which has the type
NLMSG_DONE.
but, after (ee28906) there is a missing NLM_F_MULTI flag in the middle of a
FIB dump. The result is user space applications following above man page
excerpt may get confused and may stop parsing msg believing something went
wrong.
In the golang netlink lib [0] the library logic stops parsing believing the
message is not a multipart message. Found this running Cilium[1] against
net-next while adding a feature to auto-detect routes. I noticed with
multiple route tables we no longer could detect the default routes on net
tree kernels because the library logic was not returning them.
Fix this by handling the fib_dump_info_fnhe() case the same way the
fib_dump_info() handles it by passing the flags argument through the
call chain and adding a flags argument to rt_fill_info().
Tested with Cilium stack and auto-detection of routes works again. Also
annotated libs to dump netlink msgs and inspected NLM_F_MULTI and
NLMSG_DONE flags look correct after this.
Note: In inet_rtm_getroute() pass rt_fill_info() '0' for flags the same
as is done for fib_dump_info() so this looks correct to me.
[0] https://github.com/vishvananda/netlink/
[1] https://github.com/cilium/
Fixes: ee28906fd7 ("ipv4: Dump route exceptions if requested")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit d4c08afafa ("s390/qeth: streamline SNMP cmd code") removed
the bounds checking for req_len, under the assumption that the check in
qeth_alloc_cmd() would suffice.
But that code path isn't sufficiently robust to handle a user-provided
data_length, which could overflow (when adding the cmd header overhead)
before being checked against QETH_BUFSIZE. We end up allocating just a
tiny iob, and the subsequent copy_from_user() writes past the end of
that iob.
Special-case this path and add a coarse bounds check, to protect against
maliciuous requests. This let's the subsequent code flow do its normal
job and precise checking, without risk of overflow.
Fixes: d4c08afafa ("s390/qeth: streamline SNMP cmd code")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If protocols registered exceeded PROTO_INUSE_NR, prot will be
added to proto_list, but no available bit left for prot in
proto_inuse_idx.
Changes since v2:
* Propagate the error code properly
Signed-off-by: zhanglin <zhang.lin16@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2019-08-22
This series introduces some fixes to mlx5 driver.
1) Form Moshe, two fixes for firmware health reporter
2) From Eran, two ktls fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell king maintains phylink, as part of the SFP module support.
However, much of the review work is about drivers swapping from phylib
to phylink. Such changes don't make changes to the phylink core, and
so the F: rules in MAINTAINERS don't match. Add a K:, keywork rule,
which hopefully get_maintainers will match against for patches to MAC
drivers swapping to phylink.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
In decode_session{4,6} there is a possibility that the skb dst dev is NULL,
e,g, with tunnel collect_md mode, which will cause kernel crash.
Here is what the code path looks like, for GRE:
- ip6gre_tunnel_xmit
- ip6gre_xmit_ipv6
- __gre6_xmit
- ip6_tnl_xmit
- if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE
- icmpv6_send
- icmpv6_route_lookup
- xfrm_decode_session_reverse
- decode_session4
- oif = skb_dst(skb)->dev->ifindex; <-- here
- decode_session6
- oif = skb_dst(skb)->dev->ifindex; <-- here
The reason is __metadata_dst_init() init dst->dev to NULL by default.
We could not fix it in __metadata_dst_init() as there is no dev supplied.
On the other hand, the skb_dst(skb)->dev is actually not needed as we
called decode_session{4,6} via xfrm_decode_session_reverse(), so oif is not
used by: fl4->flowi4_oif = reverse ? skb->skb_iif : oif;
So make a dst dev check here should be clean and safe.
v4: No changes.
v3: No changes.
v2: fix the issue in decode_session{4,6} instead of updating shared dst dev
in {ip_md, ip6}_tunnel_xmit.
Fixes: 8d79266bc4 ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In __icmp_send() there is a possibility that the rt->dst.dev is NULL,
e,g, with tunnel collect_md mode, which will cause kernel crash.
Here is what the code path looks like, for GRE:
- ip6gre_tunnel_xmit
- ip6gre_xmit_ipv4
- __gre6_xmit
- ip6_tnl_xmit
- if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE
- icmp_send
- net = dev_net(rt->dst.dev); <-- here
The reason is __metadata_dst_init() init dst->dev to NULL by default.
We could not fix it in __metadata_dst_init() as there is no dev supplied.
On the other hand, the reason we need rt->dst.dev is to get the net.
So we can just try get it from skb->dev when rt->dst.dev is NULL.
v4: Julian Anastasov remind skb->dev also could be NULL. We'd better
still use dst.dev and do a check to avoid crash.
v3: No changes.
v2: fix the issue in __icmp_send() instead of updating shared dst dev
in {ip_md, ip6}_tunnel_xmit.
Fixes: c8b34e680a ("ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull GPIO fixes from Linus Walleij:
"Here is a (hopefully last) set of GPIO fixes for the v5.3 kernel
cycle. Two are pretty core:
- Fix not reporting open drain/source lines to userspace as "input"
- Fix a minor build error found in randconfigs
- Fix a chip select quirk on the Freescale SPI
- Fix the irqchip initialization semantic order to reflect what it
was using the old API"
* tag 'gpio-v5.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: Fix irqchip initialization order
gpio: of: fix Freescale SPI CS quirk handling
gpio: Fix build error of function redefinition
gpiolib: never report open-drain/source lines as 'input' to user-space
Stefan Schmidt says:
====================
pull-request: ieee802154 for net 2019-08-24
An update from ieee802154 for your *net* tree.
Yue Haibing fixed two bugs discovered by KASAN in the hwsim driver for
ieee802154 and Colin Ian King cleaned up a redundant variable assignment.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull Hyper-V fixes from Sasha Levin:
- Fix for panics and network failures on PAE guests by Dexuan Cui.
- Fix of a memory leak (and related cleanups) in the hyper-v keyboard
driver by Dexuan Cui.
- Code cleanups for hyper-v clocksource driver during the merge window
by Dexuan Cui.
- Fix for a false positive warning in the userspace hyper-v KVP store
by Vitaly Kuznetsov.
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Fix virt_to_hvpfn() for X86_PAE
Tools: hv: kvp: eliminate 'may be used uninitialized' warning
Input: hyperv-keyboard: Use in-place iterator API in the channel callback
Drivers: hv: vmbus: Remove the unused "tsc_page" from struct hv_context
Pull arm64 fixes from Will Deacon:
"Two KVM/arm fixes for MMIO emulation and UBSAN.
Unusually, we're routing them via the arm64 tree as per Paolo's
request on the list:
https://lore.kernel.org/kvm/21ae69a2-2546-29d0-bff6-2ea825e3d968@redhat.com/
We don't actually have any other arm64 fixes pending at the moment
(touch wood), so I've pulled from Marc, written a merge commit, tagged
the result and run it through my build/boot/bisect scripts"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity
KVM: arm/arm64: Only skip MMIO insn once
Pull SCSI fixes from James Bottomley:
"Four fixes, three for edge conditions which don't occur very often.
The lpfc fix mitigates memory exhaustion for some high CPU systems"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: lpfc: Mitigate high memory pre-allocation by SCSI-MQ
scsi: ufs: Fix NULL pointer dereference in ufshcd_config_vreg_hpm()
scsi: target: tcmu: avoid use-after-free after command timeout
scsi: qla2xxx: Fix gnl.l memory leak on adapter init failure
Pull xfs fix from Darrick Wong:
"A single patch that fixes a xfs lockup problem when a chown/chgrp
operation fails due to running out of quota. It has survived the usual
xfstests runs and merges cleanly with this morning's master:
- Fix a forgotten inode unlock when chown/chgrp fail due to quota"
* tag 'xfs-5.3-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT
Pull more drm fixes from Dave Airlie:
"Although the tree built for me fine on arm here, it appears either
header cleanups in next or some kconfig combo it breaks, so this
contains a fix to mediatek to include dma-mapping.h explicitly.
There was also one nouveau fix that came in late that I was going to
leave until next week, but since I was sending this I thought it may
as well be in here:
mediatek:
- fix build in some cases
nouveau:
- fix hang with i2c and mst docks"
* tag 'drm-fixes-2019-08-24' of git://anongit.freedesktop.org/drm/drm:
drm/mediatek: include dma-mapping header
drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX
Pull KVM/arm fixes from Marc Zyngier as per Paulo's request at:
https://lkml.kernel.org/r/21ae69a2-2546-29d0-bff6-2ea825e3d968@redhat.com
"One (hopefully last) set of fixes for KVM/arm for 5.3: an embarassing
MMIO emulation regression, and a UBSAN splat. Oh well...
- Don't overskip instructions on MMIO emulation
- Fix UBSAN splat when initializing PPI priorities"
* tag 'kvmarm-fixes-for-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm:
KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity
KVM: arm/arm64: Only skip MMIO insn once
Although it builds fine here in my arm cross compile, it seems
either via some other patches in -next or some Kconfig combination,
this fails to build for everyone.
Include linux/dma-mapping.h should fix it.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Daniel Borkmann says:
====================
pull-request: bpf 2019-08-24
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix verifier precision tracking with BPF-to-BPF calls, from Alexei.
2) Fix a use-after-free in prog symbol exposure, from Daniel.
3) Several s390x JIT fixes plus BE related fixes in BPF kselftests, from Ilya.
4) Fix memory leak by unpinning XDP umem pages in error path, from Ivan.
5) Fix a potential use-after-free on flow dissector detach, from Jakub.
6) Fix bpftool to close prog fd after showing metadata, from Quentin.
7) BPF kselftest config and TEST_PROGS_EXTENDED fixes, from Anders.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
test_select_reuseport fails on s390 due to verifier rejecting
test_select_reuseport_kern.o with the following message:
; data_check.eth_protocol = reuse_md->eth_protocol;
18: (69) r1 = *(u16 *)(r6 +22)
invalid bpf_context access off=22 size=2
This is because on big-endian machines casts from __u32 to __u16 are
generated by referencing the respective variable as __u16 with an offset
of 2 (as opposed to 0 on little-endian machines).
The verifier already has all the infrastructure in place to allow such
accesses, it's just that they are not explicitly enabled for
eth_protocol field. Enable them for eth_protocol field by using
bpf_ctx_range instead of offsetof.
Ditto for ip_protocol, bind_inany and len, since they already allow
narrowing, and the same problem can arise when working with them.
Fixes: 2dbb9b9e6d ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
syzkaller managed to trigger the warning in bpf_jit_free() which checks via
bpf_prog_kallsyms_verify_off() for potentially unlinked JITed BPF progs
in kallsyms, and subsequently trips over GPF when walking kallsyms entries:
[...]
8021q: adding VLAN 0 to HW filter on device batadv0
8021q: adding VLAN 0 to HW filter on device batadv0
WARNING: CPU: 0 PID: 9869 at kernel/bpf/core.c:810 bpf_jit_free+0x1e8/0x2a0
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x113/0x167 lib/dump_stack.c:113
panic+0x212/0x40b kernel/panic.c:214
__warn.cold.8+0x1b/0x38 kernel/panic.c:571
report_bug+0x1a4/0x200 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:178 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
RIP: 0010:bpf_jit_free+0x1e8/0x2a0
Code: 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 86 00 00 00 48 ba 00 02 00 00 00 00 ad de 0f b6 43 02 49 39 d6 0f 84 5f fe ff ff <0f> 0b e9 58 fe ff ff 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1
RSP: 0018:ffff888092f67cd8 EFLAGS: 00010202
RAX: 0000000000000007 RBX: ffffc90001947000 RCX: ffffffff816e9d88
RDX: dead000000000200 RSI: 0000000000000008 RDI: ffff88808769f7f0
RBP: ffff888092f67d00 R08: fffffbfff1394059 R09: fffffbfff1394058
R10: fffffbfff1394058 R11: ffffffff89ca02c7 R12: ffffc90001947002
R13: ffffc90001947020 R14: ffffffff881eca80 R15: ffff88808769f7e8
BUG: unable to handle kernel paging request at fffffbfff400d000
#PF error: [normal kernel read fault]
PGD 21ffee067 P4D 21ffee067 PUD 21ffed067 PMD 9f942067 PTE 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:495 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:558 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x107/0x2e0 kernel/bpf/core.c:632
Code: 00 f0 ff ff 44 38 c8 7f 08 84 c0 0f 85 fa 00 00 00 41 f6 45 02 01 75 02 0f 0b 48 39 da 0f 82 92 00 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 03 0f 8e 45 01 00 00 8b 03 48 c1 e0
[...]
Upon further debugging, it turns out that whenever we trigger this
issue, the kallsyms removal in bpf_prog_ksym_node_del() was /skipped/
but yet bpf_jit_free() reported that the entry is /in use/.
Problem is that symbol exposure via bpf_prog_kallsyms_add() but also
perf_event_bpf_event() were done /after/ bpf_prog_new_fd(). Once the
fd is exposed to the public, a parallel close request came in right
before we attempted to do the bpf_prog_kallsyms_add().
Given at this time the prog reference count is one, we start to rip
everything underneath us via bpf_prog_release() -> bpf_prog_put().
The memory is eventually released via deferred free, so we're seeing
that bpf_jit_free() has a kallsym entry because we added it from
bpf_prog_load() but /after/ bpf_prog_put() from the remote CPU.
Therefore, move both notifications /before/ we install the fd. The
issue was never seen between bpf_prog_alloc_id() and bpf_prog_new_fd()
because upon bpf_prog_get_fd_by_id() we'll take another reference to
the BPF prog, so we're still holding the original reference from the
bpf_prog_load().
Fixes: 6ee52e2a3f ("perf, bpf: Introduce PERF_RECORD_BPF_EVENT")
Fixes: 74451e66d5 ("bpf: make jited programs visible in traces")
Reported-by: syzbot+bd3bba6ff3fcea7a6ec6@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Song Liu <songliubraving@fb.com>
While adding extra tests for precision tracking and extra infra
to adjust verifier heuristics the existing test
"calls: cross frame pruning - liveness propagation" started to fail.
The root cause is the same as described in verifer.c comment:
* Also if parent's curframe > frame where backtracking started,
* the verifier need to mark registers in both frames, otherwise callees
* may incorrectly prune callers. This is similar to
* commit 7640ead939 ("bpf: verifier: make sure callees don't prune with caller differences")
* For now backtracking falls back into conservative marking.
Turned out though that returning -ENOTSUPP from backtrack_insn() and
doing mark_all_scalars_precise() in the current parentage chain is not enough.
Depending on how is_state_visited() heuristic is creating parentage chain
it's possible that callee will incorrectly prune caller.
Fix the issue by setting precise=true earlier and more aggressively.
Before this fix the precision tracking _within_ functions that don't do
bpf2bpf calls would still work. Whereas now precision tracking is completely
disabled when bpf2bpf calls are present anywhere in the program.
No difference in cilium tests (they don't have bpf2bpf calls).
No difference in test_progs though some of them have bpf2bpf calls,
but precision tracking wasn't effective there.
Fixes: b5dc0163d8 ("bpf: precise scalar_value tracking")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Call to bpf_prog_put(), with help of call_rcu(), queues an RCU-callback to
free the program once a grace period has elapsed. The callback can run
together with new RCU readers that started after the last grace period.
New RCU readers can potentially see the "old" to-be-freed or already-freed
pointer to the program object before the RCU update-side NULLs it.
Reorder the operations so that the RCU update-side resets the protected
pointer before the end of the grace period after which the program will be
freed.
Fixes: d58e468b11 ("flow_dissector: implements flow dissector BPF hook")
Reported-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Petar Penkov <ppenkov@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This reverts commit f072218cca.
As reported by Aaro this patch causes network problems on
MIPS Loongson platform. Therefore revert it.
Fixes: f072218cca ("r8169: remove not needed call to dma_sync_single_for_device")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull rdma fixes from Doug Ledford:
"No beating around the bush: this is a monster pull request for an -rc5
kernel. Intel hit me with a series of fixes for TID processing.
Mellanox hit me with a series for their UMR memory support.
And we had one fix for siw that fixes the 32bit build warnings and
because of the number of casts that had to be changed to properly
silence the warnings, that one patch alone is a full 40% of the LOC of
this entire pull request. Given that this is the initial release
kernel for siw, I'm trying to fix anything in it that we can, so that
adds to the impetus to take fixes for it like this one.
I had to do a rebase early in the week. Jason had thought he put a
patch on the rc queue that he needed to be there so he could base some
work off of it, and it had actually not been placed there. So he asked
me (on Tuesday) to fix that up before pushing my wip branch to the
official rc branch. I did, and that's why the early patches look like
they were all committed at the same time on Tuesday. That bunch had
been in my queue prior.
The various patches all pass my test for being legitimate fixes and
not attempts to slide new features or development into a late rc.
Well, they were all fixes with the exception of a couple clean up
patches people wrote for making the fixes they also wrote better (like
a cleanup patch to move UMR checking into a function so that the
remaining UMR fix patches can reference that function), so I left
those in place too.
My apologies for the LOC count and the number of patches here, it's
just how the cards fell this cycle.
Summary:
- Fix siw buffer mapping issue
- Fix siw 32/64 casting issues
- Fix a KASAN access issue in bnxt_re
- Fix several memory leaks (hfi1, mlx4)
- Fix a NULL deref in cma_cleanup
- Fixes for UMR memory support in mlx5 (4 patch series)
- Fix namespace check for restrack
- Fixes for counter support
- Fixes for hfi1 TID processing (5 patch series)
- Fix potential NULL deref in siw
- Fix memory page calculations in mlx5"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (21 commits)
RDMA/siw: Fix 64/32bit pointer inconsistency
RDMA/siw: Fix SGL mapping issues
RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message
infiniband: hfi1: fix memory leaks
infiniband: hfi1: fix a memory leak bug
IB/mlx4: Fix memory leaks
RDMA/cma: fix null-ptr-deref Read in cma_cleanup
IB/mlx5: Block MR WR if UMR is not possible
IB/mlx5: Fix MR re-registration flow to use UMR properly
IB/mlx5: Report and handle ODP support properly
IB/mlx5: Consolidate use_umr checks into single function
RDMA/restrack: Rewrite PID namespace check to be reliable
RDMA/counters: Properly implement PID checks
IB/core: Fix NULL pointer dereference when bind QP to counter
IB/hfi1: Drop stale TID RDMA packets that cause TIDErr
IB/hfi1: Add additional checks when handling TID RDMA WRITE DATA packet
IB/hfi1: Add additional checks when handling TID RDMA READ RESP packet
IB/hfi1: Unsafe PSN checking for TID RDMA READ Resp packet
IB/hfi1: Drop stale TID RDMA packets
RDMA/siw: Fix potential NULL de-ref
...
Currently, ipv6_find_idev returns NULL when ipv6_add_dev fails,
ignoring the specific error value. This results in addrconf_add_dev
returning ENOBUFS in all cases, which is unfortunate in cases such as:
# ip link add dummyX type dummy
# ip link set dummyX mtu 1200 up
# ip addr add 2000::/64 dev dummyX
RTNETLINK answers: No buffer space available
Commit a317a2f19d ("ipv6: fail early when creating netdev named all
or default") introduced error returns in ipv6_add_dev. Before that,
that function would simply return NULL for all failures.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull block fixes from Jens Axboe:
"Here's a set of fixes that should go into this release. This contains:
- Three minor fixes for NVMe.
- Three minor tweaks for the io_uring polling logic.
- Officially mark Song as the MD maintainer, after he's been filling
that role sucessfully for the last 6 months or so"
* tag 'for-linus-20190823' of git://git.kernel.dk/linux-block:
io_uring: add need_resched() check in inner poll loop
md: update MAINTAINERS info
io_uring: don't enter poll loop if we have CQEs pending
nvme: Add quirk for LiteON CL1 devices running FW 22301111
nvme: Fix cntlid validation when not using NVMEoF
nvme-multipath: fix possible I/O hang when paths are updated
io_uring: fix potential hang with polled IO
Pull device mapper fixes from Mike Snitzer:
- Revert a DM bufio change from during the 5.3 merge window now that a
proper fix has been made to the block loopback driver.
- Fix DM kcopyd to wakeup so failed subjobs get completed.
- Various fixes to DM zoned target to address error handling, and other
small tweaks (SPDX license identifiers and fix typos).
- Fix DM integrity range locking race by tracking whether journal has
changed.
- Fix DM dust target to detect reads of badblocks beyond the first 512b
sector (applicable if blocksize is larger than 512b).
- Fix DM persistent-data issue in both the DM btree and DM
space-map-metadata interfaces.
- Fix out of bounds memory access with certain DM table configurations.
* tag 'for-5.3/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm table: fix invalid memory accesses with too high sector number
dm space map metadata: fix missing store of apply_bops() return value
dm btree: fix order of block initialization in btree_split_beneath
dm raid: add missing cleanup in raid_ctr()
dm zoned: fix potential NULL dereference in dmz_do_reclaim()
dm dust: use dust block size for badblocklist index
dm integrity: fix a crash due to BUG_ON in __journal_read_write()
dm zoned: fix a few typos
dm zoned: add SPDX license identifiers
dm zoned: properly handle backing device failure
dm zoned: improve error handling in i/o map code
dm zoned: improve error handling in reclaim
dm kcopyd: always complete failed jobs
Revert "dm bufio: fix deadlock with loop device"
Pull xfs fixes from Darrick Wong:
"Here are a few more bug fixes that trickled in since the last pull.
They've survived the usual xfstests runs and merge cleanly with this
morning's master.
I expect there to be one more pull request tomorrow for the fix to
that quota related inode unlock bug that we were reviewing last night,
but it will continue to soak in the testing machine for several more
hours.
- Fix missing compat ioctl handling for get/setlabel
- Fix missing ioctl pointer sanitization on s390
- Fix a page locking deadlock in the dedupe comparison code
- Fix inadequate locking in reflink code w.r.t. concurrent directio
- Fix broken error detection when breaking layouts"
* tag 'xfs-5.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
fs/xfs: Fix return code of xfs_break_leased_layouts()
xfs: fix reflink source file racing with directio writes
vfs: fix page locking deadlocks when deduping files
xfs: compat_ioctl: use compat_ptr()
xfs: fall back to native ioctls for unhandled compat ones
At the moment we initialise the target *mask* of a virtual IRQ to the
VCPU it belongs to, even though this mask is only defined for GICv2 and
quickly runs out of bits for many GICv3 guests.
This behaviour triggers an UBSAN complaint for more than 32 VCPUs:
------
[ 5659.462377] UBSAN: Undefined behaviour in virt/kvm/arm/vgic/vgic-init.c:223:21
[ 5659.471689] shift exponent 32 is too large for 32-bit type 'unsigned int'
------
Also for GICv3 guests the reporting of TARGET in the "vgic-state" debugfs
dump is wrong, due to this very same problem.
Because there is no requirement to create the VGIC device before the
VCPUs (and QEMU actually does it the other way round), we can't safely
initialise mpidr or targets in kvm_vgic_vcpu_init(). But since we touch
every private IRQ for each VCPU anyway later (in vgic_init()), we can
just move the initialisation of those fields into there, where we
definitely know the VGIC type.
On the way make sure we really have either a VGICv2 or a VGICv3 device,
since the existing code is just checking for "VGICv3 or not", silently
ignoring the uninitialised case.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reported-by: Dave Martin <dave.martin@arm.com>
Tested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Pull modules fixes from Jessica Yu:
"Fix BUG_ON() being triggered in frob_text() due to non-page-aligned
module sections"
* tag 'modules-for-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
modules: page-align module section allocations only for arches supporting strict module rwx
modules: always page-align module section allocations
Multiple batadv_ogm2_packet can be stored in an skbuff. The functions
batadv_v_ogm_send_to_if() uses batadv_v_ogm_aggr_packet() to check if there
is another additional batadv_ogm2_packet in the skb or not before they
continue processing the packet.
The length for such an OGM2 is BATADV_OGM2_HLEN +
batadv_ogm2_packet->tvlv_len. The check must first check that at least
BATADV_OGM2_HLEN bytes are available before it accesses tvlv_len (which is
part of the header. Otherwise it might try read outside of the currently
available skbuff to get the content of tvlv_len.
Fixes: 9323158ef9 ("batman-adv: OGMv2 - implement originators logic")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Multiple batadv_ogm_packet can be stored in an skbuff. The functions
batadv_iv_ogm_send_to_if()/batadv_iv_ogm_receive() use
batadv_iv_ogm_aggr_packet() to check if there is another additional
batadv_ogm_packet in the skb or not before they continue processing the
packet.
The length for such an OGM is BATADV_OGM_HLEN +
batadv_ogm_packet->tvlv_len. The check must first check that at least
BATADV_OGM_HLEN bytes are available before it accesses tvlv_len (which is
part of the header. Otherwise it might try read outside of the currently
available skbuff to get the content of tvlv_len.
Fixes: ef26157747 ("batman-adv: tvlv - basic infrastructure")
Reported-by: syzbot+355cab184197dbbfa384@syzkaller.appspotmail.com
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Pull ceph fixes from Ilya Dryomov:
"Three important fixes tagged for stable (an indefinite hang, a crash
on an assert and a NULL pointer dereference) plus a small series from
Luis fixing instances of vfree() under spinlock"
* tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-client:
libceph: fix PG split vs OSD (re)connect race
ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply
ceph: clear page dirty before invalidate page
ceph: fix buffer free while holding i_ceph_lock in fill_inode()
ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob()
ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer
Pull drm fixes from Dave Airlie:
"Live from the laundromat after my washing machine broke down, we have
the 5.3-rc6 fixes. Changelog is in the tag below, but nothing too
noteworthy in here:
rcar-du:
- LVDS dual-link mode fix
mediatek:
- of node refcount fix
- prime buffer import fix
- dma max seg fix
komeda:
- output polling fix
- abfc format fix
- memory-region DT fix
amdgpu:
- bpc display fix
- ioctl memory leak fix
- gfxoff fix
- smu warnings fix
i915:
- HDMI mode readout fix"
* tag 'drm-fixes-2019-08-23' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu/powerplay: silence a warning in smu_v11_0_setup_pptable
drm/amd/display: Calculate bpc based on max_requested_bpc
drm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl
drm/amd/amdgpu: disable MMHUB PG for navi10
drm/amd/powerplay: remove duplicate macro smu_get_uclk_dpm_states in amdgpu_smu.h
drm/amd/powerplay: fix variable type errors in smu_v11_0_setup_pptable
drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible
drm/i915: Fix HW readout for crtc_clock in HDMI mode
drm/mediatek: mtk_drm_drv.c: Add of_node_put() before goto
drm: rcar_lvds: Fix dual link mode operations
drm/mediatek: set DMA max segment size
drm/mediatek: use correct device to import PRIME buffers
drm/omap: ensure we have a valid dma_mask
drm/komeda: Add support for 'memory-region' DT node property
drm/komeda: Adds internal bpp computing for arm afbc only format YU08 YU10
drm/komeda: Initialize and enable output polling on Komeda
Use 'lea' instead of 'add' when adjusting %rsp in CALL_NOSPEC so as to
avoid clobbering flags.
KVM's emulator makes indirect calls into a jump table of sorts, where
the destination of the CALL_NOSPEC is a small blob of code that performs
fast emulation by executing the target instruction with fixed operands.
adcb_al_dl:
0x000339f8 <+0>: adc %dl,%al
0x000339fa <+2>: ret
A major motiviation for doing fast emulation is to leverage the CPU to
handle consumption and manipulation of arithmetic flags, i.e. RFLAGS is
both an input and output to the target of CALL_NOSPEC. Clobbering flags
results in all sorts of incorrect emulation, e.g. Jcc instructions often
take the wrong path. Sans the nops...
asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
0x0003595a <+58>: mov 0xc0(%ebx),%eax
0x00035960 <+64>: mov 0x60(%ebx),%edx
0x00035963 <+67>: mov 0x90(%ebx),%ecx
0x00035969 <+73>: push %edi
0x0003596a <+74>: popf
0x0003596b <+75>: call *%esi
0x000359a0 <+128>: pushf
0x000359a1 <+129>: pop %edi
0x000359a2 <+130>: mov %eax,0xc0(%ebx)
0x000359b1 <+145>: mov %edx,0x60(%ebx)
ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
0x000359a8 <+136>: mov -0x10(%ebp),%eax
0x000359ab <+139>: and $0x8d5,%edi
0x000359b4 <+148>: and $0xfffff72a,%eax
0x000359b9 <+153>: or %eax,%edi
0x000359bd <+157>: mov %edi,0x4(%ebx)
For the most part this has gone unnoticed as emulation of guest code
that can trigger fast emulation is effectively limited to MMIO when
running on modern hardware, and MMIO is rarely, if ever, accessed by
instructions that affect or consume flags.
Breakage is almost instantaneous when running with unrestricted guest
disabled, in which case KVM must emulate all instructions when the guest
has invalid state, e.g. when the guest is in Big Real Mode during early
BIOS.
Fixes: 776b043848fd2 ("x86/retpoline: Add initial retpoline support")
Fixes: 1a29b5b7f3 ("KVM: x86: Make indirect calls in emulator speculation safe")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190822211122.27579-1-sean.j.christopherson@intel.com
If the sector number is too high, dm_table_find_target() should return a
pointer to a zeroed dm_target structure (the caller should test it with
dm_target_is_valid).
However, for some table sizes, the code in dm_table_find_target() that
performs btree lookup will access out of bound memory structures.
Fix this bug by testing the sector number at the beginning of
dm_table_find_target(). Also, add an "inline" keyword to the function
dm_table_get_size() because this is a hot path.
Fixes: 512875bd96 ("dm: table detect io beyond device")
Cc: stable@vger.kernel.org
Reported-by: Zhang Tao <kontais@zoho.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Fixed two -Wunused-but-set-variable warnings:
/arm/linux/display/aosp-4.14-drm-next/drivers/gpu/drm/arm/display/komeda/komeda_kms.c: In function ‘komeda_crtc_normalize_zpos’:
/arm/linux/display/aosp-4.14-drm-next/drivers/gpu/drm/arm/display/komeda/komeda_kms.c:150:26: warning: variable ‘fb’ set but not used [-Wunused-but-set-variable]
struct drm_framebuffer *fb;
^~
/arm/linux/display/aosp-4.14-drm-next/drivers/gpu/drm/arm/display/komeda/komeda_kms.c: In function ‘komeda_kms_check’:
/arm/linux/display/aosp-4.14-drm-next/drivers/gpu/drm/arm/display/komeda/komeda_kms.c:209:25: warning: variable ‘old_crtc_st’ set but not used [-Wunused-but-set-variable]
struct drm_crtc_state *old_crtc_st, *new_crtc_st;
^~~~~~~~~~~
Signed-off-by: james qian wang (Arm Technology China) <james.qian.wang@arm.com>
Reviewed-by: Ayan Kumar Halder <ayan.halder@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190812112322.15990-1-james.qian.wang@arm.com
In the commit ef41b5c924 ("ARM: make kernel oops easier to read"),
- .word 0xe92d0000 >> 10 @ stmfd sp!, {}
+ .word 0xe92d0000 >> 11 @ stmfd sp!, {}
then the shift need to change to 11.
Signed-off-by: Lvqiang Huang <Lvqiang.Huang@unisoc.com>
Signed-off-by: Chunyan Zhang <zhang.lyra@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
A timing hazard exists when an early fork/exec thread begins
exiting and sets its mm pointer to NULL while a separate core
tries to update the section information.
This commit ensures that the mm pointer is not NULL before
setting its section parameters. The arguments provided by
commit 11ce4b33ae ("ARM: 8672/1: mm: remove tasklist locking
from update_sections_early()") are equally valid for not
requiring grabbing the task_lock around this check.
Fixes: 08925c2f12 ("ARM: 8464/1: Update all mm structures with section adjustments")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Laura Abbott <labbott@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Cc: Peng Fan <peng.fan@nxp.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
The new API for registering a gpio_irq_chip along with a
gpio_chip has a different semantic ordering than the old
API which added the irqchip explicitly after registering
the gpio_chip.
Move the calls to add the gpio_irq_chip *last* in the
function, so that the different hooks setting up OF and
ACPI and machine gpio_chips are called *before* we try
to register the interrupts, preserving the elder semantic
order.
This cropped up in the PL061 driver which used to work
fine with no special ACPI quirks, but started to misbehave
using the new API.
Fixes: e0d8972898 ("gpio: Implement tighter IRQ chip integration")
Cc: Thierry Reding <treding@nvidia.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Reported-by: Wei Xu <xuwei5@hisilicon.com>
Tested-by: Wei Xu <xuwei5@hisilicon.com>
Reported-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20190820080527.11796-1-linus.walleij@linaro.org
qxl has two modes: "native" (used by the drm driver) and "vga" (vga
compatibility mode, typically used for boot display and firmware
framebuffers).
Accessing any vga ioport will switch the qxl device into vga mode.
The qxl driver never does that, but other drivers accessing vga ports
can trigger that too and therefore disturb qxl operation. So aquire
the legacy vga ioports from vgaarb to avoid that.
Reproducer: Boot kvm guest with both qxl and i915 vgpu, with qxl being
first in pci scan order.
v2: Skip this for secondary qxl cards which don't have vga mode in the
first place (Frediano).
Cc: Frediano Ziglio <fziglio@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190805105401.29874-1-kraxel@redhat.com
Benjamin Moody reported to Debian that XFS partially wedges when a chgrp
fails on account of being out of disk quota. I ran his reproducer
script:
# adduser dummy
# adduser dummy plugdev
# dd if=/dev/zero bs=1M count=100 of=test.img
# mkfs.xfs test.img
# mount -t xfs -o gquota test.img /mnt
# mkdir -p /mnt/dummy
# chown -c dummy /mnt/dummy
# xfs_quota -xc 'limit -g bsoft=100k bhard=100k plugdev' /mnt
(and then as user dummy)
$ dd if=/dev/urandom bs=1M count=50 of=/mnt/dummy/foo
$ chgrp plugdev /mnt/dummy/foo
and saw:
================================================
WARNING: lock held when returning to user space!
5.3.0-rc5 #rc5 Tainted: G W
------------------------------------------------
chgrp/47006 is leaving the kernel with locks still held!
1 lock held by chgrp/47006:
#0: 000000006664ea2d (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0xd2/0x290 [xfs]
...which is clearly caused by xfs_setattr_nonsize failing to unlock the
ILOCK after the xfs_qm_vop_chown_reserve call fails. Add the missing
unlock.
Reported-by: benjamin.moody@gmail.com
Fixes: 253f4911f2 ("xfs: better xfs_trans_alloc interface")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
While I had thought I had fixed this issue in:
commit 342406e4fb ("drm/nouveau/i2c: Disable i2c bus access after
->fini()")
It turns out that while I did fix the error messages I was seeing on my
P50 when trying to access i2c busses with the GPU in runtime suspend, I
accidentally had missed one important detail that was mentioned on the
bug report this commit was supposed to fix: that the CPU would only lock
up when trying to access i2c busses _on connected devices_ _while the
GPU is not in runtime suspend_. Whoops. That definitely explains why I
was not able to get my machine to hang with i2c bus interactions until
now, as plugging my P50 into it's dock with an HDMI monitor connected
allowed me to finally reproduce this locally.
Now that I have managed to reproduce this issue properly, it looks like
the problem is much simpler then it looks. It turns out that some
connected devices, such as MST laptop docks, will actually ACK i2c reads
even if no data was actually read:
[ 275.063043] nouveau 0000:01:00.0: i2c: aux 000a: 1: 0000004c 1
[ 275.063447] nouveau 0000:01:00.0: i2c: aux 000a: 00 01101000 10040000
[ 275.063759] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000001
[ 275.064024] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
[ 275.064285] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
[ 275.064594] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
Because we don't handle the situation of i2c ack without any data, we
end up entering an infinite loop in nvkm_i2c_aux_i2c_xfer() since the
value of cnt always remains at 0. This finally properly explains how
this could result in a CPU hang like the ones observed in the
aforementioned commit.
So, fix this by retrying transactions if no data is written or received,
and give up and fail the transaction if we continue to not write or
receive any data after 32 retries.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The request coming from Netlink should use the OEM generic handler.
The standard command handler expects payload in bytes/words/dwords
but the actual payload is stored in data if the request is coming from Netlink.
Signed-off-by: Justin Lee <justin.lee1@dell.com>
Reviewed-by: Vijay Khemka <vijaykhemka@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VDSO update for CLOCK_BOOTTIME has a overflow issue as it shifts the
nanoseconds based boot time offset left by the clocksource shift. That
overflows once the boot time offset becomes large enough. As a consequence
CLOCK_BOOTTIME in the VDSO becomes a random number causing applications to
misbehave.
Fix it by storing a timespec64 representation of the offset when boot time
is adjusted and add that to the MONOTONIC base time value in the vdso data
page. Using the timespec64 representation avoids a 64bit division in the
update code.
Fixes: 44f57d788e ("timekeeping: Provide a generic update_vsyscall() implementation")
Reported-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908221257580.1983@nanos.tec.linutronix.de
Kalle Valo says:
====================
wireless-drivers fixes for 5.3
Third set of fixes for 5.3, and most likely the last one. The rt2x00
regression has been reported multiple times, others are of lower
priority.
mt76
* fix hang on resume on certain machines
rt2x00
* fix AP mode regression related to encryption
iwlwifi
* avoid unnecessary error messages due to multicast frames when not
associated
* fix configuration for ax201 devices
* fix recognition of QuZ devices
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If qed_mcp_send_drv_version() fails, no cleanup is executed, leading to
memory leaks. To fix this issue, introduce the label 'err4' to perform the
cleanup work before returning the error.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Acked-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The trap action should be copying the frame to CPU and
dropping it for forwarding, but current setting was just
copying frame to CPU.
Fixes: b596229448 ("net: mscc: ocelot: Add support for tcam")
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Allan W. Nielsen <allan.nielsen@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unfortunately, my build fix for when time travel mode isn't
enabled broke time travel mode, because I forgot that we need
to use the timer time after the timer has been marked disabled,
and thus need to leave the time stored instead of zeroing it.
Fix that by splitting the inline into two, so we can call only
the _mode() one in the relevant code path.
Fixes: b482e48d29 ("um: fix build without CONFIG_UML_TIME_TRAVEL_SUPPORT")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
The outer poll loop checks for whether we need to reschedule, and
returns to userspace if we do. However, it's possible to get stuck
in the inner loop as well, if the CPU we are running on needs to
reschedule to finish the IO work.
Add the need_resched() check in the inner loop as well. This fixes
a potential hang if the kernel is configured with
CONFIG_PREEMPT_VOLUNTARY=y.
Reported-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull PCI fixes from Bjorn Helgaas:
- Reset both NVIDIA GPU and HDA in ThinkPad P50 quirk, which was broken
by another quirk that enabled the HDA device (Lyude Paul)
- Fix pciebus-howto.rst documentation filename typo (Bjorn Helgaas)
* tag 'pci-v5.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Documentation PCI: Fix pciebus-howto.rst filename typo
PCI: Reset both NVIDIA GPU and HDA in ThinkPad P50 workaround
Dump WQE shall not include Ethernet segment. Define mlx5e_dump_wqe to be
used for "Dump WQEs" instead of sharing it with the general mlx5e_tx_wqe
layout.
Fixes: d2ead1f360 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
For TLS WQEs, metadata info did not include num_bytes. Due to this issue,
tx_tls_dump_bytes counter did not increment.
Modify tx_fill_wi() to fill num bytes. When it is called for non-traffic
WQE, zero is expected.
Fixes: d2ead1f360 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When fw fatal error occurs, poll health() first detects and reports on a
fw error. Afterwards, it detects and reports on the fw fatal error
itself.
That can cause a long delay in fw fatal error handling which waits in a
queue for the fw error handling to be finished. The fw error handle will
try asking for fw core dump command while fw in fatal state may not
respond and driver will wait for command timeout.
Changing the flow to detect and handle first fw fatal errors and only if
no fatal error detected look for a fw error to handle.
Fixes: d1bf0e2cc4 ("net/mlx5: Report devlink health on FW issues")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Crdump repeats itself every chunk of 256bytes.
That is due to bug of missing progressing offset while copying the data
from buffer to devlink_fmsg.
Fixes: 9b1f298236 ("net/mlx5: Add support for FW fatal reporter dump")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In commit 6096d91af0 ("dm space map metadata: fix occasional leak
of a metadata block on resize"), we refactor the commit logic to a new
function 'apply_bops'. But when that logic was replaced in out() the
return value was not stored. This may lead out() returning a wrong
value to the caller.
Fixes: 6096d91af0 ("dm space map metadata: fix occasional leak of a metadata block on resize")
Cc: stable@vger.kernel.org
Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
When btree_split_beneath() splits a node to two new children, it will
allocate two blocks: left and right. If right block's allocation
failed, the left block will be unlocked and marked dirty. If this
happened, the left block'ss content is zero, because it wasn't
initialized with the btree struct before the attempot to allocate the
right block. Upon return, when flushing the left block to disk, the
validator will fail when check this block. Then a BUG_ON is raised.
Fix this by completely initializing the left block before allocating and
initializing the right block.
Fixes: 4dcb8b57df ("dm btree: fix leak of bufio-backed block in btree_split_beneath error path")
Cc: stable@vger.kernel.org
Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Pull more fallthrough fixes from Gustavo A. R. Silva:
"Fix fall-through warnings on arm and mips for multiple configurations"
* tag 'Wimplicit-fallthrough-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
video: fbdev: acornfb: Mark expected switch fall-through
scsi: libsas: sas_discover: Mark expected switch fall-through
MIPS: Octeon: Mark expected switch fall-through
power: supply: ab8500_charger: Mark expected switch fall-through
watchdog: wdt285: Mark expected switch fall-through
mtd: sa1100: Mark expected switch fall-through
drm/sun4i: tcon: Mark expected switch fall-through
drm/sun4i: sun6i_mipi_dsi: Mark expected switch fall-through
ARM: riscpc: Mark expected switch fall-through
dmaengine: fsldma: Mark expected switch fall-through
Pull chrome platform fix from Benson Leung:
"Fix a kernel crash during suspend/resume of cros_ec_ishtp"
* tag 'tag-chrome-platform-fixes-for-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
platform/chrome: cros_ec_ishtp: fix crash during suspend
Pull AFS fixes from David Howells:
- Fix a cell record leak due to the default error not being cleared.
- Fix an oops in tracepoint due to a pointer that may contain an error.
- Fix the ACL storage op for YFS where the wrong op definition is being
used. By luck, this only actually affects the information appearing
in traces.
* tag 'afs-fixes-20190822' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: use correct afs_call_type in yfs_fs_store_opaque_acl2
afs: Fix possible oops in afs_lookup trace event
afs: Fix leak in afs_lookup_cell_rcu()
If the number of dirty pages to be written back is large,
then writeback_inodes_sb will block waiting for a long time,
causing hung task detection alarm. Therefore, we should limit
the maximum number of pages written back this time, which let
the budget be completed faster. The remaining dirty pages
tend to rely on the writeback mechanism to complete the
synchronization.
Fixes: b6e51316da ("writeback: separate starting of sync vs opportunistic writeback")
Signed-off-by: Liu Song <liu.song11@zte.com.cn>
Signed-off-by: Richard Weinberger <richard@nod.at>
Currently on a freshly mounted UBIFS, c->min_log_bytes is 0.
This can lead to a log overrun and make commits fail.
Recent kernels will report the following assert:
UBIFS assert failed: c->lhead_lnum != c->ltail_lnum, in fs/ubifs/log.c:412
c->min_log_bytes can have two states, 0 and c->leb_size.
It controls how much bytes of the log area are reserved for non-bud
nodes such as commit nodes.
After a commit it has to be set to c->leb_size such that we have always
enough space for a commit. While a commit runs it can be 0 to make the
remaining bytes of the log available to writers.
Having it set to 0 right after mount is wrong since no space for commits
is reserved.
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Reported-and-tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
We unlock after orphan_delete(), so no need to unlock
in the function too.
Reported-by: Han Xu <han.xu@nxp.com>
Fixes: 8009ce956c ("ubifs: Don't leak orphans on memory during commit")
Signed-off-by: Richard Weinberger <richard@nod.at>
Linux kernel tolerates C++ style comments these days. Actually, the
SPDX License tags for .c files start with //.
On the other hand, uapi headers are written in more strict C, where
the C++ comment style is forbidden.
I simply dropped these lines instead of fixing the comment style.
This code has been always commented out since it was added around
Linux 2.4.9 (i.e. commented out for more than 17 years).
'Maybe later...' will never happen.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
All user level and most in-kernel applications submit WQEs
where the SG list entries are all of a single type.
iSER in particular, however, will send us WQEs with mixed SG
types: sge[0] = kernel buffer, sge[1] = PBL region.
Check and set is_kva on each SG entry individually instead of
assuming the first SGE type carries through to the last.
This fixes iSER over siw.
Fixes: b9be6f18cf ("rdma/siw: transmit path")
Reported-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Tested-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190822150741.21871-1-bmt@zurich.ibm.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
It seems that 'yfs_RXYFSStoreOpaqueACL2' should be use in
yfs_fs_store_opaque_acl2().
Fixes: f5e4546347 ("afs: Implement YFS ACL setting")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David Howells <dhowells@redhat.com>
The afs_lookup trace event can cause the following:
[ 216.576777] BUG: kernel NULL pointer dereference, address: 000000000000023b
[ 216.576803] #PF: supervisor read access in kernel mode
[ 216.576813] #PF: error_code(0x0000) - not-present page
...
[ 216.576913] RIP: 0010:trace_event_raw_event_afs_lookup+0x9e/0x1c0 [kafs]
If the inode from afs_do_lookup() is an error other than ENOENT, or if it
is ENOENT and afs_try_auto_mntpt() returns an error, the trace event will
try to dereference the error pointer as a valid pointer.
Use IS_ERR_OR_NULL to only pass a valid pointer for the trace, or NULL.
Ideally the trace would include the error value, but for now just avoid
the oops.
Fixes: 80548b0399 ("afs: Add more tracepoints")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Fix a leak on the cell refcount in afs_lookup_cell_rcu() due to
non-clearance of the default error in the case a NULL cell name is passed
and the workstation default cell is used.
Also put a bit at the end to make sure we don't leak a cell ref if we're
going to be returning an error.
This leak results in an assertion like the following when the kafs module is
unloaded:
AFS: Assertion failed
2 == 1 is false
0x2 == 0x1 is false
------------[ cut here ]------------
kernel BUG at fs/afs/cell.c:770!
...
RIP: 0010:afs_manage_cells+0x220/0x42f [kafs]
...
process_one_work+0x4c2/0x82c
? pool_mayday_timeout+0x1e1/0x1e1
? do_raw_spin_lock+0x134/0x175
worker_thread+0x336/0x4a6
? rescuer_thread+0x4af/0x4af
kthread+0x1de/0x1ee
? kthread_park+0xd4/0xd4
ret_from_fork+0x24/0x30
Fixes: 989782dcdc ("afs: Overhaul cell database management")
Signed-off-by: David Howells <dhowells@redhat.com>
If after an MMIO exit to userspace a VCPU is immediately run with an
immediate_exit request, such as when a signal is delivered or an MMIO
emulation completion is needed, then the VCPU completes the MMIO
emulation and immediately returns to userspace. As the exit_reason
does not get changed from KVM_EXIT_MMIO in these cases we have to
be careful not to complete the MMIO emulation again, when the VCPU is
eventually run again, because the emulation does an instruction skip
(and doing too many skips would be a waste of guest code :-) We need
to use additional VCPU state to track if the emulation is complete.
As luck would have it, we already have 'mmio_needed', which even
appears to be used in this way by other architectures already.
Fixes: 0d640732db ("arm64: KVM: Skip MMIO insn after emulation")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
HS200 is not implemented in the driver, but the controller claims it
through caps. Remove it via a quirk, to make sure the mmc core do not try
to enable HS200, as it causes the eMMC initialization to fail.
Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: bb5f8ea4d5 ("mmc: sdhci-of-at91: introduce driver for the Atmel SDMMC")
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
We can't rely on ->peer_features in calc_target() because it may be
called both when the OSD session is established and open and when it's
not. ->peer_features is not valid unless the OSD session is open. If
this happens on a PG split (pg_num increase), that could mean we don't
resend a request that should have been resent, hanging the client
indefinitely.
In userspace this was fixed by looking at require_osd_release and
get_xinfo[osd].features fields of the osdmap. However these fields
belong to the OSD section of the osdmap, which the kernel doesn't
decode (only the client section is decoded).
Instead, let's drop this feature check. It effectively checks for
luminous, so only pre-luminous OSDs would be affected in that on a PG
split the kernel might resend a request that should not have been
resent. Duplicates can occur in other scenarios, so both sides should
already be prepared for them: see dup/replay logic on the OSD side and
retry_attempt check on the client side.
Cc: stable@vger.kernel.org
Fixes: 7de030d6b1 ("libceph: resend on PG splits if OSD has RESEND_ON_SPLIT")
Link: https://tracker.ceph.com/issues/41162
Reported-by: Jerry Lee <leisurelysw24@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Jerry Lee <leisurelysw24@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
When ceph_mdsc_do_request returns an error, we can't assume that the
filelock_reply pointer will be set. Only try to fetch fields out of
the r_reply_info when it returns success.
Cc: stable@vger.kernel.org
Reported-by: Hector Martin <hector@marcansoft.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Calling ceph_buffer_put() in fill_inode() may result in freeing the
i_xattrs.blob buffer while holding the i_ceph_lock. This can be fixed by
postponing the call until later, when the lock is released.
The following backtrace was triggered by fstests generic/070.
BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
in_atomic(): 1, irqs_disabled(): 0, pid: 3852, name: kworker/0:4
6 locks held by kworker/0:4/3852:
#0: 000000004270f6bb ((wq_completion)ceph-msgr){+.+.}, at: process_one_work+0x1b8/0x5f0
#1: 00000000eb420803 ((work_completion)(&(&con->work)->work)){+.+.}, at: process_one_work+0x1b8/0x5f0
#2: 00000000be1c53a4 (&s->s_mutex){+.+.}, at: dispatch+0x288/0x1476
#3: 00000000559cb958 (&mdsc->snap_rwsem){++++}, at: dispatch+0x2eb/0x1476
#4: 000000000d5ebbae (&req->r_fill_mutex){+.+.}, at: dispatch+0x2fc/0x1476
#5: 00000000a83d0514 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: fill_inode.isra.0+0xf8/0xf70
CPU: 0 PID: 3852 Comm: kworker/0:4 Not tainted 5.2.0+ #441
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
Workqueue: ceph-msgr ceph_con_workfn
Call Trace:
dump_stack+0x67/0x90
___might_sleep.cold+0x9f/0xb1
vfree+0x4b/0x60
ceph_buffer_release+0x1b/0x60
fill_inode.isra.0+0xa9b/0xf70
ceph_fill_trace+0x13b/0xc70
? dispatch+0x2eb/0x1476
dispatch+0x320/0x1476
? __mutex_unlock_slowpath+0x4d/0x2a0
ceph_con_workfn+0xc97/0x2ec0
? process_one_work+0x1b8/0x5f0
process_one_work+0x244/0x5f0
worker_thread+0x4d/0x3e0
kthread+0x105/0x140
? process_one_work+0x5f0/0x5f0
? kthread_park+0x90/0x90
ret_from_fork+0x3a/0x50
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Calling ceph_buffer_put() in __ceph_build_xattrs_blob() may result in
freeing the i_xattrs.blob buffer while holding the i_ceph_lock. This can
be fixed by having this function returning the old blob buffer and have
the callers of this function freeing it when the lock is released.
The following backtrace was triggered by fstests generic/117.
BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
in_atomic(): 1, irqs_disabled(): 0, pid: 649, name: fsstress
4 locks held by fsstress/649:
#0: 00000000a7478e7e (&type->s_umount_key#19){++++}, at: iterate_supers+0x77/0xf0
#1: 00000000f8de1423 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: ceph_check_caps+0x7b/0xc60
#2: 00000000562f2b27 (&s->s_mutex){+.+.}, at: ceph_check_caps+0x3bd/0xc60
#3: 00000000f83ce16a (&mdsc->snap_rwsem){++++}, at: ceph_check_caps+0x3ed/0xc60
CPU: 1 PID: 649 Comm: fsstress Not tainted 5.2.0+ #439
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x67/0x90
___might_sleep.cold+0x9f/0xb1
vfree+0x4b/0x60
ceph_buffer_release+0x1b/0x60
__ceph_build_xattrs_blob+0x12b/0x170
__send_cap+0x302/0x540
? __lock_acquire+0x23c/0x1e40
? __mark_caps_flushing+0x15c/0x280
? _raw_spin_unlock+0x24/0x30
ceph_check_caps+0x5f0/0xc60
ceph_flush_dirty_caps+0x7c/0x150
? __ia32_sys_fdatasync+0x20/0x20
ceph_sync_fs+0x5a/0x130
iterate_supers+0x8f/0xf0
ksys_sync+0x4f/0xb0
__ia32_sys_sync+0xa/0x10
do_syscall_64+0x50/0x1c0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fc6409ab617
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the
i_xattrs.prealloc_blob buffer while holding the i_ceph_lock. This can be
fixed by postponing the call until later, when the lock is released.
The following backtrace was triggered by fstests generic/117.
BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress
3 locks held by fsstress/650:
#0: 00000000870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50
#1: 00000000ba0c4c74 (&type->i_mutex_dir_key#6){++++}, at: vfs_setxattr+0x55/0xa0
#2: 000000008dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: __ceph_setxattr+0x297/0x810
CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x67/0x90
___might_sleep.cold+0x9f/0xb1
vfree+0x4b/0x60
ceph_buffer_release+0x1b/0x60
__ceph_setxattr+0x2b4/0x810
__vfs_setxattr+0x66/0x80
__vfs_setxattr_noperm+0x59/0xf0
vfs_setxattr+0x81/0xa0
setxattr+0x115/0x230
? filename_lookup+0xc9/0x140
? rcu_read_lock_sched_held+0x74/0x80
? rcu_sync_lockdep_assert+0x2e/0x60
? __sb_start_write+0x142/0x1a0
? mnt_want_write+0x20/0x50
path_setxattr+0xba/0xd0
__x64_sys_lsetxattr+0x24/0x30
do_syscall_64+0x50/0x1c0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7ff23514359a
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The bmControls (for UAC1) or bmMixerControls (for UAC2/3) bitmap has a
variable size depending on both input and output pins. Its size is to
fit with input * output bits. The problem is that the input size
can't be determined simply from the unit descriptor itself but it
needs to parse the whole connected sources. Although the
uac_mixer_unit_get_channels() tries to check some possible overflow of
this bitmap, it's incomplete due to the lack of the evaluation of
input pins.
For covering possible overflows, this patch adds the bitmap overflow
check in the loop of input pins in parse_audio_mixer_unit().
Fixes: 0bfe5e434e ("ALSA: usb-audio: Check mixer unit descriptors more strictly")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
If a CCP is unconfigured (e.g. there are no available queues) then
there will be no data structures allocated for the device. Thus, we
must check for validity of a pointer before trying to access structure
members.
Fixes: 720419f018 ("crypto: ccp - Introduce the AMD Secure Processor device")
Cc: <stable@vger.kernel.org>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
I have been reviewing patches for md in the past few months. Mark me
as the MD maintainer, as I have effectively been filling that role.
Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is a copy and paste error so we have "rx" where "tx" was intended
in the priv->tx[] array.
Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 93a714d6b5 ("multicast: Extend ip address command to enable
multicast group join/leave on") we added a new flag IFA_F_MCAUTOJOIN
to make user able to add multicast address on ethernet interface.
This works for IPv4, but not for IPv6. See the inet6_addr_add code.
static int inet6_addr_add()
{
...
if (cfg->ifa_flags & IFA_F_MCAUTOJOIN) {
ipv6_mc_config(net->ipv6.mc_autojoin_sk, true...)
}
ifp = ipv6_add_addr(idev, cfg, true, extack); <- always fail with maddr
if (!IS_ERR(ifp)) {
...
} else if (cfg->ifa_flags & IFA_F_MCAUTOJOIN) {
ipv6_mc_config(net->ipv6.mc_autojoin_sk, false...)
}
}
But in ipv6_add_addr() it will check the address type and reject multicast
address directly. So this feature is never worked for IPv6.
We should not remove the multicast address check totally in ipv6_add_addr(),
but could accept multicast address only when IFA_F_MCAUTOJOIN flag supplied.
v2: update commit description
Fixes: 93a714d6b5 ("multicast: Extend ip address command to enable multicast group join/leave on")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SF2 binding does not specify that the CPU port should have
properties mandatory for successfully instantiating a PHYLINK object. As
such, there will be missing properties (including fixed-link) and when
attempting to validate and later configure link modes, we will have an
incorrect set of parameters (interface, speed, duplex).
Simply prevent the CPU port from being configured through PHYLINK since
bcm_sf2_imp_setup() takes care of that already.
Fixes: 0e27921816 ("net: dsa: Use PHYLINK for the CPU/DSA ports")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[Why]
The only place where state->max_bpc is updated on the connector is
at the start of atomic check during drm_atomic_connector_check. It
isn't updated when adding the connectors to the atomic state after
the fact. It also doesn't necessarily reflect the right value when
called in amdgpu during mode validation outside of atomic check.
This can cause the wrong bpc to be used even if the max_requested_bpc
is the correct value.
[How]
Don't rely on state->max_bpc reflecting the real bpc value and just
do the min(...) based on display info bpc and max_requested_bpc.
Fixes: 01933ba42d ("drm/amd/display: Use current connector state if NULL when checking bpc")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Error out if the AMDGPU_CS ioctl is called with multiple SYNCOBJ_OUT and/or
TIMELINE_SIGNAL chunks, since otherwise the last chunk wins while the
allocated array as well as the reference counts of sync objects are leaked.
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix size type errors, from uint32_t to uint16_t.
it will cause only initializes the highest 16 bits in
smu_get_atom_data_table function.
bug report:
This fixes the following static checker warning.
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:390 smu_v11_0_setup_pptable()
warn: passing casted pointer '&size' to 'smu_get_atom_data_table()' 32 vs 16.
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need to set certain power gating flags after we determine
if the firmware version is sufficient to support gfxoff.
Previously we set the pg flags in early init, but we later
we might have disabled gfxoff if the firmware versions didn't
support it. Move adding the additional pg flags after we
determine whether or not to support gfxoff.
Fixes: 005440066f ("drm/amdgpu: enable gfxoff again on raven series (v2)")
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable@vger.kernel.org
In certain cases when the probe function fails the error path calls
cpsw_remove_dt() before calling platform_set_drvdata(). This is an
issue as cpsw_remove_dt() uses platform_get_drvdata() to retrieve the
cpsw_common data and leds to a NULL pointer exception. This patches
fixes it by calling platform_set_drvdata() earlier in the probe.
Fixes: 83a8471ba2 ("net: ethernet: ti: cpsw: refactor probe to group common hw initialization")
Reported-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Wunderlich says:
====================
Here is a batman-adv bugfix:
- fix uninit-value in batadv_netlink_get_ifindex(), by Eric Dumazet
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace 'decided' with 'decide' so that comment would be
/* To decide when the network namespace should be freed. */
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Just three fixes:
* extended key ID key installation
* regulatory processing
* possible memory leak in an error path
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull KVM fixes from Paolo Bonzini:
"A couple bugfixes, and mostly selftests changes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
selftests/kvm: make platform_info_test pass on AMD
Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"
selftests: kvm: fix state save/load on processors without XSAVE
selftests: kvm: fix vmx_set_nested_state_test
selftests: kvm: provide common function to enable eVMCS
selftests: kvm: do not try running the VM in vmx_set_nested_state_test
KVM: x86: svm: remove redundant assignment of var new_entry
MAINTAINERS: add KVM x86 reviewers
MAINTAINERS: change list for KVM/s390
kvm: x86: skip populating logical dest map if apic is not sw enabled
I forgot to release the allocated object at the early error path in
line6_init_pcm(). For addressing it, slightly shuffle the code so
that the PCM destructor (pcm->private_free) is assigned properly
before all error paths.
Fixes: 3450121997 ("ALSA: line6: Fix write on zero-sized buffer")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
test_msr_platform_info_disabled() generates EXIT_SHUTDOWN but VMCB state
is undefined after that so an attempt to launch this guest again from
test_msr_platform_info_enabled() fails. Reorder the tests to make test
pass.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull nfsd fixes from Bruce Fields:
"Fix nfsd bugs: three in the new nfsd/clients/ code, one in the reply
cache containerization"
* tag 'nfsd-5.3-1' of git://linux-nfs.org/~bfields/linux:
nfsd4: Fix kernel crash when reading proc file reply_cache_stats
nfsd: initialize i_private before d_add
nfsd: use i_wrlock instead of rcu for nfsdfs i_private
nfsd: fix dentry leak upon mkdir failure.
After _gadget_stop_activity is executed, we can consider the hardware
operation for gadget has finished, and the udc can be stopped and enter
low power mode. So, any later hardware operations (from usb_ep_ops APIs
or usb_gadget_ops APIs) should be considered invalid, any deinitializatons
has been covered at _gadget_stop_activity.
I meet this problem when I plug out usb cable from PC using mass_storage
gadget, my callstack like: vbus interrupt->.vbus_session->
composite_disconnect ->pm_runtime_put_sync(&_gadget->dev),
the composite_disconnect will call fsg_disable, but fsg_disable calls
usb_ep_disable using async way, there are register accesses for
usb_ep_disable. So sometimes, I get system hang due to visit register
without clock, sometimes not.
The Linux Kernel USB maintainer Alan Stern suggests this kinds of solution.
See: http://marc.info/?l=linux-usb&m=138541769810983&w=2.
Cc: <stable@vger.kernel.org> #v4.9+
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Link: https://lore.kernel.org/r/20190820020503.27080-2-peter.chen@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If rs_prepare_reshape() fails, no cleanup is executed, leading to
leak of the raid_set structure allocated at the beginning of
raid_ctr(). To fix this issue, go to the label 'bad' if the error
occurs.
Fixes: 11e4723206 ("dm raid: stop keeping raid set frozen altogether")
Cc: stable@vger.kernel.org
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This function is supposed to return error pointers so it matches the
dmz_get_rnd_zone_for_reclaim() function. The current code could lead to
a NULL dereference in dmz_do_reclaim()
Fixes: b234c6d7a7 ("dm zoned: improve error handling in reclaim")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Change the "frontend" dust_remove_block, dust_add_block, and
dust_query_block functions to store the "dust block number", instead
of the sector number corresponding to the "dust block number".
For the "backend" functions dust_map_read and dust_map_write,
right-shift by sect_per_block_shift. This fixes the inability to
emulate failure beyond the first sector of each "dust block" (for
devices with a "dust block size" larger than 512 bytes).
Fixes: e4f3fabd67 ("dm: add dust target")
Cc: stable@vger.kernel.org
Signed-off-by: Bryan Gurney <bgurney@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
When ./test_xdp_vlan_mode_generic.sh runs it complains that it can't
find file test_xdp_vlan.sh.
# selftests: bpf: test_xdp_vlan_mode_generic.sh
# ./test_xdp_vlan_mode_generic.sh: line 9: ./test_xdp_vlan.sh: No such
file or directory
Rework so that test_xdp_vlan.sh gets installed, added to the variable
TEST_PROGS_EXTENDED.
Fixes: d35661fcf9 ("selftests/bpf: add wrapper scripts for test_xdp_vlan.sh")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
When running test_kmod.sh the following shows up
# sysctl cannot stat /proc/sys/net/core/bpf_jit_enable No such file or directory
cannot: stat_/proc/sys/net/core/bpf_jit_enable #
# sysctl cannot stat /proc/sys/net/core/bpf_jit_harden No such file or directory
cannot: stat_/proc/sys/net/core/bpf_jit_harden #
Rework to enable CONFIG_BPF_JIT to solve "No such file or directory"
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
test_btf_dump fails when run with O=, because it needs to access source
files and assumes they live in ./progs/, which is not the case in this
scenario.
Fix by instructing kselftest to copy btf_dump_test_case_*.c files to the
test directory. Since kselftest does not preserve directory structure,
adjust the test to look in ./progs/ and then in ./.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
test_cgroup_storage fails on s390 with an assertion failure: packets are
dropped when they shouldn't. The problem is that BPF_DW packet count is
accessed as BPF_W with an offset of 0, which is not correct on
big-endian machines.
Since the point of this test is not to verify narrow loads/stores,
simply use BPF_DW when working with packet counts.
Fixes: 68cfa3ac6b ("selftests/bpf: add a cgroup storage test")
Fixes: 919646d2a3 ("selftests/bpf: extend the storage test to test per-cpu cgroup storage")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The WRITE_PROTECT bit is always in a "protected mode" on Tegra and
WP-GPIO state need to be used instead. In a case of the GPIO absence,
write-enable should be assumed. External SD is writable once again as
a result of this patch because the offending commit changed behaviour for
the case of a missing WP-GPIO to fall back to WRITE_PROTECT bit-checking,
which is incorrect for Tegra.
Cc: stable@vger.kernel.org # v5.1+
Fixes: e8391453e2 ("mmc: sdhci-tegra: drop ->get_ro() implementation")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
We should keep the case of "#define debug_align(X) (X)" for all arches
without CONFIG_HAS_STRICT_MODULE_RWX ability, which would save people, who
are sensitive to system size, a lot of memory when using modules,
especially for embedded systems. This is also the intention of the
original #ifdef... statement and still valid for now.
Note that this still keeps the effect of the fix of the following commit,
38f054d549 ("modules: always page-align module section allocations"),
since when CONFIG_ARCH_HAS_STRICT_MODULE_RWX is enabled, module pages are
aligned.
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
This reverts commit 96cce12ff6 ("cfg80211: fix processing world
regdomain when non modular").
Re-triggering a reg_process_hint with the last request on all events,
can make the regulatory domain fail in case of multiple WiFi modules. On
slower boards (espacially with mdev), enumeration of the WiFi modules
can end up in an intersected regulatory domain, and user cannot set it
with 'iw reg set' anymore.
This is happening, because:
- 1st module enumerates, queues up a regulatory request
- request gets processed by __reg_process_hint_driver():
- checks if previous was set by CORE -> yes
- checks if regulator domain changed -> yes, from '00' to e.g. 'US'
-> sends request to the 'crda'
- 2nd module enumerates, queues up a regulator request (which triggers
the reg_todo() work)
- reg_todo() -> reg_process_pending_hints() sees, that the last request
is not processed yet, so it tries to process it again.
__reg_process_hint driver() will run again, and:
- checks if the last request's initiator was the core -> no, it was
the driver (1st WiFi module)
- checks, if the previous initiator was the driver -> yes
- checks if the regulator domain changed -> yes, it was '00' (set by
core, and crda call did not return yet), and should be changed to 'US'
------> __reg_process_hint_driver calls an intersect
Besides, the reg_process_hint call with the last request is meaningless
since the crda call has a timeout work. If that timeout expires, the
first module's request will lost.
Cc: stable@vger.kernel.org
Fixes: 96cce12ff6 ("cfg80211: fix processing world regdomain when non modular")
Signed-off-by: Robert Hodaszi <robert.hodaszi@digi.com>
Link: https://lore.kernel.org/r/20190614131600.GA13897@a1-hr
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This reverts commit 4e103134b8.
Alex Williamson reported regressions with device assignment with
this patch. Even though the bug is probably elsewhere and still
latent, this is needed to fix the regression.
Fixes: 4e103134b8 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05)
Reported-by: Alex Willamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
state_test and smm_test are failing on older processors that do not
have xcr0. This is because on those processor KVM does provide
support for KVM_GET/SET_XSAVE (to avoid having to rely on the older
KVM_GET/SET_FPU) but not for KVM_GET/SET_XCRS.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix two shortcomings in the Extended Key ID API:
1) Allow the userspace to install pairwise keys using keyid 1 without
NL80211_KEY_NO_TX set. This allows the userspace to install and
activate pairwise keys with keyid 1 in the same way as for keyid 0,
simplifying the API usage for e.g. FILS and FT key installs.
2) IEEE 802.11 - 2016 restricts Extended Key ID usage to CCMP/GCMP
ciphers in IEEE 802.11 - 2016 "9.4.2.25.4 RSN capabilities".
Enforce that when installing a key.
Cc: stable@vger.kernel.org # 5.2
Fixes: 6cdd3979a2 ("nl80211/cfg80211: Extended Key ID support")
Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
Link: https://lore.kernel.org/r/20190805123400.51567-1-alexander@wetzel-home.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The quirk function snd_emuusb_set_samplerate() has a NULL check for
the mixer element, but this is useless in the current code. It used
to be a check against mixer->id_elems[unitid] but it was changed later
to the value after mixer_eleme_list_to_info() which is always non-NULL
due to the container_of() usage.
This patch fixes the check before the conversion.
While we're at it, correct a typo in the comment in the function,
too.
Fixes: 8c558076c7 ("ALSA: usb-audio: Clean up mixer element list traverse")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: rpc_defconfig arm):
drivers/video/fbdev/acornfb.c: In function ‘acornfb_parse_dram’:
drivers/video/fbdev/acornfb.c:860:9: warning: this statement may fall through [-Wimplicit-fallthrough=]
size *= 1024;
~~~~~^~~~~~~
drivers/video/fbdev/acornfb.c:861:3: note: here
case 'K':
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: mtx1_defconfig mips):
drivers/scsi/libsas/sas_discover.c: In function ‘sas_discover_domain’:
./include/linux/printk.h:309:2: warning: this statement may fall through [-Wimplicit-fallthrough=]
printk(KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/libsas/sas_discover.c:459:3: note: in expansion of macro ‘pr_notice’
pr_notice("ATA device seen but CONFIG_SCSI_SAS_ATA=N so cannot attach\n");
^~~~~~~~~
drivers/scsi/libsas/sas_discover.c:462:2: note: here
default:
^~~~~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: cavium_octeon_defconfig mips):
arch/mips/include/asm/octeon/cvmx-sli-defs.h:47:6: warning: this statement
may fall through [-Wimplicit-fallthrough=]
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: allmodconfig arm):
drivers/power/supply/ab8500_charger.c: In function ‘ab8500_charger_max_usb_curr’:
drivers/power/supply/ab8500_charger.c:738:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (di->vbus_detected) {
^
drivers/power/supply/ab8500_charger.c:745:2: note: here
case USB_STAT_HM_IDGND:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: footbridge_defconfig arm):
drivers/watchdog/wdt285.c: In function ‘watchdog_ioctl’:
drivers/watchdog/wdt285.c:170:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
watchdog_ping();
^~~~~~~~~~~~~~~
drivers/watchdog/wdt285.c:172:2: note: here
case WDIOC_GETTIMEOUT:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: assabet_defconfig arm):
drivers/mtd/maps/sa1100-flash.c: In function ‘sa1100_probe_subdev’:
drivers/mtd/maps/sa1100-flash.c:82:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
printk(KERN_WARNING "SA1100 flash: unknown base address "
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"0x%08lx, assuming CS0\n", phys);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/mtd/maps/sa1100-flash.c:85:2: note: here
case SA1100_CS0_PHYS:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: sunxi_defconfig arm):
drivers/gpu/drm/sun4i/sun4i_tcon.c: In function ‘sun4i_tcon0_mode_set_dithering’:
drivers/gpu/drm/sun4i/sun4i_tcon.c:318:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
val |= SUN4I_TCON0_FRM_CTL_MODE_B;
drivers/gpu/drm/sun4i/sun4i_tcon.c:319:2: note: here
case MEDIA_BUS_FMT_RGB666_1X18:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: multi_v7_defconfig arm):
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c: In function ‘sun6i_dsi_transfer’:
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:993:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (msg->rx_len == 1) {
^
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:998:2: note: here
default:
^~~~~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: rpc_defconfig arm):
arch/arm/mach-rpc/riscpc.c: In function ‘parse_tag_acorn’:
arch/arm/mach-rpc/riscpc.c:48:13: warning: this statement may fall through [-Wimplicit-fallthrough=]
vram_size += PAGE_SIZE * 256;
~~~~~~~~~~^~~~~~~~~~~~~~~~~~
arch/arm/mach-rpc/riscpc.c:49:2: note: here
case 256:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
Fix the following warnings (Building: powerpc-ppa8548_defconfig powerpc):
drivers/dma/fsldma.c: In function ‘fsl_dma_chan_probe’:
drivers/dma/fsldma.c:1165:26: warning: this statement may fall through [-Wimplicit-fallthrough=]
chan->toggle_ext_pause = fsl_chan_toggle_ext_pause;
~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/dma/fsldma.c:1166:2: note: here
case FSL_DMA_IP_83XX:
^~~~
Reported-by: kbuild test robot <lkp@intel.com>
Acked-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
The new dma_alloc_contiguous hides if we allocate CMA or regular
pages, and thus fails to retry a ZONE_NORMAL allocation if the CMA
allocation succeeds but isn't addressable. That means we either fail
outright or dip into a small zone that might not succeed either.
Thanks to Hillf Danton for debugging this issue.
Fixes: b1d2dc009d ("dma-contiguous: add dma_{alloc,free}_contiguous() helpers")
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
in ip_mc_inc_group, memory allocation flag, not mcast mode, is expected
by __ip_mc_inc_group
similar issue in __ip_mc_join_group, both mcase mode and gfp_t are needed
here, so use ____ip_mc_inc_group(...)
Fixes: 9fb20801da ("net: Fix ip_mc_{dec,inc}_group allocation context")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NCSI spec indicates that if the data does not end on a 32 bit
boundary, one to three padding bytes equal to 0x00 shall be present to
align the checksum field to a 32-bit boundary.
Signed-off-by: Terry S. Duncan <terry.s.duncan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, we are only explicitly setting SOCK_NOSPACE on a write timeout
for non-blocking sockets. Epoll() edge-trigger mode relies on SOCK_NOSPACE
being set when -EAGAIN is returned to ensure that EPOLLOUT is raised.
Expand the setting of SOCK_NOSPACE to non-blocking sockets as well that can
use SO_SNDTIMEO to adjust their write timeout. This mirrors the behavior
that Eric Dumazet introduced for tcp sockets.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Ursula Braun <ubraun@linux.ibm.com>
Cc: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull HID fixes from Jiri Kosina:
- a few regression fixes for wacom driver (including fix for my earlier
mismerge) from Aaron Armstrong Skomra and Jason Gerecke
- revert of a few Logitech device ID additions which turn out to not
work perfectly with the hidpp driver at the moment; proper support is
now scheduled for 5.4. Fixes from Benjamin Tissoires
- scheduling-in-atomic fix for cp2112 driver, from Benjamin Tissoires
- new device ID to intel-ish, from Even Xu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: wacom: correct misreported EKR ring values
HID: cp2112: prevent sleeping function called from invalid context
HID: intel-ish-hid: ipc: add EHL device id
HID: wacom: Correct distance scale for 2nd-gen Intuos devices
HID: logitech-hidpp: remove support for the G700 over USB
Revert "HID: logitech-hidpp: add USB PID for a few more supported mice"
HID: wacom: add back changes dropped in merge commit
ODP depends on the several device capabilities, among them is the ability
to send UMR WQEs with that modify atomic and entity size of the MR.
Therefore, only if all conditions to send such a UMR WQE are met then
driver can report that ODP is supported. Use this check of conditions
in all places where driver needs to know about ODP support.
Also, implicit ODP support depends on ability of driver to send UMR WQEs
for an indirect mkey. Therefore, verify that all conditions to do so are
met when reporting support.
Fixes: c8d75a980f ("IB/mlx5: Respect new UMR capabilities")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-7-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
task_active_pid_ns() is wrong API to check PID namespace because it
posses some restrictions and return PID namespace where the process
was allocated. It created mismatches with current namespace, which
can be different.
Rewrite whole rdma_is_visible_in_pid_ns() logic to provide reliable
results without any relation to allocated PID namespace.
Fixes: 8be565e65f ("RDMA/nldev: Factor out the PID namespace check")
Fixes: 6a6c306a09 ("RDMA/restrack: Make is_visible_in_pid_ns() as an API")
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-4-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order. A stale TID RDMA data packet
could lead to TidErr if the TID entries have been released by duplicate
data packets generated from retries, and subsequently erroneously force
the qp into error state in the current implementation.
Since the payload has already been dropped by hardware, the packet can
be simply dropped and it is no longer necessary to put the qp into
error state.
Fixes: 9905bf06e8 ("IB/hfi1: Add functions to receive TID RDMA READ response")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192058.105923.72324.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
In a congested fabric with adaptive routing enabled, traces show that
the sender could receive stale TID RDMA NAK packets that contain newer
KDETH PSNs and older Verbs PSNs. If not dropped, these packets could
cause the incorrect rewinding of the software flows and the incorrect
completion of TID RDMA WRITE requests, and eventually leading to memory
corruption and kernel crash.
The current code drops stale TID RDMA ACK/NAK packets solely based
on KDETH PSNs, which may lead to erroneous processing. This patch
fixes the issue by also checking the Verbs PSN. Addition checks are
added before rewinding the TID RDMA WRITE DATA packets.
Fixes: 9e93e967f7 ("IB/hfi1: Add a function to receive TID RDMA ACK packet")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192033.105923.44192.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
We need to check if we have CQEs pending before starting a poll loop,
as those could be the events we will be spinning for (and hence we'll
find none). This can happen if a CQE triggers an error, or if it is
found by eg an IRQ before we get a chance to find it through polling.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
One of the components in LiteON CL1 device has limitations that
can be encountered based upon boundary race conditions using the
nvme bus specific suspend to idle flow.
When this situation occurs the drive doesn't resume properly from
suspend-to-idle.
LiteON has confirmed this problem and fixed in the next firmware
version. As this firmware is already in the field, avoid running
nvme specific suspend to idle flow.
Fixes: d916b1be94 ("nvme-pci: use host managed power state for suspend")
Link: http://lists.infradead.org/pipermail/linux-nvme/2019-July/thread.html
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Charles Hyde <charles.hyde@dellteam.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit 1b1031ca63 ("nvme: validate cntlid during controller initialisation")
introduced a validation for controllers with duplicate cntlid that runs
on nvme_init_subsystem(). The problem is that the validation relies on
ctrl->cntlid, and this value is assigned (from id_ctrl value) after the
call for nvme_init_subsystem() in nvme_init_identify() for non-fabrics
scenario. That leads to ctrl->cntlid always being 0 in case we have a
physical set of controllers in the same subsystem.
This patch fixes that by loading the discovered cntlid id_ctrl value into
ctrl->cntlid before the subsystem initialization, only for the non-fabrics
case. The patch was tested with emulated nvme devices (qemu) having two
controllers in a single subsystem. Without the patch, we couldn't make
it work failing in the duplicate check; when running with the patch, we
could see the subsystem holding both controllers.
For the fabrics case we see ctrl->cntlid has a more intricate relation
with the admin connect, so we didn't change that.
Fixes: 1b1031ca63 ("nvme: validate cntlid during controller initialisation")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
nvme_state_set_live() making a path available triggers requeue_work
in order to resubmit requests that ended up on requeue_list when no
paths were available.
This requeue_work may race with concurrent nvme_ns_head_make_request()
that do not observe the live path yet.
Such concurrent requests may by made by either:
- New IO submission.
- Requeue_work triggered by nvme_failover_req() or another ana_work.
A race may cause requeue_work capture the state of requeue_list before
more requests get onto the list. These requests will stay on the list
forever unless requeue_work is triggered again.
In order to prevent such race, nvme_state_set_live() should
synchronize_srcu(&head->srcu) before triggering the requeue_work and
prevent nvme_ns_head_make_request referencing an old snapshot of the
path list.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If a request issue ends up being punted to async context to avoid
blocking, we can get into a situation where the original application
enters the poll loop for that very request before it has been issued.
This should not be an issue, except that the polling will hold the
io_uring uring_ctx mutex for the duration of the poll. When the async
worker has actually issued the request, it needs to acquire this mutex
to add the request to the poll issued list. Since the application
polling is already holding this mutex, the workqueue sleeps on the
mutex forever, and the application thus never gets a chance to poll for
the very request it was interested in.
Fix this by ensuring that the polling drops the uring_ctx occasionally
if it's not making any progress.
Reported-by: Jeffrey M. Birnbaum <jmbnyc@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In the case of X86_PAE, unsigned long is u32, but the physical address type
should be u64. Due to the bug here, the netvsc driver can not load
successfully, and sometimes the VM can panic due to memory corruption (the
hypervisor writes data to the wrong location).
Fixes: 6ba34171bc ("Drivers: hv: vmbus: Remove use of slow_virt_to_phys()")
Cc: stable@vger.kernel.org
Cc: Michael Kelley <mikelley@microsoft.com>
Reported-and-tested-by: Juliana Rodrigueiro <juliana.rodrigueiro@intra2net.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
When building hv_kvp_daemon GCC-8.3 complains:
hv_kvp_daemon.c: In function ‘kvp_get_ip_info.constprop’:
hv_kvp_daemon.c:812:30: warning: ‘ip_buffer’ may be used uninitialized in this function [-Wmaybe-uninitialized]
struct hv_kvp_ipaddr_value *ip_buffer;
this seems to be a false positive: we only use ip_buffer when
op == KVP_OP_GET_IP_INFO and it is only unset when op == KVP_OP_ENUMERATE.
Silence the warning by initializing ip_buffer to NULL.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Simplify the ring buffer handling with the in-place API.
Also avoid the dynamic allocation and the memory leak in the channel
callback function.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
This field is no longer used after the commit
63ed4e0c67 ("Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource code")
, because it's replaced by the global variable
"struct ms_hyperv_tsc_page *tsc_pg;" (now, the variable is in
drivers/clocksource/hyperv_timer.c).
Fixes: 63ed4e0c67 ("Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource code")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
If the HW revision of Qu devices we found is QuZ, then we need to
switch the configuration accordingly in order to use the correct FW.
Add a block of ifs in order do that.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
We have a too generic condition that switches from Qu configurations
to QnJ configurations. We need to exclude some configurations so that
they are not erroneously switched. Add the ax201 configuration to the
list of exclusions.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The MAC context configuration always allowed multicast data frames
to pass to the driver for all MAC context types, and in the
case of station MAC context both when associated and when not
associated.
One of the outcomes of this configuration is having the FW forward
encrypted multicast frames to the driver with Rx status indicating
that the frame was not decrypted (as expected, since no keys were
configured yet) which in turn results with unnecessary error
messages.
Change this behavior to allow multicast data frames only when they
are actually expected, e.g., station MAC context is associated etc.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
To do not brake HW restart we should keep initialization vectors data.
I assumed that on start the data is already initialized to zeros, but
that not true on some scenarios and we should clear it. So add
additional flag to check if we are under HW restart and clear IV's
data if we are not.
Patch fixes AP mode regression.
Reported-and-tested-by: Emil Karlson <jekarl@iki.fi>
Fixes: 710e6cc159 ("rt2800: do not nullify initialization vector data")
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
In ti_dra7_xbar_probe(), 'rsv_events' is allocated through kcalloc(). Then
of_property_read_u32_array() is invoked to search for the property.
However, if this process fails, 'rsv_events' is not deallocated, leading to
a memory leak bug. To fix this issue, free 'rsv_events' before returning
the error.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/1565938136-7249-1-git-send-email-wenwen@cs.uga.edu
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The static structures ht16k33_fb_fix and ht16k33_fb_var, of types
fb_fix_screeninfo and fb_var_screeninfo respectively, are not used
except to be copied into other variables. Hence make both of them
constant to prevent unintended modification.
Issue found with
Coccinelle.
Acked-by: Robin van der Gracht <robin@protonic.nl>
Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
The EKR ring claims a range of 0 to 71 but actually reports
values 1 to 72. The ring is used in relative mode so this
change should not affect users.
Signed-off-by: Aaron Armstrong Skomra <aaron.skomra@wacom.com>
Fixes: 72b236d602 ("HID: wacom: Add support for Express Key Remote.")
Cc: <stable@vger.kernel.org> # v4.3+
Reviewed-by: Ping Cheng <ping.cheng@wacom.com>
Reviewed-by: Jason Gerecke <jason.gerecke@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
We need to provide the arch hooks for non-coherent dma-direct
and swiotlb for all swiotlb builds, not just when LPAS is enabled.
Without that the Xen build that selects SWIOTLB indirectly through
SWIOTLB_XEN fails to build.
Fixes: ad3c7b18c5 ("arm: use swiotlb for bounce buffering on LPAE configs")
Reported-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Stefan Wahren <wahrenst@gmx.net>
When SCSI-MQ is enabled, the SCSI-MQ layers will do pre-allocation of MQ
resources based on shost values set by the driver. In newer cases of the
driver, which attempts to set nr_hw_queues to the cpu count, the
multipliers become excessive, with a single shost having SCSI-MQ
pre-allocation reaching into the multiple GBytes range. NPIV, which
creates additional shosts, only multiply this overhead. On lower-memory
systems, this can exhaust system memory very quickly, resulting in a system
crash or failures in the driver or elsewhere due to low memory conditions.
After testing several scenarios, the situation can be mitigated by limiting
the value set in shost->nr_hw_queues to 4. Although the shost values were
changed, the driver still had per-cpu hardware queues of its own that
allowed parallelization per-cpu. Testing revealed that even with the
smallish number for nr_hw_queues for SCSI-MQ, performance levels remained
near maximum with the within-driver affiinitization.
A module parameter was created to allow the value set for the nr_hw_queues
to be tunable.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The parens used in the while loop would result in error being assigned
the value 1 rather than the intended errno value.
This is required to return -ETXTBSY from follow on break_layout()
changes.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Commit ba5ea61462 ("bridge: simplify ip_mc_check_igmp() and
ipv6_mc_check_mld() calls") replaces direct calls to pskb_may_pull()
in br_ipv6_multicast_mld2_report() with calls to ipv6_mc_may_pull(),
that returns -EINVAL on buffers too short to be valid IPv6 packets,
while maintaining the previous handling of the return code.
This leads to the direct opposite of the intended effect: if the
packet is malformed, -EINVAL evaluates as true, and we'll happily
proceed with the processing.
Return 0 if the packet is too short, in the same way as this was
fixed for IPv4 by commit 083b78a9ed ("ip: fix ip_mc_may_pull()
return value").
I don't have a reproducer for this, unlike the one referred to by
the IPv4 commit, but this is clearly broken.
Fixes: ba5ea61462 ("bridge: simplify ip_mc_check_igmp() and ipv6_mc_check_mld() calls")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull clk fixes from Stephen Boyd:
"A couple fixes to the core framework logic that finds clk parents, a
handful of samsung clk driver fixes for audio and display clks, and a
small fix for the Stratix10 SoC driver that was checking the wrong
register for validity"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: Fix potential NULL dereference in clk_fetch_parent_index()
clk: Fix falling back to legacy parent string matching
clk: socfpga: stratix10: fix rate caclulationg for cnt_clks
clk: samsung: exynos542x: Move MSCL subsystem clocks to its sub-CMU
clk: samsung: exynos5800: Move MAU subsystem clocks to MAU sub-CMU
clk: samsung: Change signature of exynos5_subcmus_init() function
Pull kernel thread signal handling fix from Eric Biederman:
"I overlooked the fact that kernel threads are created with all signals
set to SIG_IGN, and accidentally caused a regression in cifs and drbd
when replacing force_sig with send_sig.
This is my fix for that regression. I add a new function
allow_kernel_signal which allows kernel threads to receive signals
sent from the kernel, but continues to ignore all signals sent from
userspace. This ensures the user space interface for cifs and drbd
remain the same.
These kernel threads depend on blocking networking calls which block
until something is received or a signal is pending. Making receiving
of signals somewhat necessary for these kernel threads.
Perhaps someday we can cleanup those interfaces and remove
allow_kernel_signal. If not allow_kernel_signal is pretty trivial and
clearly documents what is going on so I don't think we will mind
carrying it"
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Allow cifs and drbd to receive their terminating signals
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Remove IP MASQUERADING record in MAINTAINERS file,
from Denis Efremov.
2) Counter arguments are swapped in ebtables, from
Todd Seidelmann.
3) Missing netlink attribute validation in flow_offload
extension.
4) Incorrect alignment in xt_nfacct that breaks 32-bits
userspace / 64-bits kernels, from Juliana Rodrigueiro.
5) Missing include guard in nf_conntrack_h323_types.h,
from Masahiro Yamada.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
As Jason Baron explained in commit 790ba4566c ("tcp: set SOCK_NOSPACE
under memory pressure"), it is crucial we properly set SOCK_NOSPACE
when needed.
However, Jason patch had a bug, because the 'nonblocking' status
as far as sk_stream_wait_memory() is concerned is governed
by MSG_DONTWAIT flag passed at sendmsg() time :
long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(),
and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE
cleared, if sk->sk_sndtimeo has been set to a small (but not zero)
value.
This patch removes the 'noblock' variable since we must always
set SOCK_NOSPACE if -EAGAIN is returned.
It also renames the do_nonblock label since we might reach this
code path even if we were in blocking mode.
Fixes: 790ba4566c ("tcp: set SOCK_NOSPACE under memory pressure")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Baron <jbaron@akamai.com>
Reported-by: Vladimir Rutsky <rutsky@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If alloc_descs() fails before irq_sysfs_init() has run, free_desc() in the
cleanup path will call kobject_del() even though the kobject has not been
added with kobject_add().
Fix this by making the call to kobject_del() conditional on whether
irq_sysfs_init() has run.
This problem surfaced because commit aa30f47cf6 ("kobject: Add support
for default attribute groups to kobj_type") makes kobject_del() stricter
about pairing with kobject_add(). If the pairing is incorrrect, a WARNING
and backtrace occur in sysfs_remove_group() because there is no parent.
[ tglx: Add a comment to the code and make it work with CONFIG_SYSFS=n ]
Fixes: ecb3f394c5 ("genirq: Expose interrupt information through sysfs")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1564703564-4116-1-git-send-email-mikelley@microsoft.com
"enabled" parameter historically referred to the device input or
output, not to the led indicator. After the changes added with the led
helper functions the mic mute led logic refers to the led and not to
the mic input which caused led indicator to be negated.
Fixing logic in cxt_update_gpio_led and updated
cxt_fixup_gpio_mute_hook
Also updated debug messages to ease further debugging if necessary.
Fixes: 184e302b46 ("ALSA: hda/conexant - Use the mic-mute LED helper")
Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jeronimo Borque <jeronimo@borque.com.ar>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull networking fixes from David Miller:
1) Fix jmp to 1st instruction in x64 JIT, from Alexei Starovoitov.
2) Severl kTLS fixes in mlx5 driver, from Tariq Toukan.
3) Fix severe performance regression due to lack of SKB coalescing of
fragments during local delivery, from Guillaume Nault.
4) Error path memory leak in sch_taprio, from Ivan Khoronzhuk.
5) Fix batched events in skbedit packet action, from Roman Mashak.
6) Propagate VLAN TX offload to hw_enc_features in bond and team
drivers, from Yue Haibing.
7) RXRPC local endpoint refcounting fix and read after free in
rxrpc_queue_local(), from David Howells.
8) Fix endian bug in ibmveth multicast list handling, from Thomas
Falcon.
9) Oops, make nlmsg_parse() wrap around the correct function,
__nlmsg_parse not __nla_parse(). Fix from David Ahern.
10) Memleak in sctp_scend_reset_streams(), fro Zheng Bin.
11) Fix memory leak in cxgb4, from Wenwen Wang.
12) Yet another race in AF_PACKET, from Eric Dumazet.
13) Fix false detection of retransmit failures in tipc, from Tuong
Lien.
14) Use after free in ravb_tstamp_skb, from Tho Vu.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits)
ravb: Fix use-after-free ravb_tstamp_skb
netfilter: nf_tables: map basechain priority to hardware priority
net: sched: use major priority number as hardware priority
wimax/i2400m: fix a memory leak bug
net: cavium: fix driver name
ibmvnic: Unmap DMA address of TX descriptor buffers after use
bnxt_en: Fix to include flow direction in L2 key
bnxt_en: Use correct src_fid to determine direction of the flow
bnxt_en: Suppress HWRM errors for HWRM_NVM_GET_VARIABLE command
bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails
bnxt_en: Improve RX doorbell sequence.
bnxt_en: Fix VNIC clearing logic for 57500 chips.
net: kalmia: fix memory leaks
cx82310_eth: fix a memory leak bug
bnx2x: Fix VF's VLAN reconfiguration in reload.
Bluetooth: Add debug setting for changing minimum encryption key size
tipc: fix false detection of retransmit failures
lan78xx: Fix memory leaks
MAINTAINERS: r8169: Update path to the driver
MAINTAINERS: PHY LIBRARY: Update files in the record
...
The maximum key description size is 4095. Commit f771fde820 ("keys:
Simplify key description management") inadvertantly reduced that to 255
and made sizes between 256 and 4095 work weirdly, and any size whereby
size & 255 == 0 would cause an assertion in __key_link_begin() at the
following line:
BUG_ON(index_key->desc_len == 0);
This can be fixed by simply increasing the size of desc_len in struct
keyring_index_key to a u16.
Note the argument length test in keyutils only checked empty
descriptions and descriptions with a size around the limit (ie. 4095)
and not for all the values in between, so it missed this. This has been
addressed and
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/commit/?id=066bf56807c26cd3045a25f355b34c1d8a20a5aa
now exhaustively tests all possible lengths of type, description and
payload and then some.
The assertion failure looks something like:
kernel BUG at security/keys/keyring.c:1245!
...
RIP: 0010:__key_link_begin+0x88/0xa0
...
Call Trace:
key_create_or_update+0x211/0x4b0
__x64_sys_add_key+0x101/0x200
do_syscall_64+0x5b/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
It can be triggered by:
keyctl add user "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" a @s
Fixes: f771fde820 ("keys: Simplify key description management")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
BIOS on Samsung 500C Chromebook reports very rudimentary E820 table that
consists of 2 entries:
BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] usable
BIOS-e820: [mem 0x00000000fffff000-0x00000000ffffffff] reserved
It breaks logic in find_trampoline_placement(): bios_start lands on the
end of the first 4k page and trampoline start gets placed below 0.
Detect underflow and don't touch bios_start for such cases. It makes
kernel ignore E820 table on machines that doesn't have two usable pages
below BIOS_START_MAX.
Fixes: 1b3a626436 ("x86/boot/compressed/64: Validate trampoline placement against E820")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203463
Link: https://lkml.kernel.org/r/20190813131654.24378-1-kirill.shutemov@linux.intel.com
If the writeback error is fatal, we need to remove the tracking structures
(i.e. the nfs_page) from the inode.
Fixes: 6fbda89b25 ("NFS: Replace custom error reporting mechanism...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Initialise the result count to 0 rather than initialising it to the
argument count. The reason is that we want to ensure we record the
I/O stats correctly in the case where an error is returned (for
instance in the layoutstats).
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If the attempt to resend the I/O results in no bytes being read/written,
we must ensure that we report the error.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Fixes: 0a00b77b33 ("nfs: mirroring support for direct io")
Cc: stable@vger.kernel.org # v3.20+
If the attempt to resend the pages fails, we need to ensure that we
clean up those pages that were not transmitted.
Fixes: d600ad1f2b ("NFS41: pop some layoutget errors to application")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.5+
If the file turns out to be of the wrong type after opening, we want
to revalidate the path and retry, so return EOPENSTALE rather than
ESTALE.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Currently, we are translating RPC level errors such as timeouts,
as well as interrupts etc into EOPENSTALE, which forces a single
replay of the open attempt. What we actually want to do is
force the replay only in the cases where the returned error
indicates that the file may have changed on the server.
So the fix is to spell out the exact set of errors where we want
to return EOPENSTALE.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If we've been given the attributes of the mounted-on-file, then do not
use those to check or update the attributes on the application-visible
inode.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
When calling request_threaded_irq() with a CP2112, the function
cp2112_gpio_irq_startup() is called in a IRQ context.
Therefore we can not sleep, and we can not call
cp2112_gpio_direction_input() there.
Move the call to cp2112_gpio_direction_input() earlier to have a working
driver.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Distance values reported by 2nd-gen Intuos tablets are on an inverted
scale (0 == far, 63 == near). We need to change them over to a normal
scale before reporting to userspace or else userspace drivers and
applications can get confused.
Ref: https://github.com/linuxwacom/input-wacom/issues/98
Fixes: eda01dab53 ("HID: wacom: Add four new Intuos devices")
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
My recent to change to only use force_sig for a synchronous events
wound up breaking signal reception cifs and drbd. I had overlooked
the fact that by default kthreads start out with all signals set to
SIG_IGN. So a change I thought was safe turned out to have made it
impossible for those kernel thread to catch their signals.
Reverting the work on force_sig is a bad idea because what the code
was doing was very much a misuse of force_sig. As the way force_sig
ultimately allowed the signal to happen was to change the signal
handler to SIG_DFL. Which after the first signal will allow userspace
to send signals to these kernel threads. At least for
wake_ack_receiver in drbd that does not appear actively wrong.
So correct this problem by adding allow_kernel_signal that will allow
signals whose siginfo reports they were sent by the kernel through,
but will not allow userspace generated signals, and update cifs and
drbd to call allow_kernel_signal in an appropriate place so that their
thread can receive this signal.
Fixing things this way ensures that userspace won't be able to send
signals and cause problems, that it is clear which signals the
threads are expecting to receive, and it guarantees that nothing
else in the system will be affected.
This change was partly inspired by similar cifs and drbd patches that
added allow_signal.
Reported-by: ronnie sahlberg <ronniesahlberg@gmail.com>
Reported-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Tested-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Cc: Steve French <smfrench@gmail.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Fixes: 247bc9470b ("cifs: fix rmmod regression in cifs.ko caused by force_sig changes")
Fixes: 72abe3bcf0 ("signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig")
Fixes: fee109901f ("signal/drbd: Use send_sig not force_sig")
Fixes: 3cf5d076fb ("signal: Remove task parameter from force_sig")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Some newer machines do not advertise legacy timers. The kernel can handle
that situation if the TSC and the CPU frequency are enumerated by CPUID or
MSRs and the CPU supports TSC deadline timer. If the CPU does not support
TSC deadline timer the local APIC timer frequency has to be known as well.
Some Ryzens machines do not advertize legacy timers, but there is no
reliable way to determine the bus frequency which feeds the local APIC
timer when the machine allows overclocking of that frequency.
As there is no legacy timer the local APIC timer calibration crashes due to
a NULL pointer dereference when accessing the not installed global clock
event device.
Switch the calibration loop to a non interrupt based one, which polls
either TSC (if frequency is known) or jiffies. The latter requires a global
clockevent. As the machines which do not have a global clockevent installed
have a known TSC frequency this is a non issue. For older machines where
TSC frequency is not known, there is no known case where the legacy timers
do not exist as that would have been reported long ago.
Reported-by: Daniel Drake <drake@endlessm.com>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Drake <drake@endlessm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908091443030.21433@nanos.tec.linutronix.de
Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1142926#c12
lockdep reports the following deadlock scenario:
WARNING: possible circular locking dependency detected
kworker/1:1/48 is trying to acquire lock:
000000008d7a62b2 (text_mutex){+.+.}, at: kprobe_optimizer+0x163/0x290
but task is already holding lock:
00000000850b5e2d (module_mutex){+.+.}, at: kprobe_optimizer+0x31/0x290
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (module_mutex){+.+.}:
__mutex_lock+0xac/0x9f0
mutex_lock_nested+0x1b/0x20
set_all_modules_text_rw+0x22/0x90
ftrace_arch_code_modify_prepare+0x1c/0x20
ftrace_run_update_code+0xe/0x30
ftrace_startup_enable+0x2e/0x50
ftrace_startup+0xa7/0x100
register_ftrace_function+0x27/0x70
arm_kprobe+0xb3/0x130
enable_kprobe+0x83/0xa0
enable_trace_kprobe.part.0+0x2e/0x80
kprobe_register+0x6f/0xc0
perf_trace_event_init+0x16b/0x270
perf_kprobe_init+0xa7/0xe0
perf_kprobe_event_init+0x3e/0x70
perf_try_init_event+0x4a/0x140
perf_event_alloc+0x93a/0xde0
__do_sys_perf_event_open+0x19f/0xf30
__x64_sys_perf_event_open+0x20/0x30
do_syscall_64+0x65/0x1d0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #0 (text_mutex){+.+.}:
__lock_acquire+0xfcb/0x1b60
lock_acquire+0xca/0x1d0
__mutex_lock+0xac/0x9f0
mutex_lock_nested+0x1b/0x20
kprobe_optimizer+0x163/0x290
process_one_work+0x22b/0x560
worker_thread+0x50/0x3c0
kthread+0x112/0x150
ret_from_fork+0x3a/0x50
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(module_mutex);
lock(text_mutex);
lock(module_mutex);
lock(text_mutex);
*** DEADLOCK ***
As a reproducer I've been using bcc's funccount.py
(https://github.com/iovisor/bcc/blob/master/tools/funccount.py),
for example:
# ./funccount.py '*interrupt*'
That immediately triggers the lockdep splat.
Fix by acquiring text_mutex before module_mutex in kprobe_optimizer().
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: d5b844a2cf ("ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()")
Link: http://lkml.kernel.org/r/20190812184302.GA7010@xps-13
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If a task is PI-blocked (blocking on sleeping spinlock) then we don't want to
schedule a new kworker if we schedule out due to lock contention because !RT
does not do that as well. A spinning spinlock disables preemption and a worker
does not schedule out on lock contention (but spin).
On RT the RW-semaphore implementation uses an rtmutex so
tsk_is_pi_blocked() will return true if a task blocks on it. In this case we
will now start a new worker which may deadlock if one worker is waiting on
progress from another worker. Since a RW-semaphore starts a new worker on !RT,
we should do the same on RT.
XFS is able to trigger this deadlock.
Allow to schedule new worker if the current worker is PI-blocked.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20190816160626.12742-1-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When running a 64-bit kernel with a 32-bit iptables binary, the size of
the xt_nfacct_match_info struct diverges.
kernel: sizeof(struct xt_nfacct_match_info) : 40
iptables: sizeof(struct xt_nfacct_match_info)) : 36
Trying to append nfacct related rules results in an unhelpful message.
Although it is suggested to look for more information in dmesg, nothing
can be found there.
# iptables -A <chain> -m nfacct --nfacct-name <acct-object>
iptables: Invalid argument. Run `dmesg' for more information.
This patch fixes the memory misalignment by enforcing 8-byte alignment
within the struct's first revision. This solution is often used in many
other uapi netfilter headers.
Signed-off-by: Juliana Rodrigueiro <juliana.rodrigueiro@intra2net.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The netlink attribute policy for NFTA_FLOW_TABLE_NAME is missing.
Fixes: a3c90f7a23 ("netfilter: nf_tables: flow offload expression")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The ordering of arguments to the x_tables ADD_COUNTER macro
appears to be wrong in ebtables (cf. ip_tables.c, ip6_tables.c,
and arp_tables.c).
This causes data corruption in the ebtables userspace tools
because they get incorrect packet & byte counts from the kernel.
Fixes: d72133e628 ("netfilter: ebtables: use ADD_COUNTER macro")
Signed-off-by: Todd Seidelmann <tseidelmann@linode.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This entry is in MAINTAINERS for historical purpose.
It doesn't match current sources since the commit
adf82accc5 ("netfilter: x_tables: merge ip and
ipv6 masquerade modules") moved the module.
The net/netfilter/xt_MASQUERADE.c module is already under
the netfilter section. Thus, there is no purpose to keep this
separate entry in MAINTAINERS.
Cc: Florian Westphal <fw@strlen.de>
Cc: Juanjo Ciarlante <jjciarla@raiz.uncu.edu.ar>
Cc: netfilter-devel@vger.kernel.org
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
While trawling through the dedupe file comparison code trying to fix
page deadlocking problems, Dave Chinner noticed that the reflink code
only takes shared IOLOCK/MMAPLOCKs on the source file. Because
page_mkwrite and directio writes do not take the EXCL versions of those
locks, this means that reflink can race with writer processes.
For pure remapping this can lead to undefined behavior and file
corruption; for dedupe this means that we cannot be sure that the
contents are identical when we decide to go ahead with the remapping.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Currently the driver does not handle EPROBE_DEFER for the confd gpio.
Use devm_gpiod_get_optional() instead of devm_gpiod_get() and return
error codes from altera_ps_probe().
Fixes: 5692fae074 ("fpga manager: Add altera-ps-spi driver for Altera FPGAs")
Signed-off-by: Phil Reid <preid@electromag.com.au>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Each iteration of for_each_child_of_node puts the previous
node, but in the case of a goto from the middle of the loop, there is
no put, thus causing a memory leak. Hence add an of_node_put before the
goto in two places.
Issue found with Coccinelle.
Fixes: 119f517362 (drm/mediatek: Add DRM Driver for Mediatek SoC MT8173)
Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
When a Tx timestamp is requested, a pointer to the skb is stored in the
ravb_tstamp_skb struct. This was done without an skb_get. There exists
the possibility that the skb could be freed by ravb_tx_free (when
ravb_tx_free is called from ravb_start_xmit) before the timestamp was
processed, leading to a use-after-free bug.
Use skb_get when filling a ravb_tstamp_skb struct, and add appropriate
frees/consumes when a ravb_tstamp_skb struct is freed.
Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Tho Vu <tho.vu.wh@rvc.renesas.com>
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
flow_offload hardware priority fixes
This patchset contains two updates for the flow_offload users:
1) Pass the major tc priority to drivers so they do not have to
lshift it. This is a preparation patch for the fix coming in
patch #2.
2) Set the hardware priority from the netfilter basechain priority,
some drivers break when using the existing hardware priority
number that is set to zero.
v5: fix patch 2/2 to address a clang warning and to simplify
the priority mapping.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds initial support for offloading basechains using the
priority range from 1 to 65535. This is restricting the netfilter
priority range to 16-bit integer since this is what most drivers assume
so far from tc. It should be possible to extend this range of supported
priorities later on once drivers are updated to support for 32-bit
integer priorities.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
tc transparently maps the software priority number to hardware. Update
it to pass the major priority which is what most drivers expect. Update
drivers too so they do not need to lshift the priority field of the
flow_cls_common_offload object. The stmmac driver is an exception, since
this code assumes the tc software priority is fine, therefore, lshift it
just to be conservative.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In i2400m_barker_db_init(), 'options_orig' is allocated through kstrdup()
to hold the original command line options. Then, the options are parsed.
However, if an error occurs during the parsing process, 'options_orig' is
not deallocated, leading to a memory leak bug. To fix this issue, free
'options_orig' before returning the error.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver name gets exposed in sysfs under /sys/bus/pci/drivers
so it should look like other devices. Change it to be common
format (instead of "Cavium PTP").
This is a trivial fix that was observed by accident because
Debian kernels were building this driver into kernel (bug).
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no need to wait until a completion is received to unmap
TX descriptor buffers that have been passed to the hypervisor.
Instead unmap it when the hypervisor call has completed. This patch
avoids the possibility that a buffer will not be unmapped because
a TX completion is lost or mishandled.
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Tested-by: Devesh K. Singh <devesh_singh@in.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Bug fixes.
2 Bug fixes related to 57500 shutdown sequence and doorbell sequence,
2 TC Flower bug fixes related to the setting of the flow direction,
1 NVRAM update bug fix, and a minor fix to suppress an unnecessary
error message. Please queue for -stable as well. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
FW expects the driver to provide unique flow reference handles
for Tx or Rx flows. When a Tx flow and an Rx flow end up sharing
a reference handle, flow offload does not seem to work.
This could happen in the case of 2 flows having their L2 fields
wildcarded but in different direction.
Fix to incorporate the flow direction as part of the L2 key
v2: Move the dir field to the end of the bnxt_tc_l2_key struct to
fix the warning reported by kbuild test robot <lkp@intel.com>.
There is existing code that initializes the structure using
nested initializer and will warn with the new u8 field added to
the beginning. The structure also packs nicer when this new u8 is
added to the end of the structure [MChan].
Fixes: abd43a1352 ("bnxt_en: Support for 64-bit flow handle.")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Direction of the flow is determined using src_fid. For an RX flow,
src_fid is PF's fid and for TX flow, src_fid is VF's fid. Direction
of the flow must be specified, when getting statistics for that flow.
Currently, for DECAP flow, direction is determined incorrectly, i.e.,
direction is initialized as TX for DECAP flow, instead of RX. Because
of which, stats are not reported for this DECAP flow, though it is
offloaded and there is traffic for that flow, resulting in flow age out.
This patch fixes the problem by determining the DECAP flow's direction
using correct fid. Set the flow direction in all cases for consistency
even if 64-bit flow handle is not used.
Fixes: abd43a1352 ("bnxt_en: Support for 64-bit flow handle.")
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For newly added NVM parameters, older firmware may not have the support.
Suppress the error message to avoid the unncessary error message which is
triggered when devlink calls the driver during initialization.
Fixes: 782a624d00 ("bnxt_en: Add bnxt_en initial params table and register it.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If FW returns FRAG_ERR in response error code, driver is resending the
command only when HWRM command returns success. Fix the code to resend
NVM_INSTALL_UPDATE command with DEFRAG install flags, if FW returns
FRAG_ERR in its response error code.
Fixes: cb4d1d6261 ("bnxt_en: Retry failed NVM_INSTALL_UPDATE with defragmentation flag enabled.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When both RX buffers and RX aggregation buffers have to be
replenished at the end of NAPI, post the RX aggregation buffers first
before RX buffers. Otherwise, we may run into a situation where
there are only RX buffers without RX aggregation buffers for a split
second. This will cause the hardware to abort the RX packet and
report buffer errors, which will cause unnecessary cleanup by the
driver.
Ringing the Aggregation ring doorbell first before the RX ring doorbell
will prevent some of these buffer errors. Use the same sequence during
ring initialization as well.
Fixes: 697197e5a1 ("bnxt_en: Re-structure doorbells.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During device shutdown, the VNIC clearing sequence needs to be modified
to free the VNIC first before freeing the RSS contexts. The current
code is doing the reverse and we can get mis-directed RX completions
to CP ring ID 0 when the RSS contexts are freed and zeroed. The clearing
of RSS contexts is not required with the new sequence.
Refactor the VNIC clearing logic into a new function bnxt_clear_vnic()
and do the chip specific VNIC clearing sequence.
Fixes: 7b3af4f75b ("bnxt_en: Add RSS support for 57500 chips.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In kalmia_init_and_get_ethernet_addr(), 'usb_buf' is allocated through
kmalloc(). In the following execution, if the 'status' returned by
kalmia_send_init_packet() is not 0, 'usb_buf' is not deallocated, leading
to memory leaks. To fix this issue, add the 'out' label to free 'usb_buf'.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
In cx82310_bind(), 'dev->partial_data' is allocated through kmalloc().
Then, the execution waits for the firmware to become ready. If the firmware
is not ready in time, the execution is terminated. However, the allocated
'dev->partial_data' is not deallocated on this path, leading to a memory
leak bug. To fix this issue, free 'dev->partial_data' before returning the
error.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 04f05230c5 ("bnx2x: Remove configured vlans as
part of unload sequence."), introduced a regression in driver
that as a part of VF's reload flow, VLANs created on the VF
doesn't get re-configured in hardware as vlan metadata/info
was not getting cleared for the VFs which causes vlan PING to stop.
This patch clears the vlan metadata/info so that VLANs gets
re-configured back in the hardware in VF's reload flow and
PING/traffic continues for VLANs created over the VFs.
Fixes: 04f05230c5 ("bnx2x: Remove configured vlans as part of unload sequence.")
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Sudarsana Kalluru <skalluru@marvell.com>
Signed-off-by: Shahed Shaikh <shshaikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
pull request: bluetooth 2019-08-17
Here's a set of Bluetooth fixes for the 5.3-rc series:
- Multiple fixes for Qualcomm (btqca & hci_qca) drivers
- Minimum encryption key size debugfs setting (this is required for
Bluetooth Qualification)
- Fix hidp_send_message() to have a meaningful return value
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
For testing and qualification purposes it is useful to allow changing
the minimum encryption key size value that the host stack is going to
enforce. This adds a new debugfs setting min_encrypt_key_size to achieve
this functionality.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When dedupe wants to use the page cache to compare parts of two files
for dedupe, we must be very careful to handle locking correctly. The
current code doesn't do this. It must lock and unlock the page only
once if the two pages are the same, since the overlapping range check
doesn't catch this when blocksize < pagesize. If the pages are distinct
but from the same file, we must observe page locking order and lock them
in order of increasing offset to avoid clashing with writeback locking.
Fixes: 876bec6f9b ("vfs: refactor clone/dedupe_file_range common functions")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
For 31-bit s390 user space, we have to pass pointer arguments through
compat_ptr() in the compat_ioctl handler.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Always try the native ioctl if we don't have a compat handler. This
removes a lot of boilerplate code as 'modern' ioctls should generally
be compat clean, and fixes the missing entries for the recently added
FS_IOC_GETFSLABEL/FS_IOC_SETFSLABEL ioctls.
Fixes: f7664b3197 ("xfs: implement online get/set fs label")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
This commit eliminates the use of the link 'stale_limit' & 'prev_from'
(besides the already removed - 'stale_cnt') variables in the detection
of repeated retransmit failures as there is no proper way to initialize
them to avoid a false detection, i.e. it is not really a retransmission
failure but due to a garbage values in the variables.
Instead, a jiffies variable will be added to individual skbs (like the
way we restrict the skb retransmissions) in order to mark the first skb
retransmit time. Later on, at the next retransmissions, the timestamp
will be checked to see if the skb in the link transmq is "too stale",
that is, the link tolerance time has passed, so that a link reset will
be ordered. Note, just checking on the first skb in the queue is fine
enough since it must be the oldest one.
A counter is also added to keep track the actual skb retransmissions'
number for later checking when the failure happens.
The downside of this approach is that the skb->cb[] buffer is about to
be exhausted, however it is always able to allocate another memory area
and keep a reference to it when needed.
Fixes: 77cf8edbc0 ("tipc: simplify stale link failure criteria")
Reported-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
In lan78xx_probe(), a new urb is allocated through usb_alloc_urb() and
saved to 'dev->urb_intr'. However, in the following execution, if an error
occurs, 'dev->urb_intr' is not deallocated, leading to memory leaks. To fix
this issue, invoke usb_free_urb() to free the allocated urb before
returning from the function.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't compare the parent clock name with a NULL name in the
clk_parent_map. This prevents a kernel crash when passing NULL
core->parents[i].name to strcmp().
An example which triggered this is a mux clock with four parents when
each of them is referenced in the clock driver using
clk_parent_data.fw_name and then calling clk_set_parent(clk, 3rd_parent)
on this mux.
In this case the first parent is also the HW default so
core->parents[i].hw is populated when the clock is registered. Calling
clk_set_parent(clk, 3rd_parent) will then go through all parents and
skip the first parent because it's hw pointer doesn't match. For the
second parent no hw pointer is cached yet and clk_core_get(core, 1)
returns a non-matching pointer (which is correct because we are comparing
the second with the third parent). Comparing the result of
clk_core_get(core, 2) with the requested parent gives a match. However
we don't reach this point because right after the clk_core_get(core, 1)
mismatch the old code tried to !strcmp(parent->name, NULL) (where the
second argument is actually core->parents[i].name, but that was never
populated by the clock driver).
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://lkml.kernel.org/r/20190815223155.21384-1-martin.blumenstingl@googlemail.com
Fixes: fc0c209c14 ("clk: Allow parents to be specified without string names")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Calls to clk_core_get() will return ERR_PTR(-EINVAL) if we've started
migrating a clk driver to use the DT based style of specifying parents
but we haven't made any DT updates yet. This happens when we pass a
non-NULL value as the 'name' argument of of_parse_clkspec(). That
function returns -EINVAL in such a situation, instead of -ENOENT like we
expected. The return value comes back up to clk_core_fill_parent_index()
which proceeds to skip calling clk_core_lookup() because the error
pointer isn't equal to -ENOENT, it's -EINVAL.
Furthermore, we blindly overwrite the error pointer returned by
clk_core_get() with NULL when there isn't a legacy .name member
specified in the parent map. This isn't too bad right now because we
don't really care to differentiate NULL from an error, but in the future
we should only try to do a legacy lookup if we know we might find
something. This way DT lookups that fail don't try to lookup based on
strings when there isn't any string to match, hiding the error from DT
parsing.
Fix both these problems so that clk provider drivers can use the new
style of parent mapping without having to also update their DT at the
same time. This patch is based on an earlier patch from Taniya Das which
checked for -EINVAL in addition to -ENOENT return values from
clk_core_get().
Fixes: 601b6e9330 ("clk: Allow parents to be specified via clkspec index")
Cc: Taniya Das <tdas@codeaurora.org>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Reported-by: Taniya Das <tdas@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lkml.kernel.org/r/20190813214147.34394-1-sboyd@kernel.org
Tested-by: Taniya Das <tdas@codeaurora.org>
The R-Car LVDS encoder units support dual-link operations by splitting
the pixel output between the primary encoder and the companion encoder.
Currently the companion encoder fails at probe time, causing the
registration of the primary to fail as well, preventing the whole DU unit
from being registered at all.
Fix this by not bailing out from probe with error if the
"renesas,companion" property is not specified.
Fixes: fa440d8703 ("drm: rcar-du: lvds: Add support for dual-link mode")
Reported-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
When showing metadata about a single program by invoking
"bpftool prog show PROG", the file descriptor referring to the program
is not closed before returning from the function. Let's close it.
Fixes: 71bb428fe2 ("tools: bpf: add bpftool")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
David Howells says:
====================
rxrpc: Fix local endpoint handling
Here's a pair of patches that fix two issues in the handling of local
endpoints (rxrpc_local structs):
(1) Use list_replace_init() rather than list_replace() if we're going to
unconditionally delete the replaced item later, lest the list get
corrupted.
(2) Don't access the rxrpc_local object after passing our ref to the
workqueue, not even to illuminate tracepoints, as the work function
may cause the object to be freed. We have to cache the information
beforehand.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
This patchset contains Netfilter fixes for net:
1) Extend selftest to cover flowtable with ipsec, from Florian Westphal.
2) Fix interaction of ipsec with flowtable, also from Florian.
3) User-after-free with bound set to rule that fails to load.
4) Adjust state and timeout for flows that expire.
5) Timeout update race with flows in teardown state.
6) Ensure conntrack id hash calculation use invariants as input,
from Dirk Morris.
7) Do not push flows into flowtable for TCP fin/rst packets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
A process could race in an open and attempt to read one of these files
before i_private is initialized, and get a spurious error.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Fix a crash that was introduced by the commit 724376a04d. The crash is
reported here: https://gitlab.com/cryptsetup/cryptsetup/issues/468
When reading from the integrity device, the function
dm_integrity_map_continue calls find_journal_node to find out if the
location to read is present in the journal. Then, it calculates how many
sectors are consecutively stored in the journal. Then, it locks the range
with add_new_range and wait_and_add_new_range.
The problem is that during wait_and_add_new_range, we hold no locks (we
don't hold ic->endio_wait.lock and we don't hold a range lock), so the
journal may change arbitrarily while wait_and_add_new_range sleeps.
The code then goes to __journal_read_write and hits
BUG_ON(journal_entry_get_sector(je) != logical_sector); because the
journal has changed.
In order to fix this bug, we need to re-check the journal location after
wait_and_add_new_range. We restrict the length to one block in order to
not complicate the code too much.
Fixes: 724376a04d ("dm integrity: implement fair range locks")
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
dm-zoned is observed to lock up or livelock in case of hardware
failure or some misconfiguration of the backing zoned device.
This patch adds a new dm-zoned target function that checks the status of
the backing device. If the request queue of the backing device is found
to be in dying state or the SCSI backing device enters offline state,
the health check code sets a dm-zoned target flag prompting all further
incoming I/O to be rejected. In order to detect backing device failures
timely, this new function is called in the request mapping path, at the
beginning of every reclaim run and before performing any metadata I/O.
The proper way out of this situation is to do
dmsetup remove <dm-zoned target>
and recreate the target when the problem with the backing device
is resolved.
Fixes: 3b1a94c88b ("dm zoned: drive-managed zoned block device target")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Some errors are ignored in the I/O path during queueing chunks
for processing by chunk works. Since at least these errors are
transient in nature, it should be possible to retry the failed
incoming commands.
The fix -
Errors that can happen while queueing chunks are carried upwards
to the main mapping function and it now returns DM_MAPIO_REQUEUE
for any incoming requests that can not be properly queued.
Error logging/debug messages are added where needed.
Fixes: 3b1a94c88b ("dm zoned: drive-managed zoned block device target")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
There are several places in reclaim code where errors are not
propagated to the main function, dmz_reclaim(). This function
is responsible for unlocking zones that might be still locked
at the end of any failed reclaim iterations. As the result,
some device zones may be left permanently locked for reclaim,
degrading target's capability to reclaim zones.
This patch fixes these issues as follows -
Make sure that dmz_reclaim_buf(), dmz_reclaim_seq_data() and
dmz_reclaim_rnd_data() return error codes to the caller.
dmz_reclaim() function is renamed to dmz_do_reclaim() to avoid
clashing with "struct dmz_reclaim" and is modified to return the
error to the caller.
dmz_get_zone_for_reclaim() now returns an error instead of NULL
pointer and reclaim code checks for that error.
Error logging/debug messages are added where necessary.
Fixes: 3b1a94c88b ("dm zoned: drive-managed zoned block device target")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This patch fixes a problem in dm-kcopyd that may leave jobs in
complete queue indefinitely in the event of backing storage failure.
This behavior has been observed while running 100% write file fio
workload against an XFS volume created on top of a dm-zoned target
device. If the underlying storage of dm-zoned goes to offline state
under I/O, kcopyd sometimes never issues the end copy callback and
dm-zoned reclaim work hangs indefinitely waiting for that completion.
This behavior was traced down to the error handling code in
process_jobs() function that places the failed job to complete_jobs
queue, but doesn't wake up the job handler. In case of backing device
failure, all outstanding jobs may end up going to complete_jobs queue
via this code path and then stay there forever because there are no
more successful I/O jobs to wake up the job handler.
This patch adds a wake() call to always wake up kcopyd job wait queue
for all I/O jobs that fail before dm_io() gets called for that job.
The patch also sets the write error status in all sub jobs that are
failed because their master job has failed.
Fixes: b73c67c2cb ("dm kcopyd: add sequential write feature")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Revert the commit bd293d071f. The proper
fix has been made available with commit d0a255e795 ("loop: set
PF_MEMALLOC_NOIO for the worker thread").
Note that the fix offered by commit bd293d071f doesn't really prevent
the deadlock from occuring - if we look at the stacktrace reported by
Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex -
i.e. it has already successfully taken the mutex. Changing the mutex
from mutex_lock to mutex_trylock won't help with deadlocks that happen
afterwards.
PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0"
#0 [ffff8813dedfb938] __schedule at ffffffff8173f405
#1 [ffff8813dedfb990] schedule at ffffffff8173fa27
#2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
#3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
#4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
#5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
#6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
#7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
#8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
#9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
#13 [ffff8813dedfbec0] kthread at ffffffff810a8428
#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: bd293d071f ("dm bufio: fix deadlock with loop device")
Depends-on: d0a255e795 ("loop: set PF_MEMALLOC_NOIO for the worker thread")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
In myri10ge_probe(), myri10ge_alloc_slices() is invoked to allocate slices
related structures. Later on, myri10ge_request_irq() is used to get an irq.
However, if this process fails, the allocated slices related structures are
not deallocated, leading to memory leaks. To fix this issue, revise the
target label of the goto statement to 'abort_with_slices'.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ctx->sk_write_space pointer is only set when TLS tx mode is enabled.
When running without TX mode its a null pointer but we still set the
sk sk_write_space pointer on close().
Fix the close path to only overwrite sk->sk_write_space when the current
pointer is to the tls_write_space function indicating the tls module should
clean it up properly as well.
Reported-by: Hillf Danton <hdanton@sina.com>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Fixes: 57c722e932 ("net/tls: swap sk_write_space on close")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If oct->fn_list.enable_io_queues() fails, no cleanup is executed, leading
to memory/resource leaks. To fix this issue, invoke
octeon_delete_instr_queue() before returning from the function.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2019-08-15
This series introduces two fixes to mlx5 driver.
1) Eran fixes a compatibility issue with ethtool flash.
2) Maxim fixes a race in XSK wakeup flow.
Please pull and let me know if there is any problem.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
synchronize_rcu() gets called multiple times each time a client is
destroyed. If the laundromat thread has a lot of clients to destroy,
the delay can be noticeable. This was causing pynfs test RENEW3 to
fail.
We could embed an rcu_head in each inode and do the kref_put in an rcu
callback. But simplest is just to take a lock here.
(I also wonder if the laundromat thread would be better replaced by a
bunch of scheduled work or timers or something.)
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Cited patch deleted ethtool flash device support, as ethtool core can
fallback into devlink flash callback. However, this is supported only if
there is a devlink port registered over the corresponding netdevice.
As mlx5e do not have devlink port support over native netdevice, it broke
the ability to flash device via ethtool.
This patch re-add the ethtool callback to avoid user functionality breakage
when trying to flash device via ethtool.
Fixes: 9c8bca2637 ("mlx5: Move firmware flash implementation to devlink")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add a missing spinlock around XSKICOSQ usage at the activation stage,
because there is a race between a configuration change and the
application calling sendto().
Fixes: db05815b36 ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When running tcp_fastopen_backup_key.sh the following issue was seen in
a busybox environment.
./tcp_fastopen_backup_key.sh: line 33: [: -ne: unary operator expected
Shellcheck showed the following issue.
$ shellcheck tools/testing/selftests/net/tcp_fastopen_backup_key.sh
In tools/testing/selftests/net/tcp_fastopen_backup_key.sh line 33:
if [ $val -ne 0 ]; then
^-- SC2086: Double quote to prevent globbing and word splitting.
Rework to do a string comparison instead.
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
quirk_reset_lenovo_thinkpad_50_nvgpu() resets NVIDIA GPUs to work around
an apparent BIOS defect. It previously used pci_reset_function(), and
the available method was a bus reset, which was fine because there was
only one function on the bus. After b516ea586d ("PCI: Enable NVIDIA
HDA controllers"), there are now two functions (the HDA controller and
the GPU itself) on the bus, so the reset fails.
Use pci_reset_bus() explicitly instead of pci_reset_function() since it's
OK to reset both devices.
[bhelgaas: commit log, add e0547c81bf]
Fixes: b516ea586d ("PCI: Enable NVIDIA HDA controllers")
Fixes: e0547c81bf ("PCI: Reset Lenovo ThinkPad P50 nvgpu at boot if necessary")
Link: https://lore.kernel.org/r/20190801220117.14952-1-lyude@redhat.com
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Daniel Drake <drake@endlessm.com>
Cc: Aaron Plattner <aplattner@nvidia.com>
Cc: Peter Wu <peter@lekensteyn.nl>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Maik Freudenberg <hhfeuer@gmx.de>
arm64: dts: Amlogic fixes for v5.3-rc
- a few small DT fixes for g12a/g12b platforms
* tag 'amlogic-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic:
arm64: dts: amlogic: odroid-n2: keep SD card regulator always on
arm64: dts: meson-g12a-sei510: enable IR controller
arm64: dts: meson-g12a: add missing dwc2 phy-names
Fixes for omap variants for v5.3-rc cycle
We have another fix to disable voltage switching for am57xx SDIO as
the bootrom cannot handle all the voltages after a reset that thought
I had already sent a pull request for earlier but forgot. And we also
update dra74x iodelay configuration for mmc3 to use the recommended
values.
Then I noticed we had introduced few new boot warnings with the various
recent ti-sysc changes and wanted to fix those. I also noticed we still
have too many warnings to be able to spot the real ones easily and fixed
up few of those. Sure some of the warnings have been around for a long
time and few of the fixes could have waited for the merge window, but
having more usable dmesg log level output is a valuable.
Other fixes are IO size correction for am335x UARTs that cause issues
for at least FreeBSD using the same device tree file that checks that
the child IO range is not larger than the parent has.
For omap1 ams-delta keyboard we need to fix a irq ack that broke with
all the recent gpio changes.
And there are also few static checker warning fixes for recent am335x
PM changes and ti-sysc driver and one switch fall-though update.
* tag 'omap-for-v5.3/fixes-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
soc: ti: pm33xx: Make two symbols static
soc: ti: pm33xx: Fix static checker warnings
ARM: OMAP: dma: Mark expected switch fall-throughs
ARM: dts: Fix incomplete dts data for am3 and am4 mmc
bus: ti-sysc: Simplify cleanup upon failures in sysc_probe()
ARM: OMAP1: ams-delta-fiq: Fix missing irq_ack
ARM: dts: dra74x: Fix iodelay configuration for mmc3
ARM: dts: am335x: Fix UARTs length
ARM: OMAP2+: Fix omap4 errata warning on other SoCs
ARM: dts: Fix incorrect dcan register mapping for am3, am4 and dra7
ARM: dts: Fix flags for gpio7
bus: ti-sysc: Fix using configured sysc mask value
bus: ti-sysc: Fix handling of forced idle
ARM: OMAP2+: Fix missing SYSC_HAS_RESET_STATUS for dra7 epwmss
ARM: dts: am57xx: Disable voltage switching for SD card
Link: https://lore.kernel.org/r/pull-1565844391-332885@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
vmx_set_nested_state_test is trying to use the KVM_STATE_NESTED_EVMCS without
enabling enlightened VMCS first. Correct the outcome of the test, and actually
test that it succeeds after the capability is enabled.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are two tests already enabling eVMCS and a third is coming.
Add a function that enables the capability and tests the result.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This test is only covering various edge cases of the
KVM_SET_NESTED_STATE ioctl. Running the VM does not really
add anything.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In blocked_fl_write(), 't' is not deallocated if bitmap_parse_user() fails,
leading to a memory leak bug. To fix this issue, free t before returning
the error.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following BUG:
[ 187.065689] BUG: kernel NULL pointer dereference, address: 000000000000001c
[ 187.065790] RIP: 0010:ufshcd_vreg_set_hpm+0x3c/0x110 [ufshcd_core]
[ 187.065938] Call Trace:
[ 187.065959] ufshcd_resume+0x72/0x290 [ufshcd_core]
[ 187.065980] ufshcd_system_resume+0x54/0x140 [ufshcd_core]
[ 187.065993] ? pci_pm_restore+0xb0/0xb0
[ 187.066005] ufshcd_pci_resume+0x15/0x20 [ufshcd_pci]
[ 187.066017] pci_pm_thaw+0x4c/0x90
[ 187.066030] dpm_run_callback+0x5b/0x150
[ 187.066043] device_resume+0x11b/0x220
Voltage regulators are optional, so functions must check they exist
before dereferencing.
Note this issue is hidden if CONFIG_REGULATORS is not set, because the
offending code is optimised away.
Notes for stable:
The issue first appears in commit 57d104c153 ("ufs: add UFS power
management support") but is inadvertently fixed in commit 60f0187031
("scsi: ufs: disable vccq if it's not needed by UFS device") which in
turn was reverted by commit 730679817d ("Revert "scsi: ufs: disable vccq
if it's not needed by UFS device""). So fix applies v3.18 to v4.5 and
v5.1+
Fixes: 57d104c153 ("ufs: add UFS power management support")
Fixes: 730679817d ("Revert "scsi: ufs: disable vccq if it's not needed by UFS device"")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In tcmu_handle_completion() function, the variable called read_len is
always initialized with a value taken from se_cmd structure. If this
function is called to complete an expired (timed out) out command, the
session command pointed by se_cmd is likely to be already deallocated by
the target core at that moment. As the result, this access triggers a
use-after-free warning from KASAN.
This patch fixes the code not to touch se_cmd when completing timed out
TCMU commands. It also resets the pointer to se_cmd at the time when the
TCMU_CMD_BIT_EXPIRED flag is set because it is going to become invalid
after calling target_complete_cmd() later in the same function,
tcmu_check_expired_cmd().
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If HBA initialization fails unexpectedly (exiting via probe_failed:), we
may fail to free vha->gnl.l. So that we don't attempt to double free, set
this pointer to NULL after a free and check for NULL at probe_failed: so we
know whether or not to call dma_free_coherent.
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This driver requires imported PRIME buffers to appear contiguously in
its IO address space. Make sure this is the case by setting the maximum
DMA segment size to a more suitable value than the default 64KB.
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
PRIME buffers should be imported using the DMA device. To this end, use
a custom import function that mimics drm_gem_prime_import_dev(), but
passes the correct device.
Fixes: 119f517362 ("drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.")
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
"bind4 allow specific IP & port" and "bind6 deny specific IP & port"
fail on s390 because of endianness issue: the 4 IP address bytes are
loaded as a word and compared with a constant, but the value of this
constant should be different on big- and little- endian machines, which
is not the case right now.
Use __bpf_constant_ntohl to generate proper value based on machine
endianness.
Fixes: 1d436885b2 ("selftests/bpf: Selftest for sys_bind post-hooks.")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
new_entry is reassigned a new value next line. So
it's redundant and remove it.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is probably overdue---KVM x86 has quite a few contributors that
usually review each other's patches, which is really helpful to me.
Formalize this by listing them as reviewers. I am including people
with various expertise:
- Joerg for SVM (with designated reviewers, it makes more sense to have
him in the main KVM/x86 stanza)
- Sean for MMU and VMX
- Jim for VMX
- Vitaly for Hyper-V and possibly SVM
- Wanpeng for LAPIC and paravirtualization.
Please ack if you are okay with this arrangement, otherwise speak up.
In other news, Radim is going to leave Red Hat soon. However, he has
not been very much involved in upstream KVM development for some time,
and in the immediate future he is still going to help maintain kvm/queue
while I am on vacation. Since not much is going to change, I will let
him decide whether he wants to keep the maintainer role after he leaves.
Acked-by: Joerg Roedel <joro@8bytes.org>
Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Wanpeng Li <wanpengli@tencent.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM/s390 does not have a list of its own, and linux-s390 is in the
loop anyway thanks to the generic arch/s390 match. So use the generic
KVM list for s390 patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
recalculate_apic_map does not santize ldr and it's possible that
multiple bits are set. In that case, a previous valid entry
can potentially be overwritten by an invalid one.
This condition is hit when booting a 32 bit, >8 CPU, RHEL6 guest and then
triggering a crash to boot a kdump kernel. This is the sequence of
events:
1. Linux boots in bigsmp mode and enables PhysFlat, however, it still
writes to the LDR which probably will never be used.
2. However, when booting into kdump, the stale LDR values remain as
they are not cleared by the guest and there isn't a apic reset.
3. kdump boots with 1 cpu, and uses Logical Destination Mode but the
logical map has been overwritten and points to an inactive vcpu.
Signed-off-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Don't fall through to print error message when receive sleep indication
in HCI_IBS_RX_ASLEEP state, this is allowed behavior.
Signed-off-by: Rocky Liao <rjliao@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
rxrpc_queue_local() attempts to queue the local endpoint it is given and
then, if successful, prints a trace line. The trace line includes the
current usage count - but we're not allowed to look at the local endpoint
at this point as we passed our ref on it to the workqueue.
Fix this by reading the usage count before queuing the work item.
Also fix the reading of local->debug_id for trace lines, which must be done
with the same consideration as reading the usage count.
Fixes: 09d2bf595d ("rxrpc: Add a tracepoint to track rxrpc_local refcounting")
Reported-by: syzbot+78e71c5bab4f76a6a719@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
When a local endpoint (struct rxrpc_local) ceases to be in use by any
AF_RXRPC sockets, it starts the process of being destroyed, but this
doesn't cause it to be removed from the namespace endpoint list immediately
as tearing it down isn't trivial and can't be done in softirq context, so
it gets deferred.
If a new socket comes along that wants to bind to the same endpoint, a new
rxrpc_local object will be allocated and rxrpc_lookup_local() will use
list_replace() to substitute the new one for the old.
Then, when the dying object gets to rxrpc_local_destroyer(), it is removed
unconditionally from whatever list it is on by calling list_del_init().
However, list_replace() doesn't reset the pointers in the replaced
list_head and so the list_del_init() will likely corrupt the local
endpoints list.
Fix this by using list_replace_init() instead.
Fixes: 730c5fd42c ("rxrpc: Fix local endpoint refcounting")
Reported-by: syzbot+193e29e9387ea5837f1d@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
TCP rst and fin packets do not qualify to place a flow into the
flowtable. Most likely there will be no more packets after connection
closure. Without this patch, this flow entry expires and connection
tracking picks up the entry in ESTABLISHED state using the fixup
timeout, which makes this look inconsistent to the user for a connection
that is actually already closed.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
when do randbuilding, I got this error:
In file included from drivers/hwmon/pmbus/ucd9000.c:19:0:
./include/linux/gpio/driver.h:576:1: error: redefinition of gpiochip_add_pin_range
gpiochip_add_pin_range(struct gpio_chip *chip, const char *pinctl_name,
^~~~~~~~~~~~~~~~~~~~~~
In file included from drivers/hwmon/pmbus/ucd9000.c:18:0:
./include/linux/gpio.h:245:1: note: previous definition of gpiochip_add_pin_range was here
gpiochip_add_pin_range(struct gpio_chip *chip, const char *pinctl_name,
^~~~~~~~~~~~~~~~~~~~~~
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 964cb34188 ("gpio: move pincontrol calls to <linux/gpio/driver.h>")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20190731123814.46624-1-yuehaibing@huawei.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
If the driver doesn't support open-drain/source config options, we
emulate this behavior when setting the direction by calling
gpiod_direction_input() if the default value is 0 (open-source) or
1 (open-drain), thus not actively driving the line in those cases.
This however clears the FLAG_IS_OUT bit for the GPIO line descriptor
and makes the LINEINFO ioctl() incorrectly report this line's mode as
'input' to user-space.
This commit modifies the ioctl() to always set the GPIOLINE_FLAG_IS_OUT
bit in the lineinfo structure's flags field. Since it's impossible to
use the input mode and open-drain/source options at the same time, we
can be sure the reported information will be correct.
Fixes: 521a2ad6f8 ("gpio: add userspace ABI for GPIO line information")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Link: https://lore.kernel.org/r/20190806114151.17652-1-brgl@bgdev.pl
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Eric reported a syzbot warning:
BUG: KMSAN: uninit-value in nh_valid_get_del_req+0x6f1/0x8c0 net/ipv4/nexthop.c:1510
CPU: 0 PID: 11812 Comm: syz-executor444 Not tainted 5.3.0-rc3+ #17
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x191/0x1f0 lib/dump_stack.c:113
kmsan_report+0x162/0x2d0 mm/kmsan/kmsan_report.c:109
__msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:294
nh_valid_get_del_req+0x6f1/0x8c0 net/ipv4/nexthop.c:1510
rtm_del_nexthop+0x1b1/0x610 net/ipv4/nexthop.c:1543
rtnetlink_rcv_msg+0x115a/0x1580 net/core/rtnetlink.c:5223
netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5241
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0xf6c/0x1050 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x110f/0x1330 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg net/socket.c:657 [inline]
___sys_sendmsg+0x14ff/0x1590 net/socket.c:2311
__sys_sendmmsg+0x53a/0xae0 net/socket.c:2413
__do_sys_sendmmsg net/socket.c:2442 [inline]
__se_sys_sendmmsg+0xbd/0xe0 net/socket.c:2439
__x64_sys_sendmmsg+0x56/0x70 net/socket.c:2439
do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:297
entry_SYSCALL_64_after_hwframe+0x63/0xe7
The root cause is nlmsg_parse calling __nla_parse which means the
header struct size is not checked.
nlmsg_parse should be a wrapper around __nlmsg_parse with
NL_VALIDATE_STRICT for the validate argument very much like
nlmsg_parse_deprecated is for NL_VALIDATE_LIBERAL.
Fixes: 3de6440354 ("netlink: re-add parse/validate functions in strict mode")
Reported-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
After configuring and restarting aneg we immediately try to read the
link status. On some systems the PHY may not yet have cleared the
"aneg complete" and "link up" bits, resulting in a false link-up
signal. See [0] for a report.
Clause 22 and 45 both require the PHY to keep the AN_RESTART
bit set until the PHY actually starts auto-negotiation.
Let's consider this in the generic functions for reading link status.
The commit marked as fixed is the first one where the patch applies
cleanly.
[0] https://marc.info/?t=156518400300003&r=1&w=2
Fixes: c1164bb1a6 ("net: phy: check PMAPMD link status only in genphy_c45_read_link")
Tested-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
In mlx4_en_config_rss_steer(), 'rss_map->indir_qp' is allocated through
kzalloc(). After that, mlx4_qp_alloc() is invoked to configure RSS
indirection. However, if mlx4_qp_alloc() fails, the allocated
'rss_map->indir_qp' is not deallocated, leading to a memory leak bug.
To fix the above issue, add the 'qp_alloc_err' label to free
'rss_map->indir_qp'.
Fixes: 4931c6ef04 ("net/mlx4_en: Optimized single ring steering")
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
The ibm,mac-address-filters property defines the maximum number of
addresses the hypervisor's multicast filter list can support. It is
encoded as a big-endian integer in the OF device tree, but the virtual
ethernet driver does not convert it for use by little-endian systems.
As a result, the driver is not behaving as it should on affected systems
when a large number of multicast addresses are assigned to the device.
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Callbacks for a cmd reply run outside the protection of card->lock, to
allow for additional cmds to be issued & enqueued in parallel.
When qeth_send_control_data() bails out for a cmd without having
received a reply (eg. due to timeout), its callback may concurrently be
processing a reply that just arrived. In this case, the callback
potentially accesses a stale reply->reply_param area that eg. was
on-stack and has already been released.
To avoid this race, add some locking so that qeth_send_control_data()
can (1) wait for a concurrently running callback, and (2) zap any
pending callback that still wants to run.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
As the annotation says in sctp_do_8_2_transport_strike():
"If the transport error count is greater than the pf_retrans
threshold, and less than pathmaxrtx ..."
It should be transport->error_count checked with pathmaxrxt,
instead of asoc->pf_retrans.
Fixes: 5aa93bcf66 ("sctp: Implement quick failover draft from tsvwg")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Variable rc is initialized to a value that is never read and it is
re-assigned later. The initialization is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Change ct id hash calculation to only use invariants.
Currently the ct id hash calculation is based on some fields that can
change in the lifetime on a conntrack entry in some corner cases. The
current hash uses the whole tuple which contains an hlist pointer which
will change when the conntrack is placed on the dying list resulting in
a ct id change.
This patch also removes the reply-side tuple and extension pointer from
the hash calculation so that the ct id will will not change from
initialization until confirmation.
Fixes: 3c79107631 ("netfilter: ctnetlink: don't use conntrack/expect object addresses as id")
Signed-off-by: Dirk Morris <dmorris@metaloft.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The G700 suffers from the same issue than the G502:
when plugging it in, the driver tries to contact it but it fails.
This timeout is problematic as it introduce a delay in the boot,
and having only the mouse event node means that the hardware
macros keys can not be relayed to the userspace.
Link: https://github.com/libratbag/libratbag/issues/797
Fixes: 91cf9a98ae ("HID: logitech-hidpp: make .probe usbhid capable")
Cc: stable@vger.kernel.org # v5.2
Reviewed-by: Filipe Laíns <lains@archlinux.org>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
This partially reverts commit 27fc32fd94.
It turns out that the G502 has some issues with hid-logitech-hidpp:
when plugging it in, the driver tries to contact it but it fails.
So the driver bails out leaving only the mouse event node available.
This timeout is problematic as it introduce a delay in the boot,
and having only the mouse event node means that the hardware
macros keys can not be relayed to the userspace.
Filipe and I just gave a shot at the following devices:
G403 Wireless (0xC082)
G703 (0xC087)
G703 Hero (0xC090)
G903 (0xC086)
G903 Hero (0xC091)
G Pro (0xC088)
Reverting the devices we are not sure that works flawlessly.
Reviewed-by: Filipe Laíns <lains@archlinux.org>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
"p runtime/jit: pass > 32bit index to tail_call" fails when
bpf_jit_enable=1, because the tail call is not executed.
This in turn is because the generated code assumes index is 64-bit,
while it must be 32-bit, and as a result prog array bounds check fails,
while it should pass. Even if bounds check would have passed, the code
that follows uses 64-bit index to compute prog array offset.
Fix by using clrj instead of clgrj for comparing index with array size,
and also by using llgfr for truncating index to 32 bits before using it
to compute prog array offset.
Fixes: 6651ee070b ("s390/bpf: implement bpf_tail_call() helper")
Reported-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Fix sparse warnings:
drivers/soc/ti/pm33xx.c:144:27: warning: symbol 'rtc_wake_src' was not declared. Should it be static?
drivers/soc/ti/pm33xx.c:160:5: warning: symbol 'am33xx_rtc_only_idle' was not declared. Should it be static?
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Mark switch cases where we are expecting to fall through.
This patch fixes the following warnings:
arch/arm/plat-omap/dma.c: In function 'omap_set_dma_src_burst_mode':
arch/arm/plat-omap/dma.c:384:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (dma_omap2plus()) {
^
arch/arm/plat-omap/dma.c:393:2: note: here
case OMAP_DMA_DATA_BURST_16:
^~~~
arch/arm/plat-omap/dma.c:394:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (dma_omap2plus()) {
^
arch/arm/plat-omap/dma.c:402:2: note: here
default:
^~~~~~~
arch/arm/plat-omap/dma.c: In function 'omap_set_dma_dest_burst_mode':
arch/arm/plat-omap/dma.c:473:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (dma_omap2plus()) {
^
arch/arm/plat-omap/dma.c:481:2: note: here
default:
^~~~~~~
Notice that, in this particular case, the code comment is
modified in accordance with what GCC is expecting to find.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Commit 4e27f752ab ("ARM: OMAP2+: Drop mmc platform data for am330x and
am43xx") dropped legacy mmc platform data for am3 and am4, but missed the
fact that we never updated the dts files for mmc3 that is directly on l3
interconnect instead of l4 interconnect. This leads to a situation with
no legacy platform data and incomplete dts data.
Let's update the mmc instances on l3 interconnect to probe properly with
ti-sysc interconnect target module driver to make mmc3 work again. Let's
still keep legacy "ti,hwmods" property around for v5.2 kernel and only
drop it later on.
Note that there is no need to use property status = "disabled" for mmc3.
The default for dts is enabled, and runtime PM will idle unused instances
just fine.
Fixes: 4e27f752ab ("ARM: OMAP2+: Drop mmc platform data for am330x and am43xx")
Reported-by: David Lechner <david@lechnology.com>
Tested-by: David Lechner <david@lechnology.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The clocks are not yet parsed and prepared until after a successful
sysc_get_clocks(), so there is no need to unprepare the clocks upon
any failure of any of the prior functions in sysc_probe(). The current
code path would have been a no-op because of the clock validity checks
within sysc_unprepare(), but let's just simplify the cleanup path by
returning the error directly.
While at this, also fix the cleanup path for a sysc_init_resets()
failure which is executed after the clocks are prepared.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Non-serio path of Amstrad Delta FIQ deferred handler depended on
irq_ack() method provided by OMAP GPIO driver. That method has been
removed by commit 693de831c6 ("gpio: omap: remove irq_ack method").
Remove useless code from the deferred handler and reimplement the
missing operation inside the base FIQ handler.
Should another dependency - irq_unmask() - be ever removed from the OMAP
GPIO driver, WARN once if missing.
Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
As seen on the AM335x TRM all the UARTs controller only are 0x1000 in size.
Fix this in the DTS.
Signed-off-by: Emmanuel Vadot <manu@freebsd.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
We have errata i688 workaround produce warnings on SoCs other than
omap4 and omap5:
omap4_sram_init:Unable to allocate sram needed to handle errata I688
omap4_sram_init:Unable to get sram pool needed to handle errata I688
This is happening because there is no ti,omap4-mpu node, or no SRAM
to configure for the other SoCs, so let's remove the warning based
on the SoC revision checks.
As nobody has complained it seems that the other SoC variants do not
need this workaround.
Signed-off-by: Tony Lindgren <tony@atomide.com>
The original driver author seemed to be under the impression that a driver
cannot be removed if it does not have a .remove method. Or maybe if it is
a built-in platform driver.
This is not true. This crash can be created:
root@ubuntu:/sys/bus/platform/drivers/hisi-lpc# echo HISI0191\:00 > unbind
root@ubuntu:/sys/bus/platform/drivers/hisi-lpc# ipmitool raw 6 1
Unable to handle kernel paging request at virtual address ffff000010035010
Mem abort info:
ESR = 0x96000047
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000047
CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000000118b000
[ffff000010035010] pgd=0000041ffbfff003, pud=0000041ffbffe003, pmd=0000041ffbffd003, pte=0000000000000000
Internal error: Oops: 96000047 [#1] PREEMPT SMP
Modules linked in:
CPU: 17 PID: 1473 Comm: ipmitool Not tainted 5.2.0-rc5-00003-gf68c53b414a3-dirty #198
Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018
pstate: 20000085 (nzCv daIf -PAN -UAO)
pc : hisi_lpc_target_in+0x7c/0x120
lr : hisi_lpc_target_in+0x70/0x120
sp : ffff00001efe3930
x29: ffff00001efe3930 x28: ffff841f9f599200
x27: 0000000000000002 x26: 0000000000000000
x25: 0000000000000080 x24: 00000000000000e4
x23: 0000000000000000 x22: 0000000000000064
x21: ffff801fb667d280 x20: 0000000000000001
x19: ffff00001efe39ac x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000
x9 : 0000000000000000 x8 : ffff841febe60340
x7 : ffff801fb55c52e8 x6 : 0000000000000000
x5 : 0000000000ffc0e3 x4 : 0000000000000001
x3 : ffff801fb667d280 x2 : 0000000000000001
x1 : ffff000010035010 x0 : ffff000010035000
Call trace:
hisi_lpc_target_in+0x7c/0x120
hisi_lpc_comm_in+0x88/0x98
logic_inb+0x5c/0xb8
port_inb+0x18/0x20
bt_event+0x38/0x808
smi_event_handler+0x4c/0x5a0
check_start_timer_thread.part.4+0x40/0x58
sender+0x78/0x88
smi_send.isra.6+0x94/0x108
i_ipmi_request+0x2c4/0x8f8
ipmi_request_settime+0x124/0x160
handle_send_req+0x19c/0x208
ipmi_ioctl+0x2c0/0x990
do_vfs_ioctl+0xb8/0x8f8
ksys_ioctl+0x80/0xb8
__arm64_sys_ioctl+0x1c/0x28
el0_svc_common.constprop.0+0x64/0x160
el0_svc_handler+0x28/0x78
el0_svc+0x8/0xc
Code: 941d1511 aa0003f9 f94006a0 91004001 (b9000034)
---[ end trace aa842b86af7069e4 ]---
The problem here is that the host goes away but the associated logical PIO
region remains registered, as do the children devices.
Fix by adding a .remove method to tidy-up by removing the child devices
and unregistering the logical PIO region.
Cc: stable@vger.kernel.org
Fixes: adf38bb0b5 ("HISI LPC: Support the LPC host on Hip06/Hip07 with DT bindings")
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
If, after registering a logical PIO range, the driver probe later fails,
the logical PIO range memory will be released automatically.
This causes an issue, in that the logical PIO range is not unregistered
and the released range memory may be later referenced.
Fix by unregistering the logical PIO range.
And since we now unregister the logical PIO range for probe failure, avoid
the special ordering of setting logical PIO range ops, which was the
previous (poor) attempt at a safeguard against this.
Cc: stable@vger.kernel.org
Fixes: adf38bb0b5 ("HISI LPC: Support the LPC host on Hip06/Hip07 with DT bindings")
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
Add a function to unregister a logical PIO range.
Logical PIO space can still be leaked when unregistering certain
LOGIC_PIO_CPU_MMIO regions, but this acceptable for now since there are no
callers to unregister LOGIC_PIO_CPU_MMIO regions, and the logical PIO
region allocation scheme would need significant work to improve this.
Cc: stable@vger.kernel.org
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
The code was originally written to not support unregistering logical PIO
regions.
To accommodate supporting unregistering logical PIO regions, subtly modify
LOGIC_PIO_CPU_MMIO region registration code, such that the "end" of the
registered regions is the "end" of the last region, and not the sum of
the sizes of all the registered regions.
Cc: stable@vger.kernel.org
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
The traversing of io_range_list with list_for_each_entry_rcu()
is not properly protected by rcu_read_lock() and rcu_read_unlock(),
so add them.
These functions mark the critical section scope where the list is
protected for the reader, it cannot be "reclaimed". Any updater - in
this case, the logical PIO registration functions - cannot update the
list until the reader exits this critical section.
In addition, the list traversing used in logic_pio_register_range()
does not need to use the rcu variant.
This is because we are already using io_range_mutex to guarantee mutual
exclusion from mutating the list.
Cc: stable@vger.kernel.org
Fixes: 031e360186 ("lib: Add generic PIO mapping method")
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
This driver does a funny dance disabling and re-enabling
RX and/or TX delays. In any of the RGMII-ID modes, it first
disables the delays, just to re-enable them again right
away. This looks like a needless exercise.
Just enable the respective delays when in any of the
relevant 'id' modes, and disable them otherwise.
Also, remove comments which don't add anything that can't be
seen by looking at the code.
Signed-off-by: André Draszik <git@andred.net>
CC: Andrew Lunn <andrew@lunn.ch>
CC: Florian Fainelli <f.fainelli@gmail.com>
CC: Heiner Kallweit <hkallweit1@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch will reset the download flag to default value
before retrieving the download mode type.
Fixes: 32646db8cc ("Bluetooth: btqca: inject command complete event during fw download")
Signed-off-by: Balakrishna Godavarthi <bgodavar@codeaurora.org>
Tested-by: Claire Chang <tientzu@chromium.org>
Reviewed-by: Claire Chang <tientzu@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Let hidp_send_message return the number of successfully queued bytes
instead of an unconditional 0.
With the return value fixed to 0, other drivers relying on hidp, such as
hidraw, can not return meaningful values from their respective
implementations of write(). In particular, with the current behavior, a
hidraw device's write() will have different return values depending on
whether the device is connected via USB or Bluetooth, which makes it
harder to abstract away the transport layer.
Signed-off-by: Fabian Henneke <fabian.henneke@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
WCN399x chips are coex chips, it needs a VS pre shutdown
command while turning off the BT. So that chip can inform
BT is OFF to other active clients.
Signed-off-by: Harish Bandi <c-hbandi@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The opcode of the command injected by commit 32646db8cc ("Bluetooth:
btqca: inject command complete event during fw download") uses the CPU
byte format, however it should always be little endian. In practice it
shouldn't really matter, since all we need is an opcode != 0, but still
let's do things correctly and keep sparse happy.
Fixes: 32646db8cc ("Bluetooth: btqca: inject command complete event during fw download")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
On WCN3990 downloading the NVM sometimes fails with a "TLV response
size mismatch" error:
[ 174.949955] Bluetooth: btqca.c:qca_download_firmware() hci0: QCA Downloading qca/crnv21.bin
[ 174.958718] Bluetooth: btqca.c:qca_tlv_send_segment() hci0: QCA TLV response size mismatch
It seems the controller needs a short time after downloading the
firmware before it is ready for the NVM. A delay as short as 1 ms
seems sufficient, make it 10 ms just in case. No event is received
during the delay, hence we don't just silently drop an extra event.
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Fix to return error code -EINVAL from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: a1c49c434e ("Bluetooth: btusb: Add protocol support for MediaTek MT7668U USB devices")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
"masking, test in bounds 3" fails on s390, because
BPF_ALU64_IMM(BPF_NEG, BPF_REG_2, 0) ignores the top 32 bits of
BPF_REG_2. The reason is that JIT emits lcgfr instead of lcgr.
The associated comment indicates that the code was intended to
emit lcgr in the first place, it's just that the wrong opcode
was used.
Fix by using the correct opcode.
Fixes: 0546231057 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The omapdrm driver uses dma_set_coherent_mask(), but that's not enough
anymore when LPAE is enabled.
From Christoph Hellwig <hch@lst.de>:
> The traditional arm DMA code ignores, but the generic dma-direct/swiotlb
> has stricter checks and thus fails mappings without a DMA mask. As we
> use swiotlb for arm with LPAE now, omapdrm needs to catch up and
> actually set a DMA mask.
Change the dma_set_coherent_mask() call to
dma_coerce_mask_and_coherent() so that the dev->dma_mask is also set.
Fixes: ad3c7b18c5 ("arm: use swiotlb for bounce buffering on LPAE configs")
Reported-by: "H. Nikolaus Schaller" <hns@goldelico.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c219e7e6-0f66-d6fd-e0cf-59c803386825@ti.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Backport requested for omap dma mask fix. I'm not sure it still
requires it, but just in case. :)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
clang warns:
drivers/net/ethernet/toshiba/tc35815.c:1507:30: warning: use of logical
'&&' with constant operand [-Wconstant-logical-operand]
if (!HAVE_DMA_RXALIGN(lp) && NET_IP_ALIGN)
^ ~~~~~~~~~~~~
drivers/net/ethernet/toshiba/tc35815.c:1507:30: note: use '&' for a
bitwise operation
if (!HAVE_DMA_RXALIGN(lp) && NET_IP_ALIGN)
^~
&
drivers/net/ethernet/toshiba/tc35815.c:1507:30: note: remove constant to
silence this warning
if (!HAVE_DMA_RXALIGN(lp) && NET_IP_ALIGN)
~^~~~~~~~~~~~~~~
1 warning generated.
Explicitly check that NET_IP_ALIGN is not zero, which matches how this
is checked in other parts of the tree. Because NET_IP_ALIGN is a build
time constant, this check will be constant folded away during
optimization.
Fixes: 82a9928db5 ("tc35815: Enable StripCRC feature")
Link: https://github.com/ClangBuiltLinux/linux/issues/608
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We set the field 'addr_trial_end' to 'jiffies', instead of the current
value 0, at the moment the node address is initialized. This guarantees
we don't inadvertently enter an address trial period when the node
address is explicitly set by the user.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To identify timestamps for matching with their packets, Spectrum-1 uses a
five-tuple of (port, direction, domain number, message type, sequence ID).
If there are several clients from the same domain behind a single port
sending Delay_Req's, the only thing differentiating these packets, as far
as Spectrum-1 is concerned, is the sequence ID. Should sequence IDs between
individual clients be similar, conflicts may arise. That is not a problem
to hardware, which will simply deliver timestamps on a first comes, first
served basis.
However the driver uses a simple hash table to store the unmatched pieces.
When a new conflicting piece arrives, it pushes out the previously stored
one, which if it is a packet, is delivered without timestamp. Later on as
the corresponding timestamps arrive, the first one is mismatched to the
second packet, and the second one is never matched and eventually is GCd.
To correct this issue, instead of using a simple rhashtable, use rhltable
to keep the unmatched entries.
Previously, a found unmatched entry would always be removed from the hash
table. That is not the case anymore--an incompatible entry is left in the
hash table. Therefore removal from the hash table cannot be used to confirm
the validity of the looked-up pointer, instead the lookup would simply need
to be redone. Therefore move it inside the critical section. This
simplifies a lot of the code.
Fixes: 8748642751 ("mlxsw: spectrum: PTP: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls")
Reported-by: Alex Veber <alexve@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adjust the function names in two doc comments to match the corresponding
functions.
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix rxrpc_unuse_local() to handle a NULL local pointer as it can be called
on an unbound socket on which rx->local is not yet set.
The following reproduced (includes omitted):
int main(void)
{
socket(AF_RXRPC, SOCK_DGRAM, AF_INET);
return 0;
}
causes the following oops to occur:
BUG: kernel NULL pointer dereference, address: 0000000000000010
...
RIP: 0010:rxrpc_unuse_local+0x8/0x1b
...
Call Trace:
rxrpc_release+0x2b5/0x338
__sock_release+0x37/0xa1
sock_close+0x14/0x17
__fput+0x115/0x1e9
task_work_run+0x72/0x98
do_exit+0x51b/0xa7a
? __context_tracking_exit+0x4e/0x10e
do_group_exit+0xab/0xab
__x64_sys_exit_group+0x14/0x17
do_syscall_64+0x89/0x1d4
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Reported-by: syzbot+20dee719a2e090427b5f@syzkaller.appspotmail.com
Fixes: 730c5fd42c ("rxrpc: Fix local endpoint refcounting")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prior to the commit in the fixes tag, the resource controller in netdevsim
tracked fib entries and rules per network namespace. Restore that behavior.
Fixes: 5fc494225c ("netdevsim: create devlink instance per netdevsim instance")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2019-08-11
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) x64 JIT code generation fix for backward-jumps to 1st insn, from Alexei.
2) Fix buggy multi-closing of BTF file descriptor in libbpf, from Andrii.
3) Fix libbpf_num_possible_cpus() to make it thread safe, from Takshak.
4) Fix bpftool to dump an error if pinning fails, from Jakub.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we swap the original proto and clear the ULP pointer
on close we have to make sure no callback will try to access
the freed state. sk_write_space is not part of sk_prot, remember
to swap it.
Reported-by: syzbot+dcdc9deefaec44785f32@syzkaller.appspotmail.com
Fixes: 95fa145479 ("bpf: sockmap/tls, close can race with map free")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a warning of "suspicious rcu_dereference_check() usage"
when nload runs.
Fixes: 776e726bfb ("netvsc: fix RCU warning in get_stats")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2019-08-08
This series introduces some fixes to mlx5 driver.
Highlights:
1) From Tariq, Critical mlx5 kTLS fixes to better align with hw specs.
2) From Aya, Fixes to mlx5 tx devlink health reporter.
3) From Maxim, aRFs parsing to use flow dissector to avoid relying on
invalid skb fields.
Please pull and let me know if there is any problem.
For -stable v4.3
('net/mlx5e: Only support tx/rx pause setting for port owner')
For -stable v4.9
('net/mlx5e: Use flow keys dissector to parse packets for ARFS')
For -stable v5.1
('net/mlx5e: Fix false negative indication on tx reporter CQE recovery')
('net/mlx5e: Remove redundant check in CQE recovery flow of tx reporter')
('net/mlx5e: ethtool, Avoid setting speed to 56GBASE when autoneg off')
Note: when merged with net-next this minor conflict will pop up:
++<<<<<<< (net-next)
+ if (is_eswitch_flow) {
+ flow->esw_attr->match_level = match_level;
+ flow->esw_attr->tunnel_match_level = tunnel_match_level;
++=======
+ if (flow->flags & MLX5E_TC_FLOW_ESWITCH) {
+ flow->esw_attr->inner_match_level = inner_match_level;
+ flow->esw_attr->outer_match_level = outer_match_level;
++>>>>>>> (net)
To resolve, use hunks from net (2nd) and replace:
if (flow->flags & MLX5E_TC_FLOW_ESWITCH)
with
if (is_eswitch_flow)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ixgbe_service_task() calls unregister_netdev() under rtnl_lock().
But unregister_netdev() internally calls rtnl_lock().
So deadlock would occur.
Fixes: 59dd45d550 ("ixgbe: firmware recovery mode")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
Fix collisions in socket cookie generation
This change makes the socket cookie generator as a global counter
instead of per netns in order to fix cookie collisions for BPF use
cases we ran into. See main patch #1 for more details.
Given the change is small/trivial and fixes an issue we're seeing
my preference would be net tree (though it cleanly applies to
net-next as well). Went for net tree instead of bpf tree here given
the main change is in net/core/sock_diag.c, but either way would be
fine with me.
v1 -> v2:
- Fix up commit description in patch #1, thanks Eric!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull in updates in BPF helper function description.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Generating and retrieving socket cookies are a useful feature that is
exposed to BPF for various program types through bpf_get_socket_cookie()
helper.
The fact that the cookie counter is per netns is quite a limitation
for BPF in practice in particular for programs in host namespace that
use socket cookies as part of a map lookup key since they will be
causing socket cookie collisions e.g. when attached to BPF cgroup hooks
or cls_bpf on tc egress in host namespace handling container traffic
from veth or ipvlan devices with peer in different netns. Change the
counter to be global instead.
Socket cookie consumers must assume the value as opqaue in any case.
Not every socket must have a cookie generated and knowledge of the
counter value itself does not provide much value either way hence
conversion to global is fine.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Martynas Pumputis <m@lambda.lt>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there are two nodes named "regulator1" in the Draak DTS: a
3.3V regulator for the eMMC and the LVDS decoder, and a 12V regulator
for the backlight. This causes the former to be overwritten by the
latter.
Fix this by renaming all regulators with numerical suffixes to use named
suffixes, which are less likely to conflict.
Fixes: 4fbd4158fe ("arm64: dts: renesas: r8a77995: draak: Add backlight")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
David Howells says:
====================
Here's a couple of fixes for rxrpc:
(1) Fix refcounting of the local endpoint.
(2) Don't calculate or report packet skew information. This has been
obsolete since AFS 3.1 and so is a waste of resources.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
First make sure we don't use "prog" in error messages because
the pinning operation could be performed on a map. Second add
back missing error message if pin syscall failed.
====================
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Don't bother generating maxSkew in the ACK packet as it has been obsolete
since AFS 3.1.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
The object lifetime management on the rxrpc_local struct is broken in that
the rxrpc_local_processor() function is expected to clean up and remove an
object - but it may get requeued by packets coming in on the backing UDP
socket once it starts running.
This may result in the assertion in rxrpc_local_rcu() firing because the
memory has been scheduled for RCU destruction whilst still queued:
rxrpc: Assertion failed
------------[ cut here ]------------
kernel BUG at net/rxrpc/local_object.c:468!
Note that if the processor comes around before the RCU free function, it
will just do nothing because ->dead is true.
Fix this by adding a separate refcount to count active users of the
endpoint that causes the endpoint to be destroyed when it reaches 0.
The original refcount can then be used to refcount objects through the work
processor and cause the memory to be rcu freed when that reaches 0.
Fixes: 4f95dd78a7 ("rxrpc: Rework local endpoint management")
Reported-by: syzbot+1e0edc4b8b7494c28450@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Flows that are in teardown state (due to RST / FIN TCP packet) still
have their offload flag set on. Hence, the conntrack garbage collector
may race to undo the timeout adjustment that the fixup routine performs,
leaving the conntrack entry in place with the internal offload timeout
(one day).
Update teardown flow state to ESTABLISHED and set tracking to liberal,
then once the offload bit is cleared, adjust timeout if it is more than
the default fixup timeout (conntrack might already have set a lower
timeout from the packet path).
Fixes: da5984e510 ("netfilter: nf_flow_table: add support for sending flows back to the slow path")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Update conntrack entry to pick up expired flows, otherwise the conntrack
entry gets stuck with the internal offload timeout (one day). The TCP
state also needs to be adjusted to ESTABLISHED state and tracking is set
to liberal mode in order to give conntrack a chance to pick up the
expired flow.
Fixes: ac2a66665e ("netfilter: add generic flow table infrastructure")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If a rule that has already a bound anonymous set fails to be added, the
preparation phase releases the rule and the bound set. However, the
transaction object from the abort path still has a reference to the set
object that is stale, leading to a use-after-free when checking for the
set->bound field. Add a new field to the transaction that specifies if
the set is bound, so the abort path can skip releasing it since the rule
command owns it and it takes care of releasing it. After this update,
the set->bound field is removed.
[ 24.649883] Unable to handle kernel paging request at virtual address 0000000000040434
[ 24.657858] Mem abort info:
[ 24.660686] ESR = 0x96000004
[ 24.663769] Exception class = DABT (current EL), IL = 32 bits
[ 24.669725] SET = 0, FnV = 0
[ 24.672804] EA = 0, S1PTW = 0
[ 24.675975] Data abort info:
[ 24.678880] ISV = 0, ISS = 0x00000004
[ 24.682743] CM = 0, WnR = 0
[ 24.685723] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000428952000
[ 24.692207] [0000000000040434] pgd=0000000000000000
[ 24.697119] Internal error: Oops: 96000004 [#1] SMP
[...]
[ 24.889414] Call trace:
[ 24.891870] __nf_tables_abort+0x3f0/0x7a0
[ 24.895984] nf_tables_abort+0x20/0x40
[ 24.899750] nfnetlink_rcv_batch+0x17c/0x588
[ 24.904037] nfnetlink_rcv+0x13c/0x190
[ 24.907803] netlink_unicast+0x18c/0x208
[ 24.911742] netlink_sendmsg+0x1b0/0x350
[ 24.915682] sock_sendmsg+0x4c/0x68
[ 24.919185] ___sys_sendmsg+0x288/0x2c8
[ 24.923037] __sys_sendmsg+0x7c/0xd0
[ 24.926628] __arm64_sys_sendmsg+0x2c/0x38
[ 24.930744] el0_svc_common.constprop.0+0x94/0x158
[ 24.935556] el0_svc_handler+0x34/0x90
[ 24.939322] el0_svc+0x8/0xc
[ 24.942216] Code: 37280300 f9404023 91014262 aa1703e0 (f9401863)
[ 24.948336] ---[ end trace cebbb9dcbed3b56f ]---
Fixes: f6ac858589 ("netfilter: nf_tables: unbind set in rule from commit path")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
As spin_unlock_irq will enable interrupts.
Function tsi108_stat_carry is called from interrupt handler tsi108_irq.
Interrupts are enabled in interrupt handler.
Use spin_lock_irqsave/spin_unlock_irqrestore instead of spin_(un)lock_irq
in IRQ context to avoid this.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should also enable team's vlan tx offload in hw_enc_features,
pass the vlan packets to the slave devices with vlan tci, let the
slave handle vlan tunneling offload implementation.
Fixes: 3268e5cb49 ("team: Advertise tunneling offload features")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sk_validate_xmit_skb() and drivers depend on the sk member of
struct sk_buff to identify segments requiring encryption.
Any operation which removes or does not preserve the original TLS
socket such as skb_orphan() or skb_clone() will cause clear text
leaks.
Make the TCP socket underlying an offloaded TLS connection
mark all skbs as decrypted, if TLS TX is in offload mode.
Then in sk_validate_xmit_skb() catch skbs which have no socket
(or a socket with no validation) and decrypted flag set.
Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
sk->sk_validate_xmit_skb are slightly interchangeable right now,
they all imply TLS offload. The new checks are guarded by
CONFIG_TLS_DEVICE because that's the option guarding the
sk_buff->decrypted member.
Second, smaller issue with orphaning is that it breaks
the guarantee that packets will be delivered to device
queues in-order. All TLS offload drivers depend on that
scheduling property. This means skb_orphan_partial()'s
trick of preserving partial socket references will cause
issues in the drivers. We need a full orphan, and as a
result netem delay/throttling will cause all TLS offload
skbs to be dropped.
Reusing the sk_buff->decrypted flag also protects from
leaking clear text when incoming, decrypted skb is redirected
(e.g. by TC).
See commit 0608c69c9a ("bpf: sk_msg, sock{map|hash} redirect
through ULP") for justification why the internal flag is safe.
The only location which could leak the flag in is tcp_bpf_sendmsg(),
which is taken care of by clearing the previously unused bit.
v2:
- remove superfluous decrypted mark copy (Willem);
- remove the stale doc entry (Boris);
- rely entirely on EOR marking to prevent coalescing (Boris);
- use an internal sendpages flag instead of marking the socket
(Boris).
v3 (Willem):
- reorganize the can_skb_orphan_partial() condition;
- fix the flag leak-in through tcp_bpf_sendmsg.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak says:
====================
Fix batched event generation for skbedit action
When adding or deleting a batch of entries, the kernel sends up to
TCA_ACT_MAX_PRIO (defined to 32 in kernel) entries in an event to user
space. However it does not consider that the action sizes may vary and
require different skb sizes.
For example, consider the following script adding 32 entries with all
supported skbedit parameters and cookie (in order to maximize netlink
messages size):
% cat tc-batch.sh
TC="sudo /mnt/iproute2.git/tc/tc"
$TC actions flush action skbedit
for i in `seq 1 $1`;
do
cmd="action skbedit queue_mapping 2 priority 10 mark 7/0xaabbccdd \
ptype host inheritdsfield \
index $i cookie aabbccddeeff112233445566778800a1 "
args=$args$cmd
done
$TC actions add $args
%
% ./tc-batch.sh 32
Error: Failed to fill netlink attributes while adding TC action.
We have an error talking to the kernel
%
patch 1 adds callback in tc_action_ops of skbedit action, which calculates
the action size, and passes size to tcf_add_notify()/tcf_del_notify().
patch 2 updates the TDC test suite with relevant skbedit test cases.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Update TDC tests with cases varifying ability of TC to install or delete
batches of skbedit actions.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add get_fill_size() routine used to calculate the action size
when building a batch of events.
Fixes: ca9b0e27e ("pkt_action: add new action skbedit")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/dsa/sja1105/sja1105_main.c: In function sja1105_fdb_dump:
drivers/net/dsa/sja1105/sja1105_main.c:1226:14: warning:
variable tx_vid set but not used [-Wunused-but-set-variable]
drivers/net/dsa/sja1105/sja1105_main.c:1226:6: warning:
variable rx_vid set but not used [-Wunused-but-set-variable]
They are not used since commit 6d7c7d948a ("net: dsa:
sja1105: Fix broken learning with vlan_filtering disabled")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As commit 30d8177e8a ("bonding: Always enable vlan tx offload")
said, we should always enable bonding's vlan tx offload, pass the
vlan packets to the slave devices with vlan tci, let them to handle
vlan implementation.
Now if encapsulation protocols like VXLAN is used, skb->encapsulation
may be set, then the packet is passed to vlan device which based on
bonding device. However in netif_skb_features(), the check of
hw_enc_features:
if (skb->encapsulation)
features &= dev->hw_enc_features;
clears NETIF_F_HW_VLAN_CTAG_TX/NETIF_F_HW_VLAN_STAG_TX. This results
in same issue in commit 30d8177e8a like this:
vlan_dev_hard_start_xmit
-->dev_queue_xmit
-->validate_xmit_skb
-->netif_skb_features //NETIF_F_HW_VLAN_CTAG_TX is cleared
-->validate_xmit_vlan
-->__vlan_hwaccel_push_inside //skb->tci is cleared
...
--> bond_start_xmit
--> bond_xmit_hash //BOND_XMIT_POLICY_ENCAP34
--> __skb_flow_dissect // nhoff point to IP header
--> case htons(ETH_P_8021Q)
// skb_vlan_tag_present is false, so
vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan),
//vlan point to ip header wrongly
Fixes: b2a103e6d0 ("bonding: convert to ndo_fix_features")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In error case, all entries should be freed from the sched list
before deleting it. For simplicity use rcu way.
Fixes: 5a781ccbd1 ("tc: Add support for configuring the taprio scheduler")
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPX is no longer supported, but the example in the documentation
might useful. Replace it with IPv6.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both IPX and TR have not been supported for a while now.
Remove them from the /proc/sys/net documentation.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
At this point nr_frags has been incremented but the frag does not yet
have a page assigned so freeing the skb results in a crash. Reset
nr_frags before freeing the skb to prevent this.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before commit d4289fcc9b ("net: IP6 defrag: use rbtrees for IPv6
defrag"), a netperf UDP_STREAM test[0] using big IPv6 datagrams (thus
generating many fragments) and running over an IPsec tunnel, reported
more than 6Gbps throughput. After that patch, the same test gets only
9Mbps when receiving on a be2net nic (driver can make a big difference
here, for example, ixgbe doesn't seem to be affected).
By reusing the IPv4 defragmentation code, IPv6 lost fragment coalescing
(IPv4 fragment coalescing was dropped by commit 14fe22e334 ("Revert
"ipv4: use skb coalescing in defragmentation"")).
Without fragment coalescing, be2net runs out of Rx ring entries and
starts to drop frames (ethtool reports rx_drops_no_frags errors). Since
the netperf traffic is only composed of UDP fragments, any lost packet
prevents reassembly of the full datagram. Therefore, fragments which
have no possibility to ever get reassembled pile up in the reassembly
queue, until the memory accounting exeeds the threshold. At that point
no fragment is accepted anymore, which effectively discards all
netperf traffic.
When reassembly timeout expires, some stale fragments are removed from
the reassembly queue, so a few packets can be received, reassembled
and delivered to the netperf receiver. But the nic still drops frames
and soon the reassembly queue gets filled again with stale fragments.
These long time frames where no datagram can be received explain why
the performance drop is so significant.
Re-introducing fragment coalescing is enough to get the initial
performances again (6.6Gbps with be2net): driver doesn't drop frames
anymore (no more rx_drops_no_frags errors) and the reassembly engine
works at full speed.
This patch is quite conservative and only coalesces skbs for local
IPv4 and IPv6 delivery (in order to avoid changing skb geometry when
forwarding). Coalescing could be extended in the future if need be, as
more scenarios would probably benefit from it.
[0]: Test configuration
Sender:
ip xfrm policy flush
ip xfrm state flush
ip xfrm state add src fc00:1::1 dst fc00:2::1 proto esp spi 0x1000 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:1::1 dst fc00:2::1
ip xfrm policy add src fc00:1::1 dst fc00:2::1 dir in tmpl src fc00:1::1 dst fc00:2::1 proto esp mode transport action allow
ip xfrm state add src fc00:2::1 dst fc00:1::1 proto esp spi 0x1001 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:2::1 dst fc00:1::1
ip xfrm policy add src fc00:2::1 dst fc00:1::1 dir out tmpl src fc00:2::1 dst fc00:1::1 proto esp mode transport action allow
netserver -D -L fc00:2::1
Receiver:
ip xfrm policy flush
ip xfrm state flush
ip xfrm state add src fc00:2::1 dst fc00:1::1 proto esp spi 0x1001 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:2::1 dst fc00:1::1
ip xfrm policy add src fc00:2::1 dst fc00:1::1 dir in tmpl src fc00:2::1 dst fc00:1::1 proto esp mode transport action allow
ip xfrm state add src fc00:1::1 dst fc00:2::1 proto esp spi 0x1000 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:1::1 dst fc00:2::1
ip xfrm policy add src fc00:1::1 dst fc00:2::1 dir out tmpl src fc00:1::1 dst fc00:2::1 proto esp mode transport action allow
netperf -H fc00:2::1 -f k -P 0 -L fc00:1::1 -l 60 -t UDP_STREAM -I 99,5 -i 5,5 -T5,5 -6
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
M2M scaler clocks require special handling of their parent bus clock during
power domain on/off sequences. MSCL clocks were not initially added to the
sub-CMU handler, because that time there was no driver for the M2M scaler
device and it was not possible to test it.
This patch fixes this issue. Parent clock for M2M scaler devices is now
properly preserved during MSC power domain on/off sequence. This gives M2M
scaler devices proper performance: fullHD XRGB32 image 1000 rotations test
takes 3.17s instead of 45.08s.
Fixes: b06a532bf1 ("clk: samsung: Add Exynos5 sub-CMU clock driver")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lkml.kernel.org/r/20190808121839.23892-1-m.szyprowski@samsung.com
Acked-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
This patch fixes broken sound on Exynos5422/5800 platforms after
system/suspend resume cycle in cases where the audio root clock
is derived from MAU_EPLL_CLK.
In order to preserve state of the USER_MUX_MAU_EPLL_CLK clock mux
during system suspend/resume cycle for Exynos5800 we group the MAU
block input clocks in "MAU" sub-CMU and add the clock mux control
bit to .suspend_regs. This ensures that user configuration of the mux
is not lost after the PMU block changes the mux setting to OSC_DIV
when switching off the MAU power domain.
Adding the SRC_TOP9 register to exynos5800_clk_regs[] array is not
sufficient as at the time of the syscore_ops suspend call MAU power
domain is already turned off and we already save and subsequently
restore an incorrect register's value.
Fixes: b06a532bf1 ("clk: samsung: Add Exynos5 sub-CMU clock driver")
Reported-by: Jaafar Ali <jaafarkhalaf@gmail.com>
Suggested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Jaafar Ali <jaafarkhalaf@gmail.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Link: https://lkml.kernel.org/r/20190808144929.18685-2-s.nawrocki@samsung.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Remove check of recovery bit, in the beginning of the CQE recovery
function. This test is already performed right before the reporter
is invoked, when CQE error is detected.
Fixes: de8650a820 ("net/mlx5e: Add tx reporter support")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
CQE recovery function begins with test and set of recovery bit. Add an
error flow which ensures clearing of this bit when leaving the recovery
function, to allow further recoveries to take place. This allows removal
of clearing recovery bit on sq activate.
Fixes: de8650a820 ("net/mlx5e: Add tx reporter support")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Remove wrong error return value when SQ is not in error state.
CQE recovery on TX reporter queries the sq state. If the sq is not in
error state, the sq is either in ready or reset state. Ready state is
good state which doesn't require recovery and reset state is a temporal
state which ends in ready state. With this patch, CQE recovery in this
scenario is successful.
Fixes: de8650a820 ("net/mlx5e: Add tx reporter support")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Shift the tisn field in the WQE control segment, per the
HW specification.
Fixes: d2ead1f360 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Use the proper tisn field name from the union in struct mlx5_wqe_ctrl_seg.
Fixes: d2ead1f360 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The TLS progress params context WQE should not include an
Eth segment, drop it.
In addition, align the tls_progress_params layout with the
HW specification document:
- fix the tisn field name.
- remove the valid bit.
Fixes: a12ff35e0f ("net/mlx5: Introduce TLS TX offload hardware bits and structures")
Fixes: d2ead1f360 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Setting speed to 56GBASE is allowed only with auto-negotiation enabled.
This patch prevent setting speed to 56GBASE when auto-negotiation disabled.
Fixes: f62b8bb8f2 ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality")
Signed-off-by: Mohamad Heib <mohamadh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Only support changing tx/rx pause frame setting if the net device
is the vport group manager.
Fixes: 3c2d18ef22 ("net/mlx5e: Support ethtool get/set_pauseparam")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We have an issue that OVS application creates an offloaded drop rule
that drops VXLAN traffic with both inner and outer header match
criteria. mlx5_core driver detects correctly the inner and outer
header match criteria but does not enable the inner header match criteria
due to an incorrect assumption in mlx5_eswitch_add_offloaded_rule that
only decap rule needs inner header criteria.
Solution:
Remove mlx5_esw_flow_attr's match_level and tunnel_match_level and add
two new members: inner_match_level and outer_match_level.
inner/outer_match_level is set to NONE if the inner/outer match criteria
is not specified in the tc rule creation request. The decap assumption is
removed and the code just needs to check for inner/outer_match_level to
enable the corresponding bit in firmware's match_criteria_enable value.
Fixes: 6363651d6d ("net/mlx5e: Properly set steering match levels for offloaded TC decap rules")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The current ARFS code relies on certain fields to be set in the SKB
(e.g. transport_header) and extracts IP addresses and ports by custom
code that parses the packet. The necessary SKB fields, however, are not
always set at that point, which leads to an out-of-bounds access. Use
skb_flow_dissect_flow_keys() to get the necessary information reliably,
fix the out-of-bounds access and reuse the code.
Fixes: 18c908e477 ("net/mlx5e: Add accelerated RFS support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- Fix netlink dumping of all mcast_flags buckets, by Sven Eckelmann
- Fix deletion of RTR(4|6) mcast list entries, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark switch cases where we are expecting to fall through.
This patch fixes the following warnings (Building: haps_hs_defconfig arc):
arch/arc/kernel/unwind.c:827:20: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/arc/kernel/unwind.c:836:20: warning: this statement may fall through [-Wimplicit-fallthrough=]
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
When powering off the Odroid N2, the tflash_vdd regulator is
automatically turned off by the kernel. This is a problem
when issuing the "reboot" command while using an SD card.
The boot ROM does not power this regulator back on, blocking
the reboot process at the boot ROM stage, preventing the
SD card from being detected.
Adding the "regulator-always-on" property fixes the problem.
Signed-off-by: Xavier Ruppen <xruppen@gmail.com>
Suggested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Fixes: c35f6dc5c3 ("arm64: dts: meson: Add minimal support for Odroid-N2")
[khilman: minor subject change: s/meson/amlogic/]
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
The G12A USB2 OTG capable PHY uses a 8bit large UTMI bus, and the OTG
controller gets the PHY but width by probing the associated phy.
By default it will use 16bit wide settings if a phy is not specified,
in our case we specified the phy, but not the phy-names.
The dwc2 bindings specifies that if phys is present, phy-names shall be
"usb2-phy".
Adding phy-names = "usb2-phy" solves the OTG PHY bus configuration.
Fixes: 9baf7d6be7 ("arm64: dts: meson: g12a: Add G12A USB nodes")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
We want to use DW AXI DMAC on HSDK board in our automated verification
to test cache & dma kernel code changes. This is perfect candidate
as we don't depend on any external peripherals like MMC card / USB
storage / etc.
To increase test coverage we want to test both options:
* DW AXI DMAC is connected through IOC port & dma direct ops used
* DW AXI DMAC is connected to DDR port & dma noncoherent ops used
Introduce 'arc_hsdk_axi_dmac_coherent' global variable which can be
modified by debugger (same way as we patch 'ioc_enable') to switch
between these options without recompiling the kernel.
Depend on this value we tweak memory bridge configuration and
"dma-coherent" DTS property of DW AXI DMAC.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Acked-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This makes the previously added 'encap test' pass.
Because its possible that the xfrm dst entry becomes stale while such
a flow is offloaded, we need to call dst_check() -- the notifier that
handles this for non-tunneled traffic isn't sufficient, because SA or
or policies might have changed.
If dst becomes stale the flow offload entry will be tagged for teardown
and packets will be passed to 'classic' forwarding path.
Removing the entry right away is problematic, as this would
introduce a race condition with the gc worker.
In case flow is long-lived, it could eventually be offloaded again
once the gc worker removes the entry from the flow table.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
'flow offload' expression should not offload flows that will be subject
to ipsec, but it does.
This results in a connectivity blackhole for the affected flows -- first
packets will go through (offload happens after established state is
reached), but all remaining ones bypass ipsec encryption and are thus
discarded by the peer.
This can be worked around by adding "rt ipsec exists accept"
before the 'flow offload' rule matches.
This test case will fail, support for such flows is added in
next patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Exception handlers call FAKE_RET_FROM_EXCPN to
- clear AE bit: drop down from exception active to pure kernel mode
allowing further excptions
- set IE bit: re-enable interrupts
It additionally also clears U bit (user mode) and DE bit (delay slot
execution) which is redundant as hardware does that already on any taken
exception. Morevoer the current software clearing is bogus anyways as
the KFLAG instruction being used for purpose can't possibly write those
bits anyways.
So don't pretend to clear them.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: rewrote changelog]
Before this change the device tree description of qspi node for
second memory on BK4 board was wrong (applicable to old, removed
fsl-quadspi.c driver).
As a result this memory was not recognized correctly when used
with the new spi-fsl-qspi.c driver.
From the dt-bindings:
"Required SPI slave node properties:
- reg: There are two buses (A and B) with two chip selects each.
This encodes to which bus and CS the flash is connected:
<0>: Bus A, CS 0
<1>: Bus A, CS 1
<2>: Bus B, CS 0
<3>: Bus B, CS 1"
According to above with new driver the second SPI-NOR memory shall
have reg=<2> as it is connected to Bus B, CS 0.
Fixes: a67d2c52a8 ("ARM: dts: Add support for Liebherr's BK4 device (vf610 based)")
Suggested-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Add 2 tests that check JIT code generation to jumps to 1st insn.
1st test is similar to syzbot reproducer.
The backwards branch is never taken at runtime.
2nd test has branch to 1st insn that executes.
The test is written as two bpf functions, since it's not possible
to construct valid single bpf program that jumps to 1st insn.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Introduction of bounded loops exposed old bug in x64 JIT.
JIT maintains the array of offsets to the end of all instructions to
compute jmp offsets.
addrs[0] - offset of the end of the 1st insn (that includes prologue).
addrs[1] - offset of the end of the 2nd insn.
JIT didn't keep the offset of the beginning of the 1st insn,
since classic BPF didn't have backward jumps and valid extended BPF
couldn't have a branch to 1st insn, because it didn't allow loops.
With bounded loops it's possible to construct a valid program that
jumps backwards to the 1st insn.
Fix JIT by computing:
addrs[0] - offset of the end of prologue == start of the 1st insn.
addrs[1] - offset of the end of 1st insn.
v1->v2:
- Yonghong noticed a bug in jit linfo.
Fix it by passing 'addrs + 1' to bpf_prog_fill_jited_linfo(),
since it expects insn_to_jit_off array to be offsets to last byte.
Reported-by: syzbot+35101610ff3e83119b1b@syzkaller.appspotmail.com
Fixes: 2589726d12 ("bpf: introduce bounded loops")
Fixes: 0a14842f5a ("net: filter: Just In Time compiler for x86-64")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
5d01ab7bac ("libbpf: fix erroneous multi-closing of BTF FD")
introduced backwards-compatibility issue, manifesting itself as -E2BIG
error returned on program load due to unknown non-zero btf_fd attribute
value for BPF_PROG_LOAD sys_bpf() sub-command.
This patch fixes bug by ensuring that we only ever associate BTF FD with
program if there is a BTF.ext data that was successfully loaded into
kernel, which automatically means kernel supports func_info/line_info
and associated BTF FD for progs (checked and ensured also by BTF
sanitization code).
Fixes: 5d01ab7bac ("libbpf: fix erroneous multi-closing of BTF FD")
Reported-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Having static variable `cpus` in libbpf_num_possible_cpus function
without guarding it with mutex makes this function thread-unsafe.
If multiple threads accessing this function, in the current form; it
leads to incrementing the static variable value `cpus` in the multiple
of total available CPUs.
Used local stack variable to calculate the number of possible CPUs and
then updated the static variable using WRITE_ONCE().
Changes since v1:
* added stack variable to calculate cpus
* serialized static variable update using WRITE_ONCE()
* fixed Fixes tag
Fixes: 6446b31555 ("bpf: add a new API libbpf_num_possible_cpus()")
Signed-off-by: Takshak Chahande <ctakshak@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Some arches (e.g., arm64, x86) have moved towards non-executable
module_alloc() allocations for security hardening reasons. That means
that the module loader will need to set the text section of a module to
executable, regardless of whether or not CONFIG_STRICT_MODULE_RWX is set.
When CONFIG_STRICT_MODULE_RWX=y, module section allocations are always
page-aligned to handle memory rwx permissions. On some arches with
CONFIG_STRICT_MODULE_RWX=n however, when setting the module text to
executable, the BUG_ON() in frob_text() gets triggered since module
section allocations are not page-aligned when CONFIG_STRICT_MODULE_RWX=n.
Since the set_memory_* API works with pages, and since we need to call
set_memory_x() regardless of whether CONFIG_STRICT_MODULE_RWX is set, we
might as well page-align all module section allocations for ease of
managing rwx permissions of module sections (text, rodata, etc).
Fixes: 2eef1399a8 ("modules: fix BUG when load module with rodata=n")
Reported-by: Martin Kaiser <lists@kaiser.cx>
Reported-by: Bartosz Golaszewski <brgl@bgdev.pl>
Tested-by: David Lechner <david@lechnology.com>
Tested-by: Martin Kaiser <martin@kaiser.cx>
Tested-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Kernel crashes during suspend due to wrong conversion in
suspend and resume functions.
Use the proper helper to get ishtp_cl_device instance.
Cc: <stable@vger.kernel.org> # 5.2.x: b12bbdc5: HID: intel-ish-hid: fix wrong driver_data usage
Signed-off-by: Hyungwoo Yang <hyungwoo.yang@intel.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Libbpf stores associated BTF FD per each instance of bpf_program. When
program is unloaded, that FD is closed. This is wrong, because leads to
a race and possibly closing of unrelated files, if application
simultaneously opens new files while bpf_programs are unloaded.
It's also unnecessary, because struct btf "owns" that FD, and
btf__free(), called from bpf_object__close() will close it. Thus the fix
is to never have per-program BTF FD and fetch it from obj->btf, when
necessary.
Fixes: 2993e0515b ("tools/bpf: add support to read .BTF.ext sections")
Reported-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
We are currently using a wrong register for dcan revision. Although
this is currently only used for detecting the dcan module, let's
fix it to avoid confusion.
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The ti,no-idle-on-init and ti,no-reset-on-init flags need to be at
the interconnect target module level for the modules that have it
defined. Otherwise we get the following warnings:
dts flag should be at module level for ti,no-idle-on-init
dts flag should be at module level for ti,no-reset-on-init
Reviewed-by: Suman Anna <s-anna@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
We have cases where there are no softreset bits like with am335x lcdc.
In that case ti,sysc-mask = <0> needs to be handled properly.
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
For some devices we can get the following warning on boot:
ti-sysc 48485200.target-module: sysc_disable_module: invalid midlemode
Fix this by treating SYSC_IDLE_FORCE like we do for the other bits
for idlemodes mask.
Fixes: d59b60564c ("bus: ti-sysc: Add generic enable/disable functions")
Cc: Roger Quadros <rogerq@ti.com>
Reviewed-by: Suman Anna <s-anna@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
TRM says PWMSS_SYSCONFIG bit for SOFTRESET changes to zero when
reset is completed. Let's configure it as otherwise we get warnings
on boot when we check the data against dts provided data. Eventually
the legacy platform data will be just dropped, but let's fix the
warning first.
Reviewed-by: Suman Anna <s-anna@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The multicast code uses the lists bat_priv->mcast.want_all_rtr*_list to
store all all originator nodes which don't have the flag no-RTR4 or no-RTR6
set. When an originator is purged, it has to be removed from these lists.
Since all entries without the BATADV_MCAST_WANT_NO_RTR4/6 are stored in
these lists, they have to be handled like entries which have these flags
set to force the update routines to remove them from the lists when purging
the originator.
Not doing so will leave a pointer to a freed memory region inside the list.
Trying to operate on these lists will then cause an use-after-free error:
BUG: KASAN: use-after-free in batadv_mcast_want_rtr4_update+0x335/0x3a0 [batman_adv]
Write of size 8 at addr ffff888007b41a38 by task swapper/0/0
Fixes: 61caf3d109 ("batman-adv: mcast: detect, distribute and maintain multicast router presence")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The bucket variable is only updated outside the loop over the mcast_flags
buckets. It will only be updated during a dumping run when the dumping has
to be interrupted and a new message has to be started.
This could result in repeated or missing entries when the multicast flags
are dumped to userspace.
Fixes: d2d489b7d8 ("batman-adv: Add inconsistent multicast netlink dump detection")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
If UHS speed modes are enabled, a compatible SD card switches down to
1.8V during enumeration. If after this a software reboot/crash takes
place and on-chip ROM tries to enumerate the SD card, the difference in
IO voltages (host @ 3.3V and card @ 1.8V) may end up damaging the card.
The fix for this is to have support for power cycling the card in
hardware (with a PORz/soft-reset line causing a power cycle of the
card). Because the beaglebone X15 (rev A,B and C), am57xx-idks and
am57xx-evms don't have this capability, disable voltage switching for
these boards.
The major effect of this is that the maximum supported speed
mode is now high speed(50 MHz) down from SDR104(200 MHz).
commit 88a748419b ("ARM: dts: am57xx-idk: Remove support for voltage
switching for SD card") did this only for idk boards. Do it for all
affected boards.
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.