Pull MIPS fixes from Ralf Baechle:
"Here's a final round of fixes for 4.12:
- Fix misordered instructions in assembly code making kenel startup
via UHB unreliable.
- Fix special case of MADDF and MADDF emulation.
- Fix alignment issue in address calculation in pm-cps on 64 bit.
- Fix IRQ tracing & lockdep when rescheduling
- Systems with MAARs require post-DMA cache flushes.
The reordering fix and the MADDF/MSUBF fix have sat in linux-next for
a number of days. The others haven't propagated from my pull tree to
linux-next yet but all have survived manual testing and Imagination's
automated test system and there are no pending bug reports"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Avoid accidental raw backtrace
MIPS: Perform post-DMA cache flushes on systems with MAARs
MIPS: Fix IRQ tracing & lockdep when rescheduling
MIPS: pm-cps: Drop manual cache-line alignment of ready_count
MIPS: math-emu: Handle zero accumulator case in MADDF and MSUBF separately
MIPS: head: Reorder instructions missing a delay slot
Pull ARM fix from Russell King:
"One final fix for 4.12 - Doug found a boot failure case triggered by
requesting a non-even MB vmalloc size"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8685/1: ensure memblock-limit is pmd-aligned
Pull x86 fixes from Thomas Gleixner:
"Fixlets for x86:
- Prevent kexec crash when KASLR is enabled, which was caused by an
address calculation bug
- Restore the freeing of PUDs on memory hot remove
- Correct a negated pointer check in the intel uncore performance
monitoring driver
- Plug a memory leak in an error exit path in the RDT code"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/intel_rdt: Fix memory leak on mount failure
x86/boot/KASLR: Fix kexec crash due to 'virt_addr' calculation bug
x86/boot/KASLR: Add checking for the offset of kernel virtual address randomization
perf/x86/intel/uncore: Fix wrong box pointer check
x86/mm/hotplug: Fix BUG_ON() after hot-remove by not freeing PUD
Pull perf fix from Thomas Gleixner:
"The last fix for perf for this cycles:
- Prevent a segfault when kernel.kptr_restrict=2 is set by avoiding a
null pointer dereference"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf machine: Fix segfault for kernel.kptr_restrict=2
Pull pinctrl fix from Linus Walleij:
"Brian noticed that this regression has not got a proper fix for the
entire merge window and consequently we need to revert the offending
commit.
It's part of the RT-mainstream work, the dance goes like this, two
steps forward, one step back.
Summary:
- A last fix for v4.12, an IRQ problem reported early in the merge
window appears not to have been properly fixed, so the offending
commit will be reverted and we will find the proper fix for v4.13.
Hopefully"
* tag 'pinctrl-v4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
Revert "pinctrl: rockchip: avoid hardirq-unsafe functions in irq_chip"
Pull last minute fixes for GPIO from Linus Walleij:
- Fix another ACPI problem with broken BIOSes.
- Filter out the right GPIO events, making a very user-visible bug go
away.
* tag 'gpio-v4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: acpi: Skip _AEI entries without a handler rather then aborting the scan
gpiolib: fix filtering out unwanted events
Pull last-minute tracing fixes from Steven Rostedt:
"Two fixes:
One is for a crash when using the :mod: trace probe command into
stack_trace_filter. This bug was introduced during the last merge
window.
The other was there forever. It's a small bug that makes it impossible
to name a module function for kprobes when the module starts with a
digit"
* tag 'trace-v4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Allow to create probe with a module name starting with a digit
ftrace: Fix regression with module command in stack_trace_filter
uapi/linux/a.out.h uses a number of predefined macros that are
deprecated because they're in the application namespace
(e.g. '#ifdef linux' instead of '#ifdef __linux__').
This patch either corrects or just removes them if they are not
applicable to Linux.
The primary reason this is worth bothering to fix, considering how
obsolete a.out binary support is, is that the GCC build process
considers this such a severe error that it will copy the header into a
private directory and change the macro names, which causes future
updates to the header to be masked. This header probably doesn't get
updated very often anymore, but it is the _only_ uapi header that gets
this treatment, so IMHO it is worth patching just to drive that number
all the way to zero.
Signed-off-by: Zack Weinberg <zackw@panix.com>
[hch: removed dead conditionals]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull powerpc fixes from Michael Ellerman:
"Hopefully the last two powerpc fixes for 4.12.
The CXL one is larger than I'd usually send at rc7, but it fixes new
code this cycle, so better to have it working for the release. It was
actually sent a few weeks back but got blocked in testing behind
another fix that was causing issues.
We are still tracking one crash in v4.12-rc7, but only one person has
reproduced it and the commit identified by bisect doesn't touch any of
the relevant code, so I think it's 50/50 whether that commit is
actually the problem or it's some code layout / toolchain issue.
Two fixes for code we merged this cycle:
- cxl: Fixes for Coherent Accelerator Interface Architecture 2.0
- Avoid miscompilation w/GCC 4.6.3 on 32-bit - don't inline
copy_to/from_user()
Thanks to Al Viro, Larry Finger, Christophe Lombard"
* tag 'powerpc-4.12-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/32: Avoid miscompilation w/GCC 4.6.3 - don't inline copy_to/from_user()
cxl: Fixes for Coherent Accelerator Interface Architecture 2.0
Pull IOMMU fixes from Joerg Roedel:
"Two fixes:
- A fix for AMD IOMMU interrupt remapping code when IRQs are
forwarded directly to KVM guests
- Fixed check in the recently merged code to allow tboot with
Intel VT-d disabled"
* tag 'iommu-fixes-v4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Fix interrupt remapping when disable guest_mode
iommu/vt-d: Correctly disable Intel IOMMU force on
Pull sound fixes from Takashi Iwai:
"Two last-minute HD-audio fixes"
* tag 'sound-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix endless loop of codec configure
ALSA: hda - set input_path bitmap to zero after moving it to new place
Pull overlayfs fixes from Miklos Szeredi:
"Fix two bugs in copy-up code. One introduced in 4.11 and one in
4.12-rc"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: don't set origin on broken lower hardlink
ovl: copy-up: don't unlock between lookup and link
Kernel text KASLR is separated into physical address and virtual
address randomization. And for virtual address randomization, we
only randomiza to get an offset between 16M and KERNEL_IMAGE_SIZE.
So the initial value of 'virt_addr' should be LOAD_PHYSICAL_ADDR,
but not the original kernel loading address 'output'.
The bug will cause kernel boot failure if kernel is loaded at a different
position than the address, 16M, which is decided at compiled time.
Kexec/kdump is such practical case.
To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial
value.
Tested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 8391c73 ("x86/KASLR: Randomize virtual address separately")
Link: http://lkml.kernel.org/r/1498567146-11990-3-git-send-email-bhe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For kernel text KASLR, the virtual address is confined to area of 1G,
[0xffffffff80000000, 0xffffffffc0000000). For the implemenataion of
virtual address randomization, we only randomize to get an offset
between 16M and 1G, then add this offset to the starting address,
0xffffffff80000000. Here 16M is the offset which is decided at linking
stage. So the amount of the local variable 'virt_addr' which respresents
the offset plus the kernel output size can not exceed KERNEL_IMAGE_SIZE.
Add a debug check for the offset. If out of bounds, print error
message and hang there.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1498567146-11990-2-git-send-email-bhe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since commit 81a76d7119 ("MIPS: Avoid using unwind_stack() with
usermode") show_backtrace() invokes the raw backtracer when
cp0_status & ST0_KSU indicates user mode to fix issues on EVA kernels
where user and kernel address spaces overlap.
However this is used by show_stack() which creates its own pt_regs on
the stack and leaves cp0_status uninitialised in most of the code paths.
This results in the non deterministic use of the raw back tracer
depending on the previous stack content.
show_stack() deals exclusively with kernel mode stacks anyway, so
explicitly initialise regs.cp0_status to KSU_KERNEL (i.e. 0) to ensure
we get a useful backtrace.
Fixes: 81a76d7119 ("MIPS: Avoid using unwind_stack() with usermode")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15+
Patchwork: https://patchwork.linux-mips.org/patch/16656/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Recent CPUs from Imagination Technologies such as the I6400 or P6600 are
able to speculatively fetch data from memory into caches. This means
that if used in a system with non-coherent DMA they require that caches
be invalidated after a device performs DMA, and before the CPU reads the
DMA'd data, in order to ensure that stale values weren't speculatively
prefetched.
Such CPUs also introduced Memory Accessibility Attribute Registers
(MAARs) in order to control the regions in which they are allowed to
speculate. Thus we can use the presence of MAARs as a good indication
that the CPU requires the above cache maintenance. Use the presence of
MAARs to determine the result of cpu_needs_post_dma_flush() in the
default case, in order to handle these recent CPUs correctly.
Note that the return type of cpu_needs_post_dma_flush() is changed to
bool, such that it's clearer what's happening when cpu_has_maar is cast
to bool for the return value. If this patch were backported to a
pre-v4.7 kernel then MIPS_CPU_MAAR was 1ull<<34, so when cast to an int
we would incorrectly return 0. It so happens that MIPS_CPU_MAAR is
currently 1ull<<30, so when truncated to an int gives a non-zero value
anyway, but even so the implicit conversion from long long int to bool
makes it clearer to understand what will happen than the implicit
conversion from long long int to int would. The bool return type also
fits this usage better semantically, so seems like an all-round win.
Thanks to Ed for spotting the issue for pre-v4.7 kernels & suggesting
the return type change.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Ed Blake <ed.blake@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16363/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When the scheduler sets TIF_NEED_RESCHED & we call into the scheduler
from arch/mips/kernel/entry.S we disable interrupts. This is true
regardless of whether we reach work_resched from syscall_exit_work,
resume_userspace or by looping after calling schedule(). Although we
disable interrupts in these paths we don't call trace_hardirqs_off()
before calling into C code which may acquire locks, and we therefore
leave lockdep with an inconsistent view of whether interrupts are
disabled or not when CONFIG_PROVE_LOCKING & CONFIG_DEBUG_LOCKDEP are
both enabled.
Without tracing this interrupt state lockdep will print warnings such
as the following once a task returns from a syscall via
syscall_exit_partial with TIF_NEED_RESCHED set:
[ 49.927678] ------------[ cut here ]------------
[ 49.934445] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3687 check_flags.part.41+0x1dc/0x1e8
[ 49.946031] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
[ 49.946355] CPU: 0 PID: 1 Comm: init Not tainted 4.10.0-00439-gc9fd5d362289-dirty #197
[ 49.963505] Stack : 0000000000000000 ffffffff81bb5d6a 0000000000000006 ffffffff801ce9c4
[ 49.974431] 0000000000000000 0000000000000000 0000000000000000 000000000000004a
[ 49.985300] ffffffff80b7e487 ffffffff80a24498 a8000000ff160000 ffffffff80ede8b8
[ 49.996194] 0000000000000001 0000000000000000 0000000000000000 0000000077c8030c
[ 50.007063] 000000007fd8a510 ffffffff801cd45c 0000000000000000 a8000000ff127c88
[ 50.017945] 0000000000000000 ffffffff801cf928 0000000000000001 ffffffff80a24498
[ 50.028827] 0000000000000000 0000000000000001 0000000000000000 0000000000000000
[ 50.039688] 0000000000000000 a8000000ff127bd0 0000000000000000 ffffffff805509bc
[ 50.050575] 00000000140084e0 0000000000000000 0000000000000000 0000000000040a00
[ 50.061448] 0000000000000000 ffffffff8010e1b0 0000000000000000 ffffffff805509bc
[ 50.072327] ...
[ 50.076087] Call Trace:
[ 50.079869] [<ffffffff8010e1b0>] show_stack+0x80/0xa8
[ 50.086577] [<ffffffff805509bc>] dump_stack+0x10c/0x190
[ 50.093498] [<ffffffff8015dde0>] __warn+0xf0/0x108
[ 50.099889] [<ffffffff8015de34>] warn_slowpath_fmt+0x3c/0x48
[ 50.107241] [<ffffffff801c15b4>] check_flags.part.41+0x1dc/0x1e8
[ 50.114961] [<ffffffff801c239c>] lock_is_held_type+0x8c/0xb0
[ 50.122291] [<ffffffff809461b8>] __schedule+0x8c0/0x10f8
[ 50.129221] [<ffffffff80946a60>] schedule+0x30/0x98
[ 50.135659] [<ffffffff80106278>] work_resched+0x8/0x34
[ 50.142397] ---[ end trace 0cb4f6ef5b99fe21 ]---
[ 50.148405] possible reason: unannotated irqs-off.
[ 50.154600] irq event stamp: 400463
[ 50.159566] hardirqs last enabled at (400463): [<ffffffff8094edc8>] _raw_spin_unlock_irqrestore+0x40/0xa8
[ 50.171981] hardirqs last disabled at (400462): [<ffffffff8094eb98>] _raw_spin_lock_irqsave+0x30/0xb0
[ 50.183897] softirqs last enabled at (400450): [<ffffffff8016580c>] __do_softirq+0x4ac/0x6a8
[ 50.195015] softirqs last disabled at (400425): [<ffffffff80165e78>] irq_exit+0x110/0x128
Fix this by using the TRACE_IRQS_OFF macro to call trace_hardirqs_off()
when CONFIG_TRACE_IRQFLAGS is enabled. This is done before invoking
schedule() following the work_resched label because:
1) Interrupts are disabled regardless of the path we take to reach
work_resched() & schedule().
2) Performing the tracing here avoids the need to do it in paths which
disable interrupts but don't call out to C code before hitting a
path which uses the RESTORE_SOME macro that will call
trace_hardirqs_on() or trace_hardirqs_off() as appropriate.
We call trace_hardirqs_on() using the TRACE_IRQS_ON macro before calling
syscall_trace_leave() for similar reasons, ensuring that lockdep has a
consistent view of state after we re-enable interrupts.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: linux-mips@linux-mips.org
Cc: stable <stable@vger.kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/15385/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We allocate memory for a ready_count variable per-CPU, which is accessed
via a cached non-coherent TLB mapping to perform synchronisation between
threads within the core using LL/SC instructions. In order to ensure
that the variable is contained within its own data cache line we
allocate 2 lines worth of memory & align the resulting pointer to a line
boundary. This is however unnecessary, since kmalloc is guaranteed to
return memory which is at least cache-line aligned (see
ARCH_DMA_MINALIGN). Stop the redundant manual alignment.
Besides cleaning up the code & avoiding needless work, this has the side
effect of avoiding an arithmetic error found by Bryan on 64 bit systems
due to the 32 bit size of the former dlinesz. This led the ready_count
variable to have its upper 32b cleared erroneously for MIPS64 kernels,
causing problems when ready_count was later used on MIPS64 via cpuidle.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 3179d37ee1 ("MIPS: pm-cps: add PM state entry code for CPS systems")
Reported-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: stable <stable@vger.kernel.org> # v3.16+
Patchwork: https://patchwork.linux-mips.org/patch/15383/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The pmd containing memblock_limit is cleared by prepare_page_table()
which creates the opportunity for early_alloc() to allocate unmapped
memory if memblock_limit is not pmd aligned causing a boot-time hang.
Commit 965278dcb8 ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
attempted to resolve this problem, but there is a path through the
adjust_lowmem_bounds() routine where if all memory regions start and
end on pmd-aligned addresses the memblock_limit will be set to
arm_lowmem_limit.
Since arm_lowmem_limit can be affected by the vmalloc early parameter,
the value of arm_lowmem_limit may not be pmd-aligned. This commit
corrects this oversight such that memblock_limit is always rounded
down to pmd-alignment.
Fixes: 965278dcb8 ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Pull networking fixes from David Miller:
1) Need to access netdev->num_rx_queues behind an accessor in netvsc
driver otherwise the build breaks with some configs, from Arnd
Bergmann.
2) Add dummy xfrm_dev_event() so that build doesn't fail when
CONFIG_XFRM_OFFLOAD is not set. From Hangbin Liu.
3) Don't OOPS when pfkey_msg2xfrm_state() signals an erros, from Dan
Carpenter.
4) Fix MCDI command size for filter operations in sfc driver, from
Martin Habets.
5) Fix UFO segmenting so that we don't calculate incorrect checksums,
from Michal Kubecek.
6) When ipv6 datagram connects fail, reset destination address and
port. From Wei Wang.
7) TCP disconnect must reset the cached receive DST, from WANG Cong.
8) Fix sign extension bug on 32-bit in dev_get_stats(), from Eric
Dumazet.
9) fman driver has to depend on HAS_DMA, from Madalin Bucur.
10) Fix bpf pointer leak with xadd in verifier, from Daniel Borkmann.
11) Fix negative page counts with GFO, from Michal Kubecek.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
sfc: fix attempt to translate invalid filter ID
net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish()
bpf: prevent leaking pointer via xadd on unpriviledged
arcnet: com20020-pci: add missing pdev setup in netdev structure
arcnet: com20020-pci: fix dev_id calculation
arcnet: com20020: remove needless base_addr assignment
Trivial fix to spelling mistake in arc_printk message
arcnet: change irq handler to lock irqsave
rocker: move dereference before free
mlxsw: spectrum_router: Fix NULL pointer dereference
net: sched: Fix one possible panic when no destroy callback
virtio-net: serialize tx routine during reset
net: usb: asix88179_178a: Add support for the Belkin B2B128
fsl/fman: add dependency on HAS_DMA
net: prevent sign extension in dev_get_stats()
tcp: reset sk_rx_dst in tcp_disconnect()
net: ipv6: reset daddr and dport in sk if connect() fails
bnx2x: Don't log mc removal needlessly
bnxt_en: Fix netpoll handling.
bnxt_en: Add missing logic to handle TPA end error conditions.
...
Pull device mapper fixes from Mike Snitzer:
- dm thinp fix for crash that will occur when metadata device failure
races with discard passdown to the underlying data device.
- dm raid fix to not access the superblock's >= 1.9.0 'sectors' member
unconditionally.
* tag 'for-4.12/dm-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm thin: do not queue freed thin mapping for next stage processing
dm raid: fix oops on upgrading to extended superblock format
Pull block fixes from Jens Axboe:
"Two fixes that should go into this release.
One is an nvme regression fix from Keith, fixing a missing queue
freeze if the controller is being reset. This causes the reset to
hang.
The other is a fix for a leak of the bio protection info, if smaller
sized O_DIRECT is used. This fix should be more involved as we have
other problematic paths in the kernel, but given as this isn't a
regression in this series, we'll tackle those for 4.13"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: provide bio_uninit() free freeing integrity/task associations
nvme/pci: Fix stuck nvme reset
When filter insertion fails with no rollback, we were trying to convert
EFX_EF10_FILTER_ID_INVALID to an id to store in 'ids' (which is either
vlan->uc or vlan->mc). This would WARN_ON_ONCE and then record a bogus
filter ID of 0x1fff, neither of which is a good thing.
Fixes: 0ccb998bf4 ("sfc: fix filter_id misinterpretation in edge case")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently I started seeing warnings about pages with refcount -1. The
problem was traced to packets being reused after their head was merged into
a GRO packet by skb_gro_receive(). While bisecting the issue pointed to
commit c21b48cc1b ("net: adjust skb->truesize in ___pskb_trim()") and
I have never seen it on a kernel with it reverted, I believe the real
problem appeared earlier when the option to merge head frag in GRO was
implemented.
Handling NAPI_GRO_FREE_STOLEN_HEAD state was only added to GRO_MERGED_FREE
branch of napi_skb_finish() so that if the driver uses napi_gro_frags()
and head is merged (which in my case happens after the skb_condense()
call added by the commit mentioned above), the skb is reused including the
head that has been merged. As a result, we release the page reference
twice and eventually end up with negative page refcount.
To fix the problem, handle NAPI_GRO_FREE_STOLEN_HEAD in napi_frags_finish()
the same way it's done in napi_skb_finish().
Fixes: d7e8883cfc ("net: make GRO aware of skb->head_frag")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Leaking kernel addresses on unpriviledged is generally disallowed,
for example, verifier rejects the following:
0: (b7) r0 = 0
1: (18) r2 = 0xffff897e82304400
3: (7b) *(u64 *)(r1 +48) = r2
R2 leaks addr into ctx
Doing pointer arithmetic on them is also forbidden, so that they
don't turn into unknown value and then get leaked out. However,
there's xadd as a special case, where we don't check the src reg
for being a pointer register, e.g. the following will pass:
0: (b7) r0 = 0
1: (7b) *(u64 *)(r1 +48) = r0
2: (18) r2 = 0xffff897e82304400 ; map
4: (db) lock *(u64 *)(r1 +48) += r2
5: (95) exit
We could store the pointer into skb->cb, loose the type context,
and then read it out from there again to leak it eventually out
of a map value. Or more easily in a different variant, too:
0: (bf) r6 = r1
1: (7a) *(u64 *)(r10 -8) = 0
2: (bf) r2 = r10
3: (07) r2 += -8
4: (18) r1 = 0x0
6: (85) call bpf_map_lookup_elem#1
7: (15) if r0 == 0x0 goto pc+3
R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R6=ctx R10=fp
8: (b7) r3 = 0
9: (7b) *(u64 *)(r0 +0) = r3
10: (db) lock *(u64 *)(r0 +0) += r6
11: (b7) r0 = 0
12: (95) exit
from 7 to 11: R0=inv,min_value=0,max_value=0 R6=ctx R10=fp
11: (b7) r0 = 0
12: (95) exit
Prevent this by checking xadd src reg for pointer types. Also
add a couple of test cases related to this.
Fixes: 1be7f75d16 ("bpf: enable non-root eBPF programs")
Fixes: 17a5267067 ("bpf: verifier (add verifier core)")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Grzeschik says:
====================
arcnet: Collection of latest fixes
Here we sum up the recent fixes I collected on the way to use and
stabilise the framework. Part of it is an possible deadlock that we
prevent as well to fix the calculation of the dev_id that can be setup
by an rotary encoder. Beside that we added an trivial spelling patch and
fix some wrong and missing assignments that improves the code footprint.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We add the pdev data to the pci devices netdev structure. This way
the interface get consistent device names in the userspace (udev).
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dev_id was miscalculated. Only the two bits 4-5 are relevant for the
MA1 card. PCIARC1 and PCIFB2 use the four bits 4-7 for id selection.
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
My static checker complains that ofdpa_neigh_del() can sometimes free
"found". It just makes sense to use it first before deleting it.
Fixes: ecf244f753 ("rocker: fix maybe-uninitialized warning")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case a VLAN device is enslaved to a bridge we shouldn't create a
router interface (RIF) for it when it's configured with an IP address.
This is already handled by the driver for other types of netdevs, such
as physical ports and LAG devices.
If this IP address is then removed and the interface is subsequently
unlinked from the bridge, a NULL pointer dereference can happen, as the
original 802.1d FID was replaced with an rFID which was then deleted.
To reproduce:
$ ip link set dev enp3s0np9 up
$ ip link add name enp3s0np9.111 link enp3s0np9 type vlan id 111
$ ip link set dev enp3s0np9.111 up
$ ip link add name br0 type bridge
$ ip link set dev br0 up
$ ip link set enp3s0np9.111 master br0
$ ip address add dev enp3s0np9.111 192.168.0.1/24
$ ip address del dev enp3s0np9.111 192.168.0.1/24
$ ip link set dev enp3s0np9.111 nomaster
Fixes: 99724c18fc ("mlxsw: spectrum: Introduce support for router interfaces")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Petr Machata <petrm@mellanox.com>
Tested-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When qdisc fail to init, qdisc_create would invoke the destroy callback
to cleanup. But there is no check if the callback exists really. So it
would cause the panic if there is no real destroy callback like the qdisc
codel, fq, and so on.
Take codel as an example following:
When a malicious user constructs one invalid netlink msg, it would cause
codel_init->codel_change->nla_parse_nested failed.
Then kernel would invoke the destroy callback directly but qdisc codel
doesn't define one. It causes one panic as a result.
Now add one the check for destroy to avoid the possible panic.
Fixes: 87b60cfacf ("net_sched: fix error recovery at qdisc creation")
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't hold any tx lock when trying to disable TX during reset, this
would lead a use after free since ndo_start_xmit() tries to access
the virtqueue which has already been freed. Fix this by using
netif_tx_disable() before freeing the vqs, this could make sure no tx
after vq freeing.
Reported-by: Jean-Philippe Menil <jpmenil@gmail.com>
Tested-by: Jean-Philippe Menil <jpmenil@gmail.com>
Fixes commit f600b69050 ("virtio_net: Add XDP support")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Robert McCabe <robert.mccabe@rockwellcollins.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When doing the following command:
# echo ":mod:kvm_intel" > /sys/kernel/tracing/stack_trace_filter
it triggered a crash.
This happened with the clean up of probes. It required all callers to the
regex function (doing ftrace filtering) to have ops->private be a pointer to
a trace_array. But for the stack tracer, that is not the case.
Allow for the ops->private to be NULL, and change the function command
callbacks to handle the trace_array pointer being NULL as well.
Fixes: d2afd57a4b ("tracing/ftrace: Allow instances to have their own function probes")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
acpi_walk_resources will stop as soon as the callback passed in returns
an error status. On a x86 tablet I have the first GpioInt in the _AEI
resource list has no handler defined in the DSDT, causing
acpi_walk_resources to abort scanning the rest of the resource list,
which does define valid ACPI GPIO events.
This commit changes the return for not finding a handler from
AE_BAD_PARAMETER to AE_OK so that the rest of the resource list will
get scanned normally in case of missing event handlers.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
GPIOEVENT_REQUEST_BOTH_EDGES is not a single flag, but a binary OR of
GPIOEVENT_REQUEST_RISING_EDGE and GPIOEVENT_REQUEST_FALLING_EDGE.
The expression 'le->eflags & GPIOEVENT_REQUEST_BOTH_EDGES' we'll get
evaluated to true even if only one event type was requested.
Fix it by checking both RISING & FALLING flags explicitly.
Cc: stable@vger.kernel.org
Fixes: 61f922db72 ("gpio: userspace ABI for reading GPIO line events")
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The only user of thread_saved_pc() in non-arch-specific code was removed
in commit 8243d55977 ("sched/core: Remove pointless printout in
sched_show_task()"). Remove the implementations as well.
Some architectures use thread_saved_pc() in their arch-specific code.
Leave their thread_saved_pc() intact.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wen reports significant memory leaks with DIF and O_DIRECT:
"With nvme devive + T10 enabled, On a system it has 256GB and started
logging /proc/meminfo & /proc/slabinfo for every minute and in an hour
it increased by 15968128 kB or ~15+GB.. Approximately 256 MB / minute
leaking.
/proc/meminfo | grep SUnreclaim...
SUnreclaim: 6752128 kB
SUnreclaim: 6874880 kB
SUnreclaim: 7238080 kB
....
SUnreclaim: 22307264 kB
SUnreclaim: 22485888 kB
SUnreclaim: 22720256 kB
When testcases with T10 enabled call into __blkdev_direct_IO_simple,
code doesn't free memory allocated by bio_integrity_alloc. The patch
fixes the issue. HTX has been run with +60 hours without failure."
Since __blkdev_direct_IO_simple() allocates the bio on the stack, it
doesn't go through the regular bio free. This means that any ancillary
data allocated with the bio through the stack is not freed. Hence, we
can leak the integrity data associated with the bio, if the device is
using DIF/DIX.
Fix this by providing a bio_uninit() and export it, so that we can use
it to free this data. Note that this is a minimal fix for this issue.
Any current user of bio's that are allocated outside of
bio_alloc_bioset() suffers from this issue, most notably some drivers.
We will fix those in a more comprehensive patch for 4.13. This also
means that the commit marked as being fixed by this isn't the real
culprit, it's just the most obvious one out there.
Fixes: 542ff7bf18 ("block: new direct I/O implementation")
Reported-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull NFS client bugfixes from Trond Myklebust:
"Bugfixes include:
- stable fix for exclusive create if the server supports the umask
attribute
- trunking detection should handle ERESTARTSYS/EINTR
- stable fix for a race in the LAYOUTGET function
- stable fix to revert "nfs_rename() handle -ERESTARTSYS dentry left
behind"
- nfs4_callback_free_slot() cannot call nfs4_slot_tbl_drain_complete()"
* tag 'nfs-for-4.12-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4.1: nfs4_callback_free_slot() cannot call nfs4_slot_tbl_drain_complete()
Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"
NFSv4.1: Fix a race in nfs4_proc_layoutget
NFS: Trunking detection should handle ERESTARTSYS/EINTR
NFSv4.2: Don't send mode again in post-EXCLUSIVE4_1 SETATTR with umask
Pull drm fixes from Dave Airlie:
"This is the final set of fixes for -rc8, just a few i915 and one
vmwgfx ones.
I'm off on holidays for a week, so if anything shows up for fixes I've
asked Daniel or Sean Paul to herd it in the right direction"
[ The additional etnaviv fixes were already herded towards me as seen in
my previous pull - Linus ]
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr
drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocations
drm/i915: Hold struct_mutex for per-file stats in debugfs/i915_gem_object
drm/i915: Retire the VMA's fence tracker before unbinding
Pull drm/etnaviv fixes from Lucas Stach:
"I realized I just missed the cut-off point for the final drm fixes
pull, but I have 2 more etnaviv fixes that need to go into 4.12, as
they fix fallout from the explicit sync work introduced in the last
merge window"
[ Pulling directly because Dave is on vacation. Noted by Daniel Vetter,
and acked by Dave Airlie - Linus ]
* 'etnaviv/fixes' of git://git.pengutronix.de/git/lst/linux:
drm/etnaviv: Fix implicit/explicit sync sense inversion
drm/etnaviv: fix submit flags getting overwritten by BO content
Pass-through devices to VM guest can get updated IRQ affinity
information via irq_set_affinity() when not running in guest mode.
Currently, AMD IOMMU driver in GA mode ignores the updated information
if the pass-through device is setup to use vAPIC regardless of guest_mode.
This could cause invalid interrupt remapping.
Also, the guest_mode bit should be set and cleared only when
SVM updates posted-interrupt interrupt remapping information.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Joerg Roedel <jroedel@suse.de>
Fixes: d98de49a53 ('iommu/amd: Enable vAPIC interrupt remapping mode by default')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When copying up a file that has multiple hard links we need to break any
association with the origin file. This makes copy-up be essentially an
atomic replace.
The new file has nothing to do with the old one (except having the same
data and metadata initially), so don't set the overlay.origin attribute.
We can relax this in the future when we are able to index upper object by
origin.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 3a1e819b4e ("ovl: store file handle of lower inode on copy up")
Nothing prevents mischief on upper layer while we are busy copying up the
data.
Move the lookup right before the looked up dentry is actually used.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 01ad3eb8a0 ("ovl: concurrent copy up of regular files")
Cc: <stable@vger.kernel.org> # v4.11
azx_codec_configure() loops over the codecs found on the given
controller via a linked list. The code used to work in the past, but
in the current version, this may lead to an endless loop when a codec
binding returns an error.
The culprit is that the snd_hda_codec_configure() unregisters the
device upon error, and this eventually deletes the given codec object
from the bus. Since the list is initialized via list_del_init(), the
next object points to the same device itself. This behavior change
was introduced at splitting the HD-audio code code, and forgotten to
adapt it here.
For fixing this bug, just use a *_safe() version of list iteration.
Fixes: d068ebc25e ("ALSA: hda - Move some codes up to hdac_bus struct")
Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
We were reading the no-implicit sync flag the wrong way around,
synchronizing too much for the explicit case, and not at all for the
implicit case. Oops.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
The addition of the flags member to etnaviv_gem_submit structure didn't
take into account that the last member of this structure is a variable
length array.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Just a few minor fixes. Important one is the execbuf async fix (aka
ANDROID_native_sync). There was another patch for a display coherency
corner case on APL, but we've random-walked in that space too much,
and the cherry-pick looked really invasive.
* tag 'drm-intel-fixes-2017-06-27' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocations
drm/i915: Hold struct_mutex for per-file stats in debugfs/i915_gem_object
drm/i915: Retire the VMA's fence tracker before unbinding
Single vmwgfx fix
* 'vmwgfx-fixes-4.12' of git://people.freedesktop.org/~thomash/linux:
drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr
Recently we met a problem, the codec has valid adcs and input pins,
and they can form valid input paths, but the driver does not build
valid controls for them like "Mic boost", "Capture Volume" and
"Capture Switch".
Through debugging, I found the driver needs to shrink the invalid
adcs and input paths for this machine, so it will move the whole
column bitmap value to the previous column, after moving it, the
driver forgets to set the original column bitmap value to zero, as a
result, the driver will invalidate the path whose index value is the
original colume bitmap value. After executing this function, all
valid input paths are invalidated by a mistake, there are no any
valid input paths, so the driver won't build controls for them.
Fixes: 3a65bcdc57 ("ALSA: hda - Fix inconsistent input_paths after ADC reduction")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The current code works only for the case where we have exactly one slot,
which is no longer true.
nfs4_free_slot() will automatically declare the callback channel to be
drained when all slots have been returned.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If the task calling layoutget is signalled, then it is possible for the
calls to nfs4_sequence_free_slot() and nfs4_layoutget_prepare() to race,
in which case we leak a slot.
The fix is to move the call to nfs4_sequence_free_slot() into the
nfs4_layoutget_release() so that it gets called at task teardown time.
Fixes: 2e80dbe7ac ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
The controller state is set to resetting prior to disabling the
controller, so this patch accounts for that state when deciding if it
needs to freeze the queues. Without this, an 'nvme reset /dev/nvme0'
blocks forever because the queues were never frozen.
Fixes: 82b057caef ("nvme-pci: fix multiple ctrl removal scheduling")
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The Belkin B2B128 is a USB 3.0 Hub + Gigabit Ethernet Adapter, the
Ethernet adapter uses the ASIX AX88179 USB 3.0 to Gigabit Ethernet
chip supported by this driver, add the USB ID for the same.
This patch is based on work by Geoffrey Tran <geoffrey.tran@gmail.com>
who has indicated they would like this upstreamed by someone more
familiar with the upstreaming process.
Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A previous commit (5567e98919) inserted a dependency on DMA
API that requires HAS_DMA to be added in Kconfig.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
process_prepared_discard_passdown_pt1() should cleanup
dm_thin_new_mapping in cases of error.
dm_pool_inc_data_range() can fail trying to get a block reference:
metadata operation 'dm_pool_inc_data_range' failed: error = -61
When dm_pool_inc_data_range() fails, dm thin aborts current metadata
transaction and marks pool as PM_READ_ONLY. Memory for thin mapping
is released as well. However, current thin mapping will be queued
onto next stage as part of queue_passdown_pt2() or passdown_endio().
This dangling thin mapping memory when processed and accessed in
next stage will lead to device mapper crashing.
Code flow without fix:
-> process_prepared_discard_passdown_pt1(m)
-> dm_thin_remove_range()
-> discard passdown
--> passdown_endio(m) queues m onto next stage
-> dm_pool_inc_data_range() fails, frees memory m
but does not remove it from next stage queue
-> process_prepared_discard_passdown_pt2(m)
-> processes freed memory m and crashes
One such stack:
Call Trace:
[<ffffffffa037a46f>] dm_cell_release_no_holder+0x2f/0x70 [dm_bio_prison]
[<ffffffffa039b6dc>] cell_defer_no_holder+0x3c/0x80 [dm_thin_pool]
[<ffffffffa039b88b>] process_prepared_discard_passdown_pt2+0x4b/0x90 [dm_thin_pool]
[<ffffffffa0399611>] process_prepared+0x81/0xa0 [dm_thin_pool]
[<ffffffffa039e735>] do_worker+0xc5/0x820 [dm_thin_pool]
[<ffffffff8152bf54>] ? __schedule+0x244/0x680
[<ffffffff81087e72>] ? pwq_activate_delayed_work+0x42/0xb0
[<ffffffff81089f53>] process_one_work+0x153/0x3f0
[<ffffffff8108a71b>] worker_thread+0x12b/0x4b0
[<ffffffff8108a5f0>] ? rescuer_thread+0x350/0x350
[<ffffffff8108fd6a>] kthread+0xca/0xe0
[<ffffffff8108fca0>] ? kthread_park+0x60/0x60
[<ffffffff81530b45>] ret_from_fork+0x25/0x30
The fix is to first take the block ref count for discarded block and
then do a passdown discard of this block. If block ref count fails,
then bail out aborting current metadata transaction, mark pool as
PM_READ_ONLY and also free current thin mapping memory (existing error
handling code) without queueing this thin mapping onto next stage of
processing. If block ref count succeeds, then passdown discard of this
block. Discard callback of passdown_endio() will queue this thin mapping
onto next stage of processing.
Code flow with fix:
-> process_prepared_discard_passdown_pt1(m)
-> dm_thin_remove_range()
-> dm_pool_inc_data_range()
--> if fails, free memory m and bail out
-> discard passdown
--> passdown_endio(m) queues m onto next stage
Cc: stable <stable@vger.kernel.org> # v4.9+
Reviewed-by: Eduardo Valentin <eduval@amazon.com>
Reviewed-by: Cristian Gafton <gafton@amazon.com>
Reviewed-by: Anchal Agarwal <anchalag@amazon.com>
Signed-off-by: Vallish Vaidyeshwara <vallish@amazon.com>
Reviewed-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Similar to the fix provided by Dominik Heidler in commit
9b3dc0a17d ("l2tp: cast l2tp traffic counter to unsigned")
we need to take care of 32bit kernels in dev_get_stats().
When using atomic_long_read(), we add a 'long' to u64 and
might misinterpret high order bit, unless we cast to unsigned.
Fixes: caf586e5f2 ("net: add a core netdev->rx_dropped counter")
Fixes: 015f0688f5 ("net: net: add a core netdev->tx_dropped counter")
Fixes: 6e7333d315 ("net: add rx_nohandler stat counter")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ARM fixes from Russell King:
"Three more fixes:
- Fix the previous fix merged in the last pull for the Thumb2
decompressor.
- A fix from Vladimir to correctly identify the V7M cache type.
- The optimised 3G vmsplit case does not work with LPAE, so don't
allow this to be selected for LPAE configurations"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8682/1: V7M: Set cacheid iff DminLine or IminLine is nonzero
ARM: 8681/1: make VMSPLIT_3G_OPT depends on !ARM_LPAE
ARM: 8680/1: boot/compressed: fix inappropriate Thumb2 mnemonic for __nop
Pull c6x fixlet from Mark Salter:
"Update maintainer email"
* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
MAINTAINERS: update email address for C6x maintainer
Pull s390 bugfix from Martin Schwidefsky:
"One last s390 patch for 4.12
Revert the re-IPL semantics back to the v4.7 state. It turned out that
the memory layout may change due to memory hotplug if load-normal is
used"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ipl: revert Load Normal semantics for LPAR CCW-type re-IPL
Michael reported the segfault when kernel.kptr_restrict=2 is set.
$ perf record ls
...
perf: Segmentation fault
Obtained 16 stack frames.
./perf(dump_stack+0x2d) [0x5068df]
./perf(sighandler_dump_stack+0x2d) [0x5069bf]
./perf() [0x43e47b]
/lib64/libc.so.6(+0x3594f) [0x7f762004794f]
/lib64/libc.so.6(strlen+0x26) [0x7f762009ef86]
/lib64/libc.so.6(__strdup+0xd) [0x7f762009ecbd]
./perf(maps__set_kallsyms_ref_reloc_sym+0x4d) [0x51590f]
./perf(machine__create_kernel_maps+0x136) [0x50a7de]
./perf(perf_session__create_kernel_maps+0x2c) [0x510a81]
./perf(perf_session__new+0x13d) [0x510e23]
./perf() [0x43fd61]
./perf(cmd_record+0x704) [0x441823]
./perf() [0x4bc1a0]
./perf() [0x4bc40d]
./perf() [0x4bc55f]
./perf(main+0x2d5) [0x4bc939]
Segmentation fault (core dumped)
The reason is that with kernel.kptr_restrict=2, we don't get
the symbol from machine__get_running_kernel_start, which we
want to use in maps__set_kallsyms_ref_reloc_sym and we crash.
Check the symbol name value before calling
maps__set_kallsyms_ref_reloc_sym() and succeed without ref_reloc_sym
being set. It's safe because we check its existence before we use it.
Reported-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170626095153.553-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Larry Finger reported that his Powerbook G4 was no longer booting with v4.12-rc,
userspace was up but giving weird errors such as:
udevd[64]: starting version 175
udevd[64]: Unable to receive ctrl message: Bad address.
modprobe: chdir(4.12-rc1): No such file or directory
He bisected the problem to commit 3448890c32 ("powerpc: get rid of zeroing,
switch to RAW_COPY_USER").
Al identified that the problem is actually a miscompilation by GCC 4.6.3, which
is exposed by the above commit.
Al also pointed out that inlining copy_to/from_user() is probably of little or
no benefit, which is correct. Using Anton's copy_to_user benchmark, with a
pathological single byte copy, we see a small increase in performance
by *removing* inlining:
Before (inlined):
# time ./copy_to_user -w -l 1 -i 10000000 ( x 3 )
real 0m22.063s
real 0m22.059s
real 0m22.076s
After:
# time ./copy_to_user -w -l 1 -i 10000000 ( x 3 )
real 0m21.325s
real 0m21.299s
real 0m21.364s
So as a small performance improvement and to avoid the miscompilation, drop
inlining copy_to/from_user() on 32-bit.
Fixes: 3448890c32 ("powerpc: get rid of zeroing, switch to RAW_COPY_USER")
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The hash table created during vmw_cmdbuf_res_man_create was
never freed. This causes memory leak in context creation.
Added the corresponding drm_ht_remove in vmw_cmdbuf_res_man_destroy.
Tested for memory leak by running piglit overnight and kernel
memory is not inflated which earlier was.
Cc: <stable@vger.kernel.org>
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
If we write a relocation into the buffer, we require our own implicit
synchronisation added after the start of the execbuf, outside of the
user's control. As we may end up clflushing, or doing the patch itself
on the GPU, asynchronously we need to look at the implicit serialisation
on obj->resv and hence need to disable EXEC_OBJECT_ASYNC for this
object.
If the user does trigger a stall for relocations, we make sure the stall
is complete enough so that the batch is not submitted before we complete
those relocations.
Fixes: 77ae995789 ("drm/i915: Enable userspace to opt-out of implicit fencing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 071750e550)
[danvet: Resolve conflicts, resolution reviewed by Tvrtko on irc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pull x86 fix from Thomas Gleixner:
"A single fix to unbreak the vdso32 build for 64bit kernels caused by
excess #includes in the mshyperv header"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mshyperv: Remove excess #includes from mshyperv.h
Pull timer fixes from Thomas Gleixner:
"A few fixes for timekeeping and timers:
- Plug a subtle race due to a missing READ_ONCE() in the timekeeping
code where reloading of a pointer results in an inconsistent
callback argument being supplied to the clocksource->read function.
- Correct the CLOCK_MONOTONIC_RAW sub-nanosecond accounting in the
time keeping core code, to prevent a possible discontuity.
- Apply a similar fix to the arm64 vdso clock_gettime()
implementation
- Add missing includes to clocksource drivers, which relied on
indirect includes which fails in certain configs.
- Use the proper iomem pointer for read/iounmap in a probe function"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW
time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting
time: Fix clock->read(clock) race around clocksource changes
clocksource: Explicitly include linux/clocksource.h when needed
clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable
Pull perf fixes from Thomas Gleixner:
"Three fixlets for perf:
- Return the proper error code if aux buffers for a event are not
supported.
- Calculate the probe offset for inlined functions correctly
- Update the Skylake DTLB load/store miss event so it can count 1G
TLB entries as well"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf probe: Fix probe definition for inlined functions
perf/x86/intel: Add 1G DTLB load/store miss support for SKL
perf/aux: Correct return code of rb_alloc_aux() if !has_aux(ev)
Pull irq fix from Thomas Gleixner:
"A single fix for the MIPS GIC to prevent ftrace recursion"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mips-gic: Mark count and compare accessors notrace
Pull input fixes from Dmitry Torokhov:
- a quirk to i8042 to ignore timeout bit on Lifebook AH544
- a fixup to Synaptics RMI function 54 that was breaking some Dells
- a fix for memory leak in soc_button_array driver
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - only read the F54 query registers which are used
Input: i8042 - add Fujitsu Lifebook AH544 to notimeout list
Input: soc_button_array - fix leaking the ACPI button descriptor buffer
Pull SCSI target fixes from Nicholas Bellinger:
"Here are the target-pending fixes for v4.12-rc7 that have been queued
up for the last 2 weeks. This includes:
- Fix a TMR related kref underflow detected by the recent refcount_t
conversion in upstream.
- Fix a iscsi-target corner case during explicit connection logout
timeout failure.
- Address last fallout in iscsi-target immediate data handling from
v4.4 target-core now allowing control CDB payload underflow"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
iscsi-target: Reject immediate data underflow larger than SCSI transfer length
iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP
target: Fix kref->refcount underflow in transport_cmd_finish_abort
We have to reset the sk->sk_rx_dst when we disconnect a TCP
connection, because otherwise when we re-connect it this
dst reference is simply overridden in tcp_finish_connect().
This fixes a dst leak which leads to a loopback dev refcnt
leak. It is a long-standing bug, Kevin reported a very similar
(if not same) bug before. Thanks to Andrei for providing such
a reliable reproducer which greatly narrows down the problem.
Fixes: 41063e9dd1 ("ipv4: Early TCP socket demux.")
Reported-by: Andrei Vagin <avagin@gmail.com>
Reported-by: Kevin Xu <kaiwen.xu@hulu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In __ip6_datagram_connect(), reset sk->sk_v6_daddr and inet->dport if
error occurs.
In udp_v6_early_demux(), check for sk_state to make sure it is in
TCP_ESTABLISHED state.
Together, it makes sure unconnected UDP socket won't be considered as a
valid candidate for early demux.
v3: add TCP_ESTABLISHED state check in udp_v6_early_demux()
v2: fix compilation error
Fixes: 5425077d73 ("net: ipv6: Add early demux handler for UDP unicast")
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When mc configuration changes bnx2x_config_mcast() can return 0 for
success, negative for failure and positive for benign reason preventing
its immediate work, e.g., when the command awaits the completion of
a previously sent command.
When removing all configured macs on a 578xx adapter, if a positive
value would be returned driver would errneously log it as an error.
Fixes: c7b7b483cc ("bnx2x: Don't flush multicast MACs")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull Kbuild fixes from Masahiro Yamada:
"Nothing scary, just some random fixes:
- fix warnings of host programs
- fix "make tags" when COMPILED_SOURCE=1 is specified along with O=
- clarify help message of C=1 option
- fix dependency for ncurses compatibility check
- fix "make headers_install" for fakechroot environment"
* tag 'kbuild-fixes-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: fix sparse warnings in nconfig
kbuild: fix header installation under fakechroot environment
kconfig: Check for libncurses before menuconfig
Kbuild: tiny correction on `make help`
tags: honor COMPILED_SOURCE with apart output directory
genksyms: add printf format attribute to error_with_pos()
Pull timer fix from Eric Biederman:
"This fixes an issue of confusing injected signals with the signals
from posix timers that has existed since posix timers have been in the
kernel.
This patch is slightly simpler than my earlier version of this patch
as I discovered in testing that I had misspelled "#ifdef
CONFIG_POSIX_TIMERS". So I deleted that unnecessary test and made
setting of resched_timer uncondtional.
I have tested this and verified that without this patch there is a
nasty hang that is easy to trigger, and with this patch everything
works properly"
Thomas Gleixner dixit:
"It fixes the problem at hand and covers the ptrace case as well, which
I missed.
Reviewed-and-tested-by: Thomas Gleixner <tglx@linutronix.de>"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Only reschedule timers on signals timers have sent
A recent commit included linux/slab.h in linux/irq.h. This breaks the build
of vdso32 on a 64-bit kernel.
The reason is that linux/irq.h gets included into the vdso code via
linux/interrupt.h which is included from asm/mshyperv.h. That makes the
32-bit vdso compile fail, because slab.h includes the pgtable headers for
64-bit on a 64-bit build.
Neither linux/clocksource.h nor linux/interrupt.h are needed in the
mshyperv.h header file itself - it has a dependency on <linux/atomic.h>.
Remove the includes and unbreak the build.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: devel@linuxdriverproject.org
Fixes: dee863b571 ("hv: export current Hyper-V clocksource")
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1706231038460.2647@nanos
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 4.12. Most of these actually came in last
week but got held up for some more testing.
- three fixes for kprobes/ftrace/livepatch interactions.
- properly handle data breakpoints when using the Radix MMU.
- fix for perf sampling of registers during call_usermodehelper().
- properly initialise the thread_info on our emergency stacks
- add an explicit flush when doing TLB invalidations for a process
using NPU2.
Thanks to: Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi
Bangoria, Masami Hiramatsu"
* tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64: Initialise thread_info for emergency stacks
powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD
powerpc/perf: Fix oops when kthread execs user process
powerpc/64s: Handle data breakpoints in Radix mode
powerpc/kprobes: Skip livepatch_handler() for jprobes
powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS
powerpc/kprobes: Pause function_graph tracing during jprobes handling
Pull ACPI fix from Rafael Wysocki:
"This fixes the ACPI-based enumeration of some I2C and SPI devices
broken in 4.11.
Specifics:
- I2C and SPI devices are expected to be enumerated by the I2C and
SPI subsystems, respectively, but due to a change made during the
4.11 cycle, in some cases the ACPI core marks them as already
enumerated which causes the I2C and SPI subsystems to overlook
them, so fix that (Jarkko Nikula)"
* tag 'acpi-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / scan: Fix enumeration for special SPI and I2C devices
Pull i2c fix from Wolfram Sang.
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx: Use correct function to write to register
Pull GPIO fix from Linus Walleij:
"A single GPIO patch fixing the compatible string for the MVEBU PWM
controller embedded in the GPIO controller before we release v4.12.
Hopefully"
* tag 'gpio-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: mvebu: change compatible string for PWM support
Pull sound fixes from Takashi Iwai:
"Nothing exciting here, just a few stable fixes:
- suppress spurious kernel WARNING in PCM core
- fix potential spin deadlock at error handling in firewire
- HD-audio PCI ID addition / fixup"
* tag 'sound-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Apply quirks to Broxton-T, too
ALSA: firewire-lib: Fix stall of process context at packet error
ALSA: pcm: Don't treat NULL chmap as a fatal error
ALSA: hda - Add Coffelake PCI ID
Pull drm fixes from Dave Airlie:
"A varied bunch of fixes, one for an API regression with connectors.
Otherwise amdgpu and i915 have a bunch of varied fixes, the shrinker
ones being the most important"
* tag 'drm-fixes-for-v4.12-rc7' of git://people.freedesktop.org/~airlied/linux:
drm: Fix GETCONNECTOR regression
drm/radeon: add a quirk for Toshiba Satellite L20-183
drm/radeon: add a PX quirk for another K53TK variant
drm/amdgpu: adjust default display clock
drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating
drm/amdgpu: add Polaris12 DID
drm/i915: Don't enable backlight at setup time.
drm/i915: Plumb the correct acquire ctx into intel_crtc_disable_noatomic()
drm/i915: Fix deadlock witha the pipe A quirk during resume
drm/i915: Remove __GFP_NORETRY from our buffer allocator
drm/i915: Encourage our shrinker more when our shmemfs allocations fails
drm/i915: Differentiate between sw write location into ring and last hw read
Pull random fixes from Ted Ts'o:
"Fix some locking and gcc optimization issues from the most recent
random_for_linus_stable pull request"
* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: silence compiler warnings and fix race
Pull device mapper fixes from Mike Snitzer:
- a revert of a DM mirror commit that has proven to make the code prone
to crash
- a DM io reference count fix that resolves a NULL pointer seen when
issuing discards to a DM mirror target's device whose mirror legs do
not all support discards
- a couple DM integrity fixes
* tag 'for-4.12/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm io: fix duplicate bio completion due to missing ref count
dm integrity: fix to not disable/enable interrupts from interrupt context
Revert "dm mirror: use all available legs on multiple failures"
dm integrity: reject mappings too large for device
Merge misc fixes from Andrew Morton:
"8 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
fs/exec.c: account for argv/envp pointers
ocfs2: fix deadlock caused by recursive locking in xattr
slub: make sysfs file removal asynchronous
lib/cmdline.c: fix get_options() overflow while parsing ranges
fs/dax.c: fix inefficiency in dax_writeback_mapping_range()
autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL
mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings
mm, thp: remove cond_resched from __collapse_huge_page_copy
When limiting the argv/envp strings during exec to 1/4 of the stack limit,
the storage of the pointers to the strings was not included. This means
that an exec with huge numbers of tiny strings could eat 1/4 of the stack
limit in strings and then additional space would be later used by the
pointers to the strings.
For example, on 32-bit with a 8MB stack rlimit, an exec with 1677721
single-byte strings would consume less than 2MB of stack, the max (8MB /
4) amount allowed, but the pointers to the strings would consume the
remaining additional stack space (1677721 * 4 == 6710884).
The result (1677721 + 6710884 == 8388605) would exhaust stack space
entirely. Controlling this stack exhaustion could result in
pathological behavior in setuid binaries (CVE-2017-1000365).
[akpm@linux-foundation.org: additional commenting from Kees]
Fixes: b6a2fea393 ("mm: variable length argument support")
Link: http://lkml.kernel.org/r/20170622001720.GA32173@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qualys Security Advisory <qsa@qualys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Another deadlock path caused by recursive locking is reported. This
kind of issue was introduced since commit 743b5f1434 ("ocfs2: take
inode lock in ocfs2_iop_set/get_acl()"). Two deadlock paths have been
fixed by commit b891fa5024 ("ocfs2: fix deadlock issue when taking
inode lock at vfs entry points"). Yes, we intend to fix this kind of
case in incremental way, because it's hard to find out all possible
paths at once.
This one can be reproduced like this. On node1, cp a large file from
home directory to ocfs2 mountpoint. While on node2, run
setfacl/getfacl. Both nodes will hang up there. The backtraces:
On node1:
__ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2]
ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2]
ocfs2_write_begin+0x43/0x1a0 [ocfs2]
generic_perform_write+0xa9/0x180
__generic_file_write_iter+0x1aa/0x1d0
ocfs2_file_write_iter+0x4f4/0xb40 [ocfs2]
__vfs_write+0xc3/0x130
vfs_write+0xb1/0x1a0
SyS_write+0x46/0xa0
On node2:
__ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2]
ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2]
ocfs2_xattr_set+0x12e/0xe80 [ocfs2]
ocfs2_set_acl+0x22d/0x260 [ocfs2]
ocfs2_iop_set_acl+0x65/0xb0 [ocfs2]
set_posix_acl+0x75/0xb0
posix_acl_xattr_set+0x49/0xa0
__vfs_setxattr+0x69/0x80
__vfs_setxattr_noperm+0x72/0x1a0
vfs_setxattr+0xa7/0xb0
setxattr+0x12d/0x190
path_setxattr+0x9f/0xb0
SyS_setxattr+0x14/0x20
Fix this one by using ocfs2_inode_{lock|unlock}_tracker, which is
exported by commit 439a36b8ef ("ocfs2/dlmglue: prepare tracking logic
to avoid recursive cluster lock").
Link: http://lkml.kernel.org/r/20170622014746.5815-1-zren@suse.com
Fixes: 743b5f1434 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
Signed-off-by: Eric Ren <zren@suse.com>
Reported-by: Thomas Voegtle <tv@lio96.de>
Tested-by: Thomas Voegtle <tv@lio96.de>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit bf5eb3de38 ("slub: separate out sysfs_slab_release() from
sysfs_slab_remove()") made slub sysfs file removals synchronous to
kmem_cache shutdown.
Unfortunately, this created a possible ABBA deadlock between slab_mutex
and sysfs draining mechanism triggering the following lockdep warning.
======================================================
[ INFO: possible circular locking dependency detected ]
4.10.0-test+ #48 Not tainted
-------------------------------------------------------
rmmod/1211 is trying to acquire lock:
(s_active#120){++++.+}, at: [<ffffffff81308073>] kernfs_remove+0x23/0x40
but task is already holding lock:
(slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (slab_mutex){+.+.+.}:
lock_acquire+0xf6/0x1f0
__mutex_lock+0x75/0x950
mutex_lock_nested+0x1b/0x20
slab_attr_store+0x75/0xd0
sysfs_kf_write+0x45/0x60
kernfs_fop_write+0x13c/0x1c0
__vfs_write+0x28/0x120
vfs_write+0xc8/0x1e0
SyS_write+0x49/0xa0
entry_SYSCALL_64_fastpath+0x1f/0xc2
-> #0 (s_active#120){++++.+}:
__lock_acquire+0x10ed/0x1260
lock_acquire+0xf6/0x1f0
__kernfs_remove+0x254/0x320
kernfs_remove+0x23/0x40
sysfs_remove_dir+0x51/0x80
kobject_del+0x18/0x50
__kmem_cache_shutdown+0x3e6/0x460
kmem_cache_destroy+0x1fb/0x2d0
kvm_exit+0x2d/0x80 [kvm]
vmx_exit+0x19/0xa1b [kvm_intel]
SyS_delete_module+0x198/0x1f0
entry_SYSCALL_64_fastpath+0x1f/0xc2
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(slab_mutex);
lock(s_active#120);
lock(slab_mutex);
lock(s_active#120);
*** DEADLOCK ***
2 locks held by rmmod/1211:
#0: (cpu_hotplug.dep_map){++++++}, at: [<ffffffff810a7877>] get_online_cpus+0x37/0x80
#1: (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0
stack backtrace:
CPU: 3 PID: 1211 Comm: rmmod Not tainted 4.10.0-test+ #48
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
Call Trace:
print_circular_bug+0x1be/0x210
__lock_acquire+0x10ed/0x1260
lock_acquire+0xf6/0x1f0
__kernfs_remove+0x254/0x320
kernfs_remove+0x23/0x40
sysfs_remove_dir+0x51/0x80
kobject_del+0x18/0x50
__kmem_cache_shutdown+0x3e6/0x460
kmem_cache_destroy+0x1fb/0x2d0
kvm_exit+0x2d/0x80 [kvm]
vmx_exit+0x19/0xa1b [kvm_intel]
SyS_delete_module+0x198/0x1f0
? SyS_delete_module+0x5/0x1f0
entry_SYSCALL_64_fastpath+0x1f/0xc2
It'd be the cleanest to deal with the issue by removing sysfs files
without holding slab_mutex before the rest of shutdown; however, given
the current code structure, it is pretty difficult to do so.
This patch punts sysfs file removal to a work item. Before commit
bf5eb3de38, the removal was punted to a RCU delayed work item which is
executed after release. Now, we're punting to a different work item on
shutdown which still maintains the goal removing the sysfs files earlier
when destroying kmem_caches.
Link: http://lkml.kernel.org/r/20170620204512.GI21326@htj.duckdns.org
Fixes: bf5eb3de38 ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Existing code that uses vmalloc_to_page() may assume that any address
for which is_vmalloc_addr() returns true may be passed into
vmalloc_to_page() to retrieve the associated struct page.
This is not un unreasonable assumption to make, but on architectures
that have CONFIG_HAVE_ARCH_HUGE_VMAP=y, it no longer holds, and we need
to ensure that vmalloc_to_page() does not go off into the weeds trying
to dereference huge PUDs or PMDs as table entries.
Given that vmalloc() and vmap() themselves never create huge mappings or
deal with compound pages at all, there is no correct answer in this
case, so return NULL instead, and issue a warning.
When reading /proc/kcore on arm64, you will hit an oops as soon as you
hit the huge mappings used for the various segments that make up the
mapping of vmlinux. With this patch applied, you will no longer hit the
oops, but the kcore contents willl be incorrect (these regions will be
zeroed out)
We are fixing this for kcore specifically, so it avoids vread() for
those regions. At least one other problematic user exists, i.e.,
/dev/kmem, but that is currently broken on arm64 for other reasons.
Link: http://lkml.kernel.org/r/20170609082226.26152-1-ard.biesheuvel@linaro.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: zhong jiang <zhongjiang@huawei.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a partial revert of commit 338a16ba15 ("mm, thp: copying user
pages must schedule on collapse") which added a cond_resched() to
__collapse_huge_page_copy().
On x86 with CONFIG_HIGHPTE, __collapse_huge_page_copy is called in
atomic context and thus scheduling is not possible. This is only a
possible config on arm and i386.
Although need_resched has been shown to be set for over 100 jiffies
while doing the iteration in __collapse_huge_page_copy, this is better
than doing
if (in_atomic())
cond_resched()
to cover only non-CONFIG_HIGHPTE configs.
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1706191341550.97821@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull SCSI fixes from James Bottomley:
"Two fixes to remove spurious WARN_ONs from the new(ish) qedi driver.
The driver already prints a warning message, there's no need to panic
users by printing something that looks like an oops as well"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qedi: Remove WARN_ON from clear task context.
scsi: qedi: Remove WARN_ON for untracked cleanup.
Pull xfs fixes from Darrick Wong:
"I have one more bugfix for you for 4.12-rc7 to fix a disk corruption
problem:
- don't allow swapon on files on the realtime device, because the
swap code will swap pages out to blocks on the data device, thereby
corrupting the filesystem"
* tag 'xfs-4.12-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: don't allow bmap on rt files
Michael Chan says:
====================
bnxt_en: Error handling and netpoll fixes.
Add missing error handling and fix netpoll handling. The current code
handles RX and TX events in netpoll mode and is causing lots of warnings
and errors in the RX code path in netpoll mode. The fix is to only handle
TX events in netpoll mode.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
To handle netpoll properly, the driver must only handle TX packets
during NAPI. Handling RX events cause warnings and errors in
netpoll mode. The ndo_poll_controller() method should call
napi_schedule() directly so that a NAPI weight of zero will be used
during netpoll mode.
The bnxt_en driver supports 2 ring modes: combined, and separate rx/tx.
In separate rx/tx mode, the ndo_poll_controller() method will only
process the tx rings. In combined mode, the rx and tx completion
entries are mixed in the completion ring and we need to drop the rx
entries and recycle the rx buffers.
Add a function bnxt_force_rx_discard() to handle this in netpoll mode
when we see rx entries in combined ring mode.
Reported-by: Calvin Owens <calvinowens@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we get a TPA_END completion to handle a completed LRO packet, it
is possible that hardware would indicate errors. The current code is
not checking for the error condition. Define the proper error bits and
the macro to check for this error and abort properly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function, skb_complete_tx_timestamp(), used to allow passing in a
NULL pointer for the time stamps, but that was changed in commit
62bccb8cdb ("net-timestamp: Make the
clone operation stand-alone from phy timestamping"), and the existing
call sites, all of which are in the dp83640 driver, were fixed up.
Even though the kernel-doc was subsequently updated in commit
7a76a021cd ("net-timestamp: Update
skb_complete_tx_timestamp comment"), still a bug fix from Manfred
Rudigier came into the driver using the old semantics. Probably
Manfred derived that patch from an older kernel version.
This fix should be applied to the stable trees as well.
Fixes: 81e8f2e930 ("net: dp83640: Fix tx timestamp overflow handling.")
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2017-06-23
1) Fix xfrm garbage collecting when unregistering a netdevice.
From Hangbin Liu.
2) Fix NULL pointer derefernce when exiting a network namespace.
From Hangbin Liu.
3) Fix some error codes in pfkey to prevent a NULL pointer derefernce.
From Dan Carpenter.
4) Fix NULL pointer derefernce on allocation failure in pfkey.
From Dan Carpenter.
5) Adjust IPv6 payload_len to include extension headers. Otherwise
we corrupt the packets when doing ESP GRO on transport mode.
From Yossi Kuperman.
6) Set nhoff to the proper offset of the IPv6 nexthdr when doing ESP GRO.
From Yossi Kuperman.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The memory allocation size is controlled by user-space,
if it is too large just fail silently and return NULL,
not to mention there is a fallback allocation later.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Our customer encountered stuck NFS writes for blocks starting at specific
offsets w.r.t. page boundary caused by networking stack sending packets via
UFO enabled device with wrong checksum. The problem can be reproduced by
composing a long UDP datagram from multiple parts using MSG_MORE flag:
sendto(sd, buff, 1000, MSG_MORE, ...);
sendto(sd, buff, 1000, MSG_MORE, ...);
sendto(sd, buff, 3000, 0, ...);
Assume this packet is to be routed via a device with MTU 1500 and
NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
this condition is tested (among others) to decide whether to call
ip_ufo_append_data():
((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))
At the moment, we already have skb with 1028 bytes of data which is not
marked for GSO so that the test is false (fragheaderlen is usually 20).
Thus we append second 1000 bytes to this skb without invoking UFO. Third
sendto(), however, has sufficient length to trigger the UFO path so that we
end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
uses udp_csum() to calculate the checksum but that assumes all fragments
have correct checksum in skb->csum which is not true for UFO fragments.
When checking against MTU, we need to add skb->len to length of new segment
if we already have a partially filled skb and fragheaderlen only if there
isn't one.
In the IPv6 case, skb can only be null if this is the first segment so that
we have to use headersize (length of the first IPv6 header) rather than
fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.
Fixes: e89e9cf539 ("[IPv4/IPv6]: UFO Scatter-gather approach")
Fixes: e4c5e13aa4 ("ipv6: Should use consistent conditional judgement for
ip6 fragment between __ip6_append_data and ip6_finish_output")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a RAID set was created on dm-raid version < 1.9.0 (old RAID
superblock format), all of the new 1.9.0 members of the superblock are
uninitialized (zero) -- including the device sectors member needed to
support shrinking.
All the other accesses to superblock fields new in 1.9.0 were reviewed
and verified to be properly guarded against invalid use. The 'sectors'
member was the only one used when the superblock version is < 1.9.
Don't access the superblock's >= 1.9.0 'sectors' member unconditionally.
Also add respective comments.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Pull 'perf probe' fix from Arnaldo Carvalho de Melo:
- Do not double the offset of inline expansions when using
'perf probe' on inlined functions (Björn Töpel)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The F54 driver is currently only using the first 6 bytes of F54 so there is
no need to read all 27 bytes. Some Dell systems (Dell XP13 9333 and
similar) have an issue with the touchpad or I2C bus when reading reports
larger then 16 bytes. Reads larger then 16 bytes are reported in two HID
reports. Something about the back to back reports seems to cause the next
read to report incorrect data. This results in F30 failing to load and the
click button failing to work.
Previous issues with the I2C controller or touchpad were addressed in:
commit 5b65c2a029 ("HID: rmi: check sanity of the incoming report")
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=195949
Signed-off-by: Andrew Duggan <aduggan@synaptics.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Reviewed-by: Nick Dyer <nick@shmanahar.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
A previous set of patches "cxl: Add support for Coherent Accelerator
Interface Architecture 2.0" has introduced a new support for the CAPI
cards. These patches have been tested on Simulation environment and
quite a bit of them have been tested on real hardware.
This patch brings new fixes after a series of tests carried out on new
equipment:
- Add POWER9 definition.
- Re-enable any masked interrupts when the AFU is not activated
after resetting the AFU.
- Remove the api cxl_is_psl8/9 which is no longer useful.
- Do not dump CAPI1 registers.
- Rewrite cxl_is_page_fault() function.
- Do not register slb callack on P9.
Fixes: f24be42aab ("cxl: Add psl9 specific code")
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Emergency stacks have their thread_info mostly uninitialised, which in
particular means garbage preempt_count values.
Emergency stack code runs with interrupts disabled entirely, and is
used very rarely, so this has been unnoticed so far. It was found by a
proposed new powerpc watchdog that takes a soft-NMI directly from the
masked_interrupt handler and using the emergency stack. That crashed
at BUG_ON(in_nmi()) in nmi_enter(). preempt_count()s were found to be
garbage.
To fix this, zero the entire THREAD_SIZE allocation, and initialize
the thread_info.
Cc: stable@vger.kernel.org
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Move it all into setup_64.c, use a function not a macro. Fix
crashes on Cell by setting preempt_count to 0 not HARDIRQ_OFFSET]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Fix sparse warnings in scripts/kconfig/nconf* ('make nconfig'):
../scripts/kconfig/nconf.c:1071:32: warning: Using plain integer as NULL pointer
../scripts/kconfig/nconf.c:1238:30: warning: Using plain integer as NULL pointer
../scripts/kconfig/nconf.c:511:51: warning: Using plain integer as NULL pointer
../scripts/kconfig/nconf.c:1460:6: warning: symbol 'setup_windows' was not declared. Should it be static?
../scripts/kconfig/nconf.c:274:12: warning: symbol 'current_instructions' was not declared. Should it be static?
../scripts/kconfig/nconf.c:308:22: warning: symbol 'function_keys' was not declared. Should it be static?
../scripts/kconfig/nconf.gui.c:132:17: warning: non-ANSI function declaration of function 'set_colors'
../scripts/kconfig/nconf.gui.c:195:24: warning: Using plain integer as NULL pointer
nconf.gui.o before/after files are the same.
nconf.o before/after files are the same until the 'static' function
declarations are added.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
In commit 613f050d68 ("perf probe: Fix to probe on gcc generated
functions in modules"), the offset from symbol is, incorrectly, added
to the trace point address. This leads to incorrect probe trace points
for inlined functions and when using relative line number on symbols.
Prior this patch:
$ perf probe -m nf_nat -D in_range
p:probe/in_range nf_nat:in_range.isra.9+0
$ perf probe -m i40e -D i40e_clean_rx_irq
p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2212
$ perf probe -m i40e -D i40e_clean_rx_irq:16
p:probe/i40e_clean_rx_irq i40e:i40e_lan_xmit_frame+626
After:
$ perf probe -m nf_nat -D in_range
p:probe/in_range nf_nat:in_range.isra.9+0
$ perf probe -m i40e -D i40e_clean_rx_irq
p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+1106
$ perf probe -m i40e -D i40e_clean_rx_irq:16
p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2665
Committer testing:
Using 'pfunct', a tool found in the 'dwarves' package [1], one can ask what are
the functions that while not being explicitely marked as inline, were inlined
by the compiler:
# pfunct --cc_inlined /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko | head
__ew32
e1000_regdump
e1000e_dump_ps_pages
e1000_desc_unused
e1000e_systim_to_hwtstamp
e1000e_rx_hwtstamp
e1000e_update_rdt_wa
e1000e_update_tdt_wa
e1000_put_txbuf
e1000_consume_page
Then ask 'perf probe' to produce the kprobe_tracer probe definitions for two of
them:
# perf probe -m e1000e -D e1000e_rx_hwtstamp
p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+74
# perf probe -m e1000e -D e1000_consume_page
p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+876
p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+1506
p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074
Now lets concentrate on the 'e1000_consume_page' one, that was inlined twice in
e1000_clean_jumbo_rx_irq(), lets see what readelf says about the DWARF tags for
that function:
$ readelf -wi /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
<SNIP>
<1><13e27b>: Abbrev Number: 121 (DW_TAG_subprogram)
<13e27c> DW_AT_name : (indirect string, offset: 0xa8945): e1000_clean_jumbo_rx_irq
<13e287> DW_AT_low_pc : 0x17a30
<3><13e6ef>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
<13e6f0> DW_AT_abstract_origin: <0x13ed2c>
<13e6f4> DW_AT_low_pc : 0x17be6
<SNIP>
<1><13ed2c>: Abbrev Number: 142 (DW_TAG_subprogram)
<13ed2e> DW_AT_name : (indirect string, offset: 0xa54c3): e1000_consume_page
So, the first time in e1000_clean_jumbo_rx_irq() where e1000_consume_page() is
inlined is at PC 0x17be6, which subtracted from e1000_clean_jumbo_rx_irq()'s
address, gives us the offset we should use in the probe definition:
0x17be6 - 0x17a30 = 438
but above we have 876, which is twice as much.
Lets see the second inline expansion of e1000_consume_page() in
e1000_clean_jumbo_rx_irq():
<3><13e86e>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
<13e86f> DW_AT_abstract_origin: <0x13ed2c>
<13e873> DW_AT_low_pc : 0x17d21
0x17d21 - 0x17a30 = 753
So we where adding it at twice the offset from the containing function as we
should.
And then after this patch:
# perf probe -m e1000e -D e1000e_rx_hwtstamp
p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+37
# perf probe -m e1000e -D e1000_consume_page
p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+438
p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+753
p:probe/e1000_consume_page_2 e1000e:e1000_clean_jumbo_rx_irq+1353
#
Which matches the two first expansions and shows that because we were
doubling the offset it would spill over the next function:
readelf -sw /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
673: 0000000000017a30 1626 FUNC LOCAL DEFAULT 2 e1000_clean_jumbo_rx_irq
674: 0000000000018090 2013 FUNC LOCAL DEFAULT 2 e1000_clean_rx_irq_ps
This is the 3rd inline expansion of e1000_consume_page() in
e1000_clean_jumbo_rx_irq():
<3><13ec77>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
<13ec78> DW_AT_abstract_origin: <0x13ed2c>
<13ec7c> DW_AT_low_pc : 0x17f79
0x17f79 - 0x17a30 = 1353
So:
0x17a30 + 2 * 1353 = 0x184c2
And:
0x184c2 - 0x18090 = 1074
Which explains the bogus third expansion for e1000_consume_page() to end up at:
p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074
All fixed now :-)
[1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 613f050d68 ("perf probe: Fix to probe on gcc generated functions in modules")
Link: http://lkml.kernel.org/r/20170621164134.5701-1-bjorn.topel@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull cifs fixes from Steve French:
"Various small fixes for stable"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: Fix some return values in case of error in 'crypt_message'
cifs: remove redundant return in cifs_creation_time_get
CIFS: Improve readdir verbosity
CIFS: check if pages is null rather than bv for a failed allocation
CIFS: Set ->should_dirty in cifs_user_readv()
Pull MFD fixes from Lee Jones:
- arizona: use address passed in, rather than hard coded value
- correct STM32 clock-names value in DT binding documentation
* tag 'mfd-fixes-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
dt-bindings: mfd: Update STM32 timers clock names
mfd: arizona: Fix typo using hard-coded register
The 8000 series adapters uses catch-all filters for encapsulated traffic
to support filtering VXLAN, NVGRE and GENEVE traffic.
This new filter functionality requires a longer MCDI command.
This patch increases the size of buffers on stack that were missed, which
fixes a kernel panic from the stack protector.
Fixes: 9b41080125 ("sfc: insert catch-all filters for encapsulated traffic")
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Acked-by: Bert Kenward bkenward@solarflare.com
Signed-off-by: David S. Miller <davem@davemloft.net>
This structure member is hidden behind CONFIG_SYSFS, and we
get a build error when that is disabled:
drivers/net/hyperv/netvsc_drv.c: In function 'netvsc_set_channels':
drivers/net/hyperv/netvsc_drv.c:754:49: error: 'struct net_device' has no member named 'num_rx_queues'; did you mean 'num_tx_queues'?
drivers/net/hyperv/netvsc_drv.c: In function 'netvsc_set_rxfh':
drivers/net/hyperv/netvsc_drv.c:1181:25: error: 'struct net_device' has no member named 'num_rx_queues'; did you mean 'num_tx_queues'?
As the value is only set once to the argument of alloc_netdev_mq(),
we can compare against that constant directly.
Fixes: ff4a441990 ("netvsc: allow get/set of RSS indirection table")
Fixes: 2b01888d1b ("netvsc: allow more flexible setting of number of channels")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The per netns loopback_dev->ip6_ptr is unregistered and set to
NULL when its mtu is set to smaller than IPV6_MIN_MTU, this
leads to that we could set rt->rt6i_idev NULL after a
rt6_uncached_list_flush_dev() and then crash after another
call.
In this case we should just bring its inet6_dev down, rather
than unregistering it, at least prior to commit 176c39af29
("netns: fix addrconf_ifdown kernel panic") we always
override the case for loopback.
Thanks a lot to Andrey for finding a reliable reproducer.
Fixes: 176c39af29 ("netns: fix addrconf_ifdown kernel panic")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Daniel Lezcano <dlezcano@fr.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Yasevich says:
====================
macvlan: Fix some issues with changing mac addresses
There are some issues in macvlan wrt to changing it's mac address.
* An error is returned in the specified address is the same as an already
assigned address.
* In passthru mode, the mac address of the macvlan device doesn't change.
* After changing the mac address of a passthru macvlan and then removing it,
the mac address of the physical device remains changed.
This patch series attempts to resolve these issues.
V2: Address a small issue in p4 where we save the address from the lowerdev
(from girish.moodalbail@oracle.com)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Passthru macvlans directly change the mac address of the lower
level device. That's OK, but after the macvlan is deleted,
the lower device is left with changed address and one needs to
reboot to bring back the origina HW addresses.
This scenario is actually quite common with passthru macvtap devices.
This patch attempts to solve this, by storing the mac address
of the lower device in macvlan_port structure and keeping track of
it through the changes.
After this patch, any changes to the lower device mac address
done trough the macvlan device, will be reverted back. Any
changs done directly to the lower device mac address will be kept.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the port passthru boolean into flags with accesor functions.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a lower device of the passthru macvlan changes it's address,
passthru macvlan is supposed to change it's own address as well.
However, that doesn't happen correctly because the check in
macvlan_addr_busy() will catch the fact that the lower level
(port) mac address is the same as the address we are trying to
assign to the macvlan, and return an error. As a reasult,
the address of the passthru macvlan device is never changed.
The same thing happens when the user attempts to change the
mac address of the passthru macvlan.
The simple solution appers to be to not check against
the lower device in case of passthru macvlan device, since
the 2 addresses are _supposed_ to be the same.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The user currently gets an EBUSY error when attempting to set
the mac address on a macvlan device to the same value.
This should really be a no-op as nothing changes. Catch
the condition and return early.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a flag to indicate if a queue is rate-limited. Test the flag in
NAPI poll handler and avoid rescheduling the queue if true, otherwise
we risk locking up the host. The rescheduling will be done in the
timer callback function.
Reported-by: Jean-Louis Dupond <jean-louis@dupond.be>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Tested-by: Jean-Louis Dupond <jean-louis@dupond.be>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are number of problems with configuration peer
network device in absence of IFLA_VETH_PEER attributes
where attributes for main network device shared with
peer.
First it is not feasible to configure both network
devices with same MAC address since this makes
communication in such configuration problematic.
This case can be reproduced with following sequence:
# ip link add address 02:11:22:33:44:55 type veth
# ip li sh
...
26: veth0@veth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc \
noop state DOWN mode DEFAULT qlen 1000
link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
27: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc \
noop state DOWN mode DEFAULT qlen 1000
link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
Second it is not possible to register both main and
peer network devices with same name, that happens
when name for main interface is given with IFLA_IFNAME
and same attribute reused for peer.
This case can be reproduced with following sequence:
# ip link add dev veth1a type veth
RTNETLINK answers: File exists
To fix both of the cases check if corresponding netlink
attributes are taken from peer_tb when valid or
name based on rtnl ops kind and random address is used.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cpsw driver tries to get macid for am43xx SoCs using the compatible
ti,am4372. But not all variants of am43x uses this complatible like
epos evm uses ti,am438x. So use a generic compatible ti,am43 to get
macid for all am43 based platforms.
Reviewed-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 242d3a49a2 ("ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf")
I assumed NETDEV_REGISTER and NETDEV_UNREGISTER are paired,
unfortunately, as reported by jeffy, netdev_wait_allrefs()
could rebroadcast NETDEV_UNREGISTER event until all refs are
gone.
We have to add an additional check to avoid this corner case.
For netdev_wait_allrefs() dev->reg_state is NETREG_UNREGISTERED,
for dev_change_net_namespace(), dev->reg_state is
NETREG_REGISTERED. So check for dev->reg_state != NETREG_UNREGISTERED.
Fixes: 242d3a49a2 ("ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf")
Reported-by: jeffy <jeffy.chen@rock-chips.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit ("net/phy: micrel: Add workaround for bad autoneg") fixes an
autoneg failure case by resetting the hardware. This turns off
intterupts. Things will work themselves out if the phy polls, as it will
figure out it's state during a poll. However if the phy uses only
intterupts, the phy will stall, since interrupts are off. This patch
fixes the issue by calling config_intr after resetting the phy.
Fixes: d2fd719bcb ("net/phy: micrel: Add workaround for bad autoneg ")
Signed-off-by: Zach Brown <zach.brown@ni.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TF is handled a bit differently for syscall and sysret, compared
to the other instructions: TF is checked after the instruction completes,
so that the OS can disable #DB at a syscall by adding TF to FMASK.
When the sysret is executed the #DB is taken "as if" the syscall insn
just completed.
KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
Fix the behavior, otherwise you could get #DB on a user stack which is not
nice. This does not affect Linux guests, as they use an IST or task gate
for #DB.
This fixes CVE-2017-7518.
Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
KVM: s390: fix shadow table handling for nested guests
Some odd-ball cases (real-space designation ASCEs) are handled wrong
for the shadow page tables. Fix it.
NPU2 requires an extra explicit flush to an active GPU PID when
sending address translation shoot downs (ATSDs) to reliably flush the
GPU TLB. This patch adds just such a flush at the end of each sequence
of ATSDs.
We can safely use PID 0 which is always reserved and active on the
GPU. PID 0 is only used for init_mm which will never be a user mm on
the GPU. To enforce this we add a check in pnv_npu2_init_context()
just in case someone tries to use PID 0 on the GPU.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
[mpe: Use true/false for bool literals]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
For real-space designation asces the asce origin part is only a token.
The asce token origin must not be used to generate an effective
address for storage references. This however is erroneously done
within kvm_s390_shadow_tables().
Furthermore within the same function the wrong parts of virtual
addresses are used to generate a corresponding real address
(e.g. the region second index is used as region first index).
Both of the above can result in incorrect address translations. Only
for real space designations with a token origin of zero and addresses
below one megabyte the translation was correct.
Furthermore replace a "!asce.r" statement with a "!*fake" statement to
make it more obvious that a specific condition has nothing to do with
the architecture, but with the fake handling of real space designations.
Fixes: 3218f7094b ("s390/mm: support real-space for gmap shadows")
Cc: David Hildenbrand <david@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The i2c-imx driver incorrectly uses readb()/writeb() to read and
write to the appropriate registers when performing a repeated start.
The appropriate imx_i2c_read_reg()/imx_i2c_write_reg() functions
should be used instead. Performing a repeated start results in
a kernel panic. The platform is imx.
Signed-off-by: Michail G Etairidis <m.etairidis@beck-ipc.com>
Fixes: ce1a78840f ("i2c: imx: add DMA support for freescale i2c driver")
Fixes: 054b62d9f2 ("i2c: imx: fix the i2c bus hang issue when do repeat restart")
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
IP6CB(skb)->nhoff is the offset of the nexthdr field in an IPv6
header, unless there are extension headers present, in which case
nhoff points to the nexthdr field of the last extension header.
In non-GRO code path, nhoff is set by ipv6_rcv before any XFRM code
is executed. Conversely, in GRO code path (when esp6_offload is loaded),
nhoff is not set. The following functions fail to read the correct value
and eventually the packet is dropped:
xfrm6_transport_finish
xfrm6_tunnel_input
xfrm6_rcv_tnl
Set nhoff to the proper offset of nexthdr in esp6_gro_receive.
Fixes: 7785bba299 ("esp: Add a software GRO codepath")
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
IPv6 payload length indicates the size of the payload, including any
extension headers.
In xfrm6_transport_finish, ipv6_hdr(skb)->payload_len is set to the
payload size only, regardless of the presence of any extension headers.
After ESP GRO transport mode decapsulation, ipv6_rcv trims the packet
according to the wrong payload_len, thus corrupting the packet.
Set payload_len to account for extension headers as well.
Fixes: 7785bba299 ("esp: Add a software GRO codepath")
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Pull block fixes from Jens Axboe:
"This contains a set of fixes for xen-blkback by way of Konrad, and a
performance regression fix for blk-mq for shared tags.
The latter could account for as much as a 50x reduction in
performance, with the test case from the user with 500 name spaces. A
more realistic setup on my end with 32 drives showed a 3.5x drop. The
fix has been thoroughly tested before being committed"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: fix performance regression with shared tags
xen-blkback: don't leak stack data via response ring
xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
xen/blkback: don't free be structure too early
xen/blkback: fix disconnect while I/Os in flight
bmap returns a dumb LBA address but not the block device that goes with
that LBA. Swapfiles don't care about this and will blindly assume that
the data volume is the correct blockdev, which is totally bogus for
files on the rt subvolume. This results in the swap code doing IOs to
arbitrary locations on the data device(!) if the passed in mapping is a
realtime file, so just turn off bmap for rt files.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Since commit fcc8487d47 ("uapi: export all headers under uapi
directories") fakechroot make bindeb-pkg fails, mismatching files for
directories:
touch: cannot touch 'usr/include/video/uvesafb.h/.install': Not a
directory
This due to a bug in fakechroot:
when using the function $(wildcard $(srcdir)/*/.) in a makefile, under a
fakechroot environment, not only directories but also files are
returned.
To circumvent that, we are using the functions:
$(sort $(dir $(wildcard $(srcdir)/*/))))
Fixes: fcc8487d47 ("uapi: export all headers under uapi directories")
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Commit f406270bf7 ("ACPI / scan: Set the visited flag for all
enumerated devices") caused that two group of special SPI or I2C
devices do not enumerate. SPI and I2C devices are expected to be
enumerated by the SPI and I2C subsystems but change caused that
acpi_bus_attach() marks those devices with acpi_device_set_enumerated().
First group of devices are matched using Device Tree compatible property
with special _HID "PRP0001". Those devices have matched scan handler,
acpi_scan_attach_handler() retuns 1 and acpi_bus_attach() marks them
with acpi_device_set_enumerated().
Second group of devices without valid _HID such as "LNXVIDEO" have
device->pnp.type.platform_id set to zero and change again marks them
with acpi_device_set_enumerated().
Fix this by flagging the SPI and I2C devices during struct acpi_device
object initialization time and let the code in acpi_bus_attach() to go
through the device_attach() and acpi_default_enumeration() path for all
SPI and I2C devices.
Fixes: f406270bf7 (ACPI / scan: Set the visited flag for all enumerated devices)
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: 4.11+ <stable@vger.kernel.org> # 4.11+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull networking fixes from David Miller:
1) Fix refcounting wrt timers which hold onto inet6 address objects,
from Xin Long.
2) Fix an ancient bug in wireless wext ioctls, from Johannes Berg.
3) Firmware handling fixes in brcm80211 driver, from Arend Van Spriel.
4) Several mlx5 driver fixes (firmware readiness, timestamp cap
reporting, devlink command validity checking, tc offloading, etc.)
From Eli Cohen, Maor Dickman, Chris Mi, and Or Gerlitz.
5) Fix dst leak in IP/IP6 tunnels, from Haishuang Yan.
6) Fix dst refcount bug in decnet, from Wei Wang.
7) Netdev can be double freed in register_vlan_device(). Fix from Gao
Feng.
8) Don't allow object to be destroyed while it is being dumped in SCTP,
from Xin Long.
9) Fix dpaa_eth build when modular, from Madalin Bucur.
10) Fix throw route leaks, from Serhey Popovych.
11) IFLA_GROUP missing from if_nlmsg_size() and ifla_policy[] table,
also from Serhey Popovych.
12) Fix premature TX SKB free in stmmac, from Niklas Cassel.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
igmp: add a missing spin_lock_init()
net: stmmac: free an skb first when there are no longer any descriptors using it
sfc: remove duplicate up_write on VF filter_sem
rtnetlink: add IFLA_GROUP to ifla_policy
ipv6: Do not leak throw route references
dt-bindings: net: sms911x: Add missing optional VDD regulators
dpaa_eth: reuse the dma_ops provided by the FMan MAC device
fsl/fman: propagate dma_ops
net/core: remove explicit do_softirq() from busy_poll_stop()
fib_rules: Resolve goto rules target on delete
sctp: ensure ep is not destroyed before doing the dump
net/hns:bugfix of ethtool -t phy self_test
net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
cxgb4: notify uP to route ctrlq compl to rdma rspq
ip6_tunnel: Correct tos value in collect_md mode
decnet: always not take dst->__refcnt when inserting dst into hash table
ip6_tunnel: fix potential issue in __ip6_tnl_rcv
ip_tunnel: fix potential issue in ip_tunnel_rcv
brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2()
net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it
...
Pull more pin control fixes from Linus Walleij:
"Some late arriving fixes. I should have sent earlier, just swamped
with work as usual. Thomas patch makes AMD systems usable despite
firmware bugs so it is fairly important.
- Make the AMD driver use a regular interrupt rather than a chained
one, so the system does not lock up.
- Fix a function call error deep inside the STM32 driver"
* tag 'pinctrl-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: stm32: Fix bad function call
pinctrl/amd: Use regular interrupt instead of chained
Pull HID fixes from Jiri Kosina:
- revert of a commit to magicmouse driver that regressess certain
devices, from Daniel Stone
- quirk for a specific Dell mouse, from Sebastian Parschauer
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
Revert "HID: magicmouse: Set multi-touch keybits for Magic Mouse"
HID: Add quirk for Dell PIXART OEM mouse
Pull livepatching fix from Jiri Kosina:
"Fix the way how livepatches are being stacked with respect to RCU,
from Petr Mladek"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: Fix stacking of patches with respect to RCU
Pull more ufs fixes from Al Viro:
"More UFS fixes, unfortunately including build regression fix for the
64-bit s_dsize commit. Fixed in this pile:
- trivial bug in signedness of 32bit timestamps on ufs1
- ESTALE instead of ufs_error() when doing open-by-fhandle on
something deleted
- build regression on 32bit in ufs_new_fragments() - calculating that
many percents of u64 pulls libgcc stuff on some of those. Mea
culpa.
- fix hysteresis loop broken by typo in 2.4.14.7 (right next to the
location of previous bug).
- fix the insane limits of said hysteresis loop on filesystems with
very low percentage of reserved blocks. If it's 5% or less, just
use the OPTSPACE policy.
- calculate those limits once and mount time.
This tree does pass xfstests clean (both ufs1 and ufs2) and it _does_
survive cross-builds.
Again, my apologies for missing that, especially since I have noticed
a related percentage-of-64bit issue in earlier patches (when dealing
with amount of reserved blocks). Self-LART applied..."
* 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs: fix the logics for tail relocation
ufs_iget(): fail with -ESTALE on deleted inode
fix signedness of timestamps on ufs1
Fix expand_upwards() on architectures with an upward-growing stack (parisc,
metag and partly IA-64) to allow the stack to reliably grow exactly up to
the address space limit given by TASK_SIZE.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of
mmap testing. That's the VM_BUG_ON(gap_end < gap_start) at the
end of unmapped_area_topdown(). Linus points out how MAP_FIXED
(which does not have to respect our stack guard gap intentions)
could result in gap_end below gap_start there. Fix that, and
the similar case in its alternative, unmapped_area().
Cc: stable@vger.kernel.org
Fixes: 1be7107fbe ("mm: larger stack guard gap, between vmas")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Debugged-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If we have shared tags enabled, then every IO completion will trigger
a full loop of every queue belonging to a tag set, and every hardware
queue for each of those queues, even if nothing needs to be done.
This causes a massive performance regression if you have a lot of
shared devices.
Instead of doing this huge full scan on every IO, add an atomic
counter to the main queue that tracks how many hardware queues have
been marked as needing a restart. With that, we can avoid looking for
restartable queues, if we don't have to.
Max reports that this restores performance. Before this patch, 4K
IOPS was limited to 22-23K IOPS. With the patch, we are running at
950-970K IOPS.
Fixes: 6d8c6c0f97 ("blk-mq: Restart a single queue if tag sets are shared")
Reported-by: Max Gurtovoy <maxg@mellanox.com>
Tested-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If only a subset of the devices associated with multiple regions support
a given special operation (eg. DISCARD) then the dec_count() that is
used to set error for the region must increment the io->count.
Otherwise, when the dec_count() is called it can cause the dm-io
caller's bio to be completed multiple times. As was reported against
the dm-mirror target that had mirror legs with a mix of discard
capabilities.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=196077
Reported-by: Zhang Yi <yizhan@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Use spin_lock_irqsave and spin_unlock_irqrestore rather than
spin_{lock,unlock}_irq in submit_flush_bio().
Otherwise lockdep issues the following warning:
DEBUG_LOCKS_WARN_ON(current->hardirq_context)
WARNING: CPU: 1 PID: 0 at kernel/locking/lockdep.c:2748 trace_hardirqs_on_caller+0x107/0x180
Reported-by: Ondrej Kozina <okozina@redhat.com>
Tested-by: Ondrej Kozina <okozina@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
'rc' is known to be 0 at this point. So if 'init_sg' or 'kzalloc' fails, we
should return -ENOMEM instead.
Also remove a useless 'rc' in a debug message as it is meaningless here.
Fixes: 026e93dc0a ("CIFS: Encrypt SMB3 requests before sending")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
A few fixes for 4.12:
- Add a new Polaris12 pci id
- A stack corruption fix
- Suspend/resume fix
- PX fix
- Display flickering fix
* 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: add a quirk for Toshiba Satellite L20-183
drm/radeon: add a PX quirk for another K53TK variant
drm/amdgpu: adjust default display clock
drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating
drm/amdgpu: add Polaris12 DID
drm/i915 fixes for v4.12-rc7
* tag 'drm-intel-fixes-2017-06-20' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Don't enable backlight at setup time.
drm/i915: Plumb the correct acquire ctx into intel_crtc_disable_noatomic()
drm/i915: Fix deadlock witha the pipe A quirk during resume
drm/i915: Remove __GFP_NORETRY from our buffer allocator
drm/i915: Encourage our shrinker more when our shmemfs allocations fails
drm/i915: Differentiate between sw write location into ring and last hw read
There is a redundant return in function cifs_creation_time_get
that appears to be old vestigial code than can be removed. So
remove it.
Detected by CoverityScan, CID#1361924 ("Structurally dead code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Downgrade the loglevel for SMB2 to prevent filling the log
with messages if e.g. readdir was interrupted. Also make SMB2
and SMB1 codepaths do the same logging during readdir.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
pages is being allocated however a null check on bv is being used
to see if the allocation failed. Fix this by checking if pages is
null.
Detected by CoverityScan, CID#1432974 ("Logically dead code")
Fixes: ccf7f4088a ("CIFS: Add asynchronous context to support kernel AIO")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
The current code causes a static checker warning because ITER_IOVEC is
zero so the condition is never true.
Fixes: 6685c5e2d1 ("CIFS: Add asynchronous read support through kernel AIO")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Andrey reported a lockdep warning on non-initialized
spinlock:
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x292/0x395 lib/dump_stack.c:52
register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755
? 0xffffffffa0000000
__lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255
lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
__raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135
_raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175
spin_lock_bh ./include/linux/spinlock.h:304
ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076
igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194
ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736
We miss a spin_lock_init() in igmpv3_add_delrec(), probably
because previously we never use it on this code path. Since
we already unlink it from the global mc_tomb list, it is
probably safe not to acquire this spinlock here. It does not
harm to have it although, to avoid conditional locking.
Fixes: c38b7d327a ("igmp: acquire pmc lock for ip_mc_clear_src()")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kalle Valo says:
====================
wireless-drivers fixes for 4.12
Two important fixes for brcmfmac. The rest of the brcmfmac patches are
either code preparation and fixing a new build warning.
brcmfmac
* fix a NULL pointer dereference during resume
* fix a NULL pointer dereference with USB devices, a regression from
v4.12-rc1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When having the skb pointer in the first descriptor, stmmac_tx_clean
can get called at a moment where the IP has only cleared the own bit
of the first descriptor, thus freeing the skb, even though there can
be several descriptors whose buffers point into the same skb.
By simply moving the skb pointer from the first descriptor to the last
descriptor, a skb will get freed only when the IP has cleared the
own bit of all the descriptors that are using that skb.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Somehow two copies of the line 'up_write(&vf->efx->filter_sem);' got into
efx_ef10_sriov_set_vf_vlan(). This would put the mutex in a bad state and
cause all subsequent down attempts to hang.
Fixes: 671b53eec2 ("sfc: Ensure down_write(&filter_sem) and up_write() are matched before calling efx_net_open()")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Network interface groups support added while ago, however
there is no IFLA_GROUP attribute description in policy
and netlink message size calculations until now.
Add IFLA_GROUP attribute to the policy.
Fixes: cbda10fa97 ("net_device: add support for network device groups")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While commit 73ba57bfae ("ipv6: fix backtracking for throw routes")
does good job on error propagation to the fib_rules_lookup()
in fib rules core framework that also corrects throw routes
handling, it does not solve route reference leakage problem
happened when we return -EAGAIN to the fib_rules_lookup()
and leave routing table entry referenced in arg->result.
If rule with matched throw route isn't last matched in the
list we overwrite arg->result losing reference on throw
route stored previously forever.
We also partially revert commit ab997ad408 ("ipv6: fix the
incorrect return value of throw route") since we never return
routing table entry with dst.error == -EAGAIN when
CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point
to check for RTF_REJECT flag since it is always set throw
route.
Fixes: 73ba57bfae ("ipv6: fix backtracking for throw routes")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The lan911x family of devices require supplying from 3.3 V power
supplies (connected to VDD_IO, VDD_A and VREG_3.3 pins). The existing
driver however obtains only VDD_IO and VDD_A regulators in an optional
way so document this in bindings.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur says:
====================
net: fix loadable module for DPAA Ethernet
The DPAA Ethernet makes use of a symbol that is not exported.
Address the issue by propagating the dma_ops rather than calling
arch_setup_dma_ops().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the use of arch_setup_dma_ops() that was not exported
and was breaking loadable module compilation.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure dma_ops are set, to be later used by the Ethernet driver.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 217f697436 ("net: busy-poll: allow preemption in
sk_busy_loop()") there is an explicit do_softirq() invocation after
local_bh_enable() has been invoked.
I don't understand why we need this because local_bh_enable() will
invoke do_softirq() once the softirq counter reached zero and we have
softirq-related work pending.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should avoid marking goto rules unresolved when their
target is actually reachable after rule deletion.
Consolder following sample scenario:
# ip -4 ru sh
0: from all lookup local
32000: from all goto 32100
32100: from all lookup main
32100: from all lookup default
32766: from all lookup main
32767: from all lookup default
# ip -4 ru del pref 32100 table main
# ip -4 ru sh
0: from all lookup local
32000: from all goto 32100 [unresolved]
32100: from all lookup default
32766: from all lookup main
32767: from all lookup default
After removal of first rule with preference 32100 we
mark all goto rules as unreachable, even when rule with
same preference as removed one still present.
Check if next rule with same preference is available
and make all rules with goto action pointing to it.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit fixes a "maybe-uninitialized" build failure in
arch/mips/kvm/tlb.c when KVM, DYNAMIC_DEBUG and JUMP_LABEL are all
enabled. The failure is:
In file included from ./include/linux/printk.h:329:0,
from ./include/linux/kernel.h:13,
from ./include/asm-generic/bug.h:15,
from ./arch/mips/include/asm/bug.h:41,
from ./include/linux/bug.h:4,
from ./include/linux/thread_info.h:11,
from ./include/asm-generic/current.h:4,
from ./arch/mips/include/generated/asm/current.h:1,
from ./include/linux/sched.h:11,
from arch/mips/kvm/tlb.c:13:
arch/mips/kvm/tlb.c: In function ‘kvm_mips_host_tlb_inv’:
./include/linux/dynamic_debug.h:126:3: error: ‘idx_kernel’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
__dynamic_pr_debug(&descriptor, pr_fmt(fmt), \
^~~~~~~~~~~~~~~~~~
arch/mips/kvm/tlb.c:169:16: note: ‘idx_kernel’ was declared here
int idx_user, idx_kernel;
^~~~~~~~~~
There is a similar error relating to "idx_user". Both errors were
observed with GCC 6.
As far as I can tell, it is impossible for either idx_user or idx_kernel
to be uninitialized when they are later read in the calls to kvm_debug,
but to satisfy the compiler, add zero initializers to both variables.
Signed-off-by: James Cowgill <James.Cowgill@imgtec.com>
Fixes: 57e3869cfa ("KVM: MIPS/TLB: Generalise host TLB invalidate to kernel ASID")
Cc: <stable@vger.kernel.org> # 4.11+
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* fix problems that could cause hangs or crashes in the host on POWER9
* fix problems that could allow guests to potentially affect or disrupt
the execution of the controlling userspace
As it turns out more than just Armada 370 and XP support using GPIO
lines as PWM lines. For example the Armada 38x family has the same
hardware support. As such "marvell,armada-370-xp-gpio" for the
compatible string is a misnomer.
Change the compatible string to "marvell,armada-370-gpio" before the
driver makes it out of the -rc stage. This also follows the practice of
using only the first device family supported as part of the name.
Also update the documentation and comments in the code accordingly.
Fixes: 757642f9a5 ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull clockevents fixes from Daniel Lezcano:
- Fixed wrong iomem area unmapped in the arch_arm_timer (Frank Rowand)
- Added missing includes for sun5i and cadence-ttc (Stephen Rothwell)
rcu_read_(un)lock(), list_*_rcu(), and synchronize_rcu() are used for a secure
access and manipulation of the list of patches that modify the same function.
In particular, it is the variable func_stack that is accessible from the ftrace
handler via struct ftrace_ops and klp_ops.
Of course, it synchronizes also some states of the patch on the top of the
stack, e.g. func->transition in klp_ftrace_handler.
At the same time, this mechanism guards also the manipulation of
task->patch_state. It is modified according to the state of the transition and
the state of the process.
Now, all this works well as long as RCU works well. Sadly livepatching might
get into some corner cases when this is not true. For example, RCU is not
watching when rcu_read_lock() is taken in idle threads. It is because they
might sleep and prevent reaching the grace period for too long.
There are ways how to make RCU watching even in idle threads, see
rcu_irq_enter(). But there is a small location inside RCU infrastructure when
even this does not work.
This small problematic location can be detected either before calling
rcu_irq_enter() by rcu_irq_enter_disabled() or later by rcu_is_watching().
Sadly, there is no safe way how to handle it. Once we detect that RCU was not
watching, we might see inconsistent state of the function stack and the related
variables in klp_ftrace_handler(). Then we could do a wrong decision, use an
incompatible implementation of the function and break the consistency of the
system. We could warn but we could not avoid the damage.
Fortunately, ftrace has similar problems and they seem to be solved well there.
It uses a heavy weight implementation of some RCU operations. In particular, it
replaces:
+ rcu_read_lock() with preempt_disable_notrace()
+ rcu_read_unlock() with preempt_enable_notrace()
+ synchronize_rcu() with schedule_on_each_cpu(sync_work)
My understanding is that this is RCU implementation from a stone age. It meets
the core RCU requirements but it is rather ineffective. Especially, it does not
allow to batch or speed up the synchronize calls.
On the other hand, it is very trivial. It allows to safely trace and/or
livepatch even the RCU core infrastructure. And the effectiveness is a not a
big issue because using ftrace or livepatches on productive systems is a rare
operation. The safety is much more important than a negligible extra load.
Note that the alternative implementation follows the RCU principles. Therefore,
we could and actually must use list_*_rcu() variants when manipulating the
func_stack. These functions allow to access the pointers in the right
order and with the right barriers. But they do not use any other
information that would be set only by rcu_read_lock().
Also note that there are actually two problems solved in ftrace:
First, it cares about the consistency of RCU read sections. It is being solved
the way as described and used in this patch.
Second, ftrace needs to make sure that nobody is inside the dynamic trampoline
when it is being freed. For this, it also calls synchronize_rcu_tasks() in
preemptive kernel in ftrace_shutdown().
Livepatch has similar problem but it is solved by ftrace for free.
klp_ftrace_handler() is a good guy and never sleeps. In addition, it is
registered with FTRACE_OPS_FL_DYNAMIC. It causes that
unregister_ftrace_function() calls:
* schedule_on_each_cpu(ftrace_sync) - always
* synchronize_rcu_tasks() - in preemptive kernel
The effect is that nobody is neither inside the dynamic trampoline nor inside
the ftrace handler after unregister_ftrace_function() returns.
[jkosina@suse.cz: reformat changelog, fix comment]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Recently vDSO support for CLOCK_MONOTONIC_RAW was added in
49eea433b3 ("arm64: Add support for CLOCK_MONOTONIC_RAW in
clock_gettime() vDSO"). Noticing that the core timekeeping code
never set tkr_raw.xtime_nsec, the vDSO implementation didn't
bother exposing it via the data page and instead took the
unshifted tk->raw_time.tv_nsec value which was then immediately
shifted left in the vDSO code.
Unfortunately, by accellerating the MONOTONIC_RAW clockid, it
uncovered potential 1ns time inconsistencies caused by the
timekeeping core not handing sub-ns resolution.
Now that the core code has been fixed and is actually setting
tkr_raw.xtime_nsec, we need to take that into account in the
vDSO by adding it to the shifted raw_time value, in order to
fix the user-visible inconsistency. Rather than do that at each
use (and expand the data page in the process), instead perform
the shift/addition operation when populating the data page and
remove the shift from the vDSO code entirely.
[jstultz: minor whitespace tweak, tried to improve commit
message to make it more clear this fixes a regression]
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Daniel Mentz <danielmentz@google.com>
Acked-by: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: "stable #4 . 8+" <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Link: http://lkml.kernel.org/r/1496965462-20003-4-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Due to how the MONOTONIC_RAW accumulation logic was handled,
there is the potential for a 1ns discontinuity when we do
accumulations. This small discontinuity has for the most part
gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW
in their vDSO clock_gettime implementation, we've seen failures
with the inconsistency-check test in kselftest.
This patch addresses the issue by using the same sub-ns
accumulation handling that CLOCK_MONOTONIC uses, which avoids
the issue for in-kernel users.
Since the ARM64 vDSO implementation has its own clock_gettime
calculation logic, this patch reduces the frequency of errors,
but failures are still seen. The ARM64 vDSO will need to be
updated to include the sub-nanosecond xtime_nsec values in its
calculation for this issue to be completely fixed.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Daniel Mentz <danielmentz@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "stable #4 . 8+" <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In tests, which excercise switching of clocksources, a NULL
pointer dereference can be observed on AMR64 platforms in the
clocksource read() function:
u64 clocksource_mmio_readl_down(struct clocksource *c)
{
return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask;
}
This is called from the core timekeeping code via:
cycle_now = tkr->read(tkr->clock);
tkr->read is the cached tkr->clock->read() function pointer.
When the clocksource is changed then tkr->clock and tkr->read
are updated sequentially. The code above results in a sequential
load operation of tkr->read and tkr->clock as well.
If the store to tkr->clock hits between the loads of tkr->read
and tkr->clock, then the old read() function is called with the
new clock pointer. As a consequence the read() function
dereferences a different data structure and the resulting 'reg'
pointer can point anywhere including NULL.
This problem was introduced when the timekeeping code was
switched over to use struct tk_read_base. Before that, it was
theoretically possible as well when the compiler decided to
reload clock in the code sequence:
now = tk->clock->read(tk->clock);
Add a helper function which avoids the issue by reading
tk_read_base->clock once into a local variable clk and then issue
the read function via clk->read(clk). This guarantees that the
read() function always gets the proper clocksource pointer handed
in.
Since there is now no use for the tkr.read pointer, this patch
also removes it, and to address stopping the fast timekeeper
during suspend/resume, it introduces a dummy clocksource to use
rather then just a dummy read function.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: stable <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Daniel Mentz <danielmentz@google.com>
Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Setting these bits causes libinput to fail to initialize the device;
setting BTN_TOUCH and BTN_TOOL_FINGER causes it to treat the mouse as a
touchpad, and it then refuses to continue when it discovers ABS_X is not
set.
This breaks all known Wayland compositors, as well as Xorg when the
libinput driver is being used.
This reverts commit f4b65b9563.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Cc: Che-Liang Chiou <clchiou@chromium.org>
Cc: Thierry Escande <thierry.escande@collabora.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Broxton-T was a forgotten child and we didn't apply the quirks for
Skylake+ properly. Meanwhile, a quirk for reducing the DMA latency
seems specific to the early Broxton model, so we leave as is.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Without this quirk, the touchpad is not responsive on this product, with
the following message repeated in the logs:
psmouse serio1: bad data from KBC - timeout
Add it to the notimeout list alongside other similar Fujitsu laptops.
Signed-off-by: Daniel Drake <drake@endlessm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Pull clk fixes from Stephen Boyd:
"One build fix for an Amlogic clk driver and a handful of Allwinner clk
driver fixes for some DT bindings and a randconfig build error that
all came in this merge window"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: sunxi-ng: a64: Export PLL_PERIPH0 clock for the PRCM
clk: sunxi-ng: h3: Export PLL_PERIPH0 clock for the PRCM
dt-bindings: clock: sunxi-ccu: Add pll-periph to PRCM's needed clocks
clk: sunxi-ng: sun5i: Fix ahb_bist_clk definition
clk: sunxi-ng: enable SUNXI_CCU_MP for PRCM
clk: meson: gxbb: fix build error without RESET_CONTROLLER
clk: sunxi-ng: v3s: Fix usb otg device reset bit
clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset
Pull NTB fixes from Jon Mason:
"NTB bug fixes to address the modinfo in ntb_perf, a couple of bugs in
the NTB transport QP calculations, skx doorbells, and sleeping in
ntb_async_tx_submit"
* tag 'ntb-4.12-bugfixes' of git://github.com/jonmason/ntb:
ntb: no sleep in ntb_async_tx_submit
ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bits
ntb_transport: fix bug calculating num_qps_mw
ntb_transport: fix qp count bug
NTB: ntb_test: fix bug printing ntb_perf results
ntb: Correct modinfo usage statement for ntb_perf
Odd versions of gcc for the sh4 architecture will actually warn about
flags being used while uninitialized, so we set them to zero. Non crazy
gccs will optimize that out again, so it doesn't make a difference.
Next, over aggressive gccs could inline the expression that defines
use_lock, which could then introduce a race resulting in a lock
imbalance. By using READ_ONCE, we prevent that fate. Finally, we make
that assignment const, so that gcc can still optimize a nice amount.
Finally, we fix a potential deadlock between primary_crng.lock and
batched_entropy_reset_lock, where they could be called in opposite
order. Moving the call to invalidate_batched_entropy to outside the lock
rectifies this issue.
Fixes: b169c13de4
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Now before dumping a sock in sctp_diag, it only holds the sock while
the ep may be already destroyed. It can cause a use-after-free panic
when accessing ep->asocs.
This patch is to set sctp_sk(sk)->ep NULL in sctp_endpoint_destroy,
and check if this ep is already destroyed before dumping this ep.
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdrver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not sleep in ntb_async_tx_submit, which could deadlock.
This reverts commit "8c874cc140d667f84ae4642bb5b5e0d6396d2ca4"
Fixes: 8c874cc140 ("NTB: Address out of DMA descriptor issue with NTB")
Reported-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@dell.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Fixing doorbell register length to 32bits per spec. On Skylake NTB, the
doorbell registers are 32bit write only registers. The source for the
doorbell is a 64bit register that shows the interrupt bits.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 783dfa6cc4 ("ntb: Adding Skylake Xeon NTB support")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
A divide by zero error occurs if qp_count is less than mw_count because
num_qps_mw is calculated to be zero. The calculation appears to be
incorrect.
The requirement is for num_qps_mw to be set to qp_count / mw_count
with any remainder divided among the earlier mws.
For example, if mw_count is 5 and qp_count is 12 then mws 0 and 1
will have 3 qps per window and mws 2 through 4 will have 2 qps per window.
Thus, when mw_num < qp_count % mw_count, num_qps_mw is 1 higher
than when mw_num >= qp_count.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
In cases where there are more mw's than spads/2-2, the mw count gets
reduced to match the limitation. ntb_transport also tries to ensure that
there are fewer qps than mws but uses the full mw count instead of
the reduced one. When this happens, the math in
'ntb_transport_setup_qp_mw' will get confused and result in a kernel
paging request bug.
This patch fixes the bug by reducing qp_count to the reduced mw count
instead of the full mw count.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The code mistakenly prints the local perf results for the remote test
so the script reports identical results for both directions. Fix this
by ensuring we print the remote result.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Fixes: a9c59ef774 ("ntb_test: Add a selftest script for the NTB subsystem")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The order parameters are powers of 2; adjust the usage information
to use correct mathematical representations.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Fixes: 8a7b6a778a ("ntb: ntb perf tool")
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
This patch fixes the phy loopback self_test failed issue. when
Marvell Phy Module is loaded, it will powerdown fiber when doing
phy loopback self test, which cause phy loopback self_test fail.
Signed-off-by: Lin Yun Sheng <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At Linux v3.5, packet processing can be done in process context of ALSA
PCM application as well as software IRQ context for OHCI 1394. Below is
an example of the callgraph (some calls are omitted).
ioctl(2) with e.g. HWSYNC
(sound/core/pcm_native.c)
->snd_pcm_common_ioctl1()
->snd_pcm_hwsync()
->snd_pcm_stream_lock_irq
(sound/core/pcm_lib.c)
->snd_pcm_update_hw_ptr()
->snd_pcm_udpate_hw_ptr0()
->struct snd_pcm_ops.pointer()
(sound/firewire/*)
= Each handler on drivers in ALSA firewire stack
(sound/firewire/amdtp-stream.c)
->amdtp_stream_pcm_pointer()
(drivers/firewire/core-iso.c)
->fw_iso_context_flush_completions()
->struct fw_card_driver.flush_iso_completion()
(drivers/firewire/ohci.c)
= flush_iso_completions()
->struct fw_iso_context.callback.sc
(sound/firewire/amdtp-stream.c)
= in_stream_callback() or out_stream_callback()
->...
->snd_pcm_stream_unlock_irq
When packet queueing error occurs or detecting invalid packets in
'in_stream_callback()' or 'out_stream_callback()', 'snd_pcm_stop_xrun()'
is called on local CPU with disabled IRQ.
(sound/firewire/amdtp-stream.c)
in_stream_callback() or out_stream_callback()
->amdtp_stream_pcm_abort()
->snd_pcm_stop_xrun()
->snd_pcm_stream_lock_irqsave()
->snd_pcm_stop()
->snd_pcm_stream_unlock_irqrestore()
The process is stalled on the CPU due to attempt to acquire recursive lock.
[ 562.630853] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 562.630861] 2-...: (1 GPs behind) idle=37d/140000000000000/0 softirq=38323/38323 fqs=7140
[ 562.630862] (detected by 3, t=15002 jiffies, g=21036, c=21035, q=5933)
[ 562.630866] Task dump for CPU 2:
[ 562.630867] alsa-source-OXF R running task 0 6619 1 0x00000008
[ 562.630870] Call Trace:
[ 562.630876] ? vt_console_print+0x79/0x3e0
[ 562.630880] ? msg_print_text+0x9d/0x100
[ 562.630883] ? up+0x32/0x50
[ 562.630885] ? irq_work_queue+0x8d/0xa0
[ 562.630886] ? console_unlock+0x2b6/0x4b0
[ 562.630888] ? vprintk_emit+0x312/0x4a0
[ 562.630892] ? dev_vprintk_emit+0xbf/0x230
[ 562.630895] ? do_sys_poll+0x37a/0x550
[ 562.630897] ? dev_printk_emit+0x4e/0x70
[ 562.630900] ? __dev_printk+0x3c/0x80
[ 562.630903] ? _raw_spin_lock+0x20/0x30
[ 562.630909] ? snd_pcm_stream_lock+0x31/0x50 [snd_pcm]
[ 562.630914] ? _snd_pcm_stream_lock_irqsave+0x2e/0x40 [snd_pcm]
[ 562.630918] ? snd_pcm_stop_xrun+0x16/0x70 [snd_pcm]
[ 562.630922] ? in_stream_callback+0x3e6/0x450 [snd_firewire_lib]
[ 562.630925] ? handle_ir_packet_per_buffer+0x8e/0x1a0 [firewire_ohci]
[ 562.630928] ? ohci_flush_iso_completions+0xa3/0x130 [firewire_ohci]
[ 562.630932] ? fw_iso_context_flush_completions+0x15/0x20 [firewire_core]
[ 562.630935] ? amdtp_stream_pcm_pointer+0x2d/0x40 [snd_firewire_lib]
[ 562.630938] ? pcm_capture_pointer+0x19/0x20 [snd_oxfw]
[ 562.630943] ? snd_pcm_update_hw_ptr0+0x47/0x3d0 [snd_pcm]
[ 562.630945] ? poll_select_copy_remaining+0x150/0x150
[ 562.630947] ? poll_select_copy_remaining+0x150/0x150
[ 562.630952] ? snd_pcm_update_hw_ptr+0x10/0x20 [snd_pcm]
[ 562.630956] ? snd_pcm_hwsync+0x45/0xb0 [snd_pcm]
[ 562.630960] ? snd_pcm_common_ioctl1+0x1ff/0xc90 [snd_pcm]
[ 562.630962] ? futex_wake+0x90/0x170
[ 562.630966] ? snd_pcm_capture_ioctl1+0x136/0x260 [snd_pcm]
[ 562.630970] ? snd_pcm_capture_ioctl+0x27/0x40 [snd_pcm]
[ 562.630972] ? do_vfs_ioctl+0xa3/0x610
[ 562.630974] ? vfs_read+0x11b/0x130
[ 562.630976] ? SyS_ioctl+0x79/0x90
[ 562.630978] ? entry_SYSCALL_64_fastpath+0x1e/0xad
This commit fixes the above bug. This assumes two cases:
1. Any error is detected in software IRQ context of OHCI 1394 context.
In this case, PCM substream should be aborted in packet handler. On the
other hand, it should not be done in any process context. TO distinguish
these two context, use 'in_interrupt()' macro.
2. Any error is detect in process context of ALSA PCM application.
In this case, PCM substream should not be aborted in packet handler
because PCM substream lock is acquired. The task to abort PCM substream
should be done in ALSA PCM core. For this purpose, SNDRV_PCM_POS_XRUN is
returned at 'struct snd_pcm_ops.pointer()'.
Suggested-by: Clemens Ladisch <clemens@ladisch.de>
Fixes: e9148dddc3c7("ALSA: firewire-lib: flush completed packets when reading PCM position")
Cc: <stable@vger.kernel.org> # 4.9+
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
During the module initialisation there is a possible race
(basically race between uld and lld) where neither the uld
nor lld notifies the uP about where to route the ctrl queue
completions. LLD skips notifying uP as the rdma queues were
not created by then (will leave it to ULD to notify the uP).
As the ULD comes up, it also skips notifying the uP as the
flag FULL_INIT_DONE is not set yet (ULD assumes that the
interface is not up yet).
Consequently, this race between uld and lld leaves uP
unnotified about where to send the ctrl queue completions
to, leading to iwarp RI_RES WR failure.
Here is the race:
CPU 0 CPU1
- allocates nic rx queus
- t4_sge_alloc_ctrl_txq()
(if rdma rsp queues exists,
tell uP to route ctrl queue
compl to rdma rspq)
- acquires the mutex_lock
- allocates rdma response queues
- if FULL_INIT_DONE set,
tell uP to route ctrl queue compl
to rdma rspq
- relinquishes mutex_lock
- acquires the mutex_lock
- enable_rx()
- set FULL_INIT_DONE
- relinquishes mutex_lock
This patch fixes the above issue.
Fixes: e7519f9926f1('cxgb4: avoid enabling napi twice to the same queue')
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
CC: Stable <stable@vger.kernel.org> # 4.9+
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ARM SoC fixes from Olof Johansson:
"Stream of fixes has slowed down, only a few this week:
- Some DT fixes for Allwinner platforms, and addition of a clock to
the R_CCU clock controller that had been missed.
- A couple of small DT fixes for am335x-sl50"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
ARM: dts: am335x-sl50: Fix card detect pin for mmc1
arm64: allwinner: h5: Remove syslink to shared DTSI
ARM: sunxi: h3/h5: fix the compatible of R_CCU
I tried __GFP_NORETRY in the belief that __GFP_RECLAIM was effective. It
struggles with handling reclaim of our dirty buffers and relies on
reclaim via kswapd. As a result, a single pass of direct reclaim is
unreliable when i915 occupies the majority of available memory, and the
only means of effectively waiting on kswapd to amke progress is by not
setting the __GFP_NORETRY flag and lopping. That leaves us with the
dilemma of invoking the oomkiller instead of propagating the allocation
failure back to userspace where it can be handled more gracefully (one
hopes). In the future we may have __GFP_MAYFAIL to allow repeats up until
we genuinely run out of memory and the oomkiller would have been invoked.
Until then, let the oomkiller wreck havoc.
v2: Stop playing with side-effects of gfp flags and await __GFP_MAYFAIL
v3: Update comments that direct reclaim only appears to be ignoring our
dirty buffers!
Fixes: 24f8e00a8a ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations")
Testcase: igt/gem_tiled_swapping
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michal Hocko <mhocko@suse.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit eaf4180155)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Commit 24f8e00a8a ("drm/i915: Prefer to report ENOMEM rather than
incur the oom for gfx allocations") made the bold decision to try and
avoid the oomkiller by reporting -ENOMEM to userspace if our allocation
failed after attempting to free enough buffer objects. In short, it
appears we were giving up too easily (even before we start wondering if
one pass of reclaim is as strong as we would like). Part of the problem
is that if we only shrink just enough pages for our expected allocation,
the likelihood of those pages becoming available to us is less than 100%
To counter-act that we ask for twice the number of pages to be made
available. Furthermore, we allow the shrinker to pull pages from the
active list in later passes.
v2: Be a little more cautious in paging out gfx buffers, and leave that
to a more balanced approach from shrink_slab(). Important when combined
with "drm/i915: Start writeback from the shrinker" as anything shrunk is
immediately swapped out and so should be more conservative.
Fixes: 24f8e00a8a ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-1-chris@chris-wilson.co.uk
(cherry picked from commit 4846bf0ca8)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Johannes Berg says:
====================
Here's just the fix for that ancient bug:
* remove wext calling ndo_do_ioctl, since nobody needs
that now and it makes the type change easier
* use struct iwreq instead of struct ifreq almost everywhere
in wireless extensions code
* copy only struct iwreq from userspace in dev_ioctl for the
wireless extensions, since it's smaller than struct ifreq
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Allwinner fixes for 4.12
A few fixes around the PRCM support that got in 4.12 with a wrong
compatible, and a missing clock in the binding.
* tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
arm64: allwinner: h5: Remove syslink to shared DTSI
ARM: sunxi: h3/h5: fix the compatible of R_CCU
Signed-off-by: Olof Johansson <olof@lixom.net>
Two fixes for am335x-sl50 to fix a boot time error
for claiming SPI pins, and to fix a SDIO card detect
pin for production version of the device.
* tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
ARM: dts: am335x-sl50: Fix card detect pin for mmc1
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull virtio bugfix from Michael Tsirkin:
"It turns out balloon does not handle IOMMUs correctly. We should fix
that at some point, for now let's just disable this configuration"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_balloon: disable VIOMMU support
Pull i2c fixes from Wolfram Sang:
"Two driver bugfixes"
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: ismt: fix wrong device address when unmap the data buffer
i2c: rcar: use correct length when unmapping DMA
Pull MIPS fixes from Ralf Baechle:
- Three highmem fixes:
+ Fixed mapping initialization
+ Adjust the pkmap location
+ Ensure we use at most one page for PTEs
- Fix makefile dependencies for .its targets to depend on vmlinux
- Fix reversed condition in BNEZC and JIALC software branch emulation
- Only flush initialized flush_insn_slot to avoid NULL pointer
dereference
- perf: Remove incorrect odd/even counter handling for I6400
- ftrace: Fix init functions tracing
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: .its targets depend on vmlinux
MIPS: Fix bnezc/jialc return address calculation
MIPS: kprobes: flush_insn_slot should flush only if probe initialised
MIPS: ftrace: fix init functions tracing
MIPS: mm: adjust PKMAP location
MIPS: highmem: ensure that we don't use more than one page for PTEs
MIPS: mm: fixed mappings: correct initialisation
MIPS: perf: Remove incorrect odd/even counter handling for I6400
We are passing a buffer with ACPI_ALLOCATE_BUFFER set to
acpi_evaluate_object, so we must free it when we are done with it.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
virtio balloon bypasses the DMA API entirely so does not support the
VIOMMU right now. It's not clear we need that support, for now let's
just make sure we don't pretend to support it.
Cc: stable@vger.kernel.org
Cc: Wei Wang <wei.w.wang@intel.com>
Fixes: 1a93769399 ("virtio: new feature to detect IOMMU device quirk")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Pull x86 fixes from Thomas Gleixner:
"Two fixlets for x86:
- Handle WARN_ONs proper with the new UD based WARN implementation
- Disable 1G mappings when 2M mappings are disabled by kmemleak or
debug_pagealloc. Otherwise 1G mappings might still be used,
confusing the debug mechanisms"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Disable 1GB direct mappings when disabling 2MB mappings
x86/debug: Handle early WARN_ONs proper
Pull timer fixes from Thomas Gleixner:
"Three fixlets for timers:
- Two hot-fixes for the alarmtimer based posix timers, which prevent
a nasty DOS by self rescheduling timers. The proper cleanup of that
mess is queued for 4.13
- Make a function static"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/broadcast: Make tick_broadcast_setup_oneshot() static
alarmtimer: Rate limit periodic intervals
alarmtimer: Prevent overflow of relative timers
Pull scheduler fixes from Thomas Gleixner:
"Two small fixes for the schedulre core:
- Use the proper switch_mm() variant in idle_task_exit() because that
code is not called with interrupts disabled.
- Fix a confusing typo in a printk"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Idle_task_exit() shouldn't use switch_mm_irqs_off()
sched/fair: Fix typo in printk message
Pull perf fixes from Thomas Gleixner:
"Three fixes for the perf user space side:
- Fix the probing of precise_ip level, which got broken recently for
x86.
- Unbreak the ARCH=x86_64 build
- Report module before trying to unwind into the module code, which
avoids broken stack frames displayed"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf unwind: Report module before querying isactivation in dwfl unwind
perf tools: Fix build with ARCH=x86_64
perf evsel: Fix probing of precise_ip level for default cycles event
Pull irq fix from Thomas Gleixner:
"Add a missing resource release to an error path"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Release resources in __setup_irq() error path
Pull objtool fix from Thomas Gleixner:
"A single fix which adds fortify_panic to the list of no return
functions"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Add fortify_panic as __noreturn function
Pull LED fixes from Jacek Anaszewski:
"Two LED fixes:
- fix signal source assignment for leds-bcm6328
- revert patch that intended to fix LED behavior on suspend but it
had a side effect preventing suspend at all due to uevent being
sent on trigger removal"
* tag 'led_fixes_for_4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
Revert "leds: handle suspend/resume in heartbeat trigger"
leds: bcm6328: fix signal source assignment for leds 4 to 7
Pull USB fixes from Greg KH:
"Here are some small gadget and xhci USB fixes for 4.12-rc6.
Nothing major, but one of the gadget patches does fix a reported oops,
and the xhci ones resolve reported problems. All have been in
linux-next with no reported issues"
* tag 'usb-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks
usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
usb: xhci: Fix USB 3.1 supported protocol parsing
USB: gadget: fix GPF in gadgetfs
usb: gadget: composite: make sure to reactivate function on unbind
Pull staging and IIO fixes from Greg KH:
"Here are some small staging and IIO driver fixes for 4.12-rc6.
Nothing huge, just a few small driver fixes for reported issues. All
have been in linux-next with no reported issues"
* tag 'staging-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
Staging: rtl8723bs: fix an error code in isFileReadable()
iio: buffer-dmaengine: Add missing header buffer_impl.h
iio: buffer-dma: Add missing header buffer_impl.h
iio: adc: meson-saradc: fix potential crash in meson_sar_adc_clear_fifo
iio: adc: mxs-lradc: Fix return value check in mxs_lradc_adc_probe()
iio: imu: inv_mpu6050: add accel lpf setting for chip >= MPU6500
staging: iio: ad7152: Fix deadlock in ad7152_write_raw_samp_freq()
Pull ceph fixes from Ilya Dryomov:
"A fix for an old ceph ->fh_to_* bug from Luis and two timestamp fixups
from Zheng, prompted by the ongoing y2038 work"
* tag 'ceph-for-4.12-rc6' of git://github.com/ceph/ceph-client:
ceph: unify inode i_ctime update
ceph: use current_kernel_time() to get request time stamp
ceph: check i_nlink while converting a file handle to dentry
* original hysteresis loop got broken by typo back in 2002; now
it never switches out of OPTTIME state. Fixed.
* critical levels for switching from OPTTIME to OPTSPACE and back
ought to be calculated once, at mount time.
* we should use mul_u64_u32_div() for those calculations, now that
->s_dsize is 64bit.
* to quote Kirk McKusick (in 1995 FreeBSD commit message):
The threshold for switching from time-space and space-time is too small
when minfree is 5%...so make it stay at space in this case.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Thomas Gleixner wrote:
> The CRIU support added a 'feature' which allows a user space task to send
> arbitrary (kernel) signals to itself. The changelog says:
>
> The kernel prevents sending of siginfo with positive si_code, because
> these codes are reserved for kernel. I think we can allow a task to
> send such a siginfo to itself. This operation should not be dangerous.
>
> Quite contrary to that claim, it turns out that it is outright dangerous
> for signals with info->si_code == SI_TIMER. The following code sequence in
> a user space task allows to crash the kernel:
>
> id = timer_create(CLOCK_XXX, ..... signo = SIGX);
> timer_set(id, ....);
> info->si_signo = SIGX;
> info->si_code = SI_TIMER:
> info->_sifields._timer._tid = id;
> info->_sifields._timer._sys_private = 2;
> rt_[tg]sigqueueinfo(..., SIGX, info);
> sigemptyset(&sigset);
> sigaddset(&sigset, SIGX);
> rt_sigtimedwait(sigset, info);
>
> For timers based on CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID this
> results in a kernel crash because sigwait() dequeues the signal and the
> dequeue code observes:
>
> info->si_code == SI_TIMER && info->_sifields._timer._sys_private != 0
>
> which triggers the following callchain:
>
> do_schedule_next_timer() -> posix_cpu_timer_schedule() -> arm_timer()
>
> arm_timer() executes a list_add() on the timer, which is already armed via
> the timer_set() syscall. That's a double list add which corrupts the posix
> cpu timer list. As a consequence the kernel crashes on the next operation
> touching the posix cpu timer list.
>
> Posix clocks which are internally implemented based on hrtimers are not
> affected by this because hrtimer_start() can handle already armed timers
> nicely, but it's a reliable way to trigger the WARN_ON() in
> hrtimer_forward(), which complains about calling that function on an
> already armed timer.
This problem has existed since the posix timer code was merged into
2.5.63. A few releases earlier in 2.5.60 ptrace gained the ability to
inject not just a signal (which linux has supported since 1.0) but the
full siginfo of a signal.
The core problem is that the code will reschedule in response to
signals getting dequeued not just for signals the timers sent but
for other signals that happen to a si_code of SI_TIMER.
Avoid this confusion by testing to see if the queued signal was
preallocated as all timer signals are preallocated, and so far
only the timer code preallocates signals.
Move the check for if a timer needs to be rescheduled up into
collect_signal where the preallocation check must be performed,
and pass the result back to dequeue_signal where the code reschedules
timers. This makes it clear why the code cares about preallocated
timers.
Cc: stable@vger.kernel.org
Reported-by: Thomas Gleixner <tglx@linutronix.de>
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Reference: 66dd34ad31 ("signal: allow to send any siginfo to itself")
Reference: 1669ce53e2ff ("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO")
Fixes: db8b50ba75f2 ("[PATCH] POSIX clocks & timers")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Pull xfs fix from Darrick Wong:
"One more bugfix for you for 4.12-rc6 to fix something that came up in
an earlier rc:
- Fix some bogus ASSERT failures on CONFIG_SMP=n and CONFIG_XFS_DEBUG=y"
* tag 'xfs-4.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix spurious spin_is_locked() assert failures on non-smp kernels
Pull ufs fixes from Al Viro:
"Fix assorted ufs bugs: a couple of deadlocks, fs corruption in
truncate(), oopsen on tail unpacking and truncate when racing with
vmscan, mild fs corruption (free blocks stats summary buggered, *BSD
fsck would complain and fix), several instances of broken logics
around reserved blocks (starting with "check almost never triggers
when it should" and then there are issues with sufficiently large
UFS2)"
[ Note: ufs hasn't gotten any loving in a long time, because nobody
really seems to use it. These ufs fixes are triggered by people
actually caring now, not some sudden influx of new bugs. - Linus ]
* 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs_truncate_blocks(): fix the case when size is in the last direct block
ufs: more deadlock prevention on tail unpacking
ufs: avoid grabbing ->truncate_mutex if possible
ufs_get_locked_page(): make sure we have buffer_heads
ufs: fix s_size/s_dsize users
ufs: fix reserved blocks check
ufs: make ufs_freespace() return signed
ufs: fix logics in "ufs: make fsck -f happy"
Pull vfs fixes from Al Viro:
"A couple of fixes; a leak in mntns_install() caught by Andrei (this
cycle regression) + d_invalidate() softlockup fix - that had been
reported by a bunch of people lately, but the problem is pretty old"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: don't forget to put old mntns in mntns_install
Hang/soft lockup in d_invalidate with simultaneous calls
Pull PCI fixes from Bjorn Helgaas:
- fix another PCI_ENDPOINT build error (merged for v4.12)
- fix error codes added to config accessors for v4.12
* tag 'pci-v4.12-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: endpoint: Select CRC32 to fix test build error
PCI: Make error code types consistent in pci_{read,write}_config_*
Merge misc fixes from Andrew Morton:
"5 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: correct the comment when reclaimed pages exceed the scanned pages
userfaultfd: shmem: handle coredumping in handle_userfault()
mm: numa: avoid waiting on freed migrated pages
swap: cond_resched in swap_cgroup_prepare()
mm/memory-failure.c: use compound_head() flags for huge pages
Anon and hugetlbfs handle FOLL_DUMP set by get_dump_page() internally to
__get_user_pages().
shmem as opposed has no special FOLL_DUMP handling there so
handle_mm_fault() is invoked without mmap_sem and ends up calling
handle_userfault() that isn't expecting to be invoked without mmap_sem
held.
This makes handle_userfault() fail immediately if invoked through
shmem_vm_ops->fault during coredumping and solves the problem.
The side effect is a BUG_ON with no lock held triggered by the
coredumping process which exits. Only 4.11 is affected, pre-4.11 anon
memory holes are skipped in __get_user_pages by checking FOLL_DUMP
explicitly against empty pagetables (mm/gup.c:no_page_table()).
It's zero cost as we already had a check for current->flags to prevent
futex to trigger userfaults during exit (PF_EXITING).
Link: http://lkml.kernel.org/r/20170615214838.27429-1-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: <stable@vger.kernel.org> [4.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by
waiting until the pmd is unlocked before we return and retry. However,
we can race with migrate_misplaced_transhuge_page():
// do_huge_pmd_numa_page // migrate_misplaced_transhuge_page()
// Holds 0 refs on page // Holds 2 refs on page
vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
/* ... */
if (pmd_trans_migrating(*vmf->pmd)) {
page = pmd_page(*vmf->pmd);
spin_unlock(vmf->ptl);
ptl = pmd_lock(mm, pmd);
if (page_count(page) != 2)) {
/* roll back */
}
/* ... */
mlock_migrate_page(new_page, page);
/* ... */
spin_unlock(ptl);
put_page(page);
put_page(page); // page freed here
wait_on_page_locked(page);
goto out;
}
This can result in the freed page having its waiters flag set
unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the
page alloc/free functions. This has been observed on arm64 KVM guests.
We can avoid this by having do_huge_pmd_numa_page() take a reference on
the page before dropping the pmd lock, mirroring what we do in
__migration_entry_wait().
When we hit the race, migrate_misplaced_transhuge_page() will see the
reference and abort the migration, as it may do today in other cases.
Fixes: b8916634b7 ("mm: Prevent parallel splits during THP migration")
Link: http://lkml.kernel.org/r/1497349722-6731-2-git-send-email-will.deacon@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
memory_failure() chooses a recovery action function based on the page
flags. For huge pages it uses the tail page flags which don't have
anything interesting set, resulting in:
> Memory failure: 0x9be3b4: Unknown page state
> Memory failure: 0x9be3b4: recovery action for unknown page: Failed
Instead, save a copy of the head page's flags if this is a huge page,
this means if there are no relevant flags for this tail page, we use the
head pages flags instead. This results in the me_huge_page() recovery
action being called:
> Memory failure: 0x9b7969: recovery action for huge page: Delayed
For hugepages that have not yet been allocated, this allows the hugepage
to be dequeued.
Fixes: 524fca1e73 ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages")
Link: http://lkml.kernel.org/r/20170524130204.21845-1-james.morse@arm.com
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull powerpc fixes from Michael Ellerman:
"Three small fixes for recently merged code:
- remove a spurious WARN_ON when a PCI device has no of_node, it's
allowed in some circumstances for there to be no of_node.
- fix the offset for store EOI MMIOs in the XIVE interrupt
controller.
- fix non-const WARN_ONs which were becoming BUGs due to them losing
BUGFLAG_WARNING in a recent cleanup patch.
Thanks to: Alexey Kardashevskiy, Alistair Popple, Benjamin
Herrenschmidt"
* tag 'powerpc-4.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/debug: Add missing warn flag to WARN_ON's non-builtin path
powerpc/xive: Fix offset for store EOI MMIOs
powerpc/npu-dma: Remove spurious WARN_ON when a PCI device has no of_node
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix probing of precise_ip level for default cycles event, that
got broken recently on x86_64 when its arch code started
considering invalid requesting precise samples when not sampling
(i.e. when attr.sample_period == 0).
This also fixes another problem in s/390 where the precision
probing with sample_period == 0 returned precise_ip > 0, that
then, when setting up the real cycles event (not probing) would
return EOPNOTSUPP for precise_ip > 0 (as determined previously
by probing) and sample_period > 0.
These problems resulted in attr_precise not being set to the
highest precision available on x86.64 when no event was specified,
i.e. the canonical:
perf record ./workload
would end up using attr.precise_ip = 0. As a workaround this would
need to be done:
perf record -e cycles:P ./workload
And on s/390 it would plain not work, requiring using:
perf record -e cycles ./workload
as a workaround. (Arnaldo Carvalho de Melo)
- Fix perf build with ARCH=x86_64, when ARCH should be transformed
into ARCH=x86, just like with the main kernel Makefile and
tools/objtool's, i.e. use SRCARCH. (Jiada Wang)
- Avoid accessing uninitialized data structures when unwinding with
elfutils's libdw, making it more closely mimic libunwind's unwinder.
(Milian Wolff)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In the existing dn_route.c code, dn_route_output_slow() takes
dst->__refcnt before calling dn_insert_route() while dn_route_input_slow()
does not take dst->__refcnt before calling dn_insert_route().
This makes the whole routing code very buggy.
In dn_dst_check_expire(), dnrt_free() is called when rt expires. This
makes the routes inserted by dn_route_output_slow() not able to be
freed as the refcnt is not released.
In dn_dst_gc(), dnrt_drop() is called to release rt which could
potentially cause the dst->__refcnt to be dropped to -1.
In dn_run_flush(), dst_free() is called to release all the dst. Again,
it makes the dst inserted by dn_route_output_slow() not able to be
released and also, it does not wait on the rcu and could potentially
cause crash in the path where other users still refer to this dst.
This patch makes sure both input and output path do not take
dst->__refcnt before calling dn_insert_route() and also makes sure
dnrt_free()/dst_free() is called when removing dst from the hash table.
The only difference between those 2 calls is that dnrt_free() waits on
the rcu while dst_free() does not.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a kthread calls call_usermodehelper() the steps are:
1. allocate current->mm
2. load_elf_binary()
3. populate current->thread.regs
While doing this, interrupts are not disabled. If there is a perf
interrupt in the middle of this process (i.e. step 1 has completed
but not yet reached to step 3) and if perf tries to read userspace
regs, kernel oops with following log:
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc0000000000da0fc
...
Call Trace:
perf_output_sample_regs+0x6c/0xd0
perf_output_sample+0x4e4/0x830
perf_event_output_forward+0x64/0x90
__perf_event_overflow+0x8c/0x1e0
record_and_restart+0x220/0x5c0
perf_event_interrupt+0x2d8/0x4d0
performance_monitor_exception+0x54/0x70
performance_monitor_common+0x158/0x160
--- interrupt: f01 at avtab_search_node+0x150/0x1a0
LR = avtab_search_node+0x100/0x1a0
...
load_elf_binary+0x6e8/0x15a0
search_binary_handler+0xe8/0x290
do_execveat_common.isra.14+0x5f4/0x840
call_usermodehelper_exec_async+0x170/0x210
ret_from_kernel_thread+0x5c/0x7c
Fix it by setting abi to PERF_SAMPLE_REGS_ABI_NONE when userspace
pt_regs are not set.
Fixes: ed4a4ef85c ("powerpc/perf: Add support for sampling interrupt register state")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On Power9, trying to use data breakpoints throws the splat shown
below. This is because the check for a data breakpoint in DSISR is in
do_hash_page(), which is not called when in Radix mode.
Unable to handle kernel paging request for data at address 0xc000000000e19218
Faulting instruction address: 0xc0000000001155e8
cpu 0x0: Vector: 300 (Data Access) at [c0000000ef1e7b20]
pc: c0000000001155e8: find_pid_ns+0x48/0xe0
lr: c000000000116ac4: find_task_by_vpid+0x44/0x90
sp: c0000000ef1e7da0
msr: 9000000000009033
dar: c000000000e19218
dsisr: 400000
Move the check to handle_page_fault() so as to catch data breakpoints
in both Hash and Radix MMU modes.
We have to change the check in do_hash_page() against 0xa410 to use
0xa450, so as to include the value of (DSISR_DABRMATCH << 16).
There are two sites that call handle_page_fault() when in Radix, both
already pass DSISR in r4.
Fixes: caca285e5a ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code")
Cc: stable@vger.kernel.org # v4.7+
Reported-by: Shriya R. Kulkarni <shriykul@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Fix the fall-through case on hash, we need to reload DSISR]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
ftrace_caller() depends on a modified regs->nip to detect if a certain
function has been livepatched. However, with KPROBES_ON_FTRACE, it is
possible for regs->nip to have been modified by the kprobes pre_handler
(jprobes, for instance). In this case, we do not want to invoke the
livepatch_handler so as not to consume the livepatch stack.
To distinguish between the two (kprobes and livepatch), we check if
there is an active kprobe on the current function. If there is, then we
know for sure that it must have modified the NIP as we don't support
livepatching a kprobe'd function. In this case, we simply skip the
livepatch_handler and branch to the new NIP. Otherwise, the
livepatch_handler is invoked.
Fixes: ead514d5fb ("powerpc/kprobes: Add support for KPROBES_ON_FTRACE")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
For DYNAMIC_FTRACE_WITH_REGS, we should be passing-in the original set
of registers in pt_regs, to capture the state _before_ ftrace_caller.
However, we are instead passing the stack pointer *after* allocating a
stack frame in ftrace_caller. Fix this by saving the proper value of r1
in pt_regs. Also, use SAVE_10GPRS() to simplify the code.
Fixes: 153086644f ("powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This fixes a crash when function_graph and jprobes are used together.
This is essentially commit 237d28db03 ("ftrace/jprobes/x86: Fix
conflict between jprobes and function graph tracing"), but for powerpc.
Jprobes breaks function_graph tracing since the jprobe hook needs to use
jprobe_return(), which never returns back to the hook, but instead to
the original jprobe'd function. The solution is to momentarily pause
function_graph tracing before invoking the jprobe hook and re-enable it
when returning back to the original jprobe'd function.
Fixes: 6794c78243 ("powerpc64: port of the function graph tracer")
Cc: stable@vger.kernel.org # v2.6.30+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull configfs updates from Christoph Hellwig:
"A fix from Nic for a race seen in production (including a stable tag).
And while I'm sending you this I'm also sneaking in a trivial new
helper from Bart so that we don't need inter-tree dependencies for the
next merge window"
* tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfs:
configfs: Introduce config_item_get_unless_zero()
configfs: Fix race between create_link and configfs_rmdir
This fixes the following warning:
drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c: In function
'brcmf_usb_probe_phase2':
drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c:1198:2:
warning: 'devinfo' may be used uninitialized in this function
[-Wmaybe-uninitialized]
mutex_unlock(&devinfo->dev_init_lock);
Fixes: 6d0507a777 ("brcmfmac: add parameter to pass error code in firmware callback")
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Pull MMC fix from Ulf Hansson:
"MMC meson-gx host: work around broken SDIO with certain WiFi chips"
* tag 'mmc-v4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: meson-gx: work around broken SDIO with certain WiFi chips
Pull drm fixes from Dave Airlie:
"This is the main fixes pull for 4.12-rc6, all pretty normal for this
stage, nothing really stands out. The mxsfb one is probably the
largest and it's for a black screen boot problem.
AMD, i915, mgag200, msxfb, tegra fixes"
* tag 'drm-fixes-for-v4.12-rc6' of git://people.freedesktop.org/~airlied/linux:
drm: mxsfb_crtc: Reset the eLCDIF controller
drm/mgag200: Fix to always set HiPri for G200e4 V2
drm/tegra: Correct idr_alloc() minimum id
drm/tegra: Fix lockup on a use of staging API
gpu: host1x: Fix error handling
drm/radeon: Fix overflow of watermark calcs at > 4k resolutions.
drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions.
drm/radeon: fix "force the UVD DPB into VRAM as well"
drm/i915: Fix GVT-g PVINFO version compatibility check
drm/i915: Fix SKL+ watermarks for 90/270 rotation
drm/i915: Fix scaling check for 90/270 degree plane rotation
drm: dw-hdmi: Fix compilation breakage by selecting REGMAP_MMIO
Pull rdma fixes from Doug Ledford:
"I had thought at the time of the last pull request that there wouldn't
be much more to go, but several things just kept trickling in over the
last week.
Instead of just the six patches to bnxt_re that I had anticipated,
there are another five IPoIB patches, two qedr patches, and a few
other miscellaneous patches.
The bnxt_re patches are more lines of diff than I like to submit this
late in the game. That's mostly because of the first two patches in
the series of six. I almost dropped them just because of the lines of
churn, but on a close review, a lot of the churn came from removing
duplicated code sections and consolidating them into callable
routines. I felt like this made the number of lines of change more
acceptable, and they address problems, so I left them. The remainder
of the patches are all small, well contained, and well understood.
These have passed 0day testing, but have not been submitted to
linux-next (but a local merge test with your current master was
without any conflicts).
Summary:
- A fix for fix eea40b8f62 ("infiniband: call ipv6 route lookup via
the stub interface")
- Six patches against bnxt_re...the first two are considerably larger
than I would like, but as they address real issues I went ahead and
submitted them (it also helped that a good deal of the churn was
removing code repeated in multiple places and consolidating it to
one common function)
- Two fixes against qedr that just came in
- One fix against rxe that took a few revisions to get right plus
time to get the proper reviews
- Five late breaking IPoIB fixes
- One late cxgb4 fix"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
rdma/cxgb4: Fix memory leaks during module exit
IB/ipoib: Fix memory leak in create child syscall
IB/ipoib: Fix access to un-initialized napi struct
IB/ipoib: Delete napi in device uninit default
IB/ipoib: Limit call to free rdma_netdev for capable devices
IB/ipoib: Fix memory leaks for child interfaces priv
rxe: Fix a sleep-in-atomic bug in post_one_send
RDMA/qedr: Add 64KB PAGE_SIZE support to user-space queues
RDMA/qedr: Initialize byte_len in WC of READ and SEND commands
RDMA/bnxt_re: Remove FMR support
RDMA/bnxt_re: Fix RQE posting logic
RDMA/bnxt_re: Add HW workaround for avoiding stall for UD QPs
RDMA/bnxt_re: Dereg MR in FW before freeing the fast_reg_page_list
RDMA/bnxt_re: HW workarounds for handling specific conditions
RDMA/bnxt_re: Fixing the Control path command and response handling
IB/addr: Fix setting source address in addr6_resolve()
Pull x86 platform driver fix from Darren Hart:
"Just a single patch to fix an oops in the intel_telemetry_debugfs
module load/unload"
* tag 'platform-drivers-x86-v4.12-2' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: intel_telemetry_debugfs: fix oops when load/unload module
Pull block layer fix from Jens Axboe:
"Just a single fix this week, fixing a regression introduced in this
release.
When we put the final reference to the queue, we may need to block.
Ensure that we can safely do so. From Bart"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: Fix a blk_exit_rl() regression
Pull dmi fixes from Jean Delvare.
* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
firmware: dmi_scan: Check DMI structure length
firmware: dmi: Fix permissions of product_family
firmware: dmi_scan: Make dmi_walk and dmi_walk_early return real error codes
firmware: dmi_scan: Look for SMBIOS 3 entry point first
Pull selinux fix from James Morris:
"Fix for a double free bug in SELinux"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
selinux: fix double free in selinux_parse_opts_str()
When trapped on WARN_ON(), report_bug() is expected to return
BUG_TRAP_TYPE_WARN so the caller will increment NIP by 4 and continue.
The __builtin_constant_p() path of the PPC's WARN_ON()
calls (indirectly) __WARN_FLAGS() which has BUGFLAG_WARNING set,
however the other branch does not which makes report_bug() report a
bug rather than a warning.
Fixes: f26dee1510 ("debug: Avoid setting BUGFLAG_WARNING twice")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
POWER9 DD1 has an erratum where writing to the TBU40 register, which
is used to apply an offset to the timebase, can cause the timebase to
lose counts. This results in the timebase on some CPUs getting out of
sync with other CPUs, which then results in misbehaviour of the
timekeeping code.
To work around the problem, we make KVM ignore the timebase offset for
all guests on POWER9 DD1 machines. This means that live migration
cannot be supported on POWER9 DD1 machines.
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Saeed Mahameed says:
====================
Mellanox mlx5 fixes 2017-06-14
This series contains some fixes for the mlx5 core and netdev driver.
Please pull and let me know if there's any problem.
For -stable:
("net/mlx5: Wait for FW readiness before initializing command interface") kernels >= 4.4
("net/mlx5e: Fix timestamping capabilities reporting") kernels >= 4.5
("net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it") kernels >= 4.9
("net/mlx5e: Fix min inline value for VF rep SQs") kernels >= 4.11
The "net/mlx5e: Fix min inline .." (a oneliner patch) doesn't cleanly apply
to 4.11, it hits a contextual conflict and can be easily resolved by:
+ mlx5_query_min_inline(mdev, &priv->params.tx_min_inline_mode);
to the end of mlx5e_build_rep_netdev_priv. Note the 2nd parameter of
mlx5_query_min_inline is slightly different from the original one.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
At present, HV KVM on POWER8 and POWER9 machines loses any instruction
or data breakpoint set in the host whenever a guest is run.
Instruction breakpoints are currently only used by xmon, but ptrace
and the perf_event subsystem can set data breakpoints as well as xmon.
To fix this, we save the host values of the debug registers (CIABR,
DAWR and DAWRX) before entering the guest and restore them on exit.
To provide space to save them in the stack frame, we expand the stack
frame allocated by kvmppc_hv_entry() from 112 to 144 bytes.
Fixes: b005255e12 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
Cc: stable@vger.kernel.org # v3.14+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Driver Changes:
- dw-hdmi: Fix compilation error if REGMAP_MMIO not selected (Laurent)
- host1x: Fix incorrect return value (Christophe)
- tegra: Shore up idr API usage in tegra staging code (Dmitry)
- mgag200: Always use HiPri mode for G200e4v2 and limit max bandwidth (Mathieu)
- mxsfb: Ensure display can be lit up without bootloader initialization (Fabio)
Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Dmitry Osipenko <digetx@gmail.com>
Cc: Mathieu Larouche <mathieu.larouche@matrox.com>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
* tag 'drm-misc-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc:
drm: mxsfb_crtc: Reset the eLCDIF controller
drm/mgag200: Fix to always set HiPri for G200e4 V2
drm/tegra: Correct idr_alloc() minimum id
drm/tegra: Fix lockup on a use of staging API
gpu: host1x: Fix error handling
drm: dw-hdmi: Fix compilation breakage by selecting REGMAP_MMIO
A few fixes for 4.12:
- fix a UVD regression on SI
- fix overflow in watermark calcs on large modes
* 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: Fix overflow of watermark calcs at > 4k resolutions.
drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions.
drm/radeon: fix "force the UVD DPB into VRAM as well"
The error flow of mlx5e_create_netdev calls the cleanup call
of the given profile without checking if it exists, fix that.
Currently the VF reps don't register that callback and we crash
if getting into error -- can be reproduced by the user doing ctrl^C
while attempting to change the sriov mode from legacy to switchdev.
Fixes: 26e59d8077 '(net/mlx5e: Implement mlx5e interface attach/detach callbacks')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Sabrina Dubroca <sdubroca@redhat.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently the firmware API is partial and allows to offload only
the dscp part of the tos, also, ipv6 support isn't there yet.
As such, remove the offloading option of ipv4 dscp till the FW
APIs are more comprehensive.
Fixes: d79b6df6b1 ('net/mlx5e: Add parsing of TC pedit actions to HW format')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently we don't check that the link type is Eth and hence crash
on IB ports when attempting to deref esw->xxx, fix that.
To avoid repeating this check over and over, put the existing
checks and the one on link type in a single helper.
Fixes: 7768d1971d ('net/mlx5: E-Switch, Add control for encapsulation')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Mohamad Badarnah <mohamadb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The offending commit only changed the code path for PF/VF, but it
didn't take care of VF representors. As a result, since
params->tx_min_inline_mode for VF representors is kzalloced to 0
(MLX5_INLINE_MODE_NONE), all VF reps SQs were set to that mode.
This actually works on CX5 by default but broke CX4. Fix that by
adding a call to query the min inline mode from the VF rep build up code.
Fixes: a6f402e499 ("net/mlx5e: Tx, no inline copy on ConnectX-5")
Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Before attempting to initialize the command interface we must wait till
the fw_initializing bit is clear.
If we fail to meet this condition the hardware will drop our
configuration, specifically the descriptors page address. This scenario
can happen when the firmware is still executing an FLR flow and did not
finish yet so the driver needs to wait for that to finish.
Fixes: e3297246c2 ('net/mlx5_core: Wait for FW readiness on startup')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Using the syzkaller kernel fuzzer, Andrey Konovalov generated the
following error in gadgetfs:
> BUG: KASAN: use-after-free in __lock_acquire+0x3069/0x3690
> kernel/locking/lockdep.c:3246
> Read of size 8 at addr ffff88003a2bdaf8 by task kworker/3:1/903
>
> CPU: 3 PID: 903 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #35
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: usb_hub_wq hub_event
> Call Trace:
> __dump_stack lib/dump_stack.c:16 [inline]
> dump_stack+0x292/0x395 lib/dump_stack.c:52
> print_address_description+0x78/0x280 mm/kasan/report.c:252
> kasan_report_error mm/kasan/report.c:351 [inline]
> kasan_report+0x230/0x340 mm/kasan/report.c:408
> __asan_report_load8_noabort+0x19/0x20 mm/kasan/report.c:429
> __lock_acquire+0x3069/0x3690 kernel/locking/lockdep.c:3246
> lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> spin_lock include/linux/spinlock.h:299 [inline]
> gadgetfs_suspend+0x89/0x130 drivers/usb/gadget/legacy/inode.c:1682
> set_link_state+0x88e/0xae0 drivers/usb/gadget/udc/dummy_hcd.c:455
> dummy_hub_control+0xd7e/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2074
> rh_call_control drivers/usb/core/hcd.c:689 [inline]
> rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
> usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650
> usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542
> usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56
> usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
> usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151
> usb_clear_port_feature+0x74/0xa0 drivers/usb/core/hub.c:412
> hub_port_disable+0x123/0x510 drivers/usb/core/hub.c:4177
> hub_port_init+0x1ed/0x2940 drivers/usb/core/hub.c:4648
> hub_port_connect drivers/usb/core/hub.c:4826 [inline]
> hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
> port_event drivers/usb/core/hub.c:5105 [inline]
> hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185
> process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097
> process_scheduled_works kernel/workqueue.c:2157 [inline]
> worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233
> kthread+0x363/0x440 kernel/kthread.c:231
> ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
>
> Allocated by task 9958:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
> kmem_cache_alloc_trace+0x87/0x280 mm/slub.c:2745
> kmalloc include/linux/slab.h:492 [inline]
> kzalloc include/linux/slab.h:665 [inline]
> dev_new drivers/usb/gadget/legacy/inode.c:170 [inline]
> gadgetfs_fill_super+0x24f/0x540 drivers/usb/gadget/legacy/inode.c:1993
> mount_single+0xf6/0x160 fs/super.c:1192
> gadgetfs_mount+0x31/0x40 drivers/usb/gadget/legacy/inode.c:2019
> mount_fs+0x9c/0x2d0 fs/super.c:1223
> vfs_kern_mount.part.25+0xcb/0x490 fs/namespace.c:976
> vfs_kern_mount fs/namespace.c:2509 [inline]
> do_new_mount fs/namespace.c:2512 [inline]
> do_mount+0x41b/0x2d90 fs/namespace.c:2834
> SYSC_mount fs/namespace.c:3050 [inline]
> SyS_mount+0xb0/0x120 fs/namespace.c:3027
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> Freed by task 9960:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
> slab_free_hook mm/slub.c:1357 [inline]
> slab_free_freelist_hook mm/slub.c:1379 [inline]
> slab_free mm/slub.c:2961 [inline]
> kfree+0xed/0x2b0 mm/slub.c:3882
> put_dev+0x124/0x160 drivers/usb/gadget/legacy/inode.c:163
> gadgetfs_kill_sb+0x33/0x60 drivers/usb/gadget/legacy/inode.c:2027
> deactivate_locked_super+0x8d/0xd0 fs/super.c:309
> deactivate_super+0x21e/0x310 fs/super.c:340
> cleanup_mnt+0xb7/0x150 fs/namespace.c:1112
> __cleanup_mnt+0x1b/0x20 fs/namespace.c:1119
> task_work_run+0x1a0/0x280 kernel/task_work.c:116
> exit_task_work include/linux/task_work.h:21 [inline]
> do_exit+0x18a8/0x2820 kernel/exit.c:878
> do_group_exit+0x14e/0x420 kernel/exit.c:982
> get_signal+0x784/0x1780 kernel/signal.c:2318
> do_signal+0xd7/0x2130 arch/x86/kernel/signal.c:808
> exit_to_usermode_loop+0x1ac/0x240 arch/x86/entry/common.c:157
> prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
> entry_SYSCALL_64_fastpath+0xbc/0xbe
>
> The buggy address belongs to the object at ffff88003a2bdae0
> which belongs to the cache kmalloc-1024 of size 1024
> The buggy address is located 24 bytes inside of
> 1024-byte region [ffff88003a2bdae0, ffff88003a2bdee0)
> The buggy address belongs to the page:
> page:ffffea0000e8ae00 count:1 mapcount:0 mapping: (null)
> index:0x0 compound_mapcount: 0
> flags: 0x100000000008100(slab|head)
> raw: 0100000000008100 0000000000000000 0000000000000000 0000000100170017
> raw: ffffea0000ed3020 ffffea0000f5f820 ffff88003e80efc0 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff88003a2bd980: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff88003a2bda00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff88003a2bda80: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb
> ^
> ffff88003a2bdb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff88003a2bdb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
What this means is that the gadgetfs_suspend() routine was trying to
access dev->lock after it had been deallocated. The root cause is a
race in the dummy_hcd driver; the dummy_udc_stop() routine can race
with the rest of the driver because it contains no locking. And even
when proper locking is added, it can still race with the
set_link_state() function because that function incorrectly drops the
private spinlock before invoking any gadget driver callbacks.
The result of this race, as seen above, is that set_link_state() can
invoke a callback in gadgetfs even after gadgetfs has been unbound
from dummy_hcd's UDC and its private data structures have been
deallocated.
include/linux/usb/gadget.h documents that the ->reset, ->disconnect,
->suspend, and ->resume callbacks may be invoked in interrupt context.
In general this is necessary, to prevent races with gadget driver
removal. This patch fixes dummy_hcd to retain the spinlock across
these calls, and it adds a spinlock acquisition to dummy_udc_stop() to
prevent the race.
The net2280 driver makes the same mistake of dropping the private
spinlock for its ->disconnect and ->reset callback invocations. The
patch fixes it too.
Lastly, since gadgetfs_suspend() may be invoked in interrupt context,
it cannot assume that interrupts are enabled when it runs. It must
use spin_lock_irqsave() instead of spin_lock_irq(). The patch fixes
that bug as well.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com>
CC: <stable@vger.kernel.org>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The referenced file dsa.txt is located at
Documentation/devicetree/bindings/net/dsa/dsa.txt
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
In sctp_for_each_transport, pos is used to save how many objs it has
dumped. Now it gets the last obj by sctp_transport_get_idx, then gets
the next obj by sctp_transport_get_next.
The issue is that in the meanwhile if some objs in transport hashtable
are removed and the objs nums are less than pos, sctp_transport_get_idx
would return NULL and hti.walker.tbl is NULL as well. At this moment
it should stop hti, instead of continue getting the next obj. Or it
would cause a NULL pointer dereference in sctp_transport_get_next.
This patch is to pass pos + 1 into sctp_transport_get_idx to get the
next obj directly, even if pos > objs nums, it would return NULL and
stop hti.
Fixes: 626d16f50f ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the eLCDIF initialization steps listed in the MX6SX
Reference Manual the eLCDIF block reset is mandatory.
Without performing the eLCDIF reset the display shows garbage content
when the kernel boots.
In earlier tests this issue has not been observed because the bootloader
was previously showing a splash screen and the bootloader display driver
does properly implement the eLCDIF reset.
Add the eLCDIF reset to the driver, so that it can operate correctly
independently of the bootloader.
Tested on a imx6sx-sdb board.
Cc: <stable@vger.kernel.org>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1494007301-14535-1-git-send-email-fabio.estevam@nxp.com
This fixes CVE-2017-7482.
When a kerberos 5 ticket is being decoded so that it can be loaded into an
rxrpc-type key, there are several places in which the length of a
variable-length field is checked to make sure that it's not going to
overrun the available data - but the data is padded to the nearest
four-byte boundary and the code doesn't check for this extra. This could
lead to the size-remaining variable wrapping and the data pointer going
over the end of the buffer.
Fix this by making the various variable-length data checks use the padded
length.
Reported-by: 石磊 <shilei-c@360.cn>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.c.dionne@auristor.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
USB devices rely on queuing functionality provided by the fwsignal
module regardless the mode fwsignal is operating in. For this some
data structure needs to be reserved which is tied to the interface,
which is done by brcmf_fws_add_interface(). However, it checks the
mode. Replace that by checking result from brcmf_fws_queue_skbs().
Otherwise the driver will crash in a null pointer dereference when
data is transmitted on the interface.
Fixes: fc0471e3e8 ("brcmfmac: ignore interfaces when fwsignal is disabled")
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
When request firmware fails, brcmf_ops_sdio_remove is being called and
brcmf_bus freed. In such circumstancies if you do a suspend/resume cycle
the kernel hangs on resume due a NULL pointer dereference in resume
function. So in brcmf_sdio_firmware_callback() we need to unbind the
driver from both sdio_func devices when firmware load failure is indicated.
Cc: stable@vger.kernel.org # 4.9.x-
Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
When firmware loading failed the code used to unbind the device provided
by the calling code. However, for the sdio driver two devices are bound
and both need to be released upon failure. The callback has been extended
with parameter to pass error code so add that in this commit upon firmware
loading failure.
Cc: stable@vger.kernel.org # 4.9.x-
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Now when starting the dad work in addrconf_mod_dad_work, if the dad work
is idle and queued, it needs to hold ifa.
The problem is there's one gap in [1], during which if the pending dad work
is removed elsewhere. It will miss to hold ifa, but the dad word is still
idea and queue.
if (!delayed_work_pending(&ifp->dad_work))
in6_ifa_hold(ifp);
<--------------[1]
mod_delayed_work(addrconf_wq, &ifp->dad_work, delay);
An use-after-free issue can be caused by this.
Chen Wei found this issue when WARN_ON(!hlist_unhashed(&ifp->addr_lst)) in
net6_ifa_finish_destroy was hit because of it.
As Hannes' suggestion, this patch is to fix it by holding ifa first in
addrconf_mod_dad_work, then calling mod_delayed_work and putting ifa if
the dad_work is already in queue.
Note that this patch did not choose to fix it with:
if (!mod_delayed_work(delay))
in6_ifa_hold(ifp);
As with it, when delay == 0, dad_work would be scheduled immediately, all
addrconf_mod_dad_work(0) callings had to be moved under ifp->lock.
Reported-by: Wei Chen <weichen@redhat.com>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I made a mistake in commit bfd20f1. We should skip the force on with the
option enabled instead of vice versa. Not sure why this passed our
performance test, sorry.
Fixes: bfd20f1cc8 ('x86, iommu/vt-d: Add an option to disable Intel IOMMU force on')
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Architecturally we should apply a 0x400 offset for these. Not doing
it will break future HW implementations.
The offset of 0 is supposed to remain for "triggers" though not all
sources support both trigger and store EOI, and in P9 specifically,
some sources will treat 0 as a store EOI. But future chips will not.
So this makes us use the properly architected offset which should work
always.
Fixes: 243e25112d ("powerpc/xive: Native exploitation of the XIVE interrupt controller")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This reverts commit 12a7cf5ba6.
This commit apparently attempted to fix an issue that didn't really
exist, furthermore: this commit is the source of deadlocks and crashes
seen in multiple cases related to failing the primary mirror dev while
syncing.
Reported-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Currently they return -1 on error, which will confuse callers if
they try to interpret it as a normal negative error code.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Since version 3.0.0 of the SMBIOS specification, there can be
multiple entry points in memory, pointing to one or two DMI tables.
If both a 32-bit ("_SM_") entry point and a 64-bit ("_SM3_") entry
point are present, the specification requires that the latter points
to a table which is a super-set of the table pointed to by the
former. Therefore we should give preference to the 64-bit ("_SM3_")
entry point.
However, currently the code is picking the first valid entry point
it finds. Per specification, we should look for a 64-bit ("_SM3_")
entry point first, and if we can't find any, look for a 32-bit
("_SM_" or "_DMI_") entry point. Modify the code to do that.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
It's not hard to trigger a bunch of d_invalidate() on the same
dentry in parallel. They end up fighting each other - any
dentry picked for removal by one will be skipped by the rest
and we'll go for the next iteration through the entire
subtree, even if everything is being skipped. Morevoer, we
immediately go back to scanning the subtree. The only thing
we really need is to dissolve all mounts in the subtree and
as soon as we've nothing left to do, we can just unhash the
dentry and bugger off.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The .its targets require information about the kernel binary, such as
its entry point, which is extracted from the vmlinux ELF. We therefore
require that the ELF is built before the .its files are generated.
Declare this requirement in the Makefile such that make will ensure this
is always the case, otherwise in corner cases we can hit issues as the
.its is generated with an incorrect (either invalid or stale) entry
point.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: cf2a5e0bb4 ("MIPS: Support generating Flattened Image Trees (.itb)")
Cc: linux-mips@linux-mips.org
Cc: stable <stable@vger.kernel.org> # v4.9+
Patchwork: https://patchwork.linux-mips.org/patch/16179/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The code handling the pop76 opcode (ie. bnezc & jialc instructions) in
__compute_return_epc_for_insn() needs to set the value of $31 in the
jialc case, which is encoded with rs = 0. However its check to
differentiate bnezc (rs != 0) from jialc (rs = 0) was unfortunately
backwards, meaning that if we emulate a bnezc instruction we clobber $31
& if we emulate a jialc instruction it actually behaves like a jic
instruction.
Fix this by inverting the check of rs to match the way the instructions
are actually encoded.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 28d6f93d20 ("MIPS: Emulate the new MIPS R6 BNEZC and JIALC instructions")
Cc: stable <stable@vger.kernel.org> # v4.0+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16178/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull networking fixes from David Miller:
1) The netlink attribute passed in to dev_set_alias() is not
necessarily NULL terminated, don't use strlcpy() on it. From
Alexander Potapenko.
2) Fix implementation of atomics in arm64 bpf JIT, from Daniel
Borkmann.
3) Correct the release of netdevs and driver private data in certain
circumstances.
4) Sanitize netlink message length properly in decnet, from Mateusz
Jurczyk.
5) Don't leak kernel data in rtnl_fill_vfinfo() netlink blobs. From
Yuval Mintz.
6) Hash secret is never initialized in ipv6 ILA translation code, from
Arnd Bergmann. I guess those clang warnings about unused inline
functions are useful for something!
7) Fix endian selection in bpf_endian.h, from Daniel Borkmann.
8) Sanitize sockaddr length before dereferncing any fields in AF_UNIX
and CAIF. From Mateusz Jurczyk.
9) Fix timestamping for GMAC3 chips in stmmac driver, from Mario
Molitor.
10) Do not leak netdev on dev_alloc_name() errors in mac80211, from
Johannes Berg.
11) Fix locking in sctp_for_each_endpoint(), from Xin Long.
12) Fix wrong memset size on 32-bit in snmp6, from Christian Perle.
13) Fix use after free in ip_mc_clear_src(), from WANG Cong.
14) Fix regressions caused by ICMP rate limiting changes in 4.11, from
Jesper Dangaard Brouer.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits)
i40e: Fix a sleep-in-atomic bug
net: don't global ICMP rate limit packets originating from loopback
net/act_pedit: fix an error code
net: update undefined ->ndo_change_mtu() comment
net_sched: move tcf_lock down after gen_replace_estimator()
caif: Add sockaddr length check before accessing sa_family in connect handler
qed: fix dump of context data
qmi_wwan: new Telewell and Sierra device IDs
net: phy: Fix MDIO_THUNDER dependencies
netconsole: Remove duplicate "netconsole: " logging prefix
igmp: acquire pmc lock for ip_mc_clear_src()
r8152: give the device version
net: rps: fix uninitialized symbol warning
mac80211: don't send SMPS action frame in AP mode when not needed
mac80211/wpa: use constant time memory comparison for MACs
mac80211: set bss_info data before configuring the channel
mac80211: remove 5/10 MHz rate code from station MLME
mac80211: Fix incorrect condition when checking rx timestamp
mac80211: don't look at the PM bit of BAR frames
i40e: fix handling of HW ATR eviction
...
Pull crypto fix from Herbert Xu:
"This fixes a bug on sparc where we may dereference freed stack memory"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: Work around deallocated stack frame reference gcc bug on sparc.
Pull ACPI fixes from Rafael Wysocki:
"These revert an ACPICA commit from the 4.11 cycle that causes problems
to happen on some systems and add a protection against possible kernel
crashes due to table reference counter imbalance.
Specifics:
- Revert a 4.11 ACPICA change that made assumptions which are not
satisfied on some systems and caused the enumeration of resources
to fail on them (Rafael Wysocki).
- Add a mechanism to prevent tables from being unmapped prematurely
due to reference counter overflows (Lv Zheng)"
* tag 'acpi-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: Tables: Mechanism to handle late stage acpi_get_table() imbalance
Revert "ACPICA: Disassembler: Enhance resource descriptor detection"
Pull power management fixes from Rafael Wysocki:
"These revert a recent cpufreq schedutil governor change that turned
out to be problematic and fix a few minor issues in cpufreq, cpuidle
and the Exynos devfreq drivers.
Specifics:
- Revert a recent cpufreq schedutil governor change that caused some
systems to behave undesirably (Rafael Wysocki).
- Fix a cpufreq conservative governor issue introduced during the
3.10 cycle that prevents it from working as expected in some
situations (Tomasz Wilczyński).
- Fix an error code path in the generic cpuidle driver for DT-based
systems (Christophe Jaillet).
- Fix three minor issues in devfreq drivers for Exynos (Arvind Yadav,
Krzysztof Kozlowski)"
* tag 'pm-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: dt: Add missing 'of_node_put()'
cpufreq: conservative: Allow down_threshold to take values from 1 to 10
Revert "cpufreq: schedutil: Reduce frequencies slower"
PM / devfreq: exynos-ppmu: Staticize event list
PM / devfreq: exynos-ppmu: Handle return value of clk_prepare_enable
PM / devfreq: exynos-nocp: Handle return value of clk_prepare_enable
Pull HID fix from Jiri Kosina:
- ifdef-based bandaid for a long-standing issue with HID driver
matching, avoiding regressions in cases where specific driver is not
enabled in kernel .config, from Jiri Kosina
* 'for-4.12/driver-matching-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: let generic driver yield control iff specific driver has been enabled
Pull media fixes from Mauro Carvalho Chehab:
- some build dependency issues at CEC core with randconfigs
- fix an off by one error at vb2
- a race fix at cec core
- driver fixes at tc358743, sir_ir and rainshadow-cec
* tag 'media/v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] media/cec.h: use IS_REACHABLE instead of IS_ENABLED
[media] cec: race fix: don't return -ENONET in cec_receive()
[media] sir_ir: infinite loop in interrupt handler
[media] cec-notifier.h: handle unreachable CONFIG_CEC_CORE
[media] cec: improve MEDIA_CEC_RC dependencies
[media] vb2: Fix an off by one error in 'vb2_plane_vaddr'
[media] rainshadow-cec: Fix missing spin_lock_init()
[media] tc358743: fix register i2c_rd/wr function fix
The logics when deciding whether we need to do anything with direct blocks
is broken when new size is within the last direct block. It's better to
find the path to the last byte _not_ to be removed and use that instead
of the path to the beginning of the first block to be freed...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
If userspace attempts to call the KVM_RUN ioctl when it has hardware
transactional memory (HTM) enabled, the values that it has put in the
HTM-related SPRs TFHAR, TFIAR and TEXASR will get overwritten by
guest values. To fix this, we detect this condition and save those
SPR values in the thread struct, and disable HTM for the task. If
userspace goes to access those SPRs or the HTM facility in future,
a TM-unavailable interrupt will occur and the handler will reload
those SPRs and re-enable HTM.
If userspace has started a transaction and suspended it, we would
currently lose the transactional state in the guest entry path and
would almost certainly get a "TM Bad Thing" interrupt, which would
cause the host to crash. To avoid this, we detect this case and
return from the KVM_RUN ioctl with an EINVAL error, with the KVM
exit reason set to KVM_EXIT_FAIL_ENTRY.
Fixes: b005255e12 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
Cc: stable@vger.kernel.org # v3.14+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This restores several special-purpose registers (SPRs) to sane values
on guest exit that were missed before.
TAR and VRSAVE are readable and writable by userspace, and we need to
save and restore them to prevent the guest from potentially affecting
userspace execution (not that TAR or VRSAVE are used by any known
program that run uses the KVM_RUN ioctl). We save/restore these
in kvmppc_vcpu_run_hv() rather than on every guest entry/exit.
FSCR affects userspace execution in that it can prohibit access to
certain facilities by userspace. We restore it to the normal value
for the task on exit from the KVM_RUN ioctl.
IAMR is normally 0, and is restored to 0 on guest exit. However,
with a radix host on POWER9, it is set to a value that prevents the
kernel from executing user-accessible memory. On POWER9, we save
IAMR on guest entry and restore it on guest exit to the saved value
rather than 0. On POWER8 we continue to set it to 0 on guest exit.
PSPB is normally 0. We restore it to 0 on guest exit to prevent
userspace taking advantage of the guest having set it non-zero
(which would allow userspace to set its SMT priority to high).
UAMOR is normally 0. We restore it to 0 on guest exit to prevent
the AMR from being used as a covert channel between userspace
processes, since the AMR is not context-switched at present.
Fixes: b005255e12 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
Cc: stable@vger.kernel.org # v3.14+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
tail unpacking is done in a wrong place; the deadlocks galore
is best dealt with by doing that in ->write_iter() (and switching
to iomap, while we are at it), but that's rather painful to
backport. The trouble comes from grabbing pages that cover
the beginning of tail from inside of ufs_new_fragments(); ongoing
pageout of any of those is going to deadlock on ->truncate_mutex
with process that got around to extending the tail holding that
and waiting for page to get unlocked, while ->writepage() on
that page is waiting on ->truncate_mutex.
The thing is, we don't need ->truncate_mutex when the fragment
we are trying to map is within the tail - the damn thing is
allocated (tail can't contain holes).
Let's do a plain lookup and if the fragment is present, we can
just pretend that we'd won the race in almost all cases. The
only exception is a fragment between the end of tail and the
end of block containing tail.
Protect ->i_lastfrag with ->meta_lock - read_seqlock_excl() is
sufficient.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The driver may sleep under a spin lock, and the function call path is:
i40e_ndo_set_vf_port_vlan (acquire the lock by spin_lock_bh)
i40e_vsi_remove_pvid
i40e_vlan_stripping_disable
i40e_aq_update_vsi_params
i40e_asq_send_command
mutex_lock --> may sleep
To fixed it, the spin lock is released before "i40e_vsi_remove_pvid", and
the lock is acquired again after this function.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
callers rely upon that, but find_lock_page() racing with attempt of
page eviction by memory pressure might have left us with
* try_to_free_buffers() successfully done
* __remove_mapping() failed, leaving the page in our mapping
* find_lock_page() returning an uptodate page with no
buffer_heads attached.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Allwinner clock fixes for 4.12
Some fixes that fix some bindings that went in 4.12, fix a few reset and
clock offsets and a build error fix
* tag 'sunxi-clk-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
clk: sunxi-ng: a64: Export PLL_PERIPH0 clock for the PRCM
clk: sunxi-ng: h3: Export PLL_PERIPH0 clock for the PRCM
dt-bindings: clock: sunxi-ccu: Add pll-periph to PRCM's needed clocks
clk: sunxi-ng: enable SUNXI_CCU_MP for PRCM
clk: sunxi-ng: v3s: Fix usb otg device reset bit
clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset
For UFS2 we need 64bit variants; we even store them in uspi, but
use 32bit ones instead. One wrinkle is in handling of reserved
space - recalculating it every time had been stupid all along, but
now it would become really ugly. Just calculate it once...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
a) honour ->s_minfree; don't just go with default (5)
b) don't bother with capability checks until we know we'll need them
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Florian Weimer seems to have a glibc test-case which requires that
loopback interfaces does not get ICMP ratelimited. This was broken by
commit c0303efeab ("net: reduce cycles spend on ICMP replies that
gets rate limited").
An ICMP response will usually be routed back-out the same incoming
interface. Thus, take advantage of this and skip global ICMP
ratelimit when the incoming device is loopback. In the unlikely event
that the outgoing it not loopback, due to strange routing policy
rules, ICMP rate limiting still works via peer ratelimiting via
icmpv4_xrlim_allow(). Thus, we should still comply with RFC1812
(section 4.3.2.8 "Rate Limiting").
This seems to fix the reproducer given by Florian. While still
avoiding to perform expensive and unneeded outgoing route lookup for
rate limited packets (in the non-loopback case).
Fixes: c0303efeab ("net: reduce cycles spend on ICMP replies that gets rate limited")
Reported-by: Florian Weimer <fweimer@redhat.com>
Reported-by: "H.J. Lu" <hjl.tools@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid that the following complaint is reported:
BUG: sleeping function called from invalid context at kernel/workqueue.c:2790
in_atomic(): 1, irqs_disabled(): 0, pid: 41, name: rcuop/3
1 lock held by rcuop/3/41:
#0: (rcu_callback){......}, at: [<ffffffff8111f9a2>] rcu_nocb_kthread+0x282/0x500
Call Trace:
dump_stack+0x86/0xcf
___might_sleep+0x174/0x260
__might_sleep+0x4a/0x80
flush_work+0x7e/0x2e0
__cancel_work_timer+0x143/0x1c0
cancel_work_sync+0x10/0x20
blk_throtl_exit+0x25/0x60
blkcg_exit_queue+0x35/0x40
blk_release_queue+0x42/0x130
kobject_put+0xa9/0x190
This happens since we invoke callbacks that need to block from the
queue release handler. Fix this by pushing the final release to
a workqueue.
Reported-by: Ross Zwisler <zwisler@gmail.com>
Fixes: commit b425e50492 ("block: Avoid that blk_exit_rl() triggers a use-after-free")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Updated changelog
Signed-off-by: Jens Axboe <axboe@fb.com>
I'm reviewing static checker warnings where we do ERR_PTR(0), which is
the same as NULL. I'm pretty sure we intended to return ERR_PTR(-EINVAL)
here. Sometimes these bugs lead to a NULL dereference but I don't
immediately see that problem here.
Fixes: 71d0ed7079 ("net/act_pedit: Support using offset relative to the conventional network headers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
Storing stats _only_ at new locations is wrong for UFS1; old
locations should always be kept updated. The check for "has
been converted to use of new locations" is also wrong - it
should be "->fs_maxbsize is equal to ->fs_bsize".
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The flow of creating a new child goes through ipoib_vlan_add
which allocates a new interface and checks the rtnl_lock.
If the lock is taken, restart_syscall will be called to restart
the system call again. In this case we are not releasing the
already allocated interface, causing a leak.
Fixes: 9baa0b0364 ("IB/ipoib: Add rtnl_link_ops support")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch mekas init_default and uninit_default symmetric
with a call to delete napi. Additionally, the uninit_default
gained delete napi call in case of init_default fails.
Fixes: 515ed4f3aa ('IB/IPoIB: Separate control and data related initializations')
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
There is a need to free priv explicitly and not just to release
the device, child priv is freed explicitly on remove flow and this
patch also includes priv free on error flow in P_key creation
and also in add_port.
Fixes: cd565b4b51 ('IB/IPoIB: Support acceleration options callbacks')
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Update ->ndo_change_mtu() callback comment to remove text
about returning error in case of undefined callback. This
change makes the comment match the existing code behavior.
Signed-off-by: Magnus Damm <damm+renesas@opensource.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 18e7a45af9 ("perf/x86: Reject non sampling events with
precise_ip") returns -EINVAL for sys_perf_event_open() with an attribute
with (attr.precise_ip > 0 && attr.sample_period == 0), just like is done
in the routine used to probe the max precise level when no events were
passed to 'perf record' or 'perf top', i.e.:
perf_evsel__new_cycles()
perf_event_attr__set_max_precise_ip()
The x86 code, in x86_pmu_hw_config(), which is called all the way from
sys_perf_event_open() did, starting with the aforementioned commit:
/* There's no sense in having PEBS for non sampling events: */
if (!is_sampling_event(event))
return -EINVAL;
Which makes it fail for cycles:ppp, cycles:pp and cycles:p, always using
just the non precise cycles variant.
To make sure that this is the case, I tested it, before this patch,
with:
# perf probe -L x86_pmu_hw_config
<x86_pmu_hw_config@/home/acme/git/linux/arch/x86/events/core.c:0>
0 int x86_pmu_hw_config(struct perf_event *event)
1 {
2 if (event->attr.precise_ip) {
<SNIP>
17 if (event->attr.precise_ip > precise)
18 return -EOPNOTSUPP;
/* There's no sense in having PEBS for non sampling events: */
21 if (!is_sampling_event(event))
22 return -EINVAL;
}
<SNIP>
# perf probe x86_pmu_hw_config:22
Added new events:
probe:x86_pmu_hw_config (on x86_pmu_hw_config:22)
probe:x86_pmu_hw_config_1 (on x86_pmu_hw_config:22)
You can now use it in all perf tools, such as:
perf record -e probe:x86_pmu_hw_config_1 -aR sleep 1
# perf trace -e perf_event_open,probe:x86_pmu_hwconfig*/max-stack=16/ perf record usleep 1
0.000 ( 0.015 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.015 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.000 ( 0.021 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
0.023 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.025 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.023 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
0.028 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.030 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.028 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
41.018 ( 0.012 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8b5dd0, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.065 ( 0.011 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.080 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.103 ( 0.010 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
41.115 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
41.122 ( 0.004 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
41.128 ( 0.008 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ]
#
I.e. that return -EINVAL in x86_pmu_hw_config() is hit three times.
So fix it by just setting attr.sample_period
Now, after this patch:
# perf trace --max-stack=2 -e perf_event_open,probe:x86_pmu_hw_config* perf record usleep 1
[ perf record: Woken up 1 times to write data ]
0.000 ( 0.017 ms): perf/8469 perf_event_open(attr_uptr: 0x7ffe36c27d10, pid: -1, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_event_open_cloexec_flag (/home/acme/bin/perf)
0.050 ( 0.031 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evlist__config (/home/acme/bin/perf)
0.092 ( 0.040 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evlist__config (/home/acme/bin/perf)
0.143 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, cpu: -1, group_fd: -1 ) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
0.161 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.171 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.180 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.190 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
[ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
#
The probe one called from perf_event_attr__set_max_precise_ip() works
the first time, with attr.precise_ip = 3, wit hthe next ones being the
per cpu ones for the cycles:ppp event.
And here is the text from a report and alternative proposed patch by
Thomas-Mich Richter:
---
On s390 the counter and sampling facility do not support a precise IP
skid level and sometimes returns EOPNOTSUPP when structure member
precise_ip in struct perf_event_attr is not set to zero.
On s390 commnd 'perf record -- true' fails with error EOPNOTSUPP. This
happens only when no events are specified on command line.
The functions called are
...
--> perf_evlist__add_default
--> perf_evsel__new_cycles
--> perf_event_attr__set_max_precise_ip
The last function determines the value of structure member precise_ip by
invoking the perf_event_open() system call and checking the return code.
The first successful open is the value for precise_ip.
However the value is determined without setting member sample_period and
indicates no sampling.
On s390 the counter facility and sampling facility are different. The
above procedure determines a precise_ip value of 3 using the counter
facility. Later it uses the sampling facility with a value of 3 and
fails with EOPNOTSUPP.
---
v2: Older compilers (e.g. gcc 4.4.7) don't support referencing members
of unnamed union members in the container struct initialization, so
move from:
struct perf_event_attr attr = {
...
.sample_period = 1,
};
to right after it as:
struct perf_event_attr attr = {
...
};
attr.sample_period = 1;
v3: We need to reset .sample_period to 0 to let the users of
perf_evsel__new_cycles() to properly setup attr.sample_period or
attr.sample_freq. Reported by Ingo Molnar.
Reported-and-Acked-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 18e7a45af9 ("perf/x86: Reject non sampling events with precise_ip")
Link: http://lkml.kernel.org/n/tip-yv6nnkl7tzqocrm0hl3x7vf1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Laura reported a sleep-in-atomic kernel warning inside
tcf_act_police_init() which calls gen_replace_estimator() with
spinlock protection.
It is not necessary in this case, we already have RTNL lock here
so it is enough to protect concurrent writers. For the reader,
i.e. tcf_act_police(), it needs to make decision based on this
rate estimator, in the worst case we drop more/less packets than
necessary while changing the rate in parallel, it is still acceptable.
Reported-by: Laura Abbott <labbott@redhat.com>
Reported-by: Nick Huber <nicholashuber@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current __ceph_setattr() can set inode's i_ctime to current_time(),
req->r_stamp or attr->ia_ctime. These time stamps may have minor
differences. It may cause potential problem.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
ceph uses ktime_get_real_ts() to get request time stamp. In most
other cases, current_kernel_time() is used to get time stamp for
filesystem operations (called by current_time()).
There is granularity difference between ktime_get_real_ts() and
current_kernel_time(). The later one can be up to one jiffy behind
the former one. This can causes inode's ctime to go back.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Converting a file handle to a dentry can be done call after the inode
unlink. This means that __fh_to_dentry() requires an extra check to
verify the number of links is not 0.
The issue can be easily reproduced using xfstest generic/426, which does
something like:
name_to_handle_at(&fh)
echo 3 > /proc/sys/vm/drop_caches
unlink()
open_by_handle_at(&fh)
The call to open_by_handle_at() should fail, as the file doesn't exist
anymore.
Link: http://tracker.ceph.com/issues/19958
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The driver may sleep under a spin lock, and the function call path is:
post_one_send (acquire the lock by spin_lock_irqsave)
init_send_wqe
copy_from_user --> may sleep
There is no flow that makes "qp->is_user" true, and copy_from_user may
cause bug when a non-user pointer is used. So the lines of copy_from_user
and check of "qp->is_user" are removed.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add 64KB PAGE_SIZE support to user-space CQ, SQ and RQ queues.
De-facto it means that code was added to translate 64KB
pages to smaller 4KB pages that the FW can handle. Otherwise,
the FW would wrap (or jump to the next page) when reaching 4KB
while the user space library will continue on the same large page.
Note that MR code remains as is since the FW supports larger pages
for MRs.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Initialize byte_len in work completion of RDMA_READ and RDMA_SEND.
Exposed by uDAPL application.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Some issues observed with FMR implementation
while running stress traffic. So removing the
FMR verbs support for now.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch adds code to ring RQ Doorbell aggressively
so that the adapter can DMA RQ buffers sooner, instead
of DMA all WQEs in the post_recv WR list together at the
end of the post_recv verb.
Also use spinlock to serialize RQ posting
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
HW stalls out after 0x800000 WQEs are posted for UD QPs.
To workaround this problem, driver will send a modify_qp cmd
to the HW at around the halfway mark(0x400000) so that FW
can accordingly modify the QP context in the HW to prevent this
stall.
This workaround needs to be done for UD, QP1 and Raw Ethertype
packets. Added a counter to keep track of WQEs posted during post_send.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
If the host buffers are freed before destroying MR in HW,
HW could try accessing these buffers. This could cause a host
crash. Fixing the code to avoid this condition.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch implements the following HW workarounds
1. The SQ depth needs to be augmented by 128 + 1 to avoid running
into an Out of order CQE issue
2. Workaround to handle the problem where the HW fast path engine continues
to access DMA memory in retranmission mode even after the WQE has
already been completed. If the HW reports this condition, driver detects
it and posts a Fence WQE. The driver stops reporting the completions
to stack until it receives completion for Fence WQE.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The standard PCM chmap helper callbacks treat the NULL info->chmap as
a fatal error and spews the kernel warning with stack trace when
CONFIG_SND_DEBUG is on. This was OK, originally it was supposed to be
always static and non-NULL. But, as the recent addition of Intel LPE
audio driver shows, the chmap content may vary dynamically, and it can
be even NULL when disconnected. The user still sees the kernel
warning unnecessarily.
For clearing such a confusion, this patch simply removes the
snd_BUG_ON() in each place, just returns an error without warning.
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This reverts the two commits
7afbeb6df2 ("s390/ipl: always use load normal for CCW-type re-IPL")
0f7451ff3a ("s390/ipl: use load normal for LPAR re-ipl")
The two commits did not take into account that behavior of standby
memory changes fundamentally if the re-IPL method is changed from
Load Clear to Load Normal.
In case of the old re-IPL clear method all memory that was initially
in standby state will be put into standby state again within the
re-IPL process. Or in other words: memory that was brought online
before a re-IPL will be offline again after a reboot.
Given that we use different re-IPL methods depending on the hypervisor
and CCW-type vs SCSI re-IPL it is not easy to tell in advance when and
why memory will stay online or will be offline after a re-IPL.
This does also have other side effects, since memory that is online
from the beginning will be in ZONE_NORMAL by default vs ZONE_MOVABLE
for memory that is offline.
Therefore, before the change, a user could online and offline memory
easily since standby memory was always in ZONE_NORMAL. After the
change, and a re-IPL, this depended on which memory parts were online
before the re-IPL.
From a usability point of view the current behavior is more than
suboptimal. Therefore revert these changes until we have a better
solution and get back to a consistent behavior. The bad thing about
this is that the time required for a re-IPL will be significantly
increased for configurations with several 100GB or 1TB of memory.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Unfortunately, struct iwreq isn't a proper subset of struct ifreq,
but is still handled by the same code path. Robert reported that
then applications may (randomly) fault if the struct iwreq they
pass happens to land within 8 bytes of the end of a mapping (the
struct is only 32 bytes, vs. struct ifreq's 40 bytes).
To fix this, pull out the code handling wireless extension ioctls
and copy only the smaller structure in this case.
This bug goes back a long time, I tracked that it was introduced
into mainline in 2.1.15, over 20 years ago!
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=195869
Reported-by: Robert O'Callahan <robert@ocallahan.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
To make it clear that we never use struct ifreq, cast from it
directly in the wext entrypoint and use struct iwreq from there
on. The next patch will remove the cast again and pass the
correct struct from the beginning.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The caller only cares about zero vs non-zero so this code actually works
fine but we should be returning a negative error code instead of a valid
pointer casted to int.
Fixes: 554c0a3abf ("staging: Add rtl8723bs sdio wifi driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The default error code in pfkey_msg2xfrm_state() is -ENOBUFS. We
added a new call to security_xfrm_state_alloc() which sets "err" to zero
so there several places where we can return ERR_PTR(0) if kmalloc()
fails. The caller is expecting error pointers so it leads to a NULL
dereference.
Fixes: df71837d50 ("[LSM-IPSec]: Security association restriction.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
There are some missing error codes here so we accidentally return NULL
instead of an error pointer. It results in a NULL pointer dereference.
Fixes: df71837d50 ("[LSM-IPSec]: Security association restriction.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Drop log level for blanking from info to debug. Xorg likes to habitually
unblank when already unblanked and this can fill up logs over a long period
of time.
Signed-off-by: Mike Gerow <gerow@google.com>
Cc: bernie@plugable.com
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
When CONFIG_PROC_FS is disabled, we get warnings about unused variables
as remove_proc_entry() evaluates to an empty macro.
drivers/video/fbdev/via/viafbdev.c: In function 'viafb_remove_proc':
drivers/video/fbdev/via/viafbdev.c:1635:4: error: unused variable 'iga2_entry' [-Werror=unused-variable]
drivers/video/fbdev/via/viafbdev.c:1634:4: error: unused variable 'iga1_entry' [-Werror=unused-variable]
These are easy to avoid by using the pointer from the structure.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
gcc-7 suspects this code might be wrong because we use the
result of a multiplication as a bool:
drivers/video/fbdev/core/fbmon.c: In function 'fb_edid_add_monspecs':
drivers/video/fbdev/core/fbmon.c:1051:84: error: '*' in boolean context, suggest '&&' instead [-Werror=int-in-bool-context]
It's actually fine, so let's add a comparison to zero to make
that clear to the compiler too.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Jonathan writes:
Second set of IIO fixes for the 4.12 cycle.
* buffer-dma / buffer-dmaengine
- Fix missing include of buffer_impl.h after the split of buffer.h.
No driver in mainline is currently using these buffers so it wasn't
picked up by automated build tests.
* ad7152
- Fix a deadlock in ad7152_write_raw_samp_freq as the chip_state lock
was already held.
* inv_mpu6050
- Add low pass filter setting for chips newer than the MPU6500. None of
use previously picked up no the fact it was different on these newer
chips. It is separately set for the acceleration on these parts. There
is no normal reason to set it differently so the userspace interface
remains the same as for early parts.
* meson-saradc:
- Fix a potential crash by NULL pointer dereference in
meson_sar_adc_clear_fifo.
* mxs-lradc
- Fix a return value check where IS_ERR is used on a function that returns
NULL on error
There are no longer any drivers (in the tree proper, I didn't
check all the staging drivers) that take WEXT ioctls through
this API, the only remaining ones that even have ndo_do_ioctl
are using it only for private ioctls.
Therefore, we can remove this call.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Commit 4c3b89effc ("powerpc/powernv: Add sanity checks to
pnv_pci_get_{gpu|npu}_dev") introduced explicit warnings in
pnv_pci_get_npu_dev() when a PCIe device has no associated device-tree
node. However not all PCIe devices have an of_node and
pnv_pci_get_npu_dev() gets indirectly called at least once for every
PCIe device in the system. This results in spurious WARN_ON()'s so
remove it.
The same situation should not exist for pnv_pci_get_gpu_dev() as any
NPU based PCIe device requires a device-tree node.
Fixes: 4c3b89effc ("powerpc/powernv: Add sanity checks to pnv_pci_get_{gpu|npu}_dev")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rather than constructing a local structure instance on the stack, fill
the fields directly on the shared ring, just like other backends do.
Build on the fact that all response structure flavors are actually
identical (the old code did make this assumption too).
This is XSA-216.
Cc: stable@vger.kernel.org
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
There is no need to use xen_blkif_get()/xen_blkif_put() in the kthread
of xen-blkback. Thread stopping is synchronous and using the blkif
reference counting in the kthread will avoid to ever let the reference
count drop to zero at the end of an I/O running concurrent to
disconnecting and multiple rings.
Setting ring->xenblkd to NULL after stopping the kthread isn't needed
as the kthread does this already.
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in the connect()
handler of the AF_CAIF socket. Since the syscall doesn't enforce a minimum
size of the corresponding memory region, very short sockaddrs (zero or one
byte long) result in operating on uninitialized memory while referencing
sa_family.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The be structure must not be freed when freeing the blkif structure
isn't done. Otherwise a use-after-free of be when unmapping the ring
used for communicating with the frontend will occur in case of a
late call of xenblk_disconnect() (e.g. due to an I/O still active
when trying to disconnect).
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Fixing a concurrency issue with creq handling. Each caller
was given a globally managed crsq element, which was
accessed outside a lock. This could result in corruption,
if lot of applications are simultaneously issuing Control Path
commands. Now, each caller will provide its own response buffer
and the responses will be copied under a lock.
Also, Fixing the queue full condition check for the CMDQ.
As a part of these changes, the control path code is refactored
to remove the code replication in the response status checking.
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Today disconnecting xen-blkback is broken in case there are still
I/Os in flight: xen_blkif_disconnect() will bail out early without
releasing all resources in the hope it will be called again when
the last request has terminated. This, however, won't happen as
xen_blkif_free() won't be called on termination of the last running
request: xen_blkif_put() won't decrement the blkif refcnt to 0 as
xen_blkif_disconnect() didn't finish before thus some xen_blkif_put()
calls in xen_blkif_disconnect() didn't happen.
To solve this deadlock xen_blkif_disconnect() and
xen_blkif_alloc_rings() shouldn't use xen_blkif_put() and
xen_blkif_get() but use some other way to do their accounting of
resources.
This at once fixes another error in xen_blkif_disconnect(): when it
returned early with -EBUSY for another ring than 0 it would call
xen_blkif_put() again for already handled rings on a subsequent call.
This will lead to inconsistencies in the refcnt handling.
Cc: stable@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Add buffer_impl.h as buffer.h was split into interface for using and
for internals. Without this industrialio-buffer-dmaengine.c fails
to compile.
Fixes:
commit 33dd94cb97 ("iio:buffer.h - split
into buffer.h and buffer_impl.h")
Signed-off-by: Phil Reid <preid@electromag.com.au>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Add buffer_impl.h as buffer.h was split into interface for using and
for internals. Without this industrialio-buffer-dma.c fails
to compile.
Fixes:
commit 33dd94cb97 ("iio:buffer.h - split
into buffer.h and buffer_impl.h")
Signed-off-by: Phil Reid <preid@electromag.com.au>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
This reverts commit 5ab92a7cb8.
System cannot enter suspend mode because of heartbeat led trigger.
In autosleep_wq, try_to_suspend function will try to enter suspend
mode in specific period. it will get wakeup_count then call pm_notifier
chain callback function and freeze processes.
Heartbeat_pm_notifier is called and it call led_trigger_unregister to
change the trigger of led device to none. It will send uevent message
and the wakeup source count changed. As wakeup_count changed, suspend
will abort.
Fixes: 5ab92a7cb8 ("leds: handle suspend/resume in heartbeat trigger")
Signed-off-by: Zhang Bo <bo.zhang@nxp.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Each nibble represents 4 LEDs, and in case of the higher register, bit 0
represents LED 4, so we need to use modulus for the LED number as well.
Fixes: fd7b025a23 ("leds: add BCM6328 LED driver")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Acked-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Simon Wunderlich says:
====================
Here are two batman-adv bugfixes:
- fix rx packet counters for local ARP replies, by Sven Eckelmann
- fix memory leaks for unicast packetes received from another gateway
in bridge loop avoidance, by Andreas Pape
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Some fixes:
* Avi fixes some fallout from my mac80211 RX flags changes
* Emmanuel fixes an issue with adhering to the spec, and
an oversight in the SMPS management code
* Jason's patch makes mac80211 use constant-time memory
comparisons for message authentication, to avoid having
potentially observable timing differences
* my fix makes mac80211 set the basic rates bitmap before
the channel so the next update to the driver has more
consistent data - this required another rework patch to
remove some useless 5/10 MHz code that can never be hit
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently when dumping a context data only word number '1' is read for the
entire context.
Fixes: c965db4446 ("qed: Add support for debug data collection")
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A new Sierra Wireless EM7305 device ID used in a Toshiba laptop,
and two Longcheer device IDs entries used by Telewell TW-3G HSPA+
branded modems.
Reported-by: Petr Kloc <petr_kloc@yahoo.com>
Reported-by: Teemu Likonen <tlikonen@iki.fi>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 90eff9096c ("net: phy: Allow splitting MDIO
bus/device support from PHYs") we could create a configuration where
MDIO_DEVICE=y and PHYLIB=m which leads to the following undefined
references:
drivers/built-in.o: In function `thunder_mdiobus_pci_remove':
>> mdio-thunder.c:(.text+0x2a212f): undefined reference to
>> `mdiobus_unregister'
>> mdio-thunder.c:(.text+0x2a2138): undefined reference to
>> `mdiobus_free'
drivers/built-in.o: In function `thunder_mdiobus_pci_probe':
mdio-thunder.c:(.text+0x2a22e7): undefined reference to
`devm_mdiobus_alloc_size'
mdio-thunder.c:(.text+0x2a236f): undefined reference to
`of_mdiobus_register'
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 90eff9096c ("net: phy: Allow splitting MDIO bus/device support from PHYs")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's already added by pr_fmt so remove the explicit use.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrey reported a use-after-free in add_grec():
for (psf = *psf_list; psf; psf = psf_next) {
...
psf_next = psf->sf_next;
where the struct ip_sf_list's were already freed by:
kfree+0xe8/0x2b0 mm/slub.c:3882
ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
sock_release+0x8d/0x1e0 net/socket.c:597
sock_close+0x16/0x20 net/socket.c:1072
This happens because we don't hold pmc->lock in ip_mc_clear_src()
and a parallel mr_ifc_timer timer could jump in and access them.
The RCU lock is there but it is merely for pmc itself, this
spinlock could actually ensure we don't access them in parallel.
Thanks to Eric and Long for discussion on this bug.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Getting the device version out of the driver really aids debugging.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes uninitialized symbol warning that
got introduced by the following commit
773fc8f6e8 ("net: rps: send out pending IPI's on CPU hotplug")
Signed-off-by: Ashwanth Goli <ashwanth@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are many situations where generic HID driver provides some basic level
of support for certain device, but later this support (usually by implementing
vendor-specific extensions of HID protocol) is extended and the support moved
over to a separate (usually per-vendor) specific driver.
This might bring a rather unpleasant suprise for users, as all of a sudden
there is a new config option they have to enable in order to get any support
for their device whatsoever, although previous kernel versions provided basic
support through the generic driver. Which is rightfully seen as a regression.
Fix this by including the entry for a particular device in
hid_have_special_driver[] iff the specific config option has been specified,
and let generic driver handle the device otherwise.
Also make the behavior of hid_scan_report() (where the same decision is being
taken on a per-report level) consistent.
While at it, reshuffle the hid_have_special_driver[] a bit to restore the
alphabetical ordering (first order by config option, and within those
sections order by VID).
This is considered a short-term solution, before generic way of giving
precedence to special drivers and falling back to generic driver is
figured out.
While at it, fixup a missing entry for GFRM driver; thanks to Hans de Geode for
spotting this (and for discovering a few issues in the conversion).
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
mac80211 allows to modify the SMPS state of an AP both,
when it is started, and after it has been started. Such a
change will trigger an action frame to all the peers that
are currently connected, and will be remembered so that
new peers will get notified as soon as they connect (since
the SMPS setting in the beacon may not be the right one).
This means that we need to remember the SMPS state
currently requested as well as the SMPS state that was
configured initially (and advertised in the beacon).
The former is bss->req_smps and the latter is
sdata->smps_mode.
Initially, the AP interface could only be started with
SMPS_OFF, which means that sdata->smps_mode was SMPS_OFF
always. Later, a nl80211 API was added to be able to start
an AP with a different AP mode. That code forgot to update
bss->req_smps and because of that, if the AP interface was
started with SMPS_DYNAMIC, we had:
sdata->smps_mode = SMPS_DYNAMIC
bss->req_smps = SMPS_OFF
That configuration made mac80211 think it needs to fire off
an action frame to any new station connecting to the AP in
order to let it know that the actual SMPS configuration is
SMPS_OFF.
Fix that by properly setting bss->req_smps in
ieee80211_start_ap.
Fixes: f699317487 ("mac80211: set smps_mode according to ap params")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When mac80211 changes the channel, it also calls into the driver's
bss_info_changed() callback, e.g. with BSS_CHANGED_IDLE. The driver
may, like iwlwifi does, access more data from bss_info in that case
and iwlwifi accesses the basic_rates bitmap, but if changing from a
band with more (basic) rates to one with fewer, an out-of-bounds
access of the rate array may result.
While we can't avoid having invalid data at some point in time, we
can avoid having it while we call the driver - so set up all the
data before configuring the channel, and then apply it afterwards.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=195677
Reported-by: Johannes Hirte <johannes.hirte@datenkhaos.de>
Tested-by: Johannes Hirte <johannes.hirte@datenkhaos.de>
Debugged-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There's no need for the station MLME code to handle bitrates for 5
or 10 MHz channels when it can't ever create such a configuration.
Remove the unnecessary code.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If the driver reports the rx timestamp at PLCP start, mac80211 can
only handle legacy encoding, but the code checks that the encoding
is not legacy. Fix this.
Fixes: da6a4352e7 ("mac80211: separate encoding/bandwidth from flags")
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch is based on a discussion generated by an earlier patch
from Tetsuo Handa:
* https://marc.info/?t=149035659300001&r=1&w=2
The double free problem involves the mnt_opts field of the
security_mnt_opts struct, selinux_parse_opts_str() frees the memory
on error, but doesn't set the field to NULL so if the caller later
attempts to call security_free_mnt_opts() we trigger the problem.
In order to play it safe we change selinux_parse_opts_str() to call
security_free_mnt_opts() on error instead of free'ing the memory
directly. This should ensure that everything is handled correctly,
regardless of what the caller may do.
Fixes: e000752989 ("LSM/SELinux: Interfaces to allow FS to control mount options")
Cc: stable@vger.kernel.org
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Pull Xtensa fixes from Max Filippov:
- don't use linux IRQ #0 in legacy irq domains: fixes timer interrupt
assignment when it's hardware IRQ # is 0 and the kernel is built w/o
device tree support
- reduce reservation size for double exception vector literals from 48
to 20 bytes: fixes build on cores with small user exception vector
- cleanups: use kmalloc_array instead of kmalloc in simdisk_init and
seq_puts instead of seq_printf in c_show.
* tag 'xtensa-20170612' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: don't use linux IRQ #0
xtensa: reduce double exception literal reservation
xtensa: ISS: Use kmalloc_array() in simdisk_init()
xtensa: Use seq_puts() in c_show()
Pull s390 fixes from Martin Schwidefsky:
- A fix for KVM to avoid kernel oopses in case of host protection
faults due to runtime instrumentation
- A fix for the AP bus to avoid dead devices after unbind / bind
- A fix for a compile warning merged from the vfio_ccw tree
- Updated default configurations
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: update defconfig
s390/zcrypt: Fix blocking queue device after unbind/bind.
s390/vfio_ccw: make some symbols static
s390/kvm: do not rely on the ILC on kvm host protection fauls
This adds code to save the values of three SPRs (special-purpose
registers) used by userspace to control event-based branches (EBBs),
which are essentially interrupts that get delivered directly to
userspace. These registers are loaded up with guest values when
entering the guest, and their values are saved when exiting the
guest, but we were not saving the host values and restoring them
before going back to userspace.
On POWER8 this would only affect userspace programs which explicitly
request the use of EBBs and also use the KVM_RUN ioctl, since the
only source of EBBs on POWER8 is the PMU, and there is an explicit
enable bit in the PMU registers (and those PMU registers do get
properly context-switched between host and guest). On POWER9 there
is provision for externally-generated EBBs, and these are not subject
to the control in the PMU registers.
Since these registers only affect userspace, we can save them when
we first come in from userspace and restore them before returning to
userspace, rather than saving/restoring the host values on every
guest entry/exit. Similarly, we don't need to worry about their
values on offline secondary threads since they execute in the context
of the idle task, which never executes in userspace.
Fixes: b005255e12 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
Cc: stable@vger.kernel.org # v3.14+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
A recent commit to refactor the driver and remove the hw_disabled_flags
field accidentally introduced two regressions. First, we overwrote
pf->flags which removed various key flags including the MSI-X settings.
Additionally, it was intended that we have now two flags,
HW_ATR_EVICT_CAPABLE and HW_ATR_EVICT_ENABLED, but this was not done,
and we accidentally were mis-using HW_ATR_EVICT_CAPABLE everywhere.
This patch adds the missing piece, HW_ATR_EVICT_ENABLED, and safely
updates pf->flags instead of overwriting it.
Without this patch we will have many problems including disabling MSI-X
support, and we'll attempt to use HW ATR eviction on devices which do
not support it.
Fixes: 47994c119a ("i40e: remove hw_disabled_flags in favor of using separate flag bits", 2017-04-19)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dm-integrity would successfully create mappings with the number of
sectors greater than the provided data sector count. Attempts to read
sectors of this mapping that were beyond the provided data sector count
would then yield run-time messages of the form "device-mapper:
integrity: Too big sector number: ...".
Fix this by emitting an error when the requested mapping size is bigger
than the provided data sector count.
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The PCI endpoint test driver uses crc32_le() so it should select
CRC32. Fixes this build error (when CRC32=m):
drivers/built-in.o: In function `pci_epf_test_cmd_handler':
pci-epf-test.c:(.text+0x2d98d): undefined reference to `crc32_le'
Fixes: 349e7a85b2 ("PCI: endpoint: functions: Add an EP function to test PCI")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
When HSR interface is setup using ip link command, an annoying warning
appears with the trace as below:-
[ 203.019828] hsr_get_node: Non-HSR frame
[ 203.019833] Modules linked in:
[ 203.019848] CPU: 0 PID: 158 Comm: sd-resolve Tainted: G W 4.12.0-rc3-00052-g9fa6bf70 #2
[ 203.019853] Hardware name: Generic DRA74X (Flattened Device Tree)
[ 203.019869] [<c0110280>] (unwind_backtrace) from [<c010c2f4>] (show_stack+0x10/0x14)
[ 203.019880] [<c010c2f4>] (show_stack) from [<c04b9f64>] (dump_stack+0xac/0xe0)
[ 203.019894] [<c04b9f64>] (dump_stack) from [<c01374e8>] (__warn+0xd8/0x104)
[ 203.019907] [<c01374e8>] (__warn) from [<c0137548>] (warn_slowpath_fmt+0x34/0x44)
root@am57xx-evm:~# [ 203.019921] [<c0137548>] (warn_slowpath_fmt) from [<c081126c>] (hsr_get_node+0x148/0x170)
[ 203.019932] [<c081126c>] (hsr_get_node) from [<c0814240>] (hsr_forward_skb+0x110/0x7c0)
[ 203.019942] [<c0814240>] (hsr_forward_skb) from [<c0811d64>] (hsr_dev_xmit+0x2c/0x34)
[ 203.019954] [<c0811d64>] (hsr_dev_xmit) from [<c06c0828>] (dev_hard_start_xmit+0xc4/0x3bc)
[ 203.019963] [<c06c0828>] (dev_hard_start_xmit) from [<c06c13d8>] (__dev_queue_xmit+0x7c4/0x98c)
[ 203.019974] [<c06c13d8>] (__dev_queue_xmit) from [<c0782f54>] (ip6_finish_output2+0x330/0xc1c)
[ 203.019983] [<c0782f54>] (ip6_finish_output2) from [<c0788f0c>] (ip6_output+0x58/0x454)
[ 203.019994] [<c0788f0c>] (ip6_output) from [<c07b16cc>] (mld_sendpack+0x420/0x744)
As this is an expected path to hsr_get_node() with frame coming from
the master interface, add a check to ensure packet is not from the
master port and then warn.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cache support is optional feature in M-class cores, thus DminLine or
IminLine of Cache Type Register is zero if caches are not implemented,
but we check the whole CTR which has other features encoded there.
Let's be more precise and check for DminLine and IminLine of CTR
before we set cacheid.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
When both enable CONFIG_ARM_LPAE=y and CONFIG_VMSPLIT_3G_OPT=y, which
means use PAGE_OFFSET=0xB0000000 with ARM_LPAE, the kernel will boot
fail and stop after uncompressed:
Starting kernel ...
Uart base = 0x20001000
watchdog reg = 0x20013000
dtb addr = 0x80840308
Uncompressing Linux... done, booting the kernel.
For ARM_LPAE only support 3:1, 2:2, 1:3 split of TTBR1, which mention in:
http://elinux.org/images/6/6a/Elce11_marinas.pdf - p16
So we should make VMSPLIT_3G_OPT depends on !ARM_LPAE to avoid trigger
this bug.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Commit 06a4b6d009 ("ARM: 8677/1: boot/compressed: fix decompressor
header layout for v7-M") fixed an issue in the layout of the header
of the compressed kernel image that was caused by the assembler
emitting narrow opcodes for 'mov r0, r0', and for this reason, the
mnemonic was updated to use the W() macro, which will append the .w
suffix (which forces a wide encoding) if required, i.e., when building
the kernel in Thumb2 mode.
However, this failed to take into account that on Thumb2 kernels built
for CPUs that are also ARM capable, the entry point is entered in ARM
mode, and so the instructions emitted here will be ARM instructions
that only exist in a wide encoding to begin with, which is why the
assembler rejects the .w suffix here and aborts the build with the
following message:
head.S: Assembler messages:
head.S:132: Error: width suffixes are invalid in ARM mode -- `mov.w r0,r0'
So replace the W(mov) with separate ARM and Thumb2 instructions, where
the latter will only be used for THUMB2_ONLY builds.
Fixes: 06a4b6d009 ("ARM: 8677/1: boot/compressed: fix decompressor ...")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
When plugging an USB webcam I see the following message:
[106385.615559] xhci_hcd 0000:04:00.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
[106390.583860] handle_tx_event: 913 callbacks suppressed
With this patch applied, I get no more printing of this message.
Cc: <stable@vger.kernel.org>
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
xHCI host controllers can have both USB 3.1 and 3.0 extended speed
protocol lists. If the USB3.1 speed is parsed first and 3.0 second then
the minor revision supported will be overwritten by the 3.0 speeds and
the USB3 roothub will only show support for USB 3.0 speeds.
This was the case with a xhci controller with the supported protocol
capability listed below.
In xhci-mem.c, the USB 3.1 speed is parsed first, the min_rev of usb3_rhub
is set as 0x10. And then USB 3.0 is parsed. However, the min_rev of
usb3_rhub will be changed to 0x00. If USB 3.1 device is connected behind
this host controller, the speed of USB 3.1 device just reports 5G speed
using lsusb.
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00 01 08 00 00 00 00 00 40 00 00 00 00 00 00 00 00
10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20 02 08 10 03 55 53 42 20 01 02 00 00 00 00 00 00 //USB 3.1
30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
40 02 08 00 03 55 53 42 20 03 06 00 00 00 00 00 00 //USB 3.0
50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60 02 08 00 02 55 53 42 20 09 0E 19 00 00 00 00 00 //USB 2.0
70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
This patch fixes the issue by only owerwriting the minor revision if
it is higher than the existing one.
[reword commit message -Mathias]
Cc: <stable@vger.kernel.org>
Signed-off-by: YD Tseng <yd_tseng@asmedia.com.tw>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Felipe writes:
usb: fixes for v4.12-rc5
Alan Stern fixed a GPF in gadgetfs found by the kernel fuzzying project
composite.c learned that if it deactivates a function during bind, it
must reactivate it during unbind.
Reading /proc/net/snmp6 yields bogus values on 32 bit kernels.
Use "u64" instead of "unsigned long" in sizeof().
Fixes: 4a4857b1c8 ("proc: Reduce cache miss in snmp6_seq_show")
Signed-off-by: Christian Perle <christian.perle@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix boot warning 'Trying to vfree() nonexistent vm area'
from arch_timer_mem_of_init().
Refactored code attempts to read and iounmap using address frame
instead of address ioremap(frame->cntbase).
Fixes: c389d701df ("clocksource: arm_arch_timer: split MMIO timer probing.")
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Reviewed-by: Fu Wei <fu.wei@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Pull devfreq fixes from MyungJoo Ham.
* 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq:
PM / devfreq: exynos-ppmu: Staticize event list
PM / devfreq: exynos-ppmu: Handle return value of clk_prepare_enable
PM / devfreq: exynos-nocp: Handle return value of clk_prepare_enable
'of_node_put()' should be called on pointer returned by
'of_parse_phandle()' when done. In this function this is done in all path
except this 'continue', so add it.
Fixes: 97735da074 (drivers: cpuidle: Add status property to ARM idle states)
Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 27ed3cd2eb (cpufreq: conservative: Fix the logic in frequency
decrease checking) removed the 10 point substraction when comparing the
load against down_threshold but did not remove the related limit for the
down_threshold value. As a result, down_threshold lower than 11 is not
allowed even though values from 1 to 10 do work correctly too. The
comment ("cannot be lower than 11 otherwise freq will not fall") also
does not apply after removing the substraction.
For this reason, allow down_threshold to take any value from 1 to 99
and fix the related comment.
Fixes: 27ed3cd2eb (cpufreq: conservative: Fix the logic in frequency decrease checking)
Signed-off-by: Tomasz Wilczyński <twilczynski@naver.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.10+ <stable@vger.kernel.org> # 3.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Revert commit 39b64aa1c0 (cpufreq: schedutil: Reduce frequencies
slower) that introduced unintentional changes in behavior leading
to adverse effects on some systems.
Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Considering this case:
1. A program opens a sysfs table file 65535 times, it can increase
validation_count and first increment cause the table to be mapped:
validation_count = 65535
2. AML execution causes "Load" to be executed on the same
table, this time it cannot increase validation_count, so
validation_count remains:
validation_count = 65535
3. The program closes sysfs table file 65535 times, it can decrease
validation_count and the last decrement cause the table to be
unmapped:
validation_count = 0
4. AML code still accessing the loaded table, kernel crash can be
observed.
To prevent that from happening, add a validation_count threashold.
When it is reached, the validation_count can no longer be
incremented/decremented to invalidate the table descriptor (means
preventing table unmappings)
Note that code added in acpi_tb_put_table() is actually a no-op but
changes the warning message into a "warn once" one. Lv Zheng.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog, comments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Now we will force to do garbage collection if any policy removed in
xfrm_policy_flush(). But during xfrm_net_exit(). We call flow_cache_fini()
first and set set fc->percpu to NULL. Then after we call xfrm_policy_fini()
-> frxm_policy_flush() -> flow_cache_flush(), we will get NULL pointer
dereference when check percpu_empty. The code path looks like:
flow_cache_fini()
- fc->percpu = NULL
xfrm_policy_fini()
- xfrm_policy_flush()
- xfrm_garbage_collect()
- flow_cache_flush()
- flow_cache_percpu_empty()
- fcp = per_cpu_ptr(fc->percpu, cpu)
To reproduce, just add ipsec in netns and then remove the netns.
v2:
As Xin Long suggested, since only two other places need to call it. move
xfrm_garbage_collect() outside xfrm_policy_flush().
v3:
Fix subject mismatch after v2 fix.
Fixes: 35db069121 ("xfrm: do the garbage collection after flushing policy")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
There have been reports about SDIO failing with certain WiFi chips in
descriptor chain mode. SD / eMMC are working fine.
So let's fall back to bounce buffer mode for command SD_IO_RW_EXTENDED.
This was reported to fix the error.
Fixes: 79ed05e329 "mmc: meson-gx: add support for descriptor chain mode"
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The ppmu_events array is accessed only in this compilation unit so it
can be made static.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Pull key subsystem fixes from James Morris:
"Here are a bunch of fixes for Linux keyrings, including:
- Fix up the refcount handling now that key structs use the
refcount_t type and the refcount_t ops don't allow a 0->1
transition.
- Fix a potential NULL deref after error in x509_cert_parse().
- Don't put data for the crypto algorithms to use on the stack.
- Fix the handling of a null payload being passed to add_key().
- Fix incorrect cleanup an uninitialised key_preparsed_payload in
key_update().
- Explicit sanitisation of potentially secure data before freeing.
- Fixes for the Diffie-Helman code"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (23 commits)
KEYS: fix refcount_inc() on zero
KEYS: Convert KEYCTL_DH_COMPUTE to use the crypto KPP API
crypto : asymmetric_keys : verify_pefile:zero memory content before freeing
KEYS: DH: add __user annotations to keyctl_kdf_params
KEYS: DH: ensure the KDF counter is properly aligned
KEYS: DH: don't feed uninitialized "otherinfo" into KDF
KEYS: DH: forbid using digest_null as the KDF hash
KEYS: sanitize key structs before freeing
KEYS: trusted: sanitize all key material
KEYS: encrypted: sanitize all key material
KEYS: user_defined: sanitize key payloads
KEYS: sanitize add_key() and keyctl() key payloads
KEYS: fix freeing uninitialized memory in key_update()
KEYS: fix dereferencing NULL payload with nonzero length
KEYS: encrypted: use constant-time HMAC comparison
KEYS: encrypted: fix race causing incorrect HMAC calculations
KEYS: encrypted: fix buffer overread in valid_master_desc()
KEYS: encrypted: avoid encrypting/decrypting stack buffers
KEYS: put keyring if install_session_keyring_to_cred() fails
KEYS: Delete an error message for a failed memory allocation in get_derived_key()
...
Commit abb2ea7dfd ("compiler, clang: suppress warning for unused
static inline functions") just caused more warnings due to re-defining
the 'inline' macro.
So undef it before re-defining it, and also add the 'notrace' attribute
like the gcc version that this is overriding does.
Maybe this makes clang happier.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes two issues:
1) When forwarding on *,G mroutes that are in a vrf, the
kernel was dropping information about the actual incoming
interface when calling ip_mr_forward from ip_mr_input.
This caused ip_mr_forward to send the multicast packet
back out the incoming interface. Fix this by
modifying ip_mr_forward to be handed the correctly
resolved dev.
2) When a unresolved cache entry is created we store
the incoming skb on the unresolved cache entry and
upon mroute resolution from the user space daemon,
we attempt to forward the packet. Again we were
not resolving to the correct incoming device for
a vrf scenario, before calling ip_mr_forward.
Fix this by resolving to the correct interface
and calling ip_mr_forward with the result.
Fixes: e58e415968 ("net: Enable support for VRF with ipv4 multicast")
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox mlx5 fixes 2017-06-11
This series contains some fixes for the mlx5 core and netdev driver.
Please pull and let me know if there's any problem.
For -stable:
("net/mlx5e: Added BW check for DIM decision mechanism") kernels >= 4.9
("net/mlx5e: Fix wrong indications in DIM due to counter wraparound") kernels >= 4.9
("net/mlx5: Remove several module events out of ethtool stats") kernels >= 4.10
("net/mlx5: Enable 4K UAR only when page size is bigger than 4K") kernels >= 4.11
*all patches apply with no issue on their -stable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal says:
====================
Bugs fixes in ena ethernet driver
This patchset contains fixes for the bugs that were discovered so far.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
check_for_missing_tx_completions() is called from a timer
task and looking for lost tx packets.
The old implementation accumulate all the lost tx packets
and did not check if those packets were retrieved on a later stage.
This cause to a situation where the driver reset
the device for no reason.
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the rare case where the device runs out of free rx buffer
descriptors (in case of pressure on kernel memory),
and the napi handler continuously fail to refill new Rx descriptors
until device rx queue totally runs out of all free rx buffers
to post incoming packet, leading to a deadlock:
* The device won't send interrupts since all the new
Rx packets will be dropped.
* The napi handler won't try to allocate new Rx descriptors
since allocation is part of NAPI that's not being invoked any more
The fix involves detecting this scenario and rescheduling NAPI
(to refill buffers) by the keepalive/watchdog task.
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch also change the mapping functions to devm_ functions
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug:
"Completion context is occupied" error printout will be noticed in
dmesg.
This error will cause the admin command to fail, which will lead to
an ena_probe() failure or a watchdog reset (depends on which admin
command failed).
Root cause:
__ena_com_submit_admin_cmd() is the function that submits new entries to
the admin queue.
The function have a check that makes sure the queue is not full and the
function does not override any outstanding command.
It uses head and tail indexes for this check.
The head is increased by ena_com_handle_admin_completion() which runs
from interrupt context, and the tail index is increased by the submit
function (the function is running under ->q_lock, so there is no risk
of multithread increment).
Each command is associated with a completion context. This context
allocated before call to __ena_com_submit_admin_cmd() and freed by
ena_com_wait_and_process_admin_cq_interrupts(), right after the command
was completed.
This can lead to a state where the head was increased, the check passed,
but the completion context is still in use.
Solution:
Use the atomic variable ->outstanding_cmds instead of using the head and
the tail indexes.
This variable is safe for use since it is bumped in get_comp_ctx() in
__ena_com_submit_admin_cmd() and is freed by comp_ctxt_release()
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixing a bug that the driver does not unmask the IO interrupts
in ndo_open():
occasionally, the MSI-X interrupt (for one or more IO queues)
can be masked when ndo_close() was called.
If that is followed by ndo open(),
then the MSI-X will be still masked so no interrupt
will be received by the driver.
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current flow to detect admin completion is:
while (command_not_completed) {
if (timeout)
error
check_for_completion()
sleep()
}
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.
The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull randomness fixes from Ted Ts'o:
"Improve performance by using a lockless update mechanism suggested by
Linus, and make sure we refresh per-CPU entropy returned get_random_*
as soon as the CRNG is initialized"
* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: invalidate batched entropy after crng init
random: use lockless method of accessing and updating f->reg_idx
Pull ext4 fixes from Ted Ts'o:
"Fix various bug fixes in ext4 caused by races and memory allocation
failures"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix fdatasync(2) after extent manipulation operations
ext4: fix data corruption for mmap writes
ext4: fix data corruption with EXT4_GET_BLOCKS_ZERO
ext4: fix quota charging for shared xattr blocks
ext4: remove redundant check for encrypted file on dio write path
ext4: remove unused d_name argument from ext4_search_dir() et al.
ext4: fix off-by-one error when writing back pages before dio read
ext4: fix off-by-one on max nr_pages in ext4_find_unwritten_pgoff()
ext4: keep existing extra fields when inode expands
ext4: handle the rest of ext4_mb_load_buddy() ENOMEM errors
ext4: fix off-by-in in loop termination in ext4_find_unwritten_pgoff()
ext4: fix SEEK_HOLE
jbd2: preserve original nofs flag during journal restart
ext4: clear lockdep subtype for quota files on quota off
Pull GPIO fixes from Linus Walleij:
"A few overdue GPIO patches for the v4.12 kernel.
- Fix debounce logic on the Aspeed platform.
- Fix the "virtual gpio" things on the Intel Crystal Cove.
- Fix the blink counter selection on the MVEBU platform"
* tag 'gpio-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: mvebu: fix gpio bank registration when pwm is used
gpio: mvebu: fix blink counter register selection
MAINTAINERS: remove self from GPIO maintainers
gpio: crystalcove: Do not write regular gpio registers for virtual GPIOs
gpio: aspeed: Don't attempt to debounce if disabled
Pull char/misc driver fixes from Greg KH:
"Here are some small driver fixes for 4.12-rc5. Nothing major here,
just some small bugfixes found by people testing, and a MAINTAINERS
file update for the genwqe driver.
All have been in linux-next with no reported issues"
[ The cxl driver fix came in through the powerpc tree earlier ]
* tag 'char-misc-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
cxl: Avoid double free_irq() for psl,slice interrupts
mei: make sysfs modalias format similar as uevent modalias
drivers: char: mem: Fix wraparound check to allow mappings up to the end
MAINTAINERS: Change maintainer of genwqe driver
goldfish_pipe: use GFP_ATOMIC under spin lock
firmware: vpd: do not leak kobjects
firmware: vpd: avoid potential use-after-free when destroying section
firmware: vpd: do not leave freed section attributes to the list
Pull staging/IIO fixes from Greg KH:
"These are mostly all IIO driver fixes, resolving a number of tiny
issues. There's also a ccree and lustre fix in here as well, both fix
problems found in those codebases.
All have been in linux-next with no reported issues"
* tag 'staging-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: ccree: fix buffer copy
staging/lustre/lov: remove set_fs() call from lov_getstripe()
staging: ccree: add CRYPTO dependency
iio: adc: sun4i-gpadc-iio: fix parent device being used in devm function
iio: light: ltr501 Fix interchanged als/ps register field
iio: adc: bcm_iproc_adc: swap primary and secondary isr handler's
iio: trigger: fix NULL pointer dereference in iio_trigger_write_current()
iio: adc: max9611: Fix attribute measure unit
iio: adc: ti_am335x_adc: allocating too much in probe
iio: adc: sun4i-gpadc-iio: Fix module autoload when OF devices are registered
iio: adc: sun4i-gpadc-iio: Fix module autoload when PLATFORM devices are registered
iio: proximity: as3935: fix iio_trigger_poll issue
iio: proximity: as3935: fix AS3935_INT mask
iio: adc: Max9611: checking for ERR_PTR instead of NULL in probe
iio: proximity: as3935: recalibrate RCO after resume
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 4.12-rc5
They are for some reported issues in the chipidea and gadget drivers.
Nothing major. All have been in linux-next for a while with no
reported issues"
* tag 'usb-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: gadget: udc: renesas_usb3: Fix PN_INT_ENA disabling timing
usb: gadget: udc: renesas_usb3: lock for PN_ registers access
usb: gadget: udc: renesas_usb3: fix deadlock by spinlock
usb: gadget: udc: renesas_usb3: fix pm_runtime functions calling
usb: gadget: f_mass_storage: Serialize wake and sleep execution
usb: dwc2: add support for the DWC2 controller on Meson8 SoCs
phy: qualcomm: phy-qcom-qmp: fix application of sizeof to pointer
usb: musb: dsps: keep VBUS on for host-only mode
usb: chipidea: core: check before accessing ci_role in ci_role_show
usb: chipidea: debug: check before accessing ci_role
phy: qcom-qmp: fix return value check in qcom_qmp_phy_create()
usb: chipidea: udc: fix NULL pointer dereference if udc_start failed
usb: chipidea: imx: Do not access CLKONOFF on i.MX51
Pull SCSI fixes from James Bottomley:
"This is a set of user visible fixes (excepting one format string
change).
Four of the qla2xxx fixes only affect the firmware dump path, but it's
still important to the enterprise. The rest are various NULL pointer
crash conditions or outright driver hangs"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: cxgb4i: libcxgbi: in error case RST tcp conn
scsi: scsi_debug: Avoid PI being disabled when TPGS is enabled
scsi: qla2xxx: Fix extraneous ref on sp's after adapter break
scsi: lpfc: prevent potential null pointer dereference
scsi: lpfc: Avoid NULL pointer dereference in lpfc_els_abort()
scsi: lpfc: nvmet_fc: fix format string
scsi: qla2xxx: Fix crash due to NULL pointer dereference of ctx
scsi: qla2xxx: Fix mailbox pointer error in fwdump capture
scsi: qla2xxx: Set bit 15 for DIAG_ECHO_TEST MBC
scsi: qla2xxx: Modify T262 FW dump template to specify same start/end to debug customer issues
scsi: qla2xxx: Fix crash due to mismatch mumber of Q-pair creation for Multi queue
scsi: qla2xxx: Fix NULL pointer access due to redundant fc_host_port_name call
scsi: qla2xxx: Fix recursive loop during target mode configuration for ISP25XX leaving system unresponsive
scsi: bnx2fc: fix race condition in bnx2fc_get_host_stats()
scsi: qla2xxx: don't disable a not previously enabled PCI device
Pull libnvdimm fix from Dan Williams:
"We expanded the device-dax fs type in 4.12 to be a generic provider of
a struct dax_device with an embedded inode. However, Sasha found some
basic negative testing was not run to verify that this fs cleanly
handles being mounted directly.
Note that the fresh rebase was done to remove an unnecessary Cc:
<stable> tag, but this commit otherwise had a build success
notification from the 0day robot."
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
device-dax: fix 'dax' device filesystem inode destruction crash
Pull hexagon fix from Guenter Roeck:
"This fixes a build error seen when building hexagon images.
Richard sent me an Ack, but didn't reply when asked if he wants me to
send the patch to you directly, so I figured I'd just do it"
* tag 'hexagon-for-linus-v4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hexagon: Use raw_copy_to_user
meson_sar_adc_clear_fifo passes a 0 as value-pointer to regmap_read().
In case of the meson-saradc driver this ends up in regmap_mmio_read(),
where the value-pointer is de-referenced unconditionally to assign the
value which was read.
Fix this by passing an actual pointer, even though all we want to do is
to discard the value.
As a side-effect this fixes a sparse warning ("Using plain integer as
NULL pointer") as reported by Paolo Cretaro.
Fixes: 3adbf34273 ("iio: adc: add a driver for the SAR ADC found in Amlogic Meson SoCs")
Reported-by: Paolo Cretaro <paolocretaro@gmail.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
When the page size isn't bigger than 4K, there is no added value of enabling 4K
UAR feature in the Firmware.
Modified the condition of enabling the 4K UAR accordingly.
Fixes: f502d83495 ("net/mlx5: Activate support for 4K UARs")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
changing the channel interrupt moderation values in order to reduce CPU
overhead for all traffic types.
Each iteration of the algorithm, DIM calculates the difference in
throughput, packet rate and interrupt rate from last iteration in order
to make a decision. DIM relies on counters for each metric. When these
counters get to their type's max value they wraparound. In this case
the delta between 'end' and 'start' samples is negative and when
translated to unsigned integers - very high. This results in a false
indication to the algorithm and might result in a wrong decision.
The fix calculates the 'distance' between 'end' and 'start' samples in a
cyclic way around the relevant type's max value. It can also be viewed as
an absolute value around the type's max value instead of around 0.
Testing show higher stability in DIM profile selection and no wraparound
issues.
Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
changing the channel interrupt moderation values in order to reduce CPU
overhead for all traffic types.
Until now only interrupt and packet rate were sampled.
We found a scenario on which we get a false indication since a change in
DIM caused more aggregation and reduced packet rate while increasing BW.
We now regard a change as succesfull iff:
current_BW > (prev_BW + threshold) or
current_BW ~= prev_BW and current_PR > (prev_PR + threshold) or
current_BW ~= prev_BW and current_PR ~= prev_PR and
current_IR < (prev_IR - threshold)
Where BW = Bandwidth, PR = Packet rate and IR = Interrupt rate
Improvements (ConnectX-4Lx 25GbE, single RX queue, LRO off)
--------------------------------------------------
packet size | before[Mb/s] | after[Mb/s] | gain |
2B | 343.4 | 359.4 | 4.5% |
16B | 2739.7 | 2814.8 | 2.7% |
64B | 9739 | 10185.3 | 4.5% |
Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Remove the following module event counters out of ethtool stats. The
reason for removing these event counters is that these events do not
occur without techinician's intervention.
module_pwr_budget_exd
module_long_range
module_no_eeprom
module_enforce_part
module_unknown_id
module_unknown_status
module_plug
Fixes: bedb7c909c ("net/mlx5e: Add port module event counters to ethtool stats")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The issue is that when we get an assert we will stop polling the health
and thus we cant enter error state when we have a real health issue.
Fixes: fd76ee4da5 ('net/mlx5_core: Fix internal error detection conditions')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
INFO: task gnome-terminal-:1734 blocked for more than 120 seconds.
Not tainted 4.12.0-rc4+ #8
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
gnome-terminal- D 0 1734 1015 0x00000000
Call Trace:
__schedule+0x3cd/0xb30
schedule+0x40/0x90
kvm_async_pf_task_wait+0x1cc/0x270
? __vfs_read+0x37/0x150
? prepare_to_swait+0x22/0x70
do_async_page_fault+0x77/0xb0
? do_async_page_fault+0x77/0xb0
async_page_fault+0x28/0x30
This is triggered by running both win7 and win2016 on L1 KVM simultaneously,
and then gives stress to memory on L1, I can observed this hang on L1 when
at least ~70% swap area is occupied on L0.
This is due to async pf was injected to L2 which should be injected to L1,
L2 guest starts receiving pagefault w/ bogus %cr2(apf token from the host
actually), and L1 guest starts accumulating tasks stuck in D state in
kvm_async_pf_task_wait() since missing PAGE_READY async_pfs.
This patch fixes the hang by doing async pf when executing L1 guest.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit ac4691fac8 ("hexagon: switch to RAW_COPY_USER") replaced
__copy_to_user_hexagon() with raw_copy_to_user(), but did not catch
all callers, resulting in the following build error.
arch/hexagon/mm/uaccess.c: In function '__clear_user_hexagon':
arch/hexagon/mm/uaccess.c:40:3: error:
implicit declaration of function '__copy_to_user_hexagon'
Fixes: ac4691fac8 ("hexagon: switch to RAW_COPY_USER")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Thomas Petazzoni says:
====================
net: mvpp2: driver fixes
As requested, here is a series of patches containing only bug fixes
for the mvpp2 driver. It is based on the latest "net" branch.
Changes since v1:
- Fixed a build breakage that occurred when only PATCH 1 was only,
and not later patches in the series. Was reported by the kbuild
report on the first submission.
- Added Tested-by from Marc Zyngier on PATCH 2.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
smp_processor_id() should not be used in migration-enabled contexts. We
originally thought it was OK in the specific situation of this driver,
but it was wrong, and calling smp_processor_id() in a migration-enabled
context prints a big fat warning when CONFIG_DEBUG_PREEMPT=y.
Therefore, this commit replaces the smp_processor_id() in
migration-enabled contexts by the appropriate get_cpu/put_cpu sections.
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Fixes: a786841df7 ("net: mvpp2: handle register mapping and access for PPv2.2")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit removes the useless remove
mvpp2_bm_cookie_{build,pool_get} functions. All what
mvpp2_bm_cookie_build() was doing is compute a 32-bit value by
concatenating the pool number and the CPU number... only to get the pool
number re-extracted by mvpp2_bm_cookie_pool_get() later on.
Instead, just get the pool number directly from RX descriptor status,
and pass it to mvpp2_pool_refill() and mvpp2_rx_refill().
This has the added benefit of dropping a smp_processor_id() call in a
migration-enabled context, which is wrong, and is the original
motivation for making this change.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the
function call path is:
tipc_l2_rcv_msg (acquire the lock by rcu_read_lock)
tipc_rcv
tipc_sk_rcv
tipc_msg_reverse
pskb_expand_head(GFP_KERNEL) --> may sleep
tipc_node_broadcast
tipc_node_xmit_skb
tipc_node_xmit
tipc_sk_rcv
tipc_msg_reverse
pskb_expand_head(GFP_KERNEL) --> may sleep
To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC".
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the
function call path is:
cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock)
cfctrl_linkdown_req
cfpkt_create
cfpkt_create_pfx
alloc_skb(GFP_KERNEL) --> may sleep
cfserl_receive (acquire the lock by rcu_read_lock)
cfpkt_split
cfpkt_create_pfx
alloc_skb(GFP_KERNEL) --> may sleep
There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or
"GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function
is called under a rcu read lock, instead in interrupt.
To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now sctp holds read_lock when foreach sctp_ep_hashtable without disabling
BH. If CPU schedules to another thread A at this moment, the thread A may
be trying to hold the write_lock with disabling BH.
As BH is disabled and CPU cannot schedule back to the thread holding the
read_lock, while the thread A keeps waiting for the read_lock. A dead
lock would be triggered by this.
This patch is to fix this dead lock by calling read_lock_bh instead to
disable BH when holding the read_lock in sctp_for_each_endpoint.
Fixes: 626d16f50f ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 2b30842b23 ("net: fec: Clear and enable MIB counters on imx51")
introduced fec_enet_clear_ethtool_stats(), but missed to add a stub
for the CONFIG_M5272=y case, causing build failure for the
m5272c3_defconfig.
Add the missing empty stub to fix the build failure.
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a counter problem on 32bit systems:
When the rx_bytes counter reached 2 GiB, it jumpd to (2^64 Bytes - 2GiB) Bytes.
rtnl_link_stats64 has __u64 type and atomic_long_read returns
atomic_long_t which is signed. Due to the conversation
we get an incorrect value on 32bit systems if the MSB of
the atomic_long_t value is set.
CC: Tom Parkin <tparkin@katalix.com>
Fixes: 7b7c0719cd ("l2tp: avoid deadlock in l2tp stats update")
Signed-off-by: Dominik Heidler <dheidler@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function is not defined, so no need to declare it.
As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz says:
====================
bnx2x: Fix malicious VFs indication
It was discovered that for a VF there's a simple [yet uncommon] scenario
which would cause device firmware to declare that VF as malicious -
Add a vlan interface on top of a VF and disable txvlan offloading for
that VF [causing VF to transmit packets where vlan is on payload].
Patch #1 corrects driver transmission to prevent this issue.
Patch #2 is a by-product correcting PF behavior once a VF is declared
malicious.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Once firmware indicates that a given VF is malicious and until
that VF passes an FLR all bets are off - PF can't know anything
is happening to the VF [since VF can't communicate anything to its PF].
But PF is currently still periodically asking device to collect
statistics for the VF which might in turn fill logs by IOMMU blocking
memory access done by the VF's PCI function [in the case VF has unmapped
its buffers].
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VF clients are configured as enforced, meaning firmware is validating
the correctness of their ethertype/vid during transmission.
Once txvlan is disabled, VF would start getting SKBs for transmission
here vlan is on the payload - but it'll pass the packet's ethertype
instead of the vid, leading to firmware declaring it as malicious.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull UFS fixes from Al Viro:
"This is just the obvious backport fodder; I'm pretty sure that there
will be more - definitely so wrt performance and quite possibly
correctness as well"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs: we need to sync inode before freeing it
excessive checks in ufs_write_failed() and ufs_evict_inode()
ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path
ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments()
ufs: set correct ->s_maxsize
ufs: restore maintaining ->i_blocks
fix ufs_isblockset()
ufs: restore proper tail allocation
Pull btrfs fixes from Chris Mason:
"Some fixes that Dave Sterba collected.
We've been hitting an early enospc problem on production machines that
Omar tracked down to an old int->u64 mistake. I waited a bit on this
pull to make sure it was really the problem from production, but it's
on ~2100 hosts now and I think we're good.
Omar also noticed a commit in the queue would make new early ENOSPC
problems. I pulled that out for now, which is why the top three
commits are younger than the rest.
Otherwise these are all fixes, some explaining very old bugs that
we've been poking at for a while"
* 'for-linus-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix delalloc accounting leak caused by u32 overflow
Btrfs: clear EXTENT_DEFRAG bits in finish_ordered_io
btrfs: tree-log.c: Wrong printk information about namelen
btrfs: fix race with relocation recovery and fs_root setup
btrfs: fix memory leak in update_space_info failure path
btrfs: use correct types for page indices in btrfs_page_exists_in_range
btrfs: fix incorrect error return ret being passed to mapping_set_error
btrfs: Make flush bios explicitely sync
btrfs: fiemap: Cache and merge fiemap extent before submit it to user
Pull x86 fixes from Ingo Molnar:
"Misc fixes: a Geode fix plus a microcode loader fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/intel: Clear patch pointer before jettisoning the initrd
x86/cpu/cyrix: Add alternative Device ID of Geode GX1 SoC
Pull CPU hotplug fix from Ingo Molnar:
"An error handling corner case fix"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Drop the device lock on error
Pull RCU fixes from Ingo Molnar:
"Fix an SRCU bug affecting KVM IRQ injection"
* 'rcu-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
srcu: Allow use of Classic SRCU from both process and interrupt context
srcu: Allow use of Tiny/Tree SRCU from both process and interrupt context
Pull perf fixes from Ingo Molnar:
"This is mostly tooling fixes, plus an instruction pointer filtering
fix.
It's more fixes than usual - Arnaldo got back from a longer vacation
and there was a backlog"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
perf symbols: Kill dso__build_id_is_kmod()
perf symbols: Keep DSO->symtab_type after decompress
perf tests: Decompress kernel module before objdump
perf tools: Consolidate error path in __open_dso()
perf tools: Decompress kernel module when reading DSO data
perf annotate: Use dso__decompress_kmodule_path()
perf tools: Introduce dso__decompress_kmodule_{fd,path}
perf tools: Fix a memory leak in __open_dso()
perf annotate: Fix symbolic link of build-id cache
perf/core: Drop kernel samples even though :u is specified
perf script python: Remove dups in documentation examples
perf script python: Updated trace_unhandled() signature
perf script python: Fix wrong code snippets in documentation
perf script: Fix documentation errors
perf script: Fix outdated comment for perf-trace-python
perf probe: Fix examples section of documentation
perf report: Ensure the perf DSO mapping matches what libdw sees
perf report: Include partial stacks unwound with libdw
perf annotate: Add missing powerpc triplet
perf test: Disable breakpoint signal tests for powerpc
...
Pull EFI fix from Ingo Molnar:
"A boot crash fix for certain systems where the kernel would trust a
piece of firmware data it should not have"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: Fix boot panic because of invalid BGRT image address
Pull IOMMU fixes from Joerg Roedel:
- another compile-fix for my header cleanup
- a couple of fixes for the recently merged IOMMU probe deferal code
- fixes for ACPI/IORT code necessary with IOMMU probe deferal
* tag 'iommu-fixes-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
arm: dma-mapping: Reset the device's dma_ops
ACPI/IORT: Move the check to get iommu_ops from translated fwspec
ARM: dma-mapping: Don't tear down third-party mappings
ACPI/IORT: Ignore all errors except EPROBE_DEFER
iommu/of: Ignore all errors except EPROBE_DEFER
iommu/of: Fix check for returning EPROBE_DEFER
iommu/dma: Fix function declaration
Pull input fixes from Dmitry Torokhov:
- mark "guest" RMI device as pass-through port to avoid "phantom" ALPS
toouchpad on newer Lenovo Carbons
- add two more laptops to the Elantech's lists of devices using CRC
mode
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - register F03 port as pass-through serio
Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled
Pull MD bugfix from Shaohua Li:
"One bug fix from Neil Brown for MD. The bug was introduced in this
cycle"
* tag 'md/4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md: initialise ->writes_pending in personality modules.
Pull block fixes from Jens Axboe:
"A set of fixes in the area of block IO, that should go into the next
-rc release. This contains:
- An OOPS fix from Dmitry, fixing a regression with the bio integrity
code in this series.
- Fix truncation of elevator io context cache name, from Eric
Biggers.
- NVMe pull from Christoph includes FC fixes from James, APST
fixes/tweaks from Kai-Heng, removal fix from Rakesh, and an RDMA
fix from Sagi.
- Two tweaks for the block throttling code. One from Joseph Qi,
fixing an oops from the timer code, and one from Shaohua, improving
the behavior on rotatonal storage.
- Two blk-mq fixes from Ming, fixing corner cases with the direct
issue code.
- Locking fix for bfq cgroups from Paolo"
* 'for-linus' of git://git.kernel.dk/linux-block:
block, bfq: access and cache blkg data only when safe
Fix loop device flush before configure v3
blk-throttle: set default latency baseline for harddisk
blk-throttle: fix NULL pointer dereference in throtl_schedule_pending_timer
nvme: relax APST default max latency to 100ms
nvme: only consider exit latency when choosing useful non-op power states
nvme-fc: fix missing put reference on controller create failure
nvme-fc: on lldd/transport io error, terminate association
nvme-rdma: fast fail incoming requests while we reconnect
nvme-pci: fix multiple ctrl removal scheduling
nvme: fix hang in remove path
elevator: fix truncation of icq_cache_name
blk-mq: fix direct issue
blk-mq: pass correct hctx to blk_mq_try_issue_directly
bio-integrity: Do not allocate integrity context for bio w/o data
Pull sound fixes from Takashi Iwai:
"This update contains a slightly hight amount of changes due to the
pending ASoC fixes:
- ALSA timer core got a couple of fixes for races between read and
ioctl, leading to potential read of uninitialized kmalloced memory
- ASoC core fixed the de-registration pattern for use-after-free bug
- The rewrite of probe code in ASoC Intel Skylake for i915 component
- ASoC R-snd got a series of fixes for SSI
- ASoC simple-card, atmel, da7213, and rt286 trivial fixes
- HD-audio ALC269 quirk and rearrangement of quirk table"
* tag 'sound-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT
ALSA: timer: Fix race between read and ioctl
ALSA: hda/realtek - Reorder ALC269 ASUS quirk entries
ALSA: hda/realtek: Fix mic and headset jack sense on Asus X705UD
ASoC: rsnd: fixup parent_clk_name of AUDIO_CLKOUTx
ASoC: Intel: Skylake: Fix to parse consecutive string tkns in manifest
ASoC: Intel: Skylake: Fix IPC rx_list corruption
ASoC: rsnd: SSI PIO adjust to 24bit mode
MAINTAINERS: Update email address for patches to Wolfson parts
ASoC: Fix use-after-free at card unregistration
ASoC: simple-card: fix mic jack initialization
ASoC: rsnd: don't call free_irq() on Parent SSI
ASoC: atmel-classd: sync regcache when resuming
ASoC: rsnd: don't use PDTA bit for 24bit on SSI
ASoC: da7213: Fix incorrect usage of bitwise '&' operator for SRM check
rt286: add Thinkpad Helix 2 to force_combo_jack_table
ASoC: Intel: Skylake: Move i915 registration to worker thread
Pull drm fixes from Dave Airlie:
"Intel, nouveau, rockchip, vmwgfx, imx, meson, mediatek and core fixes.
Bit more spread out fixes this time, fixes for 7 drivers + a couple of
core fixes.
i915 and vmwgfx are the main ones. The vmwgfx ones fix a bunch of
regressions in their atomic rework, and a few fixes destined for
stable. i915 has some 4.12 regressions and older things that need to
be fixed in stable as well.
nouveau also has some runtime pm fixes and a timer list handling fix,
otherwise a couple of core and small driver regression fixes"
* tag 'drm-fixes-for-v4.12-rc5' of git://people.freedesktop.org/~airlied/linux: (37 commits)
drm/i915: fix warning for unused variable
drm/meson: Fix driver bind when only CVBS is available
drm/i915: Fix 90/270 rotated coordinates for FBC
drm/i915: Restore has_fbc=1 for ILK-M
drm/i915: Workaround VLV/CHV DSI scanline counter hardware fail
drm/i915: Fix logical inversion for gen4 quirking
drm/i915: Guard against i915_ggtt_disable_guc() being invoked unconditionally
drm/i915: Always recompute watermarks when distrust_bios_wm is set, v2.
drm/i915: Prevent the system suspend complete optimization
drm/i915/psr: disable psr2 for resolution greater than 32X20
drm/i915: Hold a wakeref for probing the ring registers
drm/i915: Short-circuit i915_gem_wait_for_idle() if already idle
drm/i915: Disable decoupled MMIO
drm/i915/guc: Remove stale comment for q_fail
drm/vmwgfx: Bump driver minor and date
drm/vmwgfx: Remove unused legacy cursor functions
drm/vmwgfx: fix spelling mistake "exeeds" -> "exceeds"
drm/vmwgfx: Fix large topology crash
drm/vmwgfx: Make sure to update STDU when FB is updated
drm/vmwgfx: Make sure backup_handle is always valid
...
As it is, short copy in write() to append-only file will fail
to truncate the excessive allocated blocks. As the matter of
fact, all checks in ufs_truncate_blocks() are either redundant
or wrong for that caller. As for the only other caller
(ufs_evict_inode()), we only need the file type checks there.
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
btrfs_calc_trans_metadata_size() does an unsigned 32-bit multiplication,
which can overflow if num_items >= 4 GB / (nodesize * BTRFS_MAX_LEVEL * 2).
For a nodesize of 16kB, this overflow happens at 16k items. Usually,
num_items is a small constant passed to btrfs_start_transaction(), but
we also use btrfs_calc_trans_metadata_size() for metadata reservations
for extent items in btrfs_delalloc_{reserve,release}_metadata().
In drop_outstanding_extents(), num_items is calculated as
inode->reserved_extents - inode->outstanding_extents. The difference
between these two counters is usually small, but if many delalloc
extents are reserved and then the outstanding extents are merged in
btrfs_merge_extent_hook(), the difference can become large enough to
overflow in btrfs_calc_trans_metadata_size().
The overflow manifests itself as a leak of a multiple of 4 GB in
delalloc_block_rsv and the metadata bytes_may_use counter. This in turn
can cause early ENOSPC errors. Additionally, these WARN_ONs in
extent-tree.c will be hit when unmounting:
WARN_ON(fs_info->delalloc_block_rsv.size > 0);
WARN_ON(fs_info->delalloc_block_rsv.reserved > 0);
WARN_ON(space_info->bytes_pinned > 0 ||
space_info->bytes_reserved > 0 ||
space_info->bytes_may_use > 0);
Fix it by casting nodesize to a u64 so that
btrfs_calc_trans_metadata_size() does a full 64-bit multiplication.
While we're here, do the same in btrfs_calc_trunc_metadata_size(); this
can't overflow with any existing uses, but it's better to be safe here
than have another hard-to-debug problem later on.
Cc: stable@vger.kernel.org
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Before this, we use 'filled' mode here, ie. if all range has been
filled with EXTENT_DEFRAG bits, get to clear it, but if the defrag
range joins the adjacent delalloc range, then we'll have EXTENT_DEFRAG
bits in extent_state until releasing this inode's pages, and that
prevents extent_data from being freed.
This clears the bit if any was found within the ordered extent.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
In verify_dir_item, it wants to printk name_len of dir_item but
printk data_len acutally.
Fix it by calling btrfs_dir_name_len instead of btrfs_dir_data_len.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Marc Kleine-Budde says:
====================
pull-request: can 2017-06-09
this is a pull request of 6 patches for net/master.
There's a patch by Stephane Grosjean that fixes an uninitialized symbol warning
in the peak_canfd driver. A patch by Johan Hovold to fix the product-id
endianness in an error message in the the peak_usb driver. A patch by Oliver
Hartkopp to enable CAN FD for virtual CAN devices by default. Three patches by
me, one makes the helper function can_change_state() robust to be called with
cf == NULL. The next patch fixes a memory leak in the gs_usb driver. And the
last one fixes a lockdep splat by properly initialize the per-net
can_rcvlists_lock spin_lock.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The change to remove free_netdev() from ieee80211_if_free()
erroneously didn't add the necessary free_netdev() for when
ieee80211_if_free() is called directly in one place, rather
than as the priv_destructor. Add the missing call.
Fixes: cf124db566 ("net: Fix inconsistent teardown and release of private netdev state.")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPI's from the victim cpu are not handled in dev_cpu_callback.
So these pending IPI's would be sent to the remote cpu only when
NET_RX is scheduled on the victim cpu and since this trigger is
unpredictable it would result in packet latencies on the remote cpu.
This patch add support to send the pending ipi's of victim cpu.
Signed-off-by: Ashwanth Goli <ashwanth@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull xen fix from Juergen Gross:
"A fix for Xen on ARM when dealing with 64kB page size of a guest"
* tag 'for-linus-4.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/privcmd: Support correctly 64KB page granularity when mapping memory
The 5th generation Thinkpad X1 Carbons use Synaptics touchpads accessible
over SMBus/RMI, combined with ALPS or Elantech trackpoint devices instead
of classic IBM/Lenovo trackpoints. Unfortunately there is no way for ALPS
driver to detect whether it is dealing with touchpad + trackpoint
combination or just a trackpoint, so we end up with a "phantom" dualpoint
ALPS device in addition to real touchpad and trackpoint.
Given that we do not have any special advanced handling for ALPS or
Elantech trackpoints (unlike IBM trackpoints that have separate driver and
a host of options) we are better off keeping the trackpoints in PS/2
emulation mode. We achieve that by setting serio type to SERIO_PS_PSTHRU,
which will limit number of protocols psmouse driver will try. In addition
to getting rid of the "phantom" touchpads, this will also speed up probing
of F03 pass-through port.
Reported-by: Damjan Georgievski <gdamjan@gmail.com>
Suggested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Pull powerpc fixes from Michael Ellerman:
"Mostly fairly minor, of note are:
- Fix percpu allocations to be NUMA aware
- Limit 4k page size config to 64TB virtual address space
- Avoid needlessly restoring FP and vector registers
Thanks to Aneesh Kumar K.V, Breno Leitao, Christophe Leroy, Frederic
Barrat, Madhavan Srinivasan, Michael Bringmann, Nicholas Piggin,
Vaibhav Jain"
* tag 'powerpc-4.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/book3s64: Move PPC_DT_CPU_FTRs and enable it by default
powerpc/mm/4k: Limit 4k page size config to 64TB virtual address space
cxl: Fix error path on bad ioctl
powerpc/perf: Fix Power9 test_adder fields
powerpc/numa: Fix percpu allocations to be NUMA aware
cxl: Avoid double free_irq() for psl,slice interrupts
powerpc/kernel: Initialize load_tm on task creation
powerpc/kernel: Fix FP and vector register restoration
powerpc/64: Reclaim CPU_FTR_SUBCORE
powerpc/hotplug-mem: Fix missing endian conversion of aa_index
powerpc/sysdev/simple_gpio: Fix oops in gpio save_regs function
powerpc/spufs: Fix coredump of SPU contexts
powerpc/64s: Add dt_cpu_ftrs boot time setup option
Pull ARM SoC fixes from Olof Johansson:
"Been sitting on these for a couple of weeks waiting on some larger
batches to come in but it's been pretty quiet.
Just your garden variety fixes here:
- A few maintainers updates (ep93xx, Exynos, TI, Marvell)
- Some PM fixes for Atmel/at91 and Marvell
- A few DT fixes for Marvell, Versatile, TI Keystone, bcm283x
- A reset driver patch to set module license for symbol access"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
MAINTAINERS: EP93XX: Update maintainership
MAINTAINERS: remove kernel@stlinux.com obsolete mailing list
ARM: dts: versatile: use #include "..." to include local DT
MAINTAINERS: add device-tree files to TI DaVinci entry
ARM: at91: select CONFIG_ARM_CPU_SUSPEND
ARM: dts: keystone-k2l: fix broken Ethernet due to disabled OSR
arm64: defconfig: enable some core options for 64bit Rockchip socs
arm64: marvell: dts: fix interrupts in 7k/8k crypto nodes
reset: hi6220: Set module license so that it can be loaded
MAINTAINERS: add irqchip related drivers to Marvell EBU maintainers
MAINTAINERS: sort F entries for Marvell EBU maintainers
ARM: davinci: PM: Do not free useful resources in normal path in 'davinci_pm_init'
ARM: davinci: PM: Free resources in error handling path in 'davinci_pm_init'
ARM: dts: bcm283x: Reserve first page for firmware
memory: atmel-ebi: mark PM ops as __maybe_unused
MAINTAINERS: Remove Javier Martinez Canillas as reviewer for Exynos
1.) Bugfix of function stmmac_get_tx_hwtstamp.
Corrected the tx timestamp available check (same as 4.8 and older)
Change printout from info syslevel to debug.
2.) Bugfix of function stmmac_get_rx_hwtstamp.
Corrected the rx timestamp available check (same as 4.8 and older)
Change printout from info syslevel to debug.
Fixes: ba1ffd74df ("stmmac: fix PTP support for GMAC4")
Signed-off-by: Mario Molitor <mario_molitor@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
According the CYCLON V documention only the bit 16 of snaptypesel should
set.
(more information see Table 17-20 (cv_5v4.pdf) :
Timestamp Snapshot Dependency on Register Bits)
Fixes: d2042052a0 ("stmmac: update the PTP header file")
Signed-off-by: Mario Molitor <mario_molitor@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a check and a nice user-friendly message when the curses
library is not present on the system and the user wants to do "make
menuconfig". It doesn't get issued, though. Instead, we fail the build
when mconf.c doesn't find the curses.h header:
HOSTCC scripts/kconfig/mconf.o
In file included from scripts/kconfig/mconf.c:23:0:
scripts/kconfig/lxdialog/dialog.h:38:20: fatal error: curses.h: No such file or directory
#include CURSES_LOC
^
compilation terminated.
Make that check a prerequisite to mconf so that the user sees the error
message instead:
$ make menuconfig
*** Unable to find the ncurses libraries or the
*** required header files.
*** 'make menuconfig' requires the ncurses libraries.
***
*** Install ncurses (ncurses-devel) and try again.
***
scripts/kconfig/Makefile:203: recipe for target 'scripts/kconfig/dochecklxdialog' failed
make[1]: *** [scripts/kconfig/dochecklxdialog] Error 1
Makefile:548: recipe for target 'menuconfig' failed
make: *** [menuconfig] Error 2
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
It looks like this:
Message from syslogd@flamingo at Apr 26 00:45:00 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4
They seem to coincide with net namespace teardown.
The message is emitted by netdev_wait_allrefs().
Forced a kdump in netdev_run_todo, but found that the refcount on the lo
device was already 0 at the time we got to the panic.
Used bcc to check the blocking in netdev_run_todo. The only places
where we're off cpu there are in the rcu_barrier() and msleep() calls.
That behavior is expected. The msleep time coincides with the amount of
time we spend waiting for the refcount to reach zero; the rcu_barrier()
wait times are not excessive.
After looking through the list of callbacks that the netdevice notifiers
invoke in this path, it appears that the dst_dev_event is the most
interesting. The dst_ifdown path places a hold on the loopback_dev as
part of releasing the dev associated with the original dst cache entry.
Most of our notifier callbacks are straight-forward, but this one a)
looks complex, and b) places a hold on the network interface in
question.
I constructed a new bcc script that watches various events in the
liftime of a dst cache entry. Note that dst_ifdown will take a hold on
the loopback device until the invalidated dst entry gets freed.
[ __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183
__dst_free
rcu_nocb_kthread
kthread
ret_from_fork
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The inode destruction path for the 'dax' device filesystem incorrectly
assumes that the inode was initialized through 'alloc_dax()'. However,
if someone attempts to directly mount the dax filesystem with 'mount -t
dax dax mnt' that will bypass 'alloc_dax()' and the following failure
signatures may occur as a result:
kill_dax() must be called before final iput()
WARNING: CPU: 2 PID: 1188 at drivers/dax/super.c:243 dax_destroy_inode+0x48/0x50
RIP: 0010:dax_destroy_inode+0x48/0x50
Call Trace:
destroy_inode+0x3b/0x60
evict+0x139/0x1c0
iput+0x1f9/0x2d0
dentry_unlink_inode+0xc3/0x160
__dentry_kill+0xcf/0x180
? dput+0x37/0x3b0
dput+0x3a3/0x3b0
do_one_tree+0x36/0x40
shrink_dcache_for_umount+0x2d/0x90
generic_shutdown_super+0x1f/0x120
kill_anon_super+0x12/0x20
deactivate_locked_super+0x43/0x70
deactivate_super+0x4e/0x60
general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
RIP: 0010:kfree+0x6d/0x290
Call Trace:
<IRQ>
dax_i_callback+0x22/0x60
? dax_destroy_inode+0x50/0x50
rcu_process_callbacks+0x298/0x740
ida_remove called for id=0 which is not allocated.
WARNING: CPU: 0 PID: 0 at lib/idr.c:383 ida_remove+0x110/0x120
[..]
Call Trace:
<IRQ>
ida_simple_remove+0x2b/0x50
? dax_destroy_inode+0x50/0x50
dax_i_callback+0x3c/0x60
rcu_process_callbacks+0x298/0x740
Add missing initialization of the 'struct dax_device' and inode so that
the destruction path does not kfree() or ida_simple_remove()
uninitialized data.
Fixes: 7b6be8444e ("dax: refactor dax-fs into a generic provider of 'struct dax_device' instances")
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maniaxx reported a kernel boot crash in the EFI code, which I emulated
by using same invalid phys addr in code:
BUG: unable to handle kernel paging request at ffffffffff280001
IP: efi_bgrt_init+0xfb/0x153
...
Call Trace:
? bgrt_init+0xbc/0xbc
acpi_parse_bgrt+0xe/0x12
acpi_table_parse+0x89/0xb8
acpi_boot_init+0x445/0x4e2
? acpi_parse_x2apic+0x79/0x79
? dmi_ignore_irq0_timer_override+0x33/0x33
setup_arch+0xb63/0xc82
? early_idt_handler_array+0x120/0x120
start_kernel+0xb7/0x443
? early_idt_handler_array+0x120/0x120
x86_64_start_reservations+0x29/0x2b
x86_64_start_kernel+0x154/0x177
secondary_startup_64+0x9f/0x9f
There is also a similar bug filed in bugzilla.kernel.org:
https://bugzilla.kernel.org/show_bug.cgi?id=195633
The crash is caused by this commit:
7b0a911478 efi/x86: Move the EFI BGRT init code to early init code
The root cause is the firmware on those machines provides invalid BGRT
image addresses.
In a kernel before above commit BGRT initializes late and uses ioremap()
to map the image address. Ioremap validates the address, if it is not a
valid physical address ioremap() just fails and returns. However in current
kernel EFI BGRT initializes early and uses early_memremap() which does not
validate the image address, and kernel panic happens.
According to ACPI spec the BGRT image address should fall into
EFI_BOOT_SERVICES_DATA, see the section 5.2.22.4 of below document:
http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
Fix this issue by validating the image address in efi_bgrt_init(). If the
image address does not fall into any EFI_BOOT_SERVICES_DATA areas we just
bail out with a warning message.
Reported-by: Maniaxx <tripleshiftone@gmail.com>
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 7b0a911478 ("efi/x86: Move the EFI BGRT init code to early init code")
Link: http://lkml.kernel.org/r/20170609084558.26766-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
CAN FD capable CAN interfaces can handle (classic) CAN 2.0 frames too.
New users usually fail at their first attempt to explore CAN FD on
virtual CAN interfaces due to the current CAN_MTU default.
Set the MTU to CANFD_MTU by default to reduce this confusion.
If someone *really* needs a 'classic CAN'-only device this can be set
with the 'ip' tool with e.g. 'ip link set vcan0 mtu 16' as before.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch adds the missing kfree() in gs_cmd_reset() to free the
memory that is not used anymore after usb_control_msg().
Cc: linux-stable <stable@vger.kernel.org>
Cc: Maximilian Schneider <max@schneidersoft.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Make sure to use the USB device product-id stored in host-byte order in
a probe error message.
Also remove a redundant reassignment of the local usb_dev variable which
had already been used to retrieve the product id.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch fixes two uninitialized symbol warnings in the new code adding
support of the PEAK-System PCAN-PCI Express FD boards, in the socket-CAN
network protocol family.
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
In OOM situations where no skb can be allocated, can_change_state() may
be called with cf == NULL. As this function updates the state and error
statistics it's not an option to skip the call to can_change_state() in
OOM situations.
This patch makes can_change_state() robust, so that it can be called
with cf == NULL.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
During an eeh call to cxl_remove can result in double free_irq of
psl,slice interrupts. This can happen if perst_reloads_same_image == 1
and call to cxl_configure_adapter() fails during slot_reset
callback. In such a case we see a kernel oops with following back-trace:
Oops: Kernel access of bad area, sig: 11 [#1]
Call Trace:
free_irq+0x88/0xd0 (unreliable)
cxl_unmap_irq+0x20/0x40 [cxl]
cxl_native_release_psl_irq+0x78/0xd8 [cxl]
pci_deconfigure_afu+0xac/0x110 [cxl]
cxl_remove+0x104/0x210 [cxl]
pci_device_remove+0x6c/0x110
device_release_driver_internal+0x204/0x2e0
pci_stop_bus_device+0xa0/0xd0
pci_stop_and_remove_bus_device+0x28/0x40
pci_hp_remove_devices+0xb0/0x150
pci_hp_remove_devices+0x68/0x150
eeh_handle_normal_event+0x140/0x580
eeh_handle_event+0x174/0x360
eeh_event_handler+0x1e8/0x1f0
This patch fixes the issue of double free_irq by checking that
variables that hold the virqs (err_hwirq, serr_hwirq, psl_virq) are
not '0' before un-mapping and resetting these variables to '0' when
they are un-mapped.
Cc: stable@vger.kernel.org
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In stm32_pconf_parse_conf function, stm32_pmx_gpio_set_direction is
called with wrong parameter value. Indeed, using NULL value for range
will raise an oops.
Fixes: aceb16dc2d ("pinctrl: Add STM32 MCUs support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The AMD pinctrl driver uses a chained interrupt to demultiplex the GPIO
interrupts. Kevin Vandeventer reported, that his new AMD Ryzen locks up
hard on boot when the AMD pinctrl driver is initialized. The reason is an
interrupt storm. It's not clear whether that's caused by hardware or
firmware or both.
Using chained interrupts on X86 is a dangerous endavour. If a system is
misconfigured or the hardware buggy there is no safety net to catch an
interrupt storm.
Convert the driver to use a regular interrupt for the demultiplex
handler. This allows the interrupt storm detector to catch the malfunction
and lets the system boot up.
This should be backported to stable because it's likely that more users run
into this problem as the AMD Ryzen machines are spreading.
Reported-by: Kevin Vandeventer
Link: https://bugzilla.suse.com/show_bug.cgi?id=1034261
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
If more than one gpio bank has the "pwm" property, only one will be
registered successfully, all the others will fail with:
mvebu-gpio: probe of f1018140.gpio failed with error -17
That's because in alloc_pwms(), the chip->base (aka "int pwm"), was not
set (thus, ==0) ; and 0 is a meaningful start value in alloc_pwm().
What was intended is mvpwm->chip->base = -1.
Like that, the numbering will be done auto-magically
Moreover, as the region might be already occupied by another pwm, we
shouldn't force:
mvpwm->chip->base = 0
nor
mvpwm->chip->base = id * MVEBU_MAX_GPIO_PER_BANK;
Tested on clearfog-pro (Marvell 88F6828)
Fixes: 757642f9a5 ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The blink counter A was always selected because 0 was forced in the
blink select counter register.
The variable 'set' was obviously there to be used as the register value,
selecting the B counter when id==1 and A counter when id==0.
Tested on clearfog-pro (Marvell 88F6828)
Fixes: 757642f9a5 ("gpio: mvebu: Add limited PWM support")
Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull RCU fix from Paul E. McKenney:
" This series enables srcu_read_lock() and srcu_read_unlock() to be used from
interrupt handlers, which fixes a bug in KVM's use of SRCU in delivery
of interrupts to guest OSes. "
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When iscsi WRITE underflow occurs there are two different scenarios
that can happen.
Normally in practice, when an EDTL vs. SCSI CDB TRANSFER LENGTH
underflow is detected, the iscsi immediate data payload is the
smaller SCSI CDB TRANSFER LENGTH.
That is, when a host fabric LLD is using a fixed size EDTL for
a specific control CDB, the SCSI CDB TRANSFER LENGTH and actual
SCSI payload ends up being smaller than EDTL. In iscsi, this
means the received iscsi immediate data payload matches the
smaller SCSI CDB TRANSFER LENGTH, because there is no more
SCSI payload to accept beyond SCSI CDB TRANSFER LENGTH.
However, it's possible for a malicous host to send a WRITE
underflow where EDTL is larger than SCSI CDB TRANSFER LENGTH,
but incoming iscsi immediate data actually matches EDTL.
In the wild, we've never had a iscsi host environment actually
try to do this.
For this special case, it's wrong to truncate part of the
control CDB payload and continue to process the command during
underflow when immediate data payload received was larger than
SCSI CDB TRANSFER LENGTH, so go ahead and reject and drop the
bogus payload as a defensive action.
Note this potential bug was originally relaxed by the following
for allowing WRITE underflow in MSFT FCP host environments:
commit c72c525022
Author: Roland Dreier <roland@purestorage.com>
Date: Wed Jul 22 15:08:18 2015 -0700
target: allow underflow/overflow for PR OUT etc. commands
Cc: Roland Dreier <roland@purestorage.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a BUG() in iscsit_close_session() that could be
triggered when iscsit_logout_post_handler() execution from within
tx thread context was not run for more than SECONDS_FOR_LOGOUT_COMP
(15 seconds), and the TCP connection didn't already close before
then forcing tx thread context to automatically exit.
This would manifest itself during explicit logout as:
[33206.974254] 1 connection(s) still exist for iSCSI session to iqn.1993-08.org.debian:01:3f5523242179
[33206.980184] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 2100.772 msecs
[33209.078643] ------------[ cut here ]------------
[33209.078646] kernel BUG at drivers/target/iscsi/iscsi_target.c:4346!
Normally when explicit logout attempt fails, the tx thread context
exits and iscsit_close_connection() from rx thread context does the
extra cleanup once it detects conn->conn_logout_remove has not been
cleared by the logout type specific post handlers.
To address this special case, if the logout post handler in tx thread
context detects conn->tx_thread_active has already been cleared, simply
return and exit in order for existing iscsit_close_connection()
logic from rx thread context do failed logout cleanup.
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.14+
Tested-by: Gary Guo <ghg@datera.io>
Tested-by: Chu Yuan Lin <cyl@datera.io>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a se_cmd->cmd_kref underflow during CMD_T_ABORTED
when a fabric driver drops it's second reference from below the
target_core_tmr.c based callers of transport_cmd_finish_abort().
Recently with the conversion of kref to refcount_t, this bug was
manifesting itself as:
[705519.601034] refcount_t: underflow; use-after-free.
[705519.604034] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 20116.512 msecs
[705539.719111] ------------[ cut here ]------------
[705539.719117] WARNING: CPU: 3 PID: 26510 at lib/refcount.c:184 refcount_sub_and_test+0x33/0x51
Since the original kref atomic_t based kref_put() didn't check for
underflow and only invoked the final callback when zero was reached,
this bug did not manifest in practice since all se_cmd memory is
using preallocated tags.
To address this, go ahead and propigate the existing return from
transport_put_cmd() up via transport_cmd_finish_abort(), and
change transport_cmd_finish_abort() + core_tmr_handle_tas_abort()
callers to only do their local target_put_sess_cmd() if necessary.
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.14+
Tested-by: Gary Guo <ghg@datera.io>
Tested-by: Chu Yuan Lin <cyl@datera.io>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If a key's refcount is dropped to zero between key_lookup() peeking at
the refcount and subsequently attempting to increment it, refcount_inc()
will see a zero refcount. Here, refcount_inc() will WARN_ONCE(), and
will *not* increment the refcount, which will remain zero.
Once key_lookup() drops key_serial_lock, it is possible for the key to
be freed behind our back.
This patch uses refcount_inc_not_zero() to perform the peek and increment
atomically.
Fixes: fff292914d ("security, keys: convert key.usage from atomic_t to refcount_t")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
The initial Diffie-Hellman computation made direct use of the MPI
library because the crypto module did not support DH at the time. Now
that KPP is implemented, KEYCTL_DH_COMPUTE should use it to get rid of
duplicate code and leverage possible hardware acceleration.
This fixes an issue whereby the input to the KDF computation would
include additional uninitialized memory when the result of the
Diffie-Hellman computation was shorter than the input prime number.
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
If userspace called KEYCTL_DH_COMPUTE with kdf_params containing NULL
otherinfo but nonzero otherinfolen, the kernel would allocate a buffer
for the otherinfo, then feed it into the KDF without initializing it.
Fix this by always doing the copy from userspace (which will fail with
EFAULT in this scenario).
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Requesting "digest_null" in the keyctl_kdf_params caused an infinite
loop in kdf_ctr() because the "null" hash has a digest size of 0. Fix
it by rejecting hash algorithms with a digest size of 0.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
While a 'struct key' itself normally does not contain sensitive
information, Documentation/security/keys.txt actually encourages this:
"Having a payload is not required; and the payload can, in fact,
just be a value stored in the struct key itself."
In case someone has taken this advice, or will take this advice in the
future, zero the key structure before freeing it. We might as well, and
as a bonus this could make it a bit more difficult for an adversary to
determine which keys have recently been in use.
This is safe because the key_jar cache does not use a constructor.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
As the previous patch did for encrypted-keys, zero sensitive any
potentially sensitive data related to the "trusted" key type before it
is freed. Notably, we were not zeroing the tpm_buf structures in which
the actual key is stored for TPM seal and unseal, nor were we zeroing
the trusted_key_payload in certain error paths.
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: David Safford <safford@us.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
For keys of type "encrypted", consistently zero sensitive key material
before freeing it. This was already being done for the decrypted
payloads of encrypted keys, but not for the master key and the keys
derived from the master key.
Out of an abundance of caution and because it is trivial to do so, also
zero buffers containing the key payload in encrypted form, although
depending on how the encrypted-keys feature is used such information
does not necessarily need to be kept secret.
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: David Safford <safford@us.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Zero the payloads of user and logon keys before freeing them. This
prevents sensitive key material from being kept around in the slab
caches after a key is released.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Before returning from add_key() or one of the keyctl() commands that
takes in a key payload, zero the temporary buffer that was allocated to
hold the key payload copied from userspace. This may contain sensitive
key material that should not be kept around in the slab caches.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
sys_add_key() and the KEYCTL_UPDATE operation of sys_keyctl() allowed a
NULL payload with nonzero length to be passed to the key type's
->preparse(), ->instantiate(), and/or ->update() methods. Various key
types including asymmetric, cifs.idmap, cifs.spnego, and pkcs7_test did
not handle this case, allowing an unprivileged user to trivially cause a
NULL pointer dereference (kernel oops) if one of these key types was
present. Fix it by doing the copy_from_user() when 'plen' is nonzero
rather than when '_payload' is non-NULL, causing the syscall to fail
with EFAULT as expected when an invalid buffer is specified.
Cc: stable@vger.kernel.org # 2.6.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
The encrypted-keys module was using a single global HMAC transform,
which could be rekeyed by multiple threads concurrently operating on
different keys, causing incorrect HMAC values to be calculated. Fix
this by allocating a new HMAC transform whenever we need to calculate a
HMAC. Also simplify things a bit by allocating the shash_desc's using
SHASH_DESC_ON_STACK() for both the HMAC and unkeyed hashes.
The following script reproduces the bug:
keyctl new_session
keyctl add user master "abcdefghijklmnop" @s
for i in $(seq 2); do
(
set -e
for j in $(seq 1000); do
keyid=$(keyctl add encrypted desc$i "new user:master 25" @s)
datablob="$(keyctl pipe $keyid)"
keyctl unlink $keyid > /dev/null
keyid=$(keyctl add encrypted desc$i "load $datablob" @s)
keyctl unlink $keyid > /dev/null
done
) &
done
Output with bug:
[ 439.691094] encrypted_key: bad hmac (-22)
add_key: Invalid argument
add_key: Invalid argument
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Since v4.9, the crypto API cannot (normally) be used to encrypt/decrypt
stack buffers because the stack may be virtually mapped. Fix this for
the padding buffers in encrypted-keys by using ZERO_PAGE for the
encryption padding and by allocating a temporary heap buffer for the
decryption padding.
Tested with CONFIG_DEBUG_SG=y:
keyctl new_session
keyctl add user master "abcdefghijklmnop" @s
keyid=$(keyctl add encrypted desc "new user:master 25" @s)
datablob="$(keyctl pipe $keyid)"
keyctl unlink $keyid
keyid=$(keyctl add encrypted desc "load $datablob" @s)
datablob2="$(keyctl pipe $keyid)"
[ "$datablob" = "$datablob2" ] && echo "Success!"
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
In join_session_keyring(), if install_session_keyring_to_cred() were to
fail, we would leak the keyring reference, just like in the bug fixed by
commit 23567fd052 ("KEYS: Fix keyring ref leak in
join_session_keyring()"). Fortunately this cannot happen currently, but
we really should be more careful. Do this by adding and using a new
error label at which the keyring reference is dropped.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
We forgot to set the error code on this path so it could result in
returning NULL which leads to a NULL dereference.
Fixes: db6c43bd21 ("crypto: KEYS: convert public key and digsig asym to the akcipher api")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
With the new standardized functions, we can replace all ACCESS_ONCE()
calls across relevant security/keyrings/.
ACCESS_ONCE() does not work reliably on non-scalar types. For example
gcc 4.6 and 4.7 might remove the volatile tag for such accesses during
the SRA (scalar replacement of aggregates) step:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
Update the new calls regardless of if it is a scalar type, this is
cleaner than having three alternatives.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
CONFIG_KEYS_COMPAT is defined in arch-specific Kconfigs and is missing for
several 64-bit architectures : mips, parisc, tile.
At the moment and for those architectures, calling in 32-bit userspace the
keyctl syscall would return an ENOSYS error.
This patch moves the CONFIG_KEYS_COMPAT option to security/keys/Kconfig, to
make sure the compatibility wrapper is registered by default for any 64-bit
architecture as long as it is configured with CONFIG_COMPAT.
[DH: Modified to remove arm64 compat enablement also as requested by Eric
Biggers]
Signed-off-by: Bilal Amarni <bilal.amarni@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
cc: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
A bunch of fixes for vmwgfx 4.12 regressions and older stuff. In the latter
case either trivial, cc'd stable or requiring backports for stable.
* 'vmwgfx-fixes-4.12' of git://people.freedesktop.org/~thomash/linux:
drm/vmwgfx: Bump driver minor and date
drm/vmwgfx: Remove unused legacy cursor functions
drm/vmwgfx: fix spelling mistake "exeeds" -> "exceeds"
drm/vmwgfx: Fix large topology crash
drm/vmwgfx: Make sure to update STDU when FB is updated
drm/vmwgfx: Make sure backup_handle is always valid
drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve()
drm/vmwgfx: Don't create proxy surface for cursor
drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()
drm/i915 fixes for v4.12-rc5
* tag 'drm-intel-fixes-2017-06-08' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: fix warning for unused variable
drm/i915: Fix 90/270 rotated coordinates for FBC
drm/i915: Restore has_fbc=1 for ILK-M
drm/i915: Workaround VLV/CHV DSI scanline counter hardware fail
drm/i915: Fix logical inversion for gen4 quirking
drm/i915: Guard against i915_ggtt_disable_guc() being invoked unconditionally
drm/i915: Always recompute watermarks when distrust_bios_wm is set, v2.
drm/i915: Prevent the system suspend complete optimization
drm/i915/psr: disable psr2 for resolution greater than 32X20
drm/i915: Hold a wakeref for probing the ring registers
drm/i915: Short-circuit i915_gem_wait_for_idle() if already idle
drm/i915: Disable decoupled MMIO
drm/i915/guc: Remove stale comment for q_fail
drm/i915: Serialize GTT/Aperture accesses on BXT
Driver Changes:
- kirin: Use correct dt port for the bridge (John)
- meson: Fix regression caused by adding HDMI support to allow board
configurations without HDMI (Neil)
Cc: John Stultz <john.stultz@linaro.org>
Cc: Neil Armstrong <narmstrong@baylibre.com>
* tag 'drm-misc-fixes-2017-06-07' of git://anongit.freedesktop.org/git/drm-misc:
drm/meson: Fix driver bind when only CVBS is available
drm: kirin: Fix drm_of_find_panel_or_bridge conversion
imx-drm: PRE clock gating, panelless LDB, and VDIC CSI selection fixes
- Keep the external clock input to the PRE ungated and only use the internal
soft reset to keep the module in low power state, to avoid sporadic startup
failures.
- Ignore -ENODEV return values from drm_of_find_panel_or_bridge in the LDB
driver to fix probing for devices that still do not specify a panel in the
device tree.
- Fix the CSI input selection to the VDIC. According to experiments, the real
behaviour differs a bit from the documentation.
* tag 'imx-drm-fixes-2017-06-08' of git://git.pengutronix.de/git/pza/linux:
gpu: ipu-v3: Fix CSI selection for VDIC
drm/imx: imx-ldb: Accept drm_of_find_panel_or_bridge failure
gpu: ipu-v3: pre: only use internal clock gating
Pull power management fixes from Rafael Wysocki:
"These revert one problematic commit related to system sleep and fix
one recent intel_pstate regression.
Specifics:
- Revert a recent commit that attempted to avoid spurious wakeups
from suspend-to-idle via ACPI SCI, but introduced regressions on
some systems (Rafael Wysocki).
We will get back to the problem it tried to address in the next
cycle.
- Fix a possible division by 0 during intel_pstate initialization
due to a missing check (Rafael Wysocki)"
* tag 'pm-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle"
cpufreq: intel_pstate: Avoid division by 0 in min_perf_pct_min()
Pull module maintainer address change from Jessica Yu:
"A single patch that advertises my email address change"
* tag 'modules-for-v4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
MAINTAINERS: update email address for Jessica Yu
Commit 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
adds the l3mdev FIB rule the first time a VRF device is created. However,
it only creates the rule once and only in the namespace the first device
is created - which may not be init_net. Fix by using the net_generic
capability to make the add_fib_rules flag per network namespace.
Fixes: 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
Reported-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I noticed that test_l4lb was failing in selftests:
# ./test_progs
test_pkt_access:PASS:ipv4 77 nsec
test_pkt_access:PASS:ipv6 44 nsec
test_xdp:PASS:ipv4 2933 nsec
test_xdp:PASS:ipv6 1500 nsec
test_l4lb:PASS:ipv4 377 nsec
test_l4lb:PASS:ipv6 544 nsec
test_l4lb:FAIL:stats 6297600000 200000
test_tcp_estats:PASS: 0 nsec
Summary: 7 PASSED, 1 FAILED
Tracking down the issue actually revealed that endianness selection
in bpf_endian.h is broken when compiled with clang with bpf target.
test_pkt_access.c, test_l4lb.c is compiled with __BYTE_ORDER as
__BIG_ENDIAN, test_xdp.c as __LITTLE_ENDIAN! test_l4lb noticeably
fails, because the test accounts bytes via bpf_ntohs(ip6h->payload_len)
and bpf_ntohs(iph->tot_len), and compares them against a defined
value and given a wrong endianness, the test outcome is different,
of course.
Turns out that there are actually two bugs: i) when we do __BYTE_ORDER
comparison with __LITTLE_ENDIAN/__BIG_ENDIAN, then depending on the
include order we see different outcomes. Reason is that __BYTE_ORDER
is undefined due to missing endian.h include. Before we include the
asm/byteorder.h (e.g. through linux/in.h), then __BYTE_ORDER equals
__LITTLE_ENDIAN since both are undefined, after the include which
correctly pulls in linux/byteorder/little_endian.h, __LITTLE_ENDIAN
is defined, but given __BYTE_ORDER is still undefined, we match on
__BYTE_ORDER equals to __BIG_ENDIAN since __BIG_ENDIAN is also
undefined at that point, sigh. ii) But even that would be wrong,
since when compiling the test cases with clang, one can select between
bpfeb and bpfel targets for cross compilation. Hence, we can also not
rely on what the system's endian.h provides, but we need to look at
the compiler's defined endianness. The compiler defines __BYTE_ORDER__,
and we can match __ORDER_LITTLE_ENDIAN__ and __ORDER_BIG_ENDIAN__,
which also reflects targets bpf (native), bpfel, bpfeb correctly,
thus really only rely on that. After patch:
# ./test_progs
test_pkt_access:PASS:ipv4 74 nsec
test_pkt_access:PASS:ipv6 42 nsec
test_xdp:PASS:ipv4 2340 nsec
test_xdp:PASS:ipv6 1461 nsec
test_l4lb:PASS:ipv4 400 nsec
test_l4lb:PASS:ipv6 530 nsec
test_tcp_estats:PASS: 0 nsec
Summary: 7 PASSED, 0 FAILED
Fixes: 43bcf707cc ("bpf: fix _htons occurences in test_progs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each time a new speed is added, the bonding 802.3ad isn't updated. Add a
comment to remind the developer to update this driver.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds 14 Gbps enum definition, and fixes
aggregated bandwidth calculation based on above slave links.
Fixes: 0d7e2d2166 ("IB/ipoib: add get_link_ksettings in ethtool")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds [5|50] Gbps enum definition, and fixes
aggregated bandwidth calculation based on above slave links.
Fixes: c9a70d4346 ("net-next: ethtool: Added port speed macros.")
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first netlink attribute (value 0) must always be defined
as none/unspec.
Because we cannot change an existing UAPI, I add a comment to point the
mistake and avoid to propagate it in a new ovs API in the future.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix messages like this:
adv7842.c:(.text+0x2edadd): undefined reference to `cec_unregister_adapter'
when CEC_CORE=m but the driver including media/cec.h is built-in. In that case
the static inlines provided in media/cec.h should be used by that driver.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
While discussing the possible merits of clang warning about unused initialized
functions, I found one function that was clearly meant to be called but
never actually is.
__ila_hash_secret_init() initializes the hash value for the ila locator,
apparently this is intended to prevent hash collision attacks, but this ends
up being a read-only zero constant since there is no caller. I could find
no indication of why it was never called, the earliest patch submission
for the module already was like this. If my interpretation is right, we
certainly want to backport the patch to stable kernels as well.
I considered adding it to the ila_xlat_init callback, but for best effect
the random data is read as late as possible, just before it is first used.
The underlying net_get_random_once() is already highly optimized to avoid
overhead when called frequently.
Fixes: 7f00feaf10 ("ila: Add generic ILA translation facility")
Cc: stable@vger.kernel.org
Link: https://www.spinics.net/lists/kernel/msg2527243.html
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull printk fix from Petr Mladek:
"This reverts a fix added into 4.12-rc1. It caused the kernel log to be
printed on another console when two consoles of the same type were
defined, e.g. console=ttyS0 console=ttyS1.
This configuration was never supported by kernel itself, but it
started to make sense with systemd. In other words, the commit broke
userspace"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
Revert "printk: fix double printing with earlycon"
Pull crypto fixes from Herbert Xu:
"This fixes a couple of places in the crypto code that were doing
interruptible sleeps dangerously. They have been converted to use
non-interruptible sleeps.
This also fixes a bug in asymmetric_keys where it would trigger a
use-after-free if a request returned EBUSY due to a full device queue"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: gcm - wait for crypto op not signal safe
crypto: drbg - wait for crypto op not signal safe
crypto: asymmetric_keys - handle EBUSY due to backlog correctly
drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c: In function ‘rtw_cfg80211_add_monitor_if’:
drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c:2670:10: error: ‘struct net_device’ has no member named ‘destructor’
mon_ndev->destructor = rtw_ndev_destructor;
^
Signed-off-by: David S. Miller <davem@davemloft.net>
In blk-cgroup, operations on blkg objects are protected with the
request_queue lock. This is no more the lock that protects
I/O-scheduler operations in blk-mq. In fact, the latter are now
protected with a finer-grained per-scheduler-instance lock. As a
consequence, although blkg lookups are also rcu-protected, blk-mq I/O
schedulers may see inconsistent data when they access blkg and
blkg-related objects. BFQ does access these objects, and does incur
this problem, in the following case.
The blkg_lookup performed in bfq_get_queue, being protected (only)
through rcu, may happen to return the address of a copy of the
original blkg. If this is the case, then the blkg_get performed in
bfq_get_queue, to pin down the blkg, is useless: it does not prevent
blk-cgroup code from destroying both the original blkg and all objects
directly or indirectly referred by the copy of the blkg. BFQ accesses
these objects, which typically causes a crash for NULL-pointer
dereference of memory-protection violation.
Some additional protection mechanism should be added to blk-cgroup to
address this issue. In the meantime, this commit provides a quick
temporary fix for BFQ: cache (when safe) blkg data that might
disappear right after a blkg_lookup.
In particular, this commit exploits the following facts to achieve its
goal without introducing further locks. Destroy operations on a blkg
invoke, as a first step, hooks of the scheduler associated with the
blkg. And these hooks are executed with bfqd->lock held for BFQ. As a
consequence, for any blkg associated with the request queue an
instance of BFQ is attached to, we are guaranteed that such a blkg is
not destroyed, and that all the pointers it contains are consistent,
while that instance is holding its bfqd->lock. A blkg_lookup performed
with bfqd->lock held then returns a fully consistent blkg, which
remains consistent until this lock is held. In more detail, this holds
even if the returned blkg is a copy of the original one.
Finally, also the object describing a group inside BFQ needs to be
protected from destruction on the blkg_free of the original blkg
(which invokes bfq_pd_free). This commit adds private refcounting for
this object, to let it disappear only after no bfq_queue refers to it
any longer.
This commit also removes or updates some stale comments on locking
issues related to blk-cgroup operations.
Reported-by: Tomas Konir <tomas.konir@gmail.com>
Reported-by: Lee Tibbert <lee.tibbert@gmail.com>
Reported-by: Marco Piazza <mpiazza@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Tested-by: Tomas Konir <tomas.konir@gmail.com>
Tested-by: Lee Tibbert <lee.tibbert@gmail.com>
Tested-by: Marco Piazza <mpiazza@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Stephen Hemminger says:
====================
netvsc: bug fixes
These are bugfixes for netvsc driver in 4.12.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The work queue and handling of network filter parameters should
be in rndis_device. This gets rid of warning from RCU checks,
eliminates a race and cleans up code.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ndo_poll_controller function needs to schedule NAPI to pick
up arriving packets and send completions. Otherwise no data
will ever be received. For simple case of netconsole, it also
will allow send completions to happen. Without this netpoll
will eventually get stuck.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool info command calls the netvsc get_sset_count with RTNL
but not with RCU. Which causes warning:
drivers/net/hyperv/netvsc_drv.c:1010 suspicious rcu_dereference_check() usage!
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting
down a guest running iperf on a VFIO assigned device. This happens
because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt
context, while a worker thread does the same inside kvm_set_irq(). If the
interrupt happens while the worker thread is executing __srcu_read_lock(),
updates to the Classic SRCU ->lock_count[] field or the Tree SRCU
->srcu_lock_count[] field can be lost.
The docs say you are not supposed to call srcu_read_lock() and
srcu_read_unlock() from irq context, but KVM interrupt injection happens
from (host) interrupt context and it would be nice if SRCU supported the
use case. KVM is using SRCU here not really for the "sleepable" part,
but rather due to its IPI-free fast detection of grace periods. It is
therefore not desirable to switch back to RCU, which would effectively
revert commit 719d93cd5f ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING",
2014-01-16).
However, the docs are overly conservative. You can have an SRCU instance
only has users in irq context, and you can mix process and irq context
as long as process context users disable interrupts. In addition,
__srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and
Classic SRCU. For those two implementations, only srcu_read_lock()
is unsafe.
When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(),
in commit 5a41344a3d ("srcu: Simplify __srcu_read_unlock() via
this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments.
Therefore it kept __this_cpu_inc(), with preempt_disable/enable in
the caller. Tree SRCU however only does one increment, so on most
architectures it is more efficient for __srcu_read_lock() to use
this_cpu_inc(), and any performance differences appear to be down in
the noise.
Cc: stable@vger.kernel.org
Fixes: 719d93cd5f ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING")
Reported-by: Linu Cherian <linuc.decode@gmail.com>
Suggested-by: Linu Cherian <linuc.decode@gmail.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting
down a guest running iperf on a VFIO assigned device. This happens
because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt
context, while a worker thread does the same inside kvm_set_irq(). If the
interrupt happens while the worker thread is executing __srcu_read_lock(),
updates to the Classic SRCU ->lock_count[] field or the Tree SRCU
->srcu_lock_count[] field can be lost.
The docs say you are not supposed to call srcu_read_lock() and
srcu_read_unlock() from irq context, but KVM interrupt injection happens
from (host) interrupt context and it would be nice if SRCU supported the
use case. KVM is using SRCU here not really for the "sleepable" part,
but rather due to its IPI-free fast detection of grace periods. It is
therefore not desirable to switch back to RCU, which would effectively
revert commit 719d93cd5f ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING",
2014-01-16).
However, the docs are overly conservative. You can have an SRCU instance
only has users in irq context, and you can mix process and irq context
as long as process context users disable interrupts. In addition,
__srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and
Classic SRCU. For those two implementations, only srcu_read_lock()
is unsafe.
When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(),
in commit 5a41344a3d ("srcu: Simplify __srcu_read_unlock() via
this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments.
Therefore it kept __this_cpu_inc(), with preempt_disable/enable in
the caller. Tree SRCU however only does one increment, so on most
architectures it is more efficient for __srcu_read_lock() to use
this_cpu_inc(), and any performance differences appear to be down in
the noise.
Unlike Classic and Tree SRCU, Tiny SRCU does increments and decrements on
a single variable. Therefore, as Peter Zijlstra pointed out, Tiny SRCU's
implementation already supports mixed-context use of srcu_read_lock()
and srcu_read_unlock(), at least as long as uses of srcu_read_lock()
and srcu_read_unlock() in each handler are nested and paired properly.
In other words, it is still illegal to (say) invoke srcu_read_lock()
in an interrupt handler and to invoke the matching srcu_read_unlock()
in a softirq handler. Therefore, the only change required for Tiny SRCU
is to its comments.
Fixes: 719d93cd5f ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING")
Reported-by: Linu Cherian <linuc.decode@gmail.com>
Suggested-by: Linu Cherian <linuc.decode@gmail.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
The 0-day kernel test robot reports assertion failures on
!CONFIG_SMP kernels due to failed spin_is_locked() checks. As it
turns out, spin_is_locked() is hardcoded to return zero on
!CONFIG_SMP kernels and so this function cannot be relied on to
verify spinlock state in this configuration.
To avoid this problem, replace the associated asserts with lockdep
variants that do the right thing regardless of kernel configuration.
Drop the one assert that checks for an unlocked lock as there is no
suitable lockdep variant for that case. This moves the spinlock
checks from XFS debug code to lockdep, but generally provides the
same level of protection.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Roopa reported attempts to delete a bond device that is referenced in a
multipath route is hanging:
$ ifdown bond2 # ifupdown2 command that deletes virtual devices
unregister_netdevice: waiting for bond2 to become free. Usage count = 2
Steps to reproduce:
echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown
ip link add dev bond12 type bond
ip link add dev bond13 type bond
ip addr add 2001:db8:2::0/64 dev bond12
ip addr add 2001:db8:3::0/64 dev bond13
ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2
ip link del dev bond12
ip link del dev bond13
The root cause is the recent change to keep routes on a linkdown. Update
the check to detect when the device is unregistering and release the
route for that case.
Fixes: a1a22c1206 ("net: ipv6: Keep nexthop of multipath route on admin down")
Reported-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the structure's fields are not initialized by the
rtnetlink. If driver doesn't set those in ndo_get_vf_config(),
they'd leak memory to user.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
CC: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Verify that the length of the socket buffer is sufficient to cover the
nlmsghdr structure before accessing the nlh->nlmsg_len field for further
input sanitization. If the client only supplies 1-3 bytes of data in
sk_buff, then nlh->nlmsg_len remains partially uninitialized and
contains leftover memory from the corresponding kernel allocation.
Operating on such data may result in indeterminate evaluation of the
nlmsg_len < sizeof(*nlh) expression.
The bug was discovered by a runtime instrumentation designed to detect
use of uninitialized memory in the kernel. The patch prevents this and
other similar tools (e.g. KMSAN) from flagging this behavior in the future.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 85eac2ba35.
There is an updated version of this fix which we should
use instead.
Signed-off-by: David S. Miller <davem@davemloft.net>
emac_mdio_read_link() was not copying the requested phy settings
back into the emac driver's own phy api. This has caused a link
speed mismatch issue for the AR8035 as the emac driver kept
trying to connect with 10/100MBps on a 1GBit/s link.
This patch also unifies shared code between emac_setup_aneg()
and emac_mdio_setup_forced(). And furthermore it removes
a chunk of emac_mdio_init_phy(), that was copying the same
data into itself.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a problem where the AR8035 PHY can't be
detected on an Cisco Meraki MR24, if the ethernet cable is
not connected on boot.
Russell Senior provided steps to reproduce the issue:
|Disconnect ethernet cable, apply power, wait until device has booted,
|plug in ethernet, check for interfaces, no eth0 is listed.
|
|This appears to be a problem during probing of the AR8035 Phy chip.
|When ethernet has no link, the phy detection fails, and eth0 is not
|created. Plugging ethernet later has no effect, because there is no
|interface as far as the kernel is concerned. The relevant part of
|the boot log looks like this:
|this is the failing case:
|
|[ 0.876611] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.882532] /plb/opb/ethernet@ef600c00: reset timeout
|[ 0.888546] /plb/opb/ethernet@ef600c00: can't find PHY!
|and the succeeding case:
|
|[ 0.876672] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.883952] eth0: EMAC-0 /plb/opb/ethernet@ef600c00, MAC 00:01:..
|[ 0.890822] eth0: found Atheros 8035 Gigabit Ethernet PHY (0x01)
Based on the comment and the commit message of
commit 23fbb5a87c ("emac: Fix EMAC soft reset on 460EX/GT").
This is because the AR8035 PHY doesn't provide the TX Clock,
if the ethernet cable is not attached. This causes the reset
to timeout and the PHY detection code in emac_init_phy() is
unable to detect the AR8035 PHY. As a result, the emac driver
bails out early and the user left with no ethernet.
In order to stay compatible with existing configurations, the driver
tries the current reset approach at first. Only if the first attempt
timed out, it does perform one more retry with the clock temporarily
switched to the internal source for just the duration of the reset.
LEDE-Bug: #687 <https://bugs.lede-project.org/index.php?do=details&task_id=687>
Cc: Chris Blake <chrisrblake93@gmail.com>
Reported-by: Russell Senior <russell@personaltelco.net>
Fixes: 23fbb5a87c ("emac: Fix EMAC soft reset on 460EX/GT")
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Verify that the length of the socket buffer is sufficient to cover the
entire nlh->nlmsg_len field before accessing that field for further
input sanitization. If the client only supplies 1-3 bytes of data in
sk_buff, then nlh->nlmsg_len remains partially uninitialized and
contains leftover memory from the corresponding kernel allocation.
Operating on such data may result in indeterminate evaluation of the
nlmsg_len < sizeof(*nlh) expression.
The bug was discovered by a runtime instrumentation designed to detect
use of uninitialized memory in the kernel. The patch prevents this and
other similar tools (e.g. KMSAN) from flagging this behavior in the future.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph writes:
"A few NVMe fixes for 4.12-rc, PCIe reset fixes and APST fixes, a
RDMA reconnect fix, two FC fixes and a general controller removal fix."
> ../drivers/hsi/clients/ssi_protocol.c:1069:5: error: 'struct net_device' has no member named 'destructor'
Reported-by: Mark Brown <broonie@kernel.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/gpu/drm/i915/intel_engine_cs.c: In function ‘intel_engine_is_idle’:
drivers/gpu/drm/i915/intel_engine_cs.c:1103:27: error: unused variable ‘dev_priv’ [-Werror=unused-variable]
struct drm_i915_private *dev_priv = engine->i915;
^~~~~~~~
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
While installing SLES-12 (based on v4.4), I found that the installer
will stall for 60+ seconds during LVM disk scan. The root cause was
determined to be the removal of a bound device check in loop_flush()
by commit b5dd2f6047 ("block: loop: improve performance via blk-mq").
Restoring this check, examining ->lo_state as set by loop_set_fd()
eliminates the bad behavior.
Test method:
modprobe loop max_loop=64
dd if=/dev/zero of=disk bs=512 count=200K
for((i=0;i<4;i++))do losetup -f disk; done
mkfs.ext4 -F /dev/loop0
for((i=0;i<4;i++))do mkdir t$i; mount /dev/loop$i t$i;done
for f in `ls /dev/loop[0-9]*|sort`; do \
echo $f; dd if=$f of=/dev/null bs=512 count=1; \
done
Test output: stock patched
/dev/loop0 18.1217e-05 8.3842e-05
/dev/loop1 6.1114e-05 0.000147979
/dev/loop10 0.414701 0.000116564
/dev/loop11 0.7474 6.7942e-05
/dev/loop12 0.747986 8.9082e-05
/dev/loop13 0.746532 7.4799e-05
/dev/loop14 0.480041 9.3926e-05
/dev/loop15 1.26453 7.2522e-05
Note that from loop10 onward, the device is not mounted, yet the
stock kernel consumes several orders of magnitude more wall time
than it does for a mounted device.
(Thanks for Mike Galbraith <efault@gmx.de>, give a changelog review.)
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: James Wang <jnwang@suse.com>
Fixes: b5dd2f6047 ("block: loop: improve performance via blk-mq")
Signed-off-by: Jens Axboe <axboe@fb.com>
If "i" is the last element in the vcpu->arch.cpuid_entries[] array, it
potentially can be exploited the vulnerability. this will out-of-bounds
read and write. Luckily, the effect is small:
/* when no next entry is found, the current entry[i] is reselected */
for (j = i + 1; ; j = (j + 1) % nent) {
struct kvm_cpuid_entry2 *ej = &vcpu->arch.cpuid_entries[j];
if (ej->function == e->function) {
It reads ej->maxphyaddr, which is user controlled. However...
ej->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
After cpuid_entries there is
int maxphyaddr;
struct x86_emulate_ctxt emulate_ctxt; /* 16-byte aligned */
So we have:
- cpuid_entries at offset 1B50 (6992)
- maxphyaddr at offset 27D0 (6992 + 3200 = 10192)
- padding at 27D4...27DF
- emulate_ctxt at 27E0
And it writes in the padding. Pfew, writing the ops field of emulate_ctxt
would have been much worse.
This patch fixes it by modding the index to avoid the out-of-bounds
access. Worst case, i == j and ej->function == e->function,
the loop can bail out.
Reported-by: Moguofang <moguofang@huawei.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Guofang Mo <moguofang@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM/ARM Fixes for v4.12-rc5 - Take 2
Changes include:
- Fix an issue with migrating GICv2 VMs on GICv3 systems.
- Squashed a bug for gicv3 when figuring out preemption levels.
- Fix a potential null pointer derefence in KVM happening under memory
pressure.
- Maintain RES1 bits in the SCTLR_EL2 to make sure KVM works on new
architecture revisions.
- Allow unaligned accesses at EL2/HYP
Since introduction of tracing for init functions the in_kernel_space()
check is no longer correct, as it ignores the init sections. As a
result, when probes are inserted (and disabled) in the init functions,
a branch instruction is inserted instead of a nop, which is likely to
result in random crashes during boot.
Remove the MIPS-specific in_kernel_space() method and replace it with a
generic core_kernel_text() that also checks for init sections during
system boot stage.
Fixes: 42c269c88d ("ftrace: Allow for function tracing to record init functions on boot up")
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Tested-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16092/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Space reserved for PKMap should span from PKMAP_BASE to FIXADDR_START.
For large page sizes this is not the case as eg. for 64k pages the range
currently defined is from 0xfe000000 to 0x102000000(!!) which obviously
isn't right.
Remove the hardcoded location and set the BASE address as an offset from
FIXADDR_START.
Since all PKMAP ptes have to be placed in a contiguous memory, ensure
that this is the case by placing them all in a single page. This is
achieved by aligning the end address to pkmap pages count pages.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15950/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
All PTEs used by PKMAP should be allocated in a contiguous memory area,
but we do not currently have a mechanism to enforce that, so ensure that
we don't try to allocate more entries than would fit in a single page.
Current fixed value of 1024 would not work with XPA enabled when
sizeof(pte_t)==8 and we need two pages to store pte tables.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15949/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
fixrange_init operates at PMD-granularity and expects the addresses to
be PMD-size aligned, but currently that might not be the case for
PKMAP_BASE unless it is defined properly, so ensure a correct alignment
is used before passing the address to fixrange_init.
fixed mappings: only align the start address that is passed to
fixrange_init rather than the value before adding the size, as we may
end up with uninitialised upper part of the range.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15948/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The PPC_DT_CPU_FTRs is a bit misplaced in menuconfig, it shows up with
other general kernel options. It's really more at home in the "Platform
Support" section, so move it there.
Also enable it by default, for Book3s 64. It does mostly nothing unless
the device tree properties are found, and we will want it enabled
eventually in distro kernels, so turn it on to start getting more
testing.
Fixes: 5a61ef74f2 ("powerpc/64s: Support new device tree binding for discovering CPU features")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Supporting 512TB requires us to do a order 3 allocation for level 1 page
table (pgd). This results in page allocation failures with certain workloads.
For now limit 4k linux page size config to 64TB.
Fixes: f6eedbba7a ("powerpc/mm/hash: Increase VA range to 128TB")
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When calling CEC_RECEIVE do not check if the adapter is configured.
Typically CEC_RECEIVE is called after a select() and if that indicates
that there are messages in the receive queue, then you should always be
able to dequeue a message.
The race condition here is that a message has been received and is
queued, so select() tells userspace that a message is available. But
before the application calls CEC_RECEIVE the adapter is unconfigured
(e.g. the HDMI cable is removed). Now select will always report that
there is a message, but calling CEC_RECEIVE will always return -ENONET
because the adapter is no longer configured and so will never actually
dequeue the message.
There is really no need for this check, and in fact the ENONET error
code was never documented for CEC_RECEIVE. This may have been a left-over
of old code that was never updated.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Cc: <stable@vger.kernel.org> # for v4.10 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This reverts commit cf39bf58af.
The commit regression to users that define both console=ttyS1
and console=ttyS0 on the command line, see
https://lkml.kernel.org/r/20170509082915.GA13236@bistromath.localdomain
The kernel log messages always appeared only on one serial port. It is
even documented in Documentation/admin-guide/serial-console.rst:
"Note that you can only define one console per device type (serial,
video)."
The above mentioned commit changed the order in which the command line
parameters are searched. As a result, the kernel log messages go to
the last mentioned ttyS* instead of the first one.
We long thought that using two console=ttyS* on the command line
did not make sense. But then we realized that console= parameters
were handled also by systemd, see
http://0pointer.de/blog/projects/serial-console.html
"By default systemd will instantiate one serial-getty@.service on
the main kernel console, if it is not a virtual terminal."
where
"[4] If multiple kernel consoles are used simultaneously, the main
console is the one listed first in /sys/class/tty/console/active,
which is the last one listed on the kernel command line."
This puts the original report into another light. The system is running
in qemu. The first serial port is used to store the messages into a file.
The second one is used to login to the system via a socket. It depends
on systemd and the historic kernel behavior.
By other words, systemd causes that it makes sense to define both
console=ttyS1 console=ttyS0 on the command line. The kernel fix
caused regression related to userspace (systemd) and need to be
reverted.
In addition, it went out that the fix helped only partially.
The messages still were duplicated when the boot console was
removed early by late_initcall(printk_late_init). Then the entire
log was replayed when the same console was registered as a normal one.
Link: 20170606160339.GC7604@pathway.suse.cz
Cc: Aleksey Makarov <aleksey.makarov@linaro.org>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Robin Murphy <robin.murphy@arm.com>,
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Nair, Jayachandran" <Jayachandran.Nair@cavium.com>
Cc: linux-serial@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
On sparc, if we have an alloca() like situation, as is the case with
SHASH_DESC_ON_STACK(), we can end up referencing deallocated stack
memory. The result can be that the value is clobbered if a trap
or interrupt arrives at just the right instruction.
It only occurs if the function ends returning a value from that
alloca() area and that value can be placed into the return value
register using a single instruction.
For example, in lib/libcrc32c.c:crc32c() we end up with a return
sequence like:
return %i7+8
lduw [%o5+16], %o0 ! MEM[(u32 *)__shash_desc.1_10 + 16B],
%o5 holds the base of the on-stack area allocated for the shash
descriptor. But the return released the stack frame and the
register window.
So if an intererupt arrives between 'return' and 'lduw', then
the value read at %o5+16 can be corrupted.
Add a data compiler barrier to work around this problem. This is
exactly what the gcc fix will end up doing as well, and it absolutely
should not change the code generated for other cpus (unless gcc
on them has the same bug :-)
With crucial insight from Eric Sandeen.
Cc: <stable@vger.kernel.org>
Reported-by: Anatoly Pugachev <matorola@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
During early boot, load_ucode_intel_ap() uses __load_ucode_intel()
to obtain a pointer to the relevant microcode patch (embedded in the
initrd), and stores this value in 'intel_ucode_patch' to speed up the
microcode patch application for subsequent CPUs.
On resuming from suspend-to-RAM, however, load_ucode_ap() calls
load_ucode_intel_ap() for each non-boot-CPU. By then the initramfs is
long gone so the pointer stored in 'intel_ucode_patch' no longer points to
a valid microcode patch.
Clear that pointer so that we effectively fall back to the CPU hotplug
notifier callbacks to update the microcode.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
[ Edit and massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.10..
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170607095819.9754-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I will be traveling in the upcoming months and it'll be much easier for me
to access my kernel.org email rather than my work one. Change my email
address in the MAINTAINERS file from jeyu@redhat.com to jeyu@kernel.org.
Signed-off-by: Jessica Yu <jeyu@redhat.com>
It's possible that get_random_{u32,u64} is used before the crng has
initialized, in which case, its output might not be cryptographically
secure. For this problem, directly, this patch set is introducing the
*_wait variety of functions, but even with that, there's a subtle issue:
what happens to our batched entropy that was generated before
initialization. Prior to this commit, it'd stick around, supplying bad
numbers. After this commit, we force the entropy to be re-extracted
after each phase of the crng has initialized.
In order to avoid a race condition with the position counter, we
introduce a simple rwlock for this invalidation. Since it's only during
this awkward transition period, after things are all set up, we stop
using it, so that it doesn't have an impact on performance.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org # v4.11+
Linus pointed out that there is a much more efficient way of avoiding
the problem that we were trying to address in commit 9dfa7bba35:
"fix race in drivers/char/random.c:get_reg()".
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Network devices can allocate reasources and private memory using
netdev_ops->ndo_init(). However, the release of these resources
can occur in one of two different places.
Either netdev_ops->ndo_uninit() or netdev->destructor().
The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.
netdev_ops->ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.
netdev->destructor(), on the other hand, does not run until the
netdev references all go away.
Further complicating the situation is that netdev->destructor()
almost universally does also a free_netdev().
This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.
If netdev_ops->ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops->ndo_uninit(). But
it is not able to invoke netdev->destructor().
This is because netdev->destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.
However, this means that the resources that would normally be released
by netdev->destructor() will not be.
Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.
Many drivers do not try to deal with this, and instead we have leaks.
Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev->destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().
netdev->priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev->destructor(), except for
free_netdev().
netdev->needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().
Now, register_netdevice() can sanely release all resources after
ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
and netdev->priv_destructor().
And at the end of unregister_netdevice(), we invoke
netdev->priv_destructor() and optionally call free_netdev().
Signed-off-by: David S. Miller <davem@davemloft.net>
Will reported that in BPF_XADD we must use a different register in stxr
instruction for the status flag due to otherwise CONSTRAINED UNPREDICTABLE
behavior per architecture. Reference manual says [1]:
If s == t, then one of the following behaviors must occur:
* The instruction is UNDEFINED.
* The instruction executes as a NOP.
* The instruction performs the store to the specified address, but
the value stored is UNKNOWN.
Thus, use a different temporary register for the status flag to fix it.
Disassembly extract from test 226/STX_XADD_DW from test_bpf.ko:
[...]
0000003c: c85f7d4b ldxr x11, [x10]
00000040: 8b07016b add x11, x11, x7
00000044: c80c7d4b stxr w12, x11, [x10]
00000048: 35ffffac cbnz w12, 0x0000003c
[...]
[1] https://static.docs.arm.com/ddi0487/b/DDI0487B_a_armv8_arm.pdf, p.6132
Fixes: 85f68fe898 ("bpf, arm64: implement jiting of BPF_XADD")
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvpp22_port_mii_set() function was added by 2697582144, but the
function directly returns without doing anything. This return was used
when debugging and wasn't removed before sending the patch. Fix this.
Fixes: 2697582144 ("net: mvpp2: handle misc PPv2.1/PPv2.2 differences")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changing the mtu is currently not supported in the ibmvnic driver.
Implement .ndo_change_mtu in the driver so that attempting to use ifconfig
to change the mtu will fail and present the user with an error message.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit eea40b8f62 ("infiniband: call ipv6 route lookup via the stub
interface") introduced a regression in address resolution when connecting
to IPv6 destination addresses. The old code called ip6_route_output(),
while the new code calls ipv6_stub->ipv6_dst_lookup(). The two are almost
the same, except that ipv6_dst_lookup() also calls ip6_route_get_saddr()
if the source address is in6addr_any.
This means that the test of ipv6_addr_any(&fl6.saddr) now never succeeds,
and so we never copy the source address out. This ends up causing
rdma_resolve_addr() to fail, because without a resolved source address,
cma_acquire_dev() will fail to find an RDMA device to use. For me, this
causes connecting to an NVMe over Fabrics target via RoCE / IPv6 to fail.
Fix this by copying out fl6.saddr if ipv6_addr_any() is true for the original
source address passed into addr6_resolve(). We can drop our call to
ipv6_dev_get_saddr() because ipv6_dst_lookup() already does that work.
Fixes: eea40b8f62 ("infiniband: call ipv6 route lookup via the stub interface")
Cc: <stable@vger.kernel.org> # 3.12+
Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The Lifebook E546 and E557 touchpad were also not functioning and
worked after running:
echo "1" > /sys/devices/platform/i8042/serio2/crc_enabled
Add them to the list of machines that need this workaround.
Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
Reviewed-by: Arjan Opmeer <arjan@opmeer.net>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
When freeing VF's DMA mappings, an already NULLed pointer was checked
again due to an apparent copy&paste error. Consequently, the pf2vf
bulletin DMA mapping was not freed.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KMSAN reported a use of uninitialized memory in dev_set_alias(),
which was caused by calling strlcpy() (which in turn called strlen())
on the user-supplied non-terminated string.
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Only print NMI watchdog hint in 'perf stat' when it is enabled (Andi Kleen)
- Fix sys_mmap/sys_old_mmap shandling in s390 in 'perf trace' (Jiri Olsa)
- Disable breakpoint signal tests in powerpc, that lacks the perf kernel
glue to set breakpoint events and makes 'perf test' always fail (Jiri Olsa)
- Fix 'perf annotate' for branch instruction with multiple operands (Kim Phillips)
- Add missing powerpc triplet when disassembling with 'objdump' in 'perf
annotate' (Kim Phillips)
- Do not trow away partial unwound stacks when using libdw, making
callchains produced with it similar to those produced when linked with
the other DWARF unwind library supported in perf, libunwind (Milian Wolff)
- Fixes to properly handle kernel modules when processing build-id meta
events (Namhyung Kim)
- Fix handling of compressed modules in the build-id cache (Namhyung Kim)
- Fix 'perf annotate' failure when filename has special chars (Ravi Bangoria)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
hard disk IO latency varies a lot depending on spindle move. The latency
range could be from several microseconds to several milliseconds. It's
pretty hard to get the baseline latency used by io.low.
We will use a different stragety here. The idea is only using IO with
spindle move to determine if cgroup IO is in good state. For HD, if io
latency is small (< 1ms), we ignore the IO. Such IO is likely from
sequential IO, and is helpless to help determine if a cgroup's IO is
impacted by other cgroups. With this, we only account IO with big
latency. Then we can choose a hardcoded baseline latency for HD (4ms,
which is typical IO latency with seek). With all these settings, the
io.low latency works for both HD and SSD.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
While introducing HDMI support, component matching on connectors node
were bypassed since no driver would actually bind on the DT node.
But when only a CVBS connector is present, only a single node is found
in the graph, but ignored and a NULL match table is given to the
component code.
This code permits bypassing the components framework by binding directly
the DRM driver when no components needs to be loaded.
Fixes: a41e82e6c4 ("drm/meson: Add support for components")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1496067352-8733-1-git-send-email-narmstrong@baylibre.com
I have encountered a NULL pointer dereference in
throtl_schedule_pending_timer:
[ 413.735396] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
[ 413.735535] IP: [<ffffffff812ebbbf>] throtl_schedule_pending_timer+0x3f/0x210
[ 413.735643] PGD 22c8cf067 PUD 22cb34067 PMD 0
[ 413.735713] Oops: 0000 [#1] SMP
......
This is caused by the following case:
blk_throtl_bio
throtl_schedule_next_dispatch <= sq is top level one without parent
throtl_schedule_pending_timer
sq_to_tg(sq)->td->throtl_slice <= sq_to_tg(sq) returns NULL
Fix it by using sq_to_td instead of sq_to_tg(sq)->td, which will always
return a valid td.
Fixes: 297e3d8547 ("blk-throttle: make throtl_slice tunable")
Signed-off-by: Joseph Qi <qijiang.qj@alibaba-inc.com>
Reviewed-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The scanline counter is bonkers on VLV/CHV DSI. The scanline counter
increment is not lined up with the start of vblank like it is on
every other platform and output type. This causes problems for
both the vblank timestamping and atomic update vblank evasion.
On my FFRD8 machine at least, the scanline counter increment
happens about 1/3 of a scanline ahead of the start of vblank (which
is where all register latching happens still). That means we can't
trust the scanline counter to tell us whether we're in vblank or not
while we're on that particular line. In order to keep vblank
timestamping in working condition when called from the vblank irq,
we'll leave scanline_offset at one, which means that the entire
line containing the start of vblank is considered to be inside
the vblank.
For the vblank evasion we'll need to consider that entire line
to be bad, since we can't tell whether the registers already
got latched or not. And we can't actually use the start of vblank
interrupt to get us past that line as the interrupt would fire
too soon, and then we'd up waiting for the next start of vblank
instead. One way around that would using the frame start
interrupt instead since that wouldn't fire until the next
scanline, but that would require some bigger changes in the
interrupt code. So for simplicity we'll just poll until we get
past the bad line.
v2: Adjust the comments a bit
Cc: stable@vger.kernel.org
Cc: Jonas Aaberg <cja@gmx.net>
Tested-by: Jonas Aaberg <cja@gmx.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99086
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161215174734.28779-1-ville.syrjala@linux.intel.com
Tested-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
(cherry picked from commit ec1b4ee283)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Since
commit bac2a909a0
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Wed Jan 21 02:17:42 2015 +0100
PCI / PM: Avoid resuming PCI devices during system suspend
PCI devices will default to allowing the system suspend complete
optimization where devices are not woken up during system suspend if
they were already runtime suspended. This however breaks the i915/HDA
drivers for two reasons:
- The i915 driver has system suspend specific steps that it needs to
run, that bring the device to a different state than its runtime
suspended state.
- The HDA driver's suspend handler requires power that it will request
from the i915 driver's power domain handler. This in turn requires the
i915 driver to runtime resume itself, but this won't be possible if the
suspend complete optimization is in effect: in this case the i915
runtime PM is disabled and trying to get an RPM reference returns
-EACCESS.
Solve this by requiring the PCI/PM core to resume the device during
system suspend which in effect disables the suspend complete optimization.
Regardless of the above commit the optimization stayed disabled for DRM
devices until
commit d14d2a8453
Author: Lukas Wunner <lukas@wunner.de>
Date: Wed Jun 8 12:49:29 2016 +0200
drm: Remove dev_pm_ops from drm_class
so this patch is in practice a fix for this commit. Another reason for
the bug staying hidden for so long is that the optimization for a device
is disabled if it's disabled for any of its children devices. i915 may
have a backlight device as its child which doesn't support runtime PM
and so doesn't allow the optimization either. So if this backlight
device got registered the bug stayed hidden.
Credits to Marta, Tomi and David who enabled pstore logging,
that caught one instance of this issue across a suspend/
resume-to-ram and Ville who rememberd that the optimization was enabled
for some devices at one point.
The first WARN triggered by the problem:
[ 6250.746445] WARNING: CPU: 2 PID: 17384 at drivers/gpu/drm/i915/intel_runtime_pm.c:2846 intel_runtime_pm_get+0x6b/0xd0 [i915]
[ 6250.746448] pm_runtime_get_sync() failed: -13
[ 6250.746451] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core ptp mei_me pps_core snd_pcm lpc_ich mei prime_
numbers i2c_hid i2c_designware_platform i2c_designware_core [last unloaded: i915]
[ 6250.746512] CPU: 2 PID: 17384 Comm: kworker/u8:0 Tainted: G U W 4.11.0-rc5-CI-CI_DRM_334+ #1
[ 6250.746515] Hardware name: /NUC5i5RYB, BIOS RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
[ 6250.746521] Workqueue: events_unbound async_run_entry_fn
[ 6250.746525] Call Trace:
[ 6250.746530] dump_stack+0x67/0x92
[ 6250.746536] __warn+0xc6/0xe0
[ 6250.746542] ? pci_restore_standard_config+0x40/0x40
[ 6250.746546] warn_slowpath_fmt+0x46/0x50
[ 6250.746553] ? __pm_runtime_resume+0x56/0x80
[ 6250.746584] intel_runtime_pm_get+0x6b/0xd0 [i915]
[ 6250.746610] intel_display_power_get+0x1b/0x40 [i915]
[ 6250.746646] i915_audio_component_get_power+0x15/0x20 [i915]
[ 6250.746654] snd_hdac_display_power+0xc8/0x110 [snd_hda_core]
[ 6250.746661] azx_runtime_resume+0x218/0x280 [snd_hda_intel]
[ 6250.746667] pci_pm_runtime_resume+0x76/0xa0
[ 6250.746672] __rpm_callback+0xb4/0x1f0
[ 6250.746677] ? pci_restore_standard_config+0x40/0x40
[ 6250.746682] rpm_callback+0x1f/0x80
[ 6250.746686] ? pci_restore_standard_config+0x40/0x40
[ 6250.746690] rpm_resume+0x4ba/0x740
[ 6250.746698] __pm_runtime_resume+0x49/0x80
[ 6250.746703] pci_pm_suspend+0x57/0x140
[ 6250.746709] dpm_run_callback+0x6f/0x330
[ 6250.746713] ? pci_pm_freeze+0xe0/0xe0
[ 6250.746718] __device_suspend+0xf9/0x370
[ 6250.746724] ? dpm_watchdog_set+0x60/0x60
[ 6250.746730] async_suspend+0x1a/0x90
[ 6250.746735] async_run_entry_fn+0x34/0x160
[ 6250.746741] process_one_work+0x1f2/0x6d0
[ 6250.746749] worker_thread+0x49/0x4a0
[ 6250.746755] kthread+0x107/0x140
[ 6250.746759] ? process_one_work+0x6d0/0x6d0
[ 6250.746763] ? kthread_create_on_node+0x40/0x40
[ 6250.746768] ret_from_fork+0x2e/0x40
[ 6250.746778] ---[ end trace 102a62fd2160f5e6 ]---
v2:
- Use the new pci_dev->needs_resume flag, to avoid any overhead during
the ->pm_prepare hook. (Rafael)
v3:
- Update commit message to reference the actual regressing commit.
(Lukas)
v4:
- Rebase on v4 of patch 1/2.
Fixes: d14d2a8453 ("drm: Remove dev_pm_ops from drm_class")
References: https://bugs.freedesktop.org/show_bug.cgi?id=100378
References: https://bugs.freedesktop.org/show_bug.cgi?id=100770
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: linux-pci@vger.kernel.org
Cc: <stable@vger.kernel.org> # v4.10.x: 4d071c3 - PCI/PM: Add needs_resume flag
Cc: <stable@vger.kernel.org> # v4.10.x
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-and-tested-by: Marta Lofstedt <marta.lofstedt@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1493726649-32094-2-git-send-email-imre.deak@intel.com
(cherry picked from commit adfdf85d79)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
While the atomic modesetting capability is signaled also elsewhere, also
reflect it by a driver minor bump.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
These function implementations and/or declarations are no longer used
now that atomic is enabled.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reported-by: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Trivial fix to spelling mistake in DRM_ERROR error message.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
The previous attempt at this had an issue with with num_clips > 1
because it would always end up using the coordinates of the last
clip while using width and height calculated from the bounding
box of all the clips.
So if the last clip happens to be not at the top-left corner of
the bounding box, the CPU blit operation would go out of bounds.
The original intent was to coalesce all the clips into one blit,
and to do that we need to also track the starting point of the
content buffer.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
When a new FB is bound, we have to send an update command otherwise
the new FB may not be shown
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
When vmw_gb_surface_define_ioctl() is called with an existing buffer,
we end up returning an uninitialized variable in the backup_handle.
The fix is to first initialize backup_handle to 0 just to be sure, and
second, when a user-provided buffer is found, we will use the
req->buffer_handle as the backup_handle.
Cc: <stable@vger.kernel.org>
Reported-by: Murray McAllister <murray.mcallister@insomniasec.com>
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
With atomic, the cursor surface is treated like a FB. Creating
a proxy surface for cursor doesn't gain us much benefit.
This fixes the issue on atomic enabled 2D VMs where the cursor
disappears.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
BXT has a H/W issue with IOMMU which can lead to system hangs when
Aperture accesses are queued within the GAM behind GTT Accesses.
This patch avoids the condition by wrapping all GTT updates in stop_machine
and using a flushing read prior to restarting the machine.
The stop_machine guarantees no new Aperture accesses can begin while
the PTE writes are being emmitted. The flushing read ensures that
any following Aperture accesses cannot begin until the PTE writes
have been cleared out of the GAM's fifo.
Only FOLLOWING Aperture accesses need to be separated from in flight
PTE updates. PTE Writes may follow tightly behind already in flight
Aperture accesses, so no flushing read is required at the start of
a PTE update sequence.
This issue was reproduced by running
igt/gem_readwrite and
igt/gem_render_copy
simultaneously from different processes, each in a tight loop,
with INTEL_IOMMU enabled.
This patch was originally published as:
drm/i915: Serialize GTT Updates on BXT
[Note: This will cause a performance penalty for some use cases, but
avoiding hangs trumps performance hits. This may need to be worked
around in Mesa to recover the lost performance.]
v2: Move bxt/iommu detection into static function
Remove #ifdef CONFIG_INTEL_IOMMU protection
Make function names more reflective of purpose
Move flushing read into static function
v3: Tidy up for checkpatch.pl
Testcase: igt/gem_concurrent_blit
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/1495641251-30022-1-git-send-email-jon.bloomfield@intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 0ef34ad622)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Commit 5995a68 "xen/privcmd: Add support for Linux 64KB page granularity" did
not go far enough to support 64KB in mmap_batch_fn.
The variable 'nr' is the number of 4KB chunk to map. However, when Linux
is using 64KB page granularity the array of pages (vma->vm_private_data)
contain one page per 64KB. Fix it by incrementing st->index correctly.
Furthermore, st->va is not correctly incremented as PAGE_SIZE !=
XEN_PAGE_SIZE.
Fixes: 5995a68 ("xen/privcmd: Add support for Linux 64KB page granularity")
CC: stable@vger.kernel.org
Reported-by: Feng Kan <fkan@apm.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Christoph Hellwig suggests we should to make APST work out of the box.
Hence relax the the default max latency to make them able to enter
deepest power state on default.
Here are id-ctrl excerpts from two high latency NVMes:
vid : 0x14a4
ssvid : 0x1b4b
mn : CX2-GB1024-Q11 NVMe LITEON 1024GB
ps 3 : mp:0.1000W non-operational enlat:5000 exlat:5000 rrt:3 rrl:3
rwt:3 rwl:3 idle_power:- active_power:-
ps 4 : mp:0.0100W non-operational enlat:50000 exlat:100000 rrt:4 rrl:4
rwt:4 rwl:4 idle_power:- active_power:-
vid : 0x15b7
ssvid : 0x1b4b
mn : A400 NVMe SanDisk 512GB
ps 3 : mp:0.0500W non-operational enlat:51000 exlat:10000 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
ps 4 : mp:0.0055W non-operational enlat:1000000 exlat:100000 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When a NVMe is in non-op states, the latency is exlat.
The latency will be enlat + exlat only when the NVMe tries to transit
from operational state right atfer it begins to transit to
non-operational state, which should be a rare case.
Therefore, as Andy Lutomirski suggests, use exlat only when deciding power
states to trainsit to.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The failure case, of a create controller request, called
nvme_uninit_ctrl() but didn't do a put to allow the nvme
controller to be deleted.
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Per FC-NVME, when lldd or transport detects an i/o error, the
connection must be terminated, which in turn requires the association
to be termianted. Currently the transport simply creates a nvme
completion status of transport error and returns the io. The FC-NVME
spec makes the mandate as initiator and host, depending on the error,
can get out of sync on outstanding io counts (sqhd/sqtail).
Implement the association teardown on lldd or transport detected
errors.
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
When we encounter an transport/controller errors, error recovery
kicks in which performs:
1. stops io/admin queues
2. moves transport queues out of LIVE state
3. fast fail pending io
4. schedule periodic reconnects.
But we also need to fast fail incoming IO taht enters after we
already scheduled. Given that our queue is not LIVE anymore, simply
restart the request queues to fail in .queue_rq
Reported-by: Alex Turin <alex@vastdata.com>
Reported-by: shahar.salzman <shahar.salzman@gmail.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
We need to start admin queues too in nvme_kill_queues()
for avoiding hang in remove path[1].
This patch is very similar with 806f026f9b901eaf(nvme: use
blk_mq_start_hw_queues() in nvme_kill_queues()).
[1] hang stack trace
[<ffffffff813c9716>] blk_execute_rq+0x56/0x80
[<ffffffff815cb6e9>] __nvme_submit_sync_cmd+0x89/0xf0
[<ffffffff815ce7be>] nvme_set_features+0x5e/0x90
[<ffffffff815ce9f6>] nvme_configure_apst+0x166/0x200
[<ffffffff815cef45>] nvme_set_latency_tolerance+0x35/0x50
[<ffffffff8157bd11>] apply_constraint+0xb1/0xc0
[<ffffffff8157cbb4>] dev_pm_qos_constraints_destroy+0xf4/0x1f0
[<ffffffff8157b44a>] dpm_sysfs_remove+0x2a/0x60
[<ffffffff8156d951>] device_del+0x101/0x320
[<ffffffff8156db8a>] device_unregister+0x1a/0x60
[<ffffffff8156dc4c>] device_destroy+0x3c/0x50
[<ffffffff815cd295>] nvme_uninit_ctrl+0x45/0xa0
[<ffffffff815d4858>] nvme_remove+0x78/0x110
[<ffffffff81452b69>] pci_device_remove+0x39/0xb0
[<ffffffff81572935>] device_release_driver_internal+0x155/0x210
[<ffffffff81572a02>] device_release_driver+0x12/0x20
[<ffffffff815d36fb>] nvme_remove_dead_ctrl_work+0x6b/0x70
[<ffffffff810bf3bc>] process_one_work+0x18c/0x3a0
[<ffffffff810bf61e>] worker_thread+0x4e/0x3b0
[<ffffffff810c5ac9>] kthread+0x109/0x140
[<ffffffff8185800c>] ret_from_fork+0x2c/0x40
[<ffffffffffffffff>] 0xffffffffffffffff
Fixes: c5552fde102fc("nvme: Enable autonomous power state transitions")
Reported-by: Rakesh Pandit <rakesh@tuxera.com>
Tested-by: Rakesh Pandit <rakesh@tuxera.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
snd_timer_user_tselect() reallocates the queue buffer dynamically, but
it forgot to reset its indices. Since the read may happen
concurrently with ioctl and snd_timer_user_tselect() allocates the
buffer via kmalloc(), this may lead to the leak of uninitialized
kernel-space data, as spotted via KMSAN:
BUG: KMSAN: use of unitialized memory in snd_timer_user_read+0x6c4/0xa10
CPU: 0 PID: 1037 Comm: probe Not tainted 4.11.0-rc5+ #2739
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x143/0x1b0 lib/dump_stack.c:52
kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007
kmsan_check_memory+0xc2/0x140 mm/kmsan/kmsan.c:1086
copy_to_user ./arch/x86/include/asm/uaccess.h:725
snd_timer_user_read+0x6c4/0xa10 sound/core/timer.c:2004
do_loop_readv_writev fs/read_write.c:716
__do_readv_writev+0x94c/0x1380 fs/read_write.c:864
do_readv_writev fs/read_write.c:894
vfs_readv fs/read_write.c:908
do_readv+0x52a/0x5d0 fs/read_write.c:934
SYSC_readv+0xb6/0xd0 fs/read_write.c:1021
SyS_readv+0x87/0xb0 fs/read_write.c:1018
This patch adds the missing reset of queue indices. Together with the
previous fix for the ioctl/read race, we cover the whole problem.
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The read from ALSA timer device, the function snd_timer_user_tread(),
may access to an uninitialized struct snd_timer_user fields when the
read is concurrently performed while the ioctl like
snd_timer_user_tselect() is invoked. We have already fixed the races
among ioctls via a mutex, but we seem to have forgotten the race
between read vs ioctl.
This patch simply applies (more exactly extends the already applied
range of) tu->ioctl_lock in snd_timer_user_tread() for closing the
race window.
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
In commit d77e38e612 ("xfrm: Add an IPsec hardware offloading API") we
make xfrm_device.o only compiled when enable option CONFIG_XFRM_OFFLOAD.
But this will make xfrm_dev_event() missing if we only enable default XFRM
options.
Then if we set down and unregister an interface with IPsec on it. there
will no xfrm_garbage_collect(), which will cause dev usage count hold and
get error like:
unregister_netdevice: waiting for <dev> to become free. Usage count = 4
Fixes: d77e38e612 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Revert commit eed4d47efe (ACPI / sleep: Ignore spurious SCI wakeups
from suspend-to-idle) as it turned out to be premature and triggered
a number of different issues on various systems.
That includes, but is not limited to, premature suspend-to-RAM aborts
on Dell XPS 13 (9343) reported by Dominik.
The issue the commit in question attempted to address is real and
will need to be taken care of going forward, but evidently more work
is needed for this purpose.
Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull networking fixes from David Miller:
1) Made TCP congestion control documentation match current reality,
from Anmol Sarma.
2) Various build warning and failure fixes from Arnd Bergmann.
3) Fix SKB list leak in ipv6_gso_segment().
4) Use after free in ravb driver, from Eugeniu Rosca.
5) Don't use udp_poll() in ping protocol driver, from Eric Dumazet.
6) Don't crash in PCI error recovery of cxgb4 driver, from Guilherme
Piccoli.
7) _SRC_NAT_DONE_BIT needs to be cleared using atomics, from Liping
Zhang.
8) Use after free in vxlan deletion, from Mark Bloch.
9) Fix ordering of NAPI poll enabled in ethoc driver, from Max
Filippov.
10) Fix stmmac hangs with TSO, from Niklas Cassel.
11) Fix crash in CALIPSO ipv6, from Richard Haines.
12) Clear nh_flags properly on mpls link up. From Roopa Prabhu.
13) Fix regression in sk_err socket error queue handling, noticed by
ping applications. From Soheil Hassas Yeganeh.
14) Update mlx4/mlx5 MAINTAINERS information.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
net: stmmac: fix a broken u32 less than zero check
net: stmmac: fix completely hung TX when using TSO
net: ethoc: enable NAPI before poll may be scheduled
net: bridge: fix a null pointer dereference in br_afspec
ravb: Fix use-after-free on `ifconfig eth0 down`
net/ipv6: Fix CALIPSO causing GPF with datagram support
net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value
Revert "sit: reload iphdr in ipip6_rcv"
i40e/i40evf: proper update of the page_offset field
i40e: Fix state flags for bit set and clean operations of PF
iwlwifi: fix host command memory leaks
iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265
iwlwifi: mvm: clear new beacon command template struct
iwlwifi: mvm: don't fail when removing a key from an inexisting sta
iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3
iwlwifi: mvm: fix firmware debug restart recording
iwlwifi: tt: move ucode_loaded check under mutex
iwlwifi: mvm: support ibss in dqa mode
iwlwifi: mvm: Fix command queue number on d0i3 flow
iwlwifi: mvm: rs: start using LQ command color
...
Pull sparc fixes from David Miller:
1) Fix TLB context wrap races, from Pavel Tatashin.
2) Cure some gcc-7 build issues.
3) Handle invalid setup_hugepagesz command line values properly, from
Liam R Howlett.
4) Copy TSB using the correct address shift for the huge TSB, from Mike
Kravetz.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: delete old wrap code
sparc64: new context wrap
sparc64: add per-cpu mm of secondary contexts
sparc64: redefine first version
sparc64: combine activate_mm and switch_mm
sparc64: reset mm cpumask after wrap
sparc/mm/hugepages: Fix setup_hugepagesz for invalid values.
sparc: Machine description indices can vary
sparc64: mm: fix copy_tsb to correctly copy huge page TSBs
arch/sparc: support NR_CPUS = 4096
sparc64: Add __multi3 for gcc 7.x and later.
sparc64: Fix build warnings with gcc 7.
arch/sparc: increase CONFIG_NODES_SHIFT on SPARC64 to 5
GCC explicitly does not warn for unused static inline functions for
-Wunused-function. The manual states:
Warn whenever a static function is declared but not defined or
a non-inline static function is unused.
Clang does warn for static inline functions that are unused.
It turns out that suppressing the warnings avoids potentially complex
#ifdef directives, which also reduces LOC.
Suppress the warning for clang.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pavel Tatashin says:
====================
sparc64: context wrap fixes
This patch series contains fixes for context wrap: when we are out of
context ids, and need to get a new version.
It fixes memory corruption issues which happen when more than number of
context ids (currently set to 8K) number of processes are started
simultaneously, and processes can get a wrong context.
sparc64: new context wrap:
- contains explanation of new wrap method, and also explanation of races
that it solves
sparc64: reset mm cpumask after wrap
- explains issue of not reseting cpu mask on a wrap
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The old method that is using xcall and softint to get new context id is
deleted, as it is replaced by a method of using per_cpu_secondary_mm
without xcall to perform the context wrap.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current wrap implementation has a race issue: it is called outside of
the ctx_alloc_lock, and also does not wait for all CPUs to complete the
wrap. This means that a thread can get a new context with a new version
and another thread might still be running with the same context. The
problem is especially severe on CPUs with shared TLBs, like sun4v. I used
the following test to very quickly reproduce the problem:
- start over 8K processes (must be more than context IDs)
- write and read values at a memory location in every process.
Very quickly memory corruptions start happening, and what we read back
does not equal what we wrote.
Several approaches were explored before settling on this one:
Approach 1:
Move smp_new_mmu_context_version() inside ctx_alloc_lock, and wait for
every process to complete the wrap. (Note: every CPU must WAIT before
leaving smp_new_mmu_context_version_client() until every one arrives).
This approach ends up with deadlocks, as some threads own locks which other
threads are waiting for, and they never receive softint until these threads
exit smp_new_mmu_context_version_client(). Since we do not allow the exit,
deadlock happens.
Approach 2:
Handle wrap right during mondo interrupt. Use etrap/rtrap to enter into
into C code, and issue new versions to every CPU.
This approach adds some overhead to runtime: in switch_mm() we must add
some checks to make sure that versions have not changed due to wrap while
we were loading the new secondary context. (could be protected by PSTATE_IE
but that degrades performance as on M7 and older CPUs as it takes 50 cycles
for each access). Also, we still need a global per-cpu array of MMs to know
where we need to load new contexts, otherwise we can change context to a
thread that is going way (if we received mondo between switch_mm() and
switch_to() time). Finally, there are some issues with window registers in
rtrap() when context IDs are changed during CPU mondo time.
The approach in this patch is the simplest and has almost no impact on
runtime. We use the array with mm's where last secondary contexts were
loaded onto CPUs and bump their versions to the new generation without
changing context IDs. If a new process comes in to get a context ID, it
will go through get_new_mmu_context() because of version mismatch. But the
running processes do not need to be interrupted. And wrap is quicker as we
do not need to xcall and wait for everyone to receive and complete wrap.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CTX_FIRST_VERSION defines the first context version, but also it defines
first context. This patch redefines it to only include the first context
version.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only difference between these two functions is that in activate_mm we
unconditionally flush context. However, there is no need to keep this
difference after fixing a bug where cpumask was not reset on a wrap. So, in
this patch we combine these.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After a wrap (getting a new context version) a process must get a new
context id, which means that we would need to flush the context id from
the TLB before running for the first time with this ID on every CPU. But,
we use mm_cpumask to determine if this process has been running on this CPU
before, and this mask is not reset after a wrap. So, there are two possible
fixes for this issue:
1. Clear mm cpumask whenever mm gets a new context id
2. Unconditionally flush context every time process is running on a CPU
This patch implements the first solution
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hugetlb_bad_size needs to be called on invalid values. Also change the
pr_warn to a pr_err to better align with other platforms.
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VIO devices were being looked up by their index in the machine
description node block, but this often varies over time as devices are
added and removed. Instead, store the ID and look up using the type,
config handle and ID.
Signed-off-by: James Clarke <jrtc27@jrtc27.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112541
Signed-off-by: David S. Miller <davem@davemloft.net>
When a TSB grows beyond its current capacity, a new TSB is allocated
and copy_tsb is called to copy entries from the old TSB to the new.
A hash shift based on page size is used to calculate the index of an
entry in the TSB. copy_tsb has hard coded PAGE_SHIFT in these
calculations. However, for huge page TSBs the value REAL_HPAGE_SHIFT
should be used. As a result, when copy_tsb is called for a huge page
TSB the entries are placed at the incorrect index in the newly
allocated TSB. When doing hardware table walk, the MMU does not
match these entries and we end up in the TSB miss handling code.
This code will then create and write an entry to the correct index
in the TSB. We take a performance hit for the table walk miss and
recreation of these entries.
Pass a new parameter to copy_tsb that is the page size shift to be
used when copying the TSB.
Suggested-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linux SPARC64 limits NR_CPUS to 4064 because init_cpu_send_mondo_info()
only allocates a single page for NR_CPUS mondo entries. Thus we cannot
use all 4096 CPUs on some SPARC platforms.
To fix, allocate (2^order) pages where order is set according to the size
of cpu_list for possible cpus. Since cpu_list_pa and cpu_mondo_block_pa
are not used in asm code, there are no imm13 offsets from the base PA
that will break because they can only reach one page.
Orabug: 25505750
Signed-off-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Atish Patra <atish.patra@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The check that queue is less or equal to zero is always true
because queue is a u32; queue is decremented and will wrap around
and never go -ve. Fix this by making queue an int.
Detected by CoverityScan, CID#1428988 ("Unsigned compared against 0")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stmmac_tso_allocator can fail to set the Last Descriptor bit
on a descriptor that actually was the last descriptor.
This happens when the buffer of the last descriptor ends
up having a size of exactly TSO_MAX_BUFF_SIZE.
When the IP eventually reaches the next last descriptor,
which actually has the bit set, the DMA will hang.
When the DMA hangs, we get a tx timeout, however,
since stmmac does not do a complete reset of the IP
in stmmac_tx_timeout, we end up in a state with
completely hung TX.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ethoc_reset enables device interrupts, ethoc_interrupt may schedule a
NAPI poll before NAPI is enabled in the ethoc_open, which results in
device being unable to send or receive anything until it's closed and
reopened. In case the device is flooded with ingress packets it may be
unable to recover at all.
Move napi_enable above ethoc_reset in the ethoc_open to fix that.
Fixes: a170285772 ("net: Add support for the OpenCores 10/100 Mbps Ethernet MAC.")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently have the HSCTLR.A bit set, trapping unaligned accesses
at HYP, but we're not really prepared to deal with it.
Since the rest of the kernel is pretty happy about that, let's follow
its example and set HSCTLR.A to zero. Modern CPUs don't really care.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
We currently have the SCTLR_EL2.A bit set, trapping unaligned accesses
at EL2, but we're not really prepared to deal with it. So far, this
has been unnoticed, until GCC 7 started emitting those (in particular
64bit writes on a 32bit boundary).
Since the rest of the kernel is pretty happy about that, let's follow
its example and set SCTLR_EL2.A to zero. Modern CPUs don't really
care.
Cc: stable@vger.kernel.org
Reported-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
__do_hyp_init has the rather bad habit of ignoring RES1 bits and
writing them back as zero. On a v8.0-8.2 CPU, this doesn't do anything
bad, but may end-up being pretty nasty on future revisions of the
architecture.
Let's preserve those bits so that we don't have to fix this later on.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
We might call br_afspec() with p == NULL which is a valid use case if
the action is on the bridge device itself, but the bridge tunnel code
dereferences the p pointer without checking, so check if p is null
first.
Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Fixes: efa5356b0d ("bridge: per vlan dst_metadata netlink support")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using CALIPSO with IPPROTO_UDP it is possible to trigger a GPF as the
IP header may have moved.
Also update the payload length after adding the CALIPSO option.
Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current comparison of entry < 0 will never be true since entry is an
unsigned integer. Make entry an int to ensure -ve error return values
from the call to jumbo_frm are correctly being caught.
Detected by CoverityScan, CID#1238760 ("Macro compares unsigned to 0")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ASoC: Fixes for v4.12
This is the usual collection of device specific fixes, all accumilated
since the merge window, plus one fix from Takashi for a nasty use after
free bug that bit some things with deferred probe and an update to the
maintainer address for the former Wolfson parts.
gcc 7.1 reports the following warning:
block/elevator.c: In function ‘elv_register’:
block/elevator.c:898:5: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=]
"%s_io_cq", e->elevator_name);
^~~~~~~~~~
block/elevator.c:897:3: note: ‘snprintf’ output between 7 and 22 bytes into a destination of size 21
snprintf(e->icq_cache_name, sizeof(e->icq_cache_name),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"%s_io_cq", e->elevator_name);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The bug is that the name of the icq_cache is 6 characters longer than
the elevator name, but only ELV_NAME_MAX + 5 characters were reserved
for it --- so in the case of a maximum-length elevator name, the 'q'
character in "_io_cq" would be truncated by snprintf(). Fix it by
reserving ELV_NAME_MAX + 6 characters instead.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Kalle Valo says:
====================
wireless-drivers fixes for 4.12
It has been a slow start of cycle and this the first set of fixes for
4.12. Nothing really major here.
wcn36xx
* fix an issue with module reload
brcmfmac
* fix aligment regression on 64 bit systems
iwlwifi
* fixes for memory leaks, runtime PM, memory initialisation and other
smaller problems
* fix IBSS on devices using DQA mode (7260 and up)
* fix the minimum firmware API requirement for 7265D, 3168, 8000 and
8265
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull media fixes from Mauro Carvalho Chehab:
"Some bug fixes:
- Don't fail build if atomisp has warnings
- Some CEC Kconfig changes to allow it to be used by DRM without
media dependencies
- A race fix at RC initialization code
- A driver fix at rainshadow-cec
IMHO, the one that affects most people in this series is a build fix:
if you try to build the Kernel with W=1 or using gcc7 and
all[yes|mod]config, build will fail due to -Werror at atomisp
makefiles"
* tag 'media/v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] rc-core: race condition during ir_raw_event_register()
[media] cec: drop MEDIA_CEC_DEBUG
[media] cec: rename MEDIA_CEC_NOTIFIER to CEC_NOTIFIER
[media] cec: select CEC_CORE instead of depend on it
[media] rainshadow-cec: ensure exit_loop is intialized
[media] atomisp: don't treat warnings as errors
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2017-06-06
This series contains fixes to i40e and i40evf only.
Mauro S. M. Rodrigues fixes a flood in the kernel log which was introduced
in a previous commit because of a mistaken substitution of __I40E_VSI_DOWN
instead of __I40E_DOWN when testing the state of the PF.
Björn Töpel fixes an issue introduced in a previous commit where the
offset was incorrect and could lead to data corruption for architectures
using PAGE_SIZE larger than 8191. Fixed the issue by updating the
page_offset correctly using the proper setting for truesize.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When direct issue is done on request picked up from plug list,
the hctx need to be updated with the actual hw queue, otherwise
wrong hctx is used and may hurt performance, especially when
wrong SRCU readlock is acquired/released
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This reverts commit b699d00358.
As per Eric Dumazet, the pskb_may_pull() is a NOP in this
particular case, so the 'iph' reload is unnecessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a bug where the copying of scatterlist buffers incorrectly
ignored bytes to skip in a scatterlist and ended 1 byte short.
This fixes testmgr hmac and hash test failures currently obscured
by hash import/export not being supported.
Fixes: abefd6741d ("staging: ccree: introduce CryptoCell HW driver").
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
WARNING: CPU: 3 PID: 2840 at arch/x86/kvm/vmx.c:10966 nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel]
CPU: 3 PID: 2840 Comm: qemu-system-x86 Tainted: G OE 4.12.0-rc3+ #23
RIP: 0010:nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel]
Call Trace:
? kvm_check_async_pf_completion+0xef/0x120 [kvm]
? rcu_read_lock_sched_held+0x79/0x80
vmx_queue_exception+0x104/0x160 [kvm_intel]
? vmx_queue_exception+0x104/0x160 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0x1171/0x1ce0 [kvm]
? kvm_arch_vcpu_load+0x47/0x240 [kvm]
? kvm_arch_vcpu_load+0x62/0x240 [kvm]
kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? __fget+0xf3/0x210
do_vfs_ioctl+0xa4/0x700
? __fget+0x114/0x210
SyS_ioctl+0x79/0x90
do_syscall_64+0x81/0x220
entry_SYSCALL64_slow_path+0x25/0x25
This is triggered occasionally by running both win7 and win2016 in L2, in
addition, EPT is disabled on both L1 and L2. It can't be reproduced easily.
Commit 0b6ac343fc (KVM: nVMX: Correct handling of exception injection) mentioned
that "KVM wants to inject page-faults which it got to the guest. This function
assumes it is called with the exit reason in vmcs02 being a #PF exception".
Commit e011c663 (KVM: nVMX: Check all exceptions for intercept during delivery to
L2) allows to check all exceptions for intercept during delivery to L2. However,
there is no guarantee the exit reason is exception currently, when there is an
external interrupt occurred on host, maybe a time interrupt for host which should
not be injected to guest, and somewhere queues an exception, then the function
nested_vmx_check_exception() will be called and the vmexit emulation codes will
try to emulate the "Acknowledge interrupt on exit" behavior, the warning is
triggered.
Reusing the exit reason from the L2->L0 vmexit is wrong in this case,
the reason must always be EXCEPTION_NMI when injecting an exception into
L1 as a nested vmexit.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Fixes: e011c663b9 ("KVM: nVMX: Check all exceptions for intercept during delivery to L2")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
This mouse is also known under other IDs. It needs the quirk
ALWAYS_POLL or will disconnect in runlevel 1 or 3.
Signed-off-by: Sebastian Parschauer <sparschauer@suse.de>
CC: stable@vger.kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
If a function sets bind_deactivated flag, upon removal we will be left
with an unbalanced deactivation. Let's make sure that we conditionally
call usb_function_activate() from usb_remove_function() and make sure
usb_remove_function() is called from remove_config().
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Commit 8d911904f3 ('powerpc/perf: Add restrictions to PMC5 in power9 DD1')
was added to restrict the use of PMC5 in Power9 DD1. Intention was to disable
the use of PMC5 using raw event code. But instead of updating the
power9_isa207_pmu structure (used on DD1), the commit incorrectly updated the
power9_pmu structure. Fix it.
Fixes: 8d911904f3 ("powerpc/perf: Add restrictions to PMC5 in power9 DD1")
Reported-by: Shriya <shriyak@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Tested-by: Shriya <shriyak@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In commit 8c27226119 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we
switched to the generic implementation of cpu_to_node(), which uses a percpu
variable to hold the NUMA node for each CPU.
Unfortunately we neglected to notice that we use cpu_to_node() in the allocation
of our percpu areas, leading to a chicken and egg problem. In practice what
happens is when we are setting up the percpu areas, cpu_to_node() reports that
all CPUs are on node 0, so we allocate all percpu areas on node 0.
This is visible in the dmesg output, as all pcpu allocs being in group 0:
pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31
pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39
pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47
To fix it we need an early_cpu_to_node() which can run prior to percpu being
setup. We already have the numa_cpu_lookup_table we can use, so just plumb it
in. With the patch dmesg output shows two groups, 0 and 1:
pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31
pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39
pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47
We can also check the data_offset in the paca of various CPUs, with the fix we
see:
CPU 0: data_offset = 0x0ffe8b0000
CPU 24: data_offset = 0x1ffe5b0000
And we can see from dmesg that CPU 24 has an allocation on node 1:
node 0: [mem 0x0000000000000000-0x0000000fffffffff]
node 1: [mem 0x0000001000000000-0x0000001fffffffff]
Cc: stable@vger.kernel.org # v3.16+
Fixes: 8c27226119 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
A disorder is found in some ALC269 quirk entries for ASUS (1043:xxxx),
which should have been sorted in PCI SSID order. Rearrange them, so
that I won't overlook the already existing entry like I did a couple
of times in the past...
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The ASUS X705UD laptop requires the known fixup ALC256_FIXUP_ASUS_MIC
in order to fix headphone jack sensing and to enable use of the internal
microphone.
Unfortunately jack sensing for the headset mic is still not working.
[rearranged the position to keep the PCI SSID order -- tiwai]
Signed-off-by: Chris Chiu <chiu@endlessm.com>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Since this driver does no detection of hardware, it might be used with
a non-sir port. Escape out if we are spinning.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Fix a link error in this specific combination of config options:
CONFIG_MEDIA_CEC_SUPPORT=y
CONFIG_CEC_CORE=m
CONFIG_MEDIA_CEC_NOTIFIER=y
CONFIG_VIDEO_STI_HDMI_CEC=m
CONFIG_DRM_STI=y
drivers/gpu/drm/sti/sti_hdmi.o: In function `sti_hdmi_remove':
sti_hdmi.c:(.text.sti_hdmi_remove+0x10): undefined reference to
`cec_notifier_set_phys_addr'
sti_hdmi.c:(.text.sti_hdmi_remove+0x34): undefined reference to
`cec_notifier_put'
drivers/gpu/drm/sti/sti_hdmi.o: In function `sti_hdmi_connector_get_modes':
sti_hdmi.c:(.text.sti_hdmi_connector_get_modes+0x4a): undefined
reference to `cec_notifier_set_phys_addr_from_edid'
drivers/gpu/drm/sti/sti_hdmi.o: In function `sti_hdmi_probe':
sti_hdmi.c:(.text.sti_hdmi_probe+0x204): undefined reference to
`cec_notifier_get'
drivers/gpu/drm/sti/sti_hdmi.o: In function `sti_hdmi_connector_detect':
sti_hdmi.c:(.text.sti_hdmi_connector_detect+0x36): undefined reference
to `cec_notifier_set_phys_addr'
drivers/gpu/drm/sti/sti_hdmi.o: In function `sti_hdmi_disable':
sti_hdmi.c:(.text.sti_hdmi_disable+0xc0): undefined reference to
`cec_notifier_set_phys_addr'
The version below seems to work, though I don't particularly
like the IS_REACHABLE() addition since that can be confusing
to users.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Changing the IS_REACHABLE() into a plain #ifdef broke the case of
CONFIG_MEDIA_RC=m && CONFIG_MEDIA_CEC=y:
drivers/media/cec/cec-core.o: In function `cec_unregister_adapter':
cec-core.c:(.text.cec_unregister_adapter+0x18): undefined reference to `rc_unregister_device'
drivers/media/cec/cec-core.o: In function `cec_delete_adapter':
cec-core.c:(.text.cec_delete_adapter+0x54): undefined reference to `rc_free_device'
drivers/media/cec/cec-core.o: In function `cec_register_adapter':
cec-core.c:(.text.cec_register_adapter+0x94): undefined reference to `rc_register_device'
cec-core.c:(.text.cec_register_adapter+0xa4): undefined reference to `rc_free_device'
cec-core.c:(.text.cec_register_adapter+0x110): undefined reference to `rc_unregister_device'
drivers/media/cec/cec-core.o: In function `cec_allocate_adapter':
cec-core.c:(.text.cec_allocate_adapter+0x234): undefined reference to `rc_allocate_device'
drivers/media/cec/cec-adap.o: In function `cec_received_msg':
cec-adap.c:(.text.cec_received_msg+0x734): undefined reference to `rc_keydown'
cec-adap.c:(.text.cec_received_msg+0x768): undefined reference to `rc_keyup'
This adds an additional dependency to explicitly forbid this combination.
Fixes: 5f2c467c54 ("[media] cec: add MEDIA_CEC_RC config option")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The driver allocates the spinlock but not initialize it.
Use spin_lock_init() on it to initialize it correctly.
This is detected by Coccinelle semantic patch.
Fixes: 0f314f6c2e ("[media] rainshadow-cec: new RainShadow Tech HDMI
CEC driver")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
In f8b45b74cc ("i40e/i40evf: Use build_skb to build frames")
i40e_build_skb updates the page_offset field with an incorrect offset,
which can lead to data corruption. This patch updates page_offset
correctly, by properly setting truesize.
Note that the bug only appears on architectures where PAGE_SIZE is
8192 or larger.
Fixes: f8b45b74cc ("i40e/i40evf: Use build_skb to build frames")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Commit 0da36b9774 ("i40e: use DECLARE_BITMAP for state fields")
introduced changes in the way i40e works with state flags converting
them to bitmaps using kernel bitmap API. This change introduced a
regression due to a mistaken substitution using __I40E_VSI_DOWN instead
of __I40E_DOWN when testing state of a PF at i40e_reset_subtask()
function. This caused a flood in the kernel log with the follow message:
[49.013] i40e 0002:01:00.0: bad reset request 0x00000020
Commit d19cb64b92 ("i40e: separate PF and VSI state flags")
also introduced some misuse of the VSI and PF flags, so both could be
considered as the offenders.
This patch simply fixes the flags where it makes sense by changing
__I40E_VSI_DOWN to __I40E_DOWN.
Fixes: 0da36b9774 ("i40e: use DECLARE_BITMAP for state fields")
Fixes: d19cb64b92 ("i40e: separate PF and VSI state flags")
Reviewed-by: "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A hardcoded register is accidentally used instead of the register
address passed into the function. Correct this and use the appropriate
variable. This would cause minor issues on wm5102, but all other
devices using this driver would have been unaffected.
Fixes: commit ef84f885e0 ("mfd: arizona: Refactor arizona_poll_reg")
Reported-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
During an eeh call to cxl_remove can result in double free_irq of
psl,slice interrupts. This can happen if perst_reloads_same_image == 1
and call to cxl_configure_adapter() fails during slot_reset
callback. In such a case we see a kernel oops with following back-trace:
Oops: Kernel access of bad area, sig: 11 [#1]
Call Trace:
free_irq+0x88/0xd0 (unreliable)
cxl_unmap_irq+0x20/0x40 [cxl]
cxl_native_release_psl_irq+0x78/0xd8 [cxl]
pci_deconfigure_afu+0xac/0x110 [cxl]
cxl_remove+0x104/0x210 [cxl]
pci_device_remove+0x6c/0x110
device_release_driver_internal+0x204/0x2e0
pci_stop_bus_device+0xa0/0xd0
pci_stop_and_remove_bus_device+0x28/0x40
pci_hp_remove_devices+0xb0/0x150
pci_hp_remove_devices+0x68/0x150
eeh_handle_normal_event+0x140/0x580
eeh_handle_event+0x174/0x360
eeh_event_handler+0x1e8/0x1f0
This patch fixes the issue of double free_irq by checking that
variables that hold the virqs (err_hwirq, serr_hwirq, psl_virq) are
not '0' before un-mapping and resetting these variables to '0' when
they are un-mapped.
Cc: stable@vger.kernel.org
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently tsk->thread.load_tm is not initialized in the task creation
and can contain garbage on a new task.
This is an undesired behaviour, since it affects the timing to enable
and disable the transactional memory laziness (disabling and enabling
the MSR TM bit, which affects TM reclaim and recheckpoint in the
scheduling process).
Fixes: 5d176f751e ("powerpc: tm: Enable transactional memory (TM) lazily for userspace")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The description of the CSI_SEL bit in the i.MX6 reference manual is
incorrect. It states "This bit defines which CSI is the input to the
IC. This bit is effective only if IC_INPUT is bit cleared".
From experiment it was found this is in fact not correct. The CSI_SEL
bit selects which CSI is input to _both_ the VDIC _and_ the IC. If the
IC_INPUT bit is set so that the IC is receiving from the VDIC, the IC
ignores the CSI_SEL bit, but CSI_SEL still selects which CSI the VDIC
receives from in that case.
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Steve Longerbeam <steve_longerbeam@mentor.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Not having an endpoint bound in DT should not cause a failure here,
there are fallbacks. So explicitly accept a missing endpoint.
This behavior change was introduced by refactoring in drm_of parsing
code and it should not require dts changes.
In particular this fixes imx6qdl-sabreauto boards.
Link: https://lists.freedesktop.org/archives/dri-devel/2017-May/141233.html
Fixes: ebc9446135 ("drm: convert drivers to use drm_of_find_panel_or_bridge")
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
By setting the SFTRST bit, the PRE will be held in the lowest power state
with clocks to the internal blocks gated. When external clock gating is
used (from the external clock controller, or by setting the CLKGATE bit)
the PRE will sporadically fail to start.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Fixes: d2a3423258 ("gpu: ipu-v3: add driver for Prefetch Resolve Engine")
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
We used to extract PRIbits from the ICH_VT_EL2 which was the upper field
in the register word, so a mask wasn't necessary, but as we switched to
looking at PREbits, which is bits 26 through 28 with the PRIbits field
being potentially non-zero, we really need to mask off the field value,
otherwise fun things may happen.
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Core Changes:
- Grab locks in drm_atomic_helper_resume() (Daniel)
- Fix oops when unplugging USB device (expand cleanup in drm_unplug_dev) (Hans)
Driver Changes:
- rockchip: Don't output 10-bit format to 8-bit encoders (Mark)
Cc: Mark yao <mark.yao@rock-chips.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Hans de Goede <hdegoede@redhat.com>
* tag 'drm-misc-fixes-2017-06-02' of git://anongit.freedesktop.org/git/drm-misc:
drm: Fix oops + Xserver hang when unplugging USB drm devices
drm: Fix locking in drm_atomic_helper_resume
drm/rockchip: Correct vop out_mode configure
4 nouveau regression fixes.
* 'linux-4.12' of git://github.com/skeggsb/linux:
drm/nouveau/tmr: fully separate alarm execution/pending lists
drm/nouveau: enable autosuspend only when it'll actually be used
drm/nouveau: replace multiple open-coded runpm support checks with function
drm/nouveau/kms/nv50: add null check before pointer dereference
Reusing the list_head for both is a bad idea. Callback execution is done
with the lock dropped so that alarms can be rescheduled from the callback,
which means that with some unfortunate timing, lists can get corrupted.
The execution list should not require its own locking, the single function
that uses it can only be called from a single context.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
Add null check before dereferencing pointer asyc
Addresses-Coverity-ID: 1397932
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Linux IRQ #0 is reserved for error reporting and may not be used.
Increase NR_IRQS for one additional slot and increase
irq_domain_add_legacy parameter first_irq value to 1, so that linux
IRQ #0 is not associated with hardware IRQ #0 in legacy IRQ domains.
Introduce macro XTENSA_PIC_LINUX_IRQ for static translation of xtensa
PIC hardware IRQ # to linux IRQ #. Use this macro in XTFPGA platform
data definitions.
This fixes inability to use hardware IRQ #0 in configurations that don't
use device tree and allows for non-identity mapping between linux IRQ #
and hardware IRQ #.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
When the kernel is compiled with an "O=" argument, the object files are
not in the source tree, but in the build tree.
This patch fixes O= build by looking for object files in the build tree.
Fixes: 923e02ecf3 ("scripts/tags.sh: Support compiled source")
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The new per-cpu counter for writes_pending is initialised in
md_alloc(), which is not called by dm-raid.
So dm-raid fails when md_write_start() is called.
Move the initialization to the personality modules
that need it. This way it is always initialised when needed,
but isn't unnecessarily initialized (requiring memory allocation)
when the personality doesn't use writes_pending.
Reported-by: Heinz Mauelshagen <heinzm@redhat.com>
Fixes: 4ad23a9764 ("MD: use per-cpu counter for writes_pending")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Pull cgroup fixes from Tejun Heo:
"Two cgroup fixes. One to address RCU delay of cpuset removal affecting
userland visible behaviors. The other fixes a race condition between
controller disable and cgroup removal"
* 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: consider dying css as offline
cgroup: Prevent kill_css() from being called more than once
Pull libata fixes from Tejun Heo:
- Revert of sata_mv devm_ioremap_resource() conversion. It made init
fail if there are overlapping resources which led to detection
failures on some setups.
- A workaround for an Acer laptop which sometimes reports corrupt port
map.
- Other non-critical fixes.
* 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: fix error checking in in ata_parse_force_one()
Revert "ata: sata_mv: Convert to devm_ioremap_resource()"
ata: libahci: properly propagate return value of platform_get_irq()
ata: sata_rcar: Handle return value of clk_prepare_enable
ahci: Acer SA5-271 SSD Not Detected Fix
Revert commit da28e1955d (ACPICA: Disassembler: Enhance resource
descriptor detection) as it is based on an assumption that doesn't
hold all the time and causes problems to happen because of that.
Reported-by: Linda Knippers <linda.knippers@hpe.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Double exception vector only needs 20 bytes of space for 5 literals, not
48. Reduce the reservation for double exception vector literals
accordingly. This fixes build for configurations with small user
exception vector size.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Sending host command with CMD_WANT_SKB flag demands the release of the
response buffer with iwl_free_resp function.
The patch adds the memory release in all the relevant places
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In a previous commit, we removed support for API versions earlier than
22 for these NICs. By mistake, the *_UCODE_API_MIN definitions were
set to 17. Fix that.
Fixes: 4b87e5af63 ("iwlwifi: remove support for fw older than -17 and -22")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Clear the struct so that all reserved fields are zero when we
send the struct down to the device.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The iwl_mvm_remove_sta_key() function handles removing a key when the
sta doesn't exist anymore. Mistakenly, this was changed to return an
error while fixing another bug.
If the mvm_sta doesn't exist, we continue normally, but just don't try
to remove the igtk key.
Fixes: cd4d23c1ea ("iwlwifi: mvm: Fix removal of IGTK")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When we want to stop the recording of the firmware debug
and restart it later without reloading the firmware we
don't need to resend the configuration that comes with
host commands.
Sending those commands confused the hardware and led to
an NMI 0x66.
Change the flow as following:
* read the relevant registers (DBGC_IN_SAMPLE, DBGC_OUT_CTRL)
* clear those registers
* wait for the hardware to complete its write to the buffer
* get the data
* restore the value of those registers (to restart the
recording)
For early start (where the configuration is already
compiled in the firmware), we don't need to set those
registers after the firmware has been loaded, but only
when we want to restart the recording without having
restarted the firmware.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The ucode_loaded check should be under the mutex, since it can
otherwise change state after we looked at it and before we got
the mutex. Fix that.
Fixes: 5c89e7bc55 ("iwlwifi: mvm: add registration to cooling device")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Allow working IBSS also when working in DQA mode.
This is done by setting it to treat the queues the
same as a BSS AP treats the queues.
Fixes: 7948b87308 ("iwlwifi: mvm: enable dynamic queue allocation mode")
Signed-off-by: Liad Kaufman <liad.kaufman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
During d0i3 flow we flush all the queue except from the command queue.
Currently, in this flow the command queue is hard coded to 9.
In DQA the command queue number has changed from 9 to 0.
Fix that.
This fixes a problem in runtime PM resume flow.
Fixes: 097129c9e6 ("iwlwifi: mvm: move cmd queue to be #0 in dqa mode")
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Up until now, the driver was comparing the rate reported by the FW and
the rate of the latest LQ command to avoid processing data belonging
to the old LQ command. Recently, FW changed the meaning of the initial
rate field in tx response and it holds the actual rate (which is not
necessarily the initial rate of LQ's rate table). Use instead LQ cmd
color to be able to filter out tx responses/BA notifications which
where sent during earlier LQ commands' time frame.
This fixes some throughput degradation in noisy environments.
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Pull ARM fixes from Russell King:
"Three fixes this time around:
- Two fixes for noMMU, fixing the decompressor header layout, and
preventing a build error with some configurations.
- Fixing the hyp-stub updates that went in during the merge window
for platforms that use MCPM"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8677/1: boot/compressed: fix decompressor header layout for v7-M
ARM: 8676/1: NOMMU: provide pgprot_device() macro
ARM: 8675/1: MCPM: ensure not to enter __hyp_soft_restart from loopback and cpu_power_down
In some situations the libdw unwinder stopped working properly. I.e.
with libunwind we see:
~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
f199 call_init.part.0 (/usr/lib/ld-2.25.so)
f2a5 _dl_init (/usr/lib/ld-2.25.so)
db9 _dl_start_user (/usr/lib/ld-2.25.so)
~~~~~
But with libdw and without this patch this sample is not properly
unwound:
~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
~~~~~
Debug output showed me that libdw found a module for the last frame
address, but it thinks it belongs to /usr/lib/ld-2.25.so. This patch
double-checks what libdw sees and what perf knows. If the mappings
mismatch, we now report the elf known to perf. This fixes the situation
above, and the libdw unwinder produces the same stack as libunwind.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20170602143753.16907-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The following tests are failing on powerpc:
# perf test break
18: Breakpoint overflow signal handler : FAILED!
19: Breakpoint overflow sampling : FAILED!
The powerpc kenel so far does not have support to even create
instruction breakpoints using the perf event interface, so those tests
fail early in the config phase.
I added a '->is_supported()' callback to test struct to be able to
disable specific tests. It seems better than putting ifdefs directly to
the test array.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170601205450.GA398@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The decompress_kmodule() decompresses kernel modules in order to load
symbols from it. In the DSO_BINARY_TYPE__BUILD_ID_CACHE case, it needs
the full file path to extract the file extension to determine the
decompression method. But overwriting 'name' will fail the
decompression since it might point to a non-existing old file.
Instead, use dso->long_name for having the correct extension and use the
real filename to decompress.
In the DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP case, both names should
be the same. This allows resolving symbols in the old modules.
Before:
$ perf report -i perf.data.old | grep scsi_mod
0.00% cc1 [scsi_mod] [k] 0x0000000000004aa6
0.00% as [scsi_mod] [k] 0x00000000000099e1
0.00% cc1 [scsi_mod] [k] 0x0000000000009830
0.00% cc1 [scsi_mod] [k] 0x0000000000001b8f
After:
0.00% cc1 [scsi_mod] [k] scsi_handle_queue_ramp_up
0.00% as [scsi_mod] [k] scsi_sg_alloc
0.00% cc1 [scsi_mod] [k] scsi_setup_cmnd
0.00% cc1 [scsi_mod] [k] scsi_get_command
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170531120105.21731-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When perf processes build-id event, it creates DSOs with the build-id.
But it didn't set the module short name (like '[module-name]') so when
processing a kernel mmap event of the module, it cannot found the DSO as
it only checks the short names.
That leads for perf to create a same DSO without the build-id info and
it'll lookup the system path even if the DSO is already in the build-id
cache. After kernel was updated, perf cannot find the DSO and cannot
show symbols in it anymore.
You can see this if you have an old data file (w/ old kernel version):
$ perf report -i perf.data.old -v |& grep scsi_mod
build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz : cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
Failed to open /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz, continuing without symbols
...
The second message didn't show the build-id. With this patch:
$ perf report -i perf.data.old -v |& grep scsi_mod
build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz: cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
/lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz with build id cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1 not found, continuing without symbols
...
Now it shows the build-id but still cannot load the symbol table. This
is a different problem which will be fixed in the next patch.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170531120105.21731-1-namhyung@kernel.org
[ Fix the build on older compilers (debian <= 8, fedora <= 21, etc) wrt kmod_path var init ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that we have umask support, we shouldn't re-send the mode in a SETATTR
following an exclusive CREATE, or we risk having the same problem fixed in
commit 5334c5bdac ("NFS: Send attributes in OPEN request for
NFS4_CREATE_EXCLUSIVE4_1"), which is that files with S_ISGID will have that
bit stripped away.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: dff25ddb48 ("nfs: add support for the umask attribute")
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
When compiling with -Wsuggest-attribute=format in HOSTCFLAGS, gcc
complains that error_with_pos() may be declared with a printf format
attribute:
scripts/genksyms/genksyms.c:726:3: warning: function might be
possible candidate for ‘gnu_printf’ format attribute
[-Wsuggest-attribute=format]
vfprintf(stderr, fmt, args);
^~~~~~~~
This would allow catching printf-format errors at compile time in
callers to error_with_pos(). Add this attribute.
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The Granular QoS per VF feature must be enabled in FW before it can be
used.
Thus, the driver cannot modify a QP's qos_vport value (via the UPDATE_QP FW
command) if the feature has not been enabled -- the FW returns an error if
this is attempted.
Fixes: 08068cd568 ("net/mlx4: Added qos_vport QP configuration in VST mode")
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix kernel-doc warnings (typo) in drivers/net/phy/phy.c:
..//drivers/net/phy/phy.c:259: warning: No description found for parameter 'features'
..//drivers/net/phy/phy.c:259: warning: Excess function parameter 'feature' description in 'phy_lookup_setting'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update tcp.txt to fix mandatory congestion control ops and default
CCA selection. Also, fix comment in tcp.h for undo_cwnd.
Signed-off-by: Anmol Sarma <me@anmolsarma.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit c5a2ee7dde (cpufreq: intel_pstate: Active mode P-state
limits rework) incorrectly assumed that pstate.turbo_pstate would
always be nonzero for CPU0 in min_perf_pct_min() if
cpufreq_register_driver() had succeeded which may not be the case
in virtualized environments.
If that assumption doesn't hold, it leads to an early crash on boot
in intel_pstate_register_driver(), so add a sanity check to
min_perf_pct_min() to prevent the crash from happening.
Fixes: c5a2ee7dde (cpufreq: intel_pstate: Active mode P-state limits rework)
Reported-and-tested-by: Jongman Heo <jongman.heo@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
As reported by Patrice, the header layout of the decompressor is
incorrect when building for v7-M. In this case, the __nop macro
resolves to 'mov r0, r0', which is emitted as a narrow encoding,
resulting in the header data fields to end up at lower offsets than
required.
Given the variety of targets we need to support with the same code,
the startup sequence is a bit of a jumble, and uses instructions
and macros whose encoding widths cannot be specified (badr), or only
exist in a narrow encoding (bx)
So force the use of a wide encoding in __nop, and replace the start
sequence with a simple jump to the label marking the start of code,
preceded by a Thumb2 mode switch if required (using explicit wide
encodings where appropriate). The label itself can be moved to the
start of code [where it belongs] due to the larger range of branch
instructions as compared to adr instructions.
Reported-by: Patrice CHOTARD <patrice.chotard@st.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
NOMMU build leads to the following error:
CC drivers/pci/mmap.o
drivers/pci/mmap.c: In function 'pci_mmap_resource_range':
drivers/pci/mmap.c:60:3: error: implicit declaration of function 'pgprot_device' [-Werror=implicit-function-declaration]
vma->vm_page_prot = pgprot_device(vma->vm_page_prot);
^
cc1: some warnings being treated as errors
scripts/Makefile.build:302: recipe for target 'drivers/pci/mmap.o' failed
make[2]: *** [drivers/pci/mmap.o] Error 1
scripts/Makefile.build:561: recipe for target 'drivers/pci' failed
make[1]: *** [drivers/pci] Error 2
Makefile:1016: recipe for target 'drivers' failed
make: *** [drivers] Error 2
Fix it with support of pgprot_device() macro for NOMMU.
Fixes: 00d2904ffe ("ARM/PCI: Use generic pci_mmap_resource_range()")
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
A SoC variant of Geode GX1, notably NSC branded SC1100, seems to
report an inverted Device ID in its DIR0 configuration register,
specifically 0xb instead of the expected 0x4.
Catch this presumably quirky version so it's properly recognized
as GX1 and has its cache switched to write-back mode, which provides
a significant performance boost in most workloads.
SC1100's datasheet "Geode™ SC1100 Information Appliance On a Chip",
states in section 1.1.7.1 "Device ID" that device identification
values are specified in SC1100's device errata. These, however,
seem to not have been publicly released.
Wading through a number of boot logs and /proc/cpuinfo dumps found on
pastebin and blogs, this patch should mostly be relevant for a number
of now admittedly aging Soekris NET4801 and PC Engines WRAP devices,
the latter being the platform this issue was discovered on.
Performance impact was verified using "openssl speed", with
write-back caching scaling throughput between -3% and +41%.
Signed-off-by: Christian Sünkenberg <christian.suenkenberg@student.kit.edu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1496596719.26725.14.camel@student.kit.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently tsk->thread->load_vec and load_fp are not initialized during
task creation, which can lead to garbage values in these variables (non-zero
values).
These variables will be checked later in restore_math() to validate if the
FP and vector registers are being utilized. Since these values might be
non-zero, the restore_math() will continue to save the FP and vectors even if
they were never utilized by the userspace application. load_fp and load_vec
counters will then overflow (they wrap at 255) and the FP and Altivec will be
finally disabled, but before that condition is reached (counter overflow)
several context switches will have restored FP and vector registers without
need, causing a performance degradation.
Fixes: 70fe3d980f ("powerpc: Restore FPU/VEC/VSX if previously used")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Gustavo Romero <gusbromero@gmail.com>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Our previous patch (cited below) introduced a regression
for RAW Eth QPs.
Fix it by checking if the QP number provided by user-space
exists, hence allowing steering rules to be added for valid
QPs only.
Fixes: 89c557687a ("net/mlx4_en: Avoid adding steering rules with invalid ring")
Reported-by: Or Gerlitz <gerlitz.or@gmail.com>
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since iptunnel_pull_header() can call pskb_may_pull(),
we must reload any pointer that was related to skb->head.
Fixes: a09a4c8dd1 ("tunnels: Remove encapsulation offloads on decap")
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander reported various KASAN messages triggered in recent kernels
The problem is that ping sockets should not use udp_poll() in the first
place, and recent changes in UDP stack finally exposed this old bug.
Fixes: c319b4d76b ("net: ipv4: add IPPROTO_ICMP socket kind")
Fixes: 6d0bfe2261 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Acked-By: Lorenzo Colitti <lorenzo@google.com>
Tested-By: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 9520ed8fb8 ("net: dsa: use cpu_switch instead of ds[0]")
replaced the use of dst->ds[0] with dst->cpu_switch since that is
functionally equivalent, however, we can now run into an use after free
scenario after unbinding then rebinding the switch driver.
The use after free happens because we do correctly initialize
dst->cpu_switch the first time we probe in dsa_cpu_parse(), then we
unbind the driver: dsa_dst_unapply() is called, and we rebind again.
dst->cpu_switch now points to a freed "ds" structure, and so when we
finally dereference it in dsa_cpu_port_ethtool_setup(), we oops.
To fix this, simply set dst->cpu_switch to NULL in dsa_dst_unapply()
which guarantees that we always correctly re-assign dst->cpu_switch in
dsa_cpu_parse().
Fixes: 9520ed8fb8 ("net: dsa: use cpu_switch instead of ds[0]")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If ip6_find_1stfragopt() fails and we return an error we have to free
up 'segs' because nobody else is going to.
Fixes: 2423496af3 ("ipv6: Prevent overrun when parsing v6 header options")
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 9b4437a5b8 ("geneve: Unify LWT and netdev handling.")
when using COLLECT_METADATA geneve devices are created with too small of
a needed_headroom and too large of a max_mtu. This is because
ip_tunnel_info_af() is not valid with the device level info when using
COLLECT_METADATA and we mistakenly fall into the IPv4 case.
For COLLECT_METADATA, always use the worst case of ipv6 since both
sockets are created.
Fixes: 9b4437a5b8 ("geneve: Unify LWT and netdev handling.")
Signed-off-by: Eric Garver <e@erig.me>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prior to f5f99309fa (sock: do not set sk_err in
sock_dequeue_err_skb), sk_err was reset to the error of
the skb on the head of the error queue.
Applications, most notably ping, are relying on this
behavior to reset sk_err for ICMP packets.
Set sk_err to the ICMP error when there is an ICMP packet
at the head of the error queue.
Fixes: f5f99309fa (sock: do not set sk_err in sock_dequeue_err_skb)
Reported-by: Cyril Hrubis <chrubis@suse.cz>
Tested-by: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
xgbe_map_rx_buffer is rather confused about what PAGE_ALLOC_COSTLY_ORDER
means. It uses PAGE_ALLOC_COSTLY_ORDER-1 assuming that
PAGE_ALLOC_COSTLY_ORDER is the first costly order which is not the case
actually because orders larger than that are costly. And even that
applies only to sleeping allocations which is not the case here. We
simply do not perform any costly operations like reclaim or compaction
for those. Simplify the code by dropping the order calculation and use
PAGE_ALLOC_COSTLY_ORDER directly.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip6_route_output() requires that the flowlabel contains the traffic
class for policy routing.
Commit 0e9a709560 ("ip6_tunnel, ip6_gre: fix setting of DSCP on
encapsulated packets") removed the code which previously added the
traffic class to the flowlabel.
The traffic class is added here because only route lookup needs the
flowlabel to contain the traffic class.
Fixes: 0e9a709560 ("ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets")
Signed-off-by: Liam McBirnie <liam.mcbirnie@boeing.com>
Acked-by: Peter Dawson <peter.a.dawson@boeing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a problem with reading files larger than 2GB from a UFS-2
file system:
https://bugzilla.kernel.org/show_bug.cgi?id=195721
The incorrect UFS s_maxsize limit became a problem as of commit
c2a9737f45 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()")
which started using s_maxbytes to avoid a page index overflow in
do_generic_file_read().
That caused files to be truncated on UFS-2 file systems because the
default maximum file size is 2GB (MAX_NON_LFS) and UFS didn't update it.
Here I simply increase the default to a common value used by other file
systems.
Signed-off-by: Richard Narron <comet.berkeley@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Will B <will.brokenbourgh2877@gmail.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: <stable@vger.kernel.org> # v4.9 and backports of c2a9737f45
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use software polling (PHY_POLL) to check for link state changes instead
of relying on the EMAC's hardware polling feature. Some PHY drivers
are unable to get a functioning link because the HW polling is not
robust enough.
The EMAC is able to poll the PHY on the MDIO bus looking for link state
changes (via the Link Status bit in the Status Register at address 0x1).
When the link state changes, the EMAC triggers an interrupt and tells the
driver what the new state is. The feature eliminates the need for
software to poll the MDIO bus.
Unfortunately, this feature is incompatible with phylib, because it
ignores everything that the PHY core and PHY drivers are trying to do.
In particular:
1. It assumes a compatible register set, so PHYs with different registers
may not work.
2. It doesn't allow for hardware errata that have work-arounds implemented
in the PHY driver.
3. It doesn't support multiple register pages. If the PHY core switches
the register set to another page, the EMAC won't know the page has
changed and will still attempt to read the same PHY register.
4. It only checks the copper side of the link, not the SGMII side. Some
PHY drivers (e.g. at803x) may also check the SGMII side, and
report the link as not ready during autonegotiation if the SGMII link
is still down. Phylib then waits for another interrupt to query
the PHY again, but the EMAC won't send another interrupt because it
thinks the link is up.
Cc: stable@vger.kernel.org # 4.11.x
Tested-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull NFS client bugfixes from Trond Myklebust:
"Bugfixes include:
- Fix a typo in commit e092693443 ("NFS append COMMIT after
synchronous COPY") that breaks copy offload
- Fix the connect error propagation in xs_tcp_setup_socket()
- Fix a lock leak in nfs40_walk_client_list
- Verify that pNFS requests lie within the offset range of the layout
segment"
* tag 'nfs-for-4.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
nfs: Mark unnecessarily extern functions as static
SUNRPC: ensure correct error is reported by xs_tcp_setup_socket()
NFSv4.0: Fix a lock leak in nfs40_walk_client_list
pnfs: Fix the check for requests in range of layout segment
xprtrdma: Delete an error message for a failed memory allocation in xprt_rdma_bc_setup()
pNFS/flexfiles: missing error code in ff_layout_alloc_lseg()
NFS fix COMMIT after COPY
Pull tty fix from Greg KH:
"Here is a single tty core fix for 4.12-rc4. It reverts a patch that a
lot of people reported as causing lockdep and other warnings.
Right after I reverted this in my tree, it seems like another
"correct" fix might have shown up, but it's too late in the release
cycle to be messing with tty core locking, so let's just revert this
for now to go back how things always have been and try it again for
4.13.
This has not been in linux-next as I only reverted it a few hours ago"
* tag 'tty-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "tty: fix port buffer locking"
Pull input subsystem fixes from Dmitry Torokhov:
- a couple of regression fixes in synaptics and axp20x-pek drivers
- try to ease transition from PS/2 to RMI for Synaptics touchpad users
by ensuring we do not try to activate RMI mode when RMI SMBus support
is not enabled, and nag users a bit to enable it
- plus a couple of other changes that seemed worthwhile for this
release
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: axp20x-pek - switch to acpi_dev_present and check for ACPI0011 too
Input: axp20x-pek - only check for "INTCFD9" ACPI device on Cherry Trail
Input: tm2-touchkey - use LEN_ON as boolean value instead of LED_FULL
Input: synaptics - tell users to report when they should be using rmi-smbus
Input: synaptics - warn the users when there is a better mode
Input: synaptics - keep PS/2 around when RMI4_SMB is not enabled
Input: synaptics - clear device info before filling in
Input: silead - disable interrupt during suspend
Pull RTC fixlet from Alexandre Belloni:
"A single patch, not really a fix but I don't think there is any reason
to delay it.
Change the mailing list address"
* tag 'rtc-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
MAINTAINERS: update RTC mailing list
A rc device can call ir_raw_event_handle() after rc_allocate_device(),
but before rc_register_device() has completed. This is racey because
rcdev->raw is set before rcdev->raw->thread has a valid value.
Cc: stable@kernel.org
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Just depend on DEBUG_FS, no need to invent a new kernel config.
Especially since CEC can be enabled by drm without enabling
MEDIA_SUPPORT.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This config option is strictly speaking independent of the
media subsystem since it can be used by drm as well.
Besides, it looks odd when drivers select CEC_CORE and
MEDIA_CEC_NOTIFIER, that's inconsistent naming.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The CEC framework is used by both drm and media. That makes it tricky
to get the dependencies right.
This patch moves the CEC_CORE and MEDIA_CEC_NOTIFIER config options
out of the media menu and instead drivers that want to use CEC should
select CEC_CORE and MEDIA_CEC_NOTIFIER (if needed).
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
exit_loop is not being initialized, so it contains garbage. Ensure it is
initialized to false.
Detected by CoverityScan, CID#1436409 ("Uninitialized scalar variable")
Fixes: ea6a69defd ("[media] rainshadow-cec: avoid -Wmaybe-uninitialized warning")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Several atomisp files use:
ccflags-y += -Werror
As, on media, our usual procedure is to use W=1, and atomisp
has *a lot* of warnings with such flag enabled,like:
./drivers/staging/media/atomisp/pci/atomisp2/css2400/hive_isp_css_common/host/system_local.h:62:26: warning: 'DDR_BASE' defined but not used [-Wunused-const-variable=]
At the end, it causes our build to fail, impacting our workflow.
So, remove this crap. If one wants to force -Werror, he
can still build with it enabled by passing a parameter to
make.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Pull SCSI fixes from James Bottomley:
"This is nine fixes, seven of which are for the qedi driver (new as of
4.10) the other two are a use after free in the cxgbi drivers and a
potential NULL dereference in the rdac device handler"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: libcxgbi: fix skb use after free
scsi: qedi: Fix endpoint NULL panic during recovery.
scsi: qedi: set max_fin_rt default value
scsi: qedi: Set firmware tcp msl timer value.
scsi: qedi: Fix endpoint NULL panic in qedi_set_path.
scsi: qedi: Set dma_boundary to 0xfff.
scsi: qedi: Correctly set firmware max supported BDs.
scsi: qedi: Fix bad pte call trace when iscsiuio is stopped.
scsi: scsi_dh_rdac: Use ctlr directly in rdac_failover_get()
Pull rdma fixes from Doug Ledford:
"For the most part this is just a minor -rc cycle for the rdma
subsystem. Even given that this is all of the -rc patches since the
merge window closed, it's still only about 25 patches:
- Multiple i40iw, nes, iw_cxgb4, hfi1, qib, mlx4, mlx5 fixes
- A few upper layer protocol fixes (IPoIB, iSER, SRP)
- A modest number of core fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (26 commits)
RDMA/SA: Fix kernel panic in CMA request handler flow
RDMA/umem: Fix missing mmap_sem in get umem ODP call
RDMA/core: not to set page dirty bit if it's already set.
RDMA/uverbs: Declare local function static and add brackets to sizeof
RDMA/netlink: Reduce exposure of RDMA netlink functions
RDMA/srp: Fix NULL deref at srp_destroy_qp()
RDMA/IPoIB: Limit the ipoib_dev_uninit_default scope
RDMA/IPoIB: Replace netdev_priv with ipoib_priv for ipoib_get_link_ksettings
RDMA/qedr: add null check before pointer dereference
RDMA/mlx5: set UMR wqe fence according to HCA cap
net/mlx5: Define interface bits for fencing UMR wqe
RDMA/mlx4: Fix MAD tunneling when SRIOV is enabled
RDMA/qib,hfi1: Fix MR reference count leak on write with immediate
RDMA/hfi1: Defer setting VL15 credits to link-up interrupt
RDMA/hfi1: change PCI bar addr assignments to Linux API functions
RDMA/hfi1: fix array termination by appending NULL to attr array
RDMA/iw_cxgb4: fix the calculation of ipv6 header size
RDMA/iw_cxgb4: calculate t4_eq_status_entries properly
RDMA/iw_cxgb4: Avoid touch after free error in ARP failure handlers
RDMA/nes: ACK MPA Reply frame
...
The alarmtimer code has another source of potentially rearming itself too
fast. Interval timers with a very samll interval have a similar CPU hog
effect as the previously fixed overflow issue.
The reason is that alarmtimers do not implement the normal protection
against this kind of problem which the other posix timer use:
timer expires -> queue signal -> deliver signal -> rearm timer
This scheme brings the rearming under scheduler control and prevents
permanently firing timers which hog the CPU.
Bringing this scheme to the alarm timer code is a major overhaul because it
lacks all the necessary mechanisms completely.
So for a quick fix limit the interval to one jiffie. This is not
problematic in practice as alarmtimers are usually backed by an RTC for
suspend which have 1 second resolution. It could be therefor argued that
the resolution of this clock should be set to 1 second in general, but
that's outside the scope of this fix.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kostya Serebryany <kcc@google.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170530211655.896767100@linutronix.de
Andrey reported a alartimer related RCU stall while fuzzing the kernel with
syzkaller.
The reason for this is an overflow in ktime_add() which brings the
resulting time into negative space and causes immediate expiry of the
timer. The following rearm with a small interval does not bring the timer
back into positive space due to the same issue.
This results in a permanent firing alarmtimer which hogs the CPU.
Use ktime_add_safe() instead which detects the overflow and clamps the
result to KTIME_SEC_MAX.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kostya Serebryany <kcc@google.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170530211655.802921648@linutronix.de
nfs_initialise_sb() and nfs_clone_super() are declared as extern even
though they are used only in fs/nfs/super.c. Mark them as static.
Also remove explicit 'inline' directive from nfs_initialise_sb() and
leave it upto compiler to decide whether inlining is worth it.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Pull hwmon fixes from Guenter Roeck:
"A couple of patches for the aspeed pwm fan driver"
* tag 'hwmon-for-linus-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (aspeed-pwm-tacho) make fan/pwm names start with index 1
hwmon: (aspeed-pwm-tacho) Call of_node_put() on a node not claimed
hwmon: (aspeed-pwm-tacho) On read failure return -ETIMEDOUT
hwmon: (aspeed-pwm-tacho) Select REGMAP
Pull MTD fixes from Brian Norris:
"NAND updates from Boris:
tango fixes:
- Add missing MODULE_DEVICE_TABLE() in tango_nand.c
- Update the number of corrected bitflips
core fixes:
- Fix a long standing memory leak in nand_scan_tail()
- Fix several bugs introduced by the per-vendor init/detection
infrastructure (introduced in 4.12)
- Add a static specifier to nand_ooblayout_lp_hamming_ops definition"
* tag 'for-linus-20170602' of git://git.infradead.org/linux-mtd:
mtd: nand: make nand_ooblayout_lp_hamming_ops static
mtd: nand: tango: Update ecc_stats.corrected
mtd: nand: tango: Export OF device ID table as module aliases
mtd: nand: samsung: warn about un-parseable ECC info
mtd: nand: free vendor-specific resources in init failure paths
mtd: nand: drop unneeded module.h include
mtd: nand: don't leak buffers when ->scan_bbt() fails
Make fan and pwm names in sysfs start with index 1 in accordance to
Documentation/hwmon/sysfs-interface conventions.
Current implementation starts with index 0, making tools such as
sensors(1) skip the first fan.
Signed-off-by: Stefan Schaeckeler <sschaeck@cisco.com>
Fixes: 2d7a548a3e ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Call of_node_put() on a node claimed with of_node_get() or by any other
means such as for_each_child_of_node().
Signed-off-by: Stefan Schaeckeler <sschaeck@cisco.com>
Fixes: 2d7a548a3e ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
modprobe is not able to resolve sysfs modalias for mei devices.
# cat
/sys/class/watchdog/watchdog0/device/watchdog/watchdog0/device/modalias
mei::05b79a6f-4628-4d7f-899d-a91514cb32ab:
# modprobe --set-version 4.9.6-200.fc25.x86_64 -R
mei::05b79a6f-4628-4d7f-899d-a91514cb32ab:
modprobe: FATAL: Module mei::05b79a6f-4628-4d7f-899d-a91514cb32ab: not
found in directory /lib/modules/4.9.6-200.fc25.x86_64
# cat /lib/modules/4.9.6-200.fc25.x86_64/modules.alias | grep
05b79a6f-4628-4d7f-899d-a91514cb32ab
alias mei:*:05b79a6f-4628-4d7f-899d-a91514cb32ab:*:* mei_wdt
commit b26864cad1 ("mei: bus: add client protocol
version to the device alias"), however sysfs modalias
is still in formmat mei:S:uuid:*.
This patch equates format of uevent and sysfs modalias so that modprobe
is able to resolve the aliases.
Cc: <stable@vger.kernel.org> 4.7+
Fixes: commit b26864cad1 ("mei: bus: add client protocol version to the device alias")
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A recent fix to /dev/mem prevents mappings from wrapping around the end
of physical address space. However, the check was written in a way that
also prevents a mapping reaching just up to the end of physical address
space, which may be a valid use case (especially on 32-bit systems).
This patch fixes it by checking the last mapped address (instead of the
first address behind that) for overflow.
Fixes: b299cde245 ("drivers: char: mem: Check for address space wraparound with mmap()")
Cc: <stable@vger.kernel.org>
Reported-by: Nico Huber <nico.h@gmx.de>
Signed-off-by: Julius Werner <jwerner@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case of error, the function devm_ioremap() returns NULL pointer
not ERR_PTR(). The IS_ERR() test in the return value check should
be replaced with NULL test. Also add NULL test for iores.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Starting from MPU6500, accelerometer dlpf is set in a separate
register named ACCEL_CONFIG_2.
Add this new register in the map and set it for the corresponding
chips.
Signed-off-by: Jean-Baptiste Maneyrol <jmaneyrol@invensense.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
lov_getstripe() calls set_fs(KERNEL_DS) so that it can handle a struct
lov_user_md pointer from user- or kernel-space. This changes the
behavior of copy_from_user() on SPARC and may result in a misaligned
access exception which in turn oopses the kernel. In fact the
relevant argument to lov_getstripe() is never called with a
kernel-space pointer and so changing the address limits is unnecessary
and so we remove the calls to save, set, and restore the address
limits.
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-on: http://review.whamcloud.com/6150
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3221
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Wei <wei.g.li@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a custom CPU target is specified and that one is not available _or_
can't be interrupted then the code returns to userland without dropping a
lock as notices by lockdep:
|echo 133 > /sys/devices/system/cpu/cpu7/hotplug/target
| ================================================
| [ BUG: lock held when returning to user space! ]
| ------------------------------------------------
| bash/503 is leaving the kernel with locks still held!
| 1 lock held by bash/503:
| #0: (device_hotplug_lock){+.+...}, at: [<ffffffff815b5650>] lock_device_hotplug_sysfs+0x10/0x40
So release the lock then.
Fixes: 757c989b99 ("cpu/hotplug: Make target state writeable")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170602142714.3ogo25f2wbq6fjpj@linutronix.de
The AR100 clock within the R_CCU (PRCM) has the PLL_PERIPH0 as one of
its parents.
This adds the reference in the device tree describing this relationship.
This patch uses a raw number for the clock index to ease merging by
avoiding cross tree dependencies.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
The AR100 clock within the R_CCU (PRCM) has the PLL_PERIPH0 as one of
its parents.
This adds the reference in the device tree describing this relationship.
This patch uses a raw number for the clock index to ease merging by
avoiding cross tree dependencies.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Kishon writes:
phy: for 4.12-rc
*) Fix return value check in phy-qcom-qmp driver
*) Fix memory allocation bug in phy-qcom-qmp driver
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
acpi_dev_found checks that there is a matching ACPI node, but it
may be disabled (_STA method returns 0) in which case the
soc_button_array driver will not bind to it and axp20x-pek should
handle the power-button.
This commit switches from acpi_dev_found to acpi_dev_present to
avoid not registering an input-dev for the powerbutton when there
is a disabled PNP0C40 device.
The ACPI-6.0 standard defines a standard gpio button device using
the ACPI0011 HID replacing the custom PNP0C40 gpio device, many
newer devices define both PNP0C40 and ACPI0011 devices enabling one
or the other depending on whether the BIOS thinks it is going to boot
Android or Windows.
This commit adds a check for the ACPI0011 device, so that if
either device is present *and* enabled we don't register an input-dev
for the powerbutton.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Commit 9b13a4ca8d ("Input: axp20x-pek - do not register input device
on some systems") added a check for the INTCFD9 ACPI device which also
handles the powerbutton as on some systems the powerbutton is connected
to both the PMIC, handled by axp20x-pek, and to a gpio on the SoC, handled
by soc_button_array which attaches itself to the INTCFD9 ACPI device.
Testing + comparing DSDTs has shown that this only happens on Cherry
Trail devices with an AXP288 PMIC, the AXP288 PMIC is also used on
Bay Trail devices but there the power button is only connected to
the PMIC and not handled by soc_button_array.
This means that the INTCFD9 check has caused a regression on Bay Trail
devices, causing power-button presses to no longer be seen.
This commit fixes this by limiting the check to devices where the ACPI
node for the AXP288 contains a _HRV (hardware revision) attribute with
a value of 3 which indicates we are dealing with a Cherry Trail platform.
Fixes: 9b13a4ca8d ("Input: axp20x-pek - do not register input ...")
Reported-by: Сергей Трусов <t.rus76@ya.ru>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Felipe writes:
usb: fixes for v4.12-rc4
A fix to a really old synchronization bug on mass storage gadget.
Support for Meson8 SoCs on dwc2
Synchronization fixes on renesas USB driver.
Pull ACPI fixes from Rafael Wysocki:
"These revert one more problematic commit related to the ACPI-based
handling of laptop lids and make some unuseful error messages coming
from ACPICA go away.
Specifics:
- Revert one more commit related to the ACPI-based handling of laptop
lids that changed the default behavior on laptops that booted with
closed lids and introduced a regression there (Benjamin Tissoires).
- Add a missing acpi_put_table() to the code implementing the
/sys/firmware/acpi/tables interface to prevent a counter in the
ACPICA core from overflowing (Dan Williams).
- Drop error messages printed by ACPICA on acpi_get_table() reference
counting mismatches as they need not indicate real errors at this
point (Lv Zheng)"
* tag 'acpi-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: Tables: Fix regression introduced by a too early mechanism enabling
Revert "ACPI / button: Change default behavior to lid_init_state=open"
ACPI / sysfs: fix acpi_get_table() leak / acpi-sysfs denial of service
Pull power management fixes from Rafael Wysocki:
"These fix two bugs in error code paths in the cpufreq core and in the
kirkwood-cpufreq driver.
Specifics:
- Make cpufreq_register_driver() return an error if the ->init()
calls fail for all CPUs to prevent non-functional drivers from
hanging around for no reason (David Arcari).
- Make kirkwood-cpufreq check the return value of
clk_prepare_enable() (which may fail) as appropriate (Arvind
Yadav)"
* tag 'pm-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: kirkwood-cpufreq:- Handle return value of clk_prepare_enable()
cpufreq: cpufreq_register_driver() should return -ENODEV if init fails
Pull /dev/random bug fix from Ted Ts'o:
"Fix a race on architectures with prioritized interrupts (such as m68k)
which can causes crashes in drivers/char/random.c:get_reg()"
* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
fix race in drivers/char/random.c:get_reg()
Merge misc fixes from Andrew Morton:
"15 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
scripts/gdb: make lx-dmesg command work (reliably)
mm: consider memblock reservations for deferred memory initialization sizing
mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified
mlock: fix mlock count can not decrease in race condition
mm/migrate: fix refcount handling when !hugepage_migration_supported()
dax: fix race between colliding PMD & PTE entries
mm: avoid spurious 'bad pmd' warning messages
mm/page_alloc.c: make sure OOM victim can try allocations with no watermarks once
pcmcia: remove left-over %Z format
slub/memcg: cure the brainless abuse of sysfs attributes
initramfs: fix disabling of initramfs (and its compression)
mm: clarify why we want kmalloc before falling backto vmallock
frv: declare jiffies to be located in the .data section
include/linux/gfp.h: fix ___GFP_NOLOCKDEP value
ksm: prevent crash after write_protect_page fails
lx-dmesg needs access to the log_buf symbol from printk.c.
Unfortunately, the symbol log_buf also exists in BPF's verifier.c and
hence gdb can pick one or the other. If it happens to pick BPF's
log_buf, lx-dmesg doesn't work:
(gdb) lx-dmesg
Python Exception <class 'gdb.MemoryError'> Cannot access memory at address 0x0:
Error occurred in Python command: Cannot access memory at address 0x0
(gdb) p log_buf
$15 = 0x0
Luckily, GDB has a way to deal with this, see
https://sourceware.org/gdb/onlinedocs/gdb/Symbols.html
(gdb) info variables ^log_buf$
All variables matching regular expression "^log_buf$":
File <linux.git>/kernel/bpf/verifier.c:
static char *log_buf;
File <linux.git>/kernel/printk/printk.c:
static char *log_buf;
(gdb) p 'verifier.c'::log_buf
$1 = 0x0
(gdb) p 'printk.c'::log_buf
$2 = 0x811a6aa0 <__log_buf> ""
(gdb) p &log_buf
$3 = (char **) 0x8120fe40 <log_buf>
(gdb) p &'verifier.c'::log_buf
$4 = (char **) 0x8120fe40 <log_buf>
(gdb) p &'printk.c'::log_buf
$5 = (char **) 0x8048b7d0 <log_buf>
By being explicit about the location of the symbol, we can make lx-dmesg
work again. While at it, do the same for the other symbols we need from
printk.c
Link: http://lkml.kernel.org/r/20170526112222.3414-1-git@andred.net
Signed-off-by: André Draszik <git@andred.net>
Tested-by: Kieran Bingham <kieran@bingham.xyz>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have seen an early OOM killer invocation on ppc64 systems with
crashkernel=4096M:
kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL|__GFP_COMP|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0
kthreadd cpuset=/ mems_allowed=7
CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1
Call Trace:
dump_stack+0xb0/0xf0 (unreliable)
dump_header+0xb0/0x258
out_of_memory+0x5f0/0x640
__alloc_pages_nodemask+0xa8c/0xc80
kmem_getpages+0x84/0x1a0
fallback_alloc+0x2a4/0x320
kmem_cache_alloc_node+0xc0/0x2e0
copy_process.isra.25+0x260/0x1b30
_do_fork+0x94/0x470
kernel_thread+0x48/0x60
kthreadd+0x264/0x330
ret_from_kernel_thread+0x5c/0xa4
Mem-Info:
active_anon:0 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
slab_reclaimable:5 slab_unreclaimable:73
mapped:0 shmem:0 pagetables:0 bounce:0
free:0 free_pcp:0 free_cma:0
Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
Node 7 DMA: 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 0kB
0 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
819200 pages RAM
0 pages HighMem/MovableOnly
817481 pages reserved
0 pages cma reserved
0 pages hwpoisoned
the reason is that the managed memory is too low (only 110MB) while the
rest of the the 50GB is still waiting for the deferred intialization to
be done. update_defer_init estimates the initial memoty to initialize
to 2GB at least but it doesn't consider any memory allocated in that
range. In this particular case we've had
Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB)
so the low 2GB is mostly depleted.
Fix this by considering memblock allocations in the initial static
initialization estimation. Move the max_initialise to
reset_deferred_meminit and implement a simple memblock_reserved_memory
helper which iterates all reserved blocks and sums the size of all that
start below the given address. The cumulative size is than added on top
of the initial estimation. This is still not ideal because
reset_deferred_meminit doesn't consider holes and so reservation might
be above the initial estimation whihch we ignore but let's make the
logic simpler until we really need to handle more complicated cases.
Fixes: 3a80a7fa79 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> [4.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KVM uses get_user_pages() to resolve its stage2 faults. KVM sets the
FOLL_HWPOISON flag causing faultin_page() to return -EHWPOISON when it
finds a VM_FAULT_HWPOISON. KVM handles these hwpoison pages as a
special case. (check_user_page_hwpoison())
When huge pages are involved, this doesn't work so well.
get_user_pages() calls follow_hugetlb_page(), which stops early if it
receives VM_FAULT_HWPOISON from hugetlb_fault(), eventually returning
-EFAULT to the caller. The step to map this to -EHWPOISON based on the
FOLL_ flags is missing. The hwpoison special case is skipped, and
-EFAULT is returned to user-space, causing Qemu or kvmtool to exit.
Instead, move this VM_FAULT_ to errno mapping code into a header file
and use it from faultin_page() and follow_hugetlb_page().
With this, KVM works as expected.
This isn't a problem for arm64 today as we haven't enabled
MEMORY_FAILURE, but I can't see any reason this doesn't happen on x86
too, so I think this should be a fix. This doesn't apply earlier than
stable's v4.11.1 due to all sorts of cleanup.
[james.morse@arm.com: add vm_fault_to_errno() call to faultin_page()]
suggested.
Link: http://lkml.kernel.org/r/20170525171035.16359-1-james.morse@arm.com
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170524160900.28786-1-james.morse@arm.com
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org> [4.11.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kefeng reported that when running the follow test, the mlock count in
meminfo will increase permanently:
[1] testcase
linux:~ # cat test_mlockal
grep Mlocked /proc/meminfo
for j in `seq 0 10`
do
for i in `seq 4 15`
do
./p_mlockall >> log &
done
sleep 0.2
done
# wait some time to let mlock counter decrease and 5s may not enough
sleep 5
grep Mlocked /proc/meminfo
linux:~ # cat p_mlockall.c
#include <sys/mman.h>
#include <stdlib.h>
#include <stdio.h>
#define SPACE_LEN 4096
int main(int argc, char ** argv)
{
int ret;
void *adr = malloc(SPACE_LEN);
if (!adr)
return -1;
ret = mlockall(MCL_CURRENT | MCL_FUTURE);
printf("mlcokall ret = %d\n", ret);
ret = munlockall();
printf("munlcokall ret = %d\n", ret);
free(adr);
return 0;
}
In __munlock_pagevec() we should decrement NR_MLOCK for each page where
we clear the PageMlocked flag. Commit 1ebb7cc6a5 ("mm: munlock: batch
NR_MLOCK zone state updates") has introduced a bug where we don't
decrement NR_MLOCK for pages where we clear the flag, but fail to
isolate them from the lru list (e.g. when the pages are on some other
cpu's percpu pagevec). Since PageMlocked stays cleared, the NR_MLOCK
accounting gets permanently disrupted by this.
Fix it by counting the number of page whose PageMlock flag is cleared.
Fixes: 1ebb7cc6a5 (" mm: munlock: batch NR_MLOCK zone state updates")
Link: http://lkml.kernel.org/r/1495678405-54569-1-git-send-email-xieyisheng1@huawei.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Joern Engel <joern@logfs.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michel Lespinasse <walken@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: zhongjiang <zhongjiang@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On failing to migrate a page, soft_offline_huge_page() performs the
necessary update to the hugepage ref-count.
But when !hugepage_migration_supported() , unmap_and_move_hugepage()
also decrements the page ref-count for the hugepage. The combined
behaviour leaves the ref-count in an inconsistent state.
This leads to soft lockups when running the overcommitted hugepage test
from mce-tests suite.
Soft offlining pfn 0x83ed600 at process virtual address 0x400000000000
soft offline: 0x83ed600: migration failed 1, type 1fffc00000008008 (uptodate|head)
INFO: rcu_preempt detected stalls on CPUs/tasks:
Tasks blocked on level-0 rcu_node (CPUs 0-7): P2715
(detected by 7, t=5254 jiffies, g=963, c=962, q=321)
thugetlb_overco R running task 0 2715 2685 0x00000008
Call trace:
dump_backtrace+0x0/0x268
show_stack+0x24/0x30
sched_show_task+0x134/0x180
rcu_print_detail_task_stall_rnp+0x54/0x7c
rcu_check_callbacks+0xa74/0xb08
update_process_times+0x34/0x60
tick_sched_handle.isra.7+0x38/0x70
tick_sched_timer+0x4c/0x98
__hrtimer_run_queues+0xc0/0x300
hrtimer_interrupt+0xac/0x228
arch_timer_handler_phys+0x3c/0x50
handle_percpu_devid_irq+0x8c/0x290
generic_handle_irq+0x34/0x50
__handle_domain_irq+0x68/0xc0
gic_handle_irq+0x5c/0xb0
Address this by changing the putback_active_hugepage() in
soft_offline_huge_page() to putback_movable_pages().
This only triggers on systems that enable memory failure handling
(ARCH_SUPPORTS_MEMORY_FAILURE) but not hugepage migration
(!ARCH_ENABLE_HUGEPAGE_MIGRATION).
I imagine this wasn't triggered as there aren't many systems running
this configuration.
[akpm@linux-foundation.org: remove dead comment, per Naoya]
Link: http://lkml.kernel.org/r/20170525135146.32011-1-punit.agrawal@arm.com
Reported-by: Manoj Iyer <manoj.iyer@canonical.com>
Tested-by: Manoj Iyer <manoj.iyer@canonical.com>
Suggested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org> [3.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We currently have two related PMD vs PTE races in the DAX code. These
can both be easily triggered by having two threads reading and writing
simultaneously to the same private mapping, with the key being that
private mapping reads can be handled with PMDs but private mapping
writes are always handled with PTEs so that we can COW.
Here is the first race:
CPU 0 CPU 1
(private mapping write)
__handle_mm_fault()
create_huge_pmd() - FALLBACK
handle_pte_fault()
passes check for pmd_devmap()
(private mapping read)
__handle_mm_fault()
create_huge_pmd()
dax_iomap_pmd_fault() inserts PMD
dax_iomap_pte_fault() does a PTE fault, but we already have a DAX PMD
installed in our page tables at this spot.
Here's the second race:
CPU 0 CPU 1
(private mapping read)
__handle_mm_fault()
passes check for pmd_none()
create_huge_pmd()
dax_iomap_pmd_fault() inserts PMD
(private mapping write)
__handle_mm_fault()
create_huge_pmd() - FALLBACK
(private mapping read)
__handle_mm_fault()
passes check for pmd_none()
create_huge_pmd()
handle_pte_fault()
dax_iomap_pte_fault() inserts PTE
dax_iomap_pmd_fault() inserts PMD,
but we already have a PTE at
this spot.
The core of the issue is that while there is isolation between faults to
the same range in the DAX fault handlers via our DAX entry locking,
there is no isolation between faults in the code in mm/memory.c. This
means for instance that this code in __handle_mm_fault() can run:
if (pmd_none(*vmf.pmd) && transparent_hugepage_enabled(vma)) {
ret = create_huge_pmd(&vmf);
But by the time we actually get to run the fault handler called by
create_huge_pmd(), the PMD is no longer pmd_none() because a racing PTE
fault has installed a normal PMD here as a parent. This is the cause of
the 2nd race. The first race is similar - there is the following check
in handle_pte_fault():
} else {
/* See comment in pte_alloc_one_map() */
if (pmd_devmap(*vmf->pmd) || pmd_trans_unstable(vmf->pmd))
return 0;
So if a pmd_devmap() PMD (a DAX PMD) has been installed at vmf->pmd, we
will bail and retry the fault. This is correct, but there is nothing
preventing the PMD from being installed after this check but before we
actually get to the DAX PTE fault handlers.
In my testing these races result in the following types of errors:
BUG: Bad rss-counter state mm:ffff8800a817d280 idx:1 val:1
BUG: non-zero nr_ptes on freeing mm: 15
Fix this issue by having the DAX fault handlers verify that it is safe
to continue their fault after they have taken an entry lock to block
other racing faults.
[ross.zwisler@linux.intel.com: improve fix for colliding PMD & PTE entries]
Link: http://lkml.kernel.org/r/20170526195932.32178-1-ross.zwisler@linux.intel.com
Link: http://lkml.kernel.org/r/20170522215749.23516-2-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Pawel Lebioda <pawel.lebioda@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Pawel Lebioda <pawel.lebioda@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Xiong Zhou <xzhou@redhat.com>
Cc: Eryu Guan <eguan@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the pmd_devmap() checks were added by 5c7fb56e5e ("mm, dax:
dax-pmd vs thp-pmd vs hugetlbfs-pmd") to add better support for DAX huge
pages, they were all added to the end of if() statements after existing
pmd_trans_huge() checks. So, things like:
- if (pmd_trans_huge(*pmd))
+ if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd))
When further checks were added after pmd_trans_unstable() checks by
commit 7267ec008b ("mm: postpone page table allocation until we have
page to map") they were also added at the end of the conditional:
+ if (pmd_trans_unstable(fe->pmd) || pmd_devmap(*fe->pmd))
This ordering is fine for pmd_trans_huge(), but doesn't work for
pmd_trans_unstable(). This is because DAX huge pages trip the bad_pmd()
check inside of pmd_none_or_trans_huge_or_clear_bad() (called by
pmd_trans_unstable()), which prints out a warning and returns 1. So, we
do end up doing the right thing, but only after spamming dmesg with
suspicious looking messages:
mm/pgtable-generic.c:39: bad pmd ffff8808daa49b88(84000001006000a5)
Reorder these checks in a helper so that pmd_devmap() is checked first,
avoiding the error messages, and add a comment explaining why the
ordering is important.
Fixes: commit 7267ec008b ("mm: postpone page table allocation until we have page to map")
Link: http://lkml.kernel.org/r/20170522215749.23516-1-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Pawel Lebioda <pawel.lebioda@intel.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Xiong Zhou <xzhou@redhat.com>
Cc: Eryu Guan <eguan@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roman Gushchin has reported that the OOM killer can trivially selects
next OOM victim when a thread doing memory allocation from page fault
path was selected as first OOM victim.
allocate invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
allocate cpuset=/ mems_allowed=0
CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Call Trace:
oom_kill_process+0x219/0x3e0
out_of_memory+0x11d/0x480
__alloc_pages_slowpath+0xc84/0xd40
__alloc_pages_nodemask+0x245/0x260
alloc_pages_vma+0xa2/0x270
__handle_mm_fault+0xca9/0x10c0
handle_mm_fault+0xf3/0x210
__do_page_fault+0x240/0x4e0
trace_do_page_fault+0x37/0xe0
do_async_page_fault+0x19/0x70
async_page_fault+0x28/0x30
...
Out of memory: Kill process 492 (allocate) score 899 or sacrifice child
Killed process 492 (allocate) total-vm:2052368kB, anon-rss:1894576kB, file-rss:4kB, shmem-rss:0kB
allocate: page allocation failure: order:0, mode:0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null)
allocate cpuset=/ mems_allowed=0
CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Call Trace:
__alloc_pages_slowpath+0xd32/0xd40
__alloc_pages_nodemask+0x245/0x260
alloc_pages_vma+0xa2/0x270
__handle_mm_fault+0xca9/0x10c0
handle_mm_fault+0xf3/0x210
__do_page_fault+0x240/0x4e0
trace_do_page_fault+0x37/0xe0
do_async_page_fault+0x19/0x70
async_page_fault+0x28/0x30
...
oom_reaper: reaped process 492 (allocate), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
...
allocate invoked oom-killer: gfp_mask=0x0(), nodemask=(null), order=0, oom_score_adj=0
allocate cpuset=/ mems_allowed=0
CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Call Trace:
oom_kill_process+0x219/0x3e0
out_of_memory+0x11d/0x480
pagefault_out_of_memory+0x68/0x80
mm_fault_error+0x8f/0x190
? handle_mm_fault+0xf3/0x210
__do_page_fault+0x4b2/0x4e0
trace_do_page_fault+0x37/0xe0
do_async_page_fault+0x19/0x70
async_page_fault+0x28/0x30
...
Out of memory: Kill process 233 (firewalld) score 10 or sacrifice child
Killed process 233 (firewalld) total-vm:246076kB, anon-rss:20956kB, file-rss:0kB, shmem-rss:0kB
There is a race window that the OOM reaper completes reclaiming the
first victim's memory while nothing but mutex_trylock() prevents the
first victim from calling out_of_memory() from pagefault_out_of_memory()
after memory allocation for page fault path failed due to being selected
as an OOM victim.
This is a side effect of commit 9a67f6488e ("mm: consolidate
GFP_NOFAIL checks in the allocator slowpath") because that commit
silently changed the behavior from
/* Avoid allocations with no watermarks from looping endlessly */
to
/*
* Give up allocations without trying memory reserves if selected
* as an OOM victim
*/
in __alloc_pages_slowpath() by moving the location to check TIF_MEMDIE
flag. I have noticed this change but I didn't post a patch because I
thought it is an acceptable change other than noise by warn_alloc()
because !__GFP_NOFAIL allocations are allowed to fail. But we
overlooked that failing memory allocation from page fault path makes
difference due to the race window explained above.
While it might be possible to add a check to pagefault_out_of_memory()
that prevents the first victim from calling out_of_memory() or remove
out_of_memory() from pagefault_out_of_memory(), changing
pagefault_out_of_memory() does not suppress noise by warn_alloc() when
allocating thread was selected as an OOM victim. There is little point
with printing similar backtraces and memory information from both
out_of_memory() and warn_alloc().
Instead, if we guarantee that current thread can try allocations with no
watermarks once when current thread looping inside
__alloc_pages_slowpath() was selected as an OOM victim, we can follow "who
can use memory reserves" rules and suppress noise by warn_alloc() and
prevent memory allocations from page fault path from calling
pagefault_out_of_memory().
If we take the comment literally, this patch would do
- if (test_thread_flag(TIF_MEMDIE))
- goto nopage;
+ if (alloc_flags == ALLOC_NO_WATERMARKS || (gfp_mask & __GFP_NOMEMALLOC))
+ goto nopage;
because gfp_pfmemalloc_allowed() returns false if __GFP_NOMEMALLOC is
given. But if I recall correctly (I couldn't find the message), the
condition is meant to apply to only OOM victims despite the comment.
Therefore, this patch preserves TIF_MEMDIE check.
Fixes: 9a67f6488e ("mm: consolidate GFP_NOFAIL checks in the allocator slowpath")
Link: http://lkml.kernel.org/r/201705192112.IAF69238.OQOHSJLFOFFMtV@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Roman Gushchin <guro@fb.com>
Tested-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org> [4.11]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
memcg_propagate_slab_attrs() abuses the sysfs attribute file functions
to propagate settings from the root kmem_cache to a newly created
kmem_cache. It does that with:
attr->show(root, buf);
attr->store(new, buf, strlen(bug);
Aside of being a lazy and absurd hackery this is broken because it does
not check the return value of the show() function.
Some of the show() functions return 0 w/o touching the buffer. That
means in such a case the store function is called with the stale content
of the previous show(). That causes nonsense like invoking
kmem_cache_shrink() on a newly created kmem_cache. In the worst case it
would cause handing in an uninitialized buffer.
This should be rewritten proper by adding a propagate() callback to
those slub_attributes which must be propagated and avoid that insane
conversion to and from ASCII, but that's too large for a hot fix.
Check at least the return value of the show() function, so calling
store() with stale content is prevented.
Steven said:
"It can cause a deadlock with get_online_cpus() that has been uncovered
by recent cpu hotplug and lockdep changes that Thomas and Peter have
been doing.
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(cpu_hotplug.lock);
lock(slab_mutex);
lock(cpu_hotplug.lock);
lock(slab_mutex);
*** DEADLOCK ***"
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705201244540.2255@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit db2aa7fd15 ("initramfs: allow again choice of the embedded
initram compression algorithm") introduced the possibility to select the
initramfs compression algorithm from Kconfig and while this is a nice
feature it broke the use case described below.
Here is what my build system does:
- kernel is initially configured not to have an initramfs included
- build the user space root file system
- re-configure the kernel to have an initramfs included
(CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
CONFIG_INITRAMFS options, in my case, no compression option
(CONFIG_INITRAMFS_COMPRESSION_NONE)
- kernel is re-built with these options -> kernel+initramfs image is
copied
- kernel is re-built again without these options -> kernel image is
copied
Building a kernel without an initramfs means setting this option:
CONFIG_INITRAMFS_SOURCE="" (and this one only)
whereas building a kernel with an initramfs means setting these options:
CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
CONFIG_INITRAMFS_ROOT_UID=1000
CONFIG_INITRAMFS_ROOT_GID=1000
CONFIG_INITRAMFS_COMPRESSION_NONE=y
CONFIG_INITRAMFS_COMPRESSION=""
Commit db2aa7fd15 ("initramfs: allow again choice of the embedded
initram compression algorithm") is problematic because
CONFIG_INITRAMFS_COMPRESSION which is used to determine the
initramfs_data.cpio extension/compression is a string, and due to how
Kconfig works it will evaluate in order, how to assign it.
Setting CONFIG_INITRAMFS_COMPRESSION_NONE with CONFIG_INITRAMFS_SOURCE=""
cannot possibly work (because of the depends on INITRAMFS_SOURCE!=""
imposed on CONFIG_INITRAMFS_COMPRESSION ) yet we still get
CONFIG_INITRAMFS_COMPRESSION assigned to ".gz" because CONFIG_RD_GZIP=y
is set in my kernel, even when there is no initramfs being built.
So we basically end-up generating two initramfs_data.cpio* files, one
without extension, and one with .gz. This causes usr/Makefile to track
usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio anymore,
that is also largely problematic after 9e3596b0c6 ("kbuild:
initramfs cleanup, set target from Kconfig") because we used to track
all possible initramfs_data files in the $(targets) variable before that
commit.
The end result is that the kernel with an initramfs clearly does not
contain what we expect it to, it has a stale initramfs_data.cpio file
built into it, and we keep re-generating an initramfs_data.cpio.gz file
which is not the one that we want to include in the kernel image proper.
The fix consists in hiding CONFIG_INITRAMFS_COMPRESSION when
CONFIG_INITRAMFS_SOURCE="". This puts us back in a state to the
pre-4.10 behavior where we can properly disable and re-enable initramfs
within the same kernel .config file, and be in control of what
CONFIG_INITRAMFS_COMPRESSION is set to.
Fixes: db2aa7fd15 ("initramfs: allow again choice of the embedded initram compression algorithm")
Fixes: 9e3596b0c6 ("kbuild: initramfs cleanup, set target from Kconfig")
Link: http://lkml.kernel.org/r/20170521033337.6197-1-f.fainelli@gmail.com
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Cc: P J P <ppandit@redhat.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While converting drm_[cm]alloc* helpers to kvmalloc* variants Chris
Wilson has wondered why we want to try kmalloc before vmalloc fallback
even for larger allocations requests. Let's clarify that one larger
physically contiguous block is less likely to fragment memory than many
scattered pages which can prevent more large blocks from being created.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170517080932.21423-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 7c30f352c8 ("jiffies.h: declare jiffies and jiffies_64 with
____cacheline_aligned_in_smp") removed a section specification from the
jiffies declaration that caused conflicts on some platforms.
Unfortunately this change broke the build for frv:
kernel/built-in.o: In function `__do_softirq': (.text+0x6460): relocation truncated to fit: R_FRV_GPREL12 against symbol
`jiffies' defined in *ABS* section in .tmp_vmlinux1
kernel/built-in.o: In function `__do_softirq': (.text+0x6574): relocation truncated to fit: R_FRV_GPREL12 against symbol
`jiffies' defined in *ABS* section in .tmp_vmlinux1
kernel/built-in.o: In function `pwq_activate_delayed_work': workqueue.c:(.text+0x15b9c): relocation truncated to fit: R_FRV_GPREL12 against
symbol `jiffies' defined in *ABS* section in .tmp_vmlinux1
...
Add __jiffy_arch_data to the declaration of jiffies and use it on frv to
include the section specification. For all other platforms
__jiffy_arch_data (currently) has no effect.
Fixes: 7c30f352c8 ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp")
Link: http://lkml.kernel.org/r/20170516221333.177280-1-mka@chromium.org
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Igor Stoppa has noticed that __GFP_NOLOCKDEP can use a lower bit. At
the time commit 7e7844226f ("lockdep: allow to disable reclaim lockup
detection") was written we still had __GFP_OTHER_NODE but I have removed
it in commit 41b6167e8f ("mm: get rid of __GFP_OTHER_NODE") and forgot
to lower the bit value.
The current value is outside of __GFP_BITS_SHIFT so it cannot be used
actually.
Fixes: 7e7844226f ("lockdep: allow to disable reclaim lockup detection")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Igor Stoppa <igor.stoppa@nokia.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
"err" needs to be left set to -EFAULT if split_huge_page succeeds.
Otherwise if "err" gets clobbered with zero and write_protect_page
fails, try_to_merge_one_page() will succeed instead of returning -EFAULT
and then try_to_merge_with_ksm_page() will continue thinking kpage is a
PageKsm when in fact it's still an anonymous page. Eventually it'll
crash in page_add_anon_rmap.
This has been reproduced on Fedora25 kernel but I can reproduce with
upstream too.
The bug was introduced in commit f765f54059 ("ksm: prepare to new THP
semantics") introduced in v4.5.
page:fffff67546ce1cc0 count:4 mapcount:2 mapping:ffffa094551e36e1 index:0x7f0f46673
flags: 0x2ffffc0004007c(referenced|uptodate|dirty|lru|active|swapbacked)
page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
page->mem_cgroup:ffffa09674bf0000
------------[ cut here ]------------
kernel BUG at mm/rmap.c:1222!
CPU: 1 PID: 76 Comm: ksmd Not tainted 4.9.3-200.fc25.x86_64 #1
RIP: do_page_add_anon_rmap+0x1c4/0x240
Call Trace:
page_add_anon_rmap+0x18/0x20
try_to_merge_with_ksm_page+0x50b/0x780
ksm_scan_thread+0x1211/0x1410
? prepare_to_wait_event+0x100/0x100
? try_to_merge_with_ksm_page+0x780/0x780
kthread+0xd9/0xf0
? kthread_park+0x60/0x60
ret_from_fork+0x25/0x30
Fixes: f765f54059 ("ksm: prepare to new THP semantics")
Link: http://lkml.kernel.org/r/20170513131040.21732-1-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Federico Simoncelli <fsimonce@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm-cpufreq:
cpufreq: kirkwood-cpufreq:- Handle return value of clk_prepare_enable()
cpufreq: cpufreq_register_driver() should return -ENODEV if init fails
Pull XFS fix from Darrick Wong:
"I've one more bugfix for you for 4.12-rc4: Fix an unmount hang due to
a race in io buffer accounting"
* tag 'xfs-4.12-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: use ->b_state to fix buffer I/O accounting release race
Pull ceph fix from Ilya Dryomov:
"A small fix for rbd FALLOC_FL_ZERO_RANGE/PUNCH_HOLE handling breakage
introduced in -rc1"
* tag 'ceph-for-4.12-rc4' of git://github.com/ceph/ceph-client:
rbd: implement REQ_OP_WRITE_ZEROES
If logout response is not received and ->ep_disconnect() is called then
close tcp conn by RST instead of FIN to cleanup conn resources
immediately.
Also move ->csk_push_tx_frames() above 'done:' to avoid calling
->csk_push_tx_frames() in error cases.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull device mapper fixes from Mike Snitzer:
- a DM verity fix for a mode when no salt is used
- a fix to DM to account for the possibility that PREFLUSH or FUA are
used without the SYNC flag if the underlying storage doesn't have a
volatile write-cache
- a DM ioctl memory allocation flag fix to use __GFP_HIGH to allow
emergency forward progress (by using memory reserves as last resort)
- a small DM integrity cleanup to use kvmalloc() instead of duplicating
the same
* tag 'for-4.12/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: make flush bios explicitly sync
dm ioctl: restore __GFP_HIGH in copy_params()
dm integrity: use kvmalloc() instead of dm_integrity_kvmalloc()
dm verity: fix no salt use case
Pull MD fixes from Shaohua Li:
"Several patches for MD. One notable is making flush bios sync, others
fix small issues"
* tag 'md/4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md: Make flush bios explicitely sync
md: report sector of stripes with check mismatches
md: uuid debug statement now in processor byte order.
md-cluster: fix potential lock issue in add_new_disk
Pull block fixes from Jens Axboe:
"A set of fixes that should go into the next -rc. This contains:
- A use-after-free in the request_list exit for the legacy IO path,
from Bart.
- A fix for CFQ, fixing a recent regression with the conversion to
higher resolution timing for iops mode. From Hou Tao.
- A single fix for nbd, split in two patches, fixing a leak of a data
structure.
- A regression fix from Keith, ensuring that callers of
blk_mq_update_nr_hw_queues() hold the right lock"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: Avoid that blk_exit_rl() triggers a use-after-free
cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode
blk-mq: Take tagset lock when updating hw queues
nbd: don't leak nbd_config
nbd: nbd_reset() call in nbd_dev_add() is redundant
Pull drm displayport quirk support:
"DP quirk for usb c dongles.
As mentioned I have a separate request for fixing a regression, but
also keeping the broken hw working, for certain USB-C DP adapters they
require a minimised n/m parameters, but an attempt to do this
generically has failed, we need to quirk these specific adapters.
However doing it generically regressed some eDP panels.
This pull adds the infrastructure and a quirk for the adapter"
* tag 'drm-dp-quirk-for-v4.12-rc4' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Detect USB-C specific dongles before reducing M and N
drm/dp: start a DPCD based DP sink/branch device quirk database
drm/i915: use drm DP helper to read DPCD desc
drm/dp: add helper for reading DP sink/branch device desc from DPCD
commit d85b758f72 ("virtio_net: fix support for small rings")
was supposed to increase the buffer size for small rings but had an
unintentional side effect of decreasing it for large rings. This seems
to break some setups - it's not yet clear why, but increasing buffer
size back to what it was before helps.
Fixes: d85b758f72 ("virtio_net: fix support for small rings")
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: "J. Bruce Fields" <bfields@fieldses.org>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Qlogic's 82xx series adapter doesn't support
tunnel offloads, driver incorrectly assumes that it is
supported and causes firmware hang while running tunnel IO.
This patch fixes this by not advertising tunnel offloads
for 82xx adapters.
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding a vxlan interface to a socket isn't symmetrical, while adding
is done in vxlan_open() the deletion is done in vxlan_dellink().
This can cause a use-after-free error when we close the vxlan
interface before deleting it.
We add vxlan_vs_del_dev() to match vxlan_vs_add_dev() and call
it from vxlan_stop() to match the call from vxlan_open().
Fixes: 56ef9c909b ("vxlan: Move socket initialization to within rtnl scope")
Acked-by: Jiri Benc <jbenc@redhat.com>
Tested-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the sender switches its congestion control during loss
recovery, if the recovery is spurious then it may incorrectly
revert cwnd and ssthresh to the older values set by a previous
congestion control. Consider a congestion control (like BBR)
that does not use ssthresh and keeps it infinite: the connection
may incorrectly revert cwnd to an infinite value when switching
from BBR to another congestion control.
This patch fixes it by disallowing such cwnd undo operation
upon switching congestion control. Note that undo_marker
is not reset s.t. the packets that were incorrectly marked
lost would be corrected. We only avoid undoing the cwnd in
tcp_undo_cwnd_reduction().
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Take uld mutex to avoid race between cxgb_up() and
cxgb4_register_uld() to enable napi for the same uld
queue.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
xfrm6_find_1stfragopt() may now return an error code and we must
not treat it as a length.
Fixes: 2423496af3 ("ipv6: Prevent overrun when parsing v6 header options")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull sound fixes from Takashi Iwai:
"This contains the fixes for a few reported regression for HD-audio and
USB-audio. All small, trivial, and boring"
* tag 'sound-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix applying MSI dual-codec mobo quirk
ALSA: usb: Avoid VLA in mixer_us16x08.c
ALSA: usb: Fix a typo in Tascam US-16x08 mixer element
Revert "ALSA: usb-audio: purge needless variable length array"
Pull dmaengine fixes from Vinod Koul:
"Here is the dmaengine fixes request for 4.12. Fixes bunch of issues in
the driver, npthing exciting though..
- mv_xor_v2 driver fixes for handling descriptors, tx_submit
implementation, removing interrupt coalescing and setting DMA mask
properly
- fix usb-dmac DMAOR AE bit definition
- fix ep93xx start buffer from BASE0 and not drain the transfers in
terminate_all
- fix rcar-dmac to use right descriptor pointer for residue
calculation
- pl330 fix warn for irq freeup"
* tag 'dmaengine-fix-4.12-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: pl330: fix warning in pl330_remove
rcar-dmac: fixup descriptor pointer for descriptor mode
dmaengine: ep93xx: Don't drain the transfers in terminate_all()
dmaengine: ep93xx: Always start from BASE0
dmaengine: usb-dmac: Fix DMAOR AE bit definition
dmaengine: mv_xor_v2: set DMA mask to 40 bits
dmaengine: mv_xor_v2: remove interrupt coalescing
dmaengine: mv_xor_v2: fix tx_submit() implementation
dmaengine: mv_xor_v2: enable XOR engine after its configuration
dmaengine: mv_xor_v2: do not use descriptors not acked by async_tx
dmaengine: mv_xor_v2: properly handle wrapping in the array of HW descriptors
dmaengine: mv_xor_v2: handle mv_xor_v2_prep_sw_desc() error properly
Pull HID fixes from Jiri Kosina:
- corner-case oops fixes for Asus and Wacom drivers from Carlo Caione
and Jason Gerecke
- power management fix (reported on SIS0817 touchscreen) for i2c-hid
devices from Hans de Goede
- device-id-specific fixes and quirks from Hans de Goede, Diego Elio
Pettenò and Che-Liang Chiou
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: asus: Stop underlying hardware on remove
HID: i2c: Call acpi_device_fix_up_power for ACPI-enumerated devices
HID: asus: Add support for T100 keyboard
HID: elecom: extend to fix the descriptor for DEFT trackballs
HID: magicmouse: Set multi-touch keybits for Magic Mouse
HID: wacom: Have wacom_tpc_irq guard against possible NULL dereference
Pull livepatching fix from Jiri Kosina:
"Kconfig dependency fix for livepatching infrastructure from Miroslav
Benes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: Make livepatch dependent on !TRIM_UNUSED_KSYMS
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- revert a broken PAT commit that broke a number of systems
- fix two preemptability warnings/bugs that can trigger under certain
circumstances, in the debug code and in the microcode loader"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "x86/PAT: Fix Xorg regression on CPUs that don't support PAT"
x86/debug/32: Convert a smp_processor_id() call to raw to avoid DEBUG_PREEMPT warning
x86/microcode/AMD: Change load_microcode_amd()'s param to bool to fix preemptibility bug
Pull EFI fixes from Ingo Molnar:
"Misc fixes:
- three boot crash fixes for uncommon configurations
- silence a boot warning under virtualization
- plus a GCC 7 related (harmless) build warning fix"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/bgrt: Skip efi_bgrt_init() in case of non-EFI boot
x86/efi: Correct EFI identity mapping under 'efi=old_map' when KASLR is enabled
x86/efi: Disable runtime services on kexec kernel if booted with efi=old_map
efi: Remove duplicate 'const' specifiers
efi: Don't issue error message when booted under Xen
commit a39be606f9 ("drm: Do a full device unregister when unplugging")
causes backtraces like this one when unplugging an usb drm device while
it is in use:
usb 2-3: USB disconnect, device number 25
------------[ cut here ]------------
WARNING: CPU: 0 PID: 242 at drivers/gpu/drm/drm_mode_config.c:424
drm_mode_config_cleanup+0x220/0x280 [drm]
...
RIP: 0010:drm_mode_config_cleanup+0x220/0x280 [drm]
...
Call Trace:
gm12u320_modeset_cleanup+0xe/0x10 [gm12u320]
gm12u320_driver_unload+0x35/0x70 [gm12u320]
drm_dev_unregister+0x3c/0xe0 [drm]
drm_unplug_dev+0x12/0x60 [drm]
gm12u320_usb_disconnect+0x36/0x40 [gm12u320]
usb_unbind_interface+0x72/0x280
device_release_driver_internal+0x158/0x210
device_release_driver+0x12/0x20
bus_remove_device+0x104/0x180
device_del+0x1d2/0x350
usb_disable_device+0x9f/0x270
usb_disconnect+0xc6/0x260
...
[drm:drm_mode_config_cleanup [drm]] *ERROR* connector Unknown-1 leaked!
------------[ cut here ]------------
WARNING: CPU: 0 PID: 242 at drivers/gpu/drm/drm_mode_config.c:458
drm_mode_config_cleanup+0x268/0x280 [drm]
...
<same Call Trace>
---[ end trace 80df975dae439ed6 ]---
general protection fault: 0000 [#1] SMP
...
Call Trace:
? __switch_to+0x225/0x450
drm_mode_rmfb_work_fn+0x55/0x70 [drm]
process_one_work+0x193/0x3c0
worker_thread+0x4a/0x3a0
...
RIP: drm_framebuffer_remove+0x62/0x3f0 [drm] RSP: ffffb776c39dfd98
---[ end trace 80df975dae439ed7 ]---
After which the system is unusable this is caused by drm_dev_unregister
getting called immediately on unplug, which calls the drivers unload
function which calls drm_mode_config_cleanup which removes the framebuffer
object while userspace is still holding a reference to it.
Reverting commit a39be606f9 ("drm: Do a full device unregister
when unplugging") leads to the following oops on unplug instead,
when userspace closes the last fd referencing the drm_dev:
sysfs group 'power' not found for kobject 'card1-Unknown-1'
------------[ cut here ]------------
WARNING: CPU: 0 PID: 2459 at fs/sysfs/group.c:237
sysfs_remove_group+0x80/0x90
...
RIP: 0010:sysfs_remove_group+0x80/0x90
...
Call Trace:
dpm_sysfs_remove+0x57/0x60
device_del+0xfd/0x350
device_unregister+0x1a/0x60
drm_sysfs_connector_remove+0x39/0x50 [drm]
drm_connector_unregister+0x5a/0x70 [drm]
drm_connector_unregister_all+0x45/0xa0 [drm]
drm_modeset_unregister_all+0x12/0x30 [drm]
drm_dev_unregister+0xca/0xe0 [drm]
drm_put_dev+0x32/0x60 [drm]
drm_release+0x2f3/0x380 [drm]
__fput+0xdf/0x1e0
...
---[ end trace ecfb91ac85688bbe ]---
BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
IP: down_write+0x1f/0x40
...
Call Trace:
debugfs_remove_recursive+0x55/0x1b0
drm_debugfs_connector_remove+0x21/0x40 [drm]
drm_connector_unregister+0x62/0x70 [drm]
drm_connector_unregister_all+0x45/0xa0 [drm]
drm_modeset_unregister_all+0x12/0x30 [drm]
drm_dev_unregister+0xca/0xe0 [drm]
drm_put_dev+0x32/0x60 [drm]
drm_release+0x2f3/0x380 [drm]
__fput+0xdf/0x1e0
...
---[ end trace ecfb91ac85688bbf ]---
This is caused by the revert moving back to drm_unplug_dev calling
drm_minor_unregister which does:
device_del(minor->kdev);
dev_set_drvdata(minor->kdev, NULL); /* safety belt */
drm_debugfs_cleanup(minor);
Causing the sysfs entries to already be removed even though we still
have references to them in e.g. drm_connector.
Note we must call drm_minor_unregister to notify userspace of the unplug
of the device, so calling drm_dev_unregister is not completely wrong the
problem is that drm_dev_unregister does too much.
This commit fixes drm_unplug_dev by not only reverting
commit a39be606f9 ("drm: Do a full device unregister when unplugging")
but by also adding a call to drm_modeset_unregister_all before the
drm_minor_unregister calls to make sure all sysfs entries are removed
before calling device_del(minor->kdev) thereby also fixing the second
set of oopses caused by just reverting the commit.
Fixes: a39be606f9 ("drm: Do a full device unregister when unplugging")
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeffy <jeffy.chen@rock-chips.com>
Cc: Marco Diego Aurélio Mesquita <marcodiegomesquita@gmail.com>
Reported-by: Marco Diego Aurélio Mesquita <marcodiegomesquita@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601115430.4113-1-hdegoede@redhat.com
Johannes Berg says:
====================
Just two fixes:
* fix the per-CPU drop counters to not be added to the
rx_packets counter, but really the drop counter
* fix TX aggregation start/stop callback races by setting
bits instead of allocating and queueing an skb
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
dsa_switch_suspend() and dsa_switch_resume() are functions that belong in
net/dsa/dsa.c and are not part of the legacy platform support code.
Fixes: a6a71f19fe ("net: dsa: isolate legacy code")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
On SYSTEMPORT Lite, since we have the main interrupt source in the first
cell, the second cell is the Wake-on-LAN interrupt, yet the code was not
properly updated to fetch the second cell, and instead looked at the
third and non-existing cell for Wake-on-LAN.
Fixes: 44a4524c54 ("net: systemport: Add support for SYSTEMPORT Lite")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The BAD_MADT_GICC_ENTRY() macro checks if a GICC MADT entry passes
muster from an ACPI specification standpoint. Current macro detects the
MADT GICC entry length through ACPI firmware version (it changed from 76
to 80 bytes in the transition from ACPI 5.1 to ACPI 6.0 specification)
but always uses (erroneously) the ACPICA (latest) struct (ie struct
acpi_madt_generic_interrupt - that is 80-bytes long) length to check if
the current GICC entry memory record exceeds the MADT table end in
memory as defined by the MADT table header itself, which may result in
false negatives depending on the ACPI firmware version and how the MADT
entries are laid out in memory (ie on ACPI 5.1 firmware MADT GICC
entries are 76 bytes long, so by adding 80 to a GICC entry start address
in memory the resulting address may well be past the actual MADT end,
triggering a false negative).
Fix the BAD_MADT_GICC_ENTRY() macro by reshuffling the condition checks
and update them to always use the firmware version specific MADT GICC
entry length in order to carry out boundary checks.
Fixes: b6cfb27737 ("ACPI / ARM64: add BAD_MADT_GICC_ENTRY() macro")
Reported-by: Julien Grall <julien.grall@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Al Stone <ahs3@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When the association between a queue device and the
driver is released via unbind and later re-associated
the queue device was not operational any more. Reason
was a wrong administration of the card/queue lists
within the ap device driver.
This patch introduces revised card/queue list handling
within the ap device driver: when an ap device is
detected it is initial not added to the card/queue list
any more. With driver probe the card device is added to
the card list/the queue device is added to the queue list
within a card. With driver remove the device is removed
from the card/queue list. Additionally there are some
situations within the ap device live where the lists
need update upon card/queue device release (for example
device hot unplug or suspend/resume).
Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We are missing a call to hid_hw_stop() on the remove hook.
Among other things this is causing an Oops when (re-)starting GNOME /
upowerd / ... after the module has been already rmmod-ed.
Signed-off-by: Carlo Caione <carlo@endlessm.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The PN_INT_ENA register should be used after usb3_pn_change() is called.
So, this patch moves the access from renesas_usb3_stop_controller() to
usb3_disable_pipe_n().
Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This controller disallows to change the PIPE until reading/writing
a packet finishes. However. the previous code is not enough to hold
the lock in some functions. So, this patch fixes it.
Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This patch fixes an issue that this driver is possible to cause
deadlock by double-spinclocked in renesas_usb3_stop_controller().
So, this patch removes spinlock API calling in renesas_usb3_stop().
(In other words, the previous code had a redundant lock.)
Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This patch fixes an issue that this driver is possible to access
the registers before pm_runtime_get_sync() if a gadget driver is
installed first. After that, oops happens on R-Car Gen3 environment.
To avoid it, this patch changes the pm_runtime call timing from
probe/remove to udc_start/udc_stop.
Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
f_mass_storage has a memorry barrier issue with the sleep and wake
functions that can cause a deadlock. This results in intermittent hangs
during MSC file transfer. The host will reset the device after receiving
no response to resume the transfer. This issue is seen when dwc3 is
processing 2 transfer-in-progress events at the same time, invoking
completion handlers for CSW and CBW. Also this issue occurs depending on
the system timing and latency.
To increase the chance to hit this issue, you can force dwc3 driver to
wait and process those 2 events at once by adding a small delay (~100us)
in dwc3_check_event_buf() whenever the request is for CSW and read the
event count again. Avoid debugging with printk and ftrace as extra
delays and memory barrier will mask this issue.
Scenario which can lead to failure:
-----------------------------------
1) The main thread sleeps and waits for the next command in
get_next_command().
2) bulk_in_complete() wakes up main thread for CSW.
3) bulk_out_complete() tries to wake up the running main thread for CBW.
4) thread_wakeup_needed is not loaded with correct value in
sleep_thread().
5) Main thread goes to sleep again.
The pattern is shown below. Note the 2 critical variables.
* common->thread_wakeup_needed
* bh->state
CPU 0 (sleep_thread) CPU 1 (wakeup_thread)
============================== ===============================
bh->state = BH_STATE_FULL;
smp_wmb();
thread_wakeup_needed = 0; thread_wakeup_needed = 1;
smp_rmb();
if (bh->state != BH_STATE_FULL)
sleep again ...
As pointed out by Alan Stern, this is an R-pattern issue. The issue can
be seen when there are two wakeups in quick succession. The
thread_wakeup_needed can be overwritten in sleep_thread, and the read of
the bh->state maybe reordered before the write to thread_wakeup_needed.
This patch applies full memory barrier smp_mb() in both sleep_thread()
and wakeup_thread() to ensure the order which the thread_wakeup_needed
and bh->state are written and loaded.
However, a better solution in the future would be to use wait_queue
method that takes care of managing memory barrier between waker and
waiter.
Cc: <stable@vger.kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
USB support in the Meson8 SoCs is provided by a DWC2 controller which
works with the same settings as Meson8b and GXBB. Using the generic
"snps,dwc2" binding results in an endless stream of "Overcurrent change
detected" messages.
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
When removing a device with less than 9 IRQs (AMBA_NR_IRQS), we'll get a
big WARN_ON from devres.c because pl330_remove calls devm_free_irqs for
unallocated irqs. Similarly to pl330_probe, check that IRQ number is
present before calling devm_free_irq.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Commit 4e552c8cb5 ("leds: add LED_ON brightness as boolean value")
has introduced the LED_ON enumeration value that can be used
instead of LED_FULL which has more of a linear value.
Because the tm2-touchscreen doesn't have brightness levels, but
it's a simple on/off led, use LED_ON instead of LED_FULL.
Signed-off-by: Andi Shyti <andi.shyti@samsung.com>
Reviewed-by: Jaechul Lee <jcsing.lee@samsung.com>
Tested-by: Jaechul Lee <jcsing.lee@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
DP sink specific quirks
* tag 'topic/dp-quirks-2017-05-31' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Detect USB-C specific dongles before reducing M and N
drm/dp: start a DPCD based DP sink/branch device quirk database
drm/i915: use drm DP helper to read DPCD desc
drm/dp: add helper for reading DP sink/branch device desc from DPCD
A fix to MAINTAINERS file adding relevant device-tree
files to mach-davinci entry.
* tag 'davinci-fixes-for-v4.12-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
MAINTAINERS: add device-tree files to TI DaVinci entry
Signed-off-by: Olof Johansson <olof@lixom.net>
mvebu non critical fixes for 4.12
Update MAINTAINER file for irqchip related drivers to Marvell EBU
* tag 'mvebu-fixes-non-critical-4.12-1' of git://git.infradead.org/linux-mvebu:
MAINTAINERS: add irqchip related drivers to Marvell EBU maintainers
MAINTAINERS: sort F entries for Marvell EBU maintainers
Signed-off-by: Olof Johansson <olof@lixom.net>
mvebu fixes for 4.12
Fix the interrupt description of the crypto node for device tree of
the Armada 7K/8K SoCs
* tag 'mvebu-fixes-4.12-1' of git://git.infradead.org/linux-mvebu: (316 commits)
arm64: marvell: dts: fix interrupts in 7k/8k crypto nodes
+ Linux 4.12-rc2
Signed-off-by: Olof Johansson <olof@lixom.net>
The STMicroelectronics dedicated mailing list kernel@stlinux.com
is no more available, remove it to avoid bouncing mails.
Several request to create a new mailing list has been send by Benjamin
Gaignard and me but without any answers.
Signed-off-by: Patrice Chotard <patrice.chotard@st.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Reset controller fixes for v4.12
- Set hi6220_reset driver module license to GPL v2 to fix module loading.
* tag 'reset-fixes-for-4.12' of git://git.pengutronix.de/git/pza/linux:
reset: hi6220: Set module license so that it can be loaded
Signed-off-by: Olof Johansson <olof@lixom.net>
Fixes for 4.12:
Fix two compilation issues
* tag 'at91-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
ARM: at91: select CONFIG_ARM_CPU_SUSPEND
memory: atmel-ebi: mark PM ops as __maybe_unused
Signed-off-by: Olof Johansson <olof@lixom.net>
Most of DT files in ARM use #include "..." to make pre-processor
include DT in the same directory, but this is one of the exceptional
files that use #include <...> for that.
Fix it to remove -I$(srctree)/arch/$(SRCARCH)/boot/dts path from
dtc_cpp_flags.
ARM: dts: versatile: use #include "..." to include DT in the same directory
Most of DT files in ARM use #include "..." to make pre-processor
include DT in the same directory, but we have 3 exceptional files
that use #include <...> for that.
They must be fixed to remove -I$(srctree)/arch/$(SRCARCH)/boot/dts
path from dtc_cpp_flags.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull nfsd fixes from Bruce Fields:
"Revert patch accidentally included in the merge window pull request,
and fix a crash that was likely a result of buggy client behavior"
* tag 'nfsd-4.12-1' of git://linux-nfs.org/~bfields/linux:
nfsd4: fix null dereference on replay
nfsd: Revert "nfsd: check for oversized NFSv2/v3 arguments"
Pull gcc-plugin prepwork from Kees Cook:
"Use designated initializers for mtk-vcodec, powerplay, amdgpu, and
sgi-xp. Use ERR_CAST() to avoid cross-structure cast in ocf2, ntfs,
and NFS.
Christoph Hellwig recommended that I send these fixes now, rather than
waiting for the v4.13 merge window. These are all initializer and cast
fixes needed for the future randstruct plugin that haven't been picked
up by the respective maintainers"
* tag 'gcc-plugins-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
mtk-vcodec: Use designated initializers
drm/amd/powerplay: Use designated initializers
drm/amdgpu: Use designated initializers
sgi-xp: Use designated initializers
ocfs2: Use ERR_CAST() to avoid cross-structure cast
ntfs: Use ERR_CAST() to avoid cross-structure cast
NFS: Use ERR_CAST() to avoid cross-structure cast
Commit 9fdca4da4d (IB/SA: Split struct sa_path_rec based on IB and
ROCE specific fields) moved the service_id to be specific attribute
for IB and OPA SA Path Record, and thus wasn't assigned for RoCE.
This caused to the following kernel panic in the CMA request handler flow:
[ 27.074594] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 27.074731] IP: __radix_tree_lookup+0x1d/0xe0
...
[ 27.075356] Workqueue: ib_cm cm_work_handler [ib_cm]
[ 27.075401] task: ffff88022e3b8000 task.stack: ffffc90001298000
[ 27.075449] RIP: 0010:__radix_tree_lookup+0x1d/0xe0
...
[ 27.075979] Call Trace:
[ 27.076015] radix_tree_lookup+0xd/0x10
[ 27.076055] cma_ps_find+0x59/0x70 [rdma_cm]
[ 27.076097] cma_id_from_event+0xd2/0x470 [rdma_cm]
[ 27.076144] ? ib_init_ah_from_path+0x39a/0x590 [ib_core]
[ 27.076193] cma_req_handler+0x25/0x480 [rdma_cm]
[ 27.076237] cm_process_work+0x25/0x120 [ib_cm]
[ 27.076280] ? cm_get_bth_pkey.isra.62+0x3c/0xa0 [ib_cm]
[ 27.076350] cm_req_handler+0xb03/0xd40 [ib_cm]
[ 27.076430] ? sched_clock_cpu+0x11/0xb0
[ 27.076478] cm_work_handler+0x194/0x1588 [ib_cm]
[ 27.076525] process_one_work+0x160/0x410
[ 27.076565] worker_thread+0x137/0x4a0
[ 27.076614] kthread+0x112/0x150
[ 27.076684] ? max_active_store+0x60/0x60
[ 27.077642] ? kthread_park+0x90/0x90
[ 27.078530] ret_from_fork+0x2c/0x40
This patch moves it back to the common SA Path Record structure
and removes the redundant setter and getter.
Tested on Connect-IB and Connect-X4 in Infiniband and RoCE respectively.
Fixes: 9fdca4da4d (IB/SA: Split struct sa_path_rec based on IB ands
ROCE specific fields)
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This change will optimize kernel memory deregistration operations.
__ib_umem_release() used to call set_page_dirty_lock() against every
writable page in its memory region. Its purpose is to keep data
synced between CPU and DMA device when swapping happens after mem
deregistration ops. Now we choose not to set page dirty bit if it's
already set by kernel prior to calling __ib_umem_release(). This
reduces memory deregistration time by half or even more when we ran
application simulation test program.
Signed-off-by: Qing Huang <qing.huang@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Commit 5752075144 ("IB/SA: Add OPA path record type") introduced
new local function __ib_copy_path_rec_to_user, but didn't limit its
scope. This produces the following sparse warning:
drivers/infiniband/core/uverbs_marshall.c:99:6: warning:
symbol '__ib_copy_path_rec_to_user' was not declared. Should it be
static?
In addition, it used sizeof ... notations instead of sizeof(...), which
is correct in C, but a little bit misleading. Let's change it too.
Fixes: 5752075144 ("IB/SA: Add OPA path record type")
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
RDMA netlink is part of ib_core, hence ibnl_chk_listeners(),
ibnl_init() and ibnl_cleanup() don't need to be published
in public header file.
Let's remove EXPORT_SYMBOL from ibnl_chk_listeners() and move all these
functions to private header file.
CC: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
If srp_init_qp() fails at srp_create_ch_ib() then ch->send_cq
may be NULL.
Calling directly to ib_destroy_qp() is sufficient because
no work requests were posted on the created qp.
Fixes: 9294000d6d ("IB/srp: Drain the send queue before destroying a QP")
Cc: <stable@vger.kernel.org>
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>--
Signed-off-by: Doug Ledford <dledford@redhat.com>
ipoib_dev_uninit_default() call is used in ipoib_main.c file only
and it generates the following warning from smatch tool:
drivers/infiniband/ulp/ipoib/ipoib_main.c:1593:6: warning:
symbol 'ipoib_dev_uninit_default' was not declared. Should it
be static?
so let's declare that function as static.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
HW can implement UMR wqe re-transmission in various ways.
Thus, add HCA cap to distinguish the needed fence for UMR to make
sure that the wqe wouldn't fail on mkey checks.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The cited patch added a type field to structures ib_ah and rdma_ah_attr.
Function mlx4_ib_query_ah() builds an rdma_ah_attr structure from the
data in an mlx4_ib_ah structure (which contains both an ib_ah structure
and an address vector).
For mlx4_ib_query_ah() to work properly, the type field in the contained
ib_ah structure must be set correctly.
In the outgoing MAD tunneling flow, procedure mlx4_ib_multiplex_mad()
paravirtualizes a MAD received from a slave and sends the processed
mad out over the wire. During this processing, it populates an
mlx4_ib_ah structure and calls mlx4_ib_query_ah().
The cited commit overlooked setting the type field in the contained
ib_ah structure before invoking mlx4_ib_query_ah(). As a result, the
type field remained uninitialized, and the rdma_ah_attr structure was
incorrectly built. This resulted in improperly built MADs being sent out
over the wire.
This patch properly initializes the type field in the contained ib_ah
structure before calling mlx4_ib_query_ah(). The rdma_ah_attr structure
is then generated correctly.
Fixes: 44c58487d5 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The handling of IB_RDMA_WRITE_ONLY_WITH_IMMEDIATE will leak a memory
reference when a buffer cannot be allocated for returning the immediate
data.
The issue is that the rkey validation has already occurred and the RNR
nak fails to release the reference that was fruitlessly gotten. The
the peer will send the identical single packet request when its RNR
timer pops.
The fix is to release the held reference prior to the rnr nak exit.
This is the only sequence the requires both rkey validation and the
buffer allocation on the same packet.
Cc: Stable <stable@vger.kernel.org> # 4.7+
Tested-by: Tadeusz Struk <tadeusz.struk@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Keep VL15 credits at 0 during LNI, before link-up. Store
VL15 credits value during verify cap interrupt and set
in after link-up. This addresses an issue where VL15 MAD
packets could be sent by one side of the link before
the other side is ready to receive them.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The Omni-Path adapter driver fails to load on the ppc64le platform
due to invalid PCI setup.
This patch makes the PCI configuration more robust and will
fix 64 bit addressing for ppc64le.
Signed-off-by: Steven L Roberts <robers97@gmail.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Take care of ipv6 checks while computing header length for deducing mtu
size of ipv6 servers. Due to the incorrect header length computation for
ipv6 servers, wrong mss is reported to the peer (client).
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The patch 761e19a504 (RDMA/iw_cxgb4: Handle return value of
c4iw_ofld_send() in abort_arp_failure()) from May 6, 2016
leads to the following static checker warning:
drivers/infiniband/hw/cxgb4/cm.c:575 abort_arp_failure()
warn: passing freed memory 'skb'
Also fixes skb leak when l2t resolution fails
Fixes: 761e19a504 (RDMA/iw_cxgb4: Handle return value of
c4iw_ofld_send() in abort_arp_failure())
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Explicitly ACK the MPA Reply frame so the peer
does not retransmit the frame.
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Don't set control flag for 0-length FULPDU (Send) RTR indication
in the enhanced MPA Request/Reply frames, because it isn't supported.
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Some error paths in i40iw_initialize_dev are doing
additional and unnecessary work before exiting.
Correctly free resources allocated prior to error
and return with correct status code.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intelcom>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Don't set control flag for 0-length FULPDU (Send)
RTR indication in the enhanced MPA Request/Reply
frames, because it isn't supported.
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In the commit enabling per-CPU station statistics, I inadvertedly
copy-pasted some code to update rx_packets and forgot to change it
to update rx_dropped_misc. Fix that.
This addresses https://bugzilla.kernel.org/show_bug.cgi?id=195953.
Fixes: c9c5962b56 ("mac80211: enable collecting station statistics per-CPU")
Reported-by: Petru-Florin Mihancea <petrum@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Leonard Crestez says:
====================
ARM: imx6ul-14x14-evk: Fix suspend over nfs by phy
Right now attempting doing suspend/resume while root is mounted over NFS
hangs on imx6ul-14x14-evk. This is happening because ksz8081 phy fixups are
lost on resume.
Fix this by using equivalent devicetree properties instead of a phy fixup
and handling those properties on resume in the micrel driver.
In theory it might now be possible to remove the phy fixup from mach-imx6ul
entirely but it is possible that this would break other imx6ul boards which
use the same phy. The solution would be to patch their dts but it's not
clear how to identify affected boards.
This code is shared with imx6ull-14x14-evk but 6ull suspend needs an
unrelated patch: https://lkml.org/lkml/2017/5/30/584
This is something of a corner case so there is no CC: stable.
Changes since v1: https://lkml.org/lkml/2017/5/30/672
* Split a kszphy_config_reset function for stuff shared between
config_init and resume. Calling config_init directly could be an option but
on some HW variants it does extra stuff like parsing devicetree options.
That would not be appropriate for resume code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
These bits seem to be lost after a suspend/resume cycle so just set them
again. Do this by splitting the handling of these bits into a function
that is also called on resume.
This patch fixes ethernet suspend/resume on imx6ul-14x14-evk boards.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now mach-imx6ul registers a fixup for the ksz8081 phy. The same
register values can be set through the micrel phy driver by using dts
properties.
This seems preferable and allows cleanly fixing suspend/resume.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver may sleep under a read spin lock, and the function call path is:
send_socklist (acquire the lock by read_lock)
skb_copy(GFP_KERNEL) --> may sleep
To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 0c1d70af92 ("net: use dst_cache for vxlan device"),
cached dst entries could be leaked when more than one remote was
present for a given vxlan_fdb entry, causing subsequent netns
operations to block indefinitely and "unregister_netdevice: waiting
for lo to become free." messages to appear in the kernel log.
Fix by properly releasing cached dst and freeing resources in this
case.
Fixes: 0c1d70af92 ("net: use dst_cache for vxlan device")
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
From Boris:
"""
This pull request contains several fixes to the core and the tango
driver.
tango fixes:
* Add missing MODULE_DEVICE_TABLE() in tango_nand.c
* Update the number of corrected bitflips
core fixes:
* Fix a long standing memory leak in nand_scan_tail()
* Fix several bugs introduced by the per-vendor init/detection
infrastructure (introduced in 4.12)
* Add a static specifier to nand_ooblayout_lp_hamming_ops definition
"""
Pull KVM fixes from Paolo Bonzini:
"Many small x86 bug fixes: SVM segment registers access rights, nested
VMX, preempt notifiers, LAPIC virtual wire mode, NMI injection"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix nmi injection failure when vcpu got blocked
KVM: SVM: do not zero out segment attributes if segment is unusable or not present
KVM: SVM: ignore type when setting segment registers
KVM: nVMX: fix nested_vmx_check_vmptr failure paths under debugging
KVM: x86: Fix virtual wire mode
KVM: nVMX: Fix handling of lmsw instruction
KVM: X86: Fix preempt the preemption timer cancel
Pull Reiserfs and GFS2 fixes from Jan Kara:
"Fixes to GFS2 & Reiserfs for the fallout of the recent WRITE_FUA
cleanup from Christoph.
Fixes for other filesystems were already merged by respective
maintainers."
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
reiserfs: Make flush bios explicitely sync
gfs2: Make flush bios explicitely sync
Pull SCSI target fixes from Nicholas Bellinger:
"Here are the target-pending fixes for v4.12-rc4:
- ibmviscsis ABORT_TASK handling fixes that missed the v4.12 merge
window. (Bryant Ly and Michael Cyr)
- Re-add a target-core check enforcing WRITE overflow reject that was
relaxed in v4.3, to avoid unsupported iscsi-target immediate data
overflow. (nab)
- Fix a target-core-user OOPs during device removal. (MNC + Bryant
Ly)
- Fix a long standing iscsi-target potential issue where kthread exit
did not wait for kthread_should_stop(). (Jiang Yi)
- Fix a iscsi-target v3.12.y regression OOPs involving initial login
PDU processing during asynchronous TCP connection close. (MNC +
nab)
This is a little larger than usual for an -rc4, primarily due to the
iscsi-target v3.12.y regression OOPs bug-fix.
However, it's an important patch as MNC + Hannes where both able to
trigger it using a reduced iscsi initiator login timeout combined with
a backend taking a long time to complete I/Os during iscsi login
driven session reinstatement"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
iscsi-target: Always wait for kthread_should_stop() before kthread exit
iscsi-target: Fix initial login PDU asynchronous socket close OOPs
tcmu: fix crash during device removal
target: Re-add check to reject control WRITEs with overflow data
ibmvscsis: Fix the incorrect req_lim_delta
ibmvscsis: Clear left-over abort_cmd pointers
arch/sparc/kernel/ds.c: In function ‘register_services’:
arch/sparc/kernel/ds.c:912:3: error: ‘strcpy’: writing at least 1 byte
into a region of size 0 overflows the destination
Reported-by: Anatoly Pugachev <matorola@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the transition of NO_STP -> KERNEL_STP was fixed by always calling
mod_timer in br_stp_start, it introduced a new regression which causes
the timer to be armed even when the bridge is down, and since we stop
the timers in its ndo_stop() function, they never get disabled if the
device is destroyed before it's upped.
To reproduce:
$ while :; do ip l add br0 type bridge hello_time 100; brctl stp br0 on;
ip l del br0; done;
CC: Xin Long <lucien.xin@gmail.com>
CC: Ivan Vecera <cera@cera.cz>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Fixes: 6d18c732b9 ("bridge: start hello_timer when enabling KERNEL_STP in br_stp_start")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Apparently multi-cos isn't working for bnx2x quite some time -
driver implements ndo_select_queue() to allow queue-selection
for FCoE, but the regular L2 flow would cause it to modulo the
fallback's result by the number of queues.
The fallback would return a queue matching the needed tc
[via __skb_tx_hash()], but since the modulo is by the number of TSS
queues where number of TCs is not accounted, transmission would always
be done by a queue configured into using TC0.
Fixes: ada7c19e6d ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change t4fw_version.h to update latest firmware version
number to 1.16.45.0.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NETLINK_F_LISTEN_ALL_NSID otion enables to listen all netns that have a
nsid assigned into the netns where the netlink socket is opened.
The nsid is sent as metadata to userland, but the existence of this nsid is
checked only for netns that are different from the socket netns. Thus, if
no nsid is assigned to the socket netns, NETNSA_NSID_NOT_ASSIGNED is
reported to the userland. This value is confusing and useless.
After this patch, only valid nsid are sent to userland.
Reported-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver may sleep under a write spin lock, and the function
call path is:
qlcnic_82xx_hw_write_wx_2M (acquire the lock by write_lock_irqsave)
crb_win_lock
qlcnic_pcie_sem_lock
usleep_range
qlcnic_82xx_hw_read_wx_2M (acquire the lock by write_lock_irqsave)
crb_win_lock
qlcnic_pcie_sem_lock
usleep_range
To fix it, the usleep_range is replaced with udelay.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we have to recover relocation during mount, we'll ultimately have to
evict the orphan inode. That goes through the reservation dance, where
priority_reclaim_metadata_space and flush_space expect fs_info->fs_root
to be valid. That's the next thing to be set up during mount, so we
crash, almost always in flush_space trying to join the transaction
but priority_reclaim_metadata_space is possible as well. This call
path has been problematic in the past WRT whether ->fs_root is valid
yet. Commit 957780eb27 (Btrfs: introduce ticketed enospc
infrastructure) added new users that are called in the direct path
instead of the async path that had already been worked around.
The thing is that we don't actually need the fs_root, specifically, for
anything. We either use it to determine whether the root is the
chunk_root for use in choosing an allocation profile or as a root to pass
btrfs_join_transaction before immediately committing it. Anything that
isn't the chunk root works in the former case and any root works in
the latter.
A simple fix is to use a root we know will always be there: the
extent_root.
Cc: <stable@vger.kernel.org> # v4.8+
Fixes: 957780eb27 (Btrfs: introduce ticketed enospc infrastructure)
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Variables start_idx and end_idx are supposed to hold a page index
derived from the file offsets. The int type is not the right one though,
offsets larger than 1 << 44 will get silently trimmed off the high bits.
(1 << 44 is 16TiB)
What can go wrong, if start is below the boundary and end gets trimmed:
- if there's a page after start, we'll find it (radix_tree_gang_lookup_slot)
- the final check "if (page->index <= end_idx)" will unexpectedly fail
The function will return false, ie. "there's no page in the range",
although there is at least one.
btrfs_page_exists_in_range is used to prevent races in:
* in hole punching, where we make sure there are not pages in the
truncated range, otherwise we'll wait for them to finish and redo
truncation, but we're going to replace the pages with holes anyway so
the only problem is the intermediate state
* lock_extent_direct: we want to make sure there are no pages before we
lock and start DIO, to prevent stale data reads
For practical occurence of the bug, there are several constaints. The
file must be quite large, the affected range must cross the 16TiB
boundary and the internal state of the file pages and pending operations
must match. Also, we must not have started any ordered data in the
range, otherwise we don't even reach the buggy function check.
DIO locking tries hard in several places to avoid deadlocks with
buffered IO and avoids waiting for ranges. The worst consequence seems
to be stale data read.
CC: Liu Bo <bo.li.liu@oracle.com>
CC: stable@vger.kernel.org # 3.16+
Fixes: fc4adbff82 ("btrfs: Drop EXTENT_UPTODATE check in hole punching and direct locking")
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The s390 architecture maps sys_mmap (nr 90) into sys_old_mmap. For this
reason perf trace can't find the proper syscall event to get args format
from and displays it wrongly as 'continued'.
To fix that fill the "alias" field with "old_mmap" for trace's mmap record
to get the correct translation.
Before:
0.042 ( 0.011 ms): vest/43052 fstat(statbuf: 0x3ffff89fd90 ) = 0
0.042 ( 0.028 ms): vest/43052 ... [continued]: mmap()) = 0x3fffd6e2000
0.072 ( 0.025 ms): vest/43052 read(buf: 0x3fffd6e2000, count: 4096 ) = 6
After:
0.045 ( 0.011 ms): fstat(statbuf: 0x3ffff8a0930 ) = 0
0.057 ( 0.018 ms): mmap(arg: 0x3ffff8a0858 ) = 0x3fffd14a000
0.076 ( 0.025 ms): read(buf: 0x3fffd14a000, count: 4096 ) = 6
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170531113557.19175-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We are running low on CPU feature bits, so we only want to use them when
it's really necessary.
CPU_FTR_SUBCORE is only used in one place, and only in C, so we don't
need it in order to make asm patching work. It can only be set on
"Power8" CPUs, which in practice means POWER8, POWER8E and POWER8NVL.
There are no plans to implement it on future CPUs, but if there ever
were we could retrofit it then.
Although KVM uses subcores, it never looks at the CPU feature, it either
looks at the ISA level or the threads_per_subcore value.
So drop the CPU feature and do a PVR check instead. Drop the device tree
"subcore" feature as we no longer support doing anything with it, and we
will drop it from skiboot too.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When adding or removing memory, the aa_index (affinity value) for the
memblock must also be converted to match the endianness of the rest
of the 'ibm,dynamic-memory' property. Otherwise, subsequent retrieval
of the attribute will likely lead to non-existent nodes, followed by
using the default node in the code inappropriately.
Fixes: 5f97b2a0d1 ("powerpc/pseries: Implement memory hotplug add in the kernel")
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
If a process dumps core while it has SPU contexts active then we have
code to also dump information about the SPU contexts.
Unfortunately it's been broken for 3 1/2 years, and we didn't notice. In
commit 7b1f4020d0 ("spufs: get rid of dump_emit() wrappers") the nread
variable was removed and rc used instead. That means when the loop exits
successfully, rc has the number of bytes read, but it's then used as the
return value for the function, which should return 0 on success.
So fix it by setting rc = 0 before returning in the success case.
Fixes: 7b1f4020d0 ("spufs: get rid of dump_emit() wrappers")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Provide a dt_cpu_ftrs= cmdline option to disable the dt_cpu_ftrs CPU
feature discovery, and fall back to the "cputable" based version.
Also allow control of advertising unknown features to userspace and
with this parameter, and remove the clunky CONFIG option.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add explicit early check of bootargs in dt_cpu_ftrs_init()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
drivers/phy/qualcomm/phy-qcom-qmp.c:847:37-43: ERROR: application of sizeof to pointer
sizeof when applied to a pointer typed expression gives the size of
the pointer
Generated by: scripts/coccinelle/misc/noderef.cocci
CC: Vivek Gautam <vivek.gautam@codeaurora.org>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
When spin_lock_irqsave() deadlock occurs inside the guest, vcpu threads,
other than the lock-holding one, would enter into S state because of
pvspinlock. Then inject NMI via libvirt API "inject-nmi", the NMI could
not be injected into vm.
The reason is:
1 It sets nmi_queued to 1 when calling ioctl KVM_NMI in qemu, and sets
cpu->kvm_vcpu_dirty to true in do_inject_external_nmi() meanwhile.
2 It sets nmi_queued to 0 in process_nmi(), before entering guest, because
cpu->kvm_vcpu_dirty is true.
It's not enough just to check nmi_queued to decide whether to stay in
vcpu_block() or not. NMI should be injected immediately at any situation.
Add checking nmi_pending, and testing KVM_REQ_NMI replaces nmi_queued
in vm_vcpu_has_events().
Do the same change for SMIs.
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a fix for the problem [1], where VMCB.CPL was set to 0 and interrupt
was taken on userspace stack. The root cause lies in the specific AMD CPU
behaviour which manifests itself as unusable segment attributes on SYSRET.
The corresponding work around for the kernel is the following:
61f01dd941 ("x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue")
In other turn virtualization side treated unusable segment incorrectly and
restored CPL from SS attributes, which were zeroed out few lines above.
In current patch it is assured only that P bit is cleared in VMCB.save state
and segment attributes are not zeroed out if segment is not presented or is
unusable, therefore CPL can be safely restored from DPL field.
This is only one part of the fix, since QEMU side should be fixed accordingly
not to zero out attributes on its side. Corresponding patch will follow.
[1] Message id: CAJrWOzD6Xq==b-zYCDdFLgSRMPM-NkNuTSDFEtX=7MreT45i7Q@mail.gmail.com
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: Mikhail Sennikovskii <mikhail.sennikovskii@profitbricks.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmář <rkrcmar@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For SDIO the alignment requirement for transfers from device to host
is configured in firmware. This configuration is limited to minimum
of 4-byte alignment. However, this is not correct for platforms using
64-bit DMA when the minimum alignment should be 8 bytes. This issue
appeared when the ALIGNMENT definition was set according the DMA
configuration. The configuration in firmware was not using that macro
defintion, but a hardcoded value of 4. Hence the driver reported
alignment failures for data coming from the device and causing
transfers to fail.
Fixes: 6e84ab604b ("brcmfmac: properly align buffers on certain platforms
Reported-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The previous commit [63691587f7: ALSA: hda - Apply dual-codec quirk
for MSI Z270-Gaming mobo] attempted to apply the existing dual-codec
quirk for a MSI mobo. But it turned out that this isn't applied
properly due to the MSI-vendor quirk before this entry. I overlooked
such two MSI entries just because they were put in the wrong position,
although we have a list ordered by PCI SSID numbers.
This patch fixes it by rearranging the unordered entries.
Fixes: 63691587f7 ("ALSA: hda - Apply dual-codec quirk for MSI Z270-Gaming mobo")
Reported-by: Rudolf Schmidt <info@rudolfschmidt.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull drm fixes from Dave Airlie:
"This is the main set of fixes for rc4, one amdgpu fix, some exynos
regression fixes, some msm fixes and some i915 and GVT fixes.
I've got a second regression fix for some DP chips that might be a
bit large, but I think we'd like to land it now, I'll send it along
tomorrow, once you are happy with this set"
* tag 'drm-fixes-for-v4.12-rc4' of git://people.freedesktop.org/~airlied/linux: (24 commits)
drm/amdgpu: Program ring for vce instance 1 at its register space
drm/exynos: clean up description of exynos_drm_crtc
drm/exynos: dsi: Remove bridge node reference in removal
drm/exynos: dsi: Fix the parse_dt function
drm/exynos: Merge pre/postclose hooks
drm/msm: Fix the check for the command size
drm/msm: Take the mutex before calling msm_gem_new_impl
drm/msm: for array in-fences, check if all backing fences are from our own context before waiting
drm/msm: constify irq_domain_ops
drm/msm/mdp5: release hwpipe(s) for unused planes
drm/msm: Reuse dma_fence_release.
drm/msm: Expose our reservation object when exporting a dmabuf.
drm/msm/gpu: check legacy clk names in get_clocks()
drm/msm/mdp5: use __drm_atomic_helper_plane_duplicate_state()
drm/msm: select PM_OPP
drm/i915: Stop pretending to mask/unmask LPE audio interrupts
drm/i915/selftests: Silence compiler warning in igt_ctx_exec
Revert "drm/i915: Restore lost "Initialized i915" welcome message"
drm/i915/gvt: clean up unsubmited workloads before destroying kmem cache
drm/i915/gvt: Disable compression workaround for Gen9
...
It was not possible to enable both T10 PI and TPGS because they share
the same byte in the INQUIRY response. Logically OR the TPGS value
instead of using assignment.
Reported-by: Ritika Srivastava <ritika.srivastava@oracle.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hung task timeouts can result if a qlogic board breaks unexpectedly
while running I/O. These tasks become hung because command srb reference
counts are not going to zero, hence the affected srbs and commands do
not get freed. This fix accounts for this extra reference in the srbs in
the case of a board failure.
Fixes: a465537ad1 ("qla2xxx: Disable the adapter and skip error recovery in case of register disconnect")
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Null check at line 966: if (ndlp) {, implies that ndlp might be NULL.
Functions lpfc_nlp_set_state() and lpfc_issue_els_prli() dereference
pointer ndlp. Include these function calls inside the IF block that
tests pointer ndlp.
Addresses-Coverity-ID: 1401856
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We might have a NULL pring in lpfc_els_abort(), for example on error
recovery path, since queues are destroyed during error recovery
mechanism.
In this case, we should just drop the abort since the queues will be
recreated anyway. This patch just verifies for NULL pointer and stop the
abortion of the queue in case of a NULL pring.
Also, this patch converts return type of lpfc_els_abort() from int to
void, since it's not checked anywhere.
Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Reported-by: Naresh Bannoth <nbannoth@in.ibm.com>
Tested-by: Raphael Silva <raphasil@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The lpfc_nvmeio_data() tracing helper always takes a format string and
three additional arguments. The latest caller has a format string with
only two integer arguments, causing this harmless warning:
drivers/scsi/lpfc/lpfc_nvmet.c: In function 'lpfc_nvmet_xmt_fcp_release':
drivers/scsi/lpfc/lpfc_nvmet.c:802:25: error: too many arguments for format [-Werror=format-extra-args]
lpfc_nvmeio_data(phba, "NVMET FCP FREE: xri x%x ste %d\n", ctxp->oxid,
We could add a dummy argument here, but it seems reasonable to print
the 'abort' flag as the third argument.
Fixes: 19b58d9473 ("nvmet_fc: add req_release to lldd api")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
- Fix a regression to description of exynos_drm_crtc
- Remove preclose hook of Exynos
. This was a exynos change of the patch series[1] merged already.
- Fix one dt broken issue
- Make sure to release bridge_node of Exynos MIPI-DSI driver.
[1] https://lists.freedesktop.org/archives/dri-devel/2017-March/135111.html
* tag 'exynos-drm-fixes-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm/exynos: clean up description of exynos_drm_crtc
drm/exynos: dsi: Remove bridge node reference in removal
drm/exynos: dsi: Fix the parse_dt function
drm/exynos: Merge pre/postclose hooks
a few fixes for 4.12..
* 'msm-fixes-4.12-rc4' of git://people.freedesktop.org/~robclark/linux:
drm/msm: Fix the check for the command size
drm/msm: Take the mutex before calling msm_gem_new_impl
drm/msm: for array in-fences, check if all backing fences are from our own context before waiting
drm/msm: constify irq_domain_ops
drm/msm/mdp5: release hwpipe(s) for unused planes
drm/msm: Reuse dma_fence_release.
drm/msm: Expose our reservation object when exporting a dmabuf.
drm/msm/gpu: check legacy clk names in get_clocks()
drm/msm/mdp5: use __drm_atomic_helper_plane_duplicate_state()
drm/msm: select PM_OPP
drm/i915 fixes for v4.12-rc4
* tag 'drm-intel-fixes-2017-05-29' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Stop pretending to mask/unmask LPE audio interrupts
drm/i915/selftests: Silence compiler warning in igt_ctx_exec
Revert "drm/i915: Restore lost "Initialized i915" welcome message"
drm/i915/gvt: clean up unsubmited workloads before destroying kmem cache
drm/i915/gvt: Disable compression workaround for Gen9
drm/i915: set initialised only when init_context callback is NULL
drm/i915: Fix new -Wint-in-bool-context gcc compiler warning
drm/i915: use vma->size for appgtt allocate_va_range
drm/i915: Do not sync RCU during shrinking
There are three timing problems in the kthread usages of iscsi_target_mod:
- np_thread of struct iscsi_np
- rx_thread and tx_thread of struct iscsi_conn
In iscsit_close_connection(), it calls
send_sig(SIGINT, conn->tx_thread, 1);
kthread_stop(conn->tx_thread);
In conn->tx_thread, which is iscsi_target_tx_thread(), when it receive
SIGINT the kthread will exit without checking the return value of
kthread_should_stop().
So if iscsi_target_tx_thread() exit right between send_sig(SIGINT...)
and kthread_stop(...), the kthread_stop() will try to stop an already
stopped kthread.
This is invalid according to the documentation of kthread_stop().
(Fix -ECONNRESET logout handling in iscsi_target_tx_thread and
early iscsi_target_rx_thread failure case - nab)
Signed-off-by: Jiang Yi <jiangyilism@gmail.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a OOPs originally introduced by:
commit bb048357da
Author: Nicholas Bellinger <nab@linux-iscsi.org>
Date: Thu Sep 5 14:54:04 2013 -0700
iscsi-target: Add sk->sk_state_change to cleanup after TCP failure
which would trigger a NULL pointer dereference when a TCP connection
was closed asynchronously via iscsi_target_sk_state_change(), but only
when the initial PDU processing in iscsi_target_do_login() from iscsi_np
process context was blocked waiting for backend I/O to complete.
To address this issue, this patch makes the following changes.
First, it introduces some common helper functions used for checking
socket closing state, checking login_flags, and atomically checking
socket closing state + setting login_flags.
Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP
connection has dropped via iscsi_target_sk_state_change(), but the
initial PDU processing within iscsi_target_do_login() in iscsi_np
context is still running. For this case, it sets LOGIN_FLAGS_CLOSED,
but doesn't invoke schedule_delayed_work().
The original NULL pointer dereference case reported by MNC is now handled
by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before
transitioning to FFP to determine when the socket has already closed,
or iscsi_target_start_negotiation() if the login needs to exchange
more PDUs (eg: iscsi_target_do_login returned 0) but the socket has
closed. For both of these cases, the cleanup up of remaining connection
resources will occur in iscsi_target_start_negotiation() from iscsi_np
process context once the failure is detected.
Finally, to handle to case where iscsi_target_sk_state_change() is
called after the initial PDU procesing is complete, it now invokes
conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once
existing iscsi_target_sk_check_close() checks detect connection failure.
For this case, the cleanup of remaining connection resources will occur
in iscsi_target_do_login_rx() from delayed workqueue process context
once the failure is detected.
Reported-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Tested-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Reported-by: Hannes Reinecke <hare@suse.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Varun Prakash <varun@chelsio.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The PRCM takes PLL_PERIPH0 as one of its parents for the AR100 clock.
As such we need to be able to describe this relationship in the device
tree.
Export the PLL_PERIPH0 clock so we can reference it in the PRCM node.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
The PRCM takes PLL_PERIPH0 as one of its parents for the AR100 clock.
As such we need to be able to describe this relationship in the device
tree.
Export the PLL_PERIPH0 clock so we can reference it in the PRCM node.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
The AR100 clock in the PRCM has parents, one of which is pll-periph from
the main CCU.
Add it to the list of required clocks for the PRCM CCU.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
recent fixes to use WRITE_ONCE for nh_flags on link up,
accidently ended up leaving the deadflags on a nh. This patch
fixes the WRITE_ONCE to use freshly evaluated nh_flags.
Fixes: 39eb8cd175 ("net: mpls: rt_nhn_alive and nh_flags should be accessed using READ_ONCE")
Reported-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver may sleep under a spin lock, the function call path is:
isdn_ppp_mp_receive (acquire the lock)
isdn_ppp_mp_reassembly
isdn_ppp_push_higher
isdn_ppp_decompress
isdn_ppp_ccp_reset_trans
isdn_ppp_ccp_reset_alloc_state
kzalloc(GFP_KERNEL) --> may sleep
To fixed it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ata_parse_force_one() was incorrectly comparing @p to @endp when it
should have been comparing @id. The only consequence is that it may
end up using an invalid port number in "libata.force" module param
instead of rejecting it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Petru-Florin Mihancea <petrum@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=195785
Add NULL check before dereferencing pointer _id_ in order to avoid
a potential NULL pointer dereference.
Addresses-Coverity-ID: 1397995
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Auto-loading of the Marvell DSA driver has stopped working with recent
kernels. This seems to be due to the change of binding for DSA devices,
moving them from the platform bus to the MDIO bus.
In order for module auto-loading to work, we need to provide a MODALIAS
string in the uevent file for the device. However, the device core does
not automatically provide this, and needs each bus_type to implement a
uevent method to generate these strings. The MDIO bus does not provide
such a method, so no MODALIAS string is provided:
.# cat /sys/bus/mdio_bus/devices/f1072004.mdio-mii\:04/uevent
DRIVER=mv88e6085
OF_NAME=switch
OF_FULLNAME=/soc/internal-regs/mdio@72004/switch@4
OF_COMPATIBLE_0=marvell,mv88e6085
OF_COMPATIBLE_N=1
In the case of OF-based devices, the solution is easy -
of_device_uevent_modalias() does the work for us. After this is done,
the uevent file looks like this:
.# cat /sys/bus/mdio_bus/devices/f1072004.mdio-mii\:04/uevent
DRIVER=mv88e6085
OF_NAME=switch
OF_FULLNAME=/soc/internal-regs/mdio@72004/switch@4
OF_COMPATIBLE_0=marvell,mv88e6085
OF_COMPATIBLE_N=1
MODALIAS=of:NswitchT<NULL>Cmarvell,mv88e6085
which results in auto-loading of the Marvell DSA driver on Clearfog
platforms.
Fixes: c0405563a6 ("ARM: dts: armada-388-clearfog: Utilize new DSA binding")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Marvell driver incorrectly provides phydev->lp_advertising as the
logical and of the link partner's advert and our advert. This is
incorrect - this field is supposed to store the link parter's unmodified
advertisment.
This allows ethtool to report the correct link partner auto-negotiation
status.
Fixes: be937f1f89 ("Marvell PHY m88e1111 driver fix")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need program ring buffer on instance 1 register space domain,
when only if instance 1 available, with two instances or instance 0,
and we need only program instance 0 regsiter space domain for ring.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MTU probing initialization occurred only at connect() and at SYN or
SYN-ACK reception, but the former sets MSS to either the default or the
user set value (through TCP_MAXSEG sockopt) and the latter never happens
with repaired sockets.
The result was that, with MTU probing enabled and unless TCP_MAXSEG
sockopt was used before connect(), probing would be stuck at
tcp_base_mss value until tcp_probe_interval seconds have passed.
Signed-off-by: Douglas Caetano dos Santos <douglascs@taghos.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
If you attempt a TCP mount from an host that is unreachable in a way
that triggers an immediate error from kernel_connect(), that error
does not propagate up, instead EAGAIN is reported.
This results in call_connect_status receiving the wrong error.
A case that it easy to demonstrate is to attempt to mount from an
address that results in ENETUNREACH, but first deleting any default
route.
Without this patch, the mount.nfs process is persistently runnable
and is hard to kill. With this patch it exits as it should.
The problem is caused by the fact that xs_tcp_force_close() eventually
calls
xprt_wake_pending_tasks(xprt, -EAGAIN);
which causes an error return of -EAGAIN. so when xs_tcp_setup_sock()
calls
xprt_wake_pending_tasks(xprt, status);
the status is ignored.
Fixes: 4efdd92c92 ("SUNRPC: Remove TCP client connection reset hack")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Commit b685d3d65a "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
CC: linux-raid@vger.kernel.org
CC: Shaohua Li <shli@kernel.org>
Fixes: b685d3d65a
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Shaohua Li <shli@fb.com>
Pull overlayfs fixes from Miklos Szeredi:
"Fix regressions:
- missing CONFIG_EXPORTFS dependency
- failure if upper fs doesn't support xattr
- bad error cleanup
This also adds the concept of "impure" directories complementing the
"origin" marking introduced in -rc1. Together they enable getting
consistent st_ino and d_ino for directory listings.
And there's a bug fix and a cleanup as well"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: filter trusted xattr for non-admin
ovl: mark upper merge dir with type origin entries "impure"
ovl: mark upper dir with type origin entries "impure"
ovl: remove unused arg from ovl_lookup_temp()
ovl: handle rename when upper doesn't support xattr
ovl: don't fail copy-up if upper doesn't support xattr
ovl: check on mount time if upper fs supports setting xattr
ovl: fix creds leak in copy up error path
ovl: select EXPORTFS
When adding a cfq_group into the cfq service tree, we use CFQ_IDLE_DELAY
as the delay of cfq_group's vdisktime if there have been other cfq_groups
already.
When cfq is under iops mode, commit 9a7f38c42c ("cfq-iosched: Convert
from jiffies to nanoseconds") could result in a large iops delay and
lead to an abnormal io schedule delay for the added cfq_group. To fix
it, we just need to revert to the old CFQ_IDLE_DELAY value: HZ / 5
when iops mode is enabled.
Despite having the same value, the delay of a cfq_queue in idle class
and the delay of cfq_group are different things, so I define two new
macros for the delay of a cfq_group under time-slice mode and iops mode.
Fixes: 9a7f38c42c ("cfq-iosched: Convert from jiffies to nanoseconds")
Cc: <stable@vger.kernel.org> # 4.8+
Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
We've had user reports of unmount hangs in xfs_wait_buftarg() that
analysis shows is due to btp->bt_io_count == -1. bt_io_count
represents the count of in-flight asynchronous buffers and thus
should always be >= 0. xfs_wait_buftarg() waits for this value to
stabilize to zero in order to ensure that all untracked (with
respect to the lru) buffers have completed I/O processing before
unmount proceeds to tear down in-core data structures.
The value of -1 implies an I/O accounting decrement race. Indeed,
the fact that xfs_buf_ioacct_dec() is called from xfs_buf_rele()
(where the buffer lock is no longer held) means that bp->b_flags can
be updated from an unsafe context. While a user-level reproducer is
currently not available, some intrusive hacks to run racing buffer
lookups/ioacct/releases from multiple threads was used to
successfully manufacture this problem.
Existing callers do not expect to acquire the buffer lock from
xfs_buf_rele(). Therefore, we can not safely update ->b_flags from
this context. It turns out that we already have separate buffer
state bits and associated serialization for dealing with buffer LRU
state in the form of ->b_state and ->b_lock. Therefore, replace the
_XBF_IN_FLIGHT flag with a ->b_state variant, update the I/O
accounting wrappers appropriately and make sure they are used with
the correct locking. This ensures that buffer in-flight state can be
modified at buffer release time without racing with modifications
from a buffer lock holder.
Fixes: 9c7504aa72 ("xfs: track and serialize in-flight async buffers against unmount")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Libor Pechacek <lpechacek@suse.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Commit b685d3d65a ("block: treat REQ_FUA and REQ_PREFLUSH as
synchronous") removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions.
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
Fixes: b685d3d65a ("block: treat REQ_FUA and REQ_PREFLUSH as synchronous")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
In the conversion to drop drm_modeset_lock_all and the magic implicit
context I failed to realize that _resume starts out with a pile of
state copies, but not with the locks. And hence drm_atomic_commit
won't grab these for us.
v2: Add locking checks in helpers to make sure we catch this in the
future. Note we can only require the locks in the atomic_check phase,
not in the commit phase. But since any commit is guaranteed to first
run the checks (even for the resume stuff where we use stored
duplicated old state) this should give us full coverage. Requested by
Maarten.
Cc: Jyri Sarha <jsarha@ti.com>
Fixes: a5b8444e28 ("drm/atomic-helper: remove modeset_lock_all from helper_resume")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170531083813.1390-1-daniel.vetter@ffwll.ch
This is another attempt to work around the VLA used in
mixer_us16x08.c. Basically the temporary array is used individually
for two cases, and we can declare locally in each block, instead of
hackish max() usage.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This reverts commit 89b593c30e ("ALSA: usb-audio: purge needless
variable length array"). The patch turned out to cause a severe
regression, triggering an Oops at snd_usb_ctl_msg(). It was overseen
that snd_usb_ctl_msg() writes back the response to the given buffer,
while the patch changed it to a read-only const buffer. (One should
always double-check when an extra pointer cast is present...)
As a simple fix, just revert the affected commit. It was merely a
cleanup. Although it brings VLA again, it's clearer as a fix. We'll
address the VLA later in another patch.
Fixes: 89b593c30e ("ALSA: usb-audio: purge needless variable length array")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195875
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Force vop output mode on encoder driver seem not a good idea,
EDP, HDMI, DisplayPort all have 10bit input on rk3399,
On non-10bit vop, vop 8bit output bit[0-7] connect to the
encoder high 8bit [2-9].
So force RGB10 to RGB888 on vop driver would be better.
And another problem, EDP check crtc id on atomic_check,
but encoder maybe NULL, so out_mode configure would fail,
it cause edp no display.
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
Reviewed-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1495885416-22216-1-git-send-email-mark.yao@rock-chips.com
When the controller fails to provide an RPM reading within the alloted
time; the driver returns -ETIMEDOUT and no file contents.
Signed-off-by: Patrick Venture <venture@google.com>
Fixes: 2d7a548a3e ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The driver uses regmap and thus has to select it to avoid build
errors such as the following.
drivers/hwmon/aspeed-pwm-tacho.c:337:21: error: variable
'aspeed_pwm_tacho_regmap_config' has initializer but incomplete type
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Joel Stanley <joel@jms.id.au>
Fixes: 2d7a548a3e ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
This effectively reverts commit 8ee74a91ac ("proc: try to remove use
of FOLL_FORCE entirely")
It turns out that people do depend on FOLL_FORCE for the /proc/<pid>/mem
case, and we're talking not just debuggers. Talking to the affected people, the use-cases are:
Keno Fischer:
"We used these semantics as a hardening mechanism in the julia JIT. By
opening /proc/self/mem and using these semantics, we could avoid
needing RWX pages, or a dual mapping approach. We do have fallbacks to
these other methods (though getting EIO here actually causes an assert
in released versions - we'll updated that to make sure to take the
fall back in that case).
Nevertheless the /proc/self/mem approach was our favored approach
because it a) Required an attacker to be able to execute syscalls
which is a taller order than getting memory write and b) didn't double
the virtual address space requirements (as a dual mapping approach
would).
I think in general this feature is very useful for anybody who needs
to precisely control the execution of some other process. Various
debuggers (gdb/lldb/rr) certainly fall into that category, but there's
another class of such processes (wine, various emulators) which may
want to do that kind of thing.
Now, I suspect most of these will have the other process under ptrace
control, so maybe allowing (same_mm || ptraced) would be ok, but at
least for the sandbox/remote-jit use case, it would be perfectly
reasonable to not have the jit server be a ptracer"
Robert O'Callahan:
"We write to readonly code and data mappings via /proc/.../mem in lots
of different situations, particularly when we're adjusting program
state during replay to match the recorded execution.
Like Julia, we can add workarounds, but they could be expensive."
so not only do people use FOLL_FORCE for both reads and writes, but they
use it for both the local mm and remote mm.
With these comments in mind, we likely also cannot add the "are we
actively ptracing" check either, so this keeps the new code organization
and does not do a real revert that would add back the original comment
about "Maybe we should limit FOLL_FORCE to actual ptrace users?"
Reported-by: Keno Fischer <keno@juliacomputing.com>
Reported-by: Robert O'Callahan <robert@ocallahan.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The tagset lock needs to be held when iterating the tag_list, so a
lockdep assert was added when updating number of hardware queues. The
drivers calling this API, however, were unaware of the new requirement,
so are failing the assertion.
This patch takes the lock within the blk-mq function so the drivers do
not have to be modified in order to be safe.
Fixes: 705cda97e ("blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list")
Reported-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Tariq Toukan says:
====================
MAINTAINERS updates
This patchset contains updates to the MAINTAINERS file.
In the first patch, I replace Yishai as the maintainer of
the mlx4_core driver.
In the other two patches we move an RDMA header file from
the list of the mlx4/mlx5 core driver into the respective
IB driver, where it belongs.
Series generated against net commit:
468b0df61a Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
It belongs there, should not be under mlx5 Core driver.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It belongs there, should not be under mlx4 Core driver.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add myself as a maintainer for mlx4 core driver,
replacing Yishai Hadas.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Building the driver with CONFIG_SMP disabled results in a harmless
warning:
ethernet/mellanox/mlx5/core/main.c: In function 'mlx5_irq_set_affinity_hint':
ethernet/mellanox/mlx5/core/main.c:615:6: error: unused variable 'irq' [-Werror=unused-variable]
It's better to express the conditional compilation using IS_ENABLED()
here, as that lets the compiler see what the intented use for the variable
is, and that it can be silently discarded.
Fixes: b665d98edc ("net/mlx5: Tolerate irq_set_affinity_hint() failures")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
'static' was not enough, the helpers must be 'static inline'
net/dsa/mv88e6xxx/global2.h:123:12: error: 'mv88e6xxx_g2_misc_4_bit_port' defined but not used [-Werror=unused-function]
net/dsa/mv88e6xxx/global2.h:117:12: error: 'mv88e6xxx_g2_pvt_write' defined but not used [-Werror=unused-function]
Fixes: c21fbe29f8 ("net: dsa: mv88e6xxx: Add missing static to stub functions")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current implementation lacks the logic for providing management
firmware with RDMA-related statistics; [much] worse than that -
it logs such events by default to system logs.
Since the statistics' gathering is done periodically, using sufficiently
new management firmware the system logs would get filled with these
unnecessary prints.
For now, reduce the verbosity of the log so that it would not be
logged by default.
Fixes: 6c75424612 ("qed: Add support for NCSI statistics")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During PCI error recovery process, specifically on eeh_err_detected()
we might have a NULL netdev struct, hence a direct dereference will
lead to a kernel oops. This was observed with latest upstream kernel
(v4.12-rc2) on Chelsio adapter T422-CR in PowerPC machines.
This patch checks for NULL pointer and avoids the crash, both in
eeh_err_detected() and eeh_resume(). Also, we avoid to trigger
a fatal error or to try disabling interrupts on FW during PCI
error recovery, because: (a) driver might not be able to accurately
access PCI regions in this case, and (b) trigger a fatal error
_during_ the recovery steps is a mistake that could prevent the
recovery path to complete successfully.
Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 19bca6ab75 ("KVM: SVM: Fix cross vendor migration issue with
unusable bit") added checking type when setting unusable.
So unusable can be set if present is 0 OR type is 0.
According to the AMD processor manual, long mode ignores the type value
in segment descriptor. And type can be 0 if it is read-only data segment.
Therefore type value is not related to unusable flag.
This patch is based on linux-next v4.12.0-rc3.
Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_skip_emulated_instruction() will return 0 if userspace is
single-stepping the guest.
kvm_skip_emulated_instruction() uses return status convention of exit
handler: 0 means "exit to userspace" and 1 means "continue vm entries".
The problem is that nested_vmx_check_vmptr() return status means
something else: 0 is ok, 1 is error.
This means we would continue executing after a failure. Static checker
noticed it because vmptr was not initialized.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 6affcbedca ("KVM: x86: Add kvm_skip_emulated_instruction and use it.")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
nbd_config is allocated in nbd_alloc_config(), but never freed.
Fixes: 5ea8d10802 ("nbd: separate out the config information")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
There is nothing to clear -- nbd_device has just been allocated.
Fold nbd_reset() into its other caller, nbd_config_put().
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
We saw perf IRQ init failures when running Linux kernel in an ACPI
guest without PMU (i.e. pmu=off). This is because perf IRQ is not
present when pmu=off, but arm_pmu_acpi still tries to register
or unregister GSI. This patch addresses the problem by checking
gicc->performance_interrupt. If it is 0, which is the value set
by qemu when pmu=off, we skip the IRQ register/unregister process.
[ 4.069470] bc00: 0000000000040b00 ffff0000089db190
[ 4.070267] [<ffff000008134f80>] enable_percpu_irq+0xdc/0xe4
[ 4.071192] [<ffff000008667cc4>] arm_perf_starting_cpu+0x108/0x10c
[ 4.072200] [<ffff0000080cbdd4>] cpuhp_invoke_callback+0x14c/0x4ac
[ 4.073210] [<ffff0000080ccd3c>] cpuhp_thread_fun+0xd4/0x11c
[ 4.074132] [<ffff0000080f1394>] smpboot_thread_fn+0x1b4/0x1c4
[ 4.075081] [<ffff0000080ec90c>] kthread+0x10c/0x138
[ 4.075921] [<ffff0000080833c0>] ret_from_fork+0x10/0x50
[ 4.076947] genirq: Setting trigger mode 4 for irq 43 failed
(gic_set_type+0x0/0x74)
Signed-off-by: Wei Huang <wei@redhat.com>
[will: add comment justifying deviation from ACPI spec, removed redundant hunk]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add device-tree files relevant to TI DaVinci platform
to its entry so mach-davinci sub-arch maintainers get
copied on patches with device-tree file updates.
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops()
,dma_ops should be cleared in the teardown path. Currently, only the
device's iommu mapping structures are cleared in arch_teardown_dma_ops,
but not the dma_ops. So on the next reprobe, dma_ops left in place is
stale from the first IOMMU setup, but iommu mappings has been disposed
of. This is a problem when the probe of the device is deferred and
recalled with the IOMMU probe deferral.
So for fixing this, slightly refactor by moving the code from
__arm_iommu_detach_device to arm_iommu_detach_device and cleanup
the former. This takes care of resetting the dma_ops in the teardown
path.
Fixes: 09515ef5dd ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices")
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With IOMMU probe deferral, iort_iommu_configure can be called
multiple times for the same device. Hence we have a check
to see if the device's fwspec is already translated and return
the iommu_ops from that directly. But the check is wrongly
placed in iort_iommu_xlate, which breaks devices with multiple
sids. Move the check to iort_iommu_configure.
Fixes: 5a1bb638d5 ("drivers: acpi: Handle IOMMU lookup failure with deferred probing or error")
Tested-by: Nate Watterson <nwatters@codeaurora.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
arch_setup_dma_ops() is used in device probe code paths to create an
IOMMU mapping and attach it to the device. The function assumes that the
device is attached to a device-specific IOMMU instance (or at least a
device-specific TLB in a shared IOMMU instance) and thus creates a
separate mapping for every device.
On several systems (Renesas R-Car Gen2 being one of them), that
assumption is not true, and IOMMU mappings must be shared between
multiple devices. In those cases the IOMMU driver knows better than the
generic ARM dma-mapping layer and attaches mapping to devices manually
with arm_iommu_attach_device(), which sets the DMA ops for the device.
The arch_setup_dma_ops() function takes this into account and bails out
immediately if the device already has DMA ops assigned. However, the
corresponding arch_teardown_dma_ops() function, called from driver
unbind code paths (including probe deferral), will tear the mapping down
regardless of who created it. When the device is reprobed
arch_setup_dma_ops() will be called again but won't perform any
operation as the DMA ops will still be set.
We need to reset the DMA ops in arch_teardown_dma_ops() to fix this.
However, we can't do so unconditionally, as then a new mapping would be
created by arch_setup_dma_ops() when the device is reprobed, regardless
of whether the device needs to share a mapping or not. We must thus keep
track of whether arch_setup_dma_ops() created the mapping, and only in
that case tear it down in arch_teardown_dma_ops().
Keep track of that information in the dev_archdata structure. As the
structure is embedded in all instances of struct device let's not grow
it, but turn the existing dma_coherent bool field into a bitfield that
can be used for other purposes.
Fixes: 09515ef5dd ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices")
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
While deferring the probe of IOMMU masters, xlate and
add_device callbacks called from iort_iommu_configure
can pass back error values like -ENODEV, which means
the IOMMU cannot be connected with that master for real
reasons. Before the IOMMU probe deferral, all such errors
were ignored. Now all those errors are propagated back,
killing the master's probe for such errors. Instead ignore
all the errors except EPROBE_DEFER, which is the only one
of concern and let the master work without IOMMU, thus
restoring the old behavior. Also make explicit that
acpi_dma_configure handles only -EPROBE_DEFER from
iort_iommu_configure.
Fixes: 5a1bb638d5 ("drivers: acpi: Handle IOMMU lookup failure with deferred probing or error")
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
While deferring the probe of IOMMU masters, xlate and
add_device callbacks called from of_iommu_configure
can pass back error values like -ENODEV, which means
the IOMMU cannot be connected with that master for real
reasons. Before the IOMMU probe deferral, all such errors
were ignored. Now all those errors are propagated back,
killing the master's probe for such errors. Instead ignore
all the errors except EPROBE_DEFER, which is the only one
of concern and let the master work without IOMMU, thus
restoring the old behavior. Also make explicit that
of_dma_configure handles only -EPROBE_DEFER from
of_iommu_configure.
Fixes: 7b07cbefb6 ("iommu: of: Handle IOMMU lookup failure with deferred probing or error")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Magnus Damn <magnus.damn@gmail.com>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Now with IOMMU probe deferral, we return -EPROBE_DEFER
for masters that are connected to an IOMMU which is not
probed yet, but going to get probed, so that we can attach
the correct dma_ops. So while trying to defer the probe of
the master, check if the of_iommu node that it is connected
to is marked in DT as 'status=disabled', then the IOMMU is never
is going to get probed. So simply return NULL and let the master
work without an IOMMU.
Fixes: 7b07cbefb6 ("iommu: of: Handle IOMMU lookup failure with deferred probing or error")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Tested-by: Magnus Damn <magnus.damn@gmail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Newly added code in the ipmmu-vmsa driver showed a small mistake
in a header file that can't be included by itself without CONFIG_IOMMU_DMA
enabled:
In file included from drivers/iommu/ipmmu-vmsa.c:13:0:
include/linux/dma-iommu.h:105:94: error: 'struct device' declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
This adds a forward declaration for 'struct device', similar to how
we treat the other struct types in this case.
Fixes: 3ae4729202 ("iommu/ipmmu-vmsa: Add new IOMMU_DOMAIN_DMA ops")
Fixes: 273df96353 ("iommu/dma: Make PCI window reservation generic")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When starting or stopping an aggregation session, one of the steps
is that the driver calls back to mac80211 that the start/stop can
proceed. This is handled by queueing up a fake SKB and processing
it from the normal iface/sdata work. Since this isn't flushed when
disassociating, the following race is possible:
* associate
* start aggregation session
* driver callback
* disassociate
* associate again to the same AP
* callback processing runs, leading to a WARN_ON() that
the TID hadn't requested aggregation
If the second association isn't to the same AP, there would only
be a message printed ("Could not find station: <addr>"), but the
same race could happen.
Fix this by not going the whole detour with a fake SKB etc. but
simply looking up the aggregation session in the driver callback,
marking it with a START_CB/STOP_CB bit and then scheduling the
regular aggregation work that will now process these bits as well.
This also simplifies the code and gets rid of the whole problem
with allocation failures of said skb, which could have left the
session in limbo.
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In descriptor mode, the descriptor running pointer is not maintained
by the interrupt handler, thus, driver finds the running descriptor
from the descriptor pointer field in the CHCRB register.
But, CHCRB::DPTR indicates *next* descriptor pointer, not current.
Thus, The residue calculation will be missed. This patch fixup it.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Conntrack SCTP CRC32c checksum mangling may operate on non-linear
skbuff, patch from Davide Caratti.
2) nf_tables rb-tree set backend does not handle element re-addition
after deletion in the same transaction, leading to infinite loop.
3) Atomically unclear the IPS_SRC_NAT_DONE_BIT on nat module removal,
from Liping Zhang.
4) Conntrack hashtable resizing while ctnetlink dump is progress leads
to a dead reference to released objects in the lists, also from
Liping.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Users should really consider switching to rmi-smbus instead of plain PS/2.
Notify them that they should report a missing pnpID in the file.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The Synaptics touchpads are now either using i2c-hid or rmi-smbus.
Warn the users if they are missing the rmi-smbus modules and have no
chance of reporting correct data.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
synaptics_query_hardware() was being passed a 'struct synaptics_device_info'
in uninitialized stack memory, then not always initializing all fields.
This caused garbage to show up in certain fields, making the touchpad
unusable.
Fix by zeroing the device info, so all fields default to 0.
Fixes: 6c53694fb2 ("Input: synaptics - split device info into a separate structure")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
When we put the touchscreen controller in low-power mode the irq
pin may trigger (float) and if we then try to read a data packet we
get the following error in dmesg:
[ 478.801017] silead_ts i2c-MSSL1680:00: Data read error -121
This commit disables the irq during suspend/resume fixing this error.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
For a driver that does not set the CPUFREQ_STICKY flag, if all of the
->init() calls fail, cpufreq_register_driver() should return an error.
This will prevent the driver from loading.
Fixes: ce1bcfe94d (cpufreq: check cpufreq_policy_list instead of scanning policies for all CPUs)
Cc: 4.0+ <stable@vger.kernel.org> # 4.0+
Signed-off-by: David Arcari <darcari@redhat.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the Linux kernel, acpi_get_table() "clones" haven't been fully
balanced by acpi_put_table() invocations. In upstream ACPICA, due to
the design change, there are also unbalanced acpi_get_table_by_index()
invocations requiring special care.
acpi_get_table() reference counting mismatches may occor due to that
and printing error messages related to them is not useful at this
point. The strict balanced validation count check should only be
enabled after confirming that all invocations are safe and aligned
with their designed purposes.
Thus this patch removes the error value returned by acpi_tb_get_table()
in that case along with the accompanying error message to fix the
issue.
Fixes: 174cc7187e (ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel)
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
Reported-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Revert commit 77e9a4aa9d (ACPI / button: Change default behavior to
lid_init_state=open) which changed the kernel's behavior on laptops
that boot with closed lids and expect the lid switch state to be
reported accurately by the kernel.
If you boot or resume your laptop with the lid closed on a docking
station while using an external monitor connected to it, both internal
and external displays will light on, while only the external should.
There is a design choice in gdm to only provide the greeter on the
internal display when lit on, so users only see a gray area on the
external monitor. Also, the cursor will not show up as it's by
default on the internal display too.
To "fix" that, users have to open the laptop once and close it once
again to sync the state of the switch with the hardware state.
Even if the "method" operation mode implementation can be buggy on
some platforms, the "open" choice is worse. It breaks docking
stations basically and there is no way to have a user-space hwdb to
fix that.
On the contrary, it's rather easy in user-space to have a hwdb
with the problematic platforms. Then, libinput (1.7.0+) can fix
the state of the lid switch for us: you need to set the udev
property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
When libinput detects internal keyboard events, it will overwrite the
state of the switch to open, making it reliable again. Given that
logind only checks the lid switch value after a timeout, we can
assume the user will use the internal keyboard before this timeout
expires.
For example, such a hwdb entry is:
libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
Link: https://bugzilla.gnome.org/show_bug.cgi?id=782380
Cc: 4.11+ <stable@vger.kernel.org> # 4.11+
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, extent manipulation operations such as hole punch, range
zeroing, or extent shifting do not record the fact that file data has
changed and thus fdatasync(2) has a work to do. As a result if we crash
e.g. after a punch hole and fdatasync, user can still possibly see the
punched out data after journal replay. Test generic/392 fails due to
these problems.
Fix the problem by properly marking that file data has changed in these
operations.
CC: stable@vger.kernel.org
Fixes: a4bb6b64e3
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Pull pin control fixes from Linus Walleij:
"Here is an overdue pull request for pin control fixes, the most
prominent feature is to make Intel Chromebooks (and I suspect any
other Cherryview-based Intel thing) happy again, which we really want
to see.
There is a patch hitting drivers/firmware/* that I was uncertain to
who actually manages, but I got Andy Shevchenko's and Dmitry Torokov's
review tags on it and I trust them both 100% to do the right thing for
Intel platform drivers.
Summary:
- Make a few Intel Chromebooks with Cherryview DMI firmware work
smoothly.
- A fix for some bogus allocations in the generic group management
code.
- Some GPIO descriptor lookup table stubs. Merged through the pin
control tree for administrative reasons.
- Revert the "bi-directional" and "output-enable" generic properties:
we need more discussions around this. It seems other SoCs are using
input/output gate enablement and these terms are not correct.
- Fix mux and drive strength atomically in the MXS driver.
- Fix the SPDIF function on sunxi A83T.
- OF table terminators and other small fixes"
* tag 'pinctrl-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: sunxi: Fix SPDIF function name for A83T
pinctrl: mxs: atomically switch mux and drive strength config
pinctrl: cherryview: Extend the Chromebook DMI quirk to Intel_Strago systems
firmware: dmi: Add DMI_PRODUCT_FAMILY identification string
pinctrl: core: Fix warning by removing bogus code
gpiolib: Add stubs for gpiod lookup table interface
Revert "pinctrl: generic: Add bi-directional and output-enable"
pinctrl: cherryview: Add terminate entry for dmi_system_id tables
This fixes a regression in commit 4d6501dce0 where I didn't notice
that MIPS and OpenRISC were reinitialising p->{set,clear}_child_tid to
NULL after our initialisation in copy_process().
We can simply get rid of the arch-specific initialisation here since it
is now always done in copy_process() before hitting copy_thread{,_tls}().
Review notes:
- As far as I can tell, copy_process() is the only user of
copy_thread_tls(), which is the only caller of copy_thread() for
architectures that don't implement copy_thread_tls().
- After this patch, there is no arch-specific code touching
p->set_child_tid or p->clear_child_tid whatsoever.
- It may look like MIPS/OpenRISC wanted to always have these fields be
NULL, but that's not true, as copy_process() would unconditionally
set them again _after_ calling copy_thread_tls() before commit
4d6501dce0.
Fixes: 4d6501dce0 ("kthread: Fix use-after-free if kthread fork fails")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net> # MIPS only
Acked-by: Stafford Horne <shorne@gmail.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: openrisc@lists.librecores.org
Cc: Jamie Iles <jamie.iles@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Filesystems filter out extended attributes in the "trusted." domain for
unprivlieged callers.
Overlay calls underlying filesystem's method with elevated privs, so need
to do the filtering in overlayfs too.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
For ACPI devices which do not have a _PSC method, the ACPI subsys cannot
query their initial state at boot, so these devices are assumed to have
been put in D0 by the BIOS, but for touchscreens that is not always true.
This commit adds a call to acpi_device_fix_up_power to explicitly put
devices without a _PSC method into D0 state (for devices with a _PSC
method it is a nop). Note we only need to do this on probe, after a
resume the ACPI subsys knows the device is in D3 and will properly
put it in D0.
This fixes the SIS0817 i2c-hid touchscreen on a Peaq C1010 2-in-1
device failing to probe with a "hid_descr_cmd failed" error.
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Switch to using the common DP helpers instead of using our own.
v2: also remove leftover struct intel_dp_desc (Daniel)
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
An upper dir is marked "impure" to let ovl_iterate() know that this
directory may contain non pure upper entries whose d_ino may need to be
read from the origin inode.
We already mark a non-merge dir "impure" when moving a non-pure child
entry inside it, to let ovl_iterate() know not to iterate the non-merge
dir directly.
Mark also a merge dir "impure" when moving a non-pure child entry inside
it and when copying up a child entry inside it.
This can be used to optimize ovl_iterate() to perform a "pure merge" of
upper and lower directories, merging the content of the directories,
without having to read d_ino from origin inodes.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Commit 93c1defedc ("rbd: remove the discard_zeroes_data flag")
explicitly didn't implement REQ_OP_WRITE_ZEROES for rbd, while the
following commit 48920ff2a5 ("block: remove the discard_zeroes_data
flag") dropped ->discard_zeroes_data in favor of REQ_OP_WRITE_ZEROES.
rbd does support efficient zeroing via CEPH_OSD_OP_ZERO opcode and will
release either some or all blocks depending on whether the zeroing
request is rbd_obj_bytes() aligned. This is how we currently implement
discards, so REQ_OP_WRITE_ZEROES can be identical to REQ_OP_DISCARD for
now. Caveats:
- REQ_NOUNMAP is ignored, but AFAICT that's true of at least two other
current implementations - nvme and loop
- there is no ->write_zeroes_alignment and blk_bio_write_zeroes_split()
is hence less helpful than blk_bio_discard_split(), but this can (and
should) be fixed on the rbd side
In the future we will split these into two code paths to respect
REQ_NOUNMAP on zeroout and save on zeroing blocks that couldn't be
released on discard.
Fixes: 93c1defedc ("rbd: remove the discard_zeroes_data flag")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The reference to cpu_resume requires the corresponding
generic code to be enabled when CONFIG_PM is set:
arch/arm/mach-at91/pm.o: In function `sama5d2_pm_init':
pm.c:(.init.text+0x5e8): undefined reference to `cpu_resume'
Fixes: 24a0f5c539 ("ARM: at91: pm: Add sama5d2 backup mode")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
With CONFIG_DEBUG_PREEMPT enabled, I get:
BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
caller is debug_smp_processor_id
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc2+ #2
Call Trace:
dump_stack
check_preemption_disabled
debug_smp_processor_id
save_microcode_in_initrd_amd
? microcode_init
save_microcode_in_initrd
...
because, well, it says it above, we're using smp_processor_id() in
preemptible code.
But passing the CPU number is not really needed. It is only used to
determine whether we're on the BSP, and, if so, to save the microcode
patch for early loading.
[ We don't absolutely need to do it on the BSP but we do that
customarily there. ]
Instead, convert that function parameter to a boolean which denotes
whether the patch should be saved or not, thereby avoiding the use of
smp_processor_id() in preemptible code.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170528200414.31305-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
POWER9 introduces a new mode for the decrementer register, called
large decrementer mode, in which the decrementer counter is 56 bits
wide rather than 32, and reads are sign-extended rather than
zero-extended. For the decrementer, this new mode is optional and
controlled by a bit in the LPCR. The hypervisor decrementer (HDEC)
is 56 bits wide on POWER9 and has no mode control.
Since KVM code reads and writes the decrementer and hypervisor
decrementer registers in a few places, it needs to be aware of the
need to treat the decrementer value as a 64-bit quantity, and only do
a 32-bit sign extension when large decrementer mode is not in effect.
Similarly, the HDEC should always be treated as a 64-bit quantity on
POWER9. We define a new EXTEND_HDEC macro to encapsulate the feature
test for POWER9 and the sign extension.
To enable the sign extension to be removed in large decrementer mode,
we test the LPCR_LD bit in the host LPCR image stored in the struct
kvm for the guest. If is set then large decrementer mode is enabled
and the sign extension should be skipped.
This is partly based on an earlier patch by Oliver O'Halloran.
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This patch removes unnecessary descriptions on
exynos_drm_crtc structure and adds one description
which specifies what pipe_clk member does.
pipe_clk support had been added by below patch without any description,
drm/exynos: add support for pipeline clock to the framework
Commit-id : f26b9343f5
Signed-off-by: Inki Dae <inki.dae@samsung.com>
The dsi + panel is a parental relationship, so OF grpah is not needed.
Therefore, the current dsi_parse_dt function will throw an error,
because there is no linked OF graph for the case fimd + dsi + panel.
Parse the Pll burst and esc clock frequency properties in dsi_parse_dt()
and create a bridge_node only if there is an OF graph associated with dsi.
Signed-off-by: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Andi Shyti <andi.shyti@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Pull thermal SoC management fixes from Eduardo Valentin:
- fixes to TI SoC driver, Broadcom, qoriq
- small sparse warning fix on thermal core
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
thermal: broadcom: ns-thermal: default on iProc SoCs
ti-soc-thermal: Fix a typo in a comment line
ti-soc-thermal: Delete error messages for failed memory allocations in ti_bandgap_build()
ti-soc-thermal: Use devm_kcalloc() in ti_bandgap_build()
thermal: core: make thermal_emergency_poweroff static
thermal: qoriq: remove useless call for of_thermal_get_trip_points()
Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. In this case, no initializers
are needed (they can be NULL initialized and callers adjusted to check
for NULL, which is more efficient than an indirect call).
Cc: Robin Holt <robinmholt@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
When trying to propagate an error result, the error return path attempts
to retain the error, but does this with an open cast across very different
types, which the upcoming structure layout randomization plugin flags as
being potentially dangerous in the face of randomization. This is a false
positive, but what this code actually wants to do is use ERR_CAST() to
retain the error value.
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
When trying to propagate an error result, the error return path attempts
to retain the error, but does this with an open cast across very different
types, which the upcoming structure layout randomization plugin flags as
being potentially dangerous in the face of randomization. This is a false
positive, but what this code actually wants to do is use ERR_CAST() to
retain the error value.
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
When the call to nfs_devname() fails, the error path attempts to retain
the error via the mnt variable, but this requires a cast across very
different types (char * to struct vfsmount *), which the upcoming
structure layout randomization plugin flags as being potentially
dangerous in the face of randomization. This is a false positive, but
what this code actually wants to do is retain the error value, so this
patch explicitly sets it, instead of using what seems to be an
unexpected cast.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Stub functions in header files need to be static, or we can have
multiple definitions errors.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 6335e9f244 ("net: dsa: mv88e6xxx: mv88e6390X SERDES support")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
ad7152_write_raw_samp_freq() is called by ad7152_write_raw() with
chip->state_lock held. So, there is unavoidable deadlock when
ad7152_write_raw_samp_freq() locks the mutex itself.
The patch removes unneeded locking.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Fixes: 6572389bcc ("staging: iio: cdc: ad7152: Implement
IIO_CHAN_INFO_SAMP_FREQ attribute")
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Sabrina Dubroca reported an early panic:
BUG: unable to handle kernel paging request at ffffffffff240001
IP: efi_bgrt_init+0xdc/0x134
[...]
---[ end Kernel panic - not syncing: Attempted to kill the idle task!
... which was introduced by:
7b0a911478 ("efi/x86: Move the EFI BGRT init code to early init code")
The cause is that on this machine the firmware provides the EFI ACPI BGRT
table even on legacy non-EFI bootups - which table should be EFI only.
The garbage BGRT data causes the efi_bgrt_init() panic.
Add a check to skip efi_bgrt_init() in case non-EFI bootup to work around
this firmware bug.
Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: <stable@vger.kernel.org> # v4.11+
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 7b0a911478 ("efi/x86: Move the EFI BGRT init code to early init code")
Link: http://lkml.kernel.org/r/20170526113652.21339-6-matt@codeblueprint.co.uk
[ Rewrote the changelog to be more readable. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For EFI with the 'efi=old_map' kernel option specified, the kernel will panic
when KASLR is enabled:
BUG: unable to handle kernel paging request at 000000007febd57e
IP: 0x7febd57e
PGD 1025a067
PUD 0
Oops: 0010 [#1] SMP
Call Trace:
efi_enter_virtual_mode()
start_kernel()
x86_64_start_reservations()
x86_64_start_kernel()
start_cpu()
The root cause is that the identity mapping is not built correctly
in the 'efi=old_map' case.
On 'nokaslr' kernels, PAGE_OFFSET is 0xffff880000000000 which is PGDIR_SIZE
aligned. We can borrow the PUD table from the direct mappings safely. Given a
physical address X, we have pud_index(X) == pud_index(__va(X)).
However, on KASLR kernels, PAGE_OFFSET is PUD_SIZE aligned. For a given physical
address X, pud_index(X) != pud_index(__va(X)). We can't just copy the PGD entry
from direct mapping to build identity mapping, instead we need to copy the
PUD entries one by one from the direct mapping.
Fix it.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Frank Ramsay <frank.ramsay@hpe.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20170526113652.21339-5-matt@codeblueprint.co.uk
[ Fixed and reworded the changelog and code comments to be more readable. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Booting kexec kernel with "efi=old_map" in kernel command line hits
kernel panic as shown below.
BUG: unable to handle kernel paging request at ffff88007fe78070
IP: virt_efi_set_variable.part.7+0x63/0x1b0
PGD 7ea28067
PUD 7ea2b067
PMD 7ea2d067
PTE 0
[...]
Call Trace:
virt_efi_set_variable()
efi_delete_dummy_variable()
efi_enter_virtual_mode()
start_kernel()
x86_64_start_reservations()
x86_64_start_kernel()
start_cpu()
[ efi=old_map was never intended to work with kexec. The problem with
using efi=old_map is that the virtual addresses are assigned from the
memory region used by other kernel mappings; vmalloc() space.
Potentially there could be collisions when booting kexec if something
else is mapped at the virtual address we allocated for runtime service
regions in the initial boot - Matt Fleming ]
Since kexec was never intended to work with efi=old_map, disable
runtime services in kexec if booted with efi=old_map, so that we don't
panic.
Tested-by: Lee Chun-Yi <jlee@suse.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20170526113652.21339-4-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
syszkaller fuzzer triggered a divide by zero, when set calibration
through ioctl().
To fix it, test 'bitrate' if it is negative or 0, just return -EINVAL.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Firo Yang <firogm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The binding documentation for the mv88e6xxx switch is missing the
eeprom-length property, which has been implemented since May 2016,
commit f8cd8753de ("dsa: mv88e6xxx: Handle eeprom-length property")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The overrun check for the size of submitted commands is off by one.
It should allow the offset plus the size to be equal to the
size of the memory object when the command stream is very tightly
constructed.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Amongst its other duties, msm_gem_new_impl adds the newly created
GEM object to the shared inactive list which may also be actively
modifiying the list during submission. All the paths to modify
the list are protected by the mutex except for the one through
msm_gem_import which can end up causing list corruption.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
[add extra WARN_ON(!mutex_is_locked(&dev->struct_mutex))]
Signed-off-by: Rob Clark <robdclark@gmail.com>
struct irq_domain_ops is not modified, so it can be made const.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Otherwise if someone was using old bindings with "core_clk" instead of
"core" as the clock name, we'd never find it and gpu would be stuck at
27MHz (or whatever it's slowest rate is).
Fixes: 98db803 ("msm/drm: gpu: Dynamically locate the clocks from the device tree")
Signed-off-by: Rob Clark <robdclark@gmail.com>
Otherwise, if nothing else enabled selects it, dev_pm_opp_of_add_table()
will return -ENOTSUPP.
Fixes: e2af8b6 ("drm/msm: gpu: Use OPP tables if we can")
Signed-off-by: Rob Clark <robdclark@gmail.com>
Pull tty/serial fixes from Greg KH:
"Here are some serial and tty fixes for 4.12-rc3. They are a bit bigger
than normal, which is why I had them bake in linux-next for a few
weeks and didn't send them to you for -rc2.
They revert a few of the serdev patches from 4.12-rc1, and bring
things back to how they were in 4.11, to try to make things a bit more
stable there. Rob and Johan both agree that this is the way forward,
so this isn't people squabbling over semantics. Other than that, just
a few minor serial driver fixes that people have had problems with.
All of these have been in linux-next for a few weeks with no reported
issues"
* tag 'tty-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: altera_uart: call iounmap() at driver remove
serial: imx: ensure UCR3 and UFCR are setup correctly
MAINTAINERS/serial: Change maintainer of jsm driver
serial: enable serdev support
tty/serdev: add serdev registration interface
serdev: Restore serdev_device_write_buf for atomic context
serial: core: fix crash in uart_suspend_port
tty: fix port buffer locking
tty: ehv_bytechan: clean up init error handling
serial: ifx6x60: fix use-after-free on module unload
serial: altera_jtaguart: adding iounmap()
serial: exar: Fix stuck MSIs
serial: efm32: Fix parity management in 'efm32_uart_console_get_options()'
serdev: fix tty-port client deregistration
Revert "tty_port: register tty ports with serdev bus"
drivers/tty: 8250: only call fintek_8250_probe when doing port I/O
Pull powerpc fixes from Michael Ellerman:
"Fix running SPU programs on Cell, and a few other minor fixes.
Thanks to Alistair Popple, Jeremy Kerr, Michael Neuling, Nicholas
Piggin"
* tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Add PPC_FEATURE userspace bits for SCV and DARN instructions
powerpc/spufs: Fix hash faults for kernel regions
powerpc: Fix booting P9 hash with CONFIG_PPC_RADIX_MMU=N
powerpc/powernv/npu-dma.c: Fix opal_npu_destroy_context() call
selftests/powerpc: Fix TM resched DSCR test with some compilers
Pull x86 fixes from Thomas Gleixner:
"A series of fixes for X86:
- The final fix for the end-of-stack issue in the unwinder
- Handle non PAT systems gracefully
- Prevent access to uninitiliazed memory
- Move early delay calaibration after basic init
- Fix Kconfig help text
- Fix a cross compile issue
- Unbreak older make versions"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/timers: Move simple_udelay_calibration past init_hypervisor_platform
x86/alternatives: Prevent uninitialized stack byte read in apply_alternatives()
x86/PAT: Fix Xorg regression on CPUs that don't support PAT
x86/watchdog: Fix Kconfig help text file path reference to lockup watchdog documentation
x86/build: Permit building with old make versions
x86/unwind: Add end-of-stack check for ftrace handlers
Revert "x86/entry: Fix the end of the stack for newly forked tasks"
x86/boot: Use CROSS_COMPILE prefix for readelf
Pull timer fixlet from Thomas Gleixner:
"Silence dmesg spam by making the posix cpu timer printks depend on
print_fatal_signals"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
posix-timers: Make signal printks conditional
Pull RAS fixes from Thomas Gleixner:
"Two fixlets for RAS:
- Export memory_error() so the NFIT module can utilize it
- Handle memory errors in NFIT correctly"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
acpi, nfit: Fix the memory error check in nfit_handle_mce()
x86/MCE: Export memory_error()
Pull perf tooling fixes from Thomas Gleixner:
- Synchronization of tools and kernel headers
- A series of fixes for perf report addressing various failures:
* Handle invalid maps proper
* Plug a memory leak
* Handle frames and callchain order correctly
- Fixes for handling inlines and children mode
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools/include: Sync kernel ABI headers with tooling headers
perf tools: Put caller above callee in --children mode
perf report: Do not drop last inlined frame
perf report: Always honor callchain order for inlined nodes
perf script: Add --inline option for debugging
perf report: Fix off-by-one for non-activation frames
perf report: Fix memory leak in addr2line when called by addr2inlines
perf report: Don't crash on invalid maps in `-g srcline` mode
Pull locking fix from Thomas Gleixner:
"A fix for a state leak which was introduced in the recent rework of
futex/rtmutex interaction"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()
Pull kthread fix from Thomas Gleixner:
"A single fix which prevents a use after free when kthread fork fails"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kthread: Fix use-after-free if kthread fork fails
Pull ftrace fixes from Steven Rostedt:
"There's been a few memory issues found with ftrace.
One was simply a memory leak where not all was being freed that should
have been in releasing a file pointer on set_graph_function.
Then Thomas found that the ftrace trampolines were marked for
read/write as well as execute. To shrink the possible attack surface,
he added calls to set them to ro. Which also uncovered some other
issues with freeing module allocated memory that had its permissions
changed.
Kprobes had a similar issue which is fixed and a selftest was added to
trigger that issue again"
* tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
x86/ftrace: Make sure that ftrace trampolines are not RWX
x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range()
selftests/ftrace: Add a testcase for many kprobe events
kprobes/x86: Fix to set RWX bits correctly before releasing trampoline
ftrace: Fix memory leak in ftrace_graph_release()
Currently VBUS is turned off while a usb device is detached, and turned
on again by the polling routine. This short period VBUS loss prevents
usb modem to switch mode.
VBUS should be constantly on for host-only mode, so this changes the
driver to not turn off VBUS for host-only mode.
Fixes: 2f3fd2c5bd ("usb: musb: Prepare dsps glue layer for PM runtime support")
Cc: stable@vger.kernel.org #v4.11
Reported-by: Moreno Bartalucci <moreno.bartalucci@tecnorama.it>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ftrace use module_alloc() to allocate trampoline pages. The mapping of
module_alloc() is RWX, which makes sense as the memory is written to right
after allocation. But nothing makes these pages RO after writing to them.
Add proper set_memory_rw/ro() calls to protect the trampolines after
modification.
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705251056410.1862@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
With function tracing starting in early bootup and having its trampoline
pages being read only, a bug triggered with the following:
kernel BUG at arch/x86/mm/pageattr.c:189!
invalid opcode: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc2-test+ #3
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
task: ffffffffb4222500 task.stack: ffffffffb4200000
RIP: 0010:change_page_attr_set_clr+0x269/0x302
RSP: 0000:ffffffffb4203c88 EFLAGS: 00010046
RAX: 0000000000000046 RBX: 0000000000000000 RCX: 00000001b6000000
RDX: ffffffffb4203d40 RSI: 0000000000000000 RDI: ffffffffb4240d60
RBP: ffffffffb4203d18 R08: 00000001b6000000 R09: 0000000000000001
R10: ffffffffb4203aa8 R11: 0000000000000003 R12: ffffffffc029b000
R13: ffffffffb4203d40 R14: 0000000000000001 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff9a639ea00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff9a636b384000 CR3: 00000001ea21d000 CR4: 00000000000406b0
Call Trace:
change_page_attr_clear+0x1f/0x21
set_memory_ro+0x1e/0x20
arch_ftrace_update_trampoline+0x207/0x21c
? ftrace_caller+0x64/0x64
? 0xffffffffc029b000
ftrace_startup+0xf4/0x198
register_ftrace_function+0x26/0x3c
function_trace_init+0x5e/0x73
tracer_init+0x1e/0x23
tracing_set_tracer+0x127/0x15a
register_tracer+0x19b/0x1bc
init_function_trace+0x90/0x92
early_trace_init+0x236/0x2b3
start_kernel+0x200/0x3f5
x86_64_start_reservations+0x29/0x2b
x86_64_start_kernel+0x17c/0x18f
secondary_startup_64+0x9f/0x9f
? secondary_startup_64+0x9f/0x9f
Interrupts should not be enabled at this early in the boot process. It is
also fine to leave interrupts enabled during this time as there's only one
CPU running, and on_each_cpu() means to only run on the current CPU.
If early_boot_irqs_disabled is set, it is safe to run cpu_flush_range() with
interrupts disabled. Don't trigger a BUG_ON() in that case.
Link: http://lkml.kernel.org/r/20170526093717.0be3b849@gandalf.local.home
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Add a testcase to test kprobes via ftrace interface
with many concurrent kprobe events.
This tries to add many kprobe events (up to 256) on
kernel functions. To avoid making ftrace-based
kprobes (kprobes on fentry), it skips first N bytes
(on x86 N=5, on ppc or arm N=4) of function entry.
After that, it enables all those events, disable it,
and remove it.
Since the unoptimization buffer reclaiming will
be delayed, after removing events, it will wait
enough time.
Link: http://lkml.kernel.org/r/149577388470.11702.11832460851769204511.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fix kprobes to set(recover) RWX bits correctly on trampoline
buffer before releasing it. Releasing readonly page to
module_memfree() crash the kernel.
Without this fix, if kprobes user register a bunch of kprobes
in function body (since kprobes on function entry usually
use ftrace) and unregister it, kernel hits a BUG and crash.
Link: http://lkml.kernel.org/r/149570868652.3518.14120169373590420503.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: d0381c81c2 ("kprobes/x86: Set kprobes pages read-only")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Pull input layer fixes from Dmitry Torokhov:
"Just a few fixups to a couple of drivers"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - ignore signals when finishing updating firmware
Input: elan_i2c - clear INT before resetting controller
Input: atmel_mxt_ts - add T100 as a readable object
Input: edt-ft5x06 - increase allowed data range for threshold parameter
If TRIM_UNUSED_KSYMS is enabled, all unneeded exported symbols are made
unexported. Two-pass build of the kernel is done to find out which
symbols are needed based on a configuration. This effectively
complicates things for out-of-tree modules.
Livepatch exports functions to (un)register and enable/disable a live
patch. The only in-tree module which uses these functions is a sample in
samples/livepatch/. If the sample is disabled, the functions are
trimmed and out-of-tree live patches cannot be built.
Note that live patches are intended to be built out-of-tree.
Suggested-by: Michal Marek <mmarek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Jessica Yu <jeyu@redhat.com>
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
mpage_submit_page() can race with another process growing i_size and
writing data via mmap to the written-back page. As mpage_submit_page()
samples i_size too early, it may happen that ext4_bio_write_page()
zeroes out too large tail of the page and thus corrupts user data.
Fix the problem by sampling i_size only after the page has been
write-protected in page tables by clear_page_dirty_for_io() call.
Reported-by: Michael Zimmer <michael@swarm64.com>
CC: stable@vger.kernel.org
Fixes: cb20d51883
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When ext4_map_blocks() is called with EXT4_GET_BLOCKS_ZERO to zero-out
allocated blocks and these blocks are actually converted from unwritten
extent the following race can happen:
CPU0 CPU1
page fault page fault
... ...
ext4_map_blocks()
ext4_ext_map_blocks()
ext4_ext_handle_unwritten_extents()
ext4_ext_convert_to_initialized()
- zero out converted extent
ext4_zeroout_es()
- inserts extent as initialized in status tree
ext4_map_blocks()
ext4_es_lookup_extent()
- finds initialized extent
write data
ext4_issue_zeroout()
- zeroes out new extent overwriting data
This problem can be reproduced by generic/340 for the fallocated case
for the last block in the file.
Fix the problem by avoiding zeroing out the area we are mapping with
ext4_map_blocks() in ext4_ext_convert_to_initialized(). It is pointless
to zero out this area in the first place as the caller asked us to
convert the area to initialized because he is just going to write data
there before the transaction finishes. To achieve this we delete the
special case of zeroing out full extent as that will be handled by the
cases below zeroing only the part of the extent that needs it. We also
instruct ext4_split_extent() that the middle of extent being split
contains data so that ext4_split_extent_at() cannot zero out full extent
in case of ENOSPC.
CC: stable@vger.kernel.org
Fixes: 12735f8819
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Callers normally treat the config space accessors as returning PCBIOS_*
error codes, not Linux error codes (or they don't look at them at all). We
have pcibios_err_to_errno() in case the error code needs to be translated.
Fixes: 4b10388347 ("PCI: Don't attempt config access to disconnected devices")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Pull LED fix from Jacek Anaszewski:
"A single LED fix for 4.12-rc3.
leds-pca955x driver uses only i2c_smbus API and thus it should pass
I2C_FUNC_SMBUS_BYTE_DATA flag to i2c_check_functionality"
* tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
leds: pca955x: Correct I2C Functionality
Pull networking fixes from David Miller:
1) Fix state pruning in bpf verifier wrt. alignment, from Daniel
Borkmann.
2) Handle non-linear SKBs properly in SCTP ICMP parsing, from Davide
Caratti.
3) Fix bit field definitions for rss_hash_type of descriptors in mlx5
driver, from Jesper Brouer.
4) Defer slave->link updates until bonding is ready to do a full commit
to the new settings, from Nithin Sujir.
5) Properly reference count ipv4 FIB metrics to avoid use after free
situations, from Eric Dumazet and several others including Cong Wang
and Julian Anastasov.
6) Fix races in llc_ui_bind(), from Lin Zhang.
7) Fix regression of ESP UDP encapsulation for TCP packets, from
Steffen Klassert.
8) Fix mdio-octeon driver Kconfig deps, from Randy Dunlap.
9) Fix regression in setting DSCP on ipv6/GRE encapsulation, from Peter
Dawson.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
ipv4: add reference counting to metrics
net: ethernet: ax88796: don't call free_irq without request_irq first
ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets
sctp: fix ICMP processing if skb is non-linear
net: llc: add lock_sock in llc_ui_bind to avoid a race condition
bonding: Don't update slave->link until ready to commit
test_bpf: Add a couple of tests for BPF_JSGE.
bpf: add various verifier test cases
bpf: fix wrong exposure of map_flags into fdinfo for lpm
bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data
bpf: properly reset caller saved regs after helper call and ld_abs/ind
bpf: fix incorrect pruning decision when alignment must be tracked
arp: fixed -Wuninitialized compiler warning
tcp: avoid fastopen API to be used on AF_UNSPEC
net: move somaxconn init from sysctl code
net: fix potential null pointer dereference
geneve: fix fill_info when using collect_metadata
virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
be2net: Fix offload features for Q-in-Q packets
vlan: Fix tcp checksum offloads in Q-in-Q vlans
...
Pull XFS fixes from Darrick Wong:
"A few miscellaneous bug fixes & cleanups:
- Fix indlen block reservation accounting bug when splitting delalloc
extent
- Fix warnings about unused variables that appeared in -rc1.
- Don't spew errors when bmapping a local format directory
- Fix an off-by-one error in a delalloc eof assertion
- Make fsmap only return inode information for CAP_SYS_ADMIN
- Fix a potential mount time deadlock recovering cow extents
- Fix unaligned memory access in _btree_visit_blocks
- Fix various SEEK_HOLE/SEEK_DATA bugs"
* tag 'xfs-4.12-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: Move handling of missing page into one place in xfs_find_get_desired_pgoff()
xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff()
xfs: Fix missed holes in SEEK_HOLE implementation
xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff()
xfs: fix unaligned access in xfs_btree_visit_blocks
xfs: avoid mount-time deadlock in CoW extent recovery
xfs: only return detailed fsmap info if the caller has CAP_SYS_ADMIN
xfs: bad assertion for delalloc an extent that start at i_size
xfs: fix warnings about unused stack variables
xfs: BMAPX shouldn't barf on inline-format directories
xfs: fix indlen accounting error on partial delalloc conversion
Andrey Konovalov reported crashes in ipv4_mtu()
I could reproduce the issue with KASAN kernels, between
10.246.7.151 and 10.246.7.152 :
1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 &
2) At the same time run following loop :
while :
do
ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
done
Cong Wang attempted to add back rt->fi in commit
82486aa6f1 ("ipv4: restore rt->fi for reference counting")
but this proved to add some issues that were complex to solve.
Instead, I suggested to add a refcount to the metrics themselves,
being a standalone object (in particular, no reference to other objects)
I tried to make this patch as small as possible to ease its backport,
instead of being super clean. Note that we believe that only ipv4 dst
need to take care of the metric refcount. But if this is wrong,
this patch adds the basic infrastructure to extend this to other
families.
Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang
for his efforts on this problem.
Fixes: 2860583fe8 ("ipv4: Kill rt->fi")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function ax_init_dev (which is called only from the driver's .probe
function) calls free_irq in the error path without having requested the
irq in the first place. So drop the free_irq call in the error path.
Fixes: 825a2ff189 ("AX88796 network driver")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fix addresses two problems in the way the DSCP field is formulated
on the encapsulating header of IPv6 tunnels.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195661
1) The IPv6 tunneling code was manipulating the DSCP field of the
encapsulating packet using the 32b flowlabel. Since the flowlabel is
only the lower 20b it was incorrect to assume that the upper 12b
containing the DSCP and ECN fields would remain intact when formulating
the encapsulating header. This fix handles the 'inherit' and
'fixed-value' DSCP cases explicitly using the extant dsfield u8 variable.
2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was
incorrect and resulted in the DSCP value always being set to 0.
Commit 90427ef5d2 ("ipv6: fix flow labels when the traffic class
is non-0") caused the regression by masking out the flowlabel
which exposed the incorrect handling of the DSCP portion of the
flowlabel in ip6_tunnel and ip6_gre.
Fixes: 90427ef5d2 ("ipv6: fix flow labels when the traffic class is non-0")
Signed-off-by: Peter Dawson <peter.a.dawson@boeing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sometimes ICMP replies to INIT chunks are ignored by the client, even if
the encapsulated SCTP headers match an open socket. This happens when the
ICMP packet is carried by a paged skb: use skb_header_pointer() to read
packet contents beyond the SCTP header, so that chunk header and initiate
tag are validated correctly.
v2:
- don't use skb_header_pointer() to read the transport header, since
icmp_socket_deliver() already puts these 8 bytes in the linear area.
- change commit message to make specific reference to INIT chunks.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a race condition in llc_ui_bind if two or more processes/threads
try to bind a same socket.
If more processes/threads bind a same socket success that will lead to
two problems, one is this action is not what we expected, another is
will lead to kernel in unstable status or oops(in my simple test case,
cause llc2.ko can't unload).
The current code is test SOCK_ZAPPED bit to avoid a process to
bind a same socket twice but that is can't avoid more processes/threads
try to bind a same socket at the same time.
So, add lock_sock in llc_ui_bind like others, such as llc_ui_connect.
Signed-off-by: Lin Zhang <xiaolou4617@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull block fixes from Jens Axboe:
"A collection of fixes that should go into this series. This contains:
- A set of NVMe fixes, pulled from Christoph. This includes a set of
fixes for the fiber channel bits from James Smart, rdma queue depth
fix from Marta, controller removal fixes from Ming, and some more
APST quirk updates from Andy.
- A blk-mq debugfs fix from Bart, fixing a problem with the
untangling of the sysfs and debugfs blk-mq bits that was added in
this series.
- Error code fix in add_partition() from Dan.
- A small series of fixes for the new blk-throttle code from Shaohua"
* 'for-linus' of git://git.kernel.dk/linux-block: (21 commits)
blk-mq: Only register debugfs attributes for blk-mq queues
nvme: Quirk APST on Intel 600P/P3100 devices
nvme: only setup block integrity if supported by the driver
nvme: replace is_flags field in nvme_ctrl_ops with a flags field
nvme-pci: consistencly use ctrl->device for logging
partitions/msdos: FreeBSD UFS2 file systems are not recognized
block: fix an error code in add_partition()
blk-throttle: force user to configure all settings for io.low
blk-throttle: respect 0 bps/iops settings for io.low
blk-throttle: output some debug info in trace
blk-throttle: add hierarchy support for latency target and idle time
nvme_fc: remove extra controller reference taken on reconnect
nvme_fc: correct nvme status set on abort
nvme_fc: set logging level on resets/deletes
nvme_fc: revise comment on teardown
nvme_fc: Support ctrl_loss_tmo
nvme_fc: get rid of local reconnect_delay
blk-mq: remove blk_mq_abort_requeue_list()
nvme: avoid to use blk_mq_abort_requeue_list()
nvme: use blk_mq_start_hw_queues() in nvme_kill_queues()
...
Pull PCI fixes from Bjorn Helgaas:
- fix PCI_ENDPOINT build error (merged for v4.12)
- fix Switchtec driver (merged for v4.12)
- fix imx6 config read timeouts, fallout from changing to non-postable
reads
- add PM "needs_resume" flag for i915 suspend issue
* tag 'pci-v4.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/PM: Add needs_resume flag to avoid suspend complete optimization
PCI: imx6: Fix config read timeout handling
switchtec: Fix minor bug with partition ID register
switchtec: Use new cdev_device_add() helper function
PCI: endpoint: Make PCI_ENDPOINT depend on HAS_DMA
Pul ceph fixes from Ilya Dryomov:
"A bunch of make W=1 and static checker fixups, a RECONNECT_SEQ
messenger patch from Zheng and Luis' fallocate fix"
* tag 'ceph-for-4.12-rc3' of git://github.com/ceph/ceph-client:
ceph: check that the new inode size is within limits in ceph_fallocate()
libceph: cleanup old messages according to reconnect seq
libceph: NULL deref on crush_decode() error path
libceph: fix error handling in process_one_ticket()
libceph: validate blob_struct_v in process_one_ticket()
libceph: drop version variable from ceph_monmap_decode()
libceph: make ceph_msg_data_advance() return void
libceph: use kbasename() and kill ceph_file_part()
Pull MMC fixes from Ulf Hansson:
"This contains fixes to make the WiFi work again for the ARM64 Hikey
board.
Together with a couple of DTS updates for the Hikey board we have also
extended the mmc pwrseq_simple, to support a new power-off-delay-us DT
property, as that was required to enable a graceful power off sequence
for the WiFi chip"
* tag 'mmc-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
arm64: dts: hikey: Fix WiFi support
arm64: dts: hi6220: Move board data from the dwmmc nodes to hikey dts
arm64: dts: hikey: Add the SYS_5V and the VDD_3V3 regulators
arm64: dts: hi6220: Move the fixed_5v_hub regulator to the hikey dts
arm64: dts: hikey: Add clock for the pmic mfd
mfd: dts: hi655x: Add clock binding for the pmic
mmc: pwrseq_simple: Parse DTS for the power-off-delay-us property
mmc: dt: pwrseq-simple: Invent power-off-delay-us
Pull sound fixes from Takashi Iwai:
"This contains a few HD-audio device-specific quirks and an endianess
fix for USB-audio, as well as the update of quirk model list document.
All fixes are small and trivial.
The document update could have been postponed, but it's a good thing
for user and has absolutely zero risk of breakage, so included here"
* tag 'sound-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - apply STAC_9200_DELL_M22 quirk for Dell Latitude D430
ALSA: hda - Update the list of quirk models
ALSA: hda - Provide dual-codecs model option for a few Realtek codecs
ALSA: hda - Apply dual-codec quirk for MSI Z270-Gaming mobo
ALSA: hda - No loopback on ALC299 codec
ALSA: usb-audio: fix Amanero Combo384 quirk on big-endian hosts
Intel SDM says, that at most one LAPIC should be configured with ExtINT
delivery. KVM configures all LAPICs this way. This causes pic_unlock()
to kick the first available vCPU from the internal KVM data structures.
If this vCPU is not the BSP, but some not-yet-booted AP, the BSP may
never realize that there is an interrupt.
Fix that by enabling ExtINT delivery only for the BSP.
This allows booting a Linux guest without a TSC in the above situation.
Otherwise the BSP gets stuck in calibrate_delay_converge().
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The decision whether or not to exit from L2 to L1 on an lmsw instruction is
based on bogus values: instead of using the information encoded within the
exit qualification, it uses the data also used for the mov-to-cr
instruction, which boils down to using whatever is in %eax at that point.
Use the correct values instead.
Without this fix, an L1 may not get notified when a 32-bit Linux L2
switches its secondary CPUs to protected mode; the L1 is only notified on
the next modification of CR0. This short time window poses a problem, when
there is some other reason to exit to L1 in between. Then, L2 will be
resumed in real mode and chaos ensues.
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull drm fixes from Dave Airlie:
"Not a whole lot happening here, a set of amdgpu fixes and one core
deadlock fix, and some misc drivers fixes"
* tag 'drm-fixes-for-v4.12-rc3' of git://people.freedesktop.org/~airlied/linux:
drm/amdgpu: fix null point error when rmmod amdgpu.
drm/amd/powerplay: fix a signedness bugs
drm/amdgpu: fix NULL pointer panic of emit_gds_switch
drm/radeon: Unbreak HPD handling for r600+
drm/amd/powerplay/smu7: disable mclk switching for high refresh rates
drm/amd/powerplay/smu7: add vblank check for mclk switching (v2)
drm/radeon/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu: fix fundamental suspend/resume issue
drm/gma500/psb: Actually use VBT mode when it is found
drm: Fix deadlock retry loop in page_flip_ioctl
drm: qxl: Delay entering atomic context during cursor update
drm/radeon: Fix oops upon driver load on PowerXpress laptops
Preemption can occur during cancel preemption timer, and there will be
inconsistent status in lapic, vmx and vmcs field.
CPU0 CPU1
preemption timer vmexit
handle_preemption_timer(vCPU0)
kvm_lapic_expired_hv_timer
vmx_cancel_hv_timer
vmx->hv_deadline_tsc = -1
vmcs_clear_bits
/* hv_timer_in_use still true */
sched_out
sched_in
kvm_arch_vcpu_load
vmx_set_hv_timer
write vmx->hv_deadline_tsc
vmcs_set_bits
/* back in kvm_lapic_expired_hv_timer */
hv_timer_in_use = false
...
vmx_vcpu_run
vmx_arm_hv_run
write preemption timer deadline
spurious preemption timer vmexit
handle_preemption_timer(vCPU0)
kvm_lapic_expired_hv_timer
WARN_ON(!apic->lapic_timer.hv_timer_in_use);
This can be reproduced sporadically during boot of L2 on a
preemptible L1, causing a splat on L1.
WARNING: CPU: 3 PID: 1952 at arch/x86/kvm/lapic.c:1529 kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm]
CPU: 3 PID: 1952 Comm: qemu-system-x86 Not tainted 4.12.0-rc1+ #24 RIP: 0010:kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm]
Call Trace:
handle_preemption_timer+0xe/0x20 [kvm_intel]
vmx_handle_exit+0xc9/0x15f0 [kvm_intel]
? lock_acquire+0xdb/0x250
? lock_acquire+0xdb/0x250
? kvm_arch_vcpu_ioctl_run+0xdf3/0x1ce0 [kvm]
kvm_arch_vcpu_ioctl_run+0xe55/0x1ce0 [kvm]
kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? __fget+0xf3/0x210
do_vfs_ioctl+0xa4/0x700
? __fget+0x114/0x210
SyS_ioctl+0x79/0x90
do_syscall_64+0x8f/0x750
? trace_hardirqs_on_thunk+0x1a/0x1c
entry_SYSCALL64_slow_path+0x25/0x25
This patch fixes it by disabling preemption while cancelling
preemption timer. This way cancel_hv_timer is atomic with
respect to kvm_arch_vcpu_load.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We need to return an error for any call that asks for MSI / MSI-X
vectors only, so that non-trivial fallback logic can work properly.
Also valid dev->irq and use the "correct" errno value based on feedback
from Linus.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Fixes: aff17164 ("PCI: Provide sensible IRQ vector alloc/free routines")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We don't need to bitbang these pins anymore, instead we muxed these
pins as SPI, after this change, done in commit 6c69f726, we introduced
the following error:
pinctrl-single 44e10800.pinmux: pin PIN85 already requested \
by 44e10800.pinmux; cannot claim for 48030000.spi
pinctrl-single 44e10800.pinmux: pin-85 (48030000.spi) status -22
Fixes: 6c69f726 ("ARM: dts: am335x-sl50: Enable SPI0 interface and Flash Memory")
Cc: <stable@vger.kernel.org> # 4.11
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The second version of the hardware moved the card detect pin from gpio0_6
to gpio1_9, as we won't support the first hardware version fix the pinmux
configuration of this pin.
Fixes: 8584d4fc ("ARM: dts: am335x-sl50: Add Toby-Churchill SL50 board support.")
Cc: <stable@vger.kernel.org> # 4.11
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Christoph writes:
"A couple of fixes for the next rc on the nvme front. Various FC fixes
from James, controller removal fixes from Ming (including a block layer
patch), a APST related device quirk from Andy, a RDMA fix for small
queue depth device from Marta, as well as fixes for the lack of
metadata support in non-PCIe drivers and the printk logging format from
me."
The code in blk-mq-debugfs.c assumes that it is working on a blk-mq
queue and is not intended to work on a blk-sq queue. Hence only
register blk-mq debugfs attributes for blk-mq queues.
Fixes: commit 9c1051aacd ("blk-mq: untangle debugfs and sysfs")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
commit 25165f79ad ("ASoC: rsnd: enable clock-frequency for both
44.1kHz/48kHz") supported both 44.1kHz/48kHz for AUDIO_CLKOUTx,
but it didn't care its parent clock name.
This patch fixes it.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
vlv_display_irq_postinstall() enables the LPE audio interrupts
regardless of whether the LPE audio irq chip has masked/unmasked
them. Also the irqchip masking/unmasking doesn't consider the state
of the display power well or the device, and hence just leads to
dmesg spew when it tries to access the hardware while it's powered
down.
If the current way works, then we don't need to do anything in the
mask/unmask hooks. If it doesn't work, well, then we'd need to properly
track whether the irqchip has masked/unmasked the interrupts when
we enable display interrupts. And the mask/unmask hooks would need
to check whether display interrupts are even enabled before frobbing
with he registers.
So let's just assume the current way works and neuter the mask/unmask
hooks. Also clean up vlv_display_irq_postinstall() a bit and stop
it from trying to unmask/enable the LPE C interrupt on VLV since it
doesn't exist.
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170427160231.13337-4-ville.syrjala@linux.intel.com
Reviewed-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit ebf5f92147)
Reference: http://mid.mail-archive.com/874cf6d3-4e45-d4cf-e662-eb972490d2ce@redhat.com
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Ethernet networking on K2L has been broken since v4.11-rc1. This was
caused by commit 32a34441a9 ("ARM: keystone: dts: fix netcp clocks
and add names"). This commit inadvertently moves on-chip static RAM
clock to the end of list of clocks provided for netcp. Since keystone
PM domain support does not have a list of recognized con_ids, only the
first clock in the list comes under runtime PM management. This means
the OSR (On-chip Static RAM) clock remains disabled and that broke
networking on K2L.
The OSR is used by QMSS on K2L as an external linking RAM. However this
is a standalone RAM that can be used for non-QMSS usage (as well as from
DSP side). So add a SRAM device node for the same and add the OSR clock
to the node.
Remove the now redundant OSR clock node from netcp.
To manage all clocks defined for netCP's use by runtime PM needs keystone
generic power domain (genpd) driver support which is under works.
Meanwhile, this patch restores K2L networking and is correct irrespective
of any future genpd work since OSR is an independent module and not part
of NetCP anyway.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Acked-by: Tero Kristo <t-kristo@ti.com>
[nsekhar@ti.com: commit message updates, port to latest mainline]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Cc: stable@vger.kernel.org # for 4.11
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Currently only the PCIe driver supports metadata, so we should not claim
integrity support for the other drivers. This prevents nasty crashes
with targets that advertise metadata support on fabrics.
Also use the opportunity to factor out some code into a separate helper
that isn't even compiled if CONFIG_BLK_DEV_INTEGRITY is disabled.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
So that we can have more flags for transport-specific behavior.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
This is what most of the code already does and gives much more useful
prefixes than the device embedded in the pci_dev.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
A bunch of bug fixes:
- Fix display flickering on some chips at high refresh rates
- suspend/resume fix
- hotplug fix
- a couple of segfault fixes for certain cases
* 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: fix null point error when rmmod amdgpu.
drm/amd/powerplay: fix a signedness bugs
drm/amdgpu: fix NULL pointer panic of emit_gds_switch
drm/radeon: Unbreak HPD handling for r600+
drm/amd/powerplay/smu7: disable mclk switching for high refresh rates
drm/amd/powerplay/smu7: add vblank check for mclk switching (v2)
drm/radeon/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu: fix fundamental suspend/resume issue
Core Changes:
- Don't drop vblank reference more than once in cases of ww retry (Daniel)
Driver Changes:
- radeon: Fix oops during radeon probe trying to reference wrong device (Lukas)
- qxl: Avoid sleeping while in atomic context on cursor update (Gabriel)
- gma500: Use VBT mode instead of pre-programmed mode for LVDS (Patrik)
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
* tag 'drm-misc-fixes-2017-05-25' of git://anongit.freedesktop.org/git/drm-misc:
drm/gma500/psb: Actually use VBT mode when it is found
drm: Fix deadlock retry loop in page_flip_ioctl
drm: qxl: Delay entering atomic context during cursor update
drm/radeon: Fix oops upon driver load on PowerXpress laptops
Update of maintainers entry for Samsung SoC.
* tag 'samsung-fixes-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
MAINTAINERS: Remove Javier Martinez Canillas as reviewer for Exynos
Signed-off-by: Olof Johansson <olof@lixom.net>
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for 4.12,
please pull the following:
- Phil provides a fix for the BCM283x (Raspberry Pi) by flagging the first
4KiB of physical memory as a reserved region in order to let the secondary
cores successfully spin until they are brought online
* tag 'arm-soc/for-4.12/devicetree-fixes-2' of http://github.com/Broadcom/stblinux:
ARM: dts: bcm283x: Reserve first page for firmware
Signed-off-by: Olof Johansson <olof@lixom.net>
Enable some very core config options used on 64bit Rockchip socs.
As built-in driver enable the Rockchip spi driver as well as the
cros-ec-spi and cros-ec keyboard driver, as this may be helpful
in case an initrd does not work as expected and drops the user
into a shell. Another built-in is the fan53555 regulator driver,
as it and its register-compatible cousins Silergy syr827 and syr828
are often used on Rockchip socs as cpu-supply next to regular pmic.
The rest can be enabled as modules and contains the pcie host
controller and its phy, the sucessive approximation adc (saradc)
that gets often used for additional buttons on Rockchip boards
as well as the adc-keys Keyboard driver for these keys.
The cros-ec-pwm also can be a module, as it is normally only used to
drive display backlights as well as the Rockchip thermal controller
that allows to read the cpu and gpu temperatures and affect frequency
scaling if necessary.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Olof Johansson <olof@lixom.net>
These fix issues with power management initialization code
on DaVinci. Some resources were getting freed prematurely.
And there was an issue with resources not being on error.
* tag 'davinci-fixes-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
ARM: davinci: PM: Do not free useful resources in normal path in 'davinci_pm_init'
ARM: davinci: PM: Free resources in error handling path in 'davinci_pm_init'
Signed-off-by: Olof Johansson <olof@lixom.net>
In the loadbalance arp monitoring scheme, when a slave link change is
detected, the slave->link is immediately updated and slave_state_changed
is set. Later down the function, the rtnl_lock is acquired and the
changes are committed, updating the bond link state.
However, the acquisition of the rtnl_lock can fail. The next time the
monitor runs, since slave->link is already updated, it determines that
link is unchanged. This results in the bond link state permanently out
of sync with the slave link.
This patch modifies bond_loadbalance_arp_mon() to handle link changes
identical to bond_ab_arp_{inspect/commit}(). The new link state is
maintained in slave->new_link until we're ready to commit at which point
it's copied into slave->link.
NOTE: miimon_{inspect/commit}() has a more complex state machine
requiring the use of the bond_{propose,commit}_link_state() functions
which maintains the intermediate state in slave->link_new_state. The arp
monitors don't require that.
Testing: This bug is very easy to reproduce with the following steps.
1. In a loop, toggle a slave link of a bond slave interface.
2. In a separate loop, do ifconfig up/down of an unrelated interface to
create contention for rtnl_lock.
Within a few iterations, the bond link goes out of sync with the slave
link.
Signed-off-by: Nithin Nayak Sujir <nsujir@tintri.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Jay Vosburgh <jay.vosburgh@canonical.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some JITs can optimize comparisons with zero. Add a couple of
BPF_JSGE tests against immediate zero.
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
Various BPF fixes
Follow-up to fix incorrect pruning when alignment tracking is
in use and to properly clear regs after call to not leave stale
data behind, also a fix that adds bpf_clone_redirect to the
bpf_helper_changes_pkt_data helper and exposes correct map_flags
for lpm map into fdinfo. For details, please see individual
patches.
v1 -> v2:
- Reworked first patch so that env->strict_alignment is the
final indicator on whether we have to deal with strict
alignment rather than having CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
checks on various locations, so only checking env->strict_alignment
is sufficient after that. Thanks for spotting, Dave!
- Added patch 3 and 4.
- Rest as is.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds various verifier test cases:
1) A test case for the pruning issue when tracking alignment
is used.
2) Various PTR_TO_MAP_VALUE_OR_NULL tests to make sure pointer
arithmetic turns such register into UNKNOWN_VALUE type.
3) Test cases for the special treatment of LD_ABS/LD_IND to
make sure verifier doesn't break calling convention here.
Latter is needed, since f.e. arm64 JIT uses r1 - r5 for
storing temporary data, so they really must be marked as
NOT_INIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
trie_alloc() always needs to have BPF_F_NO_PREALLOC passed in via
attr->map_flags, since it does not support preallocation yet. We
check the flag, but we never copy the flag into trie->map.map_flags,
which is later on exposed into fdinfo and used by loaders such as
iproute2. Latter uses this in bpf_map_selfcheck_pinned() to test
whether a pinned map has the same spec as the one from the BPF obj
file and if not, bails out, which is currently the case for lpm
since it exposes always 0 as flags.
Also copy over flags in array_map_alloc() and stack_map_alloc().
They always have to be 0 right now, but we should make sure to not
miss to copy them over at a later point in time when we add actual
flags for them to use.
Fixes: b95a5c4db0 ("bpf: add a longest prefix match trie map implementation")
Reported-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bpf_clone_redirect() still needs to be listed in
bpf_helper_changes_pkt_data() since we call into
bpf_try_make_head_writable() from there, thus we need
to invalidate prior pkt regs as well.
Fixes: 36bbef52c7 ("bpf: direct packet write and access for helpers for clsact progs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, after performing helper calls, we clear all caller saved
registers, that is r0 - r5 and fill r0 depending on struct bpf_func_proto
specification. The way we reset these regs can affect pruning decisions
in later paths, since we only reset register's imm to 0 and type to
NOT_INIT. However, we leave out clearing of other variables such as id,
min_value, max_value, etc, which can later on lead to pruning mismatches
due to stale data.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when we enforce alignment tracking on direct packet access,
the verifier lets the following program pass despite doing a packet
write with unaligned access:
0: (61) r2 = *(u32 *)(r1 +76)
1: (61) r3 = *(u32 *)(r1 +80)
2: (61) r7 = *(u32 *)(r1 +8)
3: (bf) r0 = r2
4: (07) r0 += 14
5: (25) if r7 > 0x1 goto pc+4
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
6: (2d) if r0 > r3 goto pc+1
R0=pkt(id=0,off=14,r=14) R1=ctx R2=pkt(id=0,off=0,r=14)
R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
7: (63) *(u32 *)(r0 -4) = r0
8: (b7) r0 = 0
9: (95) exit
from 6 to 8:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
8: (b7) r0 = 0
9: (95) exit
from 5 to 10:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=2 R10=fp
10: (07) r0 += 1
11: (05) goto pc-6
6: safe <----- here, wrongly found safe
processed 15 insns
However, if we enforce a pruning mismatch by adding state into r8
which is then being mismatched in states_equal(), we find that for
the otherwise same program, the verifier detects a misaligned packet
access when actually walking that path:
0: (61) r2 = *(u32 *)(r1 +76)
1: (61) r3 = *(u32 *)(r1 +80)
2: (61) r7 = *(u32 *)(r1 +8)
3: (b7) r8 = 1
4: (bf) r0 = r2
5: (07) r0 += 14
6: (25) if r7 > 0x1 goto pc+4
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
7: (2d) if r0 > r3 goto pc+1
R0=pkt(id=0,off=14,r=14) R1=ctx R2=pkt(id=0,off=0,r=14)
R3=pkt_end R7=inv,min_value=0,max_value=1
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
8: (63) *(u32 *)(r0 -4) = r0
9: (b7) r0 = 0
10: (95) exit
from 7 to 9:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
9: (b7) r0 = 0
10: (95) exit
from 6 to 11:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=2
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
11: (07) r0 += 1
12: (b7) r8 = 0
13: (05) goto pc-7 <----- mismatch due to r8
7: (2d) if r0 > r3 goto pc+1
R0=pkt(id=0,off=15,r=15) R1=ctx R2=pkt(id=0,off=0,r=15)
R3=pkt_end R7=inv,min_value=2
R8=imm0,min_value=0,max_value=0,min_align=2147483648 R10=fp
8: (63) *(u32 *)(r0 -4) = r0
misaligned packet access off 2+15+-4 size 4
The reason why we fail to see it in states_equal() is that the
third test in compare_ptrs_to_packet() ...
if (old->off <= cur->off &&
old->off >= old->range && cur->off >= cur->range)
return true;
... will let the above pass. The situation we run into is that
old->off <= cur->off (14 <= 15), meaning that prior walked paths
went with smaller offset, which was later used in the packet
access after successful packet range check and found to be safe
already.
For example: Given is R0=pkt(id=0,off=0,r=0). Adding offset 14
as in above program to it, results in R0=pkt(id=0,off=14,r=0)
before the packet range test. Now, testing this against R3=pkt_end
with 'if r0 > r3 goto out' will transform R0 into R0=pkt(id=0,off=14,r=14)
for the case when we're within bounds. A write into the packet
at offset *(u32 *)(r0 -4), that is, 2 + 14 -4, is valid and
aligned (2 is for NET_IP_ALIGN). After processing this with
all fall-through paths, we later on check paths from branches.
When the above skb->mark test is true, then we jump near the
end of the program, perform r0 += 1, and jump back to the
'if r0 > r3 goto out' test we've visited earlier already. This
time, R0 is of type R0=pkt(id=0,off=15,r=0), and we'll prune
that part because this time we'll have a larger safe packet
range, and we already found that with off=14 all further insn
were already safe, so it's safe as well with a larger off.
However, the problem is that the subsequent write into the packet
with 2 + 15 -4 is then unaligned, and not caught by the alignment
tracking. Note that min_align, aux_off, and aux_off_align were
all 0 in this example.
Since we cannot tell at this time what kind of packet access was
performed in the prior walk and what minimal requirements it has
(we might do so in the future, but that requires more complexity),
fix it to disable this pruning case for strict alignment for now,
and let the verifier do check such paths instead. With that applied,
the test cases pass and reject the program due to misalignment.
Fixes: d117441674 ("bpf: Track alignment of register values in the verifier.")
Reference: http://patchwork.ozlabs.org/patch/761909/
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7d472a59c0 ("arp: always override
existing neigh entries with gratuitous ARP") introduced a compiler
warning:
net/ipv4/arp.c:880:35: warning: 'addr_type' may be used uninitialized in
this function [-Wmaybe-uninitialized]
While the code logic seems to be correct and doesn't allow the variable
to be used uninitialized, and the warning is not consistently
reproducible, it's still worth fixing it for other people not to waste
time looking at the warning in case it pops up in the build environment.
Yes, compiler is probably at fault, but we will need to accommodate.
Fixes: 7d472a59c0 ("arp: always override existing neigh entries with gratuitous ARP")
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fastopen API should be used to perform fastopen operations on the TCP
socket. It does not make sense to use fastopen API to perform disconnect
by calling it with AF_UNSPEC. The fastopen data path is also prone to
race conditions and bugs when using with AF_UNSPEC.
One issue reported and analyzed by Vegard Nossum is as follows:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Thread A: Thread B:
------------------------------------------------------------------------
sendto()
- tcp_sendmsg()
- sk_stream_memory_free() = 0
- goto wait_for_sndbuf
- sk_stream_wait_memory()
- sk_wait_event() // sleep
| sendto(flags=MSG_FASTOPEN, dest_addr=AF_UNSPEC)
| - tcp_sendmsg()
| - tcp_sendmsg_fastopen()
| - __inet_stream_connect()
| - tcp_disconnect() //because of AF_UNSPEC
| - tcp_transmit_skb()// send RST
| - return 0; // no reconnect!
| - sk_stream_wait_connect()
| - sock_error()
| - xchg(&sk->sk_err, 0)
| - return -ECONNRESET
- ... // wake up, see sk->sk_err == 0
- skb_entail() on TCP_CLOSE socket
If the connection is reopened then we will send a brand new SYN packet
after thread A has already queued a buffer. At this point I think the
socket internal state (sequence numbers etc.) becomes messed up.
When the new connection is closed, the FIN-ACK is rejected because the
sequence number is outside the window. The other side tries to
retransmit,
but __tcp_retransmit_skb() calls tcp_trim_head() on an empty skb which
corrupts the skb data length and hits a BUG() in copy_and_csum_bits().
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Hence, this patch adds a check for AF_UNSPEC in the fastopen data path
and return EOPNOTSUPP to user if such case happens.
Fixes: cf60af03ca ("tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The default value for somaxconn is set in sysctl_core_net_init(), but this
function is not called when kernel is configured without CONFIG_SYSCTL.
This results in the kernel not being able to accept TCP connections,
because the backlog has zero size. Usually, the user ends up with:
"TCP: request_sock_TCP: Possible SYN flooding on port 7. Dropping request. Check SNMP counters."
If SYN cookies are not enabled the connection is rejected.
Before ef547f2ac1 (tcp: remove max_qlen_log), the effects were less
severe, because the backlog was always at least eight slots long.
Signed-off-by: Roman Kapl <roman.kapl@sysgo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use wait_for_completion_timeout() instead of
wait_for_completion_interruptible_timeout() to avoid stray signals ruining
firmware update. Our timeout is only 300 msec so we are fine simply letting
it expire in case device misbehaves.
Signed-off-by: KT Liao <kt.liao@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Some old touchpad FWs need to have interrupt cleared before issuing reset
command after updating firmware. We clear interrupt by attempting to read
full report from the controller, and discarding any data read.
Signed-off-by: KT Liao <kt.liao@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Add null check to avoid a potential null pointer dereference.
Addresses-Coverity-ID: 1408831
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since 9b4437a5b8 ("geneve: Unify LWT and netdev handling.") fill_info
does not return UDP_ZERO_CSUM6_RX when using COLLECT_METADATA. This is
because it uses ip_tunnel_info_af() with the device level info, which is
not valid for COLLECT_METADATA.
Fix by checking for the presence of the actual sockets.
Fixes: 9b4437a5b8 ("geneve: Unify LWT and netdev handling.")
Signed-off-by: Eric Garver <e@erig.me>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently several places in xfs_find_get_desired_pgoff() handle the case
of a missing page. Make them all handled in one place after the loop has
terminated.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
There is an off-by-one error in loop termination conditions in
xfs_find_get_desired_pgoff() since 'end' may index a page beyond end of
desired range if 'endoff' is page aligned. It doesn't have any visible
effects but still it is good to fix it.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
XFS SEEK_HOLE implementation could miss a hole in an unwritten extent as
can be seen by the following command:
xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k"
-c "seek -h 0" file
wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (49.312 MiB/sec and 12623.9856 ops/sec)
wrote 8192/8192 bytes at offset 131072
8 KiB, 2 ops; 0.0000 sec (70.383 MiB/sec and 18018.0180 ops/sec)
Whence Result
HOLE 139264
Where we can see that hole at offset 56k was just ignored by SEEK_HOLE
implementation. The bug is in xfs_find_get_desired_pgoff() which does
not properly detect the case when pages are not contiguous.
Fix the problem by properly detecting when found page has larger offset
than expected.
CC: stable@vger.kernel.org
Fixes: d126d43f63
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
xfs_find_get_desired_pgoff() is used to search for offset of hole or
data in page range [index, end] (both inclusive), and the max number
of pages to search should be at least one, if end == index.
Otherwise the only page is missed and no hole or data is found,
which is not correct.
When block size is smaller than page size, this can be demonstrated
by preallocating a file with size smaller than page size and writing
data to the last block. E.g. run this xfs_io command on a 1k block
size XFS on x86_64 host.
# xfs_io -fc "falloc 0 3k" -c "pwrite 2k 1k" \
-c "seek -d 0" /mnt/xfs/testfile
wrote 1024/1024 bytes at offset 2048
1 KiB, 1 ops; 0.0000 sec (33.675 MiB/sec and 34482.7586 ops/sec)
Whence Result
DATA EOF
Data at offset 2k was missed, and lseek(2) returned ENXIO.
This is uncovered by generic/285 subtest 07 and 08 on ppc64 host,
where pagesize is 64k. Because a recent change to generic/285
reduced the preallocated file size to smaller than 64k.
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
This structure copy was throwing unaligned access warnings on sparc64:
Kernel unaligned access at TPC[1043c088] xfs_btree_visit_blocks+0x88/0xe0 [xfs]
xfs_btree_copy_ptrs does a memcpy, which avoids it.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
The function get_free_pipe_id_locked() is called from
goldfish_pipe_open() with a lock is held, so we should
use GFP_ATOMIC instead of GFP_KERNEL.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 093d24a204 ("arm64: PCI: Manage controller-specific data on
per-controller basis") added code to allocate ACPI PCI root_ops
dynamically on a per host bridge basis but failed to update the
corresponding memory allocation failure path in pci_acpi_scan_root()
leading to a potential memory leakage.
Fix it by adding the required kfree call.
Fixes: 093d24a204 ("arm64: PCI: Manage controller-specific data on per-controller basis")
Reviewed-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Timmy Li <lixiaoping3@huawei.com>
[lorenzo.pieralisi@arm.com: refactored code, rewrote commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
kobject_del() only unlinks kobject, we need to use kobject_put() to
make sure kobject will go away completely.
Fixes: 049a59db34 ("firmware: Google VPD sysfs driver")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We should not free info->key before we remove sysfs attribute that uses
this data as its name.
Fixes: 049a59db34 ("firmware: Google VPD sysfs driver")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We should only add section attribute to the list of section attributes
if we successfully created corresponding sysfs attribute.
Fixes: 049a59db34 ("firmware: Google VPD sysfs driver")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Providing "scv" support to userspace requires kernel support, so it
must be advertised as independently to the base ISA 3 instruction set.
The darn instruction relies on firmware enablement, so it has been
decided to split this out from the core ISA 3 feature as well.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit ac29c64089 ("powerpc/mm: Replace _PAGE_USER with
_PAGE_PRIVILEGED") swapped _PAGE_USER for _PAGE_PRIVILEGED, and
introduced check_pte_access() which denied kernel access to
non-_PAGE_PRIVILEGED pages.
However, it didn't add _PAGE_PRIVILEGED to the hash fault handler
for spufs' kernel accesses, so the DMAs required to establish SPE
memory no longer work.
This change adds _PAGE_PRIVILEGED to the hash fault handler for
kernel accesses.
Fixes: ac29c64089 ("powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGED")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Reported-by: Sombat Tragolgosol <sombat3960@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently if you disable CONFIG_PPC_RADIX_MMU you'll crash on boot on
a P9. This is because we still set MMU_FTR_TYPE_RADIX via
ibm,pa-features and MMU_FTR_TYPE_RADIX is what's used for code patching
in much of the asm code (ie. slb_miss_realmode)
This patch fixes the problem by stopping MMU_FTR_TYPE_RADIX from being
set from ibm.pa-features.
We may eventually end up removing the CONFIG_PPC_RADIX_MMU option
completely but until then this fixes the issue.
Fixes: 17a3dd2f5f ("powerpc/mm/radix: Use firmware feature to enable Radix MMU")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
opal_npu_destroy_context() should be called with the NPU PHB, not the
PCIe PHB.
Fixes: 1ab66d1fba ("powerpc/powernv: Introduce address translation services for Nvlink2")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
A rare randconfig build error shows up when we have CONFIG_CRYPTO=m
in combination with a built-in CCREE driver:
crypto/hmac.o: In function `hmac_update':
hmac.c:(.text.hmac_update+0x28): undefined reference to `crypto_shash_update'
crypto/hmac.o: In function `hmac_setkey':
hmac.c:(.text.hmac_setkey+0x90): undefined reference to `crypto_shash_digest'
hmac.c:(.text.hmac_setkey+0x154): undefined reference to `crypto_shash_update'
drivers/staging/ccree/ssi_cipher.o: In function `ssi_blkcipher_setkey':
ssi_cipher.c:(.text.ssi_blkcipher_setkey+0x350): undefined reference to `crypto_shash_digest'
drivers/staging/ccree/ssi_cipher.o: In function `ssi_blkcipher_exit':
ssi_cipher.c:(.text.ssi_blkcipher_exit+0xd4): undefined reference to `crypto_destroy_tfm'
drivers/staging/ccree/ssi_cipher.o: In function `ssi_blkcipher_init':
ssi_cipher.c:(.text.ssi_blkcipher_init+0x1b0): undefined reference to `crypto_alloc_shash'
drivers/staging/ccree/ssi_cipher.o: In function `ssi_ablkcipher_free':
ssi_cipher.c:(.text.ssi_ablkcipher_free+0x48): undefined reference to `crypto_unregister_alg'
drivers/staging/ccree/ssi_cipher.o: In function `ssi_ablkcipher_alloc':
ssi_cipher.c:(.text.ssi_ablkcipher_alloc+0x138): undefined reference to `crypto_register_alg'
ssi_cipher.c:(.text.ssi_ablkcipher_alloc+0x274): undefined reference to `crypto_blkcipher_type'
We actually need to depend on both CRYPTO and CRYPTO_HW here to avoid the
problem, since CRYPTO_HW is a bool symbol and by itself that does not
force CCREE to be a loadable module when the core cryto support is modular.
Fixes: 50cfbbb7e6 ("staging: ccree: add ahash support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-By: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver calls ioremap() in the probe function but doesn't call
iounmap() in the remove function correspondingly. Do so now.
Follow commit 5c9d6abed9 ("serial: altera_jtaguart: adding iounmap()")
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit e61c38d85b ("serial: imx: setup DCEDTE early and ensure DCD and
RI irqs to be off") has a flaw: While UCR3 and UFCR were modified using
read-modify-write before it switched to write register values
independent of the previous state. That's a good idea in principle (and
that's why I did it) but needs more care.
This patch reinstates read-modify-write for UFCR and for UCR3 ensures
that RXDMUXSEL and ADNIMP are set for post imx1.
Fixes: e61c38d85b ("serial: imx: setup DCEDTE early and ensure DCD and RI irqs to be off")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Mika Penttilä <mika.penttila@nextfour.com>
Tested-by: Mika Penttilä <mika.penttila@nextfour.com>
Acked-by: Steve Twiss <stwiss.opensource@diasemi.com>
Tested-by: Steve Twiss <stwiss.opensource@diasemi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull SCSI fixes from James Bottomley:
"This is quite a big update because it includes a rework of the lpfc
driver to separate the NVMe part from the FC part.
The reason for doing this is because two separate trees (the nvme and
scsi trees respectively) want to update the individual components and
this separation will prevent a really nasty cross tree entanglement by
the time we reach the next merge window.
The rest of the fixes are the usual minor sort with no significant
security implications"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (25 commits)
scsi: zero per-cmd private driver data for each MQ I/O
scsi: csiostor: fix use after free in csio_hw_use_fwconfig()
scsi: ufs: Clean up some rpm/spm level SysFS nodes upon remove
scsi: lpfc: fix build issue if NVME_FC_TARGET is not defined
scsi: lpfc: Fix NULL pointer dereference during PCI error recovery
scsi: lpfc: update version to 11.2.0.14
scsi: lpfc: Add MDS Diagnostic support.
scsi: lpfc: Fix NVMEI's handling of NVMET's PRLI response attributes
scsi: lpfc: Cleanup entry_repost settings on SLI4 queues
scsi: lpfc: Fix debugfs root inode "lpfc" not getting deleted on driver unload.
scsi: lpfc: Fix NVME I+T not registering NVME as a supported FC4 type
scsi: lpfc: Added recovery logic for running out of NVMET IO context resources
scsi: lpfc: Separate NVMET RQ buffer posting from IO resources SGL/iocbq/context
scsi: lpfc: Separate NVMET data buffer pool fir ELS/CT.
scsi: lpfc: Fix NMI watchdog assertions when running nvmet IOPS tests
scsi: lpfc: Fix NVMEI driver not decrementing counter causing bad rport state.
scsi: lpfc: Fix nvmet RQ resource needs for large block writes.
scsi: lpfc: Adding additional stats counters for nvme.
scsi: lpfc: Fix system crash when port is reset.
scsi: lpfc: Fix used-RPI accounting problem.
...
Fixes following signature in the stack trace:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000374
IP: [<ffffffffa06ec8eb>] qla2x00_sp_free_dma+0xeb/0x2a0 [qla2xxx]
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
when driver is loaded with Multi Queue enabled, it was noticed that
there was one less queue pair created.
Following message would indicate this:
"No resources to create additional q pair."
The result of one less queue pair means that system can crash, if the
block mq layer thinks there is an extra hardware queue available, and
the driver will use a NULL ptr qpair in that instance.
Following stack trace is seen in one of the crash:
irq_create_affinity_masks+0x98/0x530
irq_create_affinity_masks+0x98/0x530
__pci_enable_msix+0x321/0x4e0
mutex_lock+0x12/0x40
pci_alloc_irq_vectors_affinity+0xb5/0x140
qla24xx_enable_msix+0x79/0x530 [qla2xxx]
qla2x00_request_irqs+0x61/0x2d0 [qla2xxx]
qla2x00_probe_one+0xc73/0x2390 [qla2xxx]
ida_simple_get+0x98/0x100
kernfs_next_descendant_post+0x40/0x50
local_pci_probe+0x45/0xa0
pci_device_probe+0xfc/0x140
driver_probe_device+0x2c5/0x470
__driver_attach+0xdd/0xe0
driver_probe_device+0x470/0x470
bus_for_each_dev+0x6c/0xc0
driver_attach+0x1e/0x20
bus_add_driver+0x45/0x270
driver_register+0x60/0xe0
__pci_register_driver+0x4c/0x50
qla2x00_module_init+0x1ce/0x21e [qla2xxx]
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This makes it possible, with appropriate filesystem support, for a
sysadmin to tell what is affected by the mismatch, and whether
it should be ignored (if it's inside a swap partition, for
instance).
We ratelimit to prevent log flooding: if there are so many
mismatches that ratelimiting is necessary, the individual messages
are relatively unlikely to be important (either the machine is
swapping like crazy or something is very wrong with the disk).
Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Previously, the uuid debug statements were printed in little-endian
format, which wasn't consistent in machines that might not be in
little-endian byte order. With this change, the output will be
consistent for all machines with different byte-ordering.
Signed-off-by: Kyungchan Koh <kkc6196@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
ext4_xattr_block_set() calls dquot_alloc_block() to charge for an xattr
block when new references are made. However if dquot_initialize() hasn't
been called on an inode, request for charging is effectively ignored
because ext4_inode_info->i_dquot is not initialized yet.
Add dquot_initialize() to call paths that lead to ext4_xattr_block_set().
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Currently we don't allow direct I/O on encrypted regular files, so in
such cases we return 0 early in ext4_direct_IO(). There was also an
additional BUG_ON() check in ext4_direct_IO_write(), but it can never be
hit because of the earlier check for the exact same condition in
ext4_direct_IO(). There was also no matching check on the read path,
which made the write path specific check seem very ad-hoc.
Just remove the unnecessary BUG_ON().
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: David Gstir <david@sigma-star.at>
Reviewed-by: Jan Kara <jack@suse.cz>
Now that we are passing a struct ext4_filename, we do not need to pass
around the original struct qstr too.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
The 'lend' argument of filemap_write_and_wait_range() is inclusive, so
we need to subtract 1 from pos + count.
Note that 'count' is guaranteed to be nonzero since
ext4_file_read_iter() returns early when given a 0 count.
Fixes: 16c5468859 ("ext4: Allow parallel DIO reads")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
ext4_find_unwritten_pgoff() is used to search for offset of hole or
data in page range [index, end] (both inclusive), and the max number
of pages to search should be at least one, if end == index.
Otherwise the only page is missed and no hole or data is found,
which is not correct.
When block size is smaller than page size, this can be demonstrated
by preallocating a file with size smaller than page size and writing
data to the last block. E.g. run this xfs_io command on a 1k block
size ext4 on x86_64 host.
# xfs_io -fc "falloc 0 3k" -c "pwrite 2k 1k" \
-c "seek -d 0" /mnt/ext4/testfile
wrote 1024/1024 bytes at offset 2048
1 KiB, 1 ops; 0.0000 sec (42.459 MiB/sec and 43478.2609 ops/sec)
Whence Result
DATA EOF
Data at offset 2k was missed, and lseek(2) returned ENXIO.
This is unconvered by generic/285 subtest 07 and 08 on ppc64 host,
where pagesize is 64k. Because a recent change to generic/285
reduced the preallocated file size to smaller than 64k.
Signed-off-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
get_reg() can be reentered on architectures with prioritized interrupts
(m68k in this case), causing f->reg_index to be incremented after the
range check. Out of bounds memory access past the pt_regs struct results.
This will go mostly undetected unless access is beyond end of memory.
Prevent the race by disabling interrupts in get_reg().
Tested on m68k (Atari Falcon, and ARAnyM emulator).
Kudos to Geert Uytterhoeven for helping to trace this race.
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We end up reading the interrupt register for HPD5, and then writing it
to HPD6 which on systems without anything using HPD5 results in
permanently disabling hotplug on one of the display outputs after the
first time we acknowledge a hotplug interrupt from the GPU.
This code is really bad. But for now, let's just fix this. I will
hopefully have a large patch series to refactor all of this soon.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lyude <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Daniel Borkmann says:
====================
BPF pruning follow-up
Follow-up to fix incorrect pruning when alignment tracking is
in use and to properly clear regs after call to not leave stale
data behind. For details, please see individual patches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Since virtio does not provide it's own ndo_features_check handler,
TSO, and now checksum offload, are disabled for stacked vlans.
Re-enable the support and let the host take care of it. This
restores/improves Guest-to-Guest performance over Q-in-Q vlans.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At least some of the be2net cards do not seem to be capabled
of performing checksum offload computions on Q-in-Q packets.
In these case, the recevied checksum on the remote is invalid
and TCP syn packets are dropped.
This patch adds a call to check disbled acceleration features
on Q-in-Q tagged traffic.
CC: Sathya Perla <sathya.perla@broadcom.com>
CC: Ajit Khaparde <ajit.khaparde@broadcom.com>
CC: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
CC: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It appears that TCP checksum offloading has been broken for
Q-in-Q vlans. The behavior was execerbated by the
series
commit afb0bc972b ("Merge branch 'stacked_vlan_tso'")
that that enabled accleleration features on stacked vlans.
However, event without that series, it is possible to trigger
this issue. It just requires a lot more specialized configuration.
The root cause is the interaction between how
netdev_intersect_features() works, the features actually set on
the vlan devices and HW having the ability to run checksum with
longer headers.
The issue starts when netdev_interesect_features() replaces
NETIF_F_HW_CSUM with a combination of NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM,
if the HW advertises IP|IPV6 specific checksums. This happens
for tagged and multi-tagged packets. However, HW that enables
IP|IPV6 checksum offloading doesn't gurantee that packets with
arbitrarily long headers can be checksummed.
This patch disables IP|IPV6 checksums on the packet for multi-tagged
packets.
CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
CC: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reinitializing the VM manager during suspend/resume is a very very bad
idea since all the VMs are still active and kicking.
This can lead to random VM faults after resume when new processes
become the same client ID assigned.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The 88m1101 has an errata when configuring autoneg. However, it was
being applied to many other Marvell PHYs as well. Limit its scope to
just the 88m1101.
Fixes: 76884679c6 ("phylib: Add support for Marvell 88e1111S and 88e1145")
Reported-by: Daniel Walker <danielwa@cisco.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Harini Katakam <harinik@xilinx.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix build errors by making this driver depend on OF_MDIO, like
several other similar drivers do.
drivers/built-in.o: In function `octeon_mdiobus_remove':
mdio-octeon.c:(.text+0x196ee0): undefined reference to `mdiobus_unregister'
mdio-octeon.c:(.text+0x196ee8): undefined reference to `mdiobus_free'
drivers/built-in.o: In function `octeon_mdiobus_probe':
mdio-octeon.c:(.text+0x196f1d): undefined reference to `devm_mdiobus_alloc_size'
mdio-octeon.c:(.text+0x196ffe): undefined reference to `of_mdiobus_register'
mdio-octeon.c:(.text+0x197010): undefined reference to `mdiobus_free'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
mlx5-fixes-2017-05-23
Some TC offloads fixes from Or Gerlitz.
From Erez, mlx5 IPoIB RX fix to improve GRO.
From Mohamad, Command interface fix to improve mitigation against FW
commands timeouts.
From Tariq, Driver load Tolerance against affinity settings failures.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Just two fixes this time:
* fix the scheduled scan "BUG: scheduling while atomic"
* check mesh address extension flags more strictly
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Some PHY require to wait for a bit after the reset GPIO has been
toggled. This adds support for the DT property `phy-reset-post-delay`
which gives the delay in milliseconds to wait after reset.
If the DT property is not given, no delay is observed. Post reset delay
greater than 1000ms are invalid.
Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long says:
====================
sctp: a bunch of fixes for processing dupcookie
After introducing transport hashtable and per stream info into sctp,
some regressions were caused when processing dupcookie, this patchset
is to fix them.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
After sctp changed to use transport hashtable, a transport would be
added into global hashtable when adding the peer to an asoc, then
the asoc can be got by searching the transport in the hashtbale.
The problem is when processing dupcookie in sctp_sf_do_5_2_4_dupcook,
a new asoc would be created. A peer with the same addr and port as
the one in the old asoc might be added into the new asoc, but fail
to be added into the hashtable, as they also belong to the same sk.
It causes that sctp's dupcookie processing can not really work.
Since the new asoc will be freed after copying it's information to
the old asoc, it's more like a temp asoc. So this patch is to fix
it by setting it as a temp asoc to avoid adding it's any transport
into the hashtable and also avoid allocing assoc_id.
An extra thing it has to do is to also alloc stream info for any
temp asoc, as sctp dupcookie process needs it to update old asoc.
But I don't think it would hurt something, as a temp asoc would
always be freed after finishing processing cookie echo packet.
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 3dbcc105d5 ("sctp: alloc stream info when initializing
asoc"), stream and stream.out info are always alloced when creating
an asoc.
So it's not correct to check !asoc->stream before updating stream
info when processing dupcookie, but would be better to check asoc
state instead.
Fixes: 3dbcc105d5 ("sctp: alloc stream info when initializing asoc")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If multiple tasks attempt to read the stats, it may happen that the
start_req_done completion is re-initialized while still being used by
another task, causing a list corruption.
This patch fixes the bug by adding a mutex to serialize the calls to
bnx2fc_get_host_stats().
WARNING: at lib/list_debug.c:48 list_del+0x6e/0xa0() (Not tainted)
Hardware name: PowerEdge R820
list_del corruption. prev->next should be ffff882035627d90, but was ffff884069541588
Pid: 40267, comm: perl Not tainted 2.6.32-642.3.1.el6.x86_64 #1
Call Trace:
[<ffffffff8107c691>] ? warn_slowpath_common+0x91/0xe0
[<ffffffff8107c796>] ? warn_slowpath_fmt+0x46/0x60
[<ffffffff812ad16e>] ? list_del+0x6e/0xa0
[<ffffffff81547eed>] ? wait_for_common+0x14d/0x180
[<ffffffff8106c4a0>] ? default_wake_function+0x0/0x20
[<ffffffff81547fd3>] ? wait_for_completion_timeout+0x13/0x20
[<ffffffffa05410b1>] ? bnx2fc_get_host_stats+0xa1/0x280 [bnx2fc]
[<ffffffffa04cf630>] ? fc_stat_show+0x90/0xc0 [scsi_transport_fc]
[<ffffffffa04cf8b6>] ? show_fcstat_tx_frames+0x16/0x20 [scsi_transport_fc]
[<ffffffff8137c647>] ? dev_attr_show+0x27/0x50
[<ffffffff8113b9be>] ? __get_free_pages+0xe/0x50
[<ffffffff812170e1>] ? sysfs_read_file+0x111/0x200
[<ffffffff8119a305>] ? vfs_read+0xb5/0x1a0
[<ffffffff8119b0b6>] ? fget_light_pos+0x16/0x50
[<ffffffff8119a651>] ? sys_read+0x51/0xb0
[<ffffffff810ee1fe>] ? __audit_syscall_exit+0x25e/0x290
[<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When pci_enable_device() or pci_enable_device_mem() fail in
qla2x00_probe_one() we bail out but do a call to
pci_disable_device(). This causes the dev_WARN_ON() in
pci_disable_device() to trigger, as the device wasn't enabled
previously.
So instead of taking the 'probe_out' error path we can directly return
*iff* one of the pci_enable_device() calls fails.
Additionally rename the 'probe_out' goto label's name to the more
descriptive 'disable_device'.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Fixes: e315cd28b9 ("[SCSI] qla2xxx: Code changes for qla data structure refactoring")
Cc: <stable@vger.kernel.org>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Element size in the manifest should be updated for each token, so that the
loop can parse all the string elements in the manifest. This was not
happening when more than two string elements appear consecutively, as it is
not updated with correct string element size. Fixed with this patch.
Signed-off-by: Shreyas NC <shreyas.nc@intel.com>
Signed-off-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
In SKL+ platforms, all IPC commands are serialised, i.e. the driver sends
a new IPC to DSP, only after receiving a reply from the firmware for the
current IPC.
Hence it seems apparent that there is only a single modifier of the IPC RX
List. However, during an IPC timeout case in a multithreaded environment,
there is a possibility of the list element being deleted two times if not
properly protected.
So, use spin lock save/restore to prevent rx_list corruption.
Signed-off-by: Pardha Saradhi K <pardha.saradhi.kesapragada@intel.com>
Signed-off-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
commit 90431eb49b ("ASoC: rsnd: don't use PDTA bit for 24bit on SSI")
fixups 24bit mode data alignment, but PIO was not cared.
This patch fixes PIO mode 24bit data alignment
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
A somewhat overdue update of the address for sending patches on Wolfson
parts to since our acquision a couple of years ago by Cirrus Logic.
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
soc_cleanup_card_resources() call snd_card_free() at the last of its
procedure. This turned out to lead to a use-after-free.
PCM runtimes have been already removed via soc_remove_pcm_runtimes(),
while it's dereferenced later in soc_pcm_free() called via
snd_card_free().
The fix is simple: just move the snd_card_free() call to the beginning
of the whole procedure. This also gives another benefit: it
guarantees that all operations have been shut down before actually
releasing the resources, which was racy until now.
Reported-and-tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: <stable@vger.kernel.org>
In most cases, a cgroup controller don't care about the liftimes of
cgroups. For the controller, a css becomes online when ->css_online()
is called on it and offline when ->css_offline() is called.
However, cpuset is special in that the user interface it exposes cares
whether certain cgroups exist or not. Combined with the RCU delay
between cgroup removal and css offlining, this can lead to user
visible behavior oddities where operations which should succeed after
cgroup removals fail for some time period. The effects of cgroup
removals are delayed when seen from userland.
This patch adds css_is_dying() which tests whether offline is pending
and updates is_cpuset_online() so that the function returns false also
while offline is pending. This gets rid of the userland visible
delays.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Link: http://lkml.kernel.org/r/327ca1f5-7957-fbb9-9e5f-9ba149d40ba2@oracle.com
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Gabriel will no longer maintain this driver, so I'm adding
myself as maintainer.
Thanks for all your work on jsm driver Gabriel.
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the ceph client doesn't respect the rlimit in fallocate. This
means that a user can allocate a file with size > RLIMIT_FSIZE. This
patch adds the call to inode_newsize_ok() to verify filesystem limits and
ulimits. This should make ceph successfully run xfstest generic/228.
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Pull ptrace fix from Eric Biederman:
"This fixes a brown paper bag bug. When I fixed the ptrace interaction
with user namespaces I added a new field ptracer_cred in struct_task
and I failed to properly initialize it on fork.
This dangling pointer wound up breaking runing setuid applications run
from the enlightenment window manager.
As this is the worst sort of bug. A regression breaking user space for
no good reason let's get this fixed"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ptrace: Properly initialize ptracer_cred on fork
Pull MMC fixes from Ulf Hansson:
"A couple of MMC host fixes intended for v4.12 rc3:
- sdhci-xenon: Don't free data for phy allocated by devm*
- sdhci-iproc: Suppress spurious interrupts
- cavium: Fix probing race with regulator
- cavium: Prevent crash with incomplete DT
- cavium-octeon: Use proper GPIO name for power control
- cavium-octeon: Fix interrupt enable code"
* tag 'mmc-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-iproc: suppress spurious interrupt with Multiblock read
mmc: cavium: Fix probing race with regulator
of/platform: Make of_platform_device_destroy globally visible
mmc: cavium: Prevent crash with incomplete DT
mmc: cavium-octeon: Use proper GPIO name for power control
mmc: cavium-octeon: Fix interrupt enable code
mmc: sdhci-xenon: kill xenon_clean_phy()
The cryptographic engine nodes have an interrupt which is configured as
both edge and level, which makes no sense at all. Fix this by
configuring it the right way (level).
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
This reverts commit 368e5fbdfc.
devm_ioremap_resource() enforces that there are no overlapping
resources, where as devm_ioremap() does not. The sata phy driver needs
a subset of the sata IO address space, so maps some of the sata
address space. As a result, sata_mv now fails to probe, reporting it
cannot get its resources, and so we don't have any SATA disks.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: Tejun Heo <tj@kernel.org>
In the current form of the code, if a->replacementlen is 0, the reference
to *insnbuf for comparison touches potentially garbage memory. While it
doesn't affect the execution flow due to the subsequent a->replacementlen
comparison, it is (rightly) detected as use of uninitialized memory by a
runtime instrumentation currently under my development, and could be
detected as such by other tools in the future, too (e.g. KMSAN).
Fix the "false-positive" by reordering the conditions to first check the
replacement instruction length before referencing specific opcode bytes.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/20170524135500.27223-1-mjurczyk@google.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It's possible and acceptable for NFS to attempt to add requests beyond the
range of the current pgio->pg_lseg, a case which should be caught and
limited by the pg_test operation. However, the current handling of this
case replaces pgio->pg_lseg with a new layout segment (after a WARN) within
that pg_test operation. That will cause all the previously added requests
to be submitted with this new layout segment, which may not be valid for
those requests.
Fix this problem by only returning zero for the number of bytes to coalesce
from pg_test for this case which allows any previously added requests to
complete on the current layout segment. The check for requests starting
out of range of the layout segment moves to pg_init, so that the
replacement of pgio->pg_lseg will be done when the next request is added.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If xdr_inline_decode() fails then we end up returning ERR_PTR(0). The
caller treats NULL returns as -ENOMEM so it doesn't really hurt runtime,
but obviously we intended to set an error code here.
Fixes: d67ae825a5 ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Commit b685d3d65a "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
Fixes: b685d3d65a
CC: reiserfs-devel@vger.kernel.org
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Commit b685d3d65a "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
Fixes: b685d3d65a
CC: Steven Whitehouse <swhiteho@redhat.com>
CC: cluster-devel@redhat.com
CC: stable@vger.kernel.org
Acked-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
If nf_conntrack_htable_size was adjusted by the user during the ct
dump operation, we may invoke nf_ct_put twice for the same ct, i.e.
the "last" ct. This will cause the ct will be freed but still linked
in hash buckets.
It's very easy to reproduce the problem by the following commands:
# while : ; do
echo $RANDOM > /proc/sys/net/netfilter/nf_conntrack_buckets
done
# while : ; do
conntrack -L
done
# iperf -s 127.0.0.1 &
# iperf -c 127.0.0.1 -P 60 -t 36000
After a while, the system will hang like this:
NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [bash:20184]
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [iperf:20382]
...
So at last if we find cb->args[1] is equal to "last", this means hash
resize happened, then we can set cb->args[1] to 0 to fix the above
issue.
Fixes: d205dc4079 ("[NETFILTER]: ctnetlink: fix deadlock in table dumping")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The hi6220_reset driver can be built as a standalone module
yet it cannot be loaded because it depends on GPL exported symbols.
Lets set the module license so that the module loads, and things like
the on-board kirin drm starts working.
Signed-off-by: Jeremy Linton <lintonrjeremy@gmail.com>
Reviewed-by: Xinliang Liu <xinliang.liu@linaro.org>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
In the file arch/x86/mm/pat.c, there's a '__pat_enabled' variable. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.
However, on AMD K6-3 CPUs, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.
The result of this bug is that a kernel warning is produced when attempting to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers:
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
...
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
To fix the bug change pat_enabled() so that it returns true only if PAT
initialization was actually done.
Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
this_cpu_has(X86_FEATURE_PAT) in pat_ap_init(), so that we check the PAT
feature on the processor that is being initialized.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1704181501450.26399@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We have been a little loose with our intermediate VMCR representation
where we had a 'ctlr' field, but we failed to differentiate between the
GICv2 GICC_CTLR and ICC_CTLR_EL1 layouts, and therefore ended up mapping
the wrong bits into the individual fields of the ICH_VMCR_EL2 when
emulating a GICv2 on a GICv3 system.
Fix this by using explicit fields for the VMCR bits instead.
Cc: Eric Auger <eric.auger@redhat.com>
Reported-by: wanghaibin <wanghaibin.wang@huawei.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Dave Jones and Steven Rostedt reported unwinder warnings like the
following:
WARNING: kernel stack frame pointer at ffff8800bda0ff30 in sshd:1090 has bad value 000055b32abf1fa8
In both cases, the unwinder was attempting to unwind from an ftrace
handler into entry code. The callchain was something like:
syscall entry code
C function
ftrace handler
save_stack_trace()
The problem is that the unwinder's end-of-stack logic gets confused by
the way ftrace lays out the stack frame (with fentry enabled).
I was able to recreate this warning with:
echo call_usermodehelper_exec_async:stacktrace > /sys/kernel/debug/tracing/set_ftrace_filter
(exit login session)
I considered fixing this by changing the ftrace code to rewrite the
stack to make the unwinder happy. But that seemed too intrusive after I
implemented it. Instead, just add another check to the unwinder's
end-of-stack logic to detect this special case.
Side note: We could probably get rid of these end-of-stack checks by
encoding the frame pointer for syscall entry just like we do for
interrupt entry. That would be simpler, but it would also be a lot more
intrusive since it would slightly affect the performance of every
syscall.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: live-patching@vger.kernel.org
Fixes: c32c47c68a ("x86/unwind: Warn on bad frame pointer")
Link: http://lkml.kernel.org/r/671ba22fbc0156b8f7e0cfa5ab2a795e08bc37e1.1495553739.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Petr Mladek reported the following warning when loading the livepatch
sample module:
WARNING: CPU: 1 PID: 3699 at arch/x86/kernel/stacktrace.c:132 save_stack_trace_tsk_reliable+0x133/0x1a0
...
Call Trace:
__schedule+0x273/0x820
schedule+0x36/0x80
kthreadd+0x305/0x310
? kthread_create_on_cpu+0x80/0x80
? icmp_echo.part.32+0x50/0x50
ret_from_fork+0x2c/0x40
That warning means the end of the stack is no longer recognized as such
for newly forked tasks. The problem was introduced with the following
commit:
ff3f7e2475 ("x86/entry: Fix the end of the stack for newly forked tasks")
... which was completely misguided. It only partially fixed the
reported issue, and it introduced another bug in the process. None of
the other entry code saves the frame pointer before calling into C code,
so it doesn't make sense for ret_from_fork to do so either.
Contrary to what I originally thought, the original issue wasn't related
to newly forked tasks. It was actually related to ftrace. When entry
code calls into a function which then calls into an ftrace handler, the
stack frame looks different than normal.
The original issue will be fixed in the unwinder, in a subsequent patch.
Reported-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: live-patching@vger.kernel.org
Fixes: ff3f7e2475 ("x86/entry: Fix the end of the stack for newly forked tasks")
Link: http://lkml.kernel.org/r/f350760f7e82f0750c8d1dd093456eb212751caa.1495553739.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Sync (copy) the following v4.12 kernel headers to the tooling headers:
arch/x86/include/asm/disabled-features.h:
arch/x86/include/uapi/asm/kvm.h:
arch/powerpc/include/uapi/asm/kvm.h:
arch/s390/include/uapi/asm/kvm.h:
arch/arm/include/uapi/asm/kvm.h:
arch/arm64/include/uapi/asm/kvm.h:
- 'struct kvm_sync_regs' got changed in an ABI-incompatible way,
fortunately none of the (in-kernel) tooling relied on it
- new KVM_DEV calls added
arch/x86/include/asm/required-features.h:
- 5-level paging hardware ABI detail added
arch/x86/include/asm/cpufeatures.h:
- new CPU feature added
arch/x86/include/uapi/asm/vmx.h:
- new VMX exit conditions
None of the changes requires fixes in the tooling source code.
This addresses the following warnings:
Warning: include/uapi/linux/stat.h differs from kernel
Warning: arch/x86/include/asm/disabled-features.h differs from kernel
Warning: arch/x86/include/asm/required-features.h differs from kernel
Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel
Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel
Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524065721.j2mlch6bgk5klgbc@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The __hpp__sort_acc() sorts entries using callchain depth in order to
put callers above in children mode. But it assumed the callchain order
was callee-first. Now default (for children) is caller-first so the
order of entries is reverted.
For example, consider following case:
$ perf report --no-children
..l
# Overhead Command Shared Object Symbol
# ........ ....... ................... ..........................
#
99.44% a.out a.out [.] main
|
---main
__libc_start_main
_start
Then children mode should show 'start' above '__libc_start_main' since
it's the caller (parent) of the __libc_start_main. But it's reversed:
# Children Self Command Shared Object Symbol
# ........ ........ ....... ............... .....................
#
99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main
99.61% 0.00% a.out a.out [.] _start
99.54% 99.44% a.out a.out [.] main
This patch fixes it.
# Children Self Command Shared Object Symbol
# ........ ........ ....... ............... .....................
#
99.61% 0.00% a.out a.out [.] _start
99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main
99.54% 99.44% a.out a.out [.] main
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-8-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The very last inlined frame, i.e. the one furthest away from the
non-inlined frame, was silently dropped. This is apparent when
comparing the output of `perf script` and `addr2line`:
~~~~~~
$ perf script --inline
...
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
$ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
0x0000000000000a4a
std::__complex_abs(doublecomplex )
/usr/include/c++/6.3.1/complex:589
double std::abs<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:597
double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:654
double std::norm<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:664
main
/tmp/inlining.cpp:14
~~~~~
Note how `std::__complex_abs` is missing from the `perf script`
output. This is similarly showing up in `perf report`. The patch
here fixes this issue, and the output becomes:
~~~~~
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::__complex_abs
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.
This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:
~~~~~~~~~~~~~~~
#include <thread>
#include <chrono>
using namespace std;
int main()
{
this_thread::sleep_for(chrono::milliseconds(1000));
this_thread::sleep_for(chrono::milliseconds(100));
this_thread::sleep_for(chrono::milliseconds(10));
return 0;
}
~~~~~~~~~~~~~~~
Now compile and record it:
~~~~~~~~~~~~~~~
g++ -std=c++11 -g -O2 test.cpp
echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
perf record \
--event sched:sched_stat_sleep \
--event sched:sched_process_exit \
--event sched:sched_switch --call-graph=dwarf \
--output perf.data.raw \
./a.out
echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
perf inject --sched-stat --input perf.data.raw --output perf.data
~~~~~~~~~~~~~~~
Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:9
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:10
| __libc_start_main
| _start
|
--0.91%--main test.cpp:13
__libc_start_main
_start
~~~~~~~~~~~~~~~
With this patch here applied, the issue is fixed. The report becomes
much more usable:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:8
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:9
| __libc_start_main
| _start
|
--0.91%--main test.cpp:10
__libc_start_main
_start
~~~~~~~~~~~~~~~
Similarly it works for signal frames:
~~~~~~~~~~~~~~~
__noinline void bar(void)
{
volatile long cnt = 0;
for (cnt = 0; cnt < 100000000; cnt++);
}
__noinline void foo(void)
{
bar();
}
void sig_handler(int sig)
{
foo();
}
int main(void)
{
signal(SIGUSR1, sig_handler);
raise(SIGUSR1);
foo();
return 0;
}
~~~~~~~~~~~~~~~~
Before, the report wrongly points to `signal.c:29` after raise():
~~~~~~~~~~~~~~~~
$ perf report --stdio --no-children -g srcline -s srcline
...
100.00% signal.c:11
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:29
__libc_start_main
_start
~~~~~~~~~~~~~~~~
With this patch in, the issue is fixed and we instead get:
~~~~~~~~~~~~~~~~
100.00% signal signal [.] bar
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:27
__libc_start_main
_start
~~~~~~~~~~~~~~~~
Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is straight-forward thanks
to dwfl_frame_pc(). For libunwind, we replace the functionality via
unw_is_signal_frame() for any but the very first frame.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-4-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a filename was found in addr2line it was duplicated via strdup()
but never freed. Now we pass NULL and handle this gracefully in
addr2line.
Detected by Valgrind:
==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 220
==16331== at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16331== by 0x672FA69: strdup (in /usr/lib/libc-2.25.so)
==16331== by 0x52769F: addr2line (srcline.c:256)
==16331== by 0x52769F: addr2inlines (srcline.c:294)
==16331== by 0x52769F: dso__parse_addr_inlines (srcline.c:502)
==16331== by 0x574D7A: inline__fprintf (hist.c:41)
==16331== by 0x574D7A: ipchain__fprintf_graph (hist.c:147)
==16331== by 0x57518A: __callchain__fprintf_graph (hist.c:212)
==16331== by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337)
==16331== by 0x57738E: hist_entry__fprintf (hist.c:628)
==16331== by 0x57738E: hists__fprintf (hist.c:882)
==16331== by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399)
==16331== by 0x44A20F: report__browse_hists (builtin-report.c:491)
==16331== by 0x44A20F: __cmd_report (builtin-report.c:624)
==16331== by 0x44A20F: cmd_report (builtin-report.c:1054)
==16331== by 0x4A49CE: run_builtin (perf.c:296)
==16331== by 0x4A4CC0: handle_internal_command (perf.c:348)
==16331== by 0x434371: run_argv (perf.c:392)
==16331== by 0x434371: main (perf.c:530)
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-3-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I just hit a segfault when doing `perf report -g srcline`.
Valgrind pointed me at this code as the culprit:
==8359== Invalid read of size 8
==8359== at 0x3096D9: map__rip_2objdump (map.c:430)
==8359== by 0x2FC1A3: match_chain_srcline (callchain.c:645)
==8359== by 0x2FC1A3: match_chain (callchain.c:700)
==8359== by 0x2FC1A3: append_chain (callchain.c:895)
==8359== by 0x2FC1A3: append_chain_children (callchain.c:846)
==8359== by 0x2FF719: callchain_append (callchain.c:944)
==8359== by 0x2FF719: hist_entry__append_callchain (callchain.c:1058)
==8359== by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908)
==8359== by 0x33195C: hist_entry_iter__add (hist.c:1050)
==8359== by 0x258F65: process_sample_event (builtin-report.c:204)
==8359== by 0x30D60C: perf_session__deliver_event (session.c:1310)
==8359== by 0x30D60C: ordered_events__deliver_event (session.c:119)
==8359== by 0x310D12: __ordered_events__flush (ordered-events.c:210)
==8359== by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277)
==8359== by 0x30DD3C: perf_session__process_user_event (session.c:1349)
==8359== by 0x30DD3C: perf_session__process_event (session.c:1475)
==8359== by 0x30FC3C: __perf_session__process_events (session.c:1867)
==8359== by 0x30FC3C: perf_session__process_events (session.c:1921)
==8359== by 0x25A985: __cmd_report (builtin-report.c:575)
==8359== by 0x25A985: cmd_report (builtin-report.c:1054)
==8359== by 0x2B9A80: run_builtin (perf.c:296)
==8359== Address 0x70 is not stack'd, malloc'd or (recently) free'd
This patch fixes the issue.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
[ Remove dependency from another change ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-2-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The current buffer is being reset to zero on device_free_chan_resources()
but not on device_terminate_all(). It could happen that HW is restarted and
expects BASE0 to be used, but the driver is not synchronized and will start
from BASE1. One solution is to reset the buffer explicitly in
m2p_hw_setup().
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Tweak the Kconfig description to mention support for NSP and make the
default on for iProc based platforms.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "devm_kcalloc".
This issue was detected by using the Coccinelle software.
Acked-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Making thermal_emergency_poweroff static fixes sparse warning:
drivers/thermal/thermal_core.c:6: warning: symbol
'thermal_emergency_poweroff' was not declared. Should it be static?
Fixes: ef1d87e06a ("thermal: core: Add a back up thermal shutdown mechanism")
Acked-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Building this driver with W=1 reports:
warning: variable 'trip' set but not used [-Wunused-but-set-variable]
The call for of_thermal_get_trip_points() is useless.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
We currently do
tcmu_free_device ->tcmu_netlink_event(TCMU_CMD_REMOVED_DEVICE) ->
uio_unregister_device -> kfree(tcmu_dev).
The problem is that the kernel does not wait for userspace to
do the close() on the uio device before freeing the tcmu_dev.
We can then hit a race where the kernel frees the tcmu_dev before
userspace does close() and so when close() -> release -> tcmu_release
is done, we try to access a freed tcmu_dev.
This patch made over the target-pending master branch moves the freeing
of the tcmu_dev to when the last reference has been dropped.
This also fixes a leak where if tcmu_configure_device was not called on a
device we did not free udev->name which was allocated at tcmu_alloc_device time.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
skb->data is assigned to task->hdr in cxgbi_conn_alloc_pdu(),
skb gets freed after tx but task->hdr is still dereferenced in
iscsi_tcp_task_xmit() to avoid this call skb_get() after allocating skb
and free the skb in cxgbi_cleanup_task() or before allocating new skb in
cxgbi_conn_alloc_pdu().
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
max_fin_rt is the maximum re-transmission of FIN packets
as part of the termination flow. After reaching this value
the FW will send a single RESET.
Signed-off-by: Nilesh Javali <nilesh.javali@cavium.com>
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
munmap done by iscsiuio during a stop of the service triggers a "bad
pte" warning sometimes. munmap kernel path goes through the mmapped
pages and has a validation check for mapcount (in struct page) to be
zero or above. kzalloc, which we had used to allocate udev->ctrl, uses
slab allocations, which re-uses mapcount (union) for other purposes that
can make the mapcount look negative. Avoid all these trouble by invoking
one of the __get_free_pages wrappers to be used instead of kzalloc for
udev->ctrl.
BUG: Bad page map in process iscsiuio pte:80000000aa624067 pmd:3e6777067
page:ffffea0002a98900 count:2 mapcount:-2143289280
mapping: (null) index:0xffff8800aa624e00
page flags: 0x10075d00000090(dirty|slab)
page dumped because: bad pte
addr:00007fcba70a3000 vm_flags:0c0400fb anon_vma: (null)
mapping:ffff8803edf66e90 index:0
Call Trace:
dump_stack+0x19/0x1b
print_bad_pte+0x1af/0x250
unmap_page_range+0x7a7/0x8a0
unmap_single_vma+0x81/0xf0
unmap_vmas+0x49/0x90
unmap_region+0xbe/0x140
? vma_rb_erase+0x121/0x220
do_munmap+0x245/0x420
vm_munmap+0x41/0x60
SyS_munmap+0x22/0x30
tracesys+0xdd/0xe2
Signed-off-by: Arun Easi <arun.easi@cavium.com>
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A recent commit added extra printks for CPU/RT limits. This can result in
excessive spam in dmesg.
Make the printks conditional on print_fatal_signals.
Fixes: e7ea7c9806 ("rlimits: Print more information when CPU/RT limits are exceeded")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arun Raghavan <arun@arunraghavan.net>
We need to clear the IPS_SRC_NAT_DONE_BIT to indicate that the ct has
been removed from nat_bysource table. But unfortunately, we use the
non-atomic bit operation: "ct->status &= ~IPS_NAT_DONE_MASK". So
there's a race condition that we may clear the _DYING_BIT set by
another CPU unexpectedly.
Since we don't care about the IPS_DST_NAT_DONE_BIT, so just using
clear_bit to clear the IPS_SRC_NAT_DONE_BIT is enough.
Also note, this is the last user which use the non-atomic bit operation
to update the confirmed ct->status.
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The existing code selects no next branch to be inspected when
re-inserting an inactive element into the rb-tree, looping endlessly.
This patch restricts the check for active elements to the EEXIST case
only.
Fixes: e701001e7c ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates")
Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
sctp_compute_cksum() implementation assumes that at least the SCTP header
is in the linear part of skb: modify conntrack error callback to avoid
false CRC32c mismatch, if the transport header is partially/entirely paged.
Fixes: cf6e007eef ("netfilter: conntrack: validate SCTP crc32c in PREROUTING")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Some drivers - like i915 - may not support the system suspend direct
complete optimization due to differences in their runtime and system
suspend sequence. Add a flag that when set resumes the device before
calling the driver's system suspend handlers which effectively disables
the optimization.
Needed by a future patch fixing suspend/resume on i915.
Suggested by Rafael.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
If there is not enough space then ceph_decode_32_safe() does a goto bad.
We need to return an error code in that situation. The current code
returns ERR_PTR(0) which is NULL. The callers are not expecting that
and it results in a NULL dereference.
Fixes: f24e9980eb ("ceph: OSD client")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
None of these are validated in userspace, but since we do validate
reply_struct_v in ceph_x_proc_ticket_reply(), tkt_struct_v (first) and
CephXServiceTicket struct_v (second) in process_one_ticket(), validate
CephXTicketBlob struct_v as well.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
It's set but not used: CEPH_FEATURE_MONNAMES feature bit isn't
advertised, which guarantees a v1 MonMap.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
if we receive a compound such that:
- the sessionid, slot, and sequence number in the SEQUENCE op
match a cached succesful reply with N ops, and
- the Nth operation of the compound is a PUTFH, PUTPUBFH,
PUTROOTFH, or RESTOREFH,
then nfsd4_sequence will return 0 and set cstate->status to
nfserr_replay_cache. The current filehandle will not be set. This will
cause us to call check_nfsd_access with first argument NULL.
To nfsd4_compound it looks like we just succesfully executed an
operation that set a filehandle, but the current filehandle is not set.
Fix this by moving the nfserr_replay_cache earlier. There was never any
reason to have it after the encode_op label, since the only case where
he hit that is when opdesc->op_func sets it.
Note that there are two ways we could hit this case:
- a client is resending a previously sent compound that ended
with one of the four PUTFH-like operations, or
- a client is sending a *new* compound that (incorrectly) shares
sessionid, slot, and sequence number with a previously sent
compound, and the length of the previously sent compound
happens to match the position of a PUTFH-like operation in the
new compound.
The second is obviously incorrect client behavior. The first is also
very strange--the only purpose of a PUTFH-like operation is to set the
current filehandle to be used by the following operation, so there's no
point in having it as the last in a compound.
So it's likely this requires a buggy or malicious client to reproduce.
Reported-by: Scott Mayhew <smayhew@redhat.com>
Cc: stable@kernel.vger.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull i2c fixes from Wolfram Sang:
"Fix the i2c-designware regression of rc2.
Also, a DMA buffer fix for the tiny-usb driver where the USB core now
loudly complains about the non DMA-capable buffer"
[ I had cherry-picked the designware fix separately because it hit my
laptop, but here is the proper sync with the i2c tree - Linus ]
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: designware: Fix bogus sda_hold_time due to uninitialized vars
i2c: i2c-tiny-usb: fix buffer not being DMA capable
Commit 9da5ac236d ("ARM: soft-reboot into same mode that we entered
the kernel") added support to enter the new kernel in the same processor
mode as the previous one when we soft-reboot from one kernel into
another by pass a flag to cpu_reset() so it knows what to do exactly.
However it missed to make similar changes in MCPM code. Due to the
missing flag, the CPUs enter HYP mode which is not supported with MCPM.
MCPM works only in secure mode as it manages CCI.
This patch aligns the cpu_reset call in MCPM with other changes in the
above mentioned commit by making phys_reset_t to follow the prototype
of cpu_reset().
Fixes: 9da5ac236d ("ARM: soft-reboot into same mode that we entered the kernel")
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
The SMD channel is not the primary WCNSS channel and must explicitly be
closed as the device is removed, or the channel will already by open on
a subsequent probe call in e.g. the case of reloading the kernel module.
This issue was introduced because I simplified the underlying SMD
implementation while the SMD adaptions of the driver sat on the mailing
list, but missed to update these patches. The patch does however only
apply back to the transition to rpmsg, hence the limited Fixes.
Fixes: 5052de8def ("soc: qcom: smd: Transition client drivers from smd to rpmsg")
Reported-by: Eyal Ilsar <c_eilsar@qti.qualcomm.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The code in block/partitions/msdos.c recognizes FreeBSD, OpenBSD
and NetBSD partitions and does a reasonable job picking out OpenBSD
and NetBSD UFS subpartitions.
But for FreeBSD the subpartitions are always "bad".
Kernel: <bsd:bad subpartition - ignored
Though all 3 of these BSD systems use UFS as a file system, only
FreeBSD uses relative start addresses in the subpartition
declarations.
The following patch fixes this for FreeBSD partitions and leaves
the code for OpenBSD and NetBSD intact:
Signed-off-by: Richard Narron <comet.berkeley@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Masks for extracting part of the Completion Queue Entry (CQE)
field rss_hash_type was swapped, namely CQE_RSS_HTYPE_IP and
CQE_RSS_HTYPE_L4.
The bug resulted in setting skb->l4_hash, even-though the
rss_hash_type indicated that hash was NOT computed over the
L4 (UDP or TCP) part of the packet.
Added comments from the datasheet, to make it more clear what
these masks are selecting.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some devices need their multicast filter reset but others are crashed by that.
So the methods need to be separated.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Reported-by: "Ridgway, Keith" <kridgway@harris.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2017-05-23
1) Fix wrong header offset for esp4 udpencap packets.
2) Fix a stack access out of bounds when creating a bundle
with sub policies. From Sabrina Dubroca.
3) Fix slab-out-of-bounds in pfkey due to an incorrect
sadb_x_sec_len calculation.
4) We checked the wrong feature flags when taking down
an interface with IPsec offload enabled.
Fix from Ilan Tayari.
5) Copy the anti replay sequence numbers when doing a state
migration, otherwise we get out of sync with the sequence
numbers. Fix from Antony Antony.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't set an error code on this path. It means that we return NULL
instead of an error pointer and the caller does a NULL dereference.
Fixes: 6d1d8050b4 ("block, partition: add partition_meta_info to hd_struct")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Add tolerance to failures of irq_set_affinity_hint().
Its role is to give hints that optimizes performance,
and should not block the driver load.
In non-SMP systems, functionality is not available as
there is a single core, and all these calls definitely
fail. Hence, do not call the function and avoid the
warning prints.
Fixes: db058a186f ("net/mlx5_core: Set irq affinity hints")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently when firmware command gets stuck or it takes long time to
complete, the driver command will get timeout and the command slot is
freed and can be used for new commands, and if the firmware receive new
command on the old busy slot its behavior is unexpected and this could
be harmful.
To fix this when the driver command gets timeout we return failure,
but we don't free the command slot and we wait for the firmware to
explicitly respond to that command.
Once all the entries are busy we will stop processing new firmware
commands.
Fixes: 9cba4ebcf3 ('net/mlx5: Fix potential deadlock in command mode change')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
IPoIB packet contains the pseudo header area, we need to pull it prior
to reset_mac_header in order to let the GRO work well.
In more details:
GRO checks the mac address of the new coming packet, it does that by
comparing the hard_header_len size of the current packet to the previous
one in that session, the comparison is over hard_header_len size.
Now, the driver prepares that area in the skb by allocating area from the
reserved part and resetting the correct mac header to it.
Fixes: 9d6bd752c6 ("net/mlx5e: IPoIB, RX handler")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The sparse tool emits these correct complaints:
drivers/net/ethernet/mellanox/mlx5/core//en_tc.c:1005:25: warning: cast to restricted __be32
drivers/net/ethernet/mellanox/mlx5/core//en_tc.c:1007:25: warning: cast to restricted __be16
The value is provided from user-space in network order, but there's
no way for them to realize that, avoid the warnings by casting to the
appropriate type.
Fixes: d79b6df6b1 ('net/mlx5e: Add parsing of TC pedit actions to HW format')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently we don't support partial header re-writes through TC pedit
action offloading. However, the code that enforces that wasn't err-ing
on cases where the first and last bits of the mask are set but there is
some zero bit between them, such as in the below example, fix that!
tc filter add dev enp1s0 protocol ip parent ffff: prio 10 flower
ip_proto udp dst_port 2001 skip_sw
action pedit munge ip src set 1.0.0.1 retain 0xff0000ff
Fixes: d79b6df6b1 ('net/mlx5e: Add parsing of TC pedit actions to HW format')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When offloading header re-writes, the HW re-calculates the relevant L3/L4
checksums. Hence, when upper layers (as done by OVS) ask for TC checksum action
offload together with pedit offload, don't err. This command now works:
tc filter add dev ens1f0 protocol ip parent ffff: prio 20 flower skip_sw
ip_proto tcp dst_port 9001
action pedit ex munge tcp dport set 0x1234 pipe
action csum tcp
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add the accessors for realizing if this is a csum action,
and for which fields checksum is needed.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We wrongly direcly invoke hlist_del_rcu() and not hash_del_rcu() which
does a slightly different call now and may change later, fix that.
Fixes: a54e20b4fc ('net/mlx5e: Add basic TC tunnel set action for SRIOV offloads')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When I introduced ptracer_cred I failed to consider the weirdness of
fork where the task_struct copies the old value by default. This
winds up leaving ptracer_cred set even when a process forks and
the child process does not wind up being ptraced.
Because ptracer_cred is not set on non-ptraced processes whose
parents were ptraced this has broken the ability of the enlightenment
window manager to start setuid children.
Fix this by properly initializing ptracer_cred in ptrace_init_task
This must be done with a little bit of care to preserve the current value
of ptracer_cred when ptrace carries through fork. Re-reading the
ptracer_cred from the ptracing process at this point is inconsistent
with how PT_PTRACE_CAP has been maintained all of these years.
Tested-by: Takashi Iwai <tiwai@suse.de>
Fixes: 64b875f7ac ("ptrace: Capture the ptracer's creds not PT_PTRACE_CAP")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
The description of the connection between the dwmmc (SDIO) controller and
the Wifi chip, which is attached to the SDIO bus is wrong. Currently the
SDIO card can't be detected and thus the Wifi doesn't work.
Let's fix this by assigning the correct vmmc supply, which is the always on
regulator VDD_3V3 and remove the WLAN enable regulator altogether. Then to
properly deal with the power on/off sequence, add a mmc-pwrseq node to
describe the resources needed to detect the SDIO card.
Except for the WLAN enable GPIO and its corresponding assert/de-assert
delays, the mmc-pwrseq node also contains a handle to a clock provided by
the hi655x pmic. This clock is also needed to be able to turn on the WiFi
chip.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Move the board specific descriptions for the dwmmc nodes in the hi6220 SoC
dtsi, into the hikey dts as it's there these belongs.
While changing this, let's take the opportunity to drop the use of the
"ti,non-removable" binding for one of the dwmmc device nodes, as it's not a
valid binding and not used. Drop also the unnecessary use of "num-slots =
<0x1>" for all of the dwmmc nodes, as there is no need to set this since
when default number of slots is one.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Add these regulators to better describe the HW, but also because those is
needed in following changes.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The regulator is a part of the hikey board, therefore let's move it from
the hi6220 SoC dtsi file into the hikey dts file . Let's also rename the
regulator according to the datasheet (5V_HUB) to better reflect the HW.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The hi655x PMIC provides the regulators but also a clock. The latter is
missing so let's add it. This clock is used by WiFi/Bluetooth chip, but
that connection is done in a separate change on top of this one.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
[Ulf: Split patch and updated changelog]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The hi655x PMIC provides the regulators but also a clock. The latter is
missing in the definition, so extend the documentation to include this as
well.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
[Ulf: Split patch and updated changelog]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
If the optional power-off-delay-us property is found, insert the
corresponding delay after asserting the GPIO during power off. This enables
a graceful shutdown sequence for some devices.
Cc: linux-mmc@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
During power off, after the GPIO pin has been asserted, some devices like
the Wifi chip from TI, Wl18xx, needs a delay before the host continues with
clock gating and turning off regulators as to follow a graceful shutdown
sequence.
Therefore invent an optional power-off-delay-us DT binding for
mmc-pwrseq-simple, to allow us to support this constraint.
Cc: devicetree@vger.kernel.org
Cc: Rob Herring <robh+dt@kernel.org>
Cc: linux-mmc@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
We use well known standard names for functions that have name, such as
I2C, SPI, SPDIF, etc..
Fix the function name of SPDIF, which was named OWA (One Wire Audio)
based on Allwinner datasheets.
Fixes: 4730f33f0d ("pinctrl: sunxi: add allwinner A83T PIO controller
support")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
To set the mux mode of a pin two bits must be set. Up to now this is
implemented using the following idiom:
writel(mask, reg + CLR);
writel(value, reg + SET);
. This however results in the mux mode being 0 between the two writes.
On my machine there is an IC's reset pin connected to LCD_D20. The
bootloader configures this pin as GPIO output-high (i.e. not holding the
IC in reset). When Linux reconfigures the pin to GPIO the short time
LCD_D20 is muxed as LCD_D20 instead of GPIO_1_20 is enough to confuse
the connected IC.
The same problem is present for the pin's drive strength setting which is
reset to low drive strength before using the right value.
So instead of relying on the hardware to modify the register setting
using two writes implement the bit toggling using read-modify-write.
Fixes: 17723111e6 ("pinctrl: add pinctrl-mxs support")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
I have not been able to dedicate time to the GPIO subsystem since quite
some time, and I don't see the situation improving in the near future.
Update the maintainers list to reflect this unfortunate fact.
Signed-off-by: Alexandre Courbot <gnurou@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
It turns out there are quite many Chromebooks out there that have the
same keyboard issue than Acer Chromebook. All of them are based on
Intel_Strago reference and report their DMI_PRODUCT_FAMILY as
"Intel_Strago" (Samsung Chromebook 3 and Cyan Chromebooks are exceptions
for which we add separate entries).
Instead of adding each machine to the quirk table, we use
DMI_PRODUCT_FAMILY of "Intel_Strago" that hopefully covers most of the
machines out there currently.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=194945
Suggested: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Sometimes it is more convenient to be able to match a whole family of
products, like in case of bunch of Chromebooks based on Intel_Strago to
apply a driver quirk instead of quirking each machine one-by-one.
This adds support for DMI_PRODUCT_FAMILY identification string and also
exports it to the userspace through sysfs attribute just like the
existing ones.
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The Crystal Cove PMIC has 16 real GPIOs but the ACPI code for devices
with this PMIC may address up to 95 GPIOs, these extra GPIOs are
called virtual GPIOs and are used by the ACPI code as a method of
accessing various non GPIO bits of PMIC.
Commit dcdc3018d6 ("gpio: crystalcove: support virtual GPIO") added
dummy support for these to avoid a bunch of ACPI errors, but instead of
ignoring writes / reads to them by doing:
if (gpio >= CRYSTALCOVE_GPIO_NUM)
return 0;
It accidentally introduced the following wrong check:
if (gpio > CRYSTALCOVE_VGPIO_NUM)
return 0;
Which means that attempts by the ACPI code to access these gpios
causes some arbitrary gpio to get touched through for example
GPIO1P0CTLO + gpionr % 8.
Since we do support input/output (but not interrupts) on the 0x5e
virtual GPIO, this commit makes to_reg return -ENOTSUPP for unsupported
virtual GPIOs so as to not have to check for (gpio >= CRYSTALCOVE_GPIO_NUM
&& gpio != 0x5e) everywhere and to make it easier to add support for more
virtual GPIOs in the future.
It then adds a check for to_reg returning an error to all callers where
this may happen fixing the ACPI code accessing virtual GPIOs accidentally
causing changes to real GPIOs.
Fixes: dcdc3018d6 ("gpio: crystalcove: support virtual GPIO")
Cc: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
This model is actually called 92XXM2-8 in Windows driver. But since pin
configs for M22 and M28 are identical, just reuse M22 quirk.
Fixes external microphone (tested) and probably docking station ports
(not tested).
Signed-off-by: Alexander Tsoy <alexander@tsoy.me>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
With enabling this workaround, can observe GPU hang issue on Gen9. As
currently host side doesn't have this workaround, disable it from GVT
side.
v2:
- Fix indent error.(Zhenyu)
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
crypto_gcm_setkey() was using wait_for_completion_interruptible() to
wait for completion of async crypto op but if a signal occurs it
may return before DMA ops of HW crypto provider finish, thus
corrupting the data buffer that is kfree'ed in this case.
Resolve this by using wait_for_completion() instead.
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
CC: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
drbg_kcapi_sym_ctr() was using wait_for_completion_interruptible() to
wait for completion of async crypto op but if a signal occurs it
may return before DMA ops of HW crypto provider finish, thus
corrupting the output buffer.
Resolve this by using wait_for_completion() instead.
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
CC: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
public_key_verify_signature() was passing the CRYPTO_TFM_REQ_MAY_BACKLOG
flag to akcipher_request_set_callback() but was not handling correctly
the case where a -EBUSY error could be returned from the call to
crypto_akcipher_verify() if backlog was used, possibly casuing
data corruption due to use-after-free of buffers.
Resolve this by handling -EBUSY correctly.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
CC: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto fix from Herbert Xu:
"This fixes a regression in the skcipher interface that allows bogus
key parameters to hit underlying implementations which can cause
crashes"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: skcipher - Add missing API setkey checks
Pull pstore fix from Kees Cook:
"Marta noticed another misbehavior in EFI pstore, which this fixes.
Hopefully this is the last of the v4.12 fixes for pstore!"
* tag 'pstore-v4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
efi-pstore: Fix write/erase id tracking
Pull ACPI fixes from Rafael Wysocki:
"These revert a 4.11 change that turned out to be problematic and add a
.gitignore file.
Specifics:
- Revert a 4.11 commit related to the ACPI-based handling of laptop
lids that made changes incompatible with existing user space stacks
and broke things there (Lv Zheng).
- Add .gitignore to the ACPI tools directory (Prarit Bhargava)"
* tag 'acpi-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI / button: Remove lid_init_state=method mode"
tools/power/acpi: Add .gitignore file
Pull power management fixes from Rafael Wysocki:
"These fix RTC wakeup from suspend-to-idle broken recently, fix CPU
idleness detection condition in the schedutil cpufreq governor, fix a
cpufreq driver build failure, fix an error code path in the power
capping framework, clean up the hibernate core and update the
intel_pstate documentation.
Specifics:
- Fix RTC wakeup from suspend-to-idle broken by the recent rework of
ACPI wakeup handling (Rafael Wysocki).
- Update intel_pstate driver documentation to reflect the current
code and explain how it works in more detail (Rafael Wysocki).
- Fix an issue related to CPU idleness detection on systems with
shared cpufreq policies in the schedutil governor (Juri Lelli).
- Fix a possible build issue in the dbx500 cpufreq driver (Arnd
Bergmann).
- Fix a function in the power capping framework core to return an
error code instead of 0 when there's an error (Dan Carpenter).
- Clean up variable definition in the hibernation core (Pushkar
Jambhlekar)"
* tag 'pm-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: dbx500: add a Kconfig symbol
PM / hibernate: Declare variables as static
PowerCap: Fix an error code in powercap_register_zone()
RTC: rtc-cmos: Fix wakeup from suspend-to-idle
PM / wakeup: Fix up wakeup_source_report_event()
cpufreq: intel_pstate: Document the current behavior and user interface
cpufreq: schedutil: use now as reference when aggregating shared policy requests
ci_role BUGs when the role is >= CI_ROLE_END.
This is the case while the role is changing.
Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
The datasheet and application note does not mention an allowed range for
the M09_REGISTER_THRESHOLD parameter. One of our customers needs to set
lower values than 20 and they seem to work just fine on EDT EP0xx0M09 with
T5x06 touch.
So, lacking a known lower limit, we increase the range for thresholds,
and set the lower limit to 0. The documentation is updated accordingly.
Signed-off-by: Schoefegger Stefan <stefan.schoefegger@ginzinger.com>
Signed-off-by: Manfred Schlaegl <manfred.schlaegl@ginzinger.com>
Signed-off-by: Martin Kepplinger <martin.kepplinger@ginzinger.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Prior to the pstore interface refactoring, the "id" generated during
a backend pstore_write() was only retained by the internal pstore
inode tracking list. Additionally the "part" was ignored, so EFI
would encode this in the id. This corrects the misunderstandings
and correctly sets "id" during pstore_write(), and uses "part"
directly during pstore_erase().
Reported-by: Marta Lofstedt <marta.lofstedt@intel.com>
Fixes: 76cc9580e3 ("pstore: Replace arguments for write() API")
Fixes: a61072aae6 ("pstore: Replace arguments for erase() API")
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Marta Lofstedt <marta.lofstedt@intel.com>
Commit d224e93818 ("drivers/md/dm-ioctl.c: use kvmalloc rather than
opencoded variant") left out the __GFP_HIGH flag when converting from
__vmalloc to kvmalloc. This can cause the DM ioctl to fail in some low
memory situations where it wouldn't have failed earlier. Add __GFP_HIGH
back to avoid any potential regression.
Fixes: d224e93818 ("drivers/md/dm-ioctl.c: use kvmalloc rather than opencoded variant")
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Commit cc7b0d4955 ("PCI: designware: Update PCI config space remap
function") made PCI configuration requests non-posted, which means we now
get a synchronous abort when the CFG space read to probe for downstream
devices times out.
Synchronous aborts need to be handled differently from the async aborts we
were getting before, in particular the PC needs to be advanced when
resolving the abort. This is mostly a copy of what other PCI drivers do on
ARM to handle those aborts.
[bhelgaas: changelog, "Fixes"]
Fixes: cc7b0d4955 ("PCI: designware: Update PCI config space remap function")
Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Tested-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Richard Zhu <hongxing.zhu@nxp.com>
When a switch endpoint is configured without NTB, the mmio_ntb registers
will read all zeros. However, in corner case configurations where the
partition ID is not zero and NTB is not enabled, the code will have the
wrong partition ID and this causes the driver to use the wrong set of
drivers. To fix this we simply take the partition ID from the system info
region.
Reported-by: Dingbao Chen <dingbao.chen@microsemi.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Convert from "cdev_add() + device_add()" to cdev_device_add(), and from
"device_del() + cdev_del()" to cdev_device_del().
[bhelgaas: changelog]
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If NO_DMA=y:
drivers/built-in.o: In function `__pci_epc_create':
(.text+0xef4e): undefined reference to `bad_dma_ops'
drivers/built-in.o: In function `pci_epc_add_epf':
(.text+0xf676): undefined reference to `bad_dma_ops'
drivers/built-in.o: In function `pci_epf_alloc_space':
(.text+0xfa32): undefined reference to `bad_dma_ops'
drivers/built-in.o: In function `pci_epf_free_space':
(.text+0xfac4): undefined reference to `bad_dma_ops'
Add a dependency on HAS_DMA to fix this.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Default value of io.low limit is 0. If user doesn't configure the limit,
last patch makes cgroup be throttled to very tiny bps/iops, which could
stall the system. A cgroup with default settings of io.low limit really
means nothing, so we force user to configure all settings, otherwise
io.low limit doesn't take effect. With this stragety, default setting of
latency/idle isn't important, so just set them to very conservative and
safe value.
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
If a cgroup with low limit 0 for both bps/iops, the cgroup's low limit
is ignored and we throttle the cgroup with its max limit. In this way,
other cgroups with a low limit will not get protected. To fix this, we
don't do the exception any more. cgroup will be throttled to a limit 0
if it uese default setting. To avoid completed stall, we give such
cgroup tiny IO resources.
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
These info are important to understand what's happening and help debug.
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
For idle time, children's setting should not be bigger than parent's.
For latency target, children's setting should not be smaller than
parent's. The leaf nodes will adjust their settings according to the
hierarchy and compare their IO with the settings and do
upgrade/downgrade. parents nodes don't need to track their IO
latency/idle time.
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
If a kthread forks (e.g. usermodehelper since commit 1da5c46fa9) but
fails in copy_process() between calling dup_task_struct() and setting
p->set_child_tid, then the value of p->set_child_tid will be inherited
from the parent and get prematurely freed by free_kthread_struct().
kthread()
- worker_thread()
- process_one_work()
| - call_usermodehelper_exec_work()
| - kernel_thread()
| - _do_fork()
| - copy_process()
| - dup_task_struct()
| - arch_dup_task_struct()
| - tsk->set_child_tid = current->set_child_tid // implied
| - ...
| - goto bad_fork_*
| - ...
| - free_task(tsk)
| - free_kthread_struct(tsk)
| - kfree(tsk->set_child_tid)
- ...
- schedule()
- __schedule()
- wq_worker_sleeping()
- kthread_data(task)->flags // UAF
The problem started showing up with commit 1da5c46fa9 since it reused
->set_child_tid for the kthread worker data.
A better long-term solution might be to get rid of the ->set_child_tid
abuse. The comment in set_kthread_struct() also looks slightly wrong.
Debugged-by: Jamie Iles <jamie.iles@oracle.com>
Fixes: 1da5c46fa9 ("kthread: Make struct kthread kmalloc'ed")
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jamie Iles <jamie.iles@oracle.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170509073959.17858-1-vegard.nossum@oracle.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Markus reported that the glibc/nptl/tst-robustpi8 test was failing after
commit:
cfafcd117d ("futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()")
The following trace shows the problem:
ld-linux-x86-64-2161 [019] .... 410.760971: SyS_futex: 00007ffbeb76b028: 80000875 op=FUTEX_LOCK_PI
ld-linux-x86-64-2161 [019] ...1 410.760972: lock_pi_update_atomic: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000875 ret=0
ld-linux-x86-64-2165 [011] .... 410.760978: SyS_futex: 00007ffbeb76b028: 80000875 op=FUTEX_UNLOCK_PI
ld-linux-x86-64-2165 [011] d..1 410.760979: do_futex: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000871 ret=0
ld-linux-x86-64-2165 [011] .... 410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=0000
ld-linux-x86-64-2161 [019] .... 410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=ETIMEDOUT
Task 2165 does an UNLOCK_PI, assigning the lock to the waiter task 2161
which then returns with -ETIMEDOUT. That wrecks the lock state, because now
the owner isn't aware it acquired the lock and removes the pending robust
list entry.
If 2161 is killed, the robust list will not clear out this futex and the
subsequent acquire on this futex will then (correctly) result in -ESRCH
which is unexpected by glibc, triggers an internal assertion and dies.
Task 2161 Task 2165
rt_mutex_wait_proxy_lock()
timeout();
/* T2161 is still queued in the waiter list */
return -ETIMEDOUT;
futex_unlock_pi()
spin_lock(hb->lock);
rtmutex_unlock()
remove_rtmutex_waiter(T2161);
mark_lock_available();
/* Make the next waiter owner of the user space side */
futex_uval = 2161;
spin_unlock(hb->lock);
spin_lock(hb->lock);
rt_mutex_cleanup_proxy_lock()
if (rtmutex_owner() !== current)
...
return FAIL;
....
return -ETIMEOUT;
This means that rt_mutex_cleanup_proxy_lock() needs to call
try_to_take_rt_mutex() so it can take over the rtmutex correctly which was
assigned by the waker. If the rtmutex is owned by some other task then this
call is harmless and just confirmes that the waiter is not able to acquire
it.
While there, fix what looks like a merge error which resulted in
rt_mutex_cleanup_proxy_lock() having two calls to
fixup_rt_mutex_waiters() and rt_mutex_wait_proxy_lock() not having any.
Both should have one, since both potentially touch the waiter list.
Fixes: 38d589f2fd ("futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()")
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Bug-Spotted-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Link: http://lkml.kernel.org/r/20170519154850.mlomgdsd26drq5j6@hirez.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Jonathan writes:
First set of IIO fixes in the 4.12 cycle.
Matt finally set up the lightning storm he needed to test the as3935.
* core
- Fix a null pointer deference in iio_trigger_write_current when changing
from a non existent trigger to another non existent trigger.
* a3935
- Recalibrate the RCO after resume.
- Fix interrupt mask so that we actually get some interrupts.
- Use iio_trigger_poll_chained as we aren't in interrupt context.
* am335x
- Fix wrong allocation size provided for private data to iio_device_alloc.
* bcm_iproc
- Swapped primary and secondary isr handlers.
* ltr501
- Fix swapped als/ps register fields when enabling interrupts
* max9611
- Wrong scale factor for the shunt_resistor attribute.
* sun4-gpadc
- Module autoloading fixes by adding the device table declarations.
- Fix parent device being used in devm functions.
Pull networking fixes from David Miller:
"Mostly netfilter bug fixes in here, but we have some bits elsewhere as
well.
1) Don't do SNAT replies for non-NATed connections in IPVS, from
Julian Anastasov.
2) Don't delete conntrack helpers while they are still in use, from
Liping Zhang.
3) Fix zero padding in xtables's xt_data_to_user(), from Willem de
Bruijn.
4) Add proper RCU protection to nf_tables_dump_set() because we
cannot guarantee that we hold the NFNL_SUBSYS_NFTABLES lock. From
Liping Zhang.
5) Initialize rcv_mss in tcp_disconnect(), from Wei Wang.
6) smsc95xx devices can't handle IPV6 checksums fully, so don't
advertise support for offloading them. From Nisar Sayed.
7) Fix out-of-bounds access in __ip6_append_data(), from Eric
Dumazet.
8) Make atl2_probe() propagate the error code properly on failures,
from Alexey Khoroshilov.
9) arp_target[] in bond_check_params() is used uninitialized. This
got changes from a global static to a local variable, which is how
this mistake happened. Fix from Jarod Wilson.
10) Fix fallout from unnecessary NULL check removal in cls_matchall,
from Jiri Pirko. This is definitely brown paper bag territory..."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
net: sched: cls_matchall: fix null pointer dereference
vsock: use new wait API for vsock_stream_sendmsg()
bonding: fix randomly populated arp target array
net: Make IP alignment calulations clearer.
bonding: fix accounting of active ports in 3ad
net: atheros: atl2: don't return zero on failure path in atl2_probe()
ipv6: fix out of bound writes in __ip6_append_data()
bridge: start hello_timer when enabling KERNEL_STP in br_stp_start
smsc95xx: Support only IPv4 TCP/UDP csum offload
arp: always override existing neigh entries with gratuitous ARP
arp: postpone addr_type calculation to as late as possible
arp: decompose is_garp logic into a separate function
arp: fixed error in a comment
tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0
netfilter: xtables: fix build failure from COMPAT_XT_ALIGN outside CONFIG_COMPAT
ebtables: arpreply: Add the standard target sanity check
netfilter: nf_tables: revisit chain/object refcounting from elements
netfilter: nf_tables: missing sanitization in data from userspace
netfilter: nf_tables: can't assume lock is acquired when dumping set elems
netfilter: synproxy: fix conntrackd interaction
...
The driver checks an incorrect flag of functionality of adapter.
When a driver requires i2c_smbus_read_byte_data and
i2c_smbus_write_byte_data, it should check I2C_FUNC_SMBUS_BYTE_DATA
instead I2C_FUNC_I2C.
This patch fixes the problem.
Signed-off-by: Tin Huynh <tnhuynh@apm.com>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
fix extra controller reference taken on reconnect by moving
reference to initial controller create
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
correct nvme status set on abort. Patch that changed status to being actual
nvme status crossed in the night with the patch that added abort values.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Per the review by Sagi on:
http://lists.infradead.org/pipermail/linux-nvme/2017-April/009261.html
Looked at existing warn vs info vs err dev_xxx levels for the messages
printed on reconnects and deletes:
- Resets due to error and resets transitioned to deletes are dev_warn
- Other reset/disconnect messages are dev_info
- Removed chatty io queue related messages
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Sync with Sagi's recent addition of ctrl_loss_tmo in the core fabrics
layer.
Remove local connect limits and connect_attempts variable.
Use fabrics new nr_connects variable and use of nvmf_should_reconnect()
Refactor duplicate reconnect failure code.
Addresses review comment by Sagi on controller reset support:
http://lists.infradead.org/pipermail/linux-nvme/2017-April/009261.html
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Remove the local copy of reconnect_delay.
Use the value in the controller options directly.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Since the head is guaranteed by the check above to be null, the call_rcu
would explode. Remove the previously logically dead code that was made
logically very much alive and kicking.
Fixes: 985538eee0 ("net/sched: remove redundant null check on head")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NVMe may add request into requeue list simply and not kick off the
requeue if hw queues are stopped. Then blk_mq_abort_requeue_list()
is called in both nvme_kill_queues() and nvme_ns_remove() for
dealing with this issue.
Unfortunately blk_mq_abort_requeue_list() is absolutely a
race maker, for example, one request may be requeued during
the aborting. So this patch just calls blk_mq_kick_requeue_list() in
nvme_kill_queues() to handle this issue like what nvme_start_queues()
does. Now all requests in requeue list when queues are stopped will be
handled by blk_mq_kick_requeue_list() when queues are restarted, either
in nvme_start_queues() or in nvme_kill_queues().
Cc: stable@vger.kernel.org
Reported-by: Zhang Yi <yizhan@redhat.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Inside nvme_kill_queues(), we have to start hw queues for
draining requests in sw queues, .dispatch list and requeue list,
so use blk_mq_start_hw_queues() instead of blk_mq_start_stopped_hw_queues()
which only run queues if queues are stopped, but the queues may have
been started already, for example nvme_start_queues() is called in reset work
function.
blk_mq_start_hw_queues() run hw queues in current context, instead
of running asynchronously like before. Given nvme_kill_queues() is
run from either remove context or reset worker context, both are fine
to run hw queue directly. And the mutex of namespaces_mutex isn't a
problem too becasue nvme_start_freeze() runs hw queue in this way
already.
Cc: stable@vger.kernel.org
Reported-by: Zhang Yi <yizhan@redhat.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In the case of small NVMe-oF queue size (<32) we may enter a deadlock
caused by the fact that the IB completions aren't sent waiting for 32
and the send queue will fill up.
The error is seen as (using mlx5):
[ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273):
[ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12
This patch changes the way the signaling is done so that it depends on
the queue depth now. The magic define has been removed completely.
Cc: stable@vger.kernel.org
Signed-off-by: Marta Rybczynska <marta.rybczynska@kalray.eu>
Signed-off-by: Samuel Jones <sjones@kalray.eu>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
As reported by Michal, vsock_stream_sendmsg() could still
sleep at vsock_stream_has_space() after prepare_to_wait():
vsock_stream_has_space
vmci_transport_stream_has_space
vmci_qpair_produce_free_space
qp_lock
qp_acquire_queue_mutex
mutex_lock
Just switch to the new wait API like we did for commit
d9dc8b0f8b ("net: fix sleeping for sk_wait_event()").
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit dc9c4d0fe0, the arp_target array moved from a static global
to a local variable. By the nature of static globals, the array used to
be initialized to all 0. At present, it's full of random data, which
that gets interpreted as arp_target values, when none have actually been
specified. Systems end up booting with spew along these lines:
[ 32.161783] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.168475] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.175089] 8021q: adding VLAN 0 to HW filter on device lacp0
[ 32.193091] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.204892] lacp0: Setting MII monitoring interval to 100
[ 32.211071] lacp0: Removing ARP target 216.124.228.17
[ 32.216824] lacp0: Removing ARP target 218.160.255.255
[ 32.222646] lacp0: Removing ARP target 185.170.136.184
[ 32.228496] lacp0: invalid ARP target 255.255.255.255 specified for removal
[ 32.236294] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
[ 32.243987] lacp0: Removing ARP target 56.125.228.17
[ 32.249625] lacp0: Removing ARP target 218.160.255.255
[ 32.255432] lacp0: Removing ARP target 15.157.233.184
[ 32.261165] lacp0: invalid ARP target 255.255.255.255 specified for removal
[ 32.268939] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
[ 32.276632] lacp0: Removing ARP target 16.0.0.0
[ 32.281755] lacp0: Removing ARP target 218.160.255.255
[ 32.287567] lacp0: Removing ARP target 72.125.228.17
[ 32.293165] lacp0: Removing ARP target 218.160.255.255
[ 32.298970] lacp0: Removing ARP target 8.125.228.17
[ 32.304458] lacp0: Removing ARP target 218.160.255.255
None of these were actually specified as ARP targets, and the driver does
seem to clean up the mess okay, but it's rather noisy and confusing, leaks
values to userspace, and the 255.255.255.255 spew shows up even when debug
prints are disabled.
The fix: just zero out arp_target at init time.
While we're in here, init arp_all_targets_value in the right place.
Fixes: dc9c4d0fe0 ("bonding: reduce scope of some global variables")
CC: Mahesh Bandewar <maheshb@google.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: netdev@vger.kernel.org
CC: stable@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* intel_pstate:
cpufreq: intel_pstate: Document the current behavior and user interface
* pm-cpufreq:
cpufreq: dbx500: add a Kconfig symbol
* pm-cpufreq-sched:
cpufreq: schedutil: use now as reference when aggregating shared policy requests
DM-Verity has an (undocumented) mode where no salt is used. This was
never handled directly by the DM-Verity code, instead working due to the
fact that calling crypto_shash_update() with a zero length data is an
implicit noop.
This is no longer the case now that we have switched to
crypto_ahash_update(). Fix the issue by introducing explicit handling
of the no salt use case to DM-Verity.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Reported-by: Marian Csontos <mcsontos@redhat.com>
Fixes: d1ac3ff ("dm verity: switch to using asynchronous hash crypto API")
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The assignmnet:
ip_align = strict ? 2 : NET_IP_ALIGN;
in compare_pkt_ptr_alignment() trips up Coverity because we can only
get to this code when strict is true, therefore ip_align will always
be 2 regardless of NET_IP_ALIGN's value.
So just assign directly to '2' and explain the situation in the
comment above.
Reported-by: "Gustavo A. R. Silva" <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The stingray SDHCI hardware supports ACMD12 and automatically
issues after multi block transfer completed.
If ACMD12 in SDHCI is disabled, spurious tx done interrupts are seen
on multi block read command with below error message:
Got data interrupt 0x00000002 even though no data
operation was in progress.
This patch uses SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 to enable
ACM12 support in SDHCI hardware and suppress spurious interrupt.
Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: b580c52d58 ("mmc: sdhci-iproc: add IPROC SDHCI driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
As of 7bb11dc9f5 and 0622cab034, bond slaves in a 3ad bond are not
removed from the aggregator when they are down, and the active slave count
is NOT equal to number of ports in the aggregator, but rather the number
of ports in the aggregator that are still enabled. The sysfs spew for
bonding_show_ad_num_ports() has a comment that says "Show number of active
802.3ad ports.", but it's currently showing total number of ports, both
active and inactive. Remedy it by using the same logic introduced in
0622cab034 in __bond_3ad_get_active_agg_info(), so sysfs, procfs and
netlink all report the number of active ports. Note that this means that
IFLA_BOND_AD_INFO_NUM_PORTS really means NUM_ACTIVE_PORTS instead of
NUM_PORTS, and thus perhaps should be renamed for clarity.
Lightly tested on a dual i40e lacp bond, simulating link downs with an ip
link set dev <slave2> down, was able to produce the state where I could
see both in the same aggregator, but a number of ports count of 1.
MII Status: up
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 2 <---
Slave Interface: ens10
MII Status: up <---
Aggregator ID: 1
Slave Interface: ens11
MII Status: up
Aggregator ID: 1
MII Status: up
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 1 <---
Slave Interface: ens10
MII Status: down <---
Aggregator ID: 1
Slave Interface: ens11
MII Status: up
Aggregator ID: 1
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If dma mask checks fail in atl2_probe(), it breaks off initialization,
deallocates all resources, but returns zero.
The patch adds proper error code return value and
make error code setup unified.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the regulator probing is not yet finished this driver
might catch a -EPROBE_DEFER. Returning after this condition
did not remove the created platform device. On a repeated
call to the probe function the of_platform_device_create
fails.
Calling of_platform_device_destroy after EPROBE_DEFER resolves
this bug.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
of_platform_device_destroy is the counterpart to
of_platform_device_create which is a non-static function.
After creating a platform device it might be neccessary
to destroy it to deal with -EPROBE_DEFER where a
repeated of_platform_device_create call would fail otherwise.
Therefore also make of_platform_device_destroy globally visible.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In case the DT specifies neither a regulator nor a gpio
for the shared power the driver will crash accessing the regulator.
Prevent the crash by checking the regulator before use.
Use mmc_regulator_get_supply() instead of open coding the same
logic.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Andrey Konovalov and idaifish@gmail.com reported crashes caused by
one skb shared_info being overwritten from __ip6_append_data()
Andrey program lead to following state :
copy -4200 datalen 2000 fraglen 2040
maxfraglen 2040 alloclen 2048 transhdrlen 0 offset 0 fraggap 6200
The skb_copy_and_csum_bits(skb_prev, maxfraglen, data + transhdrlen,
fraggap, 0); is overwriting skb->head and skb_shared_info
Since we apparently detect this rare condition too late, move the
code earlier to even avoid allocating skb and risking crashes.
Once again, many thanks to Andrey and syzkaller team.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: <idaifish@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andre Przywara <andre.przywara@arm.com> noticed that we can get the
following warning with -EPROBE_DEFER:
"WARNING: CPU: 1 PID: 89 at drivers/base/dd.c:349
driver_probe_device+0x2ac/0x2e8"
Let's fix the issue by removing the indices as suggested by
Tejun Heo <tj@kernel.org>. All we have to do here is kill the radix
tree.
I probably ended up with the indices after grepping for removal
of all entries using radix_tree_for_each_slot() and the first
match found was gmap_radix_tree_free(). Anyways, no need for
indices here, and we can just do remove all the entries using
radix_tree_for_each_slot() along how the item_kill_tree() test
case does.
Fixes: c7059c5ac7 ("pinctrl: core: Add generic pinctrl functions for managing groups")
Fixes: a76edc89b1 ("pinctrl: core: Add generic pinctrl functions for managing groups")
Reported-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
I've forgotten to sync the documentation with the actually available
options for some time. Now all updated.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Recently some laptops and mobos are equipped with the dual Realtek
codecs that require special quirks. For making the debugging easier,
add the model "dual-codecs" to be passed via module option.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
MSI Z270-Gamin mobo has also two ALC1220 codecs like Gigabyte AZ370-
Gaming mobo. Apply the same quirk to this one.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Make some symbols static to fix sparse warnings like:
drivers/s390/cio/vfio_ccw_ops.c:73:1: warning: symbol 'mdev_type_attr_name' was not declared. Should it be static?
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
The keyboard dock used with the Asus Transformer T100 series, uses
the same vendor-defined 0xff31 usage-page as some other Asus
keyboards. But with a small twist, it has a small descriptor bug which
needs to be fixed up for things to work.
This commit adds the USB-ID for this keyboard to the hid-asus driver
and makes asus_report_fixup fix the descriptor issue, fixing
various special function keys on this keyboard not working.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Add stubs for gpiod_add_lookup_table() and gpiod_remove_lookup_table()
for the !GPIOLIB case to prevent build errors.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
This reverts commit 8c58f1a7a4.
It turns out that applying these generic properties was
premature: the properties used in the driver using this
are of unclear electrical nature and the subject need to
be discussed.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
We warn the user at driver probe time that debouncing is disabled.
However, if they request debouncing later on we print a confusing error
message:
gpio_aspeed 1e780000.gpio: Failed to convert 5000us to cycles at 0Hz: -524
Instead bail out when the clock is not present.
Fixes: 5ae4cb94b3 (gpio: aspeed: Add debounce support)
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
We need to initializes those variables to 0 for platforms that do not
provide ACPI parameters. Otherwise, we set sda_hold_time to random
values, breaking e.g. Galileo and IOT2000 boards.
Fixes: 9d64084330 ("i2c: designware: don't infer timings described by ACPI from clock rate")
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
According to Boris, some user-space tools expect MTD drivers to
update ecc_stats.corrected, and it's better to provide a lower
bound than to provide no information at all.
Fixes: 6956e2385a ("mtd: nand: add tango NAND flash controller support")
Cc: stable@vger.kernel.org
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
The device table is required to load modules based on
modaliases. After adding MODULE_DEVICE_TABLE, below entries
for example will be added to module.alias:
alias: of:N*T*Csigma,smp8758-nandC*
alias: of:N*T*Csigma,smp8758-nand
Fixes: 6956e2385a ("mtd: nand: add tango NAND flash controller support")
Cc: stable@vger.kernel.org
Signed-off-by: Andres Galacho <andresgalacho@gmail.com>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
We don't handle cases larger than 7. We probably shouldn't pretend we
know the ECC step size in this case, and it's probably also good to
WARN() like we do in many other similar cases.
Fixes: 8fc82d456e ("mtd: nand: samsung: Retrieve ECC requirements from extended ID")
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
If we fail any time after calling nand_detect(), then we don't call the
vendor-specific ->cleanup() callback, and we'll leak any resources the
vendor-specific code might have allocated.
Mark the "fix" against the first commit that started allocating anything
in ->init().
Fixes: 626994e074 ("mtd: nand: hynix: Add read-retry support for 1x nm MLC NANDs")
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
mtk_hdmi_setup_vendor_specific_infoframe will return before handle
mtk_hdmi_hw_send_info_frame.Because hdmi_vendor_infoframe_pack
returns the number of bytes packed into the binary buffer or
a negative error code on failure.
So correct it.
Fixes: 8f83f26891 ("drm/mediatek: Add HDMI support")
Signed-off-by: Nickey Yang <nickey.yang@rock-chips.com>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
This code causes a static checker warning because it treats "i == 0" as
a timeout but, because it's a post-op, the loop actually ends with "i"
set to -1. Philipp Zabel points out that it would be cleaner to use
readl_poll_timeout() instead.
Fixes: 2189881683 ("drm/mediatek: add dsi transfer function")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
The add_new_disk returns with communication locked if
__sendmsg returns failure, fix it with call unlock_comm
before return.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
CC: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
I've got another report about breaking ext4 by ENOMEM error returned from
ext4_mb_load_buddy() caused by memory shortage in memory cgroup.
This time inside ext4_discard_preallocations().
This patch replaces ext4_error() with ext4_warning() where errors returned
from ext4_mb_load_buddy() are not fatal and handled by caller:
* ext4_mb_discard_group_preallocations() - called before generating ENOSPC,
we'll try to discard other group or return ENOSPC into user-space.
* ext4_trim_all_free() - just stop trimming and return ENOMEM from ioctl.
Some callers cannot handle errors, thus __GFP_NOFAIL is used for them:
* ext4_discard_preallocations()
* ext4_mb_discard_lg_preallocations()
Fixes: adb7ef600c ("ext4: use __GFP_NOFAIL in ext4_free_blocks()")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
There is an off-by-one error in loop termination conditions in
ext4_find_unwritten_pgoff() since 'end' may index a page beyond end of
desired range if 'endoff' is page aligned. It doesn't have any visible
effects but still it is good to fix it.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently, SEEK_HOLE implementation in ext4 may both return that there's
a hole at some offset although that offset already has data and skip
some holes during a search for the next hole. The first problem is
demostrated by:
xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "seek -h 0" file
wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (2.054 GiB/sec and 538461.5385 ops/sec)
Whence Result
HOLE 0
Where we can see that SEEK_HOLE wrongly returned offset 0 as containing
a hole although we have written data there. The second problem can be
demonstrated by:
xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k"
-c "seek -h 0" file
wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (1.978 GiB/sec and 518518.5185 ops/sec)
wrote 8192/8192 bytes at offset 131072
8 KiB, 2 ops; 0.0000 sec (2 GiB/sec and 500000.0000 ops/sec)
Whence Result
HOLE 139264
Where we can see that hole at offsets 56k..128k has been ignored by the
SEEK_HOLE call.
The underlying problem is in the ext4_find_unwritten_pgoff() which is
just buggy. In some cases it fails to update returned offset when it
finds a hole (when no pages are found or when the first found page has
higher index than expected), in some cases conditions for detecting hole
are just missing (we fail to detect a situation where indices of
returned pages are not contiguous).
Fix ext4_find_unwritten_pgoff() to properly detect non-contiguous page
indices and also handle all cases where we got less pages then expected
in one place and handle it properly there.
CC: stable@vger.kernel.org
Fixes: c8c0df241c
CC: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When a transaction starts, start_this_handle() saves current
PF_MEMALLOC_NOFS value so that it can be restored at journal stop time.
Journal restart is a special case that calls start_this_handle() without
stopping the transaction. start_this_handle() isn't aware that the
original value is already stored so it overwrites it with current value.
For instance, a call sequence like below leaves PF_MEMALLOC_NOFS flag set
at the end:
jbd2_journal_start()
jbd2__journal_restart()
jbd2_journal_stop()
Make jbd2__journal_restart() restore the original value before calling
start_this_handle().
Fixes: 81378da64d ("jbd2: mark the transaction context with the scope GFP_NOFS context")
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Quota files have special ranking of i_data_sem lock. We inform lockdep
about it when turning on quotas however when turning quotas off, we
don't clear the lockdep subclass from i_data_sem lock and thus when the
inode gets later reused for a normal file or directory, lockdep gets
confused and complains about possible deadlocks. Fix the problem by
resetting lockdep subclass of i_data_sem on quota off.
Cc: stable@vger.kernel.org
Fixes: daf647d2dd
Reported-and-tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Export the function which checks whether an MCE is a memory error to
other users so that we can reuse the logic. Drop the boot_cpu_data use,
while at it, as mce.cpuvendor already has the CPU vendor in there.
Integrate a piece from a patch from Vishal Verma
<vishal.l.verma@intel.com> to export it for modules (nfit).
The main reason we're exporting it is that the nfit handler
nfit_handle_mce() needs to detect a memory error properly before doing
its recovery actions.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170519093915.15413-2-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Since commit 76b91c32dd ("bridge: stp: when using userspace stp stop
kernel hello and hold timers"), bridge would not start hello_timer if
stp_enabled is not KERNEL_STP when br_dev_open.
The problem is even if users set stp_enabled with KERNEL_STP later,
the timer will still not be started. It causes that KERNEL_STP can
not really work. Users have to re-ifup the bridge to avoid this.
This patch is to fix it by starting br->hello_timer when enabling
KERNEL_STP in br_stp_start.
As an improvement, it's also to start hello_timer again only when
br->stp_enabled is KERNEL_STP in br_hello_timer_expired, there is
no reason to start the timer again when it's NO_STP.
Fixes: 76b91c32dd ("bridge: stp: when using userspace stp stop kernel hello and hold timers")
Reported-by: Haidong Li <haili@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
When TX checksum offload is used, if the computed checksum is 0 the
LAN95xx device do not alter the checksum to 0xffff. In the case of ipv4
UDP checksum, it indicates to receiver that no checksum is calculated.
Under ipv6, UDP checksum yields a result of zero must be changed to
0xffff. Hence disabling checksum offload for ipv6 packets.
Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com>
Reported-by: popcorn mix <popcornmix@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ihar Hrachyshka says:
====================
arp: always override existing neigh entries with gratuitous ARP
This patchset is spurred by discussion started at
https://patchwork.ozlabs.org/patch/760372/ where we figured that there is no
real reason for enforcing override by gratuitous ARP packets only when
arp_accept is 1. Same should happen when it's 0 (the default value).
changelog v2: handled review comments by Julian Anastasov
- fixed a mistake in a comment;
- postponed addr_type calculation to as late as possible.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when arp_accept is 1, we always override existing neigh
entries with incoming gratuitous ARP replies. Otherwise, we override
them only if new replies satisfy _locktime_ conditional (packets arrive
not earlier than _locktime_ seconds since the last update to the neigh
entry).
The idea behind locktime is to pick the very first (=> close) reply
received in a unicast burst when ARP proxies are used. This helps to
avoid ARP thrashing where Linux would switch back and forth from one
proxy to another.
This logic has nothing to do with gratuitous ARP replies that are
generally not aligned in time when multiple IP address carriers send
them into network.
This patch enforces overriding of existing neigh entries by all incoming
gratuitous ARP packets, irrespective of their time of arrival. This will
make the kernel honour all incoming gratuitous ARP packets.
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The addr_type retrieval can be costly, so it's worth trying to avoid its
calculation as much as possible. This patch makes it calculated only
for gratuitous ARP packets. This is especially important since later we
may want to move is_garp calculation outside of arp_accept block, at
which point the costly operation will be executed for all setups.
The patch is the result of a discussion in net-dev:
http://marc.info/?l=linux-netdev&m=149506354216994
Suggested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code is quite involving already to earn a separate function for
itself. If anything, it helps arp_process readability.
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tcp_disconnect() is called, inet_csk_delack_init() sets
icsk->icsk_ack.rcv_mss to 0.
This could potentially cause tcp_recvmsg() => tcp_cleanup_rbuf() =>
__tcp_select_window() call path to have division by 0 issue.
So this patch initializes rcv_mss to TCP_MIN_MSS instead of 0.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:
1) When using IPVS in direct-routing mode, normal traffic from the LVS
host to a back-end server is sometimes incorrectly NATed on the way
back into the LVS host. Patch to fix this from Julian Anastasov.
2) Calm down clang compilation warning in ctnetlink due to type
mismatch, from Matthias Kaehlcke.
3) Do not re-setup NAT for conntracks that are already confirmed, this
is fixing a problem that was introduced in the previous nf-next batch.
Patch from Liping Zhang.
4) Do not allow conntrack helper removal from userspace cthelper
infrastructure if already in used. This comes with an initial patch
to introduce nf_conntrack_helper_put() that is required by this fix.
From Liping Zhang.
5) Zero the pad when copying data to userspace, otherwise iptables fails
to remove rules. This is a follow up on the patchset that sorts out
the internal match/target structure pointer leak to userspace. Patch
from the same author, Willem de Bruijn. This also comes with a build
failure when CONFIG_COMPAT is not on, coming in the last patch of
this series.
6) SYNPROXY crashes with conntrack entries that are created via
ctnetlink, more specifically via conntrackd state sync. Patch from
Eric Leblond.
7) RCU safe iteration on set element dumping in nf_tables, from
Liping Zhang.
8) Missing sanitization of immediate date for the bitwise and cmp
expressions in nf_tables.
9) Refcounting logic for chain and objects from set elements does not
integrate into the nf_tables 2-phase commit protocol.
10) Missing sanitization of target verdict in ebtables arpreply target,
from Gao Feng.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
For the sake of DT binding stability, this IIO driver is a child of an
MFD driver for Allwinner A10, A13 and A31 because there already exists a
DT binding for this IP. The MFD driver has a DT node but the IIO driver
does not.
The IIO device registers the temperature sensor in the thermal framework
using the DT node of the parent, the MFD device, so the thermal
framework could match the phandle to the MFD device in the DT and the
struct device used to register in the thermal framework.
devm_thermal_zone_of_sensor_register was previously used to register the
thermal sensor with the parent struct device of the IIO device,
representing the MFD device. By doing so, we registered actually the
parent in the devm routine and not the actual IIO device.
This lead to the devm unregister function not being called when the IIO
module driver is removed. It resulted in the thermal framework still
polling the get_temp function of the IIO module while the device doesn't
exist anymore, thus generated a kernel panic.
Use the non-devm function instead and do the unregister manually in the
remove function.
Fixes: d1caa99055 ("iio: adc: add support for Allwinner SoCs ADC")
Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Reported-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The third argument of devm_request_threaded_irq() is the primary
handler. It is called in hardirq context and checks whether the
interrupt is relevant to the device. If the primary handler returns
IRQ_WAKE_THREAD, the secondary handler (a.k.a. handler thread) is
scheduled to run in process context.
bcm_iproc_adc.c uses the secondary handler as the primary one
and the other way around. So this patch fixes the same, along with
re-naming the secondary handler and primary handler names properly.
Tested on the BCM9583XX iProc SoC based boards.
Fixes: 4324c97ece ("iio: Add driver for Broadcom iproc-static-adc")
Reported-by: Pavel Roskin <plroskin@gmail.com>
Signed-off-by: Raveendra Padasalagi <raveendra.padasalagi@broadcom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The arm64 H5 and arm H3 SoCs share roughly the same base, and therefore
share a significant part of their device tree.
The approach we took was to add a symlink from the arm64 DTSI to the arm
DTSI.
Now that the arm DT folder is exposed in the include path, we can just use
it and remove our symlink.
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Initialize asoc_simple_card_init_mic with the correct struct
asoc_simple_jack.
Fixes: 9eac361877 ("ASoC: simple-card: add new asoc_simple_jack and use it")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
If SSI uses shared pin, some SSI will be used as parent SSI.
Then, normal SSI's remove and Parent SSI's remove
(these are same SSI) will be called when unbind or remove timing.
In this case, free_irq() will be called twice.
This patch solve this issue.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
Reported-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
If a malicious user corrupts the refcount btree to cause a cycle between
different levels of the tree, the next mount attempt will deadlock in
the CoW recovery routine while grabbing buffer locks. We can use the
ability to re-grab a buffer that was previous locked to a transaction to
avoid deadlocks, so do that here.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
During xfrm migration copy replay and preplay sequence numbers
from the previous state.
Here is a tcpdump output showing the problem.
10.0.10.46 is running vanilla kernel, is the IKE/IPsec responder.
After the migration it sent wrong sequence number, reset to 1.
The migration is from 10.0.0.52 to 10.0.0.53.
IP 10.0.0.52.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7cf), length 136
IP 10.0.10.46.4500 > 10.0.0.52.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x7cf), length 136
IP 10.0.0.52.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d0), length 136
IP 10.0.10.46.4500 > 10.0.0.52.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x7d0), length 136
IP 10.0.0.53.4500 > 10.0.10.46.4500: NONESP-encap: isakmp: child_sa inf2[I]
IP 10.0.10.46.4500 > 10.0.0.53.4500: NONESP-encap: isakmp: child_sa inf2[R]
IP 10.0.0.53.4500 > 10.0.10.46.4500: NONESP-encap: isakmp: child_sa inf2[I]
IP 10.0.10.46.4500 > 10.0.0.53.4500: NONESP-encap: isakmp: child_sa inf2[R]
IP 10.0.0.53.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d1), length 136
NOTE: next sequence is wrong 0x1
IP 10.0.10.46.4500 > 10.0.0.53.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x1), length 136
IP 10.0.0.53.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d2), length 136
IP 10.0.10.46.4500 > 10.0.0.53.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x2), length 136
Signed-off-by: Antony Antony <antony@phenome.org>
Reviewed-by: Richard Guy Briggs <rgb@tricolour.ca>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The skb must be released in the receive handler since b91a2543b4
("batman-adv: Consume skb in receive handlers"). Just returning NET_RX_DROP
will no longer automatically free the memory. This results in memory leaks
when unicast packets from other backbones must be dropped because they
share a common backbone.
Fixes: 9e794b6bf4 ("batman-adv: drop unicast packets from other backbone gw")
Signed-off-by: Andreas Pape <apape@phoenixcontact.com>
[sven@narfation.org: adjust commit message]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The stats are generated by batadv_interface_stats and must not be stored
directly in the net_device stats member variable. The batadv_priv
bat_counters information is assembled when ndo_get_stats is called. The
stats previously stored in net_device::stats is then overwritten.
The batman-adv counters must therefore be increased when an ARP packet is
answered locally via the distributed arp table.
Fixes: c384ea3ec9 ("batman-adv: Distributed ARP Table - add snooping functions for ARP messages")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The tm-resched-dscr test has started failing sometimes, depending on
what compiler it's built with, eg:
test: tm_resched_dscr
Check DSCR TM context switch: tm-resched-dscr: tm-resched-dscr.c:76: test_body: Assertion `rv' failed.
!! child died by signal 6
When it fails we see that the compiler doesn't initialise rv to 1 before
entering the inline asm block. Although that's counter intuitive, it
is allowed because we tell the compiler that the inline asm will write
to rv (using "=r"), meaning the original value is irrelevant.
Marking it as a read/write parameter would presumably work, but it seems
simpler to fix it by setting the initial value of rv in the inline asm.
Fixes: 96d0161086 ("powerpc: Correct DSCR during TM context switch")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Michael Neuling <mikey@neuling.org>
In case of error, the function of_iomap() returns NULL pointer
not ERR_PTR(). The IS_ERR() test in the return value check should
be replaced with NULL test.
Fixes: e78f3d15e1 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets")
Reviewed-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
When moving a merge dir or non-dir with copy up origin into a non-merge
upper dir (a.k.a pure upper dir), we are marking the target parent dir
"impure". ovl_iterate() iterates pure upper dirs directly, because there is
no need to filter out whiteouts and merge dir content with lower dir. But
for the case of an "impure" upper dir, ovl_iterate() will not be able to
iterate the real upper dir directly, because it will need to lookup the
origin inode and use it to fill d_ino.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
On failure to set opaque/redirect xattr on rename, skip setting xattr and
return -EXDEV.
On failure to set opaque xattr when creating a new directory, -EIO is
returned instead of -EOPNOTSUPP.
Any failure to set those xattr will be recorded in super block and
then setting any xattr on upper won't be attempted again.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
The devm_gpiod_get_optional() function appends a "-gpios" to the
string passed to it, so if we want to find the "power-gpios" signal,
we must pass "power" to this function.
Fixes: 01d9584333 ("mmc: cavium: Add MMC support for Octeon SOCs.")
Signed-off-by: David Daney <david.daney@cavium.com>
[jglauber@cavium.com: removed point after subject line]
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
OCTEON SoCs with CIU3 do not have interrupt masking local to the MMC
bus interface. Unfortunately, some even have a diagnostic register at
the same address of the enable register, which causes the interrupts
to fire immediately if stored to, thus breaking the driver. The proper
action on these SoCs is not to touch this register.
Fixes: 01d9584333 ("mmc: cavium: Add MMC support for Octeon SOCs.")
Signed-off-by: David Daney <david.daney@cavium.com>
[jglauber@cavium.com: removed point after subject line]
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Currently, the xenon_clean_phy() is only used for freeing phy_params.
The phy_params is allocated by devm_kzalloc(), there's no need to free
is explicitly.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Acked-by: Hu Ziji <huziji@marvell.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
In lower layer driver's (LLD) scsi_host_template, the driver may
optionally ask SCSI to allocate its private driver memory for each
command, by specifying cmd_size. This memory is allocated at the end of
scsi_cmnd by SCSI. Later when SCSI queues a command, the LLD can use
scsi_cmd_priv to get to its private data.
Some LLD, e.g. hv_storvsc, doesn't clear its private data before use. In
this case, the LLD may get to stale or uninitialized data in its private
driver memory. This may result in unexpected driver and hardware
behavior.
Fix this problem by also zeroing the private driver memory before
passing them to LLD.
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reviewed-by: KY Srinivasan <kys@microsoft.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
CC: <stable@vger.kernel.org> # 4.11+
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
mbp pointer is passed to csio_hw_validate_caps() so call mempool_free()
after calling csio_hw_validate_caps().
Signed-off-by: Varun Prakash <varun@chelsio.com>
Fixes: 541c571fa2 ("csiostor:Use firmware version from cxgb4/t4fw_version.h")
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When reloading module these two attributes aren't cleaned up properly
and they persist causing warnings when trying to load module
again. Additionally they are not recreated properly due to that.
Signed-off-by: Michał Potomski <michalx.potomski@intel.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a new interface for registering a serdev controller and clients, and
a helper function to deregister serdev devices (or a tty device) that
were previously registered using the new interface.
Once every driver currently using the tty_port_register_device() helpers
have been vetted and converted to use the new serdev registration
interface (at least for deregistration), we can move serdev registration
to the current helpers and get rid of the serdev-specific functions.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Starting with commit 6fe729c4bd ("serdev: Add serdev_device_write
subroutine") the function serdev_device_write_buf cannot be used in
atomic context anymore (mutex_lock is sleeping). So restore the old
behavior.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: 6fe729c4bd ("serdev: Add serdev_device_write subroutine")
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With serdev we might end up with serial ports that have no cdev exported
to userspace, as they are used as the bus interface to other devices. In
that case serial_match_port() won't be able to find a matching tty_dev.
Skip the irq wakeup enabling in that case, as serdev will make sure to
keep the port active, as long as there are devices depending on it.
Fixes: 8ee3fde047 (tty_port: register tty ports with serdev bus)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tty_insert_flip_string_fixed_flag() is racy against itself when called
from the ioctl(TCXONC, TCION/TCIOFF) path [1] and the flush_to_ldisc()
workqueue path [2].
The problem is that port->buf.tail->used is modified without consistent
locking; the ioctl path takes tty->atomic_write_lock, whereas the workqueue
path takes ldata->output_lock.
We cannot simply take ldata->output_lock, since that is specific to the
N_TTY line discipline.
It might seem natural to try to take port->buf.lock inside
tty_insert_flip_string_fixed_flag() and friends (where port->buf is
actually used/modified), but this creates problems for flush_to_ldisc()
which takes it before grabbing tty->ldisc_sem, o_tty->termios_rwsem,
and ldata->output_lock.
Therefore, the simplest solution for now seems to be to take
tty->atomic_write_lock inside tty_port_default_receive_buf(). This lock
is also used in the write path [3] with a consistent ordering.
[1]: Call Trace:
tty_insert_flip_string_fixed_flag
pty_write
tty_send_xchar // down_read(&o_tty->termios_rwsem)
// mutex_lock(&tty->atomic_write_lock)
n_tty_ioctl_helper
n_tty_ioctl
tty_ioctl // down_read(&tty->ldisc_sem)
do_vfs_ioctl
SyS_ioctl
[2]: Workqueue: events_unbound flush_to_ldisc
Call Trace:
tty_insert_flip_string_fixed_flag
pty_write
tty_put_char
__process_echoes
commit_echoes // mutex_lock(&ldata->output_lock)
n_tty_receive_buf_common
n_tty_receive_buf2
tty_ldisc_receive_buf // down_read(&o_tty->termios_rwsem)
tty_port_default_receive_buf // down_read(&tty->ldisc_sem)
flush_to_ldisc // mutex_lock(&port->buf.lock)
process_one_work
[3]: Call Trace:
tty_insert_flip_string_fixed_flag
pty_write
n_tty_write // mutex_lock(&ldata->output_lock)
// down_read(&tty->termios_rwsem)
do_tty_write (inline) // mutex_lock(&tty->atomic_write_lock)
tty_write // down_read(&tty->ldisc_sem)
__vfs_write
vfs_write
SyS_write
The bug can result in about a dozen different crashes depending on what
exactly gets corrupted when port->buf.tail->used points outside the
buffer.
The patch passes my LOCKDEP/PROVE_LOCKING testing but more testing is
always welcome.
Found using syzkaller.
Cc: <stable@vger.kernel.org>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Straighten out the initcall error handling to avoid deregistering a
never-registered tty driver (something which would lead to a
NULL-pointer dereference) in the most unlikely event that driver
registration fails (e.g. we've run out of major numbers).
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make sure to deregister the SPI driver before releasing the tty driver
to avoid use-after-free in the SPI remove callback where the tty
devices are deregistered.
Fixes: 72d4724ea5 ("serial: ifx6x60: Add modem power off function in the platform reboot process")
Cc: stable <stable@vger.kernel.org> # 3.8
Cc: Jun Chen <jun.d.chen@intel.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver does ioremap(port->mapbase, ALTERA_JTAGUART_SIZE),
but there is no any iounmap().
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After migrating 8250_exar to MSI in 172c33cb61, we can get stuck
without further interrupts because of the special wake-up event these
chips send. They are only cleared by reading INT0. As we fail to do so
during startup and shutdown, we can leave the interrupt line asserted,
which is fatal with edge-triggered MSIs.
Add the required reading of INT0 to startup and shutdown. Also account
for the fact that a pending wake-up interrupt means we have to return 1
from exar_handle_irq. Drop the unneeded reading of INT1..3 along with
this - those never reset anything.
An alternative approach would have been disabling the wake-up interrupt.
Unfortunately, this feature (REGB[17] = 1) is not available on the
XR17D15X.
Fixes: 172c33cb61 ("serial: exar: Enable MSI support")
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UARTn_FRAME_PARITY_ODD is 0x0300
UARTn_FRAME_PARITY_EVEN is 0x0200
So if the UART is configured for EVEN parity, it would be reported as ODD.
Fix it by correctly testing if the 2 bits are set.
Fixes: 3afbd89c96 ("serial/efm32: add new driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The port client data must be set when registering the serdev controller
or client deregistration will fail (and the serdev devices are left
registered and allocated) if the port was never opened in between.
Make sure to clear the port client data on any probe errors to avoid a
use-after-free when the client is later deregistered unconditionally
(e.g. in a tty-port deregistration helper).
Also move port client operation initialisation to registration. Note
that the client ops must be restored on failed probe.
Fixes: bed35c6dfa ("serdev: add a tty port controller driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 8ee3fde047.
The new serdev bus hooked into the tty layer in
tty_port_register_device() by registering a serdev controller instead of
a tty device whenever a serdev client is present, and by deregistering
the controller in the tty-port destructor. This is broken in several
ways:
Firstly, it leads to a NULL-pointer dereference whenever a tty driver
later deregisters its devices as no corresponding character device will
exist.
Secondly, far from every tty driver uses tty-port refcounting (e.g.
serial core) so the serdev devices might never be deregistered or
deallocated.
Thirdly, deregistering at tty-port destruction is too late as the
underlying device and structures may be long gone by then. A port is not
released before an open tty device is closed, something which a
registered serdev client can prevent from ever happening. A driver
callback while the device is gone typically also leads to crashes.
Many tty drivers even keep their ports around until the driver is
unloaded (e.g. serial core), something which even if a late callback
never happens, leads to leaks if a device is unbound from its driver and
is later rebound.
The right solution here is to add a new tty_port_unregister_device()
helper and to never call tty_device_unregister() whenever the port has
been claimed by serdev, but since this requires modifying just about
every tty driver (and multiple subsystems) it will need to be done
incrementally.
Reverting the offending patch is the first step in fixing the broken
lifetime assumptions. A follow-up patch will add a new pair of
tty-device registration helpers, which a vetted tty driver can use to
support serdev (initially serial core). When every tty driver uses the
serdev helpers (at least for deregistration), we can add serdev
registration to tty_port_register_device() again.
Note that this also fixes another issue with serdev, which currently
allocates and registers a serdev controller for every tty device
registered using tty_port_device_register() only to immediately
deregister and deallocate it when the corresponding OF node or serdev
child node is missing. This should be addressed before enabling serdev
for hot-pluggable buses.
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit fa01e2ca9f ("serial: 8250: Integrate Fintek into 8250_base")
modified the probing logic for PNP0501 devices, to remove a collision
between the generic 16550A driver and the Fintek driver, which reused
the same ACPI _HID.
The Fintek device probe is now incorporated into the common 8250 probe
path, and gets called for all discovered 16550A compatible devices,
including ones that are MMIO mapped rather than IO mapped. However,
the Fintek driver assumes the port base is a I/O address, and proceeds
to probe some arbitrary offsets above it.
This is generally a wrong thing to do, but on ARM systems (having no
native port I/O), this may result in faulting accesses of completely
unrelated MMIO regions in the PCI I/O space. Given that this is at
serial probe time, this results in hard to diagnose crashes at boot.
So let's restrict the Fintek probe to devices that we know are using
port I/O in the first place.
Fixes: fa01e2ca9f ("serial: 8250: Integrate Fintek into 8250_base")
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ricardo Ribalda <ricardo.ribalda@gmail.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
xattr are needed by overlayfs for setting opaque dir, redirect dir
and copy up origin.
Check at mount time by trying to set the overlay.opaque xattr on the
workdir and if that fails issue a warning message.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
The patch in the Fixes references COMPAT_XT_ALIGN in the definition
of XT_DATA_TO_USER, outside an #ifdef CONFIG_COMPAT block.
Split XT_DATA_TO_USER into separate compat and non compat variants and
define the first inside an CONFIG_COMPAT block.
This simplifies both variants by removing branches inside the macro.
Fixes: 324318f024 ("netfilter: xtables: zero padding in data_to_user")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The newly added PRCM CCU driver uses SUNXI_CCU_MP_WITH_MUX_GATE, which causes
a link error when no other driver enables SUNXI_CCU_MP:
drivers/clk/built-in.o:(.data+0x5c8c8): undefined reference to `ccu_mp_ops'
This adds an explicit 'select' statement for it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
The API setkey checks for key sizes and alignment went AWOL during the
skcipher conversion. This patch restores them.
Cc: <stable@vger.kernel.org>
Fixes: 4e6c3df4d7 ("crypto: skcipher - Add low-level skcipher...")
Reported-by: Baozeng <sploving1@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Recent commit on patchset "lpfc updates for 11.2.0.14" fixed an issue
about dereferencing a NULL pointer on port reset. The specific commit,
named "lpfc: Fix system crash when port is reset.", is missing a check
against NULL pointer on lpfc_els_flush_cmd() though.
Since we destroy the queues on adapter resets, like in PCI error
recovery path, we need the validation present on this patch in order to
avoid a NULL pointer dereference when trying to flush commands of ELS
wq, after it has been destroyed (which would lead to a kernel oops).
Tested-by: Raphael Silva <raphasil@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The kill_css() function may be called more than once under the condition
that the css was killed but not physically removed yet followed by the
removal of the cgroup that is hosting the css. This patch prevents any
harmm from being done when that happens.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v4.5+
Mesh forwarding path checks for address extension mode to fetch
appropriate proxied address and MPP address. Existing condition
that looks for 6 address format is not strict enough so that
frames with improper values are processed and invalid entries
are added into MPP table. Fix that by adding a stricter check before
processing the packet.
Per IEEE Std 802.11s-2011 spec. Table 7-6g1 lists address extension
mode 0x3 as reserved one. And also Table Table 9-13 does not specify
0x3 as valid address field.
Fixes: 9b395bc3be ("mac80211: verify that skb data is present")
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
For most cases a protection exception in the host (e.g. copy
on write or dirty tracking) on the sie instruction will indicate
an instruction length of 4. Turns out that there are some corner
cases (e.g. runtime instrumentation) where this is not necessarily
true and the ILC is unpredictable.
Let's replace our 4 byte rewind_pad with 3 byte nops to prepare for
all possible ILCs.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
It is wrong to iounmap resources in the normal path of davinci_pm_init()
The 3 ioremap'ed fields of 'pm_config' can be accessed later on in other
functions, so we should return 'success' instead of unrolling everything.
Fixes: aa9aa1ec2d ("ARM: davinci: PM: rework init, remove platform device")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
[nsekhar@ti.com: commit message and minor style fixes]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
The PM functions used in this driver are the ones defined in
sounc/soc/soc-core.c.
When suspending (using snd_soc_suspend), the regcache is marked dirty
but is never synced on resume.
Sync regcache on resume of Atmel ClassD device.
Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Current SSI uses PDTA bit which indicates data that Input/Output
data are Right-Aligned. But, 24bit sound should be Left-Aligned
in this HW. Because Linux is using Right-Aligned data, and HW uses
Left-Aligned data, current 24bit data is missing lower 8bit.
To fix this issue, this patch removes PDTA bit, and shift 8bit
in necessary module
Reported-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Added code to support Cisco MDS loopback diagnostic. The diagnostics run
various loopbacks including one which loops-back frame through the
driver.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Code review of NVMEI's FC_PORT_ROLE_NVME_DISCOVERY looked wrong.
Discussions with storage architecture team clarified NVMEI's audit of
the PRLI response port roles. Following up discussion with code review
showed a few minor corrections were required - especially in
anticipation of NVME auto discovery.
During PRLI, NVMEI should sent prli_init - which it it does. NVMET
should send prli_tgt and prli_disc - which it does. When NVMEI receives
a PRLI Response now, it audits the incoming target bits and stores the
attributes in the corresponding NDLP. Later, when NVMEI registers the
NVME rport, it uses the stored ndlp attributes to set the rport
port_roles correctly.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Too many work items being processed in IRQ context take a lot of CPU
time and cause problems.
With a recent change, we get out of the ISR after hitting entry_repost
work items on a queue. However, the actual values for entry repost are
still high. EQ is 128 and CQ is 128, this could translate into
processing 128 * 128 (16384) work items under IRQ context.
Set entry_repost in the actual queue creation routine now. Limit EQ
repost to 8 and CQ repost to 64 to further limit the amount of time
spent in the IRQ.
Fix fof IRQ routines as well.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When unloading and reloading the driver, the driver fails to recreate
the lpfc root inode in the debugfs tree.
The driver is incorrectly removing the lpfc root inode in
lpfc_debugfs_terminate in the first driver instance that unloads and
then sets the lpfc_debugfs_root global parameter to NULL. When the
final driver instance unloads, the debugfs calls quietly ignore the
remove on a NULL pointer. The bug is that the debugfs_remove call
returns void so the driver doesn't know to correctly set the global
parameter to NULL.
Base the debugfs_remove of the lpfc_debugfs_root parameter on
lpfc_debugfs_hba_count because this parameter tracks the fnX instance
tracked per driver instance.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the driver send the RPA command, it does not send supported FC4
Type NVME to the management server.
Encode NVME (type x28) in the AttribEntry in the RPA command.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Previous logic would just drop the IO.
Added logic to queue the IO to wait for an IO context resource from an
IO thats already in progress.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently IO resources are mapped 1 to 1 with RQ buffers posted
Added logic to separate RQE buffers from IO op resources
(sgl/iocbq/context). During initialization, the driver will determine
how many SGLs it will allocate for NVMET (based on what the firmware
reports) and associate a NVMET IOCBq and NVMET context structure with
each one.
Now that hdr/data buffers are immediately reposted back to the RQ, 512
RQEs for each MRQ is sufficient. Also, since NVMET data buffers are now
128 bytes, lpfc_nvmet_mrq_post is not necessary anymore as we will
always post the max (512) buffers per NVMET MRQ.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
After running IOPS test for 30 second we get kernel:NMI watchdog:
Watchdog detected hard LOCKUP on cpu 0
The driver is speend too much time in its ISR.
In ISR EQ and CQ processing routines, if we hit the entry_repost numbers
of EQE/CQEs just break out of the routine as opposed to hitting the
doorbell with NOARM and continue processing.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During driver boot, a latency in the NVMET driver side causes the
incoming NVMEI PRLI to get rejected by the NVMET driver. When this
happens, the NVMEI driver runs out of PRLI retries. Bouncing the link
does not fix the situation.
If the NVMEI driver decides, on PRLI completion failures, to retry the
PRLI, always decrement the fc4_prli_sent counter. This allows the PRLI
completion to resolve to UNMAPPED when NVMET rejects the PRLI.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Large block writes to the nvme target were failing because the default
number of RQs posted was insufficient.
Expand the NVMET RQs to 2048 RQEs and ensure a minimum of 512 RQEs are
posted, no matter how many MRQs are configured.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
With 255 vports created a link trasition can casue a crash.
When going through discovery after a link bounce the driver is using
rpis before the cmd FCOE_POST_HDR_TEMPLATES completes. By doing that the
next rpi bumps the rpi range out of the boundary.
The fix it to increment the next_rpi only when the
FCOE_POST_HDR_TEMPLATE succeeds.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Previous assignment was causing the use of the uninitialized variable
_explan_ inside fc_seq_ls_rjt() function, which in this particular case
is being called by fc_seq_els_rsp_send().
[mkp: fixed typo]
Addresses-Coverity-ID: 1398125
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some external hard drives don't support the sync command even though the
hard drive has write cache enabled. In this case, upon suspend request,
sync cache failures are ignored if the error code in the sense header is
ILLEGAL_REQUEST. There's not much we can do for these drives, so we
shouldn't fail to suspend for this error case. The drive may stay
powered if that's the setup for the port it's plugged into.
Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This reverts commit 51f5677777 "nfsd: check for oversized NFSv2/v3
arguments", which breaks support for NFSv3 ACLs.
That patch was actually an earlier draft of a fix for the problem that
was eventually fixed by e6838a29ec "nfsd: check for oversized NFSv2/v3
arguments". But somehow I accidentally left this earlier draft in the
branch that was part of my 2.12 pull request.
Reported-by: Eryu Guan <eguan@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
There were a number of handwaving complaints that one could "possibly"
use inode numbers and extent maps to fingerprint a filesystem hosting
multiple containers and somehow use the information to guess at the
contents of other containers and attack them. Despite the total lack of
any demonstration that this is actually possible, it's easier to
restrict access now and broaden it later, so use the rmapbt fsmap
backends only if the caller has CAP_SYS_ADMIN. Unprivileged users will
just have to make do with only getting the free space and static
metadata placement information.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
By run fsstress long enough time enough in RHEL-7, I find an
assertion failure (harder to reproduce on linux-4.11, but problem
is still there):
XFS: Assertion failed: (iflags & BMV_IF_DELALLOC) != 0, file: fs/xfs/xfs_bmap_util.c
The assertion is in xfs_getbmap() funciton:
if (map[i].br_startblock == DELAYSTARTBLOCK &&
--> map[i].br_startoff <= XFS_B_TO_FSB(mp, XFS_ISIZE(ip)))
ASSERT((iflags & BMV_IF_DELALLOC) != 0);
When map[i].br_startoff == XFS_B_TO_FSB(mp, XFS_ISIZE(ip)), the
startoff is just at EOF. But we only need to make sure delalloc
extents that are within EOF, not include EOF.
Signed-off-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reduce stack usage and get rid of compiler warnings by eliminating
unused variables.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
When we're fulfilling a BMAPX request, jump out early if the data fork
is in local format. This prevents us from hitting a debugging check in
bmapi_read and barfing errors back to userspace. The on-disk extent
count check later isn't sufficient for IF_DELALLOC mode because da
extents are in memory and not on disk.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The delalloc -> real block conversion path uses an incorrect
calculation in the case where the middle part of a delalloc extent
is being converted. This is documented as a rare situation because
XFS generally attempts to maximize contiguity by converting as much
of a delalloc extent as possible.
If this situation does occur, the indlen reservation for the two new
delalloc extents left behind by the conversion of the middle range
is calculated and compared with the original reservation. If more
blocks are required, the delta is allocated from the global block
pool. This delta value can be characterized as the difference
between the new total requirement (temp + temp2) and the currently
available reservation minus those blocks that have already been
allocated (startblockval(PREV.br_startblock) - allocated).
The problem is that the current code does not account for previously
allocated blocks correctly. It subtracts the current allocation
count from the (new - old) delta rather than the old indlen
reservation. This means that more indlen blocks than have been
allocated end up stashed in the remaining extents and free space
accounting is broken as a result.
Fix up the calculation to subtract the allocated block count from
the original extent indlen and thus correctly allocate the
reservation delta based on the difference between the new total
requirement and the unused blocks from the original reservation.
Also remove a bogus assert that contradicts the fact that the new
indlen reservation can be larger than the original indlen
reservation.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
When platform_get_irq() fails, it returns an error code, which
libahci_platform and replaces it by -EINVAL. This commit fixes that by
propagating the error code. It fixes the situation where
platform_get_irq() returns -EPROBE_DEFER because the interrupt
controller is not available yet, and generally looks like the right
thing to do.
We pay attention to not show the "no irq" message when we are in an
EPROBE_DEFER situation, because the driver probing will be retried
later on, once the interrupt controller becomes available to provide
the interrupt.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Here, Clock enable can failed. So adding an error check for
clk_prepare_enable.
tj: minor style updates
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
(Correction in this resend: fixed function name acer_sa5_271_workaround; fixed
the always-true condition in the function; fixed description.)
On the Acer Switch Alpha 12 (model number: SA5-271), the internal SSD may not
get detected because the port_map and CAP.nr_ports combination causes the driver
to skip the port that is actually connected to the SSD. More specifically,
either all SATA ports are identified as DUMMY, or all ports get ``link down''
and never get up again.
This problem occurs occasionally. When this problem occurs, CAP may hold a
value of 0xC734FF00 or 0xC734FF01 and port_map may hold a value of 0x00 or 0x01.
When this problem does not occur, CAP holds a value of 0xC734FF02 and port_map
may hold a value of 0x07. Overriding the CAP value to 0xC734FF02 and port_map to
0x7 significantly reduces the occurrence of this problem.
Link: https://bugzilla.kernel.org/attachment.cgi?id=253091
Signed-off-by: Sui Chen <suichen6@gmail.com>
Tested-by: Damian Ivanov <damianatorrpm@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
The setting of return code ret should be based on the error code
passed into function end_extent_writepage and not on ret. Thanks
to Liu Bo for spotting this mistake in the original fix I submitted.
Detected by CoverityScan, CID#1414312 ("Logically dead code")
Fixes: 5dca6eea91 ("Btrfs: mark mapping with error flag to report errors to userspace")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit b685d3d65a "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
CC: David Sterba <dsterba@suse.com>
CC: linux-btrfs@vger.kernel.org
Fixes: b685d3d65a
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Cycle mount btrfs can cause fiemap to return different result.
Like:
# mount /dev/vdb5 /mnt/btrfs
# dd if=/dev/zero bs=16K count=4 oflag=dsync of=/mnt/btrfs/file
# xfs_io -c "fiemap -v" /mnt/btrfs/file
/mnt/test/file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 25088..25215 128 0x1
# umount /mnt/btrfs
# mount /dev/vdb5 /mnt/btrfs
# xfs_io -c "fiemap -v" /mnt/btrfs/file
/mnt/test/file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..31]: 25088..25119 32 0x0
1: [32..63]: 25120..25151 32 0x0
2: [64..95]: 25152..25183 32 0x0
3: [96..127]: 25184..25215 32 0x1
But after above fiemap, we get correct merged result if we call fiemap
again.
# xfs_io -c "fiemap -v" /mnt/btrfs/file
/mnt/test/file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 25088..25215 128 0x1
[REASON]
Btrfs will try to merge extent map when inserting new extent map.
btrfs_fiemap(start=0 len=(u64)-1)
|- extent_fiemap(start=0 len=(u64)-1)
|- get_extent_skip_holes(start=0 len=64k)
| |- btrfs_get_extent_fiemap(start=0 len=64k)
| |- btrfs_get_extent(start=0 len=64k)
| | Found on-disk (ino, EXTENT_DATA, 0)
| |- add_extent_mapping()
| |- Return (em->start=0, len=16k)
|
|- fiemap_fill_next_extent(logic=0 phys=X len=16k)
|
|- get_extent_skip_holes(start=0 len=64k)
| |- btrfs_get_extent_fiemap(start=0 len=64k)
| |- btrfs_get_extent(start=16k len=48k)
| | Found on-disk (ino, EXTENT_DATA, 16k)
| |- add_extent_mapping()
| | |- try_merge_map()
| | Merge with previous em start=0 len=16k
| | resulting em start=0 len=32k
| |- Return (em->start=0, len=32K) << Merged result
|- Stripe off the unrelated range (0~16K) of return em
|- fiemap_fill_next_extent(logic=16K phys=X+16K len=16K)
^^^ Causing split fiemap extent.
And since in add_extent_mapping(), em is already merged, in next
fiemap() call, we will get merged result.
[FIX]
Here we introduce a new structure, fiemap_cache, which records previous
fiemap extent.
And will always try to merge current fiemap_cache result before calling
fiemap_fill_next_extent().
Only when we failed to merge current fiemap extent with cached one, we
will call fiemap_fill_next_extent() to submit cached one.
So by this method, we can merge all fiemap extents.
It can also be done in fs/ioctl.c, however the problem is if
fieinfo->fi_extents_max == 0, we have no space to cache previous fiemap
extent.
So I choose to merge it in btrfs.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With CONFIG_RESET_CONTROLLER=n we see the following link error in the
meson gxbb clk driver:
drivers/built-in.o: In function 'gxbb_aoclkc_probe':
drivers/clk/meson/gxbb-aoclk.c:161: undefined reference to 'devm_reset_controller_register'
Fix this by selecting the reset controller subsystem.
Fixes: f8c11f7991 ("clk: meson: Add GXBB AO Clock and Reset controller driver")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
[narmstrong: Added fixes-by tag]
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
The info->target comes from userspace and it would be used directly.
So we need to add the sanity check to make sure it is a valid standard
target, although the ebtables tool has already checked it. Kernel needs
to validate anything coming from userspace.
If the target is set as an evil value, it would break the ebtables
and cause a panic. Because the non-standard target is treated as one
offset.
Now add one helper function ebt_invalid_target, and we would replace
the macro INVALID_TARGET later.
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ALC299 has no loopback mixer, but the driver still tries to add a beep
control over the mixer NID which leads to the error at accessing it.
This patch fixes it by properly declaring mixer_nid=0 for this codec.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195775
Fixes: 28f1f9b26c ("ALSA: hda/realtek - Add new codec ID ALC299")
Cc: stable@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
During v4.3 when the overflow/underflow check was relaxed by
commit c72c525022:
commit c72c525022
Author: Roland Dreier <roland@purestorage.com>
Date: Wed Jul 22 15:08:18 2015 -0700
target: allow underflow/overflow for PR OUT etc. commands
to allow underflow/overflow for Windows compliance + FCP, a
consequence was to allow control CDBs to process overflow
data for iscsi-target with immediate data as well.
As per Roland's original change, continue to allow underflow
cases for control CDBs to make Windows compliance + FCP happy,
but until overflow for control CDBs is supported tree-wide,
explicitly reject all control WRITEs with overflow following
pre v4.3.y logic.
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The current code is not correctly calculating the req_lim_delta.
We want to make sure vscsi->credit is always incremented when
we do not send a response for the scsi op. Thus for the case where
there is a successfully aborted task we need to make sure the
vscsi->credit is incremented.
v2 - Moves the original location of the vscsi->credit increment
to a better spot. Since if we increment credit, the next command
we send back will have increased req_lim_delta. But we probably
shouldn't be doing that until the aborted cmd is actually released.
Otherwise the client will think that it can send a new command, and
we could find ourselves short of command elements. Not likely, but could
happen.
This patch depends on both:
commit 25e7853126 ("ibmvscsis: Do not send aborted task response")
commit 98883f1b54 ("ibmvscsis: Clear left-over abort_cmd pointers")
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-by: Michael Cyr <mikecyr@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
With the addition of ibmvscsis->abort_cmd pointer within
commit 25e7853126 ("ibmvscsis: Do not send aborted task response"),
make sure to explicitly NULL these pointers when clearing
DELAY_SEND flag.
Do this for two cases, when getting the new new ibmvscsis
descriptor in ibmvscsis_get_free_cmd() and before posting
the response completion in ibmvscsis_send_messages().
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-by: Michael Cyr <mikecyr@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The Raspberry Pi startup stub files for multi-core BCM283X processors
make the secondary CPUs spin until the corresponding mailbox is
written. These stubs are loaded at physical address 0x00000xxx (as seen
by the ARMs), but this page will be reused by the kernel unless it is
explicitly reserved, causing the waiting cores to execute random code.
Use the /memreserve/ Device Tree directive to mark the first page as
off-limits to the kernel.
See: https://github.com/raspberrypi/linux/issues/1989
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Andreas reports that the following incremental update using our commit
protocol doesn't work.
# nft -f incremental-update.nft
delete element ip filter client_to_any { 10.180.86.22 : goto CIn_1 }
delete chain ip filter CIn_1
... Error: Could not process rule: Device or resource busy
The existing code is not well-integrated into the commit phase protocol,
since element deletions do not result in refcount decrement from the
preparation phase. This results in bogus EBUSY errors like the one
above.
Two new functions come with this patch:
* nft_set_elem_activate() function is used from the abort path, to
restore the set element refcounting on objects that occurred from
the preparation phase.
* nft_set_elem_deactivate() that is called from nft_del_setelem() to
decrement set element refcounting on objects from the preparation
phase in the commit protocol.
The nft_data_uninit() has been renamed to nft_data_release() since this
function does not uninitialize any data store in the data register,
instead just releases the references to objects. Moreover, a new
function nft_data_hold() has been introduced to be used from
nft_set_elem_activate().
Reported-by: Andreas Schultz <aschultz@tpip.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Do not assume userspace always sends us NFT_DATA_VALUE for bitwise and
cmp expressions. Although NFT_DATA_VERDICT does not make any sense, it
is still possible to handcraft a netlink message using this incorrect
data type.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When dumping the elements related to a specified set, we may invoke the
nf_tables_dump_set with the NFNL_SUBSYS_NFTABLES lock not acquired. So
we should use the proper rcu operation to avoid race condition, just
like other nft dump operations.
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch fixes the creation of connection tracking entry from
netlink when synproxy is used. It was missing the addition of
the synproxy extension.
This was causing kernel crashes when a conntrack entry created by
conntrackd was used after the switch of traffic from active node
to the passive node.
Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When looking up an iptables rule, the iptables binary compares the
aligned match and target data (XT_ALIGN). In some cases this can
exceed the actual data size to include padding bytes.
Before commit f77bc5b23f ("iptables: use match, target and data
copy_to_user helpers") the malloc()ed bytes were overwritten by the
kernel with kzalloced contents, zeroing the padding and making the
comparison succeed. After this patch, the kernel copies and clears
only data, leaving the padding bytes undefined.
Extend the clear operation from data size to aligned data size to
include the padding bytes, if any.
Padding bytes can be observed in both match and target, and the bug
triggered, by issuing a rule with match icmp and target ACCEPT:
iptables -t mangle -A INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT
iptables -t mangle -D INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT
Fixes: f77bc5b23f ("iptables: use match, target and data copy_to_user helpers")
Reported-by: Paul Moore <pmoore@redhat.com>
Reported-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Simon Horman says:
====================
IPVS Fixes for v4.12
please consider this fix to IPVS for v4.12.
* It is a fix from Julian Anastasov to only SNAT SNAT packet replies only for
NATed connections
My understanding is that this fix is appropriate for 4.9.25, 4.10.13, 4.11
as well as the nf tree. Julian has separately posted backports for other
-stable kernels; please see:
* [PATCH 3.2.88,3.4.113 -stable 1/3] ipvs: SNAT packet replies only for
NATed connections
* [PATCH 3.10.105,3.12.73,3.16.43,4.1.39 -stable 2/3] ipvs: SNAT packet
replies only for NATed connections
* [PATCH 4.4.65 -stable 3/3] ipvs: SNAT packet replies only for NATed
connections
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We can still delete the ct helper even if it is in use, this will cause
a use-after-free error. In more detail, I mean:
# nfct helper add ssdp inet udp
# iptables -t raw -A OUTPUT -p udp -j CT --helper ssdp
# nfct helper delete ssdp //--> oops, succeed!
BUG: unable to handle kernel paging request at 000026ca
IP: 0x26ca
[...]
Call Trace:
? ipv4_helper+0x62/0x80 [nf_conntrack_ipv4]
nf_hook_slow+0x21/0xb0
ip_output+0xe9/0x100
? ip_fragment.constprop.54+0xc0/0xc0
ip_local_out+0x33/0x40
ip_send_skb+0x16/0x80
udp_send_skb+0x84/0x240
udp_sendmsg+0x35d/0xa50
So add reference count to fix this issue, if ct helper is used by
others, reject the delete request.
Apply this patch:
# nfct helper delete ssdp
nfct v1.4.3: netlink error: Device or resource busy
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
And convert module_put invocation to nf_conntrack_helper_put, this is
prepared for the followup patch, which will add a refcnt for cthelper,
so we can reject the deleting request when cthelper is in use.
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We cannot setup nat info if the ct has been confirmed already, else,
different cpu may race to handle the same ct. In extreme situation,
we may hit the "BUG_ON(nf_nat_initialized(ct, maniptype))" in the
nf_nat_setup_info.
Also running the following commands will easily hit NF_CT_ASSERT in
nf_conntrack_alter_reply:
# nft flush ruleset
# ping -c 2 -W 1 1.1.1.111 &
# nft add table t
# nft add chain t c {type nat hook postrouting priority 0 \;}
# nft add rule t c snat to 4.5.6.7
WARNING: CPU: 1 PID: 10065 at net/netfilter/nf_conntrack_core.c:1472
nf_conntrack_alter_reply+0x9a/0x1a0 [nf_conntrack]
[...]
Call Trace:
nf_nat_setup_info+0xad/0x840 [nf_nat]
? deactivate_slab+0x65d/0x6c0
nft_nat_eval+0xcd/0x100 [nft_nat]
nft_do_chain+0xff/0x5d0 [nf_tables]
? mark_held_locks+0x6f/0xa0
? __local_bh_enable_ip+0x70/0xa0
? trace_hardirqs_on_caller+0x11f/0x190
? ipt_do_table+0x310/0x610
? trace_hardirqs_on+0xd/0x10
? __local_bh_enable_ip+0x70/0xa0
? ipt_do_table+0x32b/0x610
? __lock_acquire+0x2ac/0x1580
? ipt_do_table+0x32b/0x610
nft_nat_do_chain+0x65/0x80 [nft_chain_nat_ipv4]
nf_nat_ipv4_fn+0x1ae/0x240 [nf_nat_ipv4]
nf_nat_ipv4_out+0x4a/0xf0 [nf_nat_ipv4]
nft_nat_ipv4_out+0x15/0x20 [nft_chain_nat_ipv4]
nf_hook_slow+0x2c/0xf0
ip_output+0x154/0x270
So for the confirmed ct, just ignore it and return NF_ACCEPT.
Fixes: 9a08ecfe74 ("netfilter: don't attach a nat extension by default")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Not all parameters passed to ctnetlink_parse_tuple() and
ctnetlink_exp_dump_tuple() match the enum type in the signatures of these
functions. Since this is intended change the argument type of to be an
unsigned integer value.
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We get a harmless warning without CONFIG_PM:
drivers/memory/atmel-ebi.c:584:12: error: 'atmel_ebi_resume' defined but not used [-Werror=unused-function]
Marking the function as __maybe_unused does the right thing here
and drops it silently when unused.
Fixes: a483fb10e5ea ("memory: atmel-ebi: Add PM ops")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Add missing endianness conversion when using the USB device-descriptor
bcdDevice field when applying the Amanero Combo384 (endianness!) quirk.
Fixes: 3eff682d76 ("ALSA: usb-audio: Support both DSD LE/BE Amanero firmware versions")
Cc: Jussi Laako <jussi@sonarnerd.net>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
We get a link error when EXPORTFS is not enabled:
ERROR: "exportfs_encode_fh" [fs/overlayfs/overlay.ko] undefined!
ERROR: "exportfs_decode_fh" [fs/overlayfs/overlay.ko] undefined!
This adds a Kconfig 'select' statement for overlayfs, the same way that
it is done for the other users of exportfs.
Fixes: 3a1e819b4e ("ovl: store file handle of lower inode on copy up")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
The power and current "shunt-resistor" attribute's 'show' function
displays the resistor value in milli-Ohms, while the ABI description
specifies it should be displayed in Ohms. Fix it.
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
We should be allocating enough information for a tiadc_device struct
which is about 400 bytes but instead we allocate enough for a second
iio_dev struct which is over 2000 bytes.
Fixes: fea89e2dfc ("iio: adc: ti_am335x_adc: use variable names for sizeof() operator")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The current implementation of interrupt coalescing doesn't work, because
it doesn't configure the coalescing timer, which is needed to make sure
we get an interrupt at some point.
As a fix for stable, we simply remove the interrupt coalescing
functionality. It will be re-introduced properly in a future commit.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The mv_xor_v2_tx_submit() gets the next available HW descriptor by
calling mv_xor_v2_get_desq_write_ptr(), which reads a HW register
telling the next available HW descriptor. This was working fine when HW
descriptors were issued for processing directly in tx_submit().
However, as part of the review process of the driver, a change was
requested to move the actual kick-off of HW descriptors processing to
->issue_pending(). Due to this, reading the HW register to know the next
available HW descriptor no longer works.
So instead of using this HW register, we implemented a software index
pointing to the next available HW descriptor.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The engine was enabled prior to its configuration, which isn't
correct. This patch relocates the activation of the XOR engine, to be
after the configuration of the XOR engine.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hanna Hawa <hannah@marvell.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Descriptors that have not been acknowledged by the async_tx layer
should not be re-used, so this commit adjusts the implementation of
mv_xor_v2_prep_sw_desc() to skip descriptors for which
async_tx_test_ack() is false.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
mv_xor_v2_tasklet() is looping over completed HW descriptors. Before the
loop, it initializes 'next_pending_hw_desc' to the first HW descriptor
to handle, and then the loop simply increments this point, without
taking care of wrapping when we reach the last HW descriptor. The
'pending_ptr' index was being wrapped back to 0 at the end, but it
wasn't used in each iteration of the loop to calculate
next_pending_hw_desc.
This commit fixes that, and makes next_pending_hw_desc a variable local
to the loop itself.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The mv_xor_v2_prep_sw_desc() is called from a few different places in
the driver, but we never take into account the fact that it might
return NULL. This commit fixes that, ensuring that we don't panic if
there are no more descriptors available.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Moving the cooling code into the cpufreq driver caused a possible build failure
when the cpu_thermal helper code is a loadable module or disabled:
drivers/cpufreq/dbx500-cpufreq.o: In function `dbx500_cpufreq_ready':
dbx500-cpufreq.c:(.text.dbx500_cpufreq_ready+0x4): undefined reference to `cpufreq_cooling_register'
This adds the same dependency that we have in other cpufreq drivers,
forcing the driver to be disabled when we can't possibly link it.
Fixes: 19678ffb9f (cpufreq: dbx500: Manage cooling device from cpufreq driver)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the current code we accidentally return the successful result from
idr_alloc() instead of a negative error pointer. The caller is looking
for an error pointer and so it treats the returned value as a valid
pointer.
This one might be a bit serious because if it lets people get around the
kernel's protection for remapping NULL. I'm not sure.
Fixes: 75d2364ea0 (PowerCap: Add class driver)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the SRM lock check section of code the '&' bitwise operator is
used as part of checking lock status. Functionally the code works
as intended, but the conditional statement is a boolean comparison
so should really use '&&' logical operator instead. This commit
rectifies this discrepancy.
Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Thinkpad Helix 2 is a tablet PC, the audio is powered by Core M
broadwell-audio and rt286 codec. For all versions of Linux kernel,
the stereo output doesn't work properly when earphones are plugged
in, the sound was coming out from both channels even if the audio
contains only the left or right channel. Furthermore, if a music
recorded in stereo is played, the two channels cancle out each other
out, as a result, no voice but only distorted background music can be
heard, like a sound card with builtin a Karaoke sount effect.
Apparently this tablet uses a combo jack with polarity incorrectly
set by rt286 driver. This patch adds DMI information of Thinkpad Helix 2
to force_combo_jack_table[] and the issue is resolved. The microphone
input doesn't work regardless to the presence of this patch and still
needs help from other developers to investigate.
This is my first patch to LKML directly, sorry for CC-ing too many
people here.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=93841
Signed-off-by: Yifeng Li <tomli@tomli.me>
Signed-off-by: Mark Brown <broonie@kernel.org>
The i915 component framework expects the caller to be invoking
snd_hdac_i915_init() from a thread context. Otherwise it results in
lockups on drm side.
So move the registering of component interface and probing of codecs on
this bus to a worker thread.
init_failed in skl structure is not used currently, so renamed to
init_done and used to track the initialization done in worker thread.
Reported-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Sodhi, VunnyX <vunnyx.sodhi@intel.com>
Signed-off-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The R_CCU of H3/H5 currently wrongly used A64 R_CCU compatible.
Fix it by changing it to the correct H3 compatible.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
The register offset for the lcd1-ch1 clock was incorrectly pointing to
the lcd0-ch1 clock. This resulted in the lcd0-ch1 clock being disabled
when the clk core disables unused clocks. This then stops the simplefb
HDMI output path.
Reported-by: Bob Ham <rah@settrans.net>
Fixes: c6e6c96d8f ("clk: sunxi-ng: Add A31/A31s clocks")
Cc: stable@vger.kernel.org # 4.9.x-
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Commit eed4d47efe (ACPI / sleep: Ignore spurious SCI wakeups from
suspend-to-idle) modified the core suspend-to-idle code to filter
out spurious SCI interrupts received while suspended, which requires
ACPI event source handlers to report wakeup events in a way that
will trigger a wakeup from suspend to idle (or abort system suspends
in progress, which is equivalent).
That needs to be done in the rtc-cmos driver too, which was overlooked
by the above commit, so do that now.
Fixes: eed4d47efe (ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle)
Reported-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 8a537ece3d (PM / wakeup: Integrate mechanism to abort
transitions in progress) modified wakeup_source_report_event()
and wakeup_source_activate() to make it possible to call
pm_system_wakeup() from the latter if so indicated by the
caller of the former (via a new function argument added by that
commit), but it overlooked the fact that in some situations
wakeup_source_report_event() is called to signal a "hard" event
(ie. such that should abort a system suspend in progress) after
pm_stay_awake() has been called for the same wakeup source object,
in which case the pm_system_wakeup() will not trigger.
To work around this issue, modify wakeup_source_activate() and
wakeup_source_report_event() again so that pm_system_wakeup() is
called by the latter directly (if its last argument is true), in
which case the additional argument does not need to be passed
to wakeup_source_activate() any more, so drop it from there.
Fixes: 8a537ece3d (PM / wakeup: Integrate mechanism to abort transitions in progress)
Reported-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a document describing the current behavior and user space
interface of the intel_pstate driver in the RST format and
drop the existing outdated intel_pstate.txt document.
Also update admin-guide/pm/cpufreq.rst with proper RST references
to the new intel_pstate.rst document.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts commit ecb10b694b.
The only expected ACPI control method lid device's usage model is
1. Listen to the lid notification,
2. Evaluate _LID after being notified by BIOS,
3. Suspend the system (if users configure to do so) after seeing "close".
It's not ensured that BIOS will notify OS after boot/resume, and
it's not ensured that BIOS will always generate "open" event upon
opening the lid.
But there are 2 wrong usage models:
1. When the lid device is responsible for suspend/resume the system,
userspace requires to see "open" event to be paired with "close" after
the system is resumed, or it will suspend the system again.
2. When an external monitor connects to the laptop attached docks,
userspace requires to see "close" event after the system is resumed so
that it can determine whether the internal display should remain dark
and the external display should be lit on.
After we made default kernel behavior to be suitable for usage model 1,
users of usage model 2 start to report regressions for such behavior
change.
Reversion of button.lid_init_state=method doesn't actually reverts to old
default behavior as doing so can enter a regression loop, but facilitates
users to work the reported regressions around with
button.lid_init_state=method.
Fixes: ecb10b694b (ACPI / button: Remove lid_init_state=method mode)
Cc: 4.11+ <stable@vger.kernel.org> # 4.11+
Link: https://bugzilla.kernel.org/show_bug.cgi?id=195455
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1430259
Tested-by: Steffen Weber <steffen.weber@gmail.com>
Tested-by: Julian Wiedmann <julian.wiedmann@jwi.name>
Reported-by: Joachim Frieben <jfrieben@hotmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a .gitignore file so that git commands do not pick up the resulting
binaries and directories.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If the list search in sg_get_rq_mark() fails to find a valid request, we
return a bogus element. This then can later lead to a GPF in
sg_remove_scat().
So don't return bogus Sg_requests in sg_get_rq_mark() but NULL in case
the list search doesn't find a valid request.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Doug Gilbert <dgilbert@interlog.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Doug Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For a zoned block device, sd_zbc_complete() handles zone write unlock on
completion of a REQ_OP_WRITE_ZEROES command but the zone write locking
is missing from sd_setup_write_zeroes_cmnd(). This patch fixes this
problem by locking the target zone of a REQ_OP_WRITE_ZEROES request.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi_io_init() may fail, leaving a zone of a zoned block device locked.
Fix this by properly unlocking the write same request target zone if
scsi_io_init() fails.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The ELECOM DEFT trackballs report only five buttons, when the device
actually has 8. Change the descriptor so that the HID driver can see all of
them.
For completeness and future reference, I included a side-by-side diff of
the part of the descriptor that is being edited.
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: Yuxuan Shui <yshuiv7@gmail.com>
Signed-off-by: Diego Elio Pettenò <flameeyes@flameeyes.eu>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "kmalloc_array".
This issue was detected by using the Coccinelle software.
* Replace the specification of a data type by a pointer dereference
to make the corresponding size determination a bit safer according to
the Linux coding style convention.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
A string which did not contain a data format specification should be put
into a sequence. Thus use the corresponding function "seq_puts".
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
We do not check if packet from real server is for NAT
connection before performing SNAT. This causes problems
for setups that use DR/TUN and allow local clients to
access the real server directly, for example:
- local client in director creates IPVS-DR/TUN connection
CIP->VIP and the request packets are routed to RIP.
Talks are finished but IPVS connection is not expired yet.
- second local client creates non-IPVS connection CIP->RIP
with same reply tuple RIP->CIP and when replies are received
on LOCAL_IN we wrongly assign them for the first client
connection because RIP->CIP matches the reply direction.
As result, IPVS SNATs replies for non-IPVS connections.
The problem is more visible to local UDP clients but in rare
cases it can happen also for TCP or remote clients when the
real server sends the reply traffic via the director.
So, better to be more precise for the reply traffic.
As replies are not expected for DR/TUN connections, better
to not touch them.
Reported-by: Nick Moriarty <nick.moriarty@york.ac.uk>
Tested-by: Nick Moriarty <nick.moriarty@york.ac.uk>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Upon NETDEV_DOWN event, all xfrm_state objects which are bound to
the device are flushed.
The condition for this is wrong, though, testing dev->hw_features
instead of dev->features. If a device has non-user-modifiable
NETIF_F_HW_ESP, then its xfrm_state objects are not flushed,
causing a crash later on after the device is deleted.
Check dev->features instead of dev->hw_features.
Fixes: d77e38e612 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The sadb_x_sec_len is stored in the unit 'byte divided by eight'.
So we have to multiply this value by eight before we can do
size checks. Otherwise we may get a slab-out-of-bounds when
we memcpy the user sec_ctx.
Fixes: df71837d50 ("[LSM-IPSec]: Security association restriction.")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
I left Samsung and lost access to most Exynos hardware and documentation.
Also, I likely won't be able to keep an eye on the platform anymore in the
short term so remove myself as a reviewer for Exynos.
Signed-off-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
If the driver is built as a module, it won't be autloaded if the devices
are registered via OF code because the OF device table
entries are not exported as aliases
Before the patch:
$ modinfo drivers/iio/adc/sun4i-gpadc-iio.ko | grep alias
alias: platform:sun6i-a31-gpadc-iio
alias: platform:sun5i-a13-gpadc-iio
alias: platform:sun4i-a10-gpadc-iio
After the patch:
$ modinfo drivers/iio/adc/sun4i-gpadc-iio.ko | grep alias
alias: of:N*T*Callwinner,sun8i-a33-thsC*
alias: of:N*T*Callwinner,sun8i-a33-ths
alias: platform:sun6i-a31-gpadc-iio
alias: platform:sun5i-a13-gpadc-iio
alias: platform:sun4i-a10-gpadc-iio
Signed-off-by: Eduardo Molinas <edu.molinas@gmail.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
If the driver is built as a module, it won't be autloaded if the devices
are registered via PLATFORM code because the PLATFORM device table
entries are not exported as aliases
Before the patch:
$ modinfo drivers/iio/adc/sun4i-gpadc-iio.ko | grep alias
$
After the patch:
$ modinfo drivers/iio/adc/sun4i-gpadc-iio.ko | grep alias
alias: platform:sun6i-a31-gpadc-iio
alias: platform:sun5i-a13-gpadc-iio
alias: platform:sun4i-a10-gpadc-iio
Signed-off-by: Eduardo Molinas <edu.molinas@gmail.com>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Using iio_trigger_poll() can oops when multiple interrupts
happen before the first is handled.
Use iio_trigger_poll_chained() instead and use the timestamp
when processed, since it will be in theory be 2 ms max latency.
Fixes: 24ddb0e4bb ("iio: Add AS3935 lightning sensor support")
Cc: stable@vger.kernel.org
Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Currently, sugov_next_freq_shared() uses last_freq_update_time as a
reference to decide when to start considering CPU contributions as
stale.
However, since last_freq_update_time is set by the last CPU that issued
a frequency transition, this might cause problems in certain cases. In
practice, the detection of stale utilization values fails whenever the
CPU with such values was the last to update the policy. For example (and
please note again that the SCHED_CPUFREQ_RT flag is not the problem
here, but only the detection of after how much time that flag has to be
considered stale), suppose a policy with 2 CPUs:
CPU0 | CPU1
|
| RT task scheduled
| SCHED_CPUFREQ_RT is set
| CPU1->last_update = now
| freq transition to max
| last_freq_update_time = now
|
more than TICK_NSEC nsecs
|
a small CFS wakes up |
CPU0->last_update = now1 |
delta_ns(CPU0) < TICK_NSEC* |
CPU0's util is considered |
delta_ns(CPU1) = |
last_freq_update_time - |
CPU1->last_update = 0 |
< TICK_NSEC |
CPU1 is still considered |
CPU1->SCHED_CPUFREQ_RT is set |
we stay at max (until CPU1 |
exits from idle) |
* delta_ns is actually negative as now1 > last_freq_update_time
While last_freq_update_time is a sensible reference for rate limiting,
it doesn't seem to be useful for working around stale CPU states.
Fix the problem by always considering now (time) as the reference for
deciding when CPUs have stale contributions.
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The driver emits multi-touch events for Magic Trackpad as well as Magic
Mouse, but it does not set keybits that are related to multi-touch event
for Magic Mouse; so set these keybits.
The keybits that are not set cause trouble because user programs often
probe these keybits for self-configuration and thus they cannot operate
properly if the keybits are not set.
One of such troubles is that libevdev will not be able to emit correct
touch count, causing gestures library failed to do fling stop.
Signed-off-by: Che-Liang Chiou <clchiou@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The following Smatch complaint was generated in response to commit
2a6cdbd ("HID: wacom: Introduce new 'touch_input' device"):
drivers/hid/wacom_wac.c:1586 wacom_tpc_irq()
error: we previously assumed 'wacom->touch_input' could be null (see line 1577)
The 'touch_input' and 'pen_input' variables point to the 'struct input_dev'
used for relaying touch and pen events to userspace, respectively. If a
device does not have a touch interface or pen interface, the associated
input variable is NULL. The 'wacom_tpc_irq()' function is responsible for
forwarding input reports to a more-specific IRQ handler function. An
unknown report could theoretically be mistaken as e.g. a touch report
on a device which does not have a touch interface. This can be prevented
by only calling the pen/touch functions are called when the pen/touch
pointers are valid.
Fixes: 2a6cdbd ("HID: wacom: Introduce new 'touch_input' device")
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
Reviewed-by: Ping Cheng <ping.cheng@wacom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
When CONFIG_XFRM_SUB_POLICY=y, xfrm_dst stores a copy of the flowi for
that dst. Unfortunately, the code that allocates and fills this copy
doesn't care about what type of flowi (flowi, flowi4, flowi6) gets
passed. In multiple code paths (from raw_sendmsg, from TCP when
replying to a FIN, in vxlan, geneve, and gre), the flowi that gets
passed to xfrm is actually an on-stack flowi4, so we end up reading
stuff from the stack past the end of the flowi4 struct.
Since xfrm_dst->origin isn't used anywhere following commit
ca116922af ("xfrm: Eliminate "fl" and "pol" args to
xfrm_bundle_ok()."), just get rid of it. xfrm_dst->partner isn't used
either, so get rid of that too.
Fixes: 9d6ec93801 ("ipv4: Use flowi4 in public route lookup interfaces.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Locally generated TCP packets are usually cloned, so we
do skb_cow_data() on this packets. After that we need to
reload the pointer to the esp header. On udpencap this
header has an offset to skb_transport_header, so take this
offset into account.
Fixes: 67d349ed60 ("net/esp4: Fix invalid esph pointer crash")
Fixes: fca11ebde3 ("esp4: Reorganize esp_output")
Reported-by: Don Bowman <db@donbowman.ca>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
AS3935 interrupt mask has been incorrect so valid lightning events
would never trigger an buffer event. Also noise interrupt should be
BIT(0).
Fixes: 24ddb0e4bb ("iio: Add AS3935 lightning sensor support")
CC: stable@vger.kernel.org
Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Reading an ACPI table through the /sys/firmware/acpi/tables interface
more than 65,536 times leads to the following log message:
ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment
(20170119/tbutils-423)
...and the table being unavailable until the next reboot. Add the
missing acpi_put_table() so the table ->validation_count is decremented
after each read.
Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
Fixes: 174cc7187e "ACPICA: Tables: Back port acpi_get_table_with_size() ..."
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
According to the datasheet the RCO must be recalibrated
on every power-on-reset. Also remove mutex locking in the
calibration function since callers other than the probe
function (which doesn't need it) will have a lock.
Fixes: 24ddb0e4bb ("iio: Add AS3935 lightning sensor support")
Cc: George McCollister <george.mccollister@gmail.com>
Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2017-04-26 07:06:31 +01:00
1379 changed files with 14785 additions and 8798 deletions
@@ -54,7 +54,7 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
returnieee754sp_nanxcpt(z);
caseIEEE754_CLASS_DNORM:
SPDNORMZ;
/* QNAN is handled separately below */
/* QNAN and ZERO cases are handled separately below */
}
switch(CLPAIR(xc,yc)){
@@ -203,6 +203,9 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
}
assert(rm&(SP_HIDDEN_BIT<<3));
if(zc==IEEE754_CLASS_ZERO)
returnieee754sp_format(rs,re,rm);
/* And now the addition */
assert(zm&SP_HIDDEN_BIT);
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.