Pull x86 fix from Thomas Gleixner:
"A single fix, which addresses boot failures on machines which do not
report EBDA correctly, which can place the trampoline into reserved
memory regions. Validating against E820 prevents that"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/compressed/64: Validate trampoline placement against E820
Pull timer fixes from Thomas Gleixner:
"Two oneliners addressing NOHZ failures:
- Use a bitmask to check for the pending timer softirq and not the
bit number. The existing code using the bit number checked for
the wrong bit, which caused timers to either expire late or stop
completely.
- Make the nohz evaluation on interrupt exit more robust. The
existing code did not re-arm the hardware when interrupting a
running softirq in task context (ksoftirqd or tail of
local_bh_enable()), which caused timers to either expire late
or stop completely"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
nohz: Fix missing tick reprogram when interrupting an inline softirq
nohz: Fix local_timer_softirq_pending()
Pull perf fixes from Thomas Gleixner:
"A set of fixes for perf:
Kernel side:
- Fix the hardcoded index of extra PCI devices on Broadwell which
caused a resource conflict and triggered warnings on CPU hotplug.
Tooling:
- Update the tools copy of several files, including perf_event.h,
powerpc's asm/unistd.h (new io_pgetevents syscall), bpf.h and x86's
memcpy_64.s (used in 'perf bench mem'), silencing the respective
warnings during the perf tools build.
- Fix the build on the alpine:edge distro"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Fix hardcoded index of Broadwell extra PCI devices
perf tools: Fix the build on the alpine:edge distro
tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy'
tools headers uapi: Refresh linux/bpf.h copy
tools headers powerpc: Update asm/unistd.h copy to pick new
tools headers uapi: Update tools's copy of linux/perf_event.h
Pull irq fix from Thomas Gleixner:
"A single bugfix for the irq core to prevent silent data corruption and
malfunction of threaded interrupts under certain conditions"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Make force irq threading setup more robust
Pull networking fixes from David Miller:
1) Handle frames in error situations properly in AF_XDP, from Jakub
Kicinski.
2) tcp_mmap test case only tests ipv6 due to a thinko, fix from
Maninder Singh.
3) Session refcnt fix in l2tp_ppp, from Guillaume Nault.
4) Fix regression in netlink bind handling of multicast gruops, from
Dmitry Safonov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
netlink: Don't shift on 64 for ngroups
net/smc: no cursor update send in state SMC_INIT
l2tp: fix missing refcount drop in pppol2tp_tunnel_ioctl()
mlxsw: core_acl_flex_actions: Remove redundant mirror resource destruction
mlxsw: core_acl_flex_actions: Remove redundant counter destruction
mlxsw: core_acl_flex_actions: Remove redundant resource destruction
mlxsw: core_acl_flex_actions: Return error for conflicting actions
selftests/bpf: update test_lwt_seg6local.sh according to iproute2
drivers: net: lmc: fix case value for target abort error
selftest/net: fix protocol family to work for IPv4.
net: xsk: don't return frames via the allocator on error
tools/bpftool: fix a percpu_array map dump problem
Pull usercopy whitelisting fix from Kees Cook:
"Bart Massey discovered that the usercopy whitelist for JFS was
incomplete: the inline inode data may intentionally "overflow" into
the neighboring "extended area", so the size of the whitelist needed
to be raised to include the neighboring field"
* tag 'usercopy-fix-v4.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
jfs: Fix usercopy whitelist for inline inode data
Pull xfs bugfix from Darrick Wong:
"One more patch for 4.18 to fix a coding error in the iomap_bmap()
function introduced in -rc1: fix incorrect shifting"
* tag 'xfs-4.18-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
fs: fix iomap_bmap position calculation
It turns out that commit 721c7fc701 ("block: fail op_is_write()
requests to read-only partitions"), while obviously correct, causes
problems for some older lvm2 installations.
The reason is that the lvm snapshotting will continue to write to the
snapshow COW volume, even after the volume has been marked read-only.
End result: snapshot failure.
This has actually been fixed in newer version of the lvm2 tool, but the
old tools still exist, and the breakage was reported both in the kernel
bugzilla and in the Debian bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=200439https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900442
The lvm2 fix is here
https://sourceware.org/git/?p=lvm2.git;a=commit;h=a6fdb9d9d70f51c49ad11a87ab4243344e6701a3
but until everybody has updated to recent versions, we'll have to weaken
the "never write to read-only partitions" check. It now allows the
write to happen, but causes a warning, something like this:
generic_make_request: Trying to write to read-only block-device dm-3 (partno X)
Modules linked in: nf_tables xt_cgroup xt_owner kvm_intel iwlmvm kvm irqbypass iwlwifi
CPU: 1 PID: 77 Comm: kworker/1:1 Not tainted 4.17.9-gentoo #3
Hardware name: LENOVO 20B6A019RT/20B6A019RT, BIOS GJET91WW (2.41 ) 09/21/2016
Workqueue: ksnaphd do_metadata
RIP: 0010:generic_make_request_checks+0x4ac/0x600
...
Call Trace:
generic_make_request+0x64/0x400
submit_bio+0x6c/0x140
dispatch_io+0x287/0x430
sync_io+0xc3/0x120
dm_io+0x1f8/0x220
do_metadata+0x1d/0x30
process_one_work+0x1b9/0x3e0
worker_thread+0x2b/0x3c0
kthread+0x113/0x130
ret_from_fork+0x35/0x40
Note that this is a "revert" in behavior only. I'm leaving alone the
actual code cleanups in commit 721c7fc701, but letting the previously
uncaught request go through with a warning instead of stopping it.
Fixes: 721c7fc701 ("block: fail op_is_write() requests to read-only partitions")
Reported-and-tested-by: WGH <wgh@torlan.ru>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Zdenek Kabelac <zkabelac@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's legal to have 64 groups for netlink_sock.
As user-supplied nladdr->nl_groups is __u32, it's possible to subscribe
only to first 32 groups.
The check for correctness of .bind() userspace supplied parameter
is done by applying mask made from ngroups shift. Which broke Android
as they have 64 groups and the shift for mask resulted in an overflow.
Fixes: 61f4b23769 ("netlink: Don't shift with UB on nlk->ngroups")
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Reported-and-Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2018-08-05
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix bpftool percpu_array dump by using correct roundup to next
multiple of 8 for the value size, from Yonghong.
2) Fix in AF_XDP's __xsk_rcv_zc() to not returning frames back to
allocator since driver will recycle frame anyway in case of an
error, from Jakub.
3) Fix up BPF test_lwt_seg6local test cases to final iproute2
syntax, from Mathieu.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If a writer blocked condition is received without data, the current
consumer cursor is immediately sent. Servers could already receive this
condition in state SMC_INIT without finished tx-setup. This patch
avoids sending a consumer cursor update in this case.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bart Massey reported what turned out to be a usercopy whitelist false
positive in JFS when symlink contents exceeded 128 bytes. The inline
inode data (i_inline) is actually designed to overflow into the "extended
area" following it (i_inline_ea) when needed. So the whitelist needed to
be expanded to include both i_inline and i_inline_ea (the whole size
of which is calculated internally using IDATASIZE, 256, instead of
sizeof(i_inline), 128).
$ cd /mnt/jfs
$ touch $(perl -e 'print "B" x 250')
$ ln -s B* b
$ ls -l >/dev/null
[ 249.436410] Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLUB object 'jfs_ip' (offset 616, size 250)!
Reported-by: Bart Massey <bart.massey@gmail.com>
Fixes: 8d2704d382 ("jfs: Define usercopy region in jfs_ip slab cache")
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: jfs-discussion@lists.sourceforge.net
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Pull KVM fixes from Paolo Bonzini:
"Two vmx bugfixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: x86: vmx: fix vpid leak
KVM: vmx: use local variable for current_vmptr when emulating VMPTRST
If 'session' is not NULL and is not a PPP pseudo-wire, then we fail to
drop the reference taken by l2tp_session_get().
Fixes: ecd012e45a ("l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel says:
====================
mlxsw: Fix ACL actions error condition handling
Nir says:
Two issues were lately noticed within mlxsw ACL actions error condition
handling. The first patch deals with conflicting actions such as:
# tc filter add dev swp49 parent ffff: \
protocol ip pref 10 flower skip_sw dst_ip 192.168.101.1 \
action goto chain 100 \
action mirred egress redirect dev swp4
The second action will never execute, however SW model allows this
configuration, while the mlxsw driver cannot allow for it as it
implements actions in sets of up to three actions per set with a single
termination marking. Conflicting actions create a contradiction over
this single marking and thus cannot be configured. The fix replaces a
misplaced warning with an error code to be returned.
Patches 2-4 fix a condition of duplicate destruction of resources. Some
actions require allocation of specific resource prior to setting the
action itself. On error condition this resource was destroyed twice,
leading to a crash when using mirror action, and to a redundant
destruction in other cases, since for error condition rule destruction
also takes care of resource destruction. In order to fix this state a
symmetry in behavior is added and resource destruction also takes care
of removing the resource from rule's resource list.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In previous patch mlxsw_afa_resource_del() was added to avoid a duplicate
resource detruction scenario.
For mirror actions, such duplicate destruction leads to a crash as in:
# tc qdisc add dev swp49 ingress
# tc filter add dev swp49 parent ffff: \
protocol ip chain 100 pref 10 \
flower skip_sw dst_ip 192.168.101.1 action drop
# tc filter add dev swp49 parent ffff: \
protocol ip pref 10 \
flower skip_sw dst_ip 192.168.101.1 action goto chain 100 \
action mirred egress mirror dev swp4
Therefore add a call to mlxsw_afa_resource_del() in
mlxsw_afa_mirror_destroy() in order to clear that resource
from rule's resources.
Fixes: d0d13c1858 ("mlxsw: spectrum_acl: Add support for mirror action")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each tc flower rule uses a hidden count action. As counter resource may
not be available due to limited HW resources, update _counter_create()
and _counter_destroy() pair to follow previously introduced symmetric
error condition handling, add a call to mlxsw_afa_resource_del() as part
of the counter resource destruction.
Fixes: c18c1e186b ("mlxsw: core: Make counter index allocated inside the action append")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some ACL actions require the allocation of a separate resource
prior to applying the action itself. When facing an error condition
during the setup phase of the action, resource should be destroyed.
For such actions the destruction was done twice which is dangerous
and lead to a potential crash.
The destruction took place first upon error on action setup phase
and then as the rule was destroyed.
The following sequence generated a crash:
# tc qdisc add dev swp49 ingress
# tc filter add dev swp49 parent ffff: \
protocol ip chain 100 pref 10 \
flower skip_sw dst_ip 192.168.101.1 action drop
# tc filter add dev swp49 parent ffff: \
protocol ip pref 10 \
flower skip_sw dst_ip 192.168.101.1 action goto chain 100 \
action mirred egress mirror dev swp4
Therefore add mlxsw_afa_resource_del() as a complement of
mlxsw_afa_resource_add() to add symmetry to resource_list membership
handling. Call this from mlxsw_afa_fwd_entry_ref_destroy() to make the
_fwd_entry_ref_create() and _fwd_entry_ref_destroy() pair of calls a
NOP.
Fixes: 140ce42121 ("mlxsw: core: Convert fwd_entry_ref list to be generic per-block resource list")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Spectrum switch ACL action set is built in groups of three actions
which may point to additional actions. A group holds a single record
which can be set as goto record for pointing at a following group
or can be set to mark the termination of the lookup. This is perfectly
adequate for handling a series of actions to be executed on a packet.
While the SW model allows configuration of conflicting actions
where it is clear that some actions will never execute, the mlxsw
driver must block such configurations as it creates a conflict
over the single terminate/goto record value.
For a conflicting actions configuration such as:
# tc filter add dev swp49 parent ffff: \
protocol ip pref 10 \
flower skip_sw dst_ip 192.168.101.1 \
action goto chain 100 \
action mirred egress mirror dev swp4
Where it is clear that the last action will never execute, the
mlxsw driver was issuing a warning instead of returning an error.
Therefore replace that warning with an error for this specific
case.
Fixes: 4cda7d8d70 ("mlxsw: core: Introduce flexible actions support")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull rdma fix from Jason Gunthorpe:
"One bug for missing user input validation: refuse invalid port numbers
in the modify_qp system call"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/uverbs: Expand primary and alt AV port checks
Pull block fix from Jens Axboe:
"Just a single fix, from Ming, fixing a regression in this cycle where
the busy tag iteration was changed to only calling the callback
function for requests that are started. We really want all non-free
requests.
This fixes a boot regression on certain VM setups"
* tag 'for-linus-20180803' of git://git.kernel.dk/linux-block:
blk-mq: fix blk_mq_tagset_busy_iter
Pull powerpc fixes from Michael Ellerman:
"One fix for a regression in a recent TLB flush optimisation, which
caused us to incorrectly not send TLB invalidations to coprocessors.
Thanks to Frederic Barrat, Nicholas Piggin, Vaibhav Jain"
* tag 'powerpc-4.18-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s/radix: Fix missing global invalidations when removing copro
Pull drm fixes from Dave Airlie:
"Nothing too major at this late stage:
- adv7511: reset fix
- vc4: scaling fix
- two atomic core fixes
- one legacy core error handling fix
I had a bunch of driver fixes from hdlcd but I think I'll leave them
for -next at this point"
* tag 'drm-fixes-2018-08-03' of git://anongit.freedesktop.org/drm/drm:
drm/vc4: Reset ->{x, y}_scaling[1] when dealing with uniplanar formats
drm/atomic: Initialize variables in drm_atomic_helper_async_check() to make gcc happy
drm/atomic: Check old_plane_state->crtc in drm_atomic_helper_async_check()
drm: re-enable error handling
drm/bridge: adv7511: Reset registers on hotplug
Pull crypto fix from Herbert Xu:
"Fix a memory corruption in the padlock-aes driver"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: padlock-aes - Fix Nano workaround data corruption
The full nohz tick is reprogrammed in irq_exit() only if the exit is not in
a nesting interrupt. This stands as an optimization: whether a hardirq or a
softirq is interrupted, the tick is going to be reprogrammed when necessary
at the end of the inner interrupt, with even potential new updates on the
timer queue.
When soft interrupts are interrupted, it's assumed that they are executing
on the tail of an interrupt return. In that case tick_nohz_irq_exit() is
called after softirq processing to take care of the tick reprogramming.
But the assumption is wrong: softirqs can be processed inline as well, ie:
outside of an interrupt, like in a call to local_bh_enable() or from
ksoftirqd.
Inline softirqs don't reprogram the tick once they are done, as opposed to
interrupt tail softirq processing. So if a tick interrupts an inline
softirq processing, the next timer will neither be reprogrammed from the
interrupting tick's irq_exit() nor after the interrupted softirq
processing. This situation may leave the tick unprogrammed while timers are
armed.
To fix this, simply keep reprogramming the tick even if a softirq has been
interrupted. That can be optimized further, but for now correctness is more
important.
Note that new timers enqueued in nohz_full mode after a softirq gets
interrupted will still be handled just fine through self-IPIs triggered by
the timer code.
Reported-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: stable@vger.kernel.org # 4.14+
Link: https://lkml.kernel.org/r/1533303094-15855-1-git-send-email-frederic@kernel.org
The support of force threading interrupts which are set up with both a
primary and a threaded handler wreckaged the setup of regular requested
threaded interrupts (primary handler == NULL).
The reason is that it does not check whether the primary handler is set to
the default handler which wakes the handler thread. Instead it replaces the
thread handler with the primary handler as it would do with force threaded
interrupts which have been requested via request_irq(). So both the primary
and the thread handler become the same which then triggers the warnon that
the thread handler tries to wakeup a not configured secondary thread.
Fortunately this only happens when the driver omits the IRQF_ONESHOT flag
when requesting the threaded interrupt, which is normaly caught by the
sanity checks when force irq threading is disabled.
Fix it by skipping the force threading setup when a regular threaded
interrupt is requested. As a consequence the interrupt request which lacks
the IRQ_ONESHOT flag is rejected correctly instead of silently wreckaging
it.
Fixes: 2a1d3ab898 ("genirq: Handle force threading of irqs with primary and thread handler")
Reported-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de>
Cc: stable@vger.kernel.org
The shell file for test_lwt_seg6local contains an early iproute2 syntax
for installing a seg6local End.BPF route. iproute2 support for this
feature has recently been upstreamed, but with an additional keyword
required. This patch updates test_lwt_seg6local.sh to the definitive
iproute2 syntax
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Pull media fixes from Mauro Carvalho Chehab:
- a deadlock regression at vsp1 driver
- some Remote Controller fixes related to the new BPF filter logic
added on it for Kernel 4.18.
* tag 'media/v4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: v4l: vsp1: Fix deadlock in VSPDL DRM pipelines
media: rc: read out of bounds if bpf reports high protocol number
media: bpf: ensure bpf program is freed on detach
media: rc: be less noisy when driver misbehaves
Merge misc fixes from Andrew Morton:
"3 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
userfaultfd: remove uffd flags from vma->vm_flags if UFFD_EVENT_FORK fails
ipc/shm.c add ->pagesize function to shm_vm_ops
memcg: remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failure
Commit 05ea88608d ("mm, hugetlbfs: introduce ->pagesize() to
vm_operations_struct") adds a new ->pagesize() function to
hugetlb_vm_ops, intended to cover all hugetlbfs backed files.
With System V shared memory model, if "huge page" is specified, the
"shared memory" is backed by hugetlbfs files, but the mappings initiated
via shmget/shmat have their original vm_ops overwritten with shm_vm_ops,
so we need to add a ->pagesize function to shm_vm_ops. Otherwise,
vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs backed vma,
result in below BUG:
fs/hugetlbfs/inode.c
443 if (unlikely(page_mapped(page))) {
444 BUG_ON(truncate_op);
resulting in
hugetlbfs: oracle (4592): Using mlock ulimits for SHM_HUGETLB is deprecated
------------[ cut here ]------------
kernel BUG at fs/hugetlbfs/inode.c:444!
Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 ...
CPU: 35 PID: 5583 Comm: oracle_5583_sbt Not tainted 4.14.35-1829.el7uek.x86_64 #2
RIP: 0010:remove_inode_hugepages+0x3db/0x3e2
....
Call Trace:
hugetlbfs_evict_inode+0x1e/0x3e
evict+0xdb/0x1af
iput+0x1a2/0x1f7
dentry_unlink_inode+0xc6/0xf0
__dentry_kill+0xd8/0x18d
dput+0x1b5/0x1ed
__fput+0x18b/0x216
____fput+0xe/0x10
task_work_run+0x90/0xa7
exit_to_usermode_loop+0xdd/0x116
do_syscall_64+0x187/0x1ae
entry_SYSCALL_64_after_hwframe+0x150/0x0
[jane.chu@oracle.com: relocate comment]
Link: http://lkml.kernel.org/r/20180731044831.26036-1-jane.chu@oracle.com
Link: http://lkml.kernel.org/r/20180727211727.5020-1-jane.chu@oracle.com
Fixes: 05ea88608d ("mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct")
Signed-off-by: Jane Chu <jane.chu@oracle.com>
Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current value for a target abort error is 0x010, however, this value
should in fact be 0x002. As it stands, the range of error is 0..7 so
it is currently never being detected. This bug has been in the driver
since the early 2.6.12 days (or before).
Detected by CoverityScan, CID#744290 ("Logically dead code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The position calculation in iomap_bmap() shifts bno the wrong way,
so we don't progress properly and end up re-mapping block zero
over and over, yielding an unchanging physical block range as the
logical block advances:
# filefrag -Be file
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 0: 21.. 21: 1: merged
1: 1.. 1: 21.. 21: 1: 22: merged
Discontinuity: Block 1 is at 21 (was 22)
2: 2.. 2: 21.. 21: 1: 22: merged
Discontinuity: Block 2 is at 21 (was 22)
3: 3.. 3: 21.. 21: 1: 22: merged
This breaks the FIBMAP interface for anyone using it (XFS), which
in turn breaks LILO, zipl, etc.
Bug-actually-spotted-by: Darrick J. Wong <darrick.wong@oracle.com>
Fixes: 89eb1906a9 ("iomap: add an iomap-based bmap implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
use actual protocol family passed by user rather than hardcoded
AF_INTE6 to cerate sockets.
current code is not working for IPv4.
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull arm64 regression fix from Will Deacon:
"Ard found a nasty arm64 regression in 4.18 where the AES ghash/gcm
code doesn't notify the kernel about its use of the vector registers,
therefore potentially corrupting live user state.
The fix is straightforward and Herbert agreed for it to go via arm64"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
crypto/arm64: aes-ce-gcm - add missing kernel_neon_begin/end pair
Pull networking fixes from David Miller:
"Fixes keep trickling in:
1) Various IP fragmentation memory limit hardening changes from Eric
Dumazet.
2) Revert ipv6 metrics leak change, it causes more problems than it
fixes for now.
3) Fix WoL regression in stmmac driver, from Jose Abreu.
4) Netlink socket spectre v1 gadget fix, from Jeremy Cline"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
Revert "net/ipv6: fix metrics leak"
rxrpc: Fix user call ID check in rxrpc_service_prealloc_one
net: dsa: Do not suspend/resume closed slave_dev
netlink: Fix spectre v1 gadget in netlink_create()
Documentation: dpaa2: Use correct heading adornment
net: stmmac: Fix WoL for PCI-based setups
bonding: avoid lockdep confusion in bond_get_stats()
enic: do not call enic_change_mtu in enic_probe
ipv4: frags: handle possible skb truesize change
inet: frag: enforce memory limits earlier
net/mlx5e: IPoIB, Set the netdevice sw mtu in ipoib enhanced flow
net/mlx5e: Fix null pointer access when setting MTU of vport representor
net/mlx5e: Set port trust mode to PCP as default
net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager
net: dsa: mv88e6xxx: Fix SERDES support on 88E6141/6341
brcmfmac: fix regression in parsing NVRAM for multiple devices
iwlwifi: add more card IDs for 9000 series
Previously in squashfs_readpage() when copying data into the page
cache, it used the length of the datablock read from the filesystem
(after decompression). However, if the filesystem has been corrupted
this data block may be short, which will leave pages unfilled.
The fix for this is to compute the expected number of bytes to copy
from the inode size, and use this to detect if the block is short.
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Tested-by: Willy Tarreau <w@1wt.eu>
Cc: Анатолий Тросиненко <anatoly.trosinenko@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The squashfs fragment reading code doesn't actually verify that the
fragment is inside the fragment table. The end result _is_ verified to
be inside the image when actually reading the fragment data, but before
that is done, we may end up taking a page fault because the fragment
table itself might not even exist.
Another report from Anatoly and his endless squashfs image fuzzing.
Reported-by: Анатолий Тросиненко <anatoly.trosinenko@gmail.com>
Acked-by:: Phillip Lougher <phillip.lougher@gmail.com>,
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit df18b50448.
This change causes other problems and use-after-free situations as
found by syzbot.
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch to fix the case where a lock request was interrupted ended up
changing default handling of errors such as NFS4ERR_DENIED and caused the
client to immediately resend the lock request. Let's do a partial revert
of that request so that the default is now to exit, but change the way
we handle resends to take into account the fact that the user may have
interrupted the request.
Reported-by: Kenneth Johansson <ken@kenjo.org>
Fixes: a3cf9bca2a ("NFSv4: Don't add a new lock on an interrupted wait..")
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Pull ARM fix from Russell King:
"Just a single fix this time around for recent binutils causing build
problems when generating Thumb-2 code"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8781/1: Fix Thumb-2 syscall return for binutils 2.29+
Commit 2c4541e24c ("mm: use vma_init() to initialize VMAs on stack and
data segments") tried to initialize various left-over ad-hoc vma's
"properly", but actually made things worse for the temporary vma's used
for TLB flushing.
vma_init() doesn't actually initialize all of the vma, just a few
fields, so doing something like
- struct vm_area_struct vma = { .vm_mm = tlb->mm, };
+ struct vm_area_struct vma;
+
+ vma_init(&vma, tlb->mm);
was actually very bad: instead of having a nicely initialized vma with
every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma
with only a couple of fields initialized. And they weren't even fields
that the code in question mostly cared about.
The flush_tlb_range() function takes a "struct vma" rather than a
"struct mm_struct", because a few architectures actually care about what
kind of range it is - being able to only do an ITLB flush if it's a
range that doesn't have data accesses enabled, for example. And all the
normal users already have the vma for doing the range invalidation.
But a few people want to call flush_tlb_range() with a range they just
made up, so they also end up using a made-up vma. x86 just has a
special "flush_tlb_mm_range()" function for this, but other
architectures (arm and ia64) do the "use fake vma" thing instead, and
thus got caught up in the vma_init() changes.
At the same time, the TLB flushing code really doesn't care about most
other fields in the vma, so vma_init() is just unnecessary and
pointless.
This fixes things by having an explicit "this is just an initializer for
the TLB flush" initializer macro, which is used by the arm/arm64/ia64
people who mis-use this interface with just a dummy vma.
Fixes: 2c4541e24c ("mm: use vma_init() to initialize VMAs on stack and data segments")
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Delete the old VM_BUG_ON_VMA() from zap_pmd_range(), which asserted
that mmap_sem must be held when splitting an "anonymous" vma there.
Whether that's still strictly true nowadays is not entirely clear,
but the danger of sometimes crashing on the BUG is now fairly clear.
Even with the new stricter rules for anonymous vma marking, the
condition it checks for can possible trigger. Commit 44960f2a7b
("staging: ashmem: Fix SIGBUS crash when traversing mmaped ashmem
pages") is good, and originally I thought it was safe from that
VM_BUG_ON_VMA(), because the /dev/ashmem fd exposed to the user is
disconnected from the vm_file in the vma, and madvise(,,MADV_REMOVE)
insists on VM_SHARED.
But after I read John's earlier mail, drawing attention to the
vfs_fallocate() in there: I may be wrong, and I don't know if Android
has THP in the config anyway, but it looks to me like an
unmap_mapping_range() from ashmem's vfs_fallocate() could hit precisely
the VM_BUG_ON_VMA(), once it's vma_is_anonymous().
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There just check the user call ID isn't already in use, hence should
compare user_call_ID with xcall->user_call_ID, which is current
node's user_call_ID.
Fixes: 540b1c48c3 ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg")
Suggested-by: David Howells <dhowells@redhat.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull power management fixes from Rafael Wysocki:
"These fix the scope of a recent intel_pstate driver optimization used
incorrectly on some systems due to processor identification ambiguity
and fix a few issues in the turbostat utility, including three recent
regressions.
Specifics:
- Use ACPI FADT preferred PM Profile to distinguish Skylake desktop
processors from some server ones with the same model number in
order to limit the scope of the recent IO-wait boost optimization
to servers, as intended (Srinivas Pandruvada).
- Fix several issues in the turbostat utility:
* Fix the -S option on 1-CPU systems (Len Brown).
* Fix computations using incorrect processor core counts (Artem
Bityutskiy).
* Fix the x2apic debug message (Len Brown).
* Fix logical node enumeration to allow for non-sequential
physical nodes (Prarit Bhargava).
* Fix reported family on modern AMD processors (Calvin Walton).
* Clarify the RAPL column information in the man page (Len Brown)"
* tag 'pm-urgent-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Limit the scope of HWP dynamic boost platforms
tools/power turbostat: version 18.07.27
tools/power turbostat: Read extended processor family from CPUID
tools/power turbostat: Fix logical node enumeration to allow for non-sequential physical nodes
tools/power turbostat: fix x2apic debug message output file
tools/power turbostat: fix bogus summary values
tools/power turbostat: fix -S on UP systems
tools/power turbostat: Update turbostat(8) RAPL throttling column description
Anatoly continues to find issues with fuzzed squashfs images.
This time, corrupt, missing, or undersized data for the page filling
wasn't checked for, because the squashfs_{copy,read}_cache() functions
did the squashfs_copy_data() call without checking the resulting data
size.
Which could result in the page cache pages being incompletely filled in,
and no error indication to the user space reading garbage data.
So make a helper function for the "fill in pages" case, because the
exact same incomplete sequence existed in two places.
[ I should have made a squashfs branch for these things, but I didn't
intend to start doing them in the first place.
My historical connection through cramfs is why I got into looking at
these issues at all, and every time I (continue to) think it's a
one-off.
Because _this_ time is always the last time. Right? - Linus ]
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Amit Pundir and Youling in parallel reported crashes with recent
mainline kernels running Android:
F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
F DEBUG : Build fingerprint: 'Android/db410c32_only/db410c32_only:Q/OC-MR1/102:userdebug/test-key
F DEBUG : Revision: '0'
F DEBUG : ABI: 'arm'
F DEBUG : pid: 2261, tid: 2261, name: zygote >>> zygote <<<
F DEBUG : signal 7 (SIGBUS), code 2 (BUS_ADRERR), fault addr 0xec00008
... <snip> ...
F DEBUG : backtrace:
F DEBUG : #00 pc 00001c04 /system/lib/libc.so (memset+48)
F DEBUG : #01 pc 0010c513 /system/lib/libart.so (create_mspace_with_base+82)
F DEBUG : #02 pc 0015c601 /system/lib/libart.so (art::gc::space::DlMallocSpace::CreateMspace(void*, unsigned int, unsigned int)+40)
F DEBUG : #03 pc 0015c3ed /system/lib/libart.so (art::gc::space::DlMallocSpace::CreateFromMemMap(art::MemMap*, std::__1::basic_string<char, std::__ 1::char_traits<char>, std::__1::allocator<char>> const&, unsigned int, unsigned int, unsigned int, unsigned int, bool)+36)
...
This was bisected back to commit bfd40eaff5 ("mm: fix
vma_is_anonymous() false-positives").
create_mspace_with_base() in the trace above, utilizes ashmem, and with
ashmem, for shared mappings we use shmem_zero_setup(), which sets the
vma->vm_ops to &shmem_vm_ops. But for private ashmem mappings nothing
sets the vma->vm_ops.
Looking at the problematic patch, it seems to add a requirement that one
call vma_set_anonymous() on a vma, otherwise the dummy_vm_ops will be
used. Using the dummy_vm_ops seem to triggger SIGBUS when traversing
unmapped pages.
Thus, this patch adds a call to vma_set_anonymous() for ashmem private
mappings and seems to avoid the reported problem.
Fixes: bfd40eaff5 ("mm: fix vma_is_anonymous() false-positives")
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Colin Cross <ccross@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Reported-by: Amit Pundir <amit.pundir@linaro.org>
Reported-by: Youling 257 <youling257@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit bfd40eaff5 ("mm: fix vma_is_anonymous() false-positives") made
newly allocated vma's have a dummy vm_ops field so that they wouldn't be
mistaken for anonymous mappings, and if you wanted an anonymous vma you
had to explicitly say so by calling "vma_set_anonymous()" on it.
However, it missed the two special vmas that ia64 processes have: the
register backing store and the NaT page. So they wouldn't actually act
like anonymous ranges, and page faults on them caused a SIGBUS rather
than the creation of a new anon page in them.
That obviously will make any ia64 binary very unhappy indeed, and the
boot fails early.
Fixes: bfd40eaff5 ("mm: fix vma_is_anonymous() false-positives")
Reported-by: Tony Luck <tony.luck@intel.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a DSA slave network device was previously disabled, there is no need
to suspend or resume it.
Fixes: 2446254915 ("net: dsa: allow switch drivers to implement suspend/resume hooks")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
'protocol' is a user-controlled value, so sanitize it after the bounds
check to avoid using it for speculative out-of-bounds access to arrays
indexed by it.
This addresses the following accesses detected with the help of smatch:
* net/netlink/af_netlink.c:654 __netlink_create() warn: potential
spectre issue 'nlk_cb_mutex_keys' [w]
* net/netlink/af_netlink.c:654 __netlink_create() warn: potential
spectre issue 'nlk_cb_mutex_key_strings' [w]
* net/netlink/af_netlink.c:685 netlink_create() warn: potential spectre
issue 'nl_table' [w] (local cap)
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add overline heading adornment to document title in order to comply
with kernel doc requirements.
Fixes: 60b9131 staging: fsl-mc: Convert documentation to rst format
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WoL won't work in PCI-based setups because we are not saving the PCI EP
state before entering suspend state and not allowing D3 wake.
Fix this by using a wrapper around stmmac_{suspend/resume} which
correctly sets the PCI EP state.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the optimizations for TLB invalidation from commit 0cef77c779
("powerpc/64s/radix: flush remote CPUs out of single-threaded
mm_cpumask"), the scope of a TLBI (global vs. local) can now be
influenced by the value of the 'copros' counter of the memory context.
When calling mm_context_remove_copro(), the 'copros' counter is
decremented first before flushing. It may have the unintended side
effect of sending local TLBIs when we explicitly need global
invalidations in this case. Thus breaking any nMMU user in a bad and
unpredictable way.
Fix it by flushing first, before updating the 'copros' counter, so
that invalidations will be global.
Fixes: 0cef77c779 ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask")
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Merge turbostat utility fixes for final 4.18:
- Fix the -S option on 1-CPU systems.
- Fix computations using incorrect processor core counts.
- Fix the x2apic debug message.
- Fix logical node enumeration to allow for non-sequential physical nodes.
- Fix reported family on modern AMD processors.
- Clarify the RAPL column information in the man page.
* pm-tools:
tools/power turbostat: version 18.07.27
tools/power turbostat: Read extended processor family from CPUID
tools/power turbostat: Fix logical node enumeration to allow for non-sequential physical nodes
tools/power turbostat: fix x2apic debug message output file
tools/power turbostat: fix bogus summary values
tools/power turbostat: fix -S on UP systems
tools/power turbostat: Update turbostat(8) RAPL throttling column description
In commit ab123fe071 ("enic: handle mtu change for vf properly")
ASSERT_RTNL() is added to _enic_change_mtu() to prevent it from being
called without rtnl held. enic_probe() calls enic_change_mtu()
without rtnl held. At this point netdev is not registered yet.
Remove call to enic_change_mtu and assign the mtu to netdev->mtu.
Fixes: ab123fe071 ("enic: handle mtu change for vf properly")
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip_frag_queue() might call pskb_pull() on one skb that
is already in the fragment queue.
We need to take care of possible truesize change, or we
might have an imbalance of the netns frags memory usage.
IPv6 is immune to this bug, because RFC5722, Section 4,
amended by Errata ID 3089 states :
When reassembling an IPv6 datagram, if
one or more its constituent fragments is determined to be an
overlapping fragment, the entire datagram (and any constituent
fragments) MUST be silently discarded.
Fixes: 158f323b98 ("net: adjust skb->truesize in pskb_expand_head()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently check current frags memory usage only when
a new frag queue is created. This allows attackers to first
consume the memory budget (default : 4 MB) creating thousands
of frag queues, then sending tiny skbs to exceed high_thresh
limit by 2 to 3 order of magnitude.
Note that before commit 648700f76b ("inet: frags: use rhashtables
for reassembly units"), work queue could be starved under DOS,
getting no cpu cycles.
After commit 648700f76b, only the per frag queue timer can eventually
remove an incomplete frag queue and its skbs.
Fixes: b13d3cbfb8 ("inet: frag: move eviction of queues to work queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jann Horn <jannh@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Peter Oskolkov <posk@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-07-31
The following series includes four mlx5 fixes.
Please pull and let me know if there's any problem.
For -stable v4.14
net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager
For -stable v4.16
net/mlx5e: Set port trust mode to PCP as default
For -stable v4.17
net/mlx5e: IPoIB, Set the netdevice sw mtu in ipoib enhanced flow
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull audit fix from Paul Moore:
"A single small audit fix to guard against memory allocation failures
when logging information about a kernel module load.
It's small, easy to understand, and self-contained; while nothing is
zero risk, this should be pretty low"
* tag 'audit-pr-20180731' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: fix potential null dereference 'context->module.name'
After introduction of the cited commit, mlx5e_build_nic_params
receives the netdevice mtu in order to set the sw_mtu of mlx5e_params.
For enhanced IPoIB, the netdevice mtu is not set in this stage,
therefore, the initial sw_mtu equals zero. As a result, the hw_mtu
of the receive queue will be calculated incorrectly causing traffic
issues.
To fix this issue, query for port mtu before building the nic params.
Fixes: 472a1e44b3 ("net/mlx5e: Save MTU in channels params")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
MTU helper function is used by both conventional mlx5e
instances (PF/VF) and the eswitch representors. The representor
shouldn't change the nic vport context MTU, the VF is responsible for
that. Therefore set_mtu_cb has a null value when changing the
representor MTU.
Fixes: 250a42b6a7 ("net/mlx5e: Support configurable MTU for vport representors")
Signed-off-by: Adi Nissim <adin@mellanox.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The hairpin offload code has dependency on the trust mode being PCP.
Hence we should set PCP as the default for handling cases where we are
disallowed to read the trust mode from the FW, or failed to initialize it.
Fixes: 106be53b6b ('net/mlx5e: Set per priority hairpin pairs')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Execute mlx5_eswitch_init() only if we have MLX5_ESWITCH_MANAGER
capabilities.
Do the same for mlx5_eswitch_cleanup().
Fixes: a9f7705ffd ("net/mlx5: Unify vport manager capability check")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Version 1 of the patch adding SERDES support to the 88E6141/6341
correctly added the ops to the 88E6141/6341. However, by the time
version 3 was committed, the ops had moved to the 88E6085/6175. Put
them back where they belong.
Fixes: 5bafeb6e7e ("net: dsa: mv88e6xxx: 88E6141/6341 SERDES support")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kalle Valo says:
====================
wireless-drivers fixes for 4.18
Last set of fixes before 4.18 is released
iwlwifi
* add new IDs for cards already available on the market
brcmfmac
* fix a regression introduced in v4.17
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull SCSI fixes from James Bottomley:
"Nine fixes, five in the qla2xxx driver, the most serious of which is
the uninitialized list head crash which can be observed in most
systems under a sufficiently loaded low memory environment.
The two sg fixes are minor but obvious and two target ones which seem
reasonable but not high impact"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Return error when TMF returns
scsi: qla2xxx: Fix ISP recovery on unload
scsi: qla2xxx: Fix driver unload by shutting down chip
scsi: qla2xxx: Fix NPIV deletion by calling wait_for_sess_deletion
scsi: qla2xxx: Fix unintialized List head crash
scsi: sg: update comment for blk_get_request()
scsi: sg: fix minor memory leak in error path
scsi: libiscsi: fix possible NULL pointer dereference in case of TMF
scsi: target: iscsi: cxgbit: fix max iso npdu calculation
Pull virtio fixes from Michael Tsirkin:
"Some bugfixes that seem important and safe enough to merge at the last
minute"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_balloon: fix another race between migration and ballooning
tools/virtio: add kmalloc_array stub
tools/virtio: add dma barrier stubs
Pull ACPI fixes from Rafael Wysocki:
"These fix a recent ACPICA regression affecting control method
execution at the table level and an earlier hibernation regression in
the ACPI driver for Intel SoCs (LPSS) that was missed by a previous
fix in this cycle.
Specifics:
- Fix a recent ACPICA regression introduced by a previous fix that
caused control method execution at the table level to be mishandled
by mistake (Erik Schmauss).
- Fix a hibernation regression from the 4.15 cycle in the ACPI driver
for Intel SoCs (LPSS) that caused the platform firmware to be
confused during resume from hibernation by the driver's PM quirks
which was fixed for system-wide suspend/resume (ACPI S3) earlier in
this cycle, but that previous fix missed the hibernation (ACPI S4)
case (Rafael Wysocki)"
* tag 'acpi-urgent-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: AML Parser: ignore control method status in module-level code
ACPI / LPSS: Avoid PM quirks on suspend and resume from hibernation
When a PCI device is detected, pdev->is_added is set to 1 and proc and
sysfs entries are created.
When the device is removed, pdev->is_added is checked for one and then
device is detached with clearing of proc and sys entries and at end,
pdev->is_added is set to 0.
is_added and is_busmaster are bit fields in pci_dev structure sharing same
memory location.
A strange issue was observed with multiple removal and rescan of a PCIe
NVMe device using sysfs commands where is_added flag was observed as zero
instead of one while removing device and proc,sys entries are not cleared.
This causes issue in later device addition with warning message
"proc_dir_entry" already registered.
Debugging revealed a race condition between the PCI core setting the
is_added bit in pci_bus_add_device() and the NVMe driver reset work-queue
setting the is_busmaster bit in pci_set_master(). As these fields are not
handled atomically, that clears the is_added bit.
Move the is_added bit to a separate private flag variable and use atomic
functions to set and retrieve the device addition state. This avoids the
race because is_added no longer shares a memory location with is_busmaster.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200283
Signed-off-by: Hari Vyas <hari.vyas@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Calling pmull_gcm_encrypt_block() requires kernel_neon_begin() and
kernel_neon_end() to be used since the routine touches the NEON
register file. Add the missing calls.
Also, since NEON register contents are not preserved outside of
a kernel mode NEON region, pass the key schedule array again.
Fixes: 7c50136a8a ("crypto: arm64/aes-ghash - yield NEON after every ...")
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Dynamic boosting of HWP performance on IO wake showed significant
improvement to IO workloads. This series was intended for Skylake Xeon
platforms only and feature was enabled by default based on CPU model
number.
But some Xeon platforms reused the Skylake desktop CPU model number. This
caused some undesirable side effects to some graphics workloads. Since
they are heavily IO bound, the increase in CPU performance decreased the
power available for GPU to do its computing and hence decrease in graphics
benchmark performance.
For example on a Skylake desktop, GpuTest benchmark showed average FPS
reduction from 529 to 506.
This change makes sure that HWP boost feature is only enabled for Skylake
server platforms by using ACPI FADT preferred PM Profile. If some desktop
users wants to get benefit of boost, they can still enable boost from
intel_pstate sysfs attribute "hwp_dynamic_boost".
Fixes: 41ab43c9c8 (cpufreq: intel_pstate: enable boost for Skylake Xeon)
Link: https://bugs.freedesktop.org/show_bug.cgi?id=107410
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Merge a fix for hibernation regression in the ACPI driver for Intel
SoCs (LPSS).
* acpi-soc:
ACPI / LPSS: Avoid PM quirks on suspend and resume from hibernation
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Update the tools copy of several files, including perf_event.h,
powerpc's asm/unistd.h (new io_pgetevents syscall), bpf.h and
x86's memcpy_64.s (used in 'perf bench mem'), silencing the
respective warnings during the perf tools build.
- Fix the build on the alpine:edge distro.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Masayoshi Mizuma reported that a warning message is shown while a CPU is
hot-removed on Broadwell servers:
WARNING: CPU: 126 PID: 6 at arch/x86/events/intel/uncore.c:988
uncore_pci_remove+0x10b/0x150
Call Trace:
pci_device_remove+0x42/0xd0
device_release_driver_internal+0x148/0x220
pci_stop_bus_device+0x76/0xa0
pci_stop_root_bus+0x44/0x60
acpi_pci_root_remove+0x1f/0x80
acpi_bus_trim+0x57/0x90
acpi_bus_trim+0x2e/0x90
acpi_device_hotplug+0x2bc/0x4b0
acpi_hotplug_work_fn+0x1a/0x30
process_one_work+0x174/0x3a0
worker_thread+0x4c/0x3d0
kthread+0xf8/0x130
This bug was introduced by:
commit 15a3e845b0 ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs")
The index of "QPI Port 2 filter" was hardcode to 2, but this conflicts with the
index of "PCU.3" which is "HSWEP_PCI_PCU_3", which equals to 2 as well.
To fix the conflict, the hardcoded index needs to be cleaned up:
- introduce a new enumerator "BDX_PCI_QPI_PORT2_FILTER" for "QPI Port 2
filter" on Broadwell,
- increase UNCORE_EXTRA_PCI_DEV_MAX by one,
- clean up the hardcoded index.
Debugged-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: msys.mizuma@gmail.com
Cc: stable@vger.kernel.org
Fixes: 15a3e845b0 ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs")
Link: http://lkml.kernel.org/r/1532953688-15008-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull networking fixes from David Miller:
"Several smallish fixes, I don't think any of this requires another -rc
but I'll leave that up to you:
1) Don't leak uninitialzed bytes to userspace in xfrm_user, from Eric
Dumazet.
2) Route leak in xfrm_lookup_route(), from Tommi Rantala.
3) Premature poll() returns in AF_XDP, from Björn Töpel.
4) devlink leak in netdevsim, from Jakub Kicinski.
5) Don't BUG_ON in fib_compute_spec_dst, the condition can
legitimately happen. From Lorenzo Bianconi.
6) Fix some spectre v1 gadgets in generic socket code, from Jeremy
Cline.
7) Don't allow user to bind to out of range multicast groups, from
Dmitry Safonov with a follow-up by Dmitry Safonov.
8) Fix metrics leak in fib6_drop_pcpu_from(), from Sabrina Dubroca"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
netlink: Don't shift with UB on nlk->ngroups
net/ipv6: fix metrics leak
xen-netfront: wait xenbus state change when load module manually
can: ems_usb: Fix memory leak on ems_usb_disconnect()
openvswitch: meter: Fix setting meter id for new entries
netlink: Do not subscribe to non-existent groups
NET: stmmac: align DMA stuff to largest cache line length
tcp_bbr: fix bw probing to raise in-flight data for very small BDPs
net: socket: Fix potential spectre v1 gadget in sock_is_registered
net: socket: fix potential spectre v1 gadget in socketcall
net: mdio-mux: bcm-iproc: fix wrong getter and setter pair
ipv4: remove BUG_ON() from fib_compute_spec_dst
enic: handle mtu change for vf properly
net: lan78xx: fix rx handling before first packet is send
nfp: flower: fix port metadata conversion bug
bpf: use GFP_ATOMIC instead of GFP_KERNEL in bpf_parse_prog()
bpf: fix bpf_skb_load_bytes_relative pkt length check
perf build: Build error in libbpf missing initialization
net: ena: Fix use of uninitialized DMA address bits field
bpf: btf: Use exact btf value_size match in map_check_btf()
...
Pull sparc fixes from David Miller:
"Some small __init annotation and build fixes from Stephen Rostedt and
Thomas Petazzoni"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: use asm-generic version of msi.h
sparc: move MSI related definitions to where they are used
sparc/time: Add missing __init to init_tick_ops()
Anatoly reports another squashfs fuzzing issue, where the decompression
parameters themselves are in a compressed block.
This causes squashfs_read_data() to be called in order to read the
decompression options before the decompression stream having been set
up, making squashfs go sideways.
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Acked-by: Phillip Lougher <phillip.lougher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
xdp_return_buff() is used when frame has been successfully
handled (transmitted) or if an error occurred during delayed
processing and there is no way to report it back to
xdp_do_redirect().
In case of __xsk_rcv_zc() error is propagated all the way
back to the driver, so there is no need to call
xdp_return_buff(). Driver will recycle the frame anyway
after seeing that error happened.
Fixes: 173d3adb6f ("xsk: add zero-copy support for Rx")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
I hit the following problem when I tried to use bpftool
to dump a percpu array.
$ sudo ./bpftool map show
61: percpu_array name stub flags 0x0
key 4B value 4B max_entries 1 memlock 4096B
...
$ sudo ./bpftool map dump id 61
bpftool: malloc.c:2406: sysmalloc: Assertion
`(old_top == initial_top (av) && old_size == 0) || \
((unsigned long) (old_size) >= MINSIZE && \
prev_inuse (old_top) && \
((unsigned long) old_end & (pagesize - 1)) == 0)'
failed.
Aborted
Further debugging revealed that this is due to
miscommunication between bpftool and kernel.
For example, for the above percpu_array with value size of 4B.
The map info returned to user space has value size of 4B.
In bpftool, the values array for lookup is allocated like:
info->value_size * get_possible_cpus() = 4 * get_possible_cpus()
In kernel (kernel/bpf/syscall.c), the values array size is
rounded up to multiple of 8.
round_up(map->value_size, 8) * num_possible_cpus()
= 8 * num_possible_cpus()
So when kernel copies the values to user buffer, the kernel will
overwrite beyond user buffer boundary.
This patch fixed the issue by allocating and stepping through
percpu map value array properly in bpftool.
Fixes: 71bb428fe2 ("tools: bpf: add bpftool")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The variable 'context->module.name' may be null pointer when
kmalloc return null, so it's better to check it before using
to avoid null dereference.
Another one more thing this patch does is using kstrdup instead
of (kmalloc + strcpy), and signal a lost record via audit_log_lost.
Cc: stable@vger.kernel.org # 4.11
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
This is necessary to be able to include <linux/msi.h> when
CONFIG_GENERIC_MSI_IRQ_DOMAIN is enabled. Without this, a build with
CONFIG_GENERIC_MSI_IRQ_DOMAIN fails with:
In file included from drivers//ata/ahci.c:45:0:
>> include/linux/msi.h:226:10: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
msi_alloc_info_t *arg);
^~~~~~~~~~~~~~~~
sg_alloc_fn
include/linux/msi.h:230:9: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
msi_alloc_info_t *arg);
^~~~~~~~~~~~~~~~
sg_alloc_fn
include/linux/msi.h:239:12: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
msi_alloc_info_t *arg);
^~~~~~~~~~~~~~~~
sg_alloc_fn
include/linux/msi.h:240:22: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
void (*msi_finish)(msi_alloc_info_t *arg, int retval);
^~~~~~~~~~~~~~~~
sg_alloc_fn
include/linux/msi.h:241:20: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
void (*set_desc)(msi_alloc_info_t *arg,
^~~~~~~~~~~~~~~~
sg_alloc_fn
include/linux/msi.h:316:18: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
int nvec, msi_alloc_info_t *args);
^~~~~~~~~~~~~~~~
sg_alloc_fn
include/linux/msi.h:318:29: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
int virq, int nvec, msi_alloc_info_t *args);
^~~~~~~~~~~~~~~~
sg_alloc_fn
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The definitions in arch/sparc/include/asm/msi.h are only used in
arch/sparc/mm/srmmu.c, so it makes sense to have them in the C file
directly.
In addition, having a custom arch/sparc/include/asm/msi.h prevents
from using the asm-generic version of this header, which is necessary
to be able to include <linux/msi.h> when CONFIG_GENERIC_MSI_IRQ_DOMAIN
is enabled.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Code that was added to force gcc not to inline any function that isn't
explicitly declared as inline uncovered that init_tick_ops() isn't
marked as "__init". It is only called by __init functions and more
importantly it too calls an __init function which would require it to be
__init as well.
Link: http://lkml.kernel.org/r/201806060444.hdHcKOBy%fengguang.wu@intel.com
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
On i386 nlk->ngroups might be 32 or 0. Which leads to UB, resulting in
hang during boot.
Check for 0 ngroups and use (unsigned long long) as a type to shift.
Fixes: 7acf9d4237 ("netlink: Do not subscribe to non-existent groups").
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marc Kleine-Budde says:
====================
pull-request: can 2018-07-30
this is a pull request of one patch for net/master.
The patch by Anton Vasilyev and the Linux Driver Verification project
fixes a memory leak in the ems_usb driver's disconnect function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull scheduler fixes from Ingo Molnar:
"Misc fixes:
- a deadline scheduler related bug fix which triggered a kernel
warning
- an RT_RUNTIME_SHARE fix
- a stop_machine preemption fix
- a potential NULL dereference fix in sched_domain_debug_one()"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/rt: Restore rt_runtime after disabling RT_RUNTIME_SHARE
sched/deadline: Update rq_clock of later_rq when pushing a task
stop_machine: Disable preemption after queueing stopper threads
sched/topology: Check variable group before dereferencing it
Fix type warnings in arch/arc/mm/cache.c.
../arch/arc/mm/cache.c: In function 'flush_anon_page':
../arch/arc/mm/cache.c:1062:55: warning: passing argument 2 of '__flush_dcache_page' makes integer from pointer without a cast [-Wint-conversion]
__flush_dcache_page((phys_addr_t)page_address(page), page_address(page));
^~~~~~~~~~~~~~~~~~
../arch/arc/mm/cache.c:1013:59: note: expected 'long unsigned int' but argument is of type 'void *'
void __flush_dcache_page(phys_addr_t paddr, unsigned long vaddr)
~~~~~~~~~~~~~~^~~~~
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: Elad Kanfi <eladkan@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Ofer Levi <oferle@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Fix build errors in arch/arc/'s delay.h:
- add "extern unsigned long loops_per_jiffy;"
- add <asm-generic/types.h> for "u64"
In file included from ../drivers/infiniband/hw/cxgb3/cxio_hal.c:32:
../arch/arc/include/asm/delay.h: In function '__udelay':
../arch/arc/include/asm/delay.h:61:12: error: 'u64' undeclared (first use in this function)
loops = ((u64) usecs * 4295 * HZ * loops_per_jiffy) >> 32;
^~~
In file included from ../drivers/infiniband/hw/cxgb3/cxio_hal.c:32:
../arch/arc/include/asm/delay.h: In function '__udelay':
../arch/arc/include/asm/delay.h:63:37: error: 'loops_per_jiffy' undeclared (first use in this function)
loops = ((u64) usecs * 4295 * HZ * loops_per_jiffy) >> 32;
^~~~~~~~~~~~~~~
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: Elad Kanfi <eladkan@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Ofer Levi <oferle@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Fix printk format warning in arch/arc/plat-eznps/mtm.c:
In file included from ../include/linux/printk.h:7,
from ../include/linux/kernel.h:14,
from ../include/linux/list.h:9,
from ../include/linux/smp.h:12,
from ../arch/arc/plat-eznps/mtm.c:17:
../arch/arc/plat-eznps/mtm.c: In function 'set_mtm_hs_ctr':
../include/linux/kern_levels.h:5:18: warning: format '%d' expects argument of type 'int', but argument 2 has type 'long int' [-Wformat=]
#define KERN_SOH "\001" /* ASCII Start Of Header */
^~~~~~
../include/linux/kern_levels.h:11:18: note: in expansion of macro 'KERN_SOH'
#define KERN_ERR KERN_SOH "3" /* error conditions */
^~~~~~~~
../include/linux/printk.h:308:9: note: in expansion of macro 'KERN_ERR'
printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~
../arch/arc/plat-eznps/mtm.c:166:3: note: in expansion of macro 'pr_err'
pr_err("** Invalid @nps_mtm_hs_ctr [%d] needs to be [%d:%d] (incl)\n",
^~~~~~
../arch/arc/plat-eznps/mtm.c:166:40: note: format string is defined here
pr_err("** Invalid @nps_mtm_hs_ctr [%d] needs to be [%d:%d] (incl)\n",
~^
%ld
The hs_ctr variable can just be int instead of long, so also change
kstrtol() to kstrtoint() and leave the format string as %d.
Also add 2 header files since they are used in mtm.c and we prefer
not to depend on accidental/indirect #includes.
Cc: linux-snps-arc@lists.infradead.org
Cc: Ofer Levi <oferle@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Add <linux/types.h> to fix build errors.
Both ctop.h and <soc/nps/common.h> use u32 types and cause many
errors.
Examples:
../include/soc/nps/common.h:71:4: error: unknown type name 'u32'
u32 __reserved:20, cluster:4, core:4, thread:4;
../include/soc/nps/common.h:76:3: error: unknown type name 'u32'
u32 value;
../include/soc/nps/common.h:124:4: error: unknown type name 'u32'
u32 base:8, cl_x:4, cl_y:4,
../include/soc/nps/common.h:127:3: error: unknown type name 'u32'
u32 value;
../arch/arc/plat-eznps/include/plat/ctop.h:83:4: error: unknown type name 'u32'
u32 gen:1, gdis:1, clk_gate_dis:1, asb:1,
../arch/arc/plat-eznps/include/plat/ctop.h:86:3: error: unknown type name 'u32'
u32 value;
../arch/arc/plat-eznps/include/plat/ctop.h:93:4: error: unknown type name 'u32'
u32 csa:22, dmsid:6, __reserved:3, cs:1;
../arch/arc/plat-eznps/include/plat/ctop.h:95:3: error: unknown type name 'u32'
u32 value;
Cc: linux-snps-arc@lists.infradead.org
Cc: Ofer Levi <oferle@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Pull locking fixes from Ingo Molnar:
"A paravirt UP-patching fix, and an I2C MUX driver lockdep warning fix"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/pvqspinlock/x86: Use LOCK_PREFIX in __pv_queued_spin_unlock() assembly code
i2c/mux, locking/core: Annotate the nested rt_mutex usage
locking/rtmutex: Allow specifying a subclass for nested locking
Pull EFI fix from Ingo Molnar:
"An UEFI variables fix for SEV guests"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efi: Access EFI MMIO data as unencrypted when SEV is active
Check that SMP_CACHE_BYTES (and hence ARCH_DMA_MINALIGN) is larger
or equal to any cache line length by comparing it with values
previously read from ARC cache BCR registers.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Since commit d4ead6b34b ("net/ipv6: move metrics from dst to
rt6_info"), ipv6 metrics are shared and refcounted. rt6_set_from()
assigns the rt->from pointer and increases the refcount on from's
metrics. This reference is never released.
Introduce the fib6_metrics_release() helper and use it to release the
metrics.
Fixes: d4ead6b34b ("net/ipv6: move metrics from dst to rt6_info")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When loading module manually, after call xenbus_switch_state to initializes
the state of the netfront device, the driver state did not change so fast
that may lead no dev created in latest kernel. This patch adds wait to make
sure xenbus knows the driver is not in closed/unknown state.
Current state:
[vm]# ethtool eth0
Settings for eth0:
Link detected: yes
[vm]# modprobe -r xen_netfront
[vm]# modprobe xen_netfront
[vm]# ethtool eth0
Settings for eth0:
Cannot get device settings: No such device
Cannot get wake-on-lan settings: No such device
Cannot get message level: No such device
Cannot get link status: No such device
No data available
With the patch installed.
[vm]# ethtool eth0
Settings for eth0:
Link detected: yes
[vm]# modprobe -r xen_netfront
[vm]# modprobe xen_netfront
[vm]# ethtool eth0
Settings for eth0:
Link detected: yes
Signed-off-by: Xiao Liang <xiliang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The UAPI file byteorder/little_endian.h uses the __always_inline define
without including the header where it is defined, linux/stddef.h, this
ends up working in all the other distros because that file gets included
seemingly by luck from one of the files included from little_endian.h.
But not on Alpine:edge, that fails for all files where perf_event.h is
included but linux/stddef.h isn't include before that.
Adding the missing linux/stddef.h file where it breaks on Alpine:edge to
fix that, in all other distros, that is just a very small header anyway.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-9r1pifftxvuxms8l7ir73p5l@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To cope with the changes in:
12c89130a5 ("x86/asm/memcpy_mcsafe: Add write-protection-fault handling")
60622d6822 ("x86/asm/memcpy_mcsafe: Return bytes remaining")
bd131544aa ("x86/asm/memcpy_mcsafe: Add labels for __memcpy_mcsafe() write fault handling")
da7bc9c57e ("x86/asm/memcpy_mcsafe: Remove loop unrolling")
This needed introducing a file with a copy of the mcsafe_handle_tail()
function, that is used in the new memcpy_64.S file, as well as a dummy
mcsafe_test.h header.
Testing it:
$ nm ~/bin/perf | grep mcsafe
0000000000484130 T mcsafe_handle_tail
0000000000484300 T __memcpy_mcsafe
$
$ perf bench mem memcpy
# Running 'mem/memcpy' benchmark:
# function 'default' (Default memcpy() provided by glibc)
# Copying 1MB bytes ...
44.389205 GB/sec
# function 'x86-64-unrolled' (unrolled memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB bytes ...
22.710756 GB/sec
# function 'x86-64-movsq' (movsq-based memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB bytes ...
42.459239 GB/sec
# function 'x86-64-movsb' (movsb-based memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB bytes ...
42.459239 GB/sec
$
This silences this perf tools build warning:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mika Penttilä <mika.penttila@nextfour.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-igdpciheradk3gb3qqal52d0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kernel panic when with high memory pressure, calltrace looks like,
PID: 21439 TASK: ffff881be3afedd0 CPU: 16 COMMAND: "java"
#0 [ffff881ec7ed7630] machine_kexec at ffffffff81059beb
#1 [ffff881ec7ed7690] __crash_kexec at ffffffff81105942
#2 [ffff881ec7ed7760] crash_kexec at ffffffff81105a30
#3 [ffff881ec7ed7778] oops_end at ffffffff816902c8
#4 [ffff881ec7ed77a0] no_context at ffffffff8167ff46
#5 [ffff881ec7ed77f0] __bad_area_nosemaphore at ffffffff8167ffdc
#6 [ffff881ec7ed7838] __node_set at ffffffff81680300
#7 [ffff881ec7ed7860] __do_page_fault at ffffffff8169320f
#8 [ffff881ec7ed78c0] do_page_fault at ffffffff816932b5
#9 [ffff881ec7ed78f0] page_fault at ffffffff8168f4c8
[exception RIP: _raw_spin_lock_irqsave+47]
RIP: ffffffff8168edef RSP: ffff881ec7ed79a8 RFLAGS: 00010046
RAX: 0000000000000246 RBX: ffffea0019740d00 RCX: ffff881ec7ed7fd8
RDX: 0000000000020000 RSI: 0000000000000016 RDI: 0000000000000008
RBP: ffff881ec7ed79a8 R8: 0000000000000246 R9: 000000000001a098
R10: ffff88107ffda000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000008 R14: ffff881ec7ed7a80 R15: ffff881be3afedd0
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
It happens in the pagefault and results in double pagefault
during compacting pages when memory allocation fails.
Analysed the vmcore, the page leads to second pagefault is corrupted
with _mapcount=-256, but private=0.
It's caused by the race between migration and ballooning, and lock
missing in virtballoon_migratepage() of virtio_balloon driver.
This patch fix the bug.
Fixes: e22504296d ("virtio_balloon: introduce migration primitives to balloon pages")
Cc: stable@vger.kernel.org
Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Huang Chong <huang.chong@zte.com.cn>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The VSP uses a lock to protect the BRU and BRS assignment when
configuring pipelines. The lock is taken in vsp1_du_atomic_begin() and
released in vsp1_du_atomic_flush(), as well as taken and released in
vsp1_du_setup_lif(). This guards against multiple pipelines trying to
assign the same BRU and BRS at the same time.
The DRM framework calls the .atomic_begin() operations in a loop over
all CRTCs included in an atomic commit. On a VSPDL (the only VSP type
where this matters), a single VSP instance handles two CRTCs, with a
single lock. This results in a deadlock when the .atomic_begin()
operation is called on the second CRTC.
The DRM framework serializes atomic commits that affect the same CRTCs,
but doesn't know about two CRTCs sharing the same VSPDL. Two commits
affecting the VSPDL LIF0 and LIF1 respectively can thus race each other,
hence the need for a lock.
This could be fixed on the DRM side by forcing serialization of commits
affecting CRTCs backed by the same VSPDL, but that would negatively
affect performances, as the locking is only needed when the BRU and BRS
need to be reassigned, which is an uncommon case.
The lock protects the whole .atomic_begin() to .atomic_flush() sequence.
The only operation that can occur in-between is vsp1_du_atomic_update(),
which doesn't touch the BRU and BRS, and thus doesn't need to be
protected by the lock. We can thus only take the lock around the
pipeline setup calls in vsp1_du_atomic_flush(), which fixes the
deadlock.
Fixes: f81f9adc4e ("media: v4l: vsp1: Assign BRU and BRS to pipelines dynamically")
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
The repeat period is read from a static array. If a keydown event is
reported from bpf with a high protocol number, we read out of bounds. This
is unlikely to end up with a reasonable repeat period at the best of times,
in which case no timely key up event is generated.
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
When building the kernel as Thumb-2 with binutils 2.29 or newer, if the
assembler has seen the .type directive (via ENDPROC()) for a symbol, it
automatically handles the setting of the lowest bit when the symbol is
used with ADR. The badr macro on the other hand handles this lowest bit
manually. This leads to a jump to a wrong address in the wrong state
in the syscall return path:
Internal error: Oops - undefined instruction: 0 [#2] SMP THUMB2
Modules linked in:
CPU: 0 PID: 652 Comm: modprobe Tainted: G D 4.18.0-rc3+ #8
PC is at ret_fast_syscall+0x4/0x62
LR is at sys_brk+0x109/0x128
pc : [<80101004>] lr : [<801c8a35>] psr: 60000013
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 50c5387d Table: 9e82006a DAC: 00000051
Process modprobe (pid: 652, stack limit = 0x(ptrval))
80101000 <ret_fast_syscall>:
80101000: b672 cpsid i
80101002: f8d9 2008 ldr.w r2, [r9, #8]
80101006: f1b2 4ffe cmp.w r2, #2130706432 ; 0x7f000000
80101184 <local_restart>:
80101184: f8d9 a000 ldr.w sl, [r9]
80101188: e92d 0030 stmdb sp!, {r4, r5}
8010118c: f01a 0ff0 tst.w sl, #240 ; 0xf0
80101190: d117 bne.n 801011c2 <__sys_trace>
80101192: 46ba mov sl, r7
80101194: f5ba 7fc8 cmp.w sl, #400 ; 0x190
80101198: bf28 it cs
8010119a: f04f 0a00 movcs.w sl, #0
8010119e: f3af 8014 nop.w {20}
801011a2: f2af 1ea2 subw lr, pc, #418 ; 0x1a2
To fix this, add a new symbol name which doesn't have ENDPROC used on it
and use that with badr. We can't remove the badr usage since that would
would cause breakage with older binutils.
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
ems_usb_probe() allocates memory for dev->tx_msg_buffer, but there
is no its deallocation in ems_usb_disconnect().
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The meter code would create an entry for each new meter. However, it
would not set the meter id in the new entry, so every meter would appear
to have a meter id of zero. This commit properly sets the meter id when
adding the entry.
Fixes: 96fbc13d7e ("openvswitch: Add meter infrastructure")
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Cc: Andy Zhou <azhou@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ext4 fixes from Ted Ts'o:
"Some miscellaneous ext4 fixes for 4.18; one fix is for a regression
introduced in 4.18-rc4.
Sorry for the late-breaking pull. I was originally going to wait for
the next merge window, but Eric Whitney found a regression introduced
in 4.18-rc4, so I decided to push out the regression plus the other
fixes now. (The other commits have been baking in linux-next since
early July)"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix check to prevent initializing reserved inodes
ext4: check for allocation block validity with block group locked
ext4: fix inline data updates with checksums enabled
ext4: clear mmp sequence number when remounting read-only
ext4: fix false negatives *and* false positives in ext4_check_descriptors()
Anatoly Trosinenko reports that a corrupted squashfs image can cause a
kernel oops. It turns out that squashfs can end up being confused about
negative fragment lengths.
The regular squashfs_read_data() does check for negative lengths, but
squashfs_read_metadata() did not, and the fragment size code just
blindly trusted the on-disk value. Fix both the fragment parsing and
the metadata reading code.
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 8844618d8a: "ext4: only look at the bg_flags field if it is
valid" will complain if block group zero does not have the
EXT4_BG_INODE_ZEROED flag set. Unfortunately, this is not correct,
since a freshly created file system has this flag cleared. It gets
almost immediately after the file system is mounted read-write --- but
the following somewhat unlikely sequence will end up triggering a
false positive report of a corrupted file system:
mkfs.ext4 /dev/vdc
mount -o ro /dev/vdc /vdc
mount -o remount,rw /dev/vdc
Instead, when initializing the inode table for block group zero, test
to make sure that itable_unused count is not too large, since that is
the case that will result in some or all of the reserved inodes
getting cleared.
This fixes the failures reported by Eric Whiteney when running
generic/230 and generic/231 in the the nojournal test case.
Fixes: 8844618d8a ("ext4: only look at the bg_flags field if it is valid")
Reported-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
As for today STMMAC_ALIGN macro (which is used to align DMA stuff)
relies on L1 line length (L1_CACHE_BYTES).
This isn't correct in case of system with several cache levels
which might have L1 cache line length smaller than L2 line. This
can lead to sharing one cache line between DMA buffer and other
data, so we can lose this data while invalidate DMA buffer before
DMA transaction.
Fix that by using SMP_CACHE_BYTES instead of L1_CACHE_BYTES for
aligning.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull turbostat utility fixes for 4.18 from Len Brown:
"Three of them are for regressions since Linux-4.17"
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: version 18.07.27
tools/power turbostat: Read extended processor family from CPUID
tools/power turbostat: Fix logical node enumeration to allow for non-sequential physical nodes
tools/power turbostat: fix x2apic debug message output file
tools/power turbostat: fix bogus summary values
tools/power turbostat: fix -S on UP systems
tools/power turbostat: Update turbostat(8) RAPL throttling column description
Previous change in the AML parser code blindly set all non-successful
dispatcher statuses to AE_OK. That approach is incorrect, though,
because successful control method invocations from module-level
return AE_CTRL_TRANSFER. Overwriting AE_OK to this status causes the
AML parser to think that there was no return value from the control
method invocation.
Fixes: 92c0f4af386 (ACPICA: AML Parser: ignore dispatcher error status during table load)
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
For some very small BDPs (with just a few packets) there was a
quantization effect where the target number of packets in flight
during the super-unity-gain (1.25x) phase of gain cycling was
implicitly truncated to a number of packets no larger than the normal
unity-gain (1.0x) phase of gain cycling. This meant that in multi-flow
scenarios some flows could get stuck with a lower bandwidth, because
they did not push enough packets inflight to discover that there was
more bandwidth available. This was really only an issue in multi-flow
LAN scenarios, where RTTs and BDPs are low enough for this to be an
issue.
This fix ensures that gain cycling can raise inflight for small BDPs
by ensuring that in PROBE_BW mode target inflight values with a
super-unity gain are always greater than inflight values with a gain
<= 1. Importantly, this applies whether the inflight value is
calculated for use as a cwnd value, or as a target inflight value for
the end of the super-unity phase in bbr_is_next_cycle_phase() (both
need to be bigger to ensure we can probe with more packets in flight
reliably).
This is a candidate fix for stable releases.
Fixes: 0f8782ea14 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Cline says:
====================
net: socket: Fix potential spectre v1 gadgets
This fixes a pair of potential spectre v1 gadgets.
Note that because the speculation window is large, the policy is to stop
the speculative out-of-bounds load and not worry if the attack can be
completed with a dependent load or store[0].
[0] https://marc.info/?l=linux-kernel&m=152449131114778
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
'call' is a user-controlled value, so sanitize the array index after the
bounds check to avoid speculating past the bounds of the 'nargs' array.
Found with the help of Smatch:
net/socket.c:2508 __do_sys_socketcall() warn: potential spectre issue
'nargs' [r] (local cap)
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2018-07-28
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) API fixes for libbpf's BTF mapping of map key/value types in order
to make them compatible with iproute2's BPF_ANNOTATE_KV_PAIR()
markings, from Martin.
2) Fix AF_XDP to not report POLLIN prematurely by using the non-cached
consumer pointer of the RX queue, from Björn.
3) Fix __xdp_return() to check for NULL pointer after the rhashtable
lookup that retrieves the allocator object, from Taehee.
4) Fix x86-32 JIT to adjust ebp register in prologue and epilogue
by 4 bytes which got removed from overall stack usage, from Wang.
5) Fix bpf_skb_load_bytes_relative() length check to use actual
packet length, from Daniel.
6) Fix uninitialized return code in libbpf bpf_perf_event_read_simple()
handler, from Thomas.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull random fixes from Ted Ts'o:
"In reaction to the fixes to address CVE-2018-1108, some Linux
distributions that have certain systemd versions in some cases
combined with patches to libcrypt for FIPS/FEDRAMP compliance, have
led to boot-time stalls for some hardware.
The reaction by some distros and Linux sysadmins has been to install
packages that try to do complicated things with the CPU and hope that
leads to randomness.
To mitigate this, if RDRAND is available, mix it into entropy provided
by userspace. It won't hurt, and it will probably help"
* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: mix rdrand with entropy sent in from userspace
mdio_mux_iproc_probe() uses platform_set_drvdata() to store md pointer
in device, whereas mdio_mux_iproc_remove() restores md pointer by
dev_get_platdata(&pdev->dev). This leads to wrong resources release.
The patch replaces getter to platform_get_drvdata.
Fixes: 98bc865a1e ("net: mdio-mux: Add MDIO mux driver for iProc SoCs")
Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove BUG_ON() from fib_compute_spec_dst routine and check
in_dev pointer during flowi4 data structure initialization.
fib_compute_spec_dst routine can be run concurrently with device removal
where ip_ptr net_device pointer is set to NULL. This can happen
if userspace enables pkt info on UDP rx socket and the device
is removed while traffic is flowing
Fixes: 35ebf65e85 ("ipv4: Create and use fib_compute_spec_dst() helper")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When driver gets notification for mtu change, driver does not handle it for
all RQs. It handles only RQ[0].
Fix is to use enic_change_mtu() interface to change mtu for vf.
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull GPIO fixes from Linus Walleij:
"Just a smallish OF fix and a driver fix:
- OF flag fix for special regulator flags
- fix up the Uniphier IRQ callback"
* tag 'gpio-v4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: uniphier: set legitimate irq trigger type in .to_irq hook
gpio: of: Handle fixed regulator flags properly
As long the bh tasklet isn't scheduled once, no packet from the rx path
will be handled. Since the tx path also schedule the same tasklet
this situation only persits until the first packet transmission.
So fix this issue by scheduling the tasklet after link reset.
Link: https://github.com/raspberrypi/linux/issues/2617
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet")
Suggested-by: Floris Bos <bos@je-eigen-domein.nl>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Function nfp_flower_repr_get_type_and_port expects an enum nfp_repr_type
return value but, if the repr type is unknown, returns a value of type
enum nfp_flower_cmsg_port_type. This means that if FW encodes the port
ID in a way the driver does not understand instead of dropping the frame
driver may attribute it to a physical port (uplink) provided the port
number is less than physical port count.
Fix this and ensure a net_device of NULL is returned if the repr can not
be determined.
Fixes: 1025351a88 ("nfp: add flower app")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MIPS fix from Paul Burton:
"Here's one more MIPS fix, reverting an errata workaround that was
merged for v4.18-rc2 but has since been found to cause system hangs on
some BCM4718A1-based systems by the OpenWRT project"
* tag 'mips_fixes_4.18_5' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
Revert "MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum"
The len > skb_headlen(skb) cannot be used as a maximum upper bound
for the packet length since it does not have any relation to the full
linear packet length when filtering is used from upper layers (e.g.
in case of reuseport BPF programs) as by then skb->data, skb->len
already got mangled through __skb_pull() and others.
Fixes: 4e1ec56cdc ("bpf: add skb_load_bytes_relative helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
In linux-next tree compiling the perf tool with additional make flags
EXTRA_CFLAGS="-Wp,-D_FORTIFY_SOURCE=2 -O2" causes a compiler error.
It is the warning 'variable may be used uninitialized' which is treated
as error: I compile it using a FEDORA 28 installation, my gcc compiler
version: gcc (GCC) 8.0.1 20180324 (Red Hat 8.0.1-0.20). The file that
causes the error is tools/lib/bpf/libbpf.c.
[root@p23lp27] # make V=1 EXTRA_CFLAGS="-Wp,-D_FORTIFY_SOURCE=2 -O2"
[...]
Makefile.config:849: No openjdk development package found, please
install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel
Warning: Kernel ABI header at 'tools/include/uapi/linux/if_link.h'
differs from latest version at 'include/uapi/linux/if_link.h'
CC libbpf.o
libbpf.c: In function ‘bpf_perf_event_read_simple’:
libbpf.c:2342:6: error: ‘ret’ may be used uninitialized in this
function [-Werror=maybe-uninitialized]
int ret;
^
cc1: all warnings being treated as errors
mv: cannot stat './.libbpf.o.tmp': No such file or directory
/home6/tmricht/linux-next/tools/build/Makefile.build:96: recipe for target 'libbpf.o' failed
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Pull i2c fixes from Wolfram Sang:
"Some driver bugfixes"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx: use open drain for recovery GPIO
i2c: rcar: handle RXDMA HW behaviour on Gen3
i2c: imx: Fix reinit_completion() use
i2c: davinci: Avoid zero value of CLKH
As for today we don't setup SMP_CACHE_BYTES and cache_line_size for
ARC, so they are set to L1_CACHE_BYTES by default. L1 line length
(L1_CACHE_BYTES) might be easily smaller than L2 line (which is
usually the case BTW). This breaks code.
For example this breaks ethernet infrastructure on HSDK/AXS103 boards
with IOC disabled, involving manual cache flushes
Functions which alloc and manage sk_buff packet data area rely on
SMP_CACHE_BYTES define. In the result we can share last L2 cache
line in sk_buff linear packet data area between DMA buffer and
some useful data in other structure. So we can lose this data when
we invalidate DMA buffer.
sk_buff linear packet data area
|
|
| skb->end skb->tail
V | |
V V
----------------------------------------------.
packet data | <tail padding> | <useful data in other struct>
----------------------------------------------.
---------------------.--------------------------------------------------.
SLC line | SLC (L2 cache) line (128B) |
---------------------.--------------------------------------------------.
^ ^
| |
These cache lines will be invalidated when we invalidate skb
linear packet data area before DMA transaction starting.
This leads to issues painful to debug as it reproduces only if
(sk_buff->end - sk_buff->tail) < SLC_LINE_SIZE and
if we have some useful data right after sk_buff->end.
Fix that by hardcode SMP_CACHE_BYTES to max line length we may have.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC backend for dma_sync_single_for_(device|cpu) was broken as it was
not honoring the @dir argument and simply forcing it based on the call:
- arc_dma_sync_single_for_device(dir) assumed DMA_TO_DEVICE (cache wback)
- arc_dma_sync_single_for_cpu(dir) assumed DMA_FROM_DEVICE (cache inv)
This is not true given the DMA API programming model and has been
discussed here [1] in some detail.
Interestingly while the deficiency has been there forever, it only started
showing up after 4.17 dma common ops rework, commit a8eb92d02d
("arc: fix arc_dma_{map,unmap}_page") which wired up these calls under the
more commonly used dma_map_page API triggering the issue.
[1]: https://lkml.org/lkml/2018/5/18/979
Fixes: commit a8eb92d02d ("arc: fix arc_dma_{map,unmap}_page")
Cc: stable@kernel.org # v4.17+
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: reworked changelog]
Pull block fixes from Jens Axboe:
"Bigger than usual at this time, mostly due to the O_DIRECT corruption
issue and the fact that I was on vacation last week. This contains:
- NVMe pull request with two fixes for the FC code, and two target
fixes (Christoph)
- a DIF bio reset iteration fix (Greg Edwards)
- two nbd reply and requeue fixes (Josef)
- SCSI timeout fixup (Keith)
- a small series that fixes an issue with bio_iov_iter_get_pages(),
which ended up causing corruption for larger sized O_DIRECT writes
that ended up racing with buffered writes (Martin Wilck)"
* tag 'for-linus-20180727' of git://git.kernel.dk/linux-block:
block: reset bi_iter.bi_done after splitting bio
block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs
blkdev: __blkdev_direct_IO_simple: fix leak in error case
block: bio_iov_iter_get_pages: fix size of last iovec
nvmet: only check for filebacking on -ENOTBLK
nvmet: fixup crash on NULL device path
scsi: set timed out out mq requests to complete
blk-mq: export setting request completion state
nvme: if_ready checks to fail io to deleting controller
nvmet-fc: fix target sgl list on large transfers
nbd: handle unexpected replies better
nbd: don't requeue the same request twice.
Merge misc fixes from Andrew Morton:
"11 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
kvm, mm: account shadow page tables to kmemcg
zswap: re-check zswap_is_full() after do zswap_shrink()
include/linux/eventfd.h: include linux/errno.h
mm: fix vma_is_anonymous() false-positives
mm: use vma_init() to initialize VMAs on stack and data segments
mm: introduce vma_init()
mm: fix exports that inadvertently make put_page() EXPORT_SYMBOL_GPL
ipc/sem.c: prevent queue.status tearing in semop
mm: disallow mappings that conflict for devm_memremap_pages()
kasan: only select SLUB_DEBUG with SYSFS=y
delayacct: fix crash in delayacct_blkio_end() after delayacct init failure
Pull PCI fix from Bjorn Helgaas:
"Fix a use-after-free error in fatal error recovery (Thomas Tai)"
* tag 'pci-v4.18-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/AER: Work around use-after-free in pcie_do_fatal_recovery()
Pull arm64 fixes from Will Deacon:
"Inevitably, after saying that I hoped we would be done on the fixes
front, a couple of issues have cropped up over the last week. Next
time I'll stay schtum.
We've fixed an over-eager BUILD_BUG_ON() which Arnd ran into with
arndconfig, as well as ensuring that KPTI really is disabled on
Thunder-X1, where the cure is worse than the disease (this regressed
when we reworked the heterogeneous CPU feature checking).
Summary:
- Fix disabling of kpti on Thunder-X machines
- Fix premature BUILD_BUG_ON() found with randconfig"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: fix vmemmap BUILD_BUG_ON() triggering on !vmemmap setups
arm64: Check for errata before evaluating cpu features
This reverts commit 2a027b47db ("MIPS: BCM47XX: Enable 74K Core
ExternalSync for PCIe erratum").
Enabling ExternalSync caused a regression for BCM4718A1 (used e.g. in
Netgear E3000 and ASUS RT-N16): it simply hangs during PCIe
initialization. It's likely that BCM4717A1 is also affected.
I didn't notice that earlier as the only BCM47XX devices with PCIe I
own are:
1) BCM4706 with 2 x 14e4:4331
2) BCM4706 with 14e4:4360 and 14e4:4331
it appears that BCM4706 is unaffected.
While BCM5300X-ES300-RDS.pdf seems to document that erratum and its
workarounds (according to quotes provided by Tokunori) it seems not even
Broadcom follows them.
According to the provided info Broadcom should define CONF7_ES in their
SDK's mipsinc.h and implement workaround in the si_mips_init(). Checking
both didn't reveal such code. It *could* mean Broadcom also had some
problems with the given workaround.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reported-by: Michael Marley <michael@michaelmarley.com>
Patchwork: https://patchwork.linux-mips.org/patch/20032/
URL: https://bugs.openwrt.org/index.php?do=details&task_id=1688
Cc: Tokunori Ikegami <ikegami@allied-telesis.co.jp>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Pull drm fixes from Dave Airlie:
"Not much happening this week which is good: two imx display fixes and
one i915 quirk addition"
* tag 'drm-fixes-2018-07-27' of git://anongit.freedesktop.org/drm/drm:
drm/i915/glk: Add Quirk for GLK NUC HDMI port issues.
gpu: ipu-csi: Check for field type alternate
drm/imx: imx-ldb: check if channel is enabled before printing warning
drm/imx: imx-ldb: disable LDB on driver bind
This fixes the reported family on modern AMD processors (e.g. Ryzen,
which is family 0x17). Previously these processors all showed up as
family 0xf.
See the document
https://support.amd.com/TechDocs/56255_OSRR.pdf
section CPUID_Fn00000001_EAX for how to calculate the family
from the BaseFamily and ExtFamily values.
This matches the code in arch/x86/lib/cpu.c
Signed-off-by: Calvin Walton <calvin.walton@kepstin.ca>
Signed-off-by: Len Brown <len.brown@intel.com>
Pull input fixes from Dmitry Torokhov:
- a couple of new device IDs added to Elan i2c touchpad controller
driver
- another entry in i8042 reset quirk list
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: i8042 - add Lenovo LaVie Z to the i8042 reset list
Input: elan_i2c - add another ACPI ID for Lenovo Ideapad 330-15AST
MAINTAINERS: Add file patterns for serio device tree bindings
Input: elan_i2c - add ACPI ID for lenovo ideapad 330
Pull tracing fixes from Steven Rostedt:
"Various fixes to the tracing infrastructure:
- Fix double free when the reg() call fails in
event_trigger_callback()
- Fix anomoly of snapshot causing tracing_on flag to change
- Add selftest to test snapshot and tracing_on affecting each other
- Fix setting of tracepoint flag on error that prevents probes from
being deleted.
- Fix another possible double free that is similar to
event_trigger_callback()
- Quiet a gcc warning of a false positive unused variable
- Fix crash of partial exposed task->comm to trace events"
* tag 'trace-v4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
kthread, tracing: Don't expose half-written comm when creating kthreads
tracing: Quiet gcc warning about maybe unused link variable
tracing: Fix possible double free in event_enable_trigger_func()
tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure
selftests/ftrace: Add snapshot and tracing_on test case
ring_buffer: tracing: Inherit the tracing setting to next ring buffer
tracing: Fix double free of event_trigger_data
Pull xfs fixes from Darrick Wong:
- Fix some uninitialized variable errors
- Fix an incorrect check in metadata verifiers
* tag 'xfs-4.18-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: properly handle free inodes in extent hint validators
xfs: Initialize variables in xfs_alloc_get_rec before using them
Steffen Klassert says:
====================
pull request (net): ipsec 2018-07-27
1) Fix PMTU handling of vti6. We update the PMTU on
the xfrm dst_entry which is not cached anymore
after the flowchache removal. So update the
PMTU of the original dst_entry instead.
From Eyal Birger.
2) Fix a leak of kernel memory to userspace.
From Eric Dumazet.
3) Fix a possible dst_entry memleak in xfrm_lookup_route.
From Tommi Rantala.
4) Fix a skb leak in case we can't call nlmsg_multicast
from xfrm_nlmsg_multicast. From Florian Westphal.
5) Fix a leak of a temporary buffer in the error path of
esp6_input. From Zhen Lei.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
After the bio has been updated to represent the remaining sectors, reset
bi_done so bio_rewind_iter() does not rewind further than it should.
This resolves a bio_integrity_process() failure on reads where the
original request was split.
Fixes: 63573e359d ("bio-integrity: Restore original iterator on verify stage")
Signed-off-by: Greg Edwards <gedwards@ddn.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit a09c591306 (ACPI / LPSS: Avoid PM quirks on suspend and
resume from S3) modified the ACPI driver for Intel SoCs (LPSS) to
avoid applying PM quirks on suspend and resume from S3 to address
system-wide suspend and resume problems on some systems, but it is
reported that the same issue also affects hibernation, so extend
the approach used by that commit to cover hibernation as well.
Fixes: a09c591306 (ACPI / LPSS: Avoid PM quirks on suspend and resume from S3)
Link: https://bugs.launchpad.net/bugs/1774950
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
UBSAN triggers the following undefined behaviour warnings:
[...]
[ 13.236124] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:468:22
[ 13.240043] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]
[ 13.744769] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:373:4
[ 13.748694] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]
When splitting the address to high and low, GENMASK_ULL is used to generate
a bitmask with dma_addr_bits field from io_sq (in ena_com_prepare_tx and
ena_com_add_single_rx_desc).
The problem is that dma_addr_bits is not initialized with a proper value
(besides being cleared in ena_com_create_io_queue).
Assign dma_addr_bits the correct value that is stored in ena_dev when
initializing the SQ.
Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Gal Pressman <pressmangal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
/sys/../zswap/stored_pages keeps rising in a zswap test with
"zswap.max_pool_percent=0" parameter. But it should not compress or
store pages any more since there is no space in the compressed pool.
Reproduce steps:
1. Boot kernel with "zswap.enabled=1"
2. Set the max_pool_percent to 0
# echo 0 > /sys/module/zswap/parameters/max_pool_percent
3. Do memory stress test to see if some pages have been compressed
# stress --vm 1 --vm-bytes $mem_available"M" --timeout 60s
4. Watching the 'stored_pages' number increasing or not
The root cause is:
When zswap_max_pool_percent is set to 0 via kernel parameter,
zswap_is_full() will always return true due to zswap_shrink(). But if
the shinking is able to reclain a page successfully the code then
proceeds to compressing/storing another page, so the value of
stored_pages will keep changing.
To solve the issue, this patch adds a zswap_is_full() check again after
zswap_shrink() to make sure it's now under the max_pool_percent, and to
not compress/store if we reached the limit.
Link: http://lkml.kernel.org/r/20180530103936.17812-1-liwang@redhat.com
Signed-off-by: Li Wang <liwang@redhat.com>
Acked-by: Dan Streetman <ddstreet@ieee.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Huang Ying <huang.ying.caritas@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The new gasket staging driver ran into a randconfig build failure when
CONFIG_EVENTFD is disabled:
In file included from drivers/staging/gasket/gasket_interrupt.h:11,
from drivers/staging/gasket/gasket_interrupt.c:4:
include/linux/eventfd.h: In function 'eventfd_ctx_fdget':
include/linux/eventfd.h:51:9: error: implicit declaration of function 'ERR_PTR' [-Werror=implicit-function-declaration]
I can't see anything wrong with including eventfd.h before err.h, so the
easiest fix is to make it possible to do this by including the file
where it is needed.
Link: http://lkml.kernel.org/r/20180724110737.3985088-1-arnd@arndb.de
Fixes: 9a69f5087c ("drivers/staging: Gasket driver framework + Apex driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When pmem namespaces created are smaller than section size, this can
cause an issue during removal and gpf was observed:
general protection fault: 0000 1 SMP PTI
CPU: 36 PID: 3941 Comm: ndctl Tainted: G W 4.14.28-1.el7uek.x86_64 #2
task: ffff88acda150000 task.stack: ffffc900233a4000
RIP: 0010:__put_page+0x56/0x79
Call Trace:
devm_memremap_pages_release+0x155/0x23a
release_nodes+0x21e/0x260
devres_release_all+0x3c/0x48
device_release_driver_internal+0x15c/0x207
device_release_driver+0x12/0x14
unbind_store+0xba/0xd8
drv_attr_store+0x27/0x31
sysfs_kf_write+0x3f/0x46
kernfs_fop_write+0x10f/0x18b
__vfs_write+0x3a/0x16d
vfs_write+0xb2/0x1a1
SyS_write+0x55/0xb9
do_syscall_64+0x79/0x1ae
entry_SYSCALL_64_after_hwframe+0x3d/0x0
Add code to check whether we have a mapping already in the same section
and prevent additional mappings from being created if that is the case.
Link: http://lkml.kernel.org/r/152909478401.50143.312364396244072931.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Building with KASAN and SLUB but without sysfs now results in a
build-time error:
WARNING: unmet direct dependencies detected for SLUB_DEBUG
Depends on [n]: SLUB [=y] && SYSFS [=n]
Selected by [y]:
- KASAN [=y] && HAVE_ARCH_KASAN [=y] && (SLUB [=y] || SLAB [=n] && !DEBUG_SLAB [=n]) && SLUB [=y]
mm/slub.c:4565:12: error: 'list_locations' defined but not used [-Werror=unused-function]
static int list_locations(struct kmem_cache *s, char *buf,
^~~~~~~~~~~~~~
mm/slub.c:4406:13: error: 'validate_slab_cache' defined but not used [-Werror=unused-function]
static long validate_slab_cache(struct kmem_cache *s)
This disallows that broken configuration in Kconfig.
Link: http://lkml.kernel.org/r/20180709154019.1693026-1-arnd@arndb.de
Fixes: dd275caf4a ("kasan: depend on CONFIG_SLUB_DEBUG")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While forking, if delayacct init fails due to memory shortage, it
continues expecting all delayacct users to check task->delays pointer
against NULL before dereferencing it, which all of them used to do.
Commit c96f5471ce ("delayacct: Account blkio completion on the correct
task"), while updating delayacct_blkio_end() to take the target task
instead of always using %current, made the function test NULL on
%current->delays and then continue to operated on @p->delays. If
%current succeeded init while @p didn't, it leads to the following
crash.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
IP: __delayacct_blkio_end+0xc/0x40
PGD 8000001fd07e1067 P4D 8000001fd07e1067 PUD 1fcffbb067 PMD 0
Oops: 0000 [#1] SMP PTI
CPU: 4 PID: 25774 Comm: QIOThread0 Not tainted 4.16.0-9_fbk1_rc2_1180_g6b593215b4d7 #9
RIP: 0010:__delayacct_blkio_end+0xc/0x40
Call Trace:
try_to_wake_up+0x2c0/0x600
autoremove_wake_function+0xe/0x30
__wake_up_common+0x74/0x120
wake_up_page_bit+0x9c/0xe0
mpage_end_io+0x27/0x70
blk_update_request+0x78/0x2c0
scsi_end_request+0x2c/0x1e0
scsi_io_completion+0x20b/0x5f0
blk_mq_complete_request+0xa2/0x100
ata_scsi_qc_complete+0x79/0x400
ata_qc_complete_multiple+0x86/0xd0
ahci_handle_port_interrupt+0xc9/0x5c0
ahci_handle_port_intr+0x54/0xb0
ahci_single_level_irq_intr+0x3b/0x60
__handle_irq_event_percpu+0x43/0x190
handle_irq_event_percpu+0x20/0x50
handle_irq_event+0x2a/0x50
handle_edge_irq+0x80/0x1c0
handle_irq+0xaf/0x120
do_IRQ+0x41/0xc0
common_interrupt+0xf/0xf
Fix it by updating delayacct_blkio_end() check @p->delays instead.
Link: http://lkml.kernel.org/r/20180724175542.GP1934745@devbig577.frc2.facebook.com
Fixes: c96f5471ce ("delayacct: Account blkio completion on the correct task")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dave Jones <dsj@fb.com>
Debugged-by: Dave Jones <dsj@fb.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Josh Snyder <joshs@netflix.com>
Cc: <stable@vger.kernel.org> [4.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
drm/imx: imx-drm ldb and ipu-v3 csi fixes
- Disable the LVDS Display Bridge (LDB) on driver bind. This is
necessary to guarantee correct LVDS signals in case the bootloader
left the LVDS output active.
- Remove false positive warning about disabled second LVDS channel in
dual-channel mode. In this mode, the second LVDS channel can not be
used separately. If the second channel is correctly described as
disabled in the device tree, the driver warned about this anyway.
- Fix the CSI confiuration to not only enable interlaced capture mode
for V4L2_FIELD_SEQ_BT and V4L2_FIELD_SEQ_TB, but also for the
V4L2_FIELD_ALTERNATE interlacing mode. Before, it incorrectly tried
to capture progressive frames in that case.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1532100423.3438.8.camel@pengutronix.de
The current map_check_btf() in BPF_MAP_TYPE_ARRAY rejects
'> map->value_size' to ensure map_seq_show_elem() will not
access things beyond an array element.
Yonghong suggested that using '!=' is a more correct
check. The 8 bytes round_up on value_size is stored
in array->elem_size. Hence, using '!=' on map->value_size
is a proper check.
This patch also adds new tests to check the btf array
key type and value type. Two of these new tests verify
the btf's value_size (the change in this patch).
It also fixes two existing tests that wrongly encoded
a btf's type size (pprint_test) and the value_type_id (in one
of the raw_tests[]). However, that do not affect these two
BTF verification tests before or after this test changes.
These two tests mainly failed at array creation time after
this patch.
Fixes: a26ca7c982 ("bpf: btf: Add pretty print support to the basic arraymap")
Suggested-by: Yonghong Song <yhs@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
rhashtable_lookup() can return NULL. so that NULL pointer
check routine should be added.
Fixes: 02b55e5657 ("xdp: add MEM_TYPE_ZERO_COPY")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Fix dev_change_tx_queue_len so it rolls back original value
upon a failure in dev_qdisc_change_tx_queue_len.
This is already done for notifirers' failures, share the code.
In case of failure in dev_qdisc_change_tx_queue_len, some tx queues
would still be of the new length, while they should be reverted.
Currently, the revert is not done, and is marked with a TODO label
in dev_qdisc_change_tx_queue_len, and should find some nice solution
to do it.
Yet it is still better to not apply the newly requested value.
Fixes: 48bfd55e7e ("net_sched: plug in qdisc ops change_tx_queue_len")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Reported-by: Ran Rozenstein <ranro@mellanox.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
turbostat fails on some multi-package topologies because the logical node
enumeration assumes that the nodes are sequentially numbered,
which causes the logical numa nodes to not be enumerated, or enumerated incorrectly.
Use a more robust enumeration algorithm which allows for non-seqential physical nodes.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This patch fixes a regression introduced in
commit 8cb48b32a5 ("tools/power turbostat: track thread ID in cpu_topology")
Turbostat uses incorrect cores number ('topo.num_cores') - its value is count
of logical CPUs, instead of count of physical cores. So it is twice as large as
it should be on a typical Intel system. For example, on a 6 core Xeon system
'topo.num_cores' is 12, and on a 52 core Xeon system 'topo.num_cores' is 104.
And interestingly, on a 68-core Knights Landing Intel system 'topo.num_cores'
is 272, because this system has 4 logical CPUs per core.
As a result, some of the turbostat calculations are incorrect. For example,
on idle 52-core Xeon system when all cores are ~99% in Core C6 (CPU%c6), the
summary (very first) line shows ~48% Core C6, while it should be ~99%.
This patch fixes the problem by fixing 'topo.num_cores' calculation.
Was:
1. Init 'thread_id' for all CPUs to -1
2. Run 'get_thread_siblings()' which sets it to 0 or 1
3. Increment 'topo.num_cores' when thread_id != -1 (bug!)
Now:
1. Init 'thread_id' for all CPUs to -1
2. Run 'get_thread_siblings()' which sets it to 0 or 1
3. Increment 'topo.num_cores' when thread_id is not 0
I did not have a chance to test this on an AMD machine, and only tested on a
couple of Intel Xeons (6 and 52 cores).
Reported-by: Vladislav Govtva <vladislav.govtva@intel.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
bio_iov_iter_get_pages() currently only adds pages for the next non-zero
segment from the iov_iter to the bio. That's suboptimal for callers,
which typically try to pin as many pages as fit into the bio. This patch
converts the current bio_iov_iter_get_pages() into a static helper, and
introduces a new helper that allocates as many pages as
1) fit into the bio,
2) are present in the iov_iter,
3) and can be pinned by MM.
Error is returned only if zero pages could be pinned. Because of 3), a
zero return value doesn't necessarily mean all pages have been pinned.
Callers that have to pin every page in the iov_iter must still call this
function in a loop (this is currently the case).
This change matters most for __blkdev_direct_IO_simple(), which calls
bio_iov_iter_get_pages() only once. If it obtains less pages than
requested, it returns a "short write" or "short read", and
__generic_file_write_iter() falls back to buffered writes, which may
lead to data corruption.
Fixes: 72ecad22d9 ("block: support a full bio worth of IO for simplified bdev direct-io")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If the last page of the bio is not "full", the length of the last
vector slot needs to be corrected. This slot has the index
(bio->bi_vcnt - 1), but only in bio->bi_io_vec. In the "bv" helper
array, which is shifted by the value of bio->bi_vcnt at function
invocation, the correct index is (nr_pages - 1).
v2: improved readability following suggestions from Ming Lei.
v3: followed a formatting suggestion from Christoph Hellwig.
Fixes: 2cefe4dbaa ("block: add bio_iov_iter_get_pages()")
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull NVMe fixes from Christoph:
"Two small fixes each for the FC code and the target."
* 'nvme-4.18' of git://git.infradead.org/nvme:
nvmet: only check for filebacking on -ENOTBLK
nvmet: fixup crash on NULL device path
nvme: if_ready checks to fail io to deleting controller
nvmet-fc: fix target sgl list on large transfers
When an fatal error is received by a non-bridge device, the device is
removed, and pci_stop_and_remove_bus_device() deallocates the device
structure. The freed device structure is used by subsequent code to send
uevents and print messages.
Hold a reference on the device until we're finished using it. This is not
an ideal fix because pcie_do_fatal_recovery() should not use the device at
all after removing it, but that's too big a project for right now.
Fixes: 7e9084b367 ("PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices")
Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
[bhelgaas: changelog, reduce get/put coverage]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If we enable or disable xgbe flow-control by ethtool ,
it does't work.Because the parameter is not properly
assigned,so we need to adjust the assignment order
of the parameters.
Fixes: c1ce2f7736 ("amd-xgbe: Fix flow control setting logic")
Signed-off-by: tangpengpeng <tangpengpeng@higon.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull USB fixes from Greg KH:
"Here are a number of USB fixes and new device ids for 4.18-rc7.
The largest number are a bunch of gadget driver fixes that got delayed
in being submitted earlier due to vacation schedules, but nothing
really huge is present in them. There are some new device ids and some
PHY driver fixes that were connected to some USB ones. Full details
are in the shortlog.
All have been in linux-next for a while with no reported issues"
* tag 'usb-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits)
usb: core: handle hub C_PORT_OVER_CURRENT condition
usb: xhci: Fix memory leak in xhci_endpoint_reset()
usb: typec: tcpm: Fix sink PDO starting index for PPS APDO selection
usb: gadget: f_fs: Only return delayed status when len is 0
usb: gadget: f_uac2: fix endianness of 'struct cntrl_*_lay3'
usb: dwc2: Fix inefficient copy of unaligned buffers
usb: dwc2: Fix DMA alignment to start at allocated boundary
usb: dwc3: rockchip: Fix PHY documentation links.
tools: usb: ffs-test: Fix build on big endian systems
usb: gadget: aspeed: Workaround memory ordering issue
usb: dwc3: gadget: remove redundant variable maxpacket
usb: dwc2: avoid NULL dereferences
usb/phy: fix PPC64 build errors in phy-fsl-usb.c
usb: dwc2: host: do not delay retries for CONTROL IN transfers
usb: gadget: u_audio: protect stream runtime fields with stream spinlock
usb: gadget: u_audio: remove cached period bytes value
usb: gadget: u_audio: remove caching of stream buffer parameters
usb: gadget: u_audio: update hw_ptr in iso_complete after data copied
usb: gadget: u_audio: fix pcm/card naming in g_audio_setup()
usb: gadget: f_uac2: fix error handling in afunc_bind (again)
...
Pull staging driver fixes from Greg KH:
"Here are three small staging driver fixes for 4.18-rc7.
One is a revert of an earlier patch that turned out to be incorrect,
one is a fix for the speakup drivers, and the last a fix for the
ks7010 driver to resolve a regression.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: speakup: fix wraparound in uaccess length check
staging: ks7010: call 'hostif_mib_set_request_int' instead of 'hostif_mib_set_request_bool'
Revert "staging:r8188eu: Use lib80211 to support TKIP"
Pull driver core fix from Greg KH:
"This is a single driver core fix for 4.18-rc7. It partially reverts a
previous commit to resolve some reported issues.
It has been in linux-next for a while now with no reported issues"
* tag 'driver-core-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Partially revert "driver core: correct device's shutdown order"
Pull ACPI fix from Rafael Wysocki:
"Fix a recent ACPICA regression causing the AML parser to get confused
and fail in some situations involving incorrect AML in an ACPI table
(Erik Schmauss)"
* tag 'acpi-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: AML Parser: ignore dispatcher error status during table load
Pull power management fix from Rafael Wysocki:
"Fix up the recently introduced cpufreq driver for Qualcomm Kryo
processors by adding a terminating NULL entry to its table of device
IDs (YueHaibing)"
* tag 'pm-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: qcom-kryo: add NULL entry to the end of_device_id array
There is a window for racing when printing directly to task->comm,
allowing other threads to see a non-terminated string. The vsnprintf
function fills the buffer, counts the truncated chars, then finally
writes the \0 at the end.
creator other
vsnprintf:
fill (not terminated)
count the rest trace_sched_waking(p):
... memcpy(comm, p->comm, TASK_COMM_LEN)
write \0
The consequences depend on how 'other' uses the string. In our case,
it was copied into the tracing system's saved cmdlines, a buffer of
adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):
crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk'
0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12"
...and a strcpy out of there would cause stack corruption:
[224761.522292] Kernel panic - not syncing: stack-protector:
Kernel stack is corrupted in: ffffff9bf9783c78
crash-arm64> kbt | grep 'comm\|trace_print_context'
#6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
comm (char [16]) = "irq/497-pwr_even"
crash-arm64> rd 0xffffffd4d0e17d14 8
ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_
ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16:
ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`..
ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`..........
The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was
likely needed because of this same bug.
Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
This way, there won't be a window where comm is not terminated.
Link: http://lkml.kernel.org/r/20180726071539.188015-1-snild@sony.com
Cc: stable@vger.kernel.org
Fixes: bc0c38d139 ("ftrace: latency tracer infrastructure")
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Snild Dolkow <snild@sony.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Currently we are leaking bpf programs when they are detached from the
lirc device; the refcount never reaches zero.
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Break statements were missing for Geneve case in
ndo_udp_tunnel_{add/del}, thereby raw mac matchall
entries were not getting added.
Fixes: c746fc0e8b2d("cxgb4: add geneve offload support for T6")
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 57ea2a34ad ("tracing/kprobes: Fix trace_probe flags on
enable_trace_kprobe() failure") added an if statement that depends on another
if statement that gcc doesn't see will initialize the "link" variable and
gives the warning:
"warning: 'link' may be used uninitialized in this function"
It is really a false positive, but to quiet the warning, and also to make
sure that it never actually is used uninitialized, initialize the "link"
variable to NULL and add an if (!WARN_ON_ONCE(!link)) where the compiler
thinks it could be used uninitialized.
Cc: stable@vger.kernel.org
Fixes: 57ea2a34ad ("tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
There was a case that triggered a double free in event_trigger_callback()
due to the called reg() function freeing the trigger_data and then it
getting freed again by the error return by the caller. The solution there
was to up the trigger_data ref count.
Code inspection found that event_enable_trigger_func() has the same issue,
but is not as easy to trigger (requires harder to trigger failures). It
needs to be solved slightly different as it needs more to clean up when the
reg() function fails.
Link: http://lkml.kernel.org/r/20180725124008.7008e586@gandalf.local.home
Cc: stable@vger.kernel.org
Fixes: 7862ad1846 ("tracing: Add 'enable_event' and 'disable_event' event trigger commands")
Reivewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Polling for the ingress queues relies on reading the producer/consumer
pointers of the Rx queue.
Prior this commit, a cached consumer pointer could be used, instead of
the actual consumer pointer and therefore report POLLIN prematurely.
This patch makes sure that the non-cached consumer pointer is used
instead.
Reported-by: Qi Zhang <qi.z.zhang@intel.com>
Tested-by: Qi Zhang <qi.z.zhang@intel.com>
Fixes: c497176cb2 ("xsk: add Rx receive functions and poll support")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Commit 24dea04767 ("bpf, x32: remove ld_abs/ld_ind")
removed the 4 /* Extra space for skb_copy_bits buffer */
from _STACK_SIZE, but it didn't fix the concerned code
in emit_prologue and emit_epilogue, and this error will
bring very strange kernel runtime errors. This patch
fixes it.
Fixes: 24dea04767 ("bpf, x32: remove ld_abs/ld_ind")
Reported-by: Meelis Roos <mroos@linux.ee>
Bisected-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes the following sparse warnings:
net/ipv4/igmp.c:1391:6: warning:
symbol '__ip_mc_inc_group' was not declared. Should it be static?
Fixes: 6e2059b53f ("ipv4/igmp: init group mode as INCLUDE when join source group")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On GLK NUC platforms the HDMI retiming buffer needs additional disabled
time to correctly sync to a faster incoming signal.
When measured on a scope the highspeed lines of the HDMI clock turn off
for ~400uS during a normal resolution change. The HDMI retimer on the
GLK NUC appears to require at least a full frame of quiet time before a
new faster clock can be correctly sync'd. Wait 100ms due to msleep
inaccuracies while waiting for a completed frame. Add a quirk to the
driver for GLK boards that use ITE66317 HDMI retimers.
V2: Add more devices to the quirk list
V3: Delay increased to 100ms, check to confirm crtc type is HDMI.
V4: crtc type check extended to include _DDI and whitespace fixes
v5: Fix white spaces, remove the macro for delay. Revert the crtc type
check introduced in v4.
Cc: Imre Deak <imre.deak@intel.com>
Cc: <stable@vger.kernel.org> # v4.14+
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105887
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Tested-by: Daniel Scheller <d.scheller.oss@gmail.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710200205.1478-1-radhakrishna.sripada@intel.com
(cherry picked from commit 90c3e21987)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Otherwise interfaces get exposed under /sys/devices/virtual, which
doesn't give udev the context it needs for PCI-based predictable
interface names.
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When received packets are dropped in virtio_net driver, received packets
counter is incremented but bytes counter is not.
As a result, for instance if we drop all packets by XDP, only received
is counted and bytes stays 0, which looks inconsistent.
IMHO received packets/bytes should be counted if packets are produced by
the hypervisor, like what common NICs on physical machines are doing.
So fix the bytes counter.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
drm_atomic_helper_async_check() declares the plane, old_plane_state and
new_plane_state variables to iterate over all planes of the atomic
state and make sure only one plane is enabled.
Unfortunately gcc is not smart enough to figure out that the check on
n_planes is enough to guarantee that plane, new_plane_state and
old_plane_state are initialized.
Explicitly initialize those variables to NULL to make gcc happy.
Fixes: fef9df8b59 ("drm/atomic: initial support for asynchronous plane update")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180724133300.32023-1-boris.brezillon@bootlin.com
Pull clk fixes from Stephen Boyd:
"One more round of updates for problems seen this -rc series. Drivers
fixes are:
- Amlogic Meson audio divider fix and CPU clk critical marking
- Qualcomm multimedia GDSC marked as 'always on' to keep display
working
- Aspeed fixes for critical clks, resets causing clks to stay
disabled, and an incorrect HPLL frequency calculation
- Marvell Armada 3700 cpu clks would undervolt when switching from
low frequencies to high frequencies because the voltage didn't
stabilize in time so now we switch to an intermediate frequency
Plus we have a core framework thinko that messed up the debugfs flag
printing logic to make it not very useful"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: aspeed: Support HPLL strapping on ast2400
clk: mvebu: armada-37xx-periph: Fix switching CPU rate from 300Mhz to 1.2GHz
clk: aspeed: Mark bclk (PCIe) and dclk (VGA) as critical
clk/mmcc-msm8996: Make mmagic_bimc_gdsc ALWAYS_ON
clk: aspeed: Treat a gate in reset as disabled
clk: Really show symbolic clock flags in debugfs
clk: qcom: gcc-msm8996: Disable halt check on UFS tx clock
clk: meson: audio-divider is one based
clk: meson-gxbb: set fclk_div2 as CLK_IS_CRITICAL
Pull fscache/cachefiles fixes from David Howells:
- Allow cancelled operations to be queued so they can be cleaned up.
- Fix a refcounting bug in the monitoring of reads on backend files
whereby a race can occur between monitor objects being listed for
work, the work processing being queued and the work processor running
and destroying the monitor objects.
- Fix a ref overput in object attachment, whereby a tentatively
considered object is put in error handling without first being 'got'.
- Fix a missing clear of the CACHEFILES_OBJECT_ACTIVE flag whereby an
assertion occurs when we retry because it seems the object is now
active.
- Wait rather BUG'ing on an object collision in the depths of
cachefiles as the active object should be being cleaned up - also
depends on the one above.
* tag 'fscache-fixes-20180725' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
cachefiles: Wait rather than BUG'ing on "Unexpected object collision"
cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag
fscache: Fix reference overput in fscache_attach_object() error handling
cachefiles: Fix refcounting bug in backing-file read monitoring
fscache: Allow cancelled operations to be enqueued
Maintain the tracing on/off setting of the ring_buffer when switching
to the trace buffer snapshot.
Taking a snapshot is done by swapping the backup ring buffer
(max_tr_buffer). But since the tracing on/off setting is defined
by the ring buffer, when swapping it, the tracing on/off setting
can also be changed. This causes a strange result like below:
/sys/kernel/debug/tracing # cat tracing_on
1
/sys/kernel/debug/tracing # echo 0 > tracing_on
/sys/kernel/debug/tracing # cat tracing_on
0
/sys/kernel/debug/tracing # echo 1 > snapshot
/sys/kernel/debug/tracing # cat tracing_on
1
/sys/kernel/debug/tracing # echo 1 > snapshot
/sys/kernel/debug/tracing # cat tracing_on
0
We don't touch tracing_on, but snapshot changes tracing_on
setting each time. This is an anomaly, because user doesn't know
that each "ring_buffer" stores its own tracing-enable state and
the snapshot is done by swapping ring buffers.
Link: http://lkml.kernel.org/r/153149929558.11274.11730609978254724394.stgit@devbox
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Hiraku Toyooka <hiraku.toyooka@cybertrust.co.jp>
Cc: stable@vger.kernel.org
Fixes: debdd57f51 ("tracing: Make a snapshot feature available from userspace")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
[ Updated commit log and comment in the code ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Running the following:
# cd /sys/kernel/debug/tracing
# echo 500000 > buffer_size_kb
[ Or some other number that takes up most of memory ]
# echo snapshot > events/sched/sched_switch/trigger
Triggers the following bug:
------------[ cut here ]------------
kernel BUG at mm/slub.c:296!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
CPU: 6 PID: 6878 Comm: bash Not tainted 4.18.0-rc6-test+ #1066
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
RIP: 0010:kfree+0x16c/0x180
Code: 05 41 0f b6 72 51 5b 5d 41 5c 4c 89 d7 e9 ac b3 f8 ff 48 89 d9 48 89 da 41 b8 01 00 00 00 5b 5d 41 5c 4c 89 d6 e9 f4 f3 ff ff <0f> 0b 0f 0b 48 8b 3d d9 d8 f9 00 e9 c1 fe ff ff 0f 1f 40 00 0f 1f
RSP: 0018:ffffb654436d3d88 EFLAGS: 00010246
RAX: ffff91a9d50f3d80 RBX: ffff91a9d50f3d80 RCX: ffff91a9d50f3d80
RDX: 00000000000006a4 RSI: ffff91a9de5a60e0 RDI: ffff91a9d9803500
RBP: ffffffff8d267c80 R08: 00000000000260e0 R09: ffffffff8c1a56be
R10: fffff0d404543cc0 R11: 0000000000000389 R12: ffffffff8c1a56be
R13: ffff91a9d9930e18 R14: ffff91a98c0c2890 R15: ffffffff8d267d00
FS: 00007f363ea64700(0000) GS:ffff91a9de580000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055c1cacc8e10 CR3: 00000000d9b46003 CR4: 00000000001606e0
Call Trace:
event_trigger_callback+0xee/0x1d0
event_trigger_write+0xfc/0x1a0
__vfs_write+0x33/0x190
? handle_mm_fault+0x115/0x230
? _cond_resched+0x16/0x40
vfs_write+0xb0/0x190
ksys_write+0x52/0xc0
do_syscall_64+0x5a/0x160
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f363e16ab50
Code: 73 01 c3 48 8b 0d 38 83 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 79 db 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 1e e3 01 00 48 89 04 24
RSP: 002b:00007fff9a4c6378 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f363e16ab50
RDX: 0000000000000009 RSI: 000055c1cacc8e10 RDI: 0000000000000001
RBP: 000055c1cacc8e10 R08: 00007f363e435740 R09: 00007f363ea64700
R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000009
R13: 0000000000000001 R14: 00007f363e4345e0 R15: 00007f363e4303c0
Modules linked in: ip6table_filter ip6_tables snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device i915 snd_pcm snd_timer i2c_i801 snd soundcore i2c_algo_bit drm_kms_helper
86_pkg_temp_thermal video kvm_intel kvm irqbypass wmi e1000e
---[ end trace d301afa879ddfa25 ]---
The cause is because the register_snapshot_trigger() call failed to
allocate the snapshot buffer, and then called unregister_trigger()
which freed the data that was passed to it. Then on return to the
function that called register_snapshot_trigger(), as it sees it
failed to register, it frees the trigger_data again and causes
a double free.
By calling event_trigger_init() on the trigger_data (which only ups
the reference counter for it), and then event_trigger_free() afterward,
the trigger_data would not get freed by the registering trigger function
as it would only up and lower the ref count for it. If the register
trigger function fails, then the event_trigger_free() called after it
will free the trigger data normally.
Link: http://lkml.kernel.org/r/20180724191331.738eb819@gandalf.local.home
Cc: stable@vger.kerne.org
Fixes: 93e31ffbf4 ("tracing: Add 'snapshot' event trigger command")
Reported-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
If we meet a conflicting object that is marked FSCACHE_OBJECT_IS_LIVE in
the active object tree, we have been emitting a BUG after logging
information about it and the new object.
Instead, we should wait for the CACHEFILES_OBJECT_ACTIVE flag to be cleared
on the old object (or return an error). The ACTIVE flag should be cleared
after it has been removed from the active object tree. A timeout of 60s is
used in the wait, so we shouldn't be able to get stuck there.
Fixes: 9ae326a690 ("CacheFiles: A cache that backs onto a mounted filesystem")
Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
In cachefiles_mark_object_active(), the new object is marked active and
then we try to add it to the active object tree. If a conflicting object
is already present, we want to wait for that to go away. After the wait,
we go round again and try to re-mark the object as being active - but it's
already marked active from the first time we went through and a BUG is
issued.
Fix this by clearing the CACHEFILES_OBJECT_ACTIVE flag before we try again.
Analysis from Kiran Kumar Modukuri:
[Impact]
Oops during heavy NFS + FSCache + Cachefiles
CacheFiles: Error: Overlong wait for old active object to go away.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
CacheFiles: Error: Object already active kernel BUG at
fs/cachefiles/namei.c:163!
[Cause]
In a heavily loaded system with big files being read and truncated, an
fscache object for a cookie is being dropped and a new object being
looked. The new object being looked for has to wait for the old object
to go away before the new object is moved to active state.
[Fix]
Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when
retrying the object lookup.
[Testcase]
Have run ~100 hours of NFS stress tests and have not seen this bug recur.
[Regression Potential]
- Limited to fscache/cachefiles.
Fixes: 9ae326a690 ("CacheFiles: A cache that backs onto a mounted filesystem")
Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
When a cookie is allocated that causes fscache_object structs to be
allocated, those objects are initialised with the cookie pointer, but
aren't blessed with a ref on that cookie unless the attachment is
successfully completed in fscache_attach_object().
If attachment fails because the parent object was dying or there was a
collision, fscache_attach_object() returns without incrementing the cookie
counter - but upon failure of this function, the object is released which
then puts the cookie, whether or not a ref was taken on the cookie.
Fix this by taking a ref on the cookie when it is assigned in
fscache_object_init(), even when we're creating a root object.
Analysis from Kiran Kumar:
This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel
BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277
fscache cookie ref count updated incorrectly during fscache object
allocation resulting in following Oops.
kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321!
kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
[Cause]
Two threads are trying to do operate on a cookie and two objects.
(1) One thread tries to unmount the filesystem and in process goes over a
huge list of objects marking them dead and deleting the objects.
cookie->usage is also decremented in following path:
nfs_fscache_release_super_cookie
-> __fscache_relinquish_cookie
->__fscache_cookie_put
->BUG_ON(atomic_read(&cookie->usage) <= 0);
(2) A second thread tries to lookup an object for reading data in following
path:
fscache_alloc_object
1) cachefiles_alloc_object
-> fscache_object_init
-> assign cookie, but usage not bumped.
2) fscache_attach_object -> fails in cant_attach_object because the
cookie's backing object or cookie's->parent object are going away
3) fscache_put_object
-> cachefiles_put_object
->fscache_object_destroy
->fscache_cookie_put
->BUG_ON(atomic_read(&cookie->usage) <= 0);
[NOTE from dhowells] It's unclear as to the circumstances in which (2) can
take place, given that thread (1) is in nfs_kill_super(), however a
conflicting NFS mount with slightly different parameters that creates a
different superblock would do it. A backtrace from Kiran seems to show
that this is a possibility:
kernel BUG at/build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
...
RIP: __fscache_cookie_put+0x3a/0x40 [fscache]
Call Trace:
__fscache_relinquish_cookie+0x87/0x120 [fscache]
nfs_fscache_release_super_cookie+0x2d/0xb0 [nfs]
nfs_kill_super+0x29/0x40 [nfs]
deactivate_locked_super+0x48/0x80
deactivate_super+0x5c/0x60
cleanup_mnt+0x3f/0x90
__cleanup_mnt+0x12/0x20
task_work_run+0x86/0xb0
exit_to_usermode_loop+0xc2/0xd0
syscall_return_slowpath+0x4e/0x60
int_ret_from_sys_call+0x25/0x9f
[Fix] Bump up the cookie usage in fscache_object_init, when it is first
being assigned a cookie atomically such that the cookie is added and bumped
up if its refcount is not zero. Remove the assignment in
fscache_attach_object().
[Testcase]
I have run ~100 hours of NFS stress tests and not seen this bug recur.
[Regression Potential]
- Limited to fscache/cachefiles.
Fixes: ccc4fc3d11 ("FS-Cache: Implement the cookie management part of the netfs API")
Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cachefiles_read_waiter() has the right to access a 'monitor' object by
virtue of being called under the waitqueue lock for one of the pages in its
purview. However, it has no ref on that monitor object or on the
associated operation.
What it is allowed to do is to move the monitor object to the operation's
to_do list, but once it drops the work_lock, it's actually no longer
permitted to access that object. However, it is trying to enqueue the
retrieval operation for processing - but it can only do this via a pointer
in the monitor object, something it shouldn't be doing.
If it doesn't enqueue the operation, the operation may not get processed.
If the order is flipped so that the enqueue is first, then it's possible
for the work processor to look at the to_do list before the monitor is
enqueued upon it.
Fix this by getting a ref on the operation so that we can trust that it
will still be there once we've added the monitor to the to_do list and
dropped the work_lock. The op can then be enqueued after the lock is
dropped.
The bug can manifest in one of a couple of ways. The first manifestation
looks like:
FS-Cache:
FS-Cache: Assertion failed
FS-Cache: 6 == 5 is false
------------[ cut here ]------------
kernel BUG at fs/fscache/operation.c:494!
RIP: 0010:fscache_put_operation+0x1e3/0x1f0
...
fscache_op_work_func+0x26/0x50
process_one_work+0x131/0x290
worker_thread+0x45/0x360
kthread+0xf8/0x130
? create_worker+0x190/0x190
? kthread_cancel_work_sync+0x10/0x10
ret_from_fork+0x1f/0x30
This is due to the operation being in the DEAD state (6) rather than
INITIALISED, COMPLETE or CANCELLED (5) because it's already passed through
fscache_put_operation().
The bug can also manifest like the following:
kernel BUG at fs/fscache/operation.c:69!
...
[exception RIP: fscache_enqueue_operation+246]
...
#7 [ffff883fff083c10] fscache_enqueue_operation at ffffffffa0b793c6
#8 [ffff883fff083c28] cachefiles_read_waiter at ffffffffa0b15a48
#9 [ffff883fff083c48] __wake_up_common at ffffffff810af028
I'm not entirely certain as to which is line 69 in Lei's kernel, so I'm not
entirely clear which assertion failed.
Fixes: 9ae326a690 ("CacheFiles: A cache that backs onto a mounted filesystem")
Reported-by: Lei Xue <carmark.dlut@gmail.com>
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Reported-by: Anthony DeRobertis <aderobertis@metrics.net>
Reported-by: NeilBrown <neilb@suse.com>
Reported-by: Daniel Axtens <dja@axtens.net>
Reported-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Alter the state-check assertion in fscache_enqueue_operation() to allow
cancelled operations to be given processing time so they can be cleaned up.
Also fix a debugging statement that was requiring such operations to have
an object assigned.
Fixes: 9ae326a690 ("CacheFiles: A cache that backs onto a mounted filesystem")
Reported-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Arnd reports the following arm64 randconfig build error with the PSI
patches that add another page flag:
/git/arm-soc/arch/arm64/mm/init.c: In function 'mem_init':
/git/arm-soc/include/linux/compiler.h:357:38: error: call to
'__compiletime_assert_618' declared with attribute error: BUILD_BUG_ON
failed: sizeof(struct page) > (1 << STRUCT_PAGE_MAX_SHIFT)
The additional page flag causes other information stored in
page->flags to get bumped into their own struct page member:
#if SECTIONS_WIDTH+ZONES_WIDTH+NODES_SHIFT+LAST_CPUPID_SHIFT <=
BITS_PER_LONG - NR_PAGEFLAGS
#define LAST_CPUPID_WIDTH LAST_CPUPID_SHIFT
#else
#define LAST_CPUPID_WIDTH 0
#endif
#if defined(CONFIG_NUMA_BALANCING) && LAST_CPUPID_WIDTH == 0
#define LAST_CPUPID_NOT_IN_PAGE_FLAGS
#endif
which in turn causes the struct page size to exceed the size set in
STRUCT_PAGE_MAX_SHIFT. This value is an an estimate used to size the
VMEMMAP page array according to address space and struct page size.
However, the check is performed - and triggers here - on a !VMEMMAP
config, which consumes an additional 22 page bits for the sparse
section id. When VMEMMAP is enabled, those bits are returned, cpupid
doesn't need its own member, and the page passes the VMEMMAP check.
Restrict that check to the situation it was meant to check: that we
are sizing the VMEMMAP page array correctly.
Says Arnd:
Further experiments show that the build error already existed before,
but was only triggered with larger values of CONFIG_NR_CPU and/or
CONFIG_NODES_SHIFT that might be used in actual configurations but
not in randconfig builds.
With longer CPU and node masks, I could recreate the problem with
kernels as old as linux-4.7 when arm64 NUMA support got added.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Cc: stable@vger.kernel.org
Fixes: 1a2db30034 ("arm64, numa: Add NUMA support for arm64 platforms.")
Fixes: 3e1907d5bf ("arm64: mm: move vmemmap region right below the linear region")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Since commit d3aec8a28b ("arm64: capabilities: Restrict KPTI
detection to boot-time CPUs") we rely on errata flags being already
populated during feature enumeration. The order of errata and
features was flipped as part of commit ed478b3f9e ("arm64:
capabilities: Group handling of features and errata workarounds").
Return to the orginal order of errata and feature evaluation to
ensure errata flags are present during feature evaluation.
Fixes: ed478b3f9e ("arm64: capabilities: Group handling of
features and errata workarounds")
CC: Suzuki K Poulose <suzuki.poulose@arm.com>
CC: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Dirk Mueller <dmueller@suse.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We only need to check for a file-backed namespace if
nvmet_bdev_ns_enable() returns -ENOTBLK. For any other error
it's pointless as the open() error will remain the same.
Fixes: d5eff33e ("nvmet: add simple file backed ns support")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When writing an empty string into the device_path attribute the kernel
will crash with
nvmet: failed to open block device (null): (-22)
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
This patch sanitizes the error handling for invalid device path settings.
Fixes: a07b4970 ("nvmet: add a generic NVMe target")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Dirk Gouders reported that two consecutive "make" invocations on an
already compiled tree will show alternating behaviors:
$ make
CALL scripts/checksyscalls.sh
DESCEND objtool
CHK include/generated/compile.h
DATAREL arch/x86/boot/compressed/vmlinux
Kernel: arch/x86/boot/bzImage is ready (#48)
Building modules, stage 2.
MODPOST 165 modules
$ make
CALL scripts/checksyscalls.sh
DESCEND objtool
CHK include/generated/compile.h
LD arch/x86/boot/compressed/vmlinux
ZOFFSET arch/x86/boot/zoffset.h
AS arch/x86/boot/header.o
LD arch/x86/boot/setup.elf
OBJCOPY arch/x86/boot/setup.bin
OBJCOPY arch/x86/boot/vmlinux.bin
BUILD arch/x86/boot/bzImage
Setup is 15644 bytes (padded to 15872 bytes).
System is 6663 kB
CRC 3eb90f40
Kernel: arch/x86/boot/bzImage is ready (#48)
Building modules, stage 2.
MODPOST 165 modules
He bisected it back to:
commit 98f7852537 ("x86/boot: Refuse to build with data relocations")
The root cause was the use of the "if_changed" kbuild function multiple
times for the same target. It was designed to only be used once per
target, otherwise it will effectively always trigger, flipping back and
forth between the two commands getting recorded by "if_changed". Instead,
this patch merges the two commands into a single function to get stable
build artifacts (i.e. .vmlinux.cmd), and a single build behavior.
Bisected-and-Reported-by: Dirk Gouders <dirk@gouders.net>
Fix-Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180724230827.GA37823@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In function perf_event_parse_addr_filter(), the path::dentry of each struct
perf_addr_filter is left unassigned (as it should be) when the pattern
being parsed is related to kernel space. But in function
perf_addr_filter_match() the same dentries are given to d_inode() where
the value is not expected to be NULL, resulting in the following splat:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058
pc : perf_event_mmap+0x2fc/0x5a0
lr : perf_event_mmap+0x2c8/0x5a0
Process uname (pid: 2860, stack limit = 0x000000001cbcca37)
Call trace:
perf_event_mmap+0x2fc/0x5a0
mmap_region+0x124/0x570
do_mmap+0x344/0x4f8
vm_mmap_pgoff+0xe4/0x110
vm_mmap+0x2c/0x40
elf_map+0x60/0x108
load_elf_binary+0x450/0x12c4
search_binary_handler+0x90/0x290
__do_execve_file.isra.13+0x6e4/0x858
sys_execve+0x3c/0x50
el0_svc_naked+0x30/0x34
This patch is fixing the problem by introducing a new check in function
perf_addr_filter_match() to see if the filter's dentry is NULL.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: miklos@szeredi.hu
Cc: namhyung@kernel.org
Cc: songliubraving@fb.com
Fixes: 9511bce9fe ("perf/core: Fix bad use of igrab()")
Link: http://lkml.kernel.org/r/1531782831-1186-1-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Vince reported the perf_fuzzer giving various unwinder warnings and
Josh reported:
> Deja vu. Most of these are related to perf PEBS, similar to the
> following issue:
>
> b8000586c9 ("perf/x86/intel: Cure bogus unwind from PEBS entries")
>
> This is basically the ORC version of that. setup_pebs_sample_data() is
> assembling a franken-pt_regs which ORC isn't happy about. RIP is
> inconsistent with some of the other registers (like RSP and RBP).
And where the previous unwinder only needed BP,SP ORC also requires
IP. But we cannot spoof IP because then the sample will get displaced,
entirely negating the point of PEBS.
So cure the whole thing differently by doing the unwind early; this
does however require a means to communicate we did the unwind early.
We (ab)use an unused sample_type bit for this, which we set on events
that fill out the data->callchain before the normal
perf_prepare_sample().
Debugged-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
NO_RT_RUNTIME_SHARE feature is used to prevent a CPU borrow enough
runtime with a spin-rt-task.
However, if RT_RUNTIME_SHARE feature is enabled and rt_rq has borrowd
enough rt_runtime at the beginning, rt_runtime can't be restored to
its initial bandwidth rt_runtime after we disable RT_RUNTIME_SHARE.
E.g. on my PC with 4 cores, procedure to reproduce:
1) Make sure RT_RUNTIME_SHARE is enabled
cat /sys/kernel/debug/sched_features
GENTLE_FAIR_SLEEPERS START_DEBIT NO_NEXT_BUDDY LAST_BUDDY
CACHE_HOT_BUDDY WAKEUP_PREEMPTION NO_HRTICK NO_DOUBLE_TICK
LB_BIAS NONTASK_CAPACITY TTWU_QUEUE NO_SIS_AVG_CPU SIS_PROP
NO_WARN_DOUBLE_CLOCK RT_PUSH_IPI RT_RUNTIME_SHARE NO_LB_MIN
ATTACH_AGE_LOAD WA_IDLE WA_WEIGHT WA_BIAS
2) Start a spin-rt-task
./loop_rr &
3) set affinity to the last cpu
taskset -p 8 $pid_of_loop_rr
4) Observe that last cpu have borrowed enough runtime.
cat /proc/sched_debug | grep rt_runtime
.rt_runtime : 950.000000
.rt_runtime : 900.000000
.rt_runtime : 950.000000
.rt_runtime : 1000.000000
5) Disable RT_RUNTIME_SHARE
echo NO_RT_RUNTIME_SHARE > /sys/kernel/debug/sched_features
6) Observe that rt_runtime can not been restored
cat /proc/sched_debug | grep rt_runtime
.rt_runtime : 950.000000
.rt_runtime : 900.000000
.rt_runtime : 950.000000
.rt_runtime : 1000.000000
This patch help to restore rt_runtime after we disable
RT_RUNTIME_SHARE.
Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn>
Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: zhong.weidong@zte.com.cn
Link: http://lkml.kernel.org/r/1531874815-39357-1-git-send-email-liu.hailong6@zte.com.cn
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit:
9fb8d5dc4b ("stop_machine, Disable preemption when waking two stopper threads")
does not fully address the race condition that can occur
as follows:
On one CPU, call it CPU 3, thread 1 invokes
cpu_stop_queue_two_works(2, 3,...), and the execution is such
that thread 1 queues the works for migration/2 and migration/3,
and is preempted after releasing the locks for migration/2 and
migration/3, but before waking the threads.
Then, On CPU 2, a kworker, call it thread 2, is running,
and it invokes cpu_stop_queue_two_works(1, 2,...), such that
thread 2 queues the works for migration/1 and migration/2.
Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
migration/2 and migration/3. This means that when CPU 2
releases the locks for migration/1 and migration/2, but before
it wakes those threads, it can be preempted by migration/2.
If thread 2 is preempted by migration/2, then migration/2 will
execute the first work item successfully, since migration/3
was woken up by CPU 3, but when it goes to execute the second
work item, it disables preemption, calls multi_cpu_stop(),
and thus, CPU 2 will wait forever for migration/1, which should
have been woken up by thread 2. However migration/1 cannot be
woken up by thread 2, since it is a kworker, so it is affine to
CPU 2, but CPU 2 is running migration/2 with preemption
disabled, so thread 2 will never run.
Disable preemption after queueing works for stopper threads
to ensure that the operation of queueing the works and waking
the stopper threads is atomic.
Co-Developed-by: Prasad Sodagudi <psodagud@codeaurora.org>
Co-Developed-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: gregkh@linuxfoundation.org
Cc: matt@codeblueprint.co.uk
Fixes: 9fb8d5dc4b ("stop_machine, Disable preemption when waking two stopper threads")
Link: http://lkml.kernel.org/r/1531856129-9871-1-git-send-email-isaacm@codeaurora.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If an i2c topology has instances of nested muxes, then a lockdep splat
is produced when when i2c_parent_lock_bus() is called. Here is an
example:
============================================
WARNING: possible recursive locking detected
--------------------------------------------
insmod/68159 is trying to acquire lock:
(i2c_register_adapter#2){+.+.}, at: i2c_parent_lock_bus+0x32/0x50 [i2c_mux]
but task is already holding lock:
(i2c_register_adapter#2){+.+.}, at: i2c_parent_lock_bus+0x32/0x50 [i2c_mux]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(i2c_register_adapter#2);
lock(i2c_register_adapter#2);
*** DEADLOCK ***
May be due to missing lock nesting notation
1 lock held by insmod/68159:
#0: (i2c_register_adapter#2){+.+.}, at: i2c_parent_lock_bus+0x32/0x50 [i2c_mux]
stack backtrace:
CPU: 13 PID: 68159 Comm: insmod Tainted: G O
Call Trace:
dump_stack+0x67/0x98
__lock_acquire+0x162e/0x1780
lock_acquire+0xba/0x200
rt_mutex_lock+0x44/0x60
i2c_parent_lock_bus+0x32/0x50 [i2c_mux]
i2c_parent_lock_bus+0x3e/0x50 [i2c_mux]
i2c_smbus_xfer+0xf0/0x700
i2c_smbus_read_byte+0x42/0x70
my2c_init+0xa2/0x1000 [my2c]
do_one_initcall+0x51/0x192
do_init_module+0x62/0x216
load_module+0x20f9/0x2b50
SYSC_init_module+0x19a/0x1c0
SyS_init_module+0xe/0x10
do_syscall_64+0x6c/0x1a0
entry_SYSCALL_64_after_hwframe+0x42/0xb7
Reported-by: John Sperbeck <jsperbeck@google.com>
Tested-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Deepa Dinamani <deepadinamani@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Chang <dpf@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Link: http://lkml.kernel.org/r/20180720083914.1950-3-peda@axentia.se
Signed-off-by: Ingo Molnar <mingo@kernel.org>
NVRAM is designed to work with Broadcom's SDK Linux kernel which fakes
PCI domain 0 for all internal MMIO devices. Since official Linux kernel
uses platform devices for that purpose there is a mismatch in numbering
PCI domains.
There used to be a fix for that problem but it was accidentally dropped
during the last firmware loading rework. That resulted in brcmfmac not
being able to extract device specific NVRAM content and all kind of
calibration problems.
Reported-by: Aditya Xavier <adityaxavier@gmail.com>
Fixes: 2baa3aaee2 ("brcmfmac: introduce brcmf_fw_alloc_request() function")
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Martin KaFai Lau says:
====================
The series allows the BPF loader to figure out the btf_key_id
and btf_value_id from a map's name by using BPF_ANNOTATE_KV_PAIR()
similarly as in iproute2 commit f823f36012fb ("bpf: implement
btf handling and map annotation").
It also removes the old 'typedef' way which requires two separate
typedefs (one for the key and one for the value).
By doing this, iproute2 and libbpf have one consistent way to
figure out the btf_key_type_id and btf_value_type_id for a map.
The first two patches are some prep/cleanup works. The last patch
introduces BPF_ANNOTATE_KV_PAIR.
v3:
- Replace some more *int*_t and u* usages with the
equivalent __[su]* in btf.c
v2:
- Fix the incorrect '&&' check on container_type
in bpf_map_find_btf_info().
- Expose the existing static btf_type_by_id() instead of
creating a new one.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch introduces BPF_ANNOTATE_KV_PAIR to signal the
bpf loader about the btf key_type and value_type of a bpf map.
Please refer to the changes in test_btf_haskv.c for its usage.
Both iproute2 and libbpf loader will then have the same
convention to find out the map's btf_key_type_id and
btf_value_type_id from a map's name.
Fixes: 8a138aed4a ("bpf: btf: Add BTF support to libbpf")
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch replaces [u]int32_t and [u]int64_t usage with
__[su]32 and __[su]64. The same change goes for [u]int16_t
and [u]int8_t.
Fixes: 8a138aed4a ("bpf: btf: Add BTF support to libbpf")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch sync the uapi btf.h to tools/
Fixes: 36fc3c8c28 bpf: btf: Clean up BTF_INT_BITS() in uapi btf.h
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Pull MIPS fixes from Paul Burton:
"A couple more MIPS fixes for 4.18:
- Fix an off-by-one in reporting PCI resource sizes to userland which
regressed in v3.12.
- Fix writes to DDR controller registers used to flush write buffers,
which regressed with some refactoring in v4.2"
* tag 'mips_fixes_4.18_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: ath79: fix register address in ath79_ddr_wb_flush()
MIPS: Fix off-by-one in pci_resource_to_user()
Pull networking fixes from David Miller:
1) Handle stations tied to AP_VLANs properly during mac80211 hw
reconfig. From Manikanta Pubbisetty.
2) Fix jump stack depth validation in nf_tables, from Taehee Yoo.
3) Fix quota handling in aRFS flow expiration of mlx5 driver, from Eran
Ben Elisha.
4) Exit path handling fix in powerpc64 BPF JIT, from Daniel Borkmann.
5) Use ptr_ring_consume_bh() in page pool code, from Tariq Toukan.
6) Fix cached netdev name leak in nf_tables, from Florian Westphal.
7) Fix memory leaks on chain rename, also from Florian Westphal.
8) Several fixes to DCTCP congestion control ACK handling, from Yuchunk
Cheng.
9) Missing rcu_read_unlock() in CAIF protocol code, from Yue Haibing.
10) Fix link local address handling with VRF, from David Ahern.
11) Don't clobber 'err' on a successful call to __skb_linearize() in
skb_segment(). From Eric Dumazet.
12) Fix vxlan fdb notification races, from Roopa Prabhu.
13) Hash UDP fragments consistently, from Paolo Abeni.
14) If TCP receives lots of out of order tiny packets, we do really
silly stuff. Make the out-of-order queue ending more robust to this
kind of behavior, from Eric Dumazet.
15) Don't leak netlink dump state in nf_tables, from Florian Westphal.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
net: axienet: Fix double deregister of mdio
qmi_wwan: fix interface number for DW5821e production firmware
ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull
bnx2x: Fix invalid memory access in rss hash config path.
net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper
r8169: restore previous behavior to accept BIOS WoL settings
cfg80211: never ignore user regulatory hint
sock: fix sg page frag coalescing in sk_alloc_sg
netfilter: nf_tables: move dumper state allocation into ->start
tcp: add tcp_ooo_try_coalesce() helper
tcp: call tcp_drop() from tcp_data_queue_ofo()
tcp: detect malicious patterns in tcp_collapse_ofo_queue()
tcp: avoid collapses in tcp_prune_queue() if possible
tcp: free batches of packets in tcp_prune_ofo_queue()
ip: hash fragments consistently
ipv6: use fib6_info_hold_safe() when necessary
can: xilinx_can: fix power management handling
can: xilinx_can: fix incorrect clear of non-processed interrupts
can: xilinx_can: fix RX overflow interrupt not being enabled
can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting
...
Syzbot reported a read beyond the end of the skb head when returning
IPV6_ORIGDSTADDR:
BUG: KMSAN: kernel-infoleak in put_cmsg+0x5ef/0x860 net/core/scm.c:242
CPU: 0 PID: 4501 Comm: syz-executor128 Not tainted 4.17.0+ #9
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:113
kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1125
kmsan_internal_check_memory+0x138/0x1f0 mm/kmsan/kmsan.c:1219
kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1261
copy_to_user include/linux/uaccess.h:184 [inline]
put_cmsg+0x5ef/0x860 net/core/scm.c:242
ip6_datagram_recv_specific_ctl+0x1cf3/0x1eb0 net/ipv6/datagram.c:719
ip6_datagram_recv_ctl+0x41c/0x450 net/ipv6/datagram.c:733
rawv6_recvmsg+0x10fb/0x1460 net/ipv6/raw.c:521
[..]
This logic and its ipv4 counterpart read the destination port from
the packet at skb_transport_offset(skb) + 4.
With MSG_MORE and a local SOCK_RAW sender, syzbot was able to cook a
packet that stores headers exactly up to skb_transport_offset(skb) in
the head and the remainder in a frag.
Call pskb_may_pull before accessing the pointer to ensure that it lies
in skb head.
Link: http://lkml.kernel.org/r/CAF=yD-LEJwZj5a1-bAAj2Oy_hKmGygV6rsJ_WOrAYnv-fnayiQ@mail.gmail.com
Reported-by: syzbot+9adb4b567003cac781f0@syzkaller.appspotmail.com
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rx hash/filter table configuration uses rss_conf_obj to configure filters
in the hardware. This object is initialized only when the interface is
brought up.
This patch adds driver changes to configure rss params only when the device
is in opened state. In port disabled case, the config will be cached in the
driver structure which will be applied in the successive load path.
Please consider applying it to 'net' branch.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Function mlx4_RST2INIT_QP_wrapper saved the qp number passed in the qp
context, rather than the one passed in the input modifier.
However, the qp number in the qp context is not defined as a
required parameter by the FW. Therefore, drivers may choose to not
specify the qp number in the qp context for the reset-to-init transition.
Thus, we must save the qp number passed in the command input modifier --
which is always present. (This saved qp number is used as the input
modifier for command 2RST_QP when a slave's qp's are destroyed).
Fixes: c82e9aa0a8 ("mlx4_core: resource tracking for HCA resources used by guests")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit cited below checked that the port numbers provided in the
primary and alt AVs are legal.
That is sufficient to prevent a kernel panic. However, it is not
sufficient for correct operation.
In Linux, AVs (both primary and alt) must be completely self-described.
We do not accept an AV from userspace without an embedded port number.
(This has been the case since kernel 3.14 commit dbf727de74
("IB/core: Use GID table in AH creation and dmac resolution")).
For the primary AV, this embedded port number must match the port number
specified with IB_QP_PORT.
We also expect the port number embedded in the alt AV to match the
alt_port_num value passed by the userspace driver in the modify_qp command
base structure.
Add these checks to modify_qp.
Cc: <stable@vger.kernel.org> # 4.16
Fixes: 5d4c05c3ee ("RDMA/uverbs: Sanitize user entered port numbers prior to access it")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Commit 7edf6d314c tried to resolve an inconsistency (BIOS WoL
settings are accepted, but device isn't wakeup-enabled) resulting
from a previous broken-BIOS workaround by making disabled WoL the
default.
This however had some side effects, most likely due to a broken BIOS
some systems don't properly resume from suspend when the MagicPacket
WoL bit isn't set in the chip, see
https://bugzilla.kernel.org/show_bug.cgi?id=200195
Therefore restore the WoL behavior from 4.16.
Reported-by: Albert Astals Cid <aacid@kde.org>
Fixes: 7edf6d314c ("r8169: disable WOL per default")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The scsi block layer requires requests claimed by the error handling be
completed by the error handler. A previous commit allowed completions
to proceed for blk-mq, breaking that assumption.
This patch prevents completions that may race with the timeout handler
by marking the state to complete, restoring the previous behavior.
Fixes: 12f5b931 ("blk-mq: Remove generation seqeunce")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This is preparing for drivers that want to directly alter the state of
their requests. No functional change here.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When inodes are freed in xfs_ifree(), di_flags is cleared (so extent size
hints are removed) but the actual extent size fields are left intact.
This causes the extent hint validators to fail on freed inodes which once
had extent size hints.
This can be observed (for example) by running xfs/229 twice on a
non-crc xfs filesystem, or presumably on V5 with ikeep.
Fixes: 7d71a67 ("xfs: verify extent size hint is valid in inode verifier")
Fixes: 02a0fda ("xfs: verify COW extent size hint is valid in inode verifier")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Pull s390 fix from Martin Schwidefsky.
Guenter Roeck reports that the s390 allmodconfig build fails because of
a gcc plugin problem. The fix won't be in-tree until 4.19, so for now
disable the gcc plugins on s390.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: disable gcc plugins
Including asm/cacheflush.h first results in the following build error
when trying to build sparc32:allmodconfig, because 'struct page' has not
been declared, and the function declaration ends up creating a separate
(private) declaration of struct page (as a result of function arguments
being in the scope of the function declaration and definition, not in
global scope).
The C scoping rules do not just affect variable visibility, they also
affect type declaration visibility.
The end result is that when the actual call site is seen in
<linux/highmem.h>, the 'struct page' type in the caller is not the same
'struct page' that the function was declared with, resulting in:
In file included from arch/sparc/include/asm/page.h:10:0,
...
from drivers/staging/media/omap4iss/iss_video.c:15:
include/linux/highmem.h: In function 'clear_user_highpage':
include/linux/highmem.h:137:31: error:
passing argument 1 of 'sparc_flush_page_to_ram' from incompatible
pointer type
Include generic includes files first to fix the problem.
Fixes: fc96d58c10 ("[media] v4l: omap4iss: Add support for OMAP4 camera interface - Video devices")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[ Added explanation of C scope rules - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Make sure we don't go over the maximum jump stack boundary,
from Taehee Yoo.
2) Missing rcu_barrier() in hash and rbtree sets, also from Taehee.
3) Missing check to nul-node in rbtree timeout routine, from Taehee.
4) Use dev->name from flowtable to fix a memleak, from Florian.
5) Oneliner to free flowtable object on removal, from Florian.
6) Memleak in chain rename transaction, again from Florian.
7) Don't allow two chains to use the same name in the same
transaction, from Florian.
8) handle DCCP SYNC/SYNCACK as invalid, this triggers an
uninitialized timer in conntrack reported by syzbot, from Florian.
9) Fix leak in case netlink_dump_start() fails, from Florian.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Only a few fixes:
* always keep regulatory user hint
* add missing break statement in station flags parsing
* fix non-linear SKBs in port-control-over-nl80211
* reconfigure VLAN stations during HW restart
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
I2C is open drain, so request the GPIO accordingly, even if pinmux did
set it up correctly for in-kernel users in this case.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
On Gen3, we can only do RXDMA once per transfer reliably. For that, we
must reset the device, then we can have RXDMA once. This patch
implements this. When there is no reset controller or the reset fails,
RXDMA will be blocked completely. Otherwise, it will be disabled after
the first RXDMA transfer. Based on a commit from the BSP by Hiromitsu
Yamasaki, yet completely refactored to handle multiple read messages
within one transfer.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
The revised if_ready checks skipped over the case of returning error when
the controller is being deleted. Instead it was returning BUSY, which
caused the ios to retry, which caused the ns delete to hang waiting for
the ios to drain.
Stack trace of hang looks like:
kworker/u64:2 D 0 74 2 0x80000000
Workqueue: nvme-delete-wq nvme_delete_ctrl_work [nvme_core]
Call Trace:
? __schedule+0x26d/0x820
schedule+0x32/0x80
blk_mq_freeze_queue_wait+0x36/0x80
? remove_wait_queue+0x60/0x60
blk_cleanup_queue+0x72/0x160
nvme_ns_remove+0x106/0x140 [nvme_core]
nvme_remove_namespaces+0x7e/0xa0 [nvme_core]
nvme_delete_ctrl_work+0x4d/0x80 [nvme_core]
process_one_work+0x160/0x350
worker_thread+0x1c3/0x3d0
kthread+0xf5/0x130
? process_one_work+0x350/0x350
? kthread_bind+0x10/0x10
ret_from_fork+0x1f/0x30
Extend nvmf_fail_nonready_command() to supply the controller pointer so
that the controller state can be looked at. Fail any io to a controller
that is deleting.
Fixes: 3bc32bb118 ("nvme-fabrics: refactor queue ready check")
Fixes: 35897b920c ("nvme-fabrics: fix and refine state checks in __nvmf_check_ready")
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
The existing code to carve up the sg list expected an sg element-per-page
which can be very incorrect with iommu's remapping multiple memory pages
to fewer bus addresses. To hit this error required a large io payload
(greater than 256k) and a system that maps on a per-page basis. It's
possible that large ios could get by fine if the system condensed the
sgl list into the first 64 elements.
This patch corrects the sg list handling by specifically walking the
sg list element by element and attempting to divide the transfer up
on a per-sg element boundary. While doing so, it still tries to keep
sequences under 256k, but will exceed that rule if a single sg element
is larger than 256k.
Fixes: 48fa362b6c ("nvmet-fc: simplify sg list handling")
Cc: <stable@vger.kernel.org> # 4.14
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
error_entry and error_exit communicate the user vs. kernel status of
the frame using %ebx. This is unnecessary -- the information is in
regs->cs. Just use regs->cs.
This makes error_entry simpler and makes error_exit more robust.
It also fixes a nasty bug. Before all the Spectre nonsense, the
xen_failsafe_callback entry point returned like this:
ALLOC_PT_GPREGS_ON_STACK
SAVE_C_REGS
SAVE_EXTRA_REGS
ENCODE_FRAME_POINTER
jmp error_exit
And it did not go through error_entry. This was bogus: RBX
contained garbage, and error_exit expected a flag in RBX.
Fortunately, it generally contained *nonzero* garbage, so the
correct code path was used. As part of the Spectre fixes, code was
added to clear RBX to mitigate certain speculation attacks. Now,
depending on kernel configuration, RBX got zeroed and, when running
some Wine workloads, the kernel crashes. This was introduced by:
commit 3ac6d8c787 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")
With this patch applied, RBX is no longer needed as a flag, and the
problem goes away.
I suspect that malicious userspace could use this bug to crash the
kernel even without the offending patch applied, though.
[ Historical note: I wrote this patch as a cleanup before I was aware
of the bug it fixed. ]
[ Note to stable maintainers: this should probably get applied to all
kernels. If you're nervous about that, a more conservative fix to
add xorl %ebx,%ebx; incl %ebx before the jump to error_exit should
also fix the problem. ]
Reported-and-tested-by: M. Vefa Bicakci <m.v.b@runbox.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Fixes: 3ac6d8c787 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")
Link: http://lkml.kernel.org/r/b5010a090d3586b2d6e06c7ad3ec5542d1241c45.1532282627.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently user regulatory hint is ignored if all wiphys
in the system are self managed. But the hint is not ignored
if there is no wiphy in the system. This affects the global
regulatory setting. Global regulatory setting needs to be
maintained so that it can be applied to a new wiphy entering
the system. Therefore, do not ignore user regulatory setting
even if all wiphys in the system are self managed.
Signed-off-by: Amar Singhal <asinghal@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The s390 build currently fails with the latent entropy plugin:
arch/s390/kernel/als.o: In function `verify_facilities':
als.c:(.init.text+0x24): undefined reference to `latent_entropy'
als.c:(.init.text+0xae): undefined reference to `latent_entropy'
make[3]: *** [arch/s390/boot/compressed/vmlinux] Error 1
make[2]: *** [arch/s390/boot/compressed/vmlinux] Error 2
make[1]: *** [bzImage] Error 2
This will be fixed with the early boot rework from Vasily, which
is planned for the 4.19 merge window.
For 4.18 the simplest solution is to disable the gcc plugins and
reenable them after the early boot rework is upstream.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Current sg coalescing logic in sk_alloc_sg() (latter is used by tls and
sockmap) is not quite correct in that we do fetch the previous sg entry,
however the subsequent check whether the refilled page frag from the
socket is still the same as from the last entry with prior offset and
length matching the start of the current buffer is comparing always the
first sg list entry instead of the prior one.
Fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch ensures the member->offset of a struct
is in the correct order (i.e the later member's offset cannot
go backward).
The current "pahole -J" BTF encoder does not generate something
like this. However, checking this can ensure future encoder
will not violate this.
Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Shaochun Chen points out we leak dumper filter state allocations
stored in dump_control->data in case there is an error before netlink sets
cb_running (after which ->done will be called at some point).
In order to fix this, add .start functions and do the allocations
there.
->done is going to clean up, and in case error occurs before
->start invocation no cleanups need to be done anymore.
Reported-by: shaochun chen <cscnull@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If a GPIO chip is a part of a hierarchy IRQ domain, there is no
way to specify the trigger type when gpio(d)_to_irq() allocates an
interrupt on-the-fly.
Currently, uniphier_gpio_to_irq() sets IRQ_TYPE_NONE, but it causes
an error in the .alloc() hook of the parent domain.
(drivers/irq/irq-uniphier-aidet.c)
Even if we change irq-uniphier-aidet.c to accept the NONE type,
GIC complains about it since commit 83a86fbb5b ("irqchip/gic:
Loudly complain about the use of IRQ_TYPE_NONE").
Instead, use IRQ_TYPE_LEVEL_HIGH as a temporary value when an irq
is allocated. irq_set_irq_type() will override it when the irq is
really requested.
Fixes: dbe776c2ca ("gpio: uniphier: add UniPhier GPIO controller driver")
Reported-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
This fixes up the handling of fixed regulator polarity
inversion flags: while I remembered to fix it for the
undocumented "reg-fixed-voltage" I forgot about the
official "regulator-fixed" binding, there are two ways
to do a fixed regulator.
The error was noticed and fixed.
Fixes: a603a2b8d8 ("gpio: of: Add special quirk to parse regulator flags")
Cc: Mark Brown <broonie@kernel.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Reported-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Eric Dumazet says:
====================
Juha-Matti Tilli reported that malicious peers could inject tiny
packets in out_of_order_queue, forcing very expensive calls
to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for
every incoming packet.
With tcp_rmem[2] default of 6MB, the ooo queue could
contain ~7000 nodes.
This patch series makes sure we cut cpu cycles enough to
render the attack not critical.
We might in the future go further, like disconnecting
or black-holing proven malicious flows.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In case skb in out_or_order_queue is the result of
multiple skbs coalescing, we would like to get a proper gso_segs
counter tracking, so that future tcp_drop() can report an accurate
number.
I chose to not implement this tracking for skbs in receive queue,
since they are not dropped, unless socket is disconnected.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to be able to give better diagnostics and detect
malicious traffic, we need to have better sk->sk_drops tracking.
Fixes: 9f5afeae51 ("tcp: use an RB tree for ooo receive queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case an attacker feeds tiny packets completely out of order,
tcp_collapse_ofo_queue() might scan the whole rb-tree, performing
expensive copies, but not changing socket memory usage at all.
1) Do not attempt to collapse tiny skbs.
2) Add logic to exit early when too many tiny skbs are detected.
We prefer not doing aggressive collapsing (which copies packets)
for pathological flows, and revert to tcp_prune_ofo_queue() which
will be less expensive.
In the future, we might add the possibility of terminating flows
that are proven to be malicious.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right after a TCP flow is created, receiving tiny out of order
packets allways hit the condition :
if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf)
tcp_clamp_window(sk);
tcp_clamp_window() increases sk_rcvbuf to match sk_rmem_alloc
(guarded by tcp_rmem[2])
Calling tcp_collapse_ofo_queue() in this case is not useful,
and offers a O(N^2) surface attack to malicious peers.
Better not attempt anything before full queue capacity is reached,
forcing attacker to spend lots of resource and allow us to more
easily detect the abuse.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Juha-Matti Tilli reported that malicious peers could inject tiny
packets in out_of_order_queue, forcing very expensive calls
to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for
every incoming packet. out_of_order_queue rb-tree can contain
thousands of nodes, iterating over all of them is not nice.
Before linux-4.9, we would have pruned all packets in ofo_queue
in one go, every XXXX packets. XXXX depends on sk_rcvbuf and skbs
truesize, but is about 7000 packets with tcp_rmem[2] default of 6 MB.
Since we plan to increase tcp_rmem[2] in the future to cope with
modern BDP, can not revert to the old behavior, without great pain.
Strategy taken in this patch is to purge ~12.5 % of the queue capacity.
Fixes: 36a6503fed ("tcp: refine tcp_prune_ofo_queue() to not drop all packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The skb hash for locally generated ip[v6] fragments belonging
to the same datagram can vary in several circumstances:
* for connected UDP[v6] sockets, the first fragment get its hash
via set_owner_w()/skb_set_hash_from_sk()
* for unconnected IPv6 UDPv6 sockets, the first fragment can get
its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if
auto_flowlabel is enabled
For the following frags the hash is usually computed via
skb_get_hash().
The above can cause OoO for unconnected IPv6 UDPv6 socket: in that
scenario the egress tx queue can be selected on a per packet basis
via the skb hash.
It may also fool flow-oriented schedulers to place fragments belonging
to the same datagram in different flows.
Fix the issue by copying the skb hash from the head frag into
the others at fragmentation time.
Before this commit:
perf probe -a "dev_queue_xmit skb skb->hash skb->l4_hash:b1@0/8 skb->sw_hash:b1@1/8"
netperf -H $IPV4 -t UDP_STREAM -l 5 -- -m 2000 -n &
perf record -e probe:dev_queue_xmit -e probe:skb_set_owner_w -a sleep 0.1
perf script
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=3713014309 l4_hash=1 sw_hash=0
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=0 l4_hash=0 sw_hash=0
After this commit:
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0
Fixes: b73c3d0e4f ("net: Save TX flow hash in sock and set in skbuf on xmit")
Fixes: 67800f9b1f ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure to call reinit_completion() before dma is started to avoid race
condition where reinit_completion() is called after complete() and before
wait_for_completion_timeout().
Signed-off-by: Esben Haabendal <eha@deif.com>
Fixes: ce1a78840f ("i2c: imx: add DMA support for freescale i2c driver")
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
If CLKH is set to 0 I2C clock is not generated at all, so avoid this value
and stretch the clock in this case.
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Marc Kleine-Budde says:
====================
pull-request: can 2018-07-23
this is a pull request of 12 patches for net/master.
The patch by Stephane Grosjean for the peak_canfd CAN driver fixes a problem
with older firmware. The next patch is by Roman Fietze and fixes the setup of
the CCCR register in the m_can driver. Nicholas Mc Guire's patch for the
mpc5xxx_can driver adds missing error checking. The two patches by Faiz Abbas
fix the runtime resume and clean up the probe function in the m_can driver. The
last 7 patches by Anssi Hannula fix several problem in the xilinx_can driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
There are several issues with the suspend/resume handling code of the
driver:
- The device is attached and detached in the runtime_suspend() and
runtime_resume() callbacks if the interface is running. However,
during xcan_chip_start() the interface is considered running,
causing the resume handler to incorrectly call netif_start_queue()
at the beginning of xcan_chip_start(), and on xcan_chip_start() error
return the suspend handler detaches the device leaving the user
unable to bring-up the device anymore.
- The device is not brought properly up on system resume. A reset is
done and the code tries to determine the bus state after that.
However, after reset the device is always in Configuration mode
(down), so the state checking code does not make sense and
communication will also not work.
- The suspend callback tries to set the device to sleep mode (low-power
mode which monitors the bus and brings the device back to normal mode
on activity), but then immediately disables the clocks (possibly
before the device reaches the sleep mode), which does not make sense
to me. If a clean shutdown is wanted before disabling clocks, we can
just bring it down completely instead of only sleep mode.
Reorganize the PM code so that only the clock logic remains in the
runtime PM callbacks and the system PM callbacks contain the device
bring-up/down logic. This makes calling the runtime PM callbacks during
e.g. xcan_chip_start() safe.
The system PM callbacks now simply call common code to start/stop the
HW if the interface was running, replacing the broken code from before.
xcan_chip_stop() is updated to use the common reset code so that it will
wait for the reset to complete. Reset also disables all interrupts so do
not do that separately.
Also, the device_may_wakeup() checks are removed as the driver does not
have wakeup support.
Tested on Zynq-7000 integrated CAN.
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
xcan_interrupt() clears ERROR|RXOFLV|BSOFF|ARBLST interrupts if any of
them is asserted. This does not take into account that some of them
could have been asserted between interrupt status read and interrupt
clear, therefore clearing them without handling them.
Fix the code to only clear those interrupts that it knows are asserted
and therefore going to be processed in xcan_err_interrupt().
Fixes: b1201e44f5 ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
RX overflow interrupt (RXOFLW) is disabled even though xcan_interrupt()
processes it. This means that an RX overflow interrupt will only be
processed when another interrupt gets asserted (e.g. for RX/TX).
Fix that by enabling the RXOFLW interrupt.
Fixes: b1201e44f5 ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The xilinx_can driver assumes that the TXOK interrupt only clears after
it has been acknowledged as many times as there have been successfully
sent frames.
However, the documentation does not mention such behavior, instead
saying just that the interrupt is cleared when the clear bit is set.
Similarly, testing seems to also suggest that it is immediately cleared
regardless of the amount of frames having been sent. Performing some
heavy TX load and then going back to idle has the tx_head drifting
further away from tx_tail over time, steadily reducing the amount of
frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
interrupt always frees up space for 1 frame from the driver's
perspective, so frames continue to be sent) and delaying the local echo
frames.
The TX FIFO tracking is also otherwise buggy as it does not account for
TX FIFO being cleared after software resets, causing
BUG!, TX FIFO full when queue awake!
messages to be output.
There does not seem to be any way to accurately track the state of the
TX FIFO for local echo support while using the full TX FIFO.
The Zynq version of the HW (but not the soft-AXI version) has watermark
programming support and with it an additional TX-FIFO-empty interrupt
bit.
Modify the driver to only put 1 frame into TX FIFO at a time on soft-AXI
and 2 frames at a time on Zynq. On Zynq the TXFEMP interrupt bit is used
to detect whether 1 or 2 frames have been sent at interrupt processing
time.
Tested with the integrated CAN on Zynq-7000 SoC. The 1-frame-FIFO mode
was also tested.
An alternative way to solve this would be to drop local echo support but
keep using the full TX FIFO.
v2: Add FIFO space check before TX queue wake with locking to
synchronize with queue stop. This avoids waking the queue when xmit()
had just filled it.
v3: Keep local echo support and reduce the amount of frames in FIFO
instead as suggested by Marc Kleine-Budde.
Fixes: b1201e44f5 ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The xilinx_can driver contains no mechanism for propagating recovery
from CAN_STATE_ERROR_WARNING and CAN_STATE_ERROR_PASSIVE.
Add such a mechanism by factoring the handling of
XCAN_STATE_ERROR_PASSIVE and XCAN_STATE_ERROR_WARNING out of
xcan_err_interrupt and checking for recovery after RX and TX if the
interface is in one of those states.
Tested with the integrated CAN on Zynq-7000 SoC.
Fixes: b1201e44f5 ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
If the device gets into a state where RXNEMP (RX FIFO not empty)
interrupt is asserted without RXOK (new frame received successfully)
interrupt being asserted, xcan_rx_poll() will continue to try to clear
RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is
not empty, the interrupt will not be cleared and napi_schedule() will
just be called again.
This situation can occur when:
(a) xcan_rx() returns without reading RX FIFO due to an error condition.
The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear
due to a frame still being in the FIFO. The frame will never be read
from the FIFO as RXOK is no longer set.
(b) A frame is received between xcan_rx_poll() reading interrupt status
and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain
set as the new message is still in the FIFO.
I'm able to trigger case (b) by flooding the bus with frames under load.
There does not seem to be any benefit in using both RXNEMP and RXOK in
the way the driver does, and the polling example in the reference manual
(UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either
RXOK or RXNEMP can be used for detecting incoming messages.
Fix the issue and simplify the RX processing by only using RXNEMP
without RXOK.
Tested with the integrated CAN on Zynq-7000 SoC.
Fixes: b1201e44f5 ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The xilinx_can driver performs a software reset when an RX overrun is
detected. This causes the device to enter Configuration mode where no
messages are received or transmitted.
The documentation does not mention any need to perform a reset on an RX
overrun, and testing by inducing an RX overflow also indicated that the
device continues to work just fine without a reset.
Remove the software reset.
Tested with the integrated CAN on Zynq-7000 SoC.
Fixes: b1201e44f5 ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
MCAN message ram should only be accessed once clocks are enabled.
Therefore, move the call to parse/init the message ram to after
clocks are enabled.
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
pm_runtime_get_sync() returns a 1 if the state of the device is already
'active'. This is not a failure case and should return a success.
Therefore fix error handling for pm_runtime_get_sync() call such that
it returns success when the value is 1.
Also cleanup the TODO for using runtime PM for sleep mode as that is
implemented.
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Cc: <stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
of_iomap() can return NULL so that return needs to be checked and NULL
treated as failure. While at it also take care of the missing
of_node_put() in the error path.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Fixes: commit afa17a500a ("net/can: add driver for mscan family & mpc52xx_mscan")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Inside m_can_chip_config(), when setting up the new value of the CCCR,
the CCCR_NISO bit is not cleared like the others, CCCR_TEST, CCCR_MON,
CCCR_BRSE and CCCR_FDOE, before checking the can.ctrlmode bits for
CAN_CTRLMODE_FD_NON_ISO.
This way once the controller was configured for CAN_CTRLMODE_FD_NON_ISO,
this mode could never be cleared again.
This fix is only relevant for controllers with version 3.1.x or 3.2.x.
Older versions do not support NISO.
Signed-off-by: Roman Fietze <roman.fietze@telemotive.de>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The DMA logic in firmwares < v3.3.0 embedded in the PCAN-PCIe FD cards
family is not capable of handling a mix of 32-bit and 64-bit logical
addresses. If the board is equipped with 2 or 4 CAN ports, then such a
situation might lead to a PCIe Bus Error "Malformed TLP" packet
as well as "irq xx: nobody cared" issue.
This patch adds a workaround that requests only 32-bit DMA addresses
when these might be allocated outside of the 4 GB area.
This issue has been fixed in firmware v3.3.0 and next.
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The dispatcher and the executer process the parse nodes During table
load. Error status from the evaluation confuses the AML parser. This
results in the parser failing to complete parsing of the current
scope op which becomes problematic. For the incorrect AML below, _ADR
never gets created.
definition_block(...)
{
Scope (\_SB)
{
Device (PCI0){...}
Name (OBJ1, 0x0)
OBJ1 = PCI0 + 5 // Results in an operand error.
} // \_SB not closed
// parser looks for \_SB._SB.PCI0, results in AE_NOT_FOUND error
// Entire scope block gets skipped.
Scope (\_SB.PCI0)
{
Name (_ADR, 0x0)
}
}
Fix the above error by properly completing the initial \_SB scope
after an error by clearing errors that occur during table load. In
the above case, this means that OBJ1 = PIC0 + 5 is skipped.
Fixes: 5088814a6e (ACPICA: AML parser: attempt to continue loading table after error)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200363
Tested-by: Bastien Nocera <hadess@hadess.net>
Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Cc: 4.17+ <stable@vger.kernel.org> # 4.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull NVMe fixes from Christoph Hellwig:
- fix a regression in 4.18 that causes a memory leak on probe failure
(Keith Bush)
- fix a deadlock in the passthrough ioctl code (Scott Bauer)
- don't enable AENs if not supported (Weiping Zhang)
- fix an old regression in metadata handling in the passthrough ioctl
code (Roland Dreier)
* tag 'nvme-for-4.18' of git://git.infradead.org/nvme:
nvme: fix handling of metadata_len for NVME_IOCTL_IO_CMD
nvme: don't enable AEN if not supported
nvme: ensure forward progress during Admin passthru
nvme-pci: fix memory leak on probe failure
Pull vfs fixes from Al Viro:
"Fix several places that screw up cleanups after failures halfway
through opening a file (one open-coding filp_clone_open() and getting
it wrong, two misusing alloc_file()). That part is -stable fodder from
the 'work.open' branch.
And Christoph's regression fix for uapi breakage in aio series;
include/uapi/linux/aio_abi.h shouldn't be pulling in the kernel
definition of sigset_t, the reason for doing so in the first place had
been bogus - there's no need to expose struct __aio_sigset in
aio_abi.h at all"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
aio: don't expose __aio_sigset in uapi
ocxlflash_getfile(): fix double-iput() on alloc_file() failures
cxl_getfile(): fix double-iput() on alloc_file() failures
drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()
kernel_wait4() expects a userland address for status - it's only
rusage that goes as a kernel one (and needs a copyout afterwards)
[ Also, fix the prototype of kernel_wait4() to have that __user
annotation - Linus ]
Fixes: 92ebce5ac5 ("osf_wait4: switch to kernel_wait4()")
Cc: stable@kernel.org # v4.13+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Prevent drivers from building on PPC32 if they use isa_bus_to_virt(),
isa_virt_to_bus(), or isa_page_to_bus(), which are not available and
thus cause build errors.
../drivers/net/ethernet/3com/3c515.c: In function 'corkscrew_open':
../drivers/net/ethernet/3com/3c515.c:824:9: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration]
../drivers/net/ethernet/amd/lance.c: In function 'lance_rx':
../drivers/net/ethernet/amd/lance.c:1203:23: error: implicit declaration of function 'isa_bus_to_virt'; did you mean 'bus_to_virt'? [-Werror=implicit-function-declaration]
../drivers/net/ethernet/amd/ni65.c: In function 'ni65_init_lance':
../drivers/net/ethernet/amd/ni65.c:585:20: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration]
../drivers/net/ethernet/cirrus/cs89x0.c: In function 'net_open':
../drivers/net/ethernet/cirrus/cs89x0.c:897:20: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration]
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously only the neighbour state was checked to decide if an offloaded
entry should be removed. However, there can be situations when the entry
is dead but still marked as valid. This can lead to dead entries not
being removed from fw tables or even incorrect data being added.
Check the entry dead bit before deciding if it should be added to or
removed from fw neighbour tables.
Fixes: 8e6a9046b6 ("nfp: flower vxlan neighbour offload")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu says:
====================
vxlan: fix default fdb entry user-space notify ordering/race
Problem:
In vxlan_newlink, a default fdb entry is added before register_netdev.
The default fdb creation function notifies user-space of the
fdb entry on the vxlan device which user-space does not know about yet.
(RTM_NEWNEIGH goes before RTM_NEWLINK for the same ifindex).
This series fixes the user-space netlink notification ordering issue
with the following changes:
- decouple fdb notify from fdb create.
- Move fdb notify after register_netdev.
- modify rtnl_configure_link to allow configuring a link early.
- Call rtnl_configure_link in vxlan newlink handler to notify
userspace about the newlink before fdb notify and
hence avoiding the user-space race.
====================
Fixes: afbd8bae9c ("vxlan: add implicit fdb entry for default destination")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Problem:
In vxlan_newlink, a default fdb entry is added before register_netdev.
The default fdb creation function also notifies user-space of the
fdb entry on the vxlan device which user-space does not know about yet.
(RTM_NEWNEIGH goes before RTM_NEWLINK for the same ifindex).
This patch fixes the user-space netlink notification ordering issue
with the following changes:
- decouple fdb notify from fdb create.
- Move fdb notify after register_netdev.
- Call rtnl_configure_link in vxlan newlink handler to notify
userspace about the newlink before fdb notify and
hence avoiding the user-space race.
Fixes: afbd8bae9c ("vxlan: add implicit fdb entry for default destination")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new option do_notify to vxlan_fdb_destroy to make
sending netlink notify optional. Used by a later patch.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Add new vxlan_fdb_alloc helper
- rename existing vxlan_fdb_create into vxlan_fdb_update:
because it really creates or updates an existing
fdb entry
- move new fdb creation into a separate vxlan_fdb_create
Main motivation for this change is to introduce the ability
to decouple vxlan fdb creation and notify, used in a later patch.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rtnl_configure_link sets dev->rtnl_link_state to
RTNL_LINK_INITIALIZED and unconditionally calls
__dev_notify_flags to notify user-space of dev flags.
current call sequence for rtnl_configure_link
rtnetlink_newlink
rtnl_link_ops->newlink
rtnl_configure_link (unconditionally notifies userspace of
default and new dev flags)
If a newlink handler wants to call rtnl_configure_link
early, we will end up with duplicate notifications to
user-space.
This patch fixes rtnl_configure_link to check rtnl_link_state
and call __dev_notify_flags with gchanges = 0 if already
RTNL_LINK_INITIALIZED.
Later in the series, this patch will help the following sequence
where a driver implementing newlink can call rtnl_configure_link
to initialize the link early.
makes the following call sequence work:
rtnetlink_newlink
rtnl_link_ops->newlink (vxlan) -> rtnl_configure_link (initializes
link and notifies
user-space of default
dev flags)
rtnl_configure_link (updates dev flags if requested by user ifm
and notifies user-space of new dev flags)
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Got crash report with following backtrace:
BUG: unable to handle kernel paging request at ffff8801869daffe
RIP: 0010:[<ffffffff816429c4>] [<ffffffff816429c4>] ip6_finish_output2+0x394/0x4c0
RSP: 0018:ffff880186c83a98 EFLAGS: 00010283
RAX: ffff8801869db00e ...
[<ffffffff81644cdc>] ip6_finish_output+0x8c/0xf0
[<ffffffff81644d97>] ip6_output+0x57/0x100
[<ffffffff81643dc9>] ip6_forward+0x4b9/0x840
[<ffffffff81645566>] ip6_rcv_finish+0x66/0xc0
[<ffffffff81645db9>] ipv6_rcv+0x319/0x530
[<ffffffff815892ac>] netif_receive_skb+0x1c/0x70
[<ffffffffc0060bec>] atl1c_clean+0x1ec/0x310 [atl1c]
...
The bad access is in neigh_hh_output(), at skb->data - 16 (HH_DATA_MOD).
atl1c driver provided skb with no headroom, so 14 bytes (ethernet
header) got pulled, but then 16 are copied.
Reserve NET_SKB_PAD bytes headroom, like netdev_alloc_skb().
Compile tested only; I lack hardware.
Fixes: 7b70176421 ("atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two scenarios that we will restore deleted records. The first is
when device down and up(or unmap/remap). In this scenario the new filter
mode is same with previous one. Because we get it from in_dev->mc_list and
we do not touch it during device down and up.
The other scenario is when a new socket join a group which was just delete
and not finish sending status reports. In this scenario, we should use the
current filter mode instead of restore old one. Here are 4 cases in total.
old_socket new_socket before_fix after_fix
IN(A) IN(A) ALLOW(A) ALLOW(A)
IN(A) EX( ) TO_IN( ) TO_EX( )
EX( ) IN(A) TO_EX( ) ALLOW(A)
EX( ) EX( ) TO_EX( ) TO_EX( )
Fixes: 24803f38a5 (igmp: do not remove igmp souce list info when set link down)
Fixes: 1666d49e1d (mld: do not remove mld souce list info when set link down)
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
free_irq() waits until all handlers for this IRQ have completed. As the
relevant handler (mv88e6xxx_g1_irq_thread_fn()) takes the chip's reg_lock
it might never return if the thread calling free_irq() holds this lock.
For the same reason kthread_cancel_delayed_work_sync() in the polling case
must not hold this lock.
Also first free the irq (or stop the worker respectively) such that
mv88e6xxx_g1_irq_thread_work() isn't called any more before the irq
mappings are dropped in mv88e6xxx_g1_irq_free_common() to prevent the
worker thread to call handle_nested_irq(0) which results in a NULL-pointer
exception.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Example setup:
host: ip -6 addr add dev eth1 2001:db8:104::4
where eth1 is enslaved to a VRF
switch: ip -6 ro add 2001:db8:104::4/128 dev br1
where br1 only has an LLA
ping6 2001:db8:104::4
ssh 2001:db8:104::4
(NOTE: UDP works fine if the PKTINFO has the address set to the global
address and ifindex is set to the index of eth1 with a destination an
LLA).
For ICMP, icmp6_iif needs to be updated to check if skb->dev is an
L3 master. If it is then return the ifindex from rt6i_idev similar
to what is done for loopback.
For TCP, restore the original tcp_v6_iif definition which is needed in
most places and add a new tcp_v6_iif_l3_slave that considers the
l3_slave variability. This latter check is only needed for socket
lookups.
Fixes: 9ff7438460 ("net: vrf: Handle ipv6 multicast and link-local addresses")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ARM SoC fixes from Olof Johansson:
- Fix interrupt type on ethernet switch for i.MX-based RDU2
- GPC on i.MX exposed too large a register window which resulted in
userspace being able to crash the machine.
- Fixup of bad merge resolution moving GPIO DT nodes under pinctrl on
droid4.
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: imx6: RDU2: fix irq type for mv88e6xxx switch
soc: imx: gpc: restrict register range for regmap access
ARM: dts: omap4-droid4: fix dts w.r.t. pwm
Pull x86 fix from Ingo Molnar:
"A single fix for a MCE-polling regression, which prevented the
disabling of polling"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/MCE: Remove min interval polling limitation
Pull x86 pti fixes from Ingo Molnar:
"An APM fix, and a BTS hardware-tracing fix related to PTI changes"
* 'x86-pti-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apm: Don't access __preempt_count with zeroed fs
x86/events/intel/ds: Fix bts_interrupt_threshold alignment
Pull scheduler fixes from Ingo Molnar:
"Two fixes: a stop-machine preemption fix and a SCHED_DEADLINE fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/deadline: Fix switched_from_dl() warning
stop_machine: Disable preemption when waking two stopper threads
Pull core kernel fixes from Ingo Molnar:
"This is mostly the copy_to_user_mcsafe() related fixes from Dan
Williams, and an ORC fix for Clang"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm/memcpy_mcsafe: Fix copy_to_user_mcsafe() exception handling
lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe()
lib/iov_iter: Document _copy_to_iter_flushcache()
lib/iov_iter: Document _copy_to_iter_mcsafe()
objtool: Use '.strtab' if '.shstrtab' doesn't exist, to support ORC tables on Clang
Pull powerpc fixes from Michael Ellerman:
"Two regression fixes, one for xmon disassembly formatting and the
other to fix the E500 build.
Two commits to fix a potential security issue in the VFIO code under
obscure circumstances.
And finally a fix to the Power9 idle code to restore SPRG3, which is
user visible and used for sched_getcpu().
Thanks to: Alexey Kardashevskiy, David Gibson. Gautham R. Shenoy,
James Clarke"
* tag 'powerpc-4.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powernv: Fix save/restore of SPRG3 on entry/exit from stop (idle)
powerpc/Makefile: Assemble with -me500 when building for E500
KVM: PPC: Check if IOMMU page is contained in the pinned physical page
vfio/spapr: Use IOMMU pageshift rather than pagesize
powerpc/xmon: Fix disassembly since printf changes
Pull btrfs fix from David Sterba:
"A fix of a corruption regarding fsync and clone, under some very
specific conditions explained in the patch.
The fix is marked for stable 3.16+ so I'd like to get it merged now
given the impact"
* tag 'for-4.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Btrfs: fix file data corruption after cloning a range and fsync
Fix following warning:
net/ipv4/bpfilter/sockopt.c:28:5: error: symbol 'bpfilter_ip_set_sockopt' redeclared with different type
net/ipv4/bpfilter/sockopt.c:34:5: error: symbol 'bpfilter_ip_get_sockopt' redeclared with different type
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The situation described in the comment can occur also with
PHY_IGNORE_INTERRUPT, therefore change the condition to include it.
Fixes: f555f34fdc ("net: phy: fix auto-negotiation stall due to unavailable interrupt")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru says:
====================
qed: Fix series II.
The patch series fixes few issues in the qed driver.
Please consider applying it to 'net' branch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
FW hsi contains 256 approximation buckets which are split in ramrod into
eight u32 values, but driver is using eight 'unsigned long' variables.
This patch fixes the mcast logic by making the API utilize u32.
Fixes: 83aeb933 ("qed*: Trivial modifications")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a possible race where driver can read link status in mid-transition
and see that virtual-link is up yet speed is 0. Since in this
mid-transition we're guaranteed to see a mailbox from MFW soon, we can
afford to treat this as link down.
Fixes: cc875c2e ("qed: Add link support")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Apparently, MFW publishes EEE capabilities even for Fiber-boards that don't
support them, and later since qed internally sets adv_caps it would cause
link-flap avoidance (LFA) to fail when driver would initiate the link.
This in turn delays the link, causing traffic to fail.
Driver has been modified to not to ask MFW for any EEE config if EEE isn't
to be enabled.
Fixes: 645874e5 ("qed: Add support for Energy efficient ethernet.")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a missing rcu_read_unlock in the error path
Fixes: c95567c803 ("caif: added check for potential null return")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Like vm_area_dup(), it initializes the anon_vma_chain head, and the
basic mm pointer.
The rest of the fields end up being different for different users,
although the plan is to also initialize the 'vm_ops' field to a dummy
entry.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
.. and re-initialize th eanon_vma_chain head.
This removes some boiler-plate from the users, and also makes it clear
why it didn't need use the 'zalloc()' version.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The vm_area_struct is one of the most fundamental memory management
objects, but the management of it is entirely open-coded evertwhere,
ranging from allocation and freeing (using kmem_cache_[z]alloc and
kmem_cache_free) to initializing all the fields.
We want to unify this in order to end up having some unified
initialization of the vmas, and the first step to this is to at least
have basic allocation functions.
Right now those functions are literally just wrappers around the
kmem_cache_*() calls. This is a purely mechanical conversion:
# new vma:
kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL) -> vm_area_alloc()
# copy old vma
kmem_cache_alloc(vm_area_cachep, GFP_KERNEL) -> vm_area_dup(old)
# free vma
kmem_cache_free(vm_area_cachep, vma) -> vm_area_free(vma)
to the point where the old vma passed in to the vm_area_dup() function
isn't even used yet (because I've left all the old manual initialization
alone).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge fixes from Andrew Morton:
"5 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: memcg: fix use after free in mem_cgroup_iter()
mm/huge_memory.c: fix data loss when splitting a file pmd
fat: fix memory allocation failure handling of match_strdup()
MAINTAINERS: Peter has moved
mm/memblock: add missing include <linux/bootmem.h>
It was reported that a kernel crash happened in mem_cgroup_iter(), which
can be triggered if the legacy cgroup-v1 non-hierarchical mode is used.
Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
......
Call trace:
mem_cgroup_iter+0x2e0/0x6d4
shrink_zone+0x8c/0x324
balance_pgdat+0x450/0x640
kswapd+0x130/0x4b8
kthread+0xe8/0xfc
ret_from_fork+0x10/0x20
mem_cgroup_iter():
......
if (css_tryget(css)) <-- crash here
break;
......
The crashing reason is that mem_cgroup_iter() uses the memcg object whose
pointer is stored in iter->position, which has been freed before and
filled with POISON_FREE(0x6b).
And the root cause of the use-after-free issue is that
invalidate_reclaim_iterators() fails to reset the value of iter->position
to NULL when the css of the memcg is released in non- hierarchical mode.
Link: http://lkml.kernel.org/r/1531994807-25639-1-git-send-email-jing.xia@unisoc.com
Fixes: 6df38689e0 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim")
Signed-off-by: Jing Xia <jing.xia.mail@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <chunyan.zhang@unisoc.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__split_huge_pmd_locked() must check if the cleared huge pmd was dirty,
and propagate that to PageDirty: otherwise, data may be lost when a huge
tmpfs page is modified then split then reclaimed.
How has this taken so long to be noticed? Because there was no problem
when the huge page is written by a write system call (shmem_write_end()
calls set_page_dirty()), nor when the page is allocated for a write fault
(fault_dirty_shared_page() calls set_page_dirty()); but when allocated for
a read fault (which MAP_POPULATE simulates), no set_page_dirty().
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1807111741430.1106@eggly.anvils
Fixes: d21b9e57c7 ("thp: handle file pages in split_huge_pmd()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Ashwin Chaugule <ashwinch@google.com>
Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: <stable@vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 26f09e9b3a ("mm/memblock: add memblock memory allocation apis")
introduced two new function definitions:
memblock_virt_alloc_try_nid_nopanic()
memblock_virt_alloc_try_nid()
and commit ea1f5f3712 ("mm: define memblock_virt_alloc_try_nid_raw")
introduced the following function definition:
memblock_virt_alloc_try_nid_raw()
This commit adds an include of header file <linux/bootmem.h> to provide
the missing function prototypes. This silences the following gcc warning
(W=1):
mm/memblock.c:1334:15: warning: no previous prototype for `memblock_virt_alloc_try_nid_raw' [-Wmissing-prototypes]
mm/memblock.c:1371:15: warning: no previous prototype for `memblock_virt_alloc_try_nid_nopanic' [-Wmissing-prototypes]
mm/memblock.c:1407:15: warning: no previous prototype for `memblock_virt_alloc_try_nid' [-Wmissing-prototypes]
Also adds #ifdef blockers to prevent compilation failure on mips/ia64
where CONFIG_NO_BOOTMEM=n as could be seen in commit commit 6cc22dc08a
("revert "mm/memblock: add missing include <linux/bootmem.h>"").
Because Makefile already does:
obj-$(CONFIG_HAVE_MEMBLOCK) += memblock.o
The #ifdef has been simplified from:
#if defined(CONFIG_HAVE_MEMBLOCK) && defined(CONFIG_NO_BOOTMEM)
to simply:
#if defined(CONFIG_NO_BOOTMEM)
Link: http://lkml.kernel.org/r/20180626184422.24974-1-malat@debian.org
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Suggested-by: Tony Luck <tony.luck@intel.com>
Suggested-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For some time now, if you load the bonding driver and configure bond
parameters via sysfs using minimal config options, such as specifying
nothing but the mode, relying on defaults for everything else, modes
that cannot use arp monitoring (802.3ad, balance-tlb, balance-alb) all
wind up with both arp_interval=0 (as it should be) and miimon=0, which
means the miimon monitor thread never actually runs. This is particularly
problematic for 802.3ad.
For example, from an LNST recipe I've set up:
$ modprobe bonding max_bonds=0"
$ echo "+t_bond0" > /sys/class/net/bonding_masters"
$ ip link set t_bond0 down"
$ echo "802.3ad" > /sys/class/net/t_bond0/bonding/mode"
$ ip link set ens1f1 down"
$ echo "+ens1f1" > /sys/class/net/t_bond0/bonding/slaves"
$ ip link set ens1f0 down"
$ echo "+ens1f0" > /sys/class/net/t_bond0/bonding/slaves"
$ ethtool -i t_bond0"
$ ip link set ens1f1 up"
$ ip link set ens1f0 up"
$ ip link set t_bond0 up"
$ ip addr add 192.168.9.1/24 dev t_bond0"
$ ip addr add 2002::1/64 dev t_bond0"
This bond comes up okay, but things look slightly suspect in
/proc/net/bonding/t_bond0 output:
$ grep -i mii /proc/net/bonding/t_bond0
MII Status: up
MII Polling Interval (ms): 0
MII Status: up
MII Status: up
Now, pull a cable on one of the ports in the bond, then reconnect it, and
you'll see:
Slave Interface: ens1f0
MII Status: down
Speed: 1000 Mbps
Duplex: full
I believe this became a major issue as of commit 4d2c0cda07, which for
802.3ad bonds, sets slave->link = BOND_LINK_DOWN, with a comment about
relying on link monitoring via miimon to set it correctly, but since the
miimon work queue never runs, the link just stays marked down.
If we simply tweak bond_option_mode_set() slightly, we can check for the
non-arp modes having no miimon value set, and insert BOND_DEFAULT_MIIMON,
which gets things back in full working order. This problem exists as far
back as 4.14, and might be worth fixing in all stable trees since, though
the work-around is to simply specify an miimon value yourself.
Reported-by: Bob Ball <ball@umich.edu>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-07-18
The following series provides fixes to mlx5 core and net device driver.
Please pull and let me know if there's any problem.
For -stable v4.7
net/mlx5e: Don't allow aRFS for encapsulated packets
net/mlx5e: Fix quota counting in aRFS expire flow
For -stable v4.15
net/mlx5e: Only allow offloading decap egress (egdev) flows
net/mlx5e: Refine ets validation function
net/mlx5: Adjust clock overflow work period
For -stable v4.17
net/mlx5: E-Switch, UBSAN fix undefined behavior in mlx5_eswitch_mode
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2018-07-20
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix in BPF Makefile to detect llvm-objcopy in a more robust way which is
needed for pahole's BTF converter and minor UAPI tweaks in BTF_INT_BITS()
to shrink the mask before eventual UAPI freeze, from Martin.
2) Fix a segfault in bpftool when prog pin id has no further arguments such
as id value or file specified, from Taeung.
3) Fix powerpc JIT handling of XADD which has jumps to exit path that would
potentially bypass verifier expectations e.g. with subprog calls. Also add
a test case to make sure XADD is not mangling src/dst register, from Daniel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on USB2.0 Spec Section 11.12.5,
"If a hub has per-port power switching and per-port current limiting,
an over-current on one port may still cause the power on another port
to fall below specific minimums. In this case, the affected port is
placed in the Power-Off state and C_PORT_OVER_CURRENT is set for the
port, but PORT_OVER_CURRENT is not set."
so let's check C_PORT_OVER_CURRENT too for over current condition.
Fixes: 08d1dec6f4 ("usb:hub set hub->change_bits when over-current happens")
Cc: <stable@vger.kernel.org>
Tested-by: Alessandro Antenucci <antenucci@korg.it>
Signed-off-by: Bin Liu <b-liu@ti.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current code does not check sk->sk_shutdown & RCV_SHUTDOWN.
tls_sw_recvmsg may return a positive value in the case where bytes have
already been copied when the socket is shutdown. sk->sk_err has been
cleared, causing the tls_wait_data to hang forever on a subsequent
invocation. Checking sk->sk_shutdown & RCV_SHUTDOWN, as in tcp_recvmsg,
fixes this problem.
Fixes: c46234ebb4 ("tls: RX path for ktls")
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng says:
====================
fix DCTCP ECE Ack series
This patch set address that the existing DCTCP implementation does not
fully implement the ACK policy specified in the RFC. This improves
the responsiveness of CE status change particularly on flows with
small inflight.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Per DCTCP RFC8257 (Section 3.2) the ACK reflecting the CE status change
has to be sent immediately so the sender can respond quickly:
""" When receiving packets, the CE codepoint MUST be processed as follows:
1. If the CE codepoint is set and DCTCP.CE is false, set DCTCP.CE to
true and send an immediate ACK.
2. If the CE codepoint is not set and DCTCP.CE is true, set DCTCP.CE
to false and send an immediate ACK.
"""
Previously DCTCP implementation may continue to delay the ACK. This
patch fixes that to implement the RFC by forcing an immediate ACK.
Tested with this packetdrill script provided by Larry Brakmo
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0
0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
0.110 < [ect0] . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_SOCKET, SO_DEBUG, [1], 4) = 0
0.200 < [ect0] . 1:1001(1000) ack 1 win 257
0.200 > [ect01] . 1:1(0) ack 1001
0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 1:2(1) ack 1001
0.200 < [ect0] . 1001:2001(1000) ack 2 win 257
+0.005 < [ce] . 2001:3001(1000) ack 2 win 257
+0.000 > [ect01] . 2:2(0) ack 2001
// Previously the ACK below would be delayed by 40ms
+0.000 > [ect01] E. 2:2(0) ack 3001
+0.500 < F. 9501:9501(0) ack 4 win 257
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently when a DCTCP receiver delays an ACK and receive a
data packet with a different CE mark from the previous one's, it
sends two immediate ACKs acking previous and latest sequences
respectly (for ECN accounting).
Previously sending the first ACK may mark off the delayed ACK timer
(tcp_event_ack_sent). This may subsequently prevent sending the
second ACK to acknowledge the latest sequence (tcp_ack_snd_check).
The culprit is that tcp_send_ack() assumes it always acknowleges
the latest sequence, which is not true for the first special ACK.
The fix is to not make the assumption in tcp_send_ack and check the
actual ack sequence before cancelling the delayed ACK. Further it's
safer to pass the ack sequence number as a local variable into
tcp_send_ack routine, instead of intercepting tp->rcv_nxt to avoid
future bugs like this.
Reported-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull VFIO fix from Alex Williamson:
"Harden potential Spectre v1 issue (Gustavo A. R. Silva)"
* tag 'vfio-v4.18-rc6' of git://github.com/awilliam/linux-vfio:
vfio/pci: Fix potential Spectre v1
Pull device mapper fix from Mike Snitzer:
"Fix DM writecache target to allow an optional offset to the start of
the data and metadata area.
This allows userspace tools (e.g. LVM2) to place a header and metadata
at the front of the writecache device for its use"
* tag 'for-4.18/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm writecache: support optional offset for start of device
i.MX fixes for 4.18, round 4:
- A fix for i.MX6 RDU2 board on the wrong IRQ type of Marvell switch,
which might result in a race condition in the interrupt handler and
cause the OS to miss all future events.
* tag 'imx-fixes-4.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6: RDU2: fix irq type for mv88e6xxx switch
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull SCSI fixes from James Bottomley:
"A set of 8 obvious fixes.
Three (2 qla2xxx and the cxlflash oopses) are regressions, two from
4.17 and one from the merge window. The hpsa change is user visible,
but it fixes an error users have complained about"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: cxlflash: fix assignment of the backend operations
scsi: qedi: Send driver state to MFW
scsi: qedf: Send the driver state to MFW
scsi: hpsa: correct enclosure sas address
scsi: sd_zbc: Fix variable type and bogus comment
scsi: qla2xxx: Fix NULL pointer dereference for fcport search
scsi: qla2xxx: Fix kernel crash due to late workqueue allocation
scsi: qla2xxx: Fix inconsistent DMA mem alloc/free
Pull IOMMU fix from Joerg Roedel:
"Only one revert, for an an Intel VT-d patch that caused issues with
the i915 GPU driver"
* tag 'iommu-fixes-v4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
Revert "iommu/vt-d: Clean up pasid quirk for pre-production devices"
Pull x86 platform driver fixes from Andy Shevchenko:
"The Dell laptop ACPI video brightness control is now back after fixing
a regression brought by SMM refactoring"
* tag 'platform-drivers-x86-v4.18-2' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: dell-laptop: Fix backlight detection
Pull nds32 updates from Greentime Hu:
"Bug fixes and build ixes for nds32"
* tag 'nds32-for-linus-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux:
nds32: fix build error "relocation truncated to fit: R_NDS32_25_PCREL_RELA" when make allyesconfig
nds32: To simplify the implementation of update_mmu_cache()
nds32: Fix the dts pointer is not passed correctly issue.
nds32: To implement these icache invalidation APIs since nds32 cores don't snoop data cache. This issue is found by Guo Ren. Based on the Documentation/core-api/cachetlb.rst and it says:
nds32: Fix build error caused by configuration flag rename
nds32: define __NDS32_E[BL]__ for sparse
Pull power management fix from Rafael Wysocki:
"Fix a relatively old initialization issue in intel_pstate causing the
pcc-cpufreq driver to be used instead of it on some HP Proliant
systems.
This turned into a functional regression during the 4.17 cycle,
because pcc-cpufreq is a scalability disaster and that was amplified
by the idle loop rework done at that time (Rafael Wysocki).
* tag 'pm-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Register when ACPI PCCH is present
Pull ACPI fix from Rafael Wysocki:
"Extend the recently added suspend-to-idle quirk for Thinkpad X1 Carbon
6th to other systems from that familiy which turned out to need it too
(Robin Johnson)"
* tag 'acpi-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / EC: Use ec_no_wakeup on more Thinkpad X1 Carbon 6th systems
VPID for the nested vcpu is allocated at vmx_create_vcpu whenever nested
vmx is turned on with the module parameter.
However, it's only freed if the L1 guest has executed VMXON which is not
a given.
As a result, on a system with nested==on every creation+deletion of an
L1 vcpu without running an L2 guest results in leaking one vpid. Since
the total number of vpids is limited to 64k, they can eventually get
exhausted, preventing L2 from starting.
Delay allocation of the L2 vpid until VMXON emulation, thus matching its
freeing.
Fixes: 5c614b3583
Cc: stable@vger.kernel.org
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not expose the address of vmx->nested.current_vmptr to
kvm_write_guest_virt_system() as the resulting __copy_to_user()
call will trigger a WARN when CONFIG_HARDENED_USERCOPY is
enabled.
Opportunistically clean up variable names in handle_vmptrst()
to improve readability, e.g. vmcs_gva is misleading as the
memory operand of VMPTRST is plain memory, not a VMCS.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Tested-by: Peter Shier <pshier@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The old code in nvme_user_cmd() passed the userspace virtual address
from nvme_passthru_cmd.metadata as the length of the metadata buffer
as well as the address to nvme_submit_user_cmd().
Fixes: 63263d60 ("nvme: Use metadata for passthrough commands")
Signed-off-by: Roland Dreier <roland@purestorage.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
There is a bug in the sink PDO search code when trying to select
a PPS APDO. The current code actually sets the starting index for
searching to whatever value 'i' is, rather than choosing index 1
to avoid the first PDO (always 5V fixed). As a result, for sources
which support PPS but whose PPS APDO index does not match with the
supporting sink PPS APDO index for the platform, no valid PPS APDO
will be found so this feature will not be permitted.
Sadly in testing, both Source and Sink capabilities matched up and
this was missed. Code is now updated to correctly set the start
index to 1, and testing with additional PPS capable sources show
this to work as expected.
Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Fixes: 2eadc33f40 ("typec: tcpm: Add core support for sink side PPS")
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 1b9ba000 ("Allow function drivers to pause control
transfers") states that USB_GADGET_DELAYED_STATUS is only
supported if data phase is 0 bytes.
It seems that when the length is not 0 bytes, there is no
need to explicitly delay the data stage since the transfer
is not completed until the user responds. However, when the
length is 0, there is no data stage and the transfer is
finished once setup() returns, hence there is a need to
explicitly delay completion.
This manifests as the following bugs:
Prior to 946ef68ad4 ('Let setup() return
USB_GADGET_DELAYED_STATUS'), when setup is 0 bytes, ffs
would require user to queue a 0 byte request in order to
clear setup state. However, that 0 byte request was actually
not needed and would hang and cause errors in other setup
requests.
After the above commit, 0 byte setups work since the gadget
now accepts empty queues to ep0 to clear the delay, but all
other setups hang.
Fixes: 946ef68ad4 ("Let setup() return USB_GADGET_DELAYED_STATUS")
Signed-off-by: Jerry Zhang <zhangjerry@google.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When first DCCP packet is SYNC or SYNCACK, we insert a new conntrack
that has an un-initialized timeout value, i.e. such entry could be
reaped at any time.
Mark them as INVALID and only ignore SYNC/SYNCACK when connection had
an old state.
Reported-by: syzbot+6f18401420df260e37ed@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Its possible to rename two chains to the same name in one
transaction:
nft add chain t c1
nft add chain t c2
nft 'rename chain t c1 c3;rename chain t c2 c3'
This creates two chains named 'c3'.
Appears to be harmless, both chains can still be deleted both
by name or handle, but, nevertheless, its a bug.
Walk transaction log and also compare vs. the pending renames.
Both chains can still be deleted, but nevertheless it is a bug as
we don't allow to create chains with identical names, so we should
prevent this from happening-by-rename too.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The new name is stored in the transaction metadata, on commit,
the pointers to the old and new names are swapped.
Therefore in abort and commit case we have to free the
pointer in the chain_trans container.
In commit case, the pointer can be used by another cpu that
is currently dumping the renamed chain, thus kfree needs to
happen after waiting for rcu readers to complete.
Fixes: b7263e071a ("netfilter: nf_tables: Allow chain name of up to 255 chars")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
no need to store the name in separate area.
Furthermore, it uses kmalloc but not kfree and most accesses seem to treat
it as char[IFNAMSIZ] not char *.
Remove this and use dev->name instead.
In case event zeroed dev, just omit the name in the dump.
Fixes: d92191aa84 ("netfilter: nf_tables: cache device name in flowtable object")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fix return code check for "max brightness" ACPI call.
The Dell laptop ACPI video brightness control is not present on dell
laptops anymore, but was present in older kernel versions.
The code that checks the return value is incorrect since the SMM
refactoring.
The old code was:
if (buffer->output[0] == 0)
Which was changed to:
ret = dell_send_request(...)
if (ret)
However, dell_send_request() will return 0 if buffer->output[0] == 0,
so we must change the check to:
if (ret == 0)
This issue was found on a Dell M4800 laptop, and the fix tested on it
as well.
Fixes: 549b4930f0 ("dell-smbios: Introduce dispatcher for SMM calls")
Signed-off-by: Damien Thébault <damien@dtbo.net>
Tested-by: Damien Thébault <damien@dtbo.net>
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
I noticed the "--version" option of the llvm-objcopy command has recently
disappeared from the master llvm branch. It is currently used as a BTF
support test in tools/testing/selftests/bpf/Makefile.
This patch replaces it with "--help" which should be
less error prone in the future.
Fixes: c0fa1b6c3e ("bpf: btf: Add BTF tests")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch shrinks the BTF_INT_BITS() mask. The current
btf_int_check_meta() ensures the nr_bits of an integer
cannot exceed 64. Hence, it is mostly an uapi cleanup.
The actual btf usage (i.e. seq_show()) is also modified
to use u8 instead of u16. The verification (e.g. btf_int_check_meta())
path stays as is to deal with invalid BTF situation.
Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Arguments of 'pin' subcommand should be checked
at the very beginning of do_pin_any().
Otherwise segfault errors can occur when using
'map pin' or 'prog pin' commands, so fix it.
# bpftool prog pin id
Segmentation fault
Fixes: 71bb428fe2 ("tools: bpf: add bpftool")
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The calculation of "wqe_size" is not correct when the tx queue is busy in
hinic_xmit_frame().
When there are no free WQEs, the tx flow will unmap the skb buffer, then
ring the doobell for the pending packets. But the "wqe_size" which used
to calculate the doorbell address is not correct. The wqe size should be
cleared to 0, otherwise, it will cause a doorbell error.
This patch fixes the problem.
Reported-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This was detected by the self-test thanks to Ard's chunking patch.
I finally got around to testing this out on my ancient Via box. It
turns out that the workaround got the assembly wrong and we end up
doing count + initial cycles of the loop instead of just count.
This obviously causes corruption, either by overwriting the source
that is yet to be processed, or writing over the end of the buffer.
On CPUs that don't require the workaround only ECB is affected.
On Nano CPUs both ECB and CBC are affected.
This patch fixes it by doing the subtraction prior to the assembly.
Fixes: a76c1c23d0 ("crypto: padlock-aes - work around Nano CPU...")
Cc: <stable@vger.kernel.org>
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull drm fixes from Dave Airlie:
"Just two sets of driver fixes this week to follow up on the set from
earlier in the week and hopefully get me realigned schedule wise.
amdgpu:
- ACP fix for boards with multiple I2S instances
- DP fix for CZ, vega
- hybrid laptop fixes
- Resume regression fix
nouveau:
- large memory systems and Pascal fix
- MST race fixes
- runtime PM fix"
* tag 'drm-fixes-2018-07-20' of git://anongit.freedesktop.org/drm/drm:
drm/nouveau/fb/gp100-: disable address remapper
drm/amd/amdgpu: creating two I2S instances for stoney/cz (v2)
drm/amdgpu: add another ATPX quirk for TOPAZ
drm/amd/display: Fix DP HBR2 Eye Diagram Pattern on Carrizo
drm/amdgpu: Make sure IB tests flushed after IP resume
drm/nouveau: Set DRIVER_ATOMIC cap earlier to fix debugfs
drm/nouveau: Remove bogus crtc check in pmops_runtime_idle
drm/nouveau/drm/nouveau: Fix runtime PM leak in nv50_disp_atomic_commit()
drm/nouveau: Avoid looping through fake MST connectors
drm/nouveau: Use drm_connector_list_iter_* for iterating connectors
drm/nouveau/gem: off by one bugs in nouveau_gem_pushbuf_reloc_apply()
drm/nouveau/kms/nv50-: ensure window updates are submitted when flushing mst disables
The Marvell switches report their interrupts in a level sensitive way.
When using edge sensitive detection a race condition in the interrupt
handler of the swich might result in the OS to miss all future events
which might make the switch non-functional.
The problem is that both mv88e6xxx_g2_irq_thread_fn() and
mv88e6xxx_g1_irq_thread_work() sample the irq cause register
(MV88E6XXX_G2_INT_SRC and MV88E6XXX_G1_STS respectively) once and then
handle the observed sources. If after sampling but before all observed
irq sources are handled a new irq source gets active this is not noticed
by the handler which returns unsuspecting, but the interrupt line stays
active which prevents the edge detector to kick in.
All device trees but imx6qdl-zii-rdu2 get this right (most of them by
not specifying an interrupt parent). So fix imx6qdl-zii-rdu2
accordingly.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: f64992d1a9 ("ARM: dts: imx6: RDU2: Add Switch interrupts")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
During unload process, the chip can encounter problem where a FW dump would
be captured. For this case, the full reset sequence will be skip to bring
the chip back to full operational state.
Fixes: e315cd28b9 ("[SCSI] qla2xxx: Code changes for qla data structure refactoring")
Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use chip shutdown at the start of unload to stop all DMA + traffic and
bring down the laser. This prevents any link activities from triggering the
driver to be re-engaged.
Fixes: 4b60c82736 ("scsi: qla2xxx: Add fw_started flags to qpair")
Cc: <stable@vger.kernel.org> #4.16
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In case of IOCB Queue full or system where memory is low and driver
receives large number of RSCN storm, the stale sp pointer can stay on
gpnid_list resulting in page_fault.
This patch fixes this issue by initializing the sp->elem list head and
removing sp->elem before memory is freed.
Following stack trace is seen
9 [ffff987b37d1bc60] page_fault at ffffffffad516768 [exception RIP: qla24xx_async_gpnid+496]
10 [ffff987b37d1bd10] qla24xx_async_gpnid at ffffffffc039866d [qla2xxx]
11 [ffff987b37d1bd80] qla2x00_do_work at ffffffffc036169c [qla2xxx]
12 [ffff987b37d1be38] qla2x00_do_dpc_all_vps at ffffffffc03adfed [qla2xxx]
13 [ffff987b37d1be78] qla2x00_do_dpc at ffffffffc036458a [qla2xxx]
14 [ffff987b37d1bec8] kthread at ffffffffacebae31
Fixes: 2d73ac6102 ("scsi: qla2xxx: Serialize GPNID for multiple RSCN")
Cc: <stable@vger.kernel.org> # v4.17+
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes for 4.18. The ACP patch is a bit bigger than I would like
at this point, but it should have gone in long ago, it just fell
through the cracks. The others are pretty small and straight-forward.
- ACP fix for boards with 2 I2S instances
- DP fix for CZ, vega
- Fix for a hybrid graphics laptop
- Fix a resume regression
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180718162603.2747-1-alexander.deucher@amd.com
Daniel Borkmann says:
====================
This set adds a ppc64 JIT fix for xadd as well as a missing test
case for verifying whether xadd messes with src/dst reg. Thanks!
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
We currently do not have such a test case in test_verifier selftests
but it's important to test under bpf_jit_enable=1 to make sure JIT
implementations do not mistakenly mess with src/dst reg for xadd/{w,dw}.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
None of the JITs is allowed to implement exit paths from the BPF
insn mappings other than BPF_JMP | BPF_EXIT. In the BPF core code
we have a couple of rewrites in eBPF (e.g. LD_ABS / LD_IND) and
in eBPF to cBPF translation to retain old existing behavior where
exceptions may occur; they are also tightly controlled by the
verifier where it disallows some of the features such as BPF to
BPF calls when legacy LD_ABS / LD_IND ops are present in the BPF
program. During recent review of all BPF_XADD JIT implementations
I noticed that the ppc64 one is buggy in that it contains two
jumps to exit paths. This is problematic as this can bypass verifier
expectations e.g. pointed out in commit f6b1b3bf0d ("bpf: fix
subprog verifier bypass by div/mod by 0 exception"). The first
exit path is obsoleted by the fix in ca36960211 ("bpf: allow xadd
only on aligned memory") anyway, and for the second one we need to
do a fetch, add and store loop if the reservation from lwarx/ldarx
was lost in the meantime.
Fixes: 156d0e290e ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF")
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Tested-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
i.MX fixes for 4.18, round 3:
- Restrict GPC driver on register range that is accessible by regmap,
so that we can avoid user space from triggering imprecise external
abort exception.
* tag 'imx-fixes-4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
soc: imx: gpc: restrict register range for regmap access
Signed-off-by: Olof Johansson <olof@lixom.net>
One omap dts mismerge fix
The dts patch for droid4 PWM vibrator has added gpio6 entries to the wrong
node. Let's fix it with a note that there seems to be also other GPIO PWM
issues to fix still to get the PWM vibrator working. So this can wait for
v4.19 merge cycle if necessary.
* tag 'omap-for-v4.18/fixes-rc5-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap4-droid4: fix dts w.r.t. pwm
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull PCI fixes from Bjorn Helgaas:
- Fix crashes that happen when PHY drivers are left disabled in the V3
Semiconductor, MediaTek, Faraday, Aardvark, DesignWare, Versatile,
and X-Gene host controller drivers (Sergei Shtylyov)
- Fix a NULL pointer dereference in the endpoint library configfs
support (Kishon Vijay Abraham I)
- Fix a race condition in Hyper-V IRQ handling (Dexuan Cui)
* tag 'pci-v4.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: v3-semi: Fix I/O space page leak
PCI: mediatek: Fix I/O space page leak
PCI: faraday: Fix I/O space page leak
PCI: aardvark: Fix I/O space page leak
PCI: designware: Fix I/O space page leak
PCI: versatile: Fix I/O space page leak
PCI: xgene: Fix I/O space page leak
PCI: OF: Fix I/O space page leak
PCI: endpoint: Fix NULL pointer dereference error when CONFIGFS is disabled
PCI: hv: Disable/enable IRQs rather than BH in hv_compose_msi_msg()
This manifsted as strace segfaulting on HSDK because gcc was targetting
the accumulator registers as GPRs, which kernek was not saving/restoring
by default.
Cc: stable@vger.kernel.org #4.14+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Pull sound fixes from Takashi Iwai:
"A rawmidi race fix and three trivial HD-audio quirks"
* tag 'sound-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek - Yet another Clevo P950 quirk entry
ALSA: rawmidi: Change resized buffers atomically
ALSA: hda/realtek - Add Panasonic CF-SZ6 headset jack quirk
ALSA: hda: add mute led support for HP ProBook 455 G5
The ec_no_wakeup matcher added for Thinkpad X1 Carbon 6th gen systems
beyond matched only a single DMI model (20KGS3JF01), that didn't cover
my laptop (20KH002JUS). Change to match based on DMI product family to
cover all X1 6th gen systems.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull crypto fix from Herbert Xu:
"This fixes an allocation error-path bug in af_alg discovered by
syzkaller"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: af_alg - Initialize sg_num_bytes in error code path
When we clone a range into a file we can end up dropping existing
extent maps (or trimming them) and replacing them with new ones if the
range to be cloned overlaps with a range in the destination inode.
When that happens we add the new extent maps to the list of modified
extents in the inode's extent map tree, so that a "fast" fsync (the flag
BTRFS_INODE_NEEDS_FULL_SYNC not set in the inode) will see the extent maps
and log corresponding extent items. However, at the end of range cloning
operation we do truncate all the pages in the affected range (in order to
ensure future reads will not get stale data). Sometimes this truncation
will release the corresponding extent maps besides the pages from the page
cache. If this happens, then a "fast" fsync operation will miss logging
some extent items, because it relies exclusively on the extent maps being
present in the inode's extent tree, leading to data loss/corruption if
the fsync ends up using the same transaction used by the clone operation
(that transaction was not committed in the meanwhile). An extent map is
released through the callback btrfs_invalidatepage(), which gets called by
truncate_inode_pages_range(), and it calls __btrfs_releasepage(). The
later ends up calling try_release_extent_mapping() which will release the
extent map if some conditions are met, like the file size being greater
than 16Mb, gfp flags allow blocking and the range not being locked (which
is the case during the clone operation) nor being the extent map flagged
as pinned (also the case for cloning).
The following example, turned into a test for fstests, reproduces the
issue:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0x18 9000K 6908K" /mnt/foo
$ xfs_io -f -c "pwrite -S 0x20 2572K 156K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
# reflink destination offset corresponds to the size of file bar,
# 2728Kb minus 4Kb.
$ xfs_io -c ""reflink ${SCRATCH_MNT}/foo 0 2724K 15908K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
$ md5sum /mnt/bar
95a95813a8c2abc9aa75a6c2914a077e /mnt/bar
<power fail>
$ mount /dev/sdb /mnt
$ md5sum /mnt/bar
207fd8d0b161be8a84b945f0df8d5f8d /mnt/bar
# digest should be 95a95813a8c2abc9aa75a6c2914a077e like before the
# power failure
In the above example, the destination offset of the clone operation
corresponds to the size of the "bar" file minus 4Kb. So during the clone
operation, the extent map covering the range from 2572Kb to 2728Kb gets
trimmed so that it ends at offset 2724Kb, and a new extent map covering
the range from 2724Kb to 11724Kb is created. So at the end of the clone
operation when we ask to truncate the pages in the range from 2724Kb to
2724Kb + 15908Kb, the page invalidation callback ends up removing the new
extent map (through try_release_extent_mapping()) when the page at offset
2724Kb is passed to that callback.
Fix this by setting the bit BTRFS_INODE_NEEDS_FULL_SYNC whenever an extent
map is removed at try_release_extent_mapping(), forcing the next fsync to
search for modified extents in the fs/subvolume tree instead of relying on
the presence of extent maps in memory. This way we can continue doing a
"fast" fsync if the destination range of a clone operation does not
overlap with an existing range or if any of the criteria necessary to
remove an extent map at try_release_extent_mapping() is not met (file
size not bigger then 16Mb or gfp flags do not allow blocking).
CC: stable@vger.kernel.org # 3.16+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This was causing problems on a system with a large amount of RAM, where
display push buffers were being fetched incorrectly when placed in high
system memory addresses.
While this commit will resolve the issue on that particular system, the
issue will be avoided completely with another patch to more fully solve
problems with display and large amounts of system memory on Pascal.
It's still probably a good idea to disable this to prevent weird issues
in the future.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pull networking fixes from David Miller:
"Lots of fixes, here goes:
1) NULL deref in qtnfmac, from Gustavo A. R. Silva.
2) Kernel oops when fw download fails in rtlwifi, from Ping-Ke Shih.
3) Lost completion messages in AF_XDP, from Magnus Karlsson.
4) Correct bogus self-assignment in rhashtable, from Rishabh
Bhatnagar.
5) Fix regression in ipv6 route append handling, from David Ahern.
6) Fix masking in __set_phy_supported(), from Heiner Kallweit.
7) Missing module owner set in x_tables icmp, from Florian Westphal.
8) liquidio's timeouts are HZ dependent, fix from Nicholas Mc Guire.
9) Link setting fixes for sh_eth and ravb, from Vladimir Zapolskiy.
10) Fix NULL deref when using chains in act_csum, from Davide Caratti.
11) XDP_REDIRECT needs to check if the interface is up and whether the
MTU is sufficient. From Toshiaki Makita.
12) Net diag can do a double free when killing TCP_NEW_SYN_RECV
connections, from Lorenzo Colitti.
13) nf_defrag in ipv6 can unnecessarily hold onto dst entries for a
full minute, delaying device unregister. From Eric Dumazet.
14) Update MAC entries in the correct order in ixgbe, from Alexander
Duyck.
15) Don't leave partial mangles bpf program in jit_subprogs, from
Daniel Borkmann.
16) Fix pfmemalloc SKB state propagation, from Stefano Brivio.
17) Fix ACK handling in DCTCP congestion control, from Yuchung Cheng.
18) Use after free in tun XDP_TX, from Toshiaki Makita.
19) Stale ipv6 header pointer in ipv6 gre code, from Prashant Bhole.
20) Don't reuse remainder of RX page when XDP is set in mlx4, from
Saeed Mahameed.
21) Fix window probe handling of TCP rapair sockets, from Stefan
Baranoff.
22) Missing socket locking in smc_ioctl(), from Ursula Braun.
23) IPV6_ILA needs DST_CACHE, from Arnd Bergmann.
24) Spectre v1 fix in cxgb3, from Gustavo A. R. Silva.
25) Two spots in ipv6 do a rol32() on a hash value but ignore the
result. Fixes from Colin Ian King"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (176 commits)
tcp: identify cryptic messages as TCP seq # bugs
ptp: fix missing break in switch
hv_netvsc: Fix napi reschedule while receive completion is busy
MAINTAINERS: Drop inactive Vitaly Bordug's email
net: cavium: Add fine-granular dependencies on PCI
net: qca_spi: Fix log level if probe fails
net: qca_spi: Make sure the QCA7000 reset is triggered
net: qca_spi: Avoid packet drop during initial sync
ipv6: fix useless rol32 call on hash
ipv6: sr: fix useless rol32 call on hash
net: sched: Using NULL instead of plain integer
net: usb: asix: replace mii_nway_restart in resume path
net: cxgb3_main: fix potential Spectre v1
lib/rhashtable: consider param->min_size when setting initial table size
net/smc: reset recv timeout after clc handshake
net/smc: add error handling for get_user()
net/smc: optimize consumer cursor updates
net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
ipv6: ila: select CONFIG_DST_CACHE
net: usb: rtl8150: demote allmulti message to dev_dbg()
...
We get egress rules through the egdev mechanism when the ingress device
is not supporting offload, with the expected use-case of tunnel decap
ingress rule set on shared tunnel device.
Make sure to offload egress/egdev rules only if decap action (tunnel key
unset) exists there and err otherwise.
Fixes: 717503b9cf ("net: sched: convert cls_flower->egress_dev users to tc_setup_cb_egdev infra")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Fix bad alignment of SQ buffer in fragmented QP allocation.
It should start directly after RQ buffer ends.
Take special care of the end case where the RQ buffer does not occupy
a whole page. RQ size is a power of two, so would be the case only for
small RQ sizes (RQ size < PAGE_SIZE).
Fix wrong assignments for sqb->size (mistakenly assigned RQ size),
and for npages value of RQ and SQ.
Fixes: 3a2f703312 ("net/mlx5: Use order-0 allocations for all WQ types")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The flow counters binding support commit introduced a code change where
none NULL 'rule_dest' is always passed to mlx5_add_flow_rules, this breaks
'DON'T_TRAP' rules insertion.
The fix uses the equivalent 'dest_num' value instead of dest pointer
at the failed check.
fixes: 3b3233fbf0 ('IB/mlx5: Add flow counters binding support')
Signed-off-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Driver is yet to support aRFS for encapsulated packets, return early
error in such case.
Fixes: 18c908e477 ("net/mlx5e: Add accelerated RFS support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Quota should follow the amount of rules which do expire, and not the
number of rules that were examined, fixed that.
Fixes: 18c908e477 ("net/mlx5e: Add accelerated RFS support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When driver converts HW timestamp to wall clock time it subtracts
the last saved cycle counter from the HW timestamp and converts the
difference to nanoseconds.
The conversion is done by multiplying the cycles difference with the
clock multiplier value as a first step and therefore the cycles
difference should be small enough so that the multiplication product
doesn't exceed 64bit.
The overflow handling routine is in charge of updating the last saved
cycle counter in driver and it is called periodically using kernel
delayed workqueue.
The delay period for this work is calculated using the max HW cycle
counter value (a 41 bit mask) as a base which doesn't take the 64bit
limit into account so the delay period may be incorrect and too
long to prevent a large difference between the HW counter and the last
saved counter in SW.
This change adjusts the work period for the HW clock overflow work by
taking the minimum between the previous value and the quotient of max
u64 value and the clock multiplier value.
Fixes: ef9814deaf ("net/mlx5e: Add HW timestamping (TS) support")
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Removed an error message received when configuring ETS total
bandwidth to be zero.
Our hardware doesn't support such configuration, so we shall
reject it in the driver. Nevertheless, we removed the error message
in order to eliminate error messages caused by old userspace tools
who try to pass such configuration.
Fixes: ff0891915c ("net/mlx5e: Fix ETS BW check")
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Pull DeviceTree fixes from Rob Herring:
- Fix phandle cache to work with overlays
- Correct the default clock-frequency for QCom geni-i2c
- Binding doc quote and spelling fixes
* tag 'devicetree-fixes-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: overlay: update phandle cache on overlay apply and remove
dt-bindings: Fix unbalanced quotation marks
dt-bindings: soc: qcom: Fix default clock-freq for qcom,geni-i2c
dt-bindings: w1-gpio: Remove unneeded unit address
Documentation: devicetree: tilcdc: fix spelling mistake "suppors" -> "supports"
Attempt to make cryptic TCP seq number error messages clearer by
(1) identifying the source of the message as "TCP", (2) identifying the
errors as "seq # bug", and (3) grouping the field identifiers and values
by separating them with commas.
E.g., the following message is changed from:
recvmsg bug 2: copied 73BCB6CD seq 70F17CBE rcvnxt 73BCB9AA fl 0
WARNING: CPU: 2 PID: 1501 at /linux/net/ipv4/tcp.c:1881 tcp_recvmsg+0x649/0xb90
to:
TCP recvmsg seq # bug 2: copied 73BCB6CD, seq 70F17CBE, rcvnxt 73BCB9AA, fl 0
WARNING: CPU: 2 PID: 1501 at /linux/net/ipv4/tcp.c:2011 tcp_recvmsg+0x694/0xba0
Suggested-by: 積丹尼 Dan Jacobson <jidanni@jidanni.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It seems that a *break* is missing in order to avoid falling through
to the default case. Otherwise, checking *chan* makes no sense.
Fixes: 72df7a7244 ("ptp: Allow reassigning calibration pin function")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If out ring is full temporarily and receive completion cannot go out,
we may still need to reschedule napi if certain conditions are met.
Otherwise the napi poll might be stopped forever, and cause network
disconnect.
Fixes: 7426b1a518 ("netvsc: optimize receive completions")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add dependencies on PCI where necessary.
Fixes: 7e2bc7fb65 ("net: cavium: Drop dependency of NET_VENDOR_CAVIUM on PCI")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren says:
====================
net: qca_spi: Minor bugfixes
This patch series contains some minor bugfixes for
the qca_spi driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In cases the probing fails the log level of the messages should
be an error.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case the SPI thread is not running, a simple reset of sync
state won't fix the transmit timeout. We also need to wake up the kernel
thread.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: ed7d42e24e ("net: qca_spi: fix transmit queue timeout handling")
Signed-off-by: David S. Miller <davem@davemloft.net>
As long as the synchronization with the QCA7000 isn't finished, we
cannot accept packets from the upper layers. So let the SPI thread
enable the TX queue after sync and avoid unwanted packet drop.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: 291ab06ecf ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
Signed-off-by: David S. Miller <davem@davemloft.net>
The rol32 call is currently rotating hash but the rol'd value is
being discarded. I believe the current code is incorrect and hash
should be assigned the rotated value returned from rol32.
Thanks to David Lebrun for spotting this.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rol32 call is currently rotating hash but the rol'd value is
being discarded. I believe the current code is incorrect and hash
should be assigned the rotated value returned from rol32.
Detected by CoverityScan, CID#1468411 ("Useless call")
Fixes: b5facfdba1 ("ipv6: sr: Compute flowlabel for outer IPv6 header of seg6 encap mode")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: dlebrun@google.com
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Wunderlich says:
====================
Here are some batman-adv fixes:
- Fix gateway refcounting in BATMAN IV and V, by Sven Eckelmann (2 patches)
- Fix debugfs paths when renaming interfaces, by Sven Eckelmann (2 patches)
- Fix TT flag issues, by Linus Luessing (2 patches)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warnings:
net/sched/cls_api.c:1101:43: warning: Using plain integer as NULL pointer
net/sched/cls_api.c:1492:75: warning: Using plain integer as NULL pointer
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mii_nway_restart is not pm aware which results in a rtnl deadlock.
Implement mii_nway_restart manual by setting BMCR_ANRESTART if
BMCR_ANENABLE is set.
To reproduce:
* plug an asix based usb network interface
* wait until the device enters PM (~5 sec)
* `ip link set eth1 up` will never return
Fixes: d9fe64e511 ("net: asix: Add in_pm parameter")
Signed-off-by: Alexander Couzens <lynxis@fe80.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
t.qset_idx can be indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:2286 cxgb_extension_ioctl()
warn: potential spectre issue 'adapter->msix_info'
Fix this by sanitizing t.qset_idx before using it to index
adapter->msix_info
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rhashtable_init() currently does not take into account the user-passed
min_size parameter unless param->nelem_hint is set as well. As such,
the default size (number of buckets) will always be HASH_DEFAULT_SIZE
even if the smallest allowed size is larger than that. Remediate this
by unconditionally calling into rounded_hashtable_size() and handling
things accordingly.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
info.index can be indirectly controlled by user-space, hence leading
to a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/vfio/pci/vfio_pci.c:734 vfio_pci_ioctl()
warn: potential spectre issue 'vdev->region'
Fix this by sanitizing info.index before indirectly using it to index
vdev->region
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pull btrfs fixes from David Sterba:
"Three regression fixes. They're few-liners and fixing some corner
cases missed in the origial patches"
* tag 'for-4.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: scrub: Don't use inode page cache in scrub_handle_errored_block()
btrfs: fix use-after-free of cmp workspace pages
btrfs: restore uuid_mutex in btrfs_open_devices
Pull kvm fixes from Paolo Bonzini:
"Miscellaneous bugfixes, plus a small patchlet related to Spectre v2"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvmclock: fix TSC calibration for nested guests
KVM: VMX: Mark VMXArea with revision_id of physical CPU even when eVMCS enabled
KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumer
KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel.
x86/kvmclock: set pvti_cpu0_va after enabling kvmclock
x86/kvm/Kconfig: Ensure CRYPTO_DEV_CCP_DD state at minimum matches KVM_AMD
kvm: nVMX: Restore exit qual for VM-entry failure due to MSR loading
x86/kvm/vmx: don't read current->thread.{fs,gs}base of legacy tasks
KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR
Ursula Braun says:
====================
net/smc: fixes 2018-07-18
here are small fixes for SMC: The first patch speeds up unidirectional
traffic, the second patch increases security, and the third patch
fixes a problem for fallback cases.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
During clc handshake the receive timeout is set to CLC_WAIT_TIME.
Remember and reset the original timeout value after the receive calls,
and remove a duplicate assignment of CLC_WAIT_TIME.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For security reasons the return code of get_user() should always be
checked.
Fixes: 01d2f7e2cd ("net/smc: sockopts TCP_NODELAY and TCP_CORK")
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SMC protocol requires to send a separate consumer cursor update,
if it cannot be piggybacked to updates of the producer cursor.
Currently the decision to send a separate consumer cursor update
just considers the amount of data already received by the socket
program. It does not consider the amount of data already arrived, but
not yet consumed by the receiver. Basing the decision on the
difference between already confirmed and already arrived data
(instead of difference between already confirmed and already consumed
data), may lead to a somewhat earlier consumer cursor update send in
fast unidirectional traffic scenarios, and thus to better throughput.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Lenovo LaVie Z laptop requires i8042 to be reset in order to
consistently detect its Elantech touchpad. The nomux and kbdreset
quirks are not sufficient.
It's possible the other LaVie Z models from NEC require this as well.
Cc: stable@vger.kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
My randconfig builds came across an old missing dependency for ILA:
ERROR: "dst_cache_set_ip6" [net/ipv6/ila/ila.ko] undefined!
ERROR: "dst_cache_get" [net/ipv6/ila/ila.ko] undefined!
ERROR: "dst_cache_init" [net/ipv6/ila/ila.ko] undefined!
ERROR: "dst_cache_destroy" [net/ipv6/ila/ila.ko] undefined!
We almost never run into this by accident because randconfig builds
end up selecting DST_CACHE from some other tunnel protocol, and this
one appears to be the only one missing the explicit 'select'.
>From all I can tell, this problem first appeared in linux-4.9
when dst_cache support got added to ILA.
Fixes: 79ff2fc31e ("ila: Cache a route to translated address")
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, intel_pstate doesn't register if _PSS is not present on
HP Proliant systems, because it expects the firmware to take over
CPU performance scaling in that case. However, if ACPI PCCH is
present, the firmware expects the kernel to use it for CPU
performance scaling and the pcc-cpufreq driver is loaded for that.
Unfortunately, the firmware interface used by that driver is not
scalable for fundamental reasons, so pcc-cpufreq is way suboptimal
on systems with more than just a few CPUs. In fact, it is better to
avoid using it at all.
For this reason, modify intel_pstate to look for ACPI PCCH if _PSS
is not present and register if it is there. Also prevent the
pcc-cpufreq driver from trying to initialize itself if intel_pstate
has been registered already.
Fixes: fbbcdc0744 (intel_pstate: skip the driver if ACPI has power mgmt option)
Reported-by: Andreas Herrmann <aherrmann@suse.com>
Reviewed-by: Andreas Herrmann <aherrmann@suse.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Andreas Herrmann <aherrmann@suse.com>
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror
SPRN_USPRG3 are used as userspace VDSO write and read registers
respectively.
SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not
restored. As a result, any read from SPRN_USPRG3 returns zero on an
exit from stop4 (Power9 only) and above.
Thus in this situation, on POWER9, any call from sched_getcpu() always
returns zero, as on powerpc, we call __kernel_getcpu() which relies
upon SPRN_USPRG3 to report the CPU and NUMA node information.
Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state
with the sprg_vdso value that is cached in PACA.
Fixes: e1c1cfed54 ("powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle")
Cc: stable@vger.kernel.org # v4.14+
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Some of the assembly files use instructions specific to BookE or E500,
which are rejected with the now-default -mcpu=powerpc, so we must pass
-me500 to the assembler just as we pass -me200 for E200.
Fixes: 4bf4f42a2f ("powerpc/kbuild: Set default generic machine type for 32-bit compile")
Signed-off-by: James Clarke <jrtc27@jrtc27.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Inside a nested guest, access to hardware can be slow enough that
tsc_read_refs always return ULLONG_MAX, causing tsc_refine_calibration_work
to be called periodically and the nested guest to spend a lot of time
reading the ACPI timer.
However, if the TSC frequency is available from the pvclock page,
we can just set X86_FEATURE_TSC_KNOWN_FREQ and avoid the recalibration.
'refine' operation.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
[Commit message rewritten. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When eVMCS is enabled, all VMCS allocated to be used by KVM are marked
with revision_id of KVM_EVMCS_VERSION instead of revision_id reported
by MSR_IA32_VMX_BASIC.
However, even though not explictly documented by TLFS, VMXArea passed
as VMXON argument should still be marked with revision_id reported by
physical CPU.
This issue was found by the following setup:
* L0 = KVM which expose eVMCS to it's L1 guest.
* L1 = KVM which consume eVMCS reported by L0.
This setup caused the following to occur:
1) L1 execute hardware_enable().
2) hardware_enable() calls kvm_cpu_vmxon() to execute VMXON.
3) L0 intercept L1 VMXON and execute handle_vmon() which notes
vmxarea->revision_id != VMCS12_REVISION and therefore fails with
nested_vmx_failInvalid() which sets RFLAGS.CF.
4) L1 kvm_cpu_vmxon() don't check RFLAGS.CF for failure and therefore
hardware_enable() continues as usual.
5) L1 hardware_enable() then calls ept_sync_global() which executes
INVEPT.
6) L0 intercept INVEPT and execute handle_invept() which notes
!vmx->nested.vmxon and thus raise a #UD to L1.
7) Raised #UD caused L1 to panic.
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Cc: stable@vger.kernel.org
Fixes: 773e8a0425
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A comment warning against this bug is there, but the code is not doing what
the comment says. Therefore it is possible that an EPOLLHUP races against
irq_bypass_register_consumer. The EPOLLHUP handler schedules irqfd_shutdown,
and if that runs soon enough, you get a use-after-free.
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Syzbot reports crashes in kvm_irqfd_assign(), caused by use-after-free
when kvm_irqfd_assign() and kvm_irqfd_deassign() run in parallel
for one specific eventfd. When the assign path hasn't finished but irqfd
has been added to kvm->irqfds.items list, another thead may deassign the
eventfd and free struct kvm_kernel_irqfd(). The assign path then uses
the struct kvm_kernel_irqfd that has been freed by deassign path. To avoid
such issue, keep irqfd under kvm->irq_srcu protection after the irqfd
has been added to kvm->irqfds.items list, and call synchronize_srcu()
in irq_shutdown() to make sure that irqfd has been fully initialized in
the assign path.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Tianyu Lan <tianyu.lan@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A VM which has:
- a DMA capable device passed through to it (eg. network card);
- running a malicious kernel that ignores H_PUT_TCE failure;
- capability of using IOMMU pages bigger that physical pages
can create an IOMMU mapping that exposes (for example) 16MB of
the host physical memory to the device when only 64K was allocated to the VM.
The remaining 16MB - 64K will be some other content of host memory, possibly
including pages of the VM, but also pages of host kernel memory, host
programs or other VMs.
The attacking VM does not control the location of the page it can map,
and is only allowed to map as many pages as it has pages of RAM.
We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
an IOMMU page is contained in the physical page so the PCI hardware won't
get access to unassigned host memory; however this check is missing in
the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and
did not hit this yet as the very first time when the mapping happens
we do not have tbl::it_userspace allocated yet and fall back to
the userspace which in turn calls VFIO IOMMU driver, this fails and
the guest does not retry,
This stores the smallest preregistered page size in the preregistered
region descriptor and changes the mm_iommu_xxx API to check this against
the IOMMU page size.
This calculates maximum page size as a minimum of the natural region
alignment and compound page size. For the page shift this uses the shift
returned by find_linux_pte() which indicates how the page is mapped to
the current userspace - if the page is huge and this is not a zero, then
it is a leaf pte and the page is mapped within the range.
Fixes: 121f80ba68 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The size is always equal to 1 page so let's use this. Later on this will
be used for other checks which use page shifts to check the granularity
of access.
This should cause no behavioral change.
Cc: stable@vger.kernel.org # v4.12+
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This driver can spam the kernel log with multiple messages of:
net eth0: eth0: allmulti set
Usually 4 or 8 at a time (probably because of using ConnMan).
This message doesn't seem useful, so let's demote it from dev_info()
to dev_dbg().
Signed-off-by: David Lechner <david@lechnology.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
octeon_mgmt driver doesn't drop RX frames that are 1-4 bytes bigger than
MTU set for the corresponding interface. The problem is in the
AGL_GMX_RX0/1_FRM_MAX register setting, which should not account for VLAN
tagging.
According to Octeon HW manual:
"For tagged frames, MAX increases by four bytes for each VLAN found up to a
maximum of two VLANs, or MAX + 8 bytes."
OCTEON_FRAME_HEADER_LEN "define" is fine for ring buffer management, but
should not be used for AGL_GMX_RX0/1_FRM_MAX.
The problem could be easily reproduced using "ping" command. If affected
system has default MTU 1500, other host (having MTU >= 1504) can
successfully "ping" the affected system with payload size 1473-1476,
resulting in IP packets of size 1501-1504 accepted by the mgmt driver.
Fixed system still accepts IP packets of 1500 bytes even with VLAN tagging,
because the limits are lifted in HW as expected, for every VLAN tag.
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
glibc uses a different defintion of sigset_t than the kernel does,
and the current version would pull in both. To fix this just do not
expose the type at all - this somewhat mirrors pselect() where we
do not even have a type for the magic sigmask argument, but just
use pointer arithmetics.
Fixes: 7a074e96 ("aio: implement io_pgetevents")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Adrian Reber <adrian@lisas.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Fedora has integrated the jitter entropy daemon to work around slow
boot problems, especially on VM's that don't support virtio-rng:
https://bugzilla.redhat.com/show_bug.cgi?id=1572944
It's understandable why they did this, but the Jitter entropy daemon
works fundamentally on the principle: "the CPU microarchitecture is
**so** complicated and we can't figure it out, so it *must* be
random". Yes, it uses statistical tests to "prove" it is secure, but
AES_ENCRYPT(NSA_KEY, COUNTER++) will also pass statistical tests with
flying colors.
So if RDRAND is available, mix it into entropy submitted from
userspace. It can't hurt, and if you believe the NSA has backdoored
RDRAND, then they probably have enough details about the Intel
microarchitecture that they can reverse engineer how the Jitter
entropy daemon affects the microarchitecture, and attack its output
stream. And if RDRAND is in fact an honest DRNG, it will immeasurably
improve on what the Jitter entropy daemon might produce.
This also provides some protection against someone who is able to read
or set the entropy seed file.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
[why] dp hbr2 eye diagram pattern for raven asic is not stabled.
workaround is to use tp4 pattern. But this should not be
applied to asic before raven.
[how] add new bool varilable in asic caps. for raven asic,
use the workaround. for carrizo, vega, do not use workaround.
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Way back in 4.9, we committed 4cd13c21b2 ("softirq: Let ksoftirqd do
its job"), and ever since we've had small nagging issues with it. For
example, we've had:
1ff688209e ("watchdog: core: make sure the watchdog_worker is not deferred")
8d5755b3f7 ("watchdog: softdog: fire watchdog even if softirqs do not get to run")
217f697436 ("net: busy-poll: allow preemption in sk_busy_loop()")
all of which worked around some of the effects of that commit.
The DVB people have also complained that the commit causes excessive USB
URB latencies, which seems to be due to the USB code using tasklets to
schedule USB traffic. This seems to be an issue mainly when already
living on the edge, but waiting for ksoftirqd to handle it really does
seem to cause excessive latencies.
Now Hanna Hawa reports that this issue isn't just limited to USB URB and
DVB, but also causes timeout problems for the Marvell SoC team:
"I'm facing kernel panic issue while running raid 5 on sata disks
connected to Macchiatobin (Marvell community board with Armada-8040
SoC with 4 ARMv8 cores of CA72) Raid 5 built with Marvell DMA engine
and async_tx mechanism (ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2)
uses a tasklet to clean the done descriptors from the queue"
The latency problem causes a panic:
mv_xor_v2 f0400000.xor: dma_sync_wait: timeout!
Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for transaction
We've discussed simply just reverting the original commit entirely, and
also much more involved solutions (with per-softirq threads etc). This
patch is intentionally stupid and fairly limited, because the issue
still remains, and the other solutions either got sidetracked or had
other issues.
We should probably also consider the timer softirqs to be synchronous
and not be delayed to ksoftirqd (since they were the issue with the
earlier watchdog problems), but that should be done as a separate patch.
This does only the tasklet cases.
Reported-and-tested-by: Hanna Hawa <hannah@marvell.com>
Reported-and-tested-by: Josef Griebichler <griebichler.josef@gmx.at>
Reported-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The SNDRV_RAWMIDI_IOCTL_PARAMS ioctl may resize the buffers and the
current code is racy. For example, the sequencer client may write to
buffer while it being resized.
As a simple workaround, let's switch to the resized buffer inside the
stream runtime lock.
Reported-by: syzbot+52f83f0ea8df16932f7f@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Avoid excuting set_feature command if there is no supported bit in
Optional Asynchronous Events Supported (OAES).
Fixes: c0561f82 ("nvme: submit AEN event configuration on startup")
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Weiping Zhang <zhangweiping@didichuxing.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In commit ac0b4145d6 ("btrfs: scrub: Don't use inode pages for device
replace") we removed the branch of copy_nocow_pages() to avoid
corruption for compressed nodatasum extents.
However above commit only solves the problem in scrub_extent(), if
during scrub_pages() we failed to read some pages,
sctx->no_io_error_seen will be non-zero and we go to fixup function
scrub_handle_errored_block().
In scrub_handle_errored_block(), for sctx without csum (no matter if
we're doing replace or scrub) we go to scrub_fixup_nodatasum() routine,
which does the similar thing with copy_nocow_pages(), but does it
without the extra check in copy_nocow_pages() routine.
So for test cases like btrfs/100, where we emulate read errors during
replace/scrub, we could corrupt compressed extent data again.
This patch will fix it just by avoiding any "optimization" for
nodatasum, just falls back to the normal fixup routine by try read from
any good copy.
This also solves WARN_ON() or dead lock caused by lame backref iteration
in scrub_fixup_nodatasum() routine.
The deadlock or WARN_ON() won't be triggered before commit ac0b4145d6
("btrfs: scrub: Don't use inode pages for device replace") since
copy_nocow_pages() have better locking and extra check for data extent,
and it's already doing the fixup work by try to read data from any good
copy, so it won't go scrub_fixup_nodatasum() anyway.
This patch disables the faulty code and will be removed completely in a
followup patch.
Fixes: ac0b4145d6 ("btrfs: scrub: Don't use inode pages for device replace")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The recent change to add printf annotations to xmon inadvertently made
the disassembly output ugly, eg:
c00000002001e058 7ee00026 mfcr r23
c00000002001e05c fffffffffae101a0 std r23,416(r1)
c00000002001e060 fffffffff8230000 std r1,0(r3)
The problem being that negative 32-bit values are being displayed in
full 64-bits.
The printf conversion was actually correct, we are passing unsigned
long so it should use "lx". But powerpc instructions are only 4 bytes
and the code only reads 4 bytes, so inst should really just be
unsigned int, and that also fixes the printing to look the way we
want:
c00000002001e058 7ee00026 mfcr r23
c00000002001e05c fae101a0 std r23,416(r1)
c00000002001e060 f8230000 std r1,0(r3)
Fixes: e70d8f5526 ("powerpc/xmon: Add __printf annotation to xmon_printf()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Felipe writes:
usb: fixes for v4.18-rc5
With a total of 20 non-merge commits, we have accumulated quite a few
fixes. These include lot's of fixes the our audio gadget interface, a
build error fix for PPC64 builds for the frescale PHY driver,
sleep-while-atomic fixes on the r8a66597 UDC driver, 3-stage SETUP fix
for the aspeed-vhub UDC and some other misc fixes.
The list [1] of commits doing endianness fixes in USB subsystem is long
due to below quote from USB spec Revision 2.0 from April 27, 2000:
------------
8.1 Byte/Bit Ordering
Multiple byte fields in standard descriptors, requests, and responses
are interpreted as and moved over the bus in little-endian order, i.e.
LSB to MSB.
------------
This commit belongs to the same family.
[1] Example of endianness fixes in USB subsystem:
commit 14e1d56cbe ("usb: gadget: f_uac2: endianness fixes.")
commit 42370b8211 ("usb: gadget: f_uac1: endianness fixes.")
commit 63afd5cc78 ("USB: chaoskey: fix Alea quirk on big-endian hosts")
commit 74098c4ac7 ("usb: gadget: acm: fix endianness in notifications")
commit cdd7928df0 ("ACM gadget: fix endianness in notifications")
commit 323ece54e0 ("cdc-wdm: fix endianness bug in debug statements")
commit e102609f10 ("usb: gadget: uvc: Fix endianness mismatches")
list goes on
Fixes: 132fcb4608 ("usb: gadget: Add Audio Class 2.0 Driver")
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Reviewed-by: Ruslan Bilovol <ruslan.bilovol@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Make sure only to copy any actual data rather than the whole buffer,
when releasing the temporary buffer used for unaligned non-isochronous
transfers.
Taken directly from commit 0efd937e27 ("USB: ehci-tegra: fix inefficient
copy of unaligned buffers")
Tested with Lantiq xRX200 (MIPS) and RPi Model B Rev 2 (ARM)
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Antti Seppälä <a.seppala@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The commit 3bc04e28a0 ("usb: dwc2: host: Get aligned DMA in a more
supported way") introduced a common way to align DMA allocations.
The code in the commit aligns the struct dma_aligned_buffer but the
actual DMA address pointed by data[0] gets aligned to an offset from
the allocated boundary by the kmalloc_ptr and the old_xfer_buffer
pointers.
This is against the recommendation in Documentation/DMA-API.txt which
states:
Therefore, it is recommended that driver writers who don't take
special care to determine the cache line size at run time only map
virtual regions that begin and end on page boundaries (which are
guaranteed also to be cache line boundaries).
The effect of this is that architectures with non-coherent DMA caches
may run into memory corruption or kernel crashes with Unhandled
kernel unaligned accesses exceptions.
Fix the alignment by positioning the DMA area in front of the allocation
and use memory at the end of the area for storing the orginal
transfer_buffer pointer. This may have the added benefit of increased
performance as the DMA area is now fully aligned on all architectures.
Tested with Lantiq xRX200 (MIPS) and RPi Model B Rev 2 (ARM).
Fixes: 3bc04e28a0 ("usb: dwc2: host: Get aligned DMA in a more supported way")
Cc: <stable@vger.kernel.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Antti Seppälä <a.seppala@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Commit 34962fb807 ("docs: Fix more broken references") replaced the
broken reference to rockchip,dwc3-usb-phy.txt binding for the Qualcomm
DWC3 binding (qcom-dwc3-usb-phy.txt). That's wrong, so replace that
reference for the correct ones.
Fixes: 34962fb807 ("docs: Fix more broken references")
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The tools/usb/ffs-test.c file defines cpu_to_le16/32 by using the C
library htole16/32 function calls. However, cpu_to_le16/32 are used when
initializing structures, i.e in a context where a function call is not
allowed.
It works fine on little endian systems because htole16/32 are defined by
the C library as no-ops. But on big-endian systems, they are actually
doing something, which might involve calling a function, causing build
failures, such as:
ffs-test.c:48:25: error: initializer element is not constant
#define cpu_to_le32(x) htole32(x)
^~~~~~~
ffs-test.c:128:12: note: in expansion of macro ‘cpu_to_le32’
.magic = cpu_to_le32(FUNCTIONFS_DESCRIPTORS_MAGIC_V2),
^~~~~~~~~~~
To solve this, we code cpu_to_le16/32 in a way that allows them to be
used when initializing structures. This fix was imported from
meta-openembedded/android-tools/fix-big-endian-build.patch written by
Thomas Petazzoni <thomas.petazzoni@free-electrons.com>.
CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The Aspeed SoC has a memory ordering issue that (thankfully)
only affects the USB gadget device. A read back is necessary
after writing to memory and before letting the device DMA
from it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Variable maxpacket is being assigned but is never used hence it is
redundant and can be removed.
Cleans up clang warning:
warning: variable 'maxpacket' set but not used [-Wunused-but-set-variable]
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
For unidirectional endpoints, the endpoint pointer will be NULL for the
unused direction. Check that the endpoint is active before
dereferencing this pointer.
Fixes: 1b4977c793 ("usb: dwc2: Update dwc2_handle_incomplete_isoc_in() function")
Fixes: 689efb2619 ("usb: dwc2: Update dwc2_handle_incomplete_isoc_out() function")
Fixes: d84845522d ("usb: dwc2: Update GINTSTS_GOUTNAKEFF interrupt handling")
Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Fix build errors when built for PPC64:
These variables are only used on PPC32 so they don't need to be
initialized for PPC64.
../drivers/usb/phy/phy-fsl-usb.c: In function 'usb_otg_start':
../drivers/usb/phy/phy-fsl-usb.c:865:3: error: '_fsl_readl' undeclared (first use in this function); did you mean 'fsl_readl'?
_fsl_readl = _fsl_readl_be;
../drivers/usb/phy/phy-fsl-usb.c:865:16: error: '_fsl_readl_be' undeclared (first use in this function); did you mean 'fsl_readl'?
_fsl_readl = _fsl_readl_be;
../drivers/usb/phy/phy-fsl-usb.c:866:3: error: '_fsl_writel' undeclared (first use in this function); did you mean 'fsl_writel'?
_fsl_writel = _fsl_writel_be;
../drivers/usb/phy/phy-fsl-usb.c:866:17: error: '_fsl_writel_be' undeclared (first use in this function); did you mean 'fsl_writel'?
_fsl_writel = _fsl_writel_be;
../drivers/usb/phy/phy-fsl-usb.c:868:16: error: '_fsl_readl_le' undeclared (first use in this function); did you mean 'fsl_readl'?
_fsl_readl = _fsl_readl_le;
../drivers/usb/phy/phy-fsl-usb.c:869:17: error: '_fsl_writel_le' undeclared (first use in this function); did you mean 'fsl_writel'?
_fsl_writel = _fsl_writel_le;
and the sysfs "show" function return type should be ssize_t, not int:
../drivers/usb/phy/phy-fsl-usb.c:1042:49: error: initialization of 'ssize_t (*)(struct device *, struct device_attribute *, char *)' {aka 'long int (*)(struct device *, struct device_attribute *, char *)'} from incompatible pointer type 'int (*)(struct device *, struct device_attribute *, char *)' [-Werror=incompatible-pointer-types]
static DEVICE_ATTR(fsl_usb2_otg_state, S_IRUGO, show_fsl_usb2_otg_state, NULL);
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: linux-usb@vger.kernel.org
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
When handling split transactions we will try to delay retry after
getting a NAK from the device. This works well for BULK transfers that
can be polled for essentially forever. Unfortunately, on slower systems
at boot time, when the kernel is busy enumerating all the devices (USB
or not), we issue a bunch of control requests (reading device
descriptors, etc). If we get a NAK for the IN part of the control
request and delay retry for too long (because the system is busy), we
may confuse the device when we finally get to reissue SSPLIT/CSPLIT IN
and the device will respond with STALL. As a result we end up with
failure to get device descriptor and will fail to enumerate the device:
[ 3.428801] usb 2-1.2.1: new full-speed USB device number 9 using dwc2
[ 3.508576] usb 2-1.2.1: device descriptor read/8, error -32
[ 3.699150] usb 2-1.2.1: device descriptor read/8, error -32
[ 3.891653] usb 2-1.2.1: new full-speed USB device number 10 using dwc2
[ 3.968859] usb 2-1.2.1: device descriptor read/8, error -32
...
Let's not delay retries of split CONTROL IN transfers, as this allows us
to reliably enumerate devices at boot time.
Fixes: 38d2b5fb75 ("usb: dwc2: host: Don't retry NAKed transactions right away")
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Substream period size potentially can be changed in runtime, however
this is not accounted in the data copying routine, the change replaces
the cached value with an actual value from substream runtime.
As a side effect the change also removes a potential division by zero
in u_audio_iso_complete() function, if there is a race with
uac_pcm_hw_free(), which sets prm->period_size to 0.
Fixes: 132fcb4608 ("usb: gadget: Add Audio Class 2.0 Driver")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
There is no necessity to copy PCM stream ring buffer area and size
properties to UAC private data structure, these values can be got
from substream itself.
The change gives more control on substream and avoid stale caching.
Fixes: 132fcb4608 ("usb: gadget: Add Audio Class 2.0 Driver")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In u_audio_iso_complete, the runtime hw_ptr is updated before the
data is actually copied over to/from the buffer/dma area. When
ALSA uses this hw_ptr, the data may not actually be available to
be used. This causes trash/stale audio to play/record. This
patch updates the hw_ptr after the data has been copied to avoid
this.
Fixes: 132fcb4608 ("usb: gadget: Add Audio Class 2.0 Driver")
Signed-off-by: Joshua Frkuska <joshua_frkuska@mentor.com>
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Fix below smatch (v0.5.0-4443-g69e9094e11c1) warnings:
drivers/usb/gadget/function/u_audio.c:607 g_audio_setup() warn: strcpy() 'pcm_name' of unknown size might be too large for 'pcm->name'
drivers/usb/gadget/function/u_audio.c:614 g_audio_setup() warn: strcpy() 'card_name' of unknown size might be too large for 'card->driver'
drivers/usb/gadget/function/u_audio.c:615 g_audio_setup() warn: strcpy() 'card_name' of unknown size might be too large for 'card->shortname'
Below commits performed a similar 's/strcpy/strlcpy/' rework:
* v2.6.31 commit 8372d4980f ("ALSA: ctxfi - Fix PCM device naming")
* v4.14 commit 003d3e70db ("ALSA: ad1848: fix format string overflow warning")
* v4.14 commit 6d8b04de87 ("ALSA: cs423x: fix format string overflow warning")
Fixes: eb9fecb9e6 ("usb: gadget: f_uac2: split out audio core")
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The driver may sleep in an interrupt handler.
The function call path (from bottom to top) in Linux-4.16.7 is:
[FUNC] r8a66597_queue(GFP_KERNEL)
drivers/usb/gadget/udc/r8a66597-udc.c, 1193:
r8a66597_queue in get_status
drivers/usb/gadget/udc/r8a66597-udc.c, 1301:
get_status in setup_packet
drivers/usb/gadget/udc/r8a66597-udc.c, 1381:
setup_packet in irq_control_stage
drivers/usb/gadget/udc/r8a66597-udc.c, 1508:
irq_control_stage in r8a66597_irq (interrupt handler)
To fix this bug, GFP_KERNEL is replaced with GFP_ATOMIC.
This bug is found by my static analysis tool (DSAC-2) and checked by
my code review.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The driver may sleep with holding a spinlock.
The function call paths (from bottom to top) in Linux-4.16.7 are:
[FUNC] msleep
drivers/usb/gadget/udc/r8a66597-udc.c, 839:
msleep in init_controller
drivers/usb/gadget/udc/r8a66597-udc.c, 96:
init_controller in r8a66597_usb_disconnect
drivers/usb/gadget/udc/r8a66597-udc.c, 93:
spin_lock in r8a66597_usb_disconnect
[FUNC] msleep
drivers/usb/gadget/udc/r8a66597-udc.c, 835:
msleep in init_controller
drivers/usb/gadget/udc/r8a66597-udc.c, 96:
init_controller in r8a66597_usb_disconnect
drivers/usb/gadget/udc/r8a66597-udc.c, 93:
spin_lock in r8a66597_usb_disconnect
To fix these bugs, msleep() is replaced with mdelay().
This bug is found by my static analysis tool (DSAC-2) and checked by
my code review.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The current code is broken as it re-defines "req" inside the
if block, then goto out of it. Thus the request that ends
up being sent is not the one that was populated by the
code in question.
This fixes RNDIS driver autodetect by Windows 10 for me.
The bug was introduced by Chris rework to remove the local
queuing inside the if { } block of the redefined request.
Fixes: 636ba13aec ("usb: gadget: composite: remove duplicated code in OS desc handling")
Cc: <stable@vger.kernel.org> # v4.17
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
A couple of bugs in the driver are preventing SETUP packets
with an OUT data phase from working properly.
Interestingly those are incredibly rare (RNDIS typically
uses them and thus is broken without this fix).
The main problem was an incorrect register offset being
applied for arming RX on EP0. The other problem relates
to stalling such a packet before the data phase, in which
case we don't get an ACK cycle, and get the next SETUP
packet directly, so we shouldn't reject it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
pwm node should not be under gpio6 node in the device tree.
This fixes detection of the pwm on Droid 4.
Fixes: 6d7bdd328d ("ARM: dts: omap4-droid4: update touchscreen")
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
[tony@atomide.com: added fixes tag]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Siva Reddy Kallam says:
====================
tg3: Update copyright and fix for tx timeout with 5762
First patch:
Update copyright
Second patch:
Add higher cpu clock for 5762
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Testing has uncovered a failure case that is not handled properly. In the
event that a login fails and we are not able to recover on the spot, we
return 0 from do_reset, preventing any error recovery code from being
triggered. Additionally, the state is set to "probed" meaning that when we
are able to trigger the error recovery, the driver always comes up in the
probed state. To handle the case properly, we need to return a failure code
here and set the adapter state to the state that we entered the reset in
indicating the state that we would like to come out of the recovery reset
in.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The skb size calculation in lan78xx_tx_bh is in race with the start_xmit,
which could lead to rare kernel oopses. So protect the whole skb walk with
a spin lock. As a benefit we can unlink the skb directly.
This patch was tested on Raspberry Pi 3B+
Link: https://github.com/raspberrypi/linux/issues/2608
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Floris Bos <bos@je-eigen-domein.nl>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric reported that reverting the patch that fixed and simplified IPv6
multipath routes means reverting back to invalid userspace notifications.
eg.,
$ ip -6 route add 2001:db8:1::/64 nexthop dev eth0 nexthop dev eth1
only generates a single notification:
2001:db8:1::/64 dev eth0 metric 1024 pref medium
While working on a fix for this problem I found another case that is just
broken completely - a multipath route with a gateway followed by device
followed by gateway:
$ ip -6 ro add 2001:db8:103::/64
nexthop via 2001:db8:1::64
nexthop dev dummy2
nexthop via 2001:db8:3::64
In this case the device only route is dropped completely - no notification
to userpsace but no addition to the FIB either:
$ ip -6 ro ls
2001:db8:1::/64 dev dummy1 proto kernel metric 256 pref medium
2001:db8:2::/64 dev dummy2 proto kernel metric 256 pref medium
2001:db8:3::/64 dev dummy3 proto kernel metric 256 pref medium
2001:db8:103::/64 metric 1024
nexthop via 2001:db8:1::64 dev dummy1 weight 1
nexthop via 2001:db8:3::64 dev dummy3 weight 1 pref medium
fe80::/64 dev dummy1 proto kernel metric 256 pref medium
fe80::/64 dev dummy2 proto kernel metric 256 pref medium
fe80::/64 dev dummy3 proto kernel metric 256 pref medium
Really, IPv6 multipath is just FUBAR'ed beyond repair when it comes to
device only routes, so do not allow it all.
This change will break any scripts relying on the mpath api for insert,
but I don't see any other way to handle the permutations. Besides, since
the routes are added to the FIB as standalone (non-multipath) routes the
kernel is not doing what the user requested, so it might as well tell the
user that.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Correct previous bad attempt at allowing sockets to come out of TCP
repair without sending window probes. To avoid changing size of
the repair variable in struct tcp_sock, this lets the decision for
sending probes or not to be made when coming out of repair by
introducing two ways to turn it off.
v2:
* Remove erroneous comment; defines now make behavior clear
Fixes: 70b7ff1302 ("tcp: allow user to create repair socket without window probes")
Signed-off-by: Stefan Baranoff <sbaranoff@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a new rx packet arrives, the rx path will decide whether to reuse
the remainder of the page or not according to one of the below conditions:
1. frag_info->frag_stride == PAGE_SIZE / 2
2. frags->page_offset + frag_info->frag_size > PAGE_SIZE;
The first condition is no met for when XDP is set.
For XDP, page_offset is always set to priv->rx_headroom which is
XDP_PACKET_HEADROOM and frag_info->frag_size is around mtu size + some
padding, still the 2nd release condition will hold since
XDP_PACKET_HEADROOM + 1536 < PAGE_SIZE, as a result the page will not
be released and will be _wrongly_ reused for next free rx descriptor.
In XDP there is an assumption to have a page per packet and reuse can
break such assumption and might cause packet data corruptions.
Fix this by adding an extra condition (!priv->rx_headroom) to the 2nd
case to avoid page reuse when XDP is set, since rx_headroom is set to 0
for non XDP setup and set to XDP_PACKET_HEADROOM for XDP setup.
No additional cache line is required for the new condition.
Fixes: 34db548bfb ("mlx4: add page recycling in receive path")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Suggested-by: Martin KaFai Lau <kafai@fb.com>
CC: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CC [M] drivers/net/ethernet/freescale/fman/fman.o
In file included from ../drivers/net/ethernet/freescale/fman/fman.c:35:
../include/linux/fsl/guts.h: In function 'guts_set_dmacr':
../include/linux/fsl/guts.h:165:2: error: implicit declaration of function 'clrsetbits_be32' [-Werror=implicit-function-declaration]
clrsetbits_be32(&guts->dmacr, 3 << shift, device << shift);
^~~~~~~~~~~~~~~
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Madalin Bucur <madalin.bucur@nxp.com>
Cc: netdev@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: David S. Miller <davem@davemloft.net>
The netvsc device may need to fallback to running in single queue
mode if host side only wants to support single queue.
Recent change for handling mtu broke this in setup logic.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 3ffe64f1a6 ("hv_netvsc: split sub-channel setup into async and sync")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During a device failover, there may be latency between the loss
of the current backing device and a notification from firmware that
a failover has occurred. This latency can result in a large amount of
error printouts as firmware returns outgoing traffic with a generic
error code. These are not necessarily errors in this case as the
firmware is busy swapping in a new backing adapter and is not ready
to send packets yet. This patch reclassifies those error codes as
warnings with an explanation that a failover may be pending. All
other return codes will be considered errors.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit adc176c547 ("ipv6 addrconf: Implemented enhanced DAD (RFC7527)")
added enhanced DAD with a nonce length of 6 bytes. However, RFC7527
doesn't specify the length of the nonce, other than being 6 + 8*k bytes,
with integer k >= 0 (RFC3971 5.3.2). The current implementation simply
assumes that the nonce will always be 6 bytes, but others systems are
free to choose different sizes.
If another system sends a nonce of different length but with the same 6
bytes prefix, it shouldn't be considered as the same nonce. Thus, check
that the length of the received nonce is the same as the length we sent.
Ugly scapy test script running on veth0:
def loop():
pkt=sniff(iface="veth0", filter="icmp6", count=1)
pkt = pkt[0]
b = bytearray(pkt[Raw].load)
b[1] += 1
b += b'\xde\xad\xbe\xef\xde\xad\xbe\xef'
pkt[Raw].load = bytes(b)
pkt[IPv6].plen += 8
# fixup checksum after modifying the payload
pkt[IPv6].payload.cksum -= 0x3b44
if pkt[IPv6].payload.cksum < 0:
pkt[IPv6].payload.cksum += 0xffff
sendp(pkt, iface="veth0")
This should result in DAD failure for any address added to veth0's peer,
but is currently ignored.
Fixes: adc176c547 ("ipv6 addrconf: Implemented enhanced DAD (RFC7527)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch remove the following documentation warning
drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:103: warning: Excess function parameter 'priv' description in 'stmmac_axi_setup'
It was introduced in commit afea03656a ("stmmac: rework DMA bus setting and introduce new platform AXI structure")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A KASAN:use-after-free bug was found related to ip6-erspan
while running selftests/net/ip6_gre_headroom.sh
It happens because of following sequence:
- ipv6hdr pointer is obtained from skb
- skb_cow_head() is called, skb->head memory is reallocated
- old data is accessed using ipv6hdr pointer
skb_cow_head() call was added in e41c7c68ea ("ip6erspan: make sure
enough headroom at xmit."), but looking at the history there was a
chance of similar bug because gre_handle_offloads() and pskb_trim()
can also reallocate skb->head memory. Fixes tag points to commit
which introduced possibility of this bug.
This patch moves ipv6hdr pointer assignment after skb_cow_head() call.
Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On XDP_TX we need to free up the frame only when tun_xdp_tx() returns a
negative value. A positive value indicates that the packet is
successfully enqueued to the ptr_ring, so freeing the page causes
use-after-free.
Fixes: 735fc4054b ("xdp: change ndo_xdp_xmit API to support bulking")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the zerocopy sendmsg() path, there are error checks to revert
the zerocopy if we get any error code. syzkaller has discovered
that tls_push_record can return -ECONNRESET, which is fatal, and
happens after the point at which it is safe to revert the iter,
as we've already passed the memory to do_tcp_sendpages.
Previously this code could return -ENOMEM and we would want to
revert the iter, but AFAIK this no longer returns ENOMEM after
a447da7d00 ("tls: fix waitall behavior in tls_sw_recvmsg"),
so we fail for all error codes.
Reported-by: syzbot+c226690f7b3126c5ee04@syzkaller.appspotmail.com
Reported-by: syzbot+709f2810a6a05f11d4d3@syzkaller.appspotmail.com
Signed-off-by: Dave Watson <davejwatson@fb.com>
Fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: David S. Miller <davem@davemloft.net>
My recent fix for dns_resolver_preparse() printing very long strings was
incomplete, as shown by syzbot which still managed to hit the
WARN_ONCE() in set_precision() by adding a crafted "dns_resolver" key:
precision 50001 too large
WARNING: CPU: 7 PID: 864 at lib/vsprintf.c:2164 vsnprintf+0x48a/0x5a0
The bug this time isn't just a printing bug, but also a logical error
when multiple options ("#"-separated strings) are given in the key
payload. Specifically, when separating an option string into name and
value, if there is no value then the name is incorrectly considered to
end at the end of the key payload, rather than the end of the current
option. This bypasses validation of the option length, and also means
that specifying multiple options is broken -- which presumably has gone
unnoticed as there is currently only one valid option anyway.
A similar problem also applied to option values, as the kstrtoul() when
parsing the "dnserror" option will read past the end of the current
option and into the next option.
Fix these bugs by correctly computing the length of the option name and
by copying the option value, null-terminated, into a temporary buffer.
Reproducer for the WARN_ONCE() that syzbot hit:
perl -e 'print "#A#", "\0" x 50000' | keyctl padd dns_resolver desc @s
Reproducer for "dnserror" option being parsed incorrectly (expected
behavior is to fail when seeing the unknown option "foo", actual
behavior was to read the dnserror value as "1#foo" and fail there):
perl -e 'print "#dnserror=1#foo\0"' | keyctl padd dns_resolver desc @s
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: 4a2d789267 ("DNS: If the DNS server returns an error, allow that to be cached [ver #2]")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu says:
====================
multicast: init as INCLUDE when join SSM INCLUDE group
Based on RFC3376 5.1 and RFC3810 6.1, we should init as INCLUDE when join SSM
INCLUDE group. In my first version I only clear the group change record. But
this is not enough as when a new group join, it will init as EXCLUDE and
trigger an filter mode change in ip/ip6_mc_add_src(), which will clear all
source addresses' sf_crcount. This will prevent early joined address sending
state change records if multi source addresses joined at the same time.
In this v2 patchset, I fixed it by directly initializing the mode to INCLUDE
for SSM JOIN_SOURCE_GROUP. I also split the original patch into two separated
patches for IPv4 and IPv6.
Test: test by myself and customer.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This an IPv6 version patch of "ipv4/igmp: init group mode as INCLUDE when
join source group". From RFC3810, part 6.1:
If no per-interface state existed for that
multicast address before the change (i.e., the change consisted of
creating a new per-interface record), or if no state exists after the
change (i.e., the change consisted of deleting a per-interface
record), then the "non-existent" state is considered to have an
INCLUDE filter mode and an empty source list.
Which means a new multicast group should start with state IN(). Currently,
for MLDv2 SSM JOIN_SOURCE_GROUP mode, we first call ipv6_sock_mc_join(),
then ip6_mc_source(), which will trigger a TO_IN() message instead of
ALLOW().
The issue was exposed by commit a052517a8f ("net/multicast: should not
send source list records when have filter mode change"). Before this change,
we sent both ALLOW(A) and TO_IN(A). Now, we only send TO_IN(A).
Fix it by adding a new parameter to init group mode. Also add some wrapper
functions to avoid changing too much code.
v1 -> v2:
In the first version I only cleared the group change record. But this is not
enough. Because when a new group join, it will init as EXCLUDE and trigger
a filter mode change in ip/ip6_mc_add_src(), which will clear all source
addresses sf_crcount. This will prevent early joined address sending state
change records if multi source addressed joined at the same time.
In v2 patch, I fixed it by directly initializing the mode to INCLUDE for SSM
JOIN_SOURCE_GROUP. I also split the original patch into two separated patches
for IPv4 and IPv6.
There is also a difference between v4 and v6 version. For IPv6, when the
interface goes down and up, we will send correct state change record with
unspecified IPv6 address (::) with function ipv6_mc_up(). But after DAD is
completed, we resend the change record TO_IN() in mld_send_initial_cr().
Fix it by sending ALLOW() for INCLUDE mode in mld_send_initial_cr().
Fixes: a052517a8f ("net/multicast: should not send source list records when have filter mode change")
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on RFC3376 5.1
If no interface
state existed for that multicast address before the change (i.e., the
change consisted of creating a new per-interface record), or if no
state exists after the change (i.e., the change consisted of deleting
a per-interface record), then the "non-existent" state is considered
to have a filter mode of INCLUDE and an empty source list.
Which means a new multicast group should start with state IN().
Function ip_mc_join_group() works correctly for IGMP ASM(Any-Source Multicast)
mode. It adds a group with state EX() and inits crcount to mc_qrv,
so the kernel will send a TO_EX() report message after adding group.
But for IGMPv3 SSM(Source-specific multicast) JOIN_SOURCE_GROUP mode, we
split the group joining into two steps. First we join the group like ASM,
i.e. via ip_mc_join_group(). So the state changes from IN() to EX().
Then we add the source-specific address with INCLUDE mode. So the state
changes from EX() to IN(A).
Before the first step sends a group change record, we finished the second
step. So we will only send the second change record. i.e. TO_IN(A).
Regarding the RFC stands, we should actually send an ALLOW(A) message for
SSM JOIN_SOURCE_GROUP as the state should mimic the 'IN() to IN(A)'
transition.
The issue was exposed by commit a052517a8f ("net/multicast: should not
send source list records when have filter mode change"). Before this change,
we used to send both ALLOW(A) and TO_IN(A). After this change we only send
TO_IN(A).
Fix it by adding a new parameter to init group mode. Also add new wrapper
functions so we don't need to change too much code.
v1 -> v2:
In my first version I only cleared the group change record. But this is not
enough. Because when a new group join, it will init as EXCLUDE and trigger
an filter mode change in ip/ip6_mc_add_src(), which will clear all source
addresses' sf_crcount. This will prevent early joined address sending state
change records if multi source addressed joined at the same time.
In v2 patch, I fixed it by directly initializing the mode to INCLUDE for SSM
JOIN_SOURCE_GROUP. I also split the original patch into two separated patches
for IPv4 and IPv6.
Fixes: a052517a8f ("net/multicast: should not send source list records when have filter mode change")
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull pin control fixes from Linus Walleij:
- A slew of driver fixes for Mediatek mt7622
- Fix a direction inversion bug in the Ingenic driver
- Fix unsupported drive strength setting on the PFC r8a77970
- Off by one and NULL dereference fixes in the NSP driver
* tag 'pinctrl-v4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: nsp: Fix potential NULL dereference
pinctrl: nsp: off by ones in nsp_pinmux_enable()
pinctrl: sh-pfc: r8a77970: remove SH_PFC_PIN_CFG_DRIVE_STRENGTH flag
pinctrl: ingenic: Fix inverted direction for < JZ4770
pinctrl: mt7622: fix a kernel panic when gpio-hog is being applied
pinctrl: mt7622: stop using the deprecated pinctrl_add_gpio_range
pinctrl: mt7622: fix that pinctrl_claim_hogs cannot work
pinctrl: mt7622: fix initialization sequence between eint and gpiochip
pinctrl: mt7622: fix error path on failing at groups building
Pull drm fixes from Dave Airlie:
- two AGP fixes in here
- a bunch of mostly amdgpu fixes
- sun4i build fix
- two armada fixes
- some tegra fixes
- one i915 core and one i915 gvt fix
* tag 'drm-fixes-2018-07-16-1' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu/pp/smu7: use a local variable for toc indexing
amd/dc/dce100: On dce100, set clocks to 0 on suspend
drm/amd/display: Convert 10kHz clks from PPLib into kHz for Vega
drm/amdgpu: Verify root PD is mapped into kernel address space (v4)
drm/amd/display: fix invalid function table override
drm/amdgpu: Reserve VM root shared fence slot for command submission (v3)
Revert "drm/amd/display: Don't return ddc result and read_bytes in same return value"
char: amd64-agp: Use 64-bit arithmetic instead of 32-bit
char: agp: Change return type to vm_fault_t
drm/i915: Fix hotplug irq ack on i965/g4x
drm/armada: fix irq handling
drm/armada: fix colorkey mode property
drm/tegra: Fix comparison operator for buffer size
gpu: host1x: Check whether size of unpin isn't 0
gpu: host1x: Skip IOMMU initialization if firewall is enabled
drm/sun4i: link in front-end code if needed
drm/i915/gvt: update vreg on inhibit context lri command
Moving zero_resv_unavail before memmap_init_zone(), caused a regression on
x86-32.
The cause is that we access struct pages before they are allocated when
CONFIG_FLAT_NODE_MEM_MAP is used.
free_area_init_nodes()
zero_resv_unavail()
mm_zero_struct_page(pfn_to_page(pfn)); <- struct page is not alloced
free_area_init_node()
if CONFIG_FLAT_NODE_MEM_MAP
alloc_node_mem_map()
memblock_virt_alloc_node_nopanic() <- struct page alloced here
On the other hand memblock_virt_alloc_node_nopanic() zeroes all the memory
that it returns, so we do not need to do zero_resv_unavail() here.
Fixes: e181ae0c5d ("mm: zero unavailable pages before memmap init")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Tested-by: Matt Hart <matt@mattface.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the server or network is misbehaving and we get an unexpected reply
we can sometimes miss the request not being started and wait on a
request and never get a response, or even double complete the same
request. Fix this by replacing the send_complete completion with just a
per command lock. Add a per command cookie as well so that we can know
if we're getting a double completion for a previous event. Also check
to make sure we dont have REQUEUED set as that means we raced with the
timeout handler and need to just let the retry occur.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We can race with the snd timeout and the per-request timeout and end up
requeuing the same request twice. We can't use the send_complete
completion to tell if everything is ok because we hold the tx_lock
during send, so the timeout stuff will block waiting to mark the socket
dead, and we could be marked complete and still requeue. Instead add a
flag to the socket so we know whether we've been requeued yet.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The MIPS implementation of pci_resource_to_user() introduced in v3.12 by
commit 4c2924b725 ("MIPS: PCI: Use pci_resource_to_user to map pci
memory space properly") incorrectly sets *end to the address of the
byte after the resource, rather than the last byte of the resource.
This results in userland seeing resources as a byte larger than they
actually are, for example a 32 byte BAR will be reported by a tool such
as lspci as being 33 bytes in size:
Region 2: I/O ports at 1000 [disabled] [size=33]
Correct this by subtracting one from the calculated end address,
reporting the correct address to userland.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reported-by: Rui Wang <rui.wang@windriver.com>
Fixes: 4c2924b725 ("MIPS: PCI: Use pci_resource_to_user to map pci memory space properly")
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # v3.12+
Patchwork: https://patchwork.linux-mips.org/patch/19829/
When the CSI is receiving from a bt.656 bus, include a check for
field type 'alternate' when determining whether to set CSI clock
mode to CCIR656_INTERLACED or CCIR656_PROGRESSIVE.
Signed-off-by: Steve Longerbeam <steve_longerbeam@mentor.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
If the second LVDS channel has been disabled in the DT when using dual-channel
mode we should not print a warning.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
The LVDS signal integrity is only guaranteed when the correct enable
sequence (first IPU DI, then LDB) is used. If the LDB display output was
active before the imx-drm driver is loaded (like when a bootsplash was
active) the DI will be disabled by the full IPU reset we do when loading
the driver. The LDB control registers are not part of the IPU range and
thus will remain unchanged.
This leads to the LDB still being active when the DI is getting enabled,
effectively reversing the required enable sequence. Fix this by also
disabling the LDB on driver bind.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
A comment in the review of the patch adding the phandle cache said that
the cache would have to be updated when modules are applied and removed.
This patch implements the cache updates.
Fixes: 0b3ce78e90 ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
Reported-by: Alan Tull <atull@kernel.org>
Suggested-by: Alan Tull <atull@kernel.org>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
drm_legacy_ctxbitmap_next() returns idr_alloc() which can return
-ENOMEM, -EINVAL or -ENOSPC none of which are -1 . but the call sites
of drm_legacy_ctxbitmap_next() seem to be assuming that the error case
would be -1 (original return of drm_ctxbitmap_next() prior to 2.6.23
was actually -1). Thus reenable error handling by checking for < 0.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Fixes: 62968144e6 ("drm: convert drm context code to use Linux idr")
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1531571532-22733-1-git-send-email-hofrat@osadl.org
Kishon writes:
phy: for 4.18-rc
*) Fix to get xhci working after disable<->enable cycle
*) Fix wrong enum used for status lines (also fixes a compilation
warning).
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
If softsynthx_read() is called with `count < 3`, `count - 3` wraps, causing
the loop to copy as much data as available to the provided buffer. If
softsynthx_read() is invoked through sys_splice(), this causes an
unbounded kernel write; but even when userspace just reads from it
normally, a small size could cause userspace crashes.
Fixes: 425e586cf9 ("speakup: add unicode variant of /dev/softsynth")
Cc: stable@vger.kernel.org
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
'hostif_mib_set_request_bool' function receives a bool as value and
send the received value with MIB_VALUE_TYPE_BOOL type. There is
one case where the value passed is not a boolean one but
'MCAST_FILTER_PROMISC' which is '2'. Call hostif_mib_set_request_int
instead for related multicast enumeration. This changes original
code behaviour but seems to be the right way to do this.
Fixes: 8ce76bff0e ("staging: ks7010: add new helpers to achieve mib set request and simplify code")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit b83b8b1881 ("staging:r8188eu: Use lib80211 to support TKIP")
is causing 2 problems for me:
1) One boot the wifi on a laptop with a r8188eu wifi device would not
connect and dmesg contained an oops about scheduling while atomic
pointing to the tkip code. This went away after reverting the commit.
2) I reverted the revert to try and get the oops from 1. again to be able
to add it to this commit message. But now the system did connect to the
wifi only to print a whole bunch of oopses, followed by a hardfreeze a
few seconds later. Subsequent reboots also all lead to scenario 2. Until
I reverted the commit again.
Revert the commit fixes both issues making the laptop usable again.
Fixes: b83b8b1881 ("staging:r8188eu: Use lib80211 to support TKIP")
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Ivan Safonov <insafonov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently nouveau doesn't actually expose the state debugfs file that's
usually provided for any modesetting driver that supports atomic, even
if nouveau is loaded with atomic=1. This is due to the fact that the
standard debugfs files that DRM creates for atomic drivers is called
when drm_get_pci_dev() is called from nouveau_drm.c. This happens well
before we've initialized the display core, which is currently
responsible for setting the DRIVER_ATOMIC cap.
So, move the atomic option into nouveau_drm.c and just add the
DRIVER_ATOMIC cap whenever it's enabled on the kernel commandline. This
shouldn't cause any actual issues, as the atomic ioctl will still fail
as expected even if the display core doesn't disable it until later in
the init sequence. This also provides the added benefit of being able to
use the state debugfs file to check the current display state even if
clients aren't allowed to modify it through anything other than the
legacy ioctls.
Additionally, disable the DRIVER_ATOMIC cap in nv04's display core, as
this was already disabled there previously.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This both uses the legacy modesetting structures in a racy manner, and
additionally also doesn't even check the right variable (enabled != the
CRTC is actually turned on for atomic).
This fixes issues on my P50 regarding the dedicated GPU not entering
runtime suspend.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
A CRTC being enabled doesn't mean it's on! It doesn't even necessarily
mean it's being used. This fixes runtime PM leaks on the P50 I've got
next to me.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
When MST and atomic were introduced to nouveau, another structure that
could contain a drm_connector embedded within it was introduced; struct
nv50_mstc. This meant that we no longer would be able to simply loop
through our connector list and assume that nouveau_connector() would
return a proper pointer for each connector, since the assertion that
all connectors coming from nouveau have a full nouveau_connector struct
became invalid.
Unfortunately, none of the actual code that looped through connectors
ever got updated, which means that we've been causing invalid memory
accesses for quite a while now.
An example that was caught by KASAN:
[ 201.038698] ==================================================================
[ 201.038792] BUG: KASAN: slab-out-of-bounds in nvif_notify_get+0x190/0x1a0 [nouveau]
[ 201.038797] Read of size 4 at addr ffff88076738c650 by task kworker/0:3/718
[ 201.038800]
[ 201.038822] CPU: 0 PID: 718 Comm: kworker/0:3 Tainted: G O 4.18.0-rc4Lyude-Test+ #1
[ 201.038825] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51 ) 05/18/2018
[ 201.038882] Workqueue: events nouveau_display_hpd_work [nouveau]
[ 201.038887] Call Trace:
[ 201.038894] dump_stack+0xa4/0xfd
[ 201.038900] print_address_description+0x71/0x239
[ 201.038929] ? nvif_notify_get+0x190/0x1a0 [nouveau]
[ 201.038935] kasan_report.cold.6+0x242/0x2fe
[ 201.038942] __asan_report_load4_noabort+0x19/0x20
[ 201.038970] nvif_notify_get+0x190/0x1a0 [nouveau]
[ 201.038998] ? nvif_notify_put+0x1f0/0x1f0 [nouveau]
[ 201.039003] ? kmsg_dump_rewind_nolock+0xe4/0xe4
[ 201.039049] nouveau_display_init.cold.12+0x34/0x39 [nouveau]
[ 201.039089] ? nouveau_user_framebuffer_create+0x120/0x120 [nouveau]
[ 201.039133] nouveau_display_resume+0x5c0/0x810 [nouveau]
[ 201.039173] ? nvkm_client_ioctl+0x20/0x20 [nouveau]
[ 201.039215] nouveau_do_resume+0x19f/0x570 [nouveau]
[ 201.039256] nouveau_pmops_runtime_resume+0xd8/0x2a0 [nouveau]
[ 201.039264] pci_pm_runtime_resume+0x130/0x250
[ 201.039269] ? pci_restore_standard_config+0x70/0x70
[ 201.039275] __rpm_callback+0x1f2/0x5d0
[ 201.039279] ? rpm_resume+0x560/0x18a0
[ 201.039283] ? pci_restore_standard_config+0x70/0x70
[ 201.039287] ? pci_restore_standard_config+0x70/0x70
[ 201.039291] ? pci_restore_standard_config+0x70/0x70
[ 201.039296] rpm_callback+0x175/0x210
[ 201.039300] ? pci_restore_standard_config+0x70/0x70
[ 201.039305] rpm_resume+0xcc3/0x18a0
[ 201.039312] ? rpm_callback+0x210/0x210
[ 201.039317] ? __pm_runtime_resume+0x9e/0x100
[ 201.039322] ? kasan_check_write+0x14/0x20
[ 201.039326] ? do_raw_spin_lock+0xc2/0x1c0
[ 201.039333] __pm_runtime_resume+0xac/0x100
[ 201.039374] nouveau_display_hpd_work+0x67/0x1f0 [nouveau]
[ 201.039380] process_one_work+0x7a0/0x14d0
[ 201.039388] ? cancel_delayed_work_sync+0x20/0x20
[ 201.039392] ? lock_acquire+0x113/0x310
[ 201.039398] ? kasan_check_write+0x14/0x20
[ 201.039402] ? do_raw_spin_lock+0xc2/0x1c0
[ 201.039409] worker_thread+0x86/0xb50
[ 201.039418] kthread+0x2e9/0x3a0
[ 201.039422] ? process_one_work+0x14d0/0x14d0
[ 201.039426] ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 201.039431] ret_from_fork+0x3a/0x50
[ 201.039441]
[ 201.039444] Allocated by task 79:
[ 201.039449] save_stack+0x43/0xd0
[ 201.039452] kasan_kmalloc+0xc4/0xe0
[ 201.039456] kmem_cache_alloc_trace+0x10a/0x260
[ 201.039494] nv50_mstm_add_connector+0x9a/0x340 [nouveau]
[ 201.039504] drm_dp_add_port+0xff5/0x1fc0 [drm_kms_helper]
[ 201.039511] drm_dp_send_link_address+0x4a7/0x740 [drm_kms_helper]
[ 201.039518] drm_dp_check_and_send_link_address+0x1a7/0x210 [drm_kms_helper]
[ 201.039525] drm_dp_mst_link_probe_work+0x71/0xb0 [drm_kms_helper]
[ 201.039529] process_one_work+0x7a0/0x14d0
[ 201.039533] worker_thread+0x86/0xb50
[ 201.039537] kthread+0x2e9/0x3a0
[ 201.039541] ret_from_fork+0x3a/0x50
[ 201.039543]
[ 201.039546] Freed by task 0:
[ 201.039549] (stack is not available)
[ 201.039551]
[ 201.039555] The buggy address belongs to the object at ffff88076738c1a8
which belongs to the cache kmalloc-2048 of size 2048
[ 201.039559] The buggy address is located 1192 bytes inside of
2048-byte region [ffff88076738c1a8, ffff88076738c9a8)
[ 201.039563] The buggy address belongs to the page:
[ 201.039567] page:ffffea001d9ce200 count:1 mapcount:0 mapping:ffff88084000d0c0 index:0x0 compound_mapcount: 0
[ 201.039573] flags: 0x8000000000008100(slab|head)
[ 201.039578] raw: 8000000000008100 ffffea001da3be08 ffffea001da25a08 ffff88084000d0c0
[ 201.039582] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000
[ 201.039585] page dumped because: kasan: bad access detected
[ 201.039588]
[ 201.039591] Memory state around the buggy address:
[ 201.039594] ffff88076738c500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 201.039598] ffff88076738c580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 201.039601] >ffff88076738c600: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc
[ 201.039604] ^
[ 201.039607] ffff88076738c680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 201.039611] ffff88076738c700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 201.039613] ==================================================================
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Every codepath in nouveau that loops through the connector list
currently does so using the old method, which is prone to race
conditions from MST connectors being created and destroyed. This has
been causing a multitude of problems, including memory corruption from
trying to access connectors that have already been freed!
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The bo array has req->nr_buffers elements so the > should be >= so we
don't read beyond the end of the array.
Fixes: a1606a9596 ("drm/nouveau: new gem pushbuf interface, bump to 0.0.16")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
It was possible for this to be skipped when shutting down MST streams, and
leaving the core channel interlocked with a wndw channel update that never
happens - leading to a hung display.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tested-By: Lyude Paul <lyude@redhat.com>
All copy_to_user() implementations need to be prepared to handle faults
accessing userspace. The __memcpy_mcsafe() implementation handles both
mmu-faults on the user destination and machine-check-exceptions on the
source buffer. However, the memcpy_mcsafe() wrapper may silently
fallback to memcpy() depending on build options and cpu-capabilities.
Force copy_to_user_mcsafe() to always use __memcpy_mcsafe() when
available, and otherwise disable all of the copy_to_user_mcsafe()
infrastructure when __memcpy_mcsafe() is not available, i.e.
CONFIG_X86_MCE=n.
This fixes crashes of the form:
run fstests generic/323 at 2018-07-02 12:46:23
BUG: unable to handle kernel paging request at 00007f0d50001000
RIP: 0010:__memcpy+0x12/0x20
[..]
Call Trace:
copyout_mcsafe+0x3a/0x50
_copy_to_iter_mcsafe+0xa1/0x4a0
? dax_alive+0x30/0x50
dax_iomap_actor+0x1f9/0x280
? dax_iomap_rw+0x100/0x100
iomap_apply+0xba/0x130
? dax_iomap_rw+0x100/0x100
dax_iomap_rw+0x95/0x100
? dax_iomap_rw+0x100/0x100
xfs_file_dax_read+0x7b/0x1d0 [xfs]
xfs_file_read_iter+0xa7/0xc0 [xfs]
aio_read+0x11c/0x1a0
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Fixes: 8780356ef6 ("x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()")
Link: http://lkml.kernel.org/r/153108277790.37979.1486841789275803399.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mark noticed that syzkaller is able to reliably trigger the following warning:
dl_rq->running_bw > dl_rq->this_bw
WARNING: CPU: 1 PID: 153 at kernel/sched/deadline.c:124 switched_from_dl+0x454/0x608
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 153 Comm: syz-executor253 Not tainted 4.18.0-rc3+ #29
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x458
show_stack+0x20/0x30
dump_stack+0x180/0x250
panic+0x2dc/0x4ec
__warn_printk+0x0/0x150
report_bug+0x228/0x2d8
bug_handler+0xa0/0x1a0
brk_handler+0x2f0/0x568
do_debug_exception+0x1bc/0x5d0
el1_dbg+0x18/0x78
switched_from_dl+0x454/0x608
__sched_setscheduler+0x8cc/0x2018
sys_sched_setattr+0x340/0x758
el0_svc_naked+0x30/0x34
syzkaller reproducer runs a bunch of threads that constantly switch
between DEADLINE and NORMAL classes while interacting through futexes.
The splat above is caused by the fact that if a DEADLINE task is setattr
back to NORMAL while in non_contending state (blocked on a futex -
inactive timer armed), its contribution to running_bw is not removed
before sub_rq_bw() gets called (!task_on_rq_queued() branch) and the
latter sees running_bw > this_bw.
Fix it by removing a task contribution from running_bw if the task is
not queued and in non_contending state while switched to a different
class.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Reviewed-by: Luca Abeni <luca.abeni@santannapisa.it>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: claudio@evidence.eu.com
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/20180711072948.27061-1-juri.lelli@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull ARM SoC fixes from Olof Johansson:
- A fix for OMAP5 and DRA7 to make the branch predictor hardening
settings take proper effect on secondary cores
- Disable USB OTG on am3517 since current driver isn't working
- Fix thermal sensor register settings on Armada 38x
- Fix suspend/resume IRQs on pxa3xx
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: am3517.dtsi: Disable reference to OMAP3 OTG controller
ARM: DRA7/OMAP5: Enable ACTLR[0] (Enable invalidates of BTB) for secondary cores
ARM: pxa: irq: fix handling of ICMR registers in suspend/resume
ARM: dts: armada-38x: use the new thermal binding
pvti_cpu0_va is the address of shared kvmclock data structure.
pvti_cpu0_va is currently kept unset (1) on 32 bit systems, (2) when
kvmclock vsyscall is disabled, and (3) if kvmclock is not stable.
This poses a problem, because kvm_ptp needs pvti_cpu0_va, but (1) can
work on 32 bit, (2) has little relation to the vsyscall, and (3) does
not need stable kvmclock (although kvmclock won't be used for system
clock if it's not stable, so kvm_ptp is pointless in that case).
Expose pvti_cpu0_va whenever kvmclock is enabled to allow all users to
work with it.
This fixes a regression found on Gentoo: https://bugs.gentoo.org/658544.
Fixes: 9f08890ab9 ("x86/pvclock: add setter for pvclock_pvti_cpu0_va")
Cc: stable@vger.kernel.org
Reported-by: Andreas Steinmetz <ast@domdv.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prevent a config where KVM_AMD=y and CRYPTO_DEV_CCP_DD=m thereby ensuring
that AMD Secure Processor device driver will be built-in when KVM_AMD is
also built-in.
v1->v2:
* Removed usage of 'imply' Kconfig option.
* Change patch commit message.
Fixes: 505c9e94d8 ("KVM: x86: prefer "depends on" to "select" for SEV")
Cc: <stable@vger.kernel.org> # 4.16.x
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Reviewed-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This exit qualification was inadvertently dropped when the two
VM-entry failure blocks were coalesced.
Fixes: e79f245dde ("X86/KVM: Properly update 'tsc_offset' to represent the running guest")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When we switched from doing rdmsr() to reading FS/GS base values from
current->thread we completely forgot about legacy 32-bit userspaces which
we still support in KVM (why?). task->thread.{fsbase,gsbase} are only
synced for 64-bit processes, calling save_fsgs_for_kvm() and using
its result from current is illegal for legacy processes.
There's no ARCH_SET_FS/GS prctls for legacy applications. Base MSRs are,
however, not always equal to zero. Intel's manual says (3.4.4 Segment
Loading Instructions in IA-32e Mode):
"In order to set up compatibility mode for an application, segment-load
instructions (MOV to Sreg, POP Sreg) work normally in 64-bit mode. An
entry is read from the system descriptor table (GDT or LDT) and is loaded
in the hidden portion of the segment register.
...
The hidden descriptor register fields for FS.base and GS.base are
physically mapped to MSRs in order to load all address bits supported by
a 64-bit implementation.
"
The issue was found by strace test suite where 32-bit ioctl_kvm_run test
started segfaulting.
Reported-by: Dmitry V. Levin <ldv@altlinux.org>
Bisected-by: Masatake YAMATO <yamato@redhat.com>
Fixes: 42b933b597 ("x86/kvm/vmx: read MSR_{FS,KERNEL_GS}_BASE from current->thread")
Cc: stable@vger.kernel.org
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This lets userspace read the MSR_IA32_ARCH_CAPABILITIES and check that all
requested features are available on the host.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When cpu_stop_queue_two_works() begins to wake the stopper threads, it does
so without preemption disabled, which leads to the following race
condition:
The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source
CPU, and cpu2 as the destination CPU. When adding the stopper threads to
the wake queue used in this function, the source CPU stopper thread is
added first, and the destination CPU stopper thread is added last.
When wake_up_q() is invoked to wake the stopper threads, the threads are
woken up in the order that they are queued in, so the source CPU's stopper
thread is woken up first, and it preempts the thread running on the source
CPU.
The stopper thread will then execute on the source CPU, disable preemption,
and begin executing multi_cpu_stop(), and wait for an ack from the
destination CPU's stopper thread, with preemption still disabled. Since the
worker thread that woke up the stopper thread on the source CPU is affine
to the source CPU, and preemption is disabled on the source CPU, that
thread will never run to dequeue the destination CPU's stopper thread from
the wake queue, and thus, the destination CPU's stopper thread will never
run, causing the source CPU's stopper thread to wait forever, and stall.
Disable preemption when waking the stopper threads in
cpu_stop_queue_two_works().
Fixes: 0b26351b91 ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock")
Co-Developed-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Co-Developed-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: peterz@infradead.org
Cc: matt@codeblueprint.co.uk
Cc: bigeasy@linutronix.de
Cc: gregkh@linuxfoundation.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1530655334-4601-1-git-send-email-isaacm@codeaurora.org
Markus reported that BTS is sporadically missing the tail of the trace
in the perf_event data buffer: [decode error (1): instruction overflow]
shown in GDB; and bisected it to the conversion of debug_store to PTI.
A little "optimization" crept into alloc_bts_buffer(), which mistakenly
placed bts_interrupt_threshold away from the 24-byte record boundary.
Intel SDM Vol 3B 17.4.9 says "This address must point to an offset from
the BTS buffer base that is a multiple of the BTS record size."
Revert "max" from a byte count to a record count, to calculate the
bts_interrupt_threshold correctly: which turns out to fix problem seen.
Fixes: c1961a4631 ("x86/events/intel/ds: Map debug buffers in cpu_entry_area")
Reported-and-tested-by: Markus T Metzger <markus.t.metzger@intel.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: stable@vger.kernel.org # v4.14+
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.1807141248290.1614@eggly.anvils
Pull RTC fixes from Alexandre Belloni:
"Two fixes for 4.18:
- an important core fix for RTCs using the core offsetting only one
driver is affected
- a fix for the error path of mrst"
* tag 'rtc-4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: fix alarm read and set offset
rtc: mrst: fix error code in probe()
Two omap fixes for v4.18-rc cycle
Turns out the recent patches for ARM branch predictor hardening are
not working on omap5 and dra7 as planned because the secondary CPU
is parked to the bootrom code. We can't configure it in the bootloader.
So we must enable invalidates of BTB for omap5 and dra7 secondary
core in the kernel.
And there's a fix for reserved register access for am3517. The
usb otg module on am3517 is not the same as for other omap3.
* tag 'omap-for-v4.18/fixes-rc4-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am3517.dtsi: Disable reference to OMAP3 OTG controller
ARM: DRA7/OMAP5: Enable ACTLR[0] (Enable invalidates of BTB) for secondary cores
Signed-off-by: Olof Johansson <olof@lixom.net>
mvebu fixes for 4.18 (part 1)
Use the new thermal binding on Armada 38x allowing to use a driver fix
which is already part of the kernel.
* tag 'mvebu-fixes-4.18-1' of git://git.infradead.org/linux-mvebu:
ARM: dts: armada-38x: use the new thermal binding
Signed-off-by: Olof Johansson <olof@lixom.net>
This is the fixes set for v4.18 cycle.
This is a fix for suspending all pxa3xx platforms, where high
number interrupts are not reenabled.
* tag 'pxa-fixes-4.18' of https://github.com/rjarzmik/linux:
ARM: pxa: irq: fix handling of ICMR registers in suspend/resume
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull xen fixes from Juergen Gross:
"Two related fixes for a boot failure of Xen PV guests"
* tag 'for-linus-4.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: setup pv irq ops vector earlier
xen: remove global bit from __default_kernel_pte_mask for pv guests
Pull block fix from Jens Axboe:
"Just a single regression fix (from 4.17) for bsg, fixing an EINVAL
return on non-data commands"
* tag 'for-linus-20180713' of git://git.kernel.dk/linux-block:
bsg: fix bogus EINVAL on non-data commands
Merge misc fixes from Andrew Morton:
"11 fixes"
* emailed patches form Andrew Morton <akpm@linux-foundation.org>:
reiserfs: fix buffer overflow with long warning messages
checkpatch: fix duplicate invalid vsprintf pointer extension '%p<foo>' messages
mm: do not bug_on on incorrect length in __mm_populate()
mm/memblock.c: do not complain about top-down allocations for !MEMORY_HOTREMOVE
fs, elf: make sure to page align bss in load_elf_library
x86/purgatory: add missing FORCE to Makefile target
net/9p/client.c: put refcount of trans_mod in error case in parse_opts()
mm: allow arch to supply p??_free_tlb functions
autofs: fix slab out of bounds read in getname_kernel()
fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps*
mm: do not drop unused pages when userfaultd is running
Multiline statements with invalid %p<foo> uses produce multiple
warnings. Fix that.
e.g.:
$ cat t_block.c
void foo(void)
{
MY_DEBUG(drv->foo,
"%pk",
foo->boo);
}
$ ./scripts/checkpatch.pl -f t_block.c
WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
#1: FILE: t_block.c:1:
+void foo(void)
WARNING: Invalid vsprintf pointer extension '%pk'
#3: FILE: t_block.c:3:
+ MY_DEBUG(drv->foo,
+ "%pk",
+ foo->boo);
WARNING: Invalid vsprintf pointer extension '%pk'
#3: FILE: t_block.c:3:
+ MY_DEBUG(drv->foo,
+ "%pk",
+ foo->boo);
total: 0 errors, 3 warnings, 6 lines checked
NOTE: For some of the reported defects, checkpatch may be able to
mechanically convert to the typical style using --fix or --fix-inplace.
t_block.c has style problems, please review.
NOTE: If any of the errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
Link: http://lkml.kernel.org/r/9e8341bbe4c9877d159cb512bb701043cbfbb10b.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: "Tobin C. Harding" <me@tobin.cc>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
syzbot has noticed that a specially crafted library can easily hit
VM_BUG_ON in __mm_populate
kernel BUG at mm/gup.c:1242!
invalid opcode: 0000 [#1] SMP
CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
RIP: 0010:__mm_populate+0x1e2/0x1f0
Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb
Call Trace:
vm_brk_flags+0xc3/0x100
vm_brk+0x1f/0x30
load_elf_library+0x281/0x2e0
__ia32_sys_uselib+0x170/0x1e0
do_fast_syscall_32+0xca/0x420
entry_SYSENTER_compat+0x70/0x7f
The reason is that the length of the new brk is not page aligned when we
try to populate the it. There is no reason to bug on that though.
do_brk_flags already aligns the length properly so the mapping is
expanded as it should. All we need is to tell mm_populate about it.
Besides that there is absolutely no reason to to bug_on in the first
place. The worst thing that could happen is that the last page wouldn't
get populated and that is far from putting system into an inconsistent
state.
Fix the issue by moving the length sanitization code from do_brk_flags
up to vm_brk_flags. The only other caller of do_brk_flags is brk
syscall entry and it makes sure to provide the proper length so t here
is no need for sanitation and so we can use do_brk_flags without it.
Also remove the bogus BUG_ONs.
[osalvador@techadventures.net: fix up vm_brk_flags s@request@len@]
Link: http://lkml.kernel.org/r/20180706090217.GI32658@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: syzbot <syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com>
Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport is converting architectures from bootmem to nobootmem
allocator. While doing so for m68k Geert has noticed that he gets a
scary looking warning:
WARNING: CPU: 0 PID: 0 at mm/memblock.c:230
memblock_find_in_range_node+0x11c/0x1be
memblock: bottom-up allocation failed, memory hotunplug may be affected
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted
4.18.0-rc3-atari-01343-gf2fb5f2e09a97a3c-dirty #7
Call Trace: __warn+0xa8/0xc2
kernel_pg_dir+0x0/0x1000
netdev_lower_get_next+0x2/0x22
warn_slowpath_fmt+0x2e/0x36
memblock_find_in_range_node+0x11c/0x1be
memblock_find_in_range_node+0x11c/0x1be
memblock_find_in_range_node+0x0/0x1be
vprintk_func+0x66/0x6e
memblock_virt_alloc_internal+0xd0/0x156
netdev_lower_get_next+0x2/0x22
netdev_lower_get_next+0x2/0x22
kernel_pg_dir+0x0/0x1000
memblock_virt_alloc_try_nid_nopanic+0x58/0x7a
netdev_lower_get_next+0x2/0x22
kernel_pg_dir+0x0/0x1000
kernel_pg_dir+0x0/0x1000
EXPTBL+0x234/0x400
EXPTBL+0x234/0x400
alloc_node_mem_map+0x4a/0x66
netdev_lower_get_next+0x2/0x22
free_area_init_node+0xe2/0x29e
EXPTBL+0x234/0x400
paging_init+0x430/0x462
kernel_pg_dir+0x0/0x1000
printk+0x0/0x1a
EXPTBL+0x234/0x400
setup_arch+0x1b8/0x22c
start_kernel+0x4a/0x40a
_sinittext+0x344/0x9e8
The warning is basically saying that a top-down allocation can break
memory hotremove because memblock allocation is not movable. But m68k
doesn't even support MEMORY_HOTREMOVE so there is no point to warn about
it.
Make the warning conditional only to configurations that care.
Link: http://lkml.kernel.org/r/20180706061750.GH32658@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Sam Creasey <sammy@sammy.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Build the kernel without the fix
- Add some flag to the purgatories KBUILD_CFLAGS,I used
-fno-asynchronous-unwind-tables
- Re-build the kernel
When you look at makes output you see that sha256.o is not re-build in the
last step. Also readelf -S still shows the .eh_frame section for
sha256.o.
With the fix sha256.o is rebuilt in the last step.
Without FORCE make does not detect changes only made to the command line
options. So object files might not be re-built even when they should be.
Fix this by adding FORCE where it is missing.
Link: http://lkml.kernel.org/r/20180704110044.29279-2-prudo@linux.ibm.com
Fixes: df6f2801f5 ("kernel/kexec_file.c: move purgatories sha256 to common code")
Signed-off-by: Philipp Rudo <prudo@linux.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org> [4.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In my testing, the second mount will fail after umounting successfully.
The reason is that we put refcount of trans_mod in the correct case
rather than the error case in parse_opts() at last. That will cause the
refcount decrease to -1, and when we try to get trans_mod again in
try_module_get(), we could only increase refcount to 0 which will cause
failure as follows:
parse_opts
v9fs_get_trans_by_name
try_module_get : return NULL to caller which cause error
So we should put refcount of trans_mod in error case.
Link: http://lkml.kernel.org/r/5B3F39A0.2030509@huawei.com
Fixes: 9421c3e641 ("net/9p/client.c: fix potential refcnt problem of trans module")
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Dominique Martinet <dominique.martinet@cea.fr>
Tested-by: Dominique Martinet <dominique.martinet@cea.fr>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The mmu_gather APIs keep track of the invalidated address range
including the span covered by invalidated page table pages. Ranges
covered by page tables but not ptes (and therefore no TLBs) still need
to be invalidated because some architectures (x86) can cache
intermediate page table entries, and invalidate those with normal TLB
invalidation instructions to be almost-backward-compatible.
Architectures which don't cache intermediate page table entries, or
which invalidate these caches separately from TLB invalidation, do not
require TLB invalidation range expanded over page tables.
Allow architectures to supply their own p??_free_tlb functions, which
can avoid the __tlb_adjust_range.
Link: http://lkml.kernel.org/r/20180703013131.2807-1-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas reports:
"While looking around in /proc on my v4.14.52 system I noticed that all
processes got a lot of "Locked" memory in /proc/*/smaps. A lot more
memory than a regular user can usually lock with mlock().
Commit 493b0e9d94 (in v4.14-rc1) seems to have changed the behavior
of "Locked".
Before that commit the code was like this. Notice the VM_LOCKED check.
(vma->vm_flags & VM_LOCKED) ?
(unsigned long)(mss.pss >> (10 + PSS_SHIFT)) : 0);
After that commit Locked is now the same as Pss:
(unsigned long)(mss->pss >> (10 + PSS_SHIFT)));
This looks like a mistake."
Indeed, the commit has added mss->pss_locked with the correct value that
depends on VM_LOCKED, but forgot to actually use it. Fix it.
Link: http://lkml.kernel.org/r/ebf6c7fb-fec3-6a26-544f-710ed193c154@suse.cz
Fixes: 493b0e9d94 ("mm: add /proc/pid/smaps_rollup")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Thomas Lindroth <thomas.lindroth@gmail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Daniel Colascione <dancol@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KVM guests on s390 can notify the host of unused pages. This can result
in pte_unused callbacks to be true for KVM guest memory.
If a page is unused (checked with pte_unused) we might drop this page
instead of paging it. This can have side-effects on userfaultd, when
the page in question was already migrated:
The next access of that page will trigger a fault and a user fault
instead of faulting in a new and empty zero page. As QEMU does not
expect a userfault on an already migrated page this migration will fail.
The most straightforward solution is to ignore the pte_unused hint if a
userfault context is active for this VMA.
Link: http://lkml.kernel.org/r/20180703171854.63981-1-borntraeger@de.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We must zero struct pages for memory that is not backed by physical
memory, or kernel does not have access to.
Recently, there was a change which zeroed all memmap for all holes in
e820. Unfortunately, it introduced a bug that is discussed here:
https://www.spinics.net/lists/linux-mm/msg156764.html
Linus, also saw this bug on his machine, and confirmed that reverting
commit 124049decb ("x86/e820: put !E820_TYPE_RAM regions into
memblock.reserved") fixes the issue.
The problem is that we incorrectly zero some struct pages after they
were setup.
The fix is to zero unavailable struct pages prior to initializing of
struct pages.
A more detailed fix should come later that would avoid double zeroing
cases: one in __init_single_page(), the other one in
zero_resv_unavail().
Fixes: 124049decb ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
platform_get_resource() may fail and return NULL, so we should
better check it's return value to avoid a NULL pointer dereference
a bit later in the code.
This is detected by Coccinelle semantic patch.
@@
expression pdev, res, n, t, e, e1, e2;
@@
res = platform_get_resource(pdev, t, n);
+ if (!res)
+ return -EINVAL;
... when != res == NULL
e = devm_ioremap_nocache(e1, res->start, e2);
Fixes: cc4fa83f66 ("pinctrl: nsp: add pinmux driver support for Broadcom NSP SoC")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The > comparisons should be >= or else we read beyond the end of the
pinctrl->functions[] array.
Fixes: cc4fa83f66 ("pinctrl: nsp: add pinmux driver support for Broadcom NSP SoC")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The datasheet does not document any registers to control drive strength,
and no drive strength registers are for this reason described for this
SoC. The flags indicating that drive strength can be controlled are
however set for some pins in the driver.
This leads to a NULL pointer dereference when the sh-pfc core tries to
access the struct describing the drive strength registers, for example
when reading the sysfs file pinconf-pins.
Fix this by removing the SH_PFC_PIN_CFG_DRIVE_STRENGTH from all pins.
Fixes: b92ac66a18 ("pinctrl: sh-pfc: Add R8A77970 PFC support")
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The .gpio_set_direction() callback was setting inverted direction
for SoCs older than the JZ4770, this restores the correct behaviour.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
If the pinctrl node has the gpio-ranges property, the range will be added
by the gpio core and doesn't need to be added by the pinctrl driver.
But for keeping backward compatibility, an explicit pinctrl_add_gpio_range
is still needed to be called when there is a missing gpio-ranges in pinctrl
node in old dts files.
Cc: stable@vger.kernel.org
Fixes: d6ed935513 ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
To allow claiming hogs by pinctrl, we cannot enable pinctrl until all
groups and functions are being added done. Also, it's necessary that
the corresponding gpiochip is being added when the pinctrl device is
enabled.
Cc: stable@vger.kernel.org
Fixes: d6ed935513 ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Because gpichip applied in the driver must depend on mtk eint to implement
the input data debouncing and the translation between gpio and irq, it's
better to keep logic consistent with mtk eint being built prior to gpiochip
being added.
Cc: stable@vger.kernel.org
Fixes: e6dabd38d8 ("pinctrl: mediatek: add EINT support to MT7622 SoC")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Yuchung Cheng says:
====================
fix DCTCP delayed ACK
This patch series addresses the issue that sometimes DCTCP
fail to acknowledge the latest sequence and result in sender timeout
if inflight is small.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, when a data segment was sent an ACK was piggybacked
on the data segment without generating a CA_EVENT_NON_DELAYED_ACK
event to notify congestion control modules. So the DCTCP
ca->delayed_ack_reserved flag could incorrectly stay set when
in fact there were no delayed ACKs being reserved. This could result
in sending a special ECN notification ACK that carries an older
ACK sequence, when in fact there was no need for such an ACK.
DCTCP keeps track of the delayed ACK status with its own separate
state ca->delayed_ack_reserved. Previously it may accidentally cancel
the delayed ACK without updating this field upon sending a special
ACK that carries a older ACK sequence. This inconsistency would
lead to DCTCP receiver never acknowledging the latest data until the
sender times out and retry in some cases.
Packetdrill script (provided by Larry Brakmo)
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0
0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
0.110 < [ect0] . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
0.200 < [ect0] . 1:1001(1000) ack 1 win 257
0.200 > [ect01] . 1:1(0) ack 1001
0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 1:2(1) ack 1001
0.200 < [ect0] . 1001:2001(1000) ack 2 win 257
0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 2:3(1) ack 2001
0.200 < [ect0] . 2001:3001(1000) ack 3 win 257
0.200 < [ect0] . 3001:4001(1000) ack 3 win 257
0.200 > [ect01] . 3:3(0) ack 4001
0.210 < [ce] P. 4001:4501(500) ack 3 win 257
+0.001 read(4, ..., 4500) = 4500
+0 write(4, ..., 1) = 1
+0 > [ect01] PE. 3:4(1) ack 4501
+0.010 < [ect0] W. 4501:5501(1000) ack 4 win 257
// Previously the ACK sequence below would be 4501, causing a long RTO
+0.040~+0.045 > [ect01] . 4:4(0) ack 5501 // delayed ack
+0.311 < [ect0] . 5501:6501(1000) ack 4 win 257 // More data
+0 > [ect01] . 4:4(0) ack 6501 // now acks everything
+0.500 < F. 9501:9501(0) ack 4 win 257
Reported-by: Larry Brakmo <brakmo@fb.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We accidentally left out the error handling for kstrtoul().
Fixes: a520030e32 ("qlcnic: Implement flash sysfs callback for 83xx adapter")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull i2c fixes from Wolfram Sang:
- I2C core bugfix regarding bus recovery
- driver bugfix for the tegra driver
- typo correction
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: recovery: if possible send STOP with recovery pulses
i2c: tegra: Fix NACK error handling
i2c: stu300: use non-archaic spelling of failes
Daniel Borkmann says:
====================
pull-request: bpf 2018-07-13
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix AF_XDP TX error reporting before final kernel release such that it
becomes consistent between copy mode and zero-copy, from Magnus.
2) Fix three different syzkaller reported issues: oob due to ld_abs
rewrite with too large offset, another oob in l3 based skb test run
and a bug leaving mangled prog in subprog JITing error path, from Daniel.
3) Fix BTF handling for bitfield extraction on big endian, from Okash.
4) Fix a missing linux/errno.h include in cgroup/BPF found by kbuild bot,
from Roman.
5) Fix xdp2skb_meta.sh sample by using just command names instead of
absolute paths for tc and ip and allow them to be redefined, from Taeung.
6) Fix availability probing for BPF seg6 helpers before final kernel ships
so they can be detected at prog load time, from Mathieu.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 8b7008620b ("net: Don't copy pfmemalloc flag in
__copy_skb_header()") introduced a different handling for the
pfmemalloc flag in copy and clone paths.
In __skb_clone(), now, the flag is set only if it was set in the
original skb, but not cleared if it wasn't. This is wrong and
might lead to socket buffers being flagged with pfmemalloc even
if the skb data wasn't allocated from pfmemalloc reserves. Copy
the flag instead of ORing it.
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Fixes: 8b7008620b ("net: Don't copy pfmemalloc flag in __copy_skb_header()")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull timer fixes from Ingo Molnar:
"A clocksource driver fix and a revert"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: arm_arch_timer: Set arch_mem_timer cpumask to cpu_possible_mask
Revert "tick: Prefer a lower rating device only if it's CPU local device"
Pull perf tool fixes from Ingo Molnar:
"Misc tooling fixes: python3 related fixes, gcc8 fix, bashism fixes and
some other smaller fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Use python-config --includes rather than --cflags
perf script python: Fix dict reference counting
perf stat: Fix --interval_clear option
perf tools: Fix compilation errors on gcc8
perf test shell: Prevent temporary editor files from being considered test scripts
perf llvm-utils: Remove bashism from kernel include fetch script
perf test shell: Make perf's inet_pton test more portable
perf test shell: Replace '|&' with '2>&1 |' to work with more shells
perf scripts python: Add Python 3 support to EventClass.py
perf scripts python: Add Python 3 support to sched-migration.py
perf scripts python: Add Python 3 support to Util.py
perf scripts python: Add Python 3 support to SchedGui.py
perf scripts python: Add Python 3 support to Core.py
perf tools: Generate a Python script compatible with Python 2 and 3
Pull rseq fixes from Ingo Molnar:
"Various rseq ABI fixes and cleanups: use get_user()/put_user(),
validate parameters and use proper uapi types, etc"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq/selftests: cleanup: Update comment above rseq_prepare_unload
rseq: Remove unused types_32_64.h uapi header
rseq: uapi: Declare rseq_cs field as union, update includes
rseq: uapi: Update uapi comments
rseq: Use get_user/put_user rather than __get_user/__put_user
rseq: Use __u64 for rseq_cs fields, validate user inputs
Pull rdma fixes from Jason Gunthorpe:
"Things have been quite slow, only 6 RC patches have been sent to the
list. Regression, user visible bugs, and crashing fixes:
- cxgb4 could wrongly fail MR creation due to a typo
- various crashes if the wrong QP type is mixed in with APIs that
expect other types
- syzkaller oops
- using ERR_PTR and NULL together cases HFI1 to crash in some cases
- mlx5 memory leak in error unwind"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/mlx5: Fix memory leak in mlx5_ib_create_srq() error path
RDMA/uverbs: Don't fail in creation of multiple flows
IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values
RDMA/uverbs: Fix slab-out-of-bounds in ib_uverbs_ex_create_flow
RDMA/uverbs: Protect from attempts to create flows on unsupported QP
iw_cxgb4: correctly enforce the max reg_mr depth
Pull VFIO fix from Alex Williamson:
"Fix deadlock in mbochs sample driver (Alexey Khoroshilov)"
* tag 'vfio-v4.18-rc5' of git://github.com/awilliam/linux-vfio:
sample: vfio-mdev: avoid deadlock in mdev_access()
Pull Kbuild fixes from Masahiro Yamada:
- update Kbuild and Kconfig documents
- sanitize -I compiler option handling
- update extract-vmlinux script to recognize LZ4 and ZSTD
- fix tools Makefiles
- update tags.sh to handle __ro_after_init
- suppress warnings in case getconf does not recognize LFS_* parameters
* tag 'kbuild-fixes-v4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: suppress warnings from 'getconf LFS_*'
scripts/tags.sh: add __ro_after_init
tools: build: Use HOSTLDFLAGS with fixdep
tools: build: Fixup host c flags
tools build: fix # escaping in .cmd files for future Make
scripts: teach extract-vmlinux about LZ4 and ZSTD
kbuild: remove duplicated comments about PHONY
kbuild: .PHONY is not a variable, but PHONY is
kbuild: do not drop -I without parameter
kbuild: document the KBUILD_KCONFIG env. variable
kconfig: update user kconfig tools doc.
kbuild: delete INSTALL_FW_PATH from kbuild documentation
kbuild: update ARCH alias info for sparc
kbuild: update ARCH alias info for sh
Pull arm64 fixes from Will Deacon:
"Catalin's out enjoying the sunshine, so I'm sending the fixes for a
couple of weeks (although there hopefully won't be any more!).
We've got a revert of a previous fix because it broke the build with
some distro toolchains and a preemption fix when detemining whether or
not the SIMD unit is in use.
Summary:
- Revert back to the 'linux' target for LD, as 'elf' breaks some
distributions
- Fix preemption race when testing whether the vector unit is in use
or not"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: neon: Fix function may_use_simd() return error status
Revert "arm64: Use aarch64elf and aarch64elfb emulation mode variants"
Pull ARM fixes from Russell King:
"A couple of small fixes this time around from Steven for an
interaction between ftrace and kernel read-only protection, and
Vladimir for nommu"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8780/1: ftrace: Only set kernel memory back to read-only after boot
ARM: 8775/1: NOMMU: Use instr_sync instead of plain isb in common code
Pull tracing fixlet from Steven Rostedt:
"Joel Fernandes asked to add a feature in tracing that Android had its
own patch internally for. I took it back in 4.13. Now he realizes that
he had a mistake, and swapped the values from what Android had. This
means that the old Android tools will break when using a new kernel
that has the new feature on it.
The options are:
1. To swap it back to what Android wants.
2. Add a command line option or something to do the swap
3. Just let Android carry a patch that swaps it back
Since it requires setting a tracing option to enable this anyway, I
doubt there are other users of this than Android. Thus, I've decided
to take option 1. If someone else is actually depending on the order
that is in the kernel, then we will have to revert this change and go
to option 2 or 3"
* tag 'trace-v4.18-rc3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Reorder display of TGID to be after PID
Pull sound fixes from Takashi Iwai:
"Just a few HD-auio fixes: one fix for a possible mutex deadlock at
HDMI hotplug handling is somewhat subtle and delicate, while the rest
are usual device-specific quirks"
* tag 'sound-4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/ca0132: Update a pci quirk device name
ALSA: hda/ca0132: Add Recon3Di quirk for Gigabyte G1.Sniper Z97
ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION
ALSA: hda - Handle pm failure during hotplug
Pull libnvdimm fixes from Dave Jiang:
- ensure that a variable passed in by reference to acpi_nfit_ctl is
always set to a value. An incremental patch is provided due to notice
from testing in -next. The rest of the commits did not exhibit
issues.
- fix a return path in nsio_rw_bytes() that was not returning "bytes
remain" as expected for the function.
- address an issue where applications polling on scrub-completion for
the NVDIMM may falsely wakeup and read the wrong state value and
cause hang.
- change the test unit persistent capability attribute to fix up a
broken assumption in the unit test infrastructure wrt the
'write_cache' attribute
- ratelimit dev_info() in the dax device check_vma() function since
this is easily triggered from userspace
* tag 'libnvdimm-fixes-4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nfit: fix unchecked dereference in acpi_nfit_ctl
acpi, nfit: Fix scrub idle detection
tools/testing/nvdimm: advertise a write cache for nfit_test
acpi/nfit: fix cmd_rc for acpi_nfit_ctl to always return a value
dev-dax: check_vma: ratelimit dev_info-s
libnvdimm, pmem: Fix memcpy_mcsafe() return code handling in nsio_rw_bytes()
btrfs_cmp_data_free() puts cmp's src_pages and dst_pages, but leaves
their page address intact. Now, if you hit "goto again" in
btrfs_extent_same_range() and hit some error in
btrfs_cmp_data_prepare(), you'll try to unlock/put already put pages.
This is simple fix to reset the address to avoid use-after-free.
Fixes: 67b07bd4be ("Btrfs: reuse cmp workspace in EXTENT_SAME ioctl")
Signed-off-by: Naohiro Aota <naota@elisp.net>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Magnus Karlsson says:
====================
This patch set adjusts the AF_XDP TX error reporting so that it becomes
consistent between copy mode and zero-copy. First some background:
Copy-mode for TX uses the SKB path in which the action of sending the
packet is performed from process context using the sendmsg
syscall. Completions are usually done asynchronously from NAPI mode by
using a TX interrupt. In this mode, send errors can be returned back
through the syscall.
In zero-copy mode both the sending of the packet and the completions
are done asynchronously from NAPI mode for performance reasons. In
this mode, the sendmsg syscall only makes sure that the TX NAPI loop
will be run that performs both the actions of sending and
completing. In this mode it is therefore not possible to return errors
through the sendmsg syscall as the sending is done from the NAPI
loop. Note that it is possible to implement a synchronous send with
our API, but in our benchmarks that made the TX performance drop by
nearly half due to synchronization requirements and cache line
bouncing. But for some netdevs this might be preferable so let us
leave it up to the implementation to decide.
The problem is that the current code base returns some errors in
copy-mode that are not possible to return in zero-copy mode. This
patch set aligns them so that the two modes always return the same
error code. We achieve this by removing some of the errors returned by
sendmsg in copy-mode (and in one case adding an error message for
zero-copy mode) and offering alternative error detection methods that
are consistent between the two modes.
The structure of the patch set is as follows:
Patch 1: removes the ENXIO return code from copy-mode when someone has
forcefully changed the number of queues on the device so that the
queue bound to the socket is no longer available. Just silently stop
sending anything as in zero-copy mode.
Patch 2: stop returning EAGAIN in copy mode when the completion queue
is full as zero-copy does not do this. Instead this situation can be
detected by comparing the head and tail pointers of the completion
queue in both modes. In any case, EAGAIN was not the correct error code
here since no amount of calling sendmsg will solve the problem. Only
consuming one or more messages on the completion queue will fix this.
Patch 3: Always return ENOBUFS from sendmsg if there is no TX queue
configured. This was not the case for zero-copy mode.
Patch 4: stop returning EMSGSIZE when the size of the packet is larger
than the MTU. Just send it to the device so that it will drop it as in
zero-copy mode.
Note that copy-mode can still return EAGAIN in certain circumstances,
but as these conditions cannot occur in zero-copy mode it is fine for
copy-mode to return them.
====================
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch stops returning EMSGSIZE from sendmsg in copy mode when the
size of the packet is larger than the MTU. Just send it to the device
so that it will drop it as in zero-copy mode. This makes the error
reporting consistent between copy mode and zero-copy mode.
Fixes: 35fcde7f8d ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch makes sure ENOBUFS is always returned from sendmsg if there
is no TX queue configured. This was not the case for zero-copy
mode. With this patch this error reporting is consistent between copy
mode and zero-copy mode.
Fixes: ac98d8aab6 ("xsk: wire upp Tx zero-copy functions")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch stops returning EAGAIN in TX copy mode when the completion
queue is full as zero-copy does not do this. Instead this situation
can be detected by comparing the head and tail pointers of the
completion queue in both modes. In any case, EAGAIN was not the
correct error code here since no amount of calling sendmsg will solve
the problem. Only consuming one or more messages on the completion
queue will fix this.
With this patch, the error reporting becomes consistent between copy
mode and zero-copy mode.
Fixes: 35fcde7f8d ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch removes the ENXIO return code from TX copy-mode when
someone has forcefully changed the number of queues on the device so
that the queue bound to the socket is no longer available. Just
silently stop sending anything as in zero-copy mode so the error
reporting gets consistent between the two modes.
Fixes: 35fcde7f8d ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Commit 542c5908ab ("btrfs: replace uuid_mutex by
device_list_mutex in btrfs_open_devices") switched to device_list_mutex
as we need that for the device list traversal, but we also need
uuid_mutex to protect access to fs_devices::opened to be consistent with
other users of that.
Fixes: 542c5908ab ("btrfs: replace uuid_mutex by device_list_mutex in btrfs_open_devices")
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The offset needs to be added after reading the alarm value.
It also needs to be subtracted after the now < alarm test.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Setting pv_irq_ops for Xen PV domains should be done as early as
possible in order to support e.g. very early printk() usage.
The same applies to xen_vcpu_info_reset(0), as it is needed for the
pv irq ops.
Move the call of xen_setup_machphys_mapping() after initializing the
pv functions as it contains a WARN_ON(), too.
Remove the no longer necessary conditional in xen_init_irq_ops()
from PVH V1 times to make clear this is a PV only function.
Cc: <stable@vger.kernel.org> # 4.14
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
The calling convention of blk_get_request() has changed in lk 4.18; update
the comment in sg.c to match.
Fixes: ff005a0662 ("block: sanitize blk_get_request calling conventions")
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In iscsi_check_tmf_restrictions() task->hdr is dereferenced to print the
opcode, it is possible that task->hdr is NULL.
There are two cases based on opcode argument:
1. ISCSI_OP_SCSI_CMD - In this case alloc_pdu() is called
after iscsi_check_tmf_restrictions()
iscsi_prep_scsi_cmd_pdu() -> iscsi_check_tmf_restrictions() -> alloc_pdu().
Transport drivers allocate memory for iSCSI hdr in alloc_pdu() and assign
it to task->hdr. In case of TMF task->hdr will be NULL resulting in NULL
pointer dereference.
2. ISCSI_OP_SCSI_DATA_OUT - In this case transport driver can free the
memory for iSCSI hdr after transmitting the pdu so task->hdr can be NULL or
invalid.
This patch fixes this issue by removing task->hdr->opcode from the printk
statement.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
- rounddown CXGBIT_MAX_ISO_PAYLOAD by csk->emss before calculating
max_iso_npdu to get max TCP payload in multiple of mss.
- call cxgbit_set_digest() before cxgbit_set_iso_npdu() to set
csk->submode, it is used in calculating number of iso pdus.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The udpgso benchmark compares various configurations of UDP and TCP.
Including one that is not upstream, udp zerocopy. This is a leftover
from the earlier RFC patchset.
The test is part of kselftests and run in continuous spinners. Remove
the failing case to make the test start passing.
Fixes: 3a687bef14 ("selftests: udp gso benchmark")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently ftrace displays data in trace output like so:
_-----=> irqs-off
/ _----=> need-resched
| / _---=> hardirq/softirq
|| / _--=> preempt-depth
||| / delay
TASK-PID CPU TGID |||| TIMESTAMP FUNCTION
| | | | |||| | |
bash-1091 [000] ( 1091) d..2 28.313544: sched_switch:
However Android's trace visualization tools expect a slightly different
format due to an out-of-tree patch patch that was been carried for a
decade, notice that the TGID and CPU fields are reversed:
_-----=> irqs-off
/ _----=> need-resched
| / _---=> hardirq/softirq
|| / _--=> preempt-depth
||| / delay
TASK-PID TGID CPU |||| TIMESTAMP FUNCTION
| | | | |||| | |
bash-1091 ( 1091) [002] d..2 64.965177: sched_switch:
From kernel v4.13 onwards, during which TGID was introduced, tracing
with systrace on all Android kernels will break (most Android kernels
have been on 4.9 with Android patches, so this issues hasn't been seen
yet). From v4.13 onwards things will break.
The chrome browser's tracing tools also embed the systrace viewer which
uses the legacy TGID format and updates to that are known to be
difficult to make.
Considering this, I suggest we make this change to the upstream kernel
and backport it to all Android kernels. I believe this feature is merged
recently enough into the upstream kernel that it shouldn't be a problem.
Also logically, IMO it makes more sense to group the TGID with the
TASK-PID and the CPU after these.
Link: http://lkml.kernel.org/r/20180626000822.113931-1-joel@joelfernandes.org
Cc: jreck@google.com
Cc: tkjos@google.com
Cc: stable@vger.kernel.org
Fixes: 441dae8f2f ("tracing: Add support for display of tgid in trace output")
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
If variable length link layer headers result in a packet shorter
than dev->hard_header_len, reset the network header offset. Else
skb->mac_len may exceed skb->len after skb_mac_reset_len.
packet_sendmsg_spkt already has similar logic.
Fixes: b84bbaf7a6 ("packet: in packet_snd start writing at link layer allocation")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With commit 044e6e3d74: "ext4: don't update checksum of new
initialized bitmaps" the buffer valid bit will get set without
actually setting up the checksum for the allocation bitmap, since the
checksum will get calculated once we actually allocate an inode or
block.
If we are doing this, then we need to (re-)check the verified bit
after we take the block group lock. Otherwise, we could race with
another process reading and verifying the bitmap, which would then
complain about the checksum being invalid.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1780137
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
The pfmemalloc flag indicates that the skb was allocated from
the PFMEMALLOC reserves, and the flag is currently copied on skb
copy and clone.
However, an skb copied from an skb flagged with pfmemalloc
wasn't necessarily allocated from PFMEMALLOC reserves, and on
the other hand an skb allocated that way might be copied from an
skb that wasn't.
So we should not copy the flag on skb copy, and rather decide
whether to allow an skb to be associated with sockets unrelated
to page reclaim depending only on how it was allocated.
Move the pfmemalloc flag before headers_start[0] using an
existing 1-bit hole, so that __copy_skb_header() doesn't copy
it.
When cloning, we'll now take care of this flag explicitly,
contravening to the warning comment of __skb_clone().
While at it, restore the newline usage introduced by commit
b193722731 ("net: reorganize sk_buff for faster
__copy_skb_header()") to visually separate bytes used in
bitfields after headers_start[0], that was gone after commit
a9e419dc7b ("netfilter: merge ctinfo into nfct pointer storage
area"), and describe the pfmemalloc flag in the kernel-doc
structure comment.
This doesn't change the size of sk_buff or cacheline boundaries,
but consolidates the 15 bits hole before tc_index into a 2 bytes
hole before csum, that could now be filled more easily.
Reported-by: Patrick Talbert <ptalbert@redhat.com>
Fixes: c93bdd0e03 ("netvm: allow skb allocation to use PFMEMALLOC reserves")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bert Kenward says:
====================
sfc: filter locking fixes
Two fixes for sfc ef10 filter table locking. Initially spotted
by lockdep, but one issue has also been seen in normal use.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We should take and release the filter_sem consistently during the
reset process, in the same manner as the mac_lock and reset_lock.
For lockdep consistency we also take the filter_sem for write around
other calls to efx->type->init().
Fixes: c2bebe37c6 ("sfc: give ef10 its own rwsem in the filter table instead of filter_lock")
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In some situations we may end up calling down_read while already
holding the semaphore for write, thus hanging. This has been seen
when setting the MAC address for the interface. The hung task log
in this situation includes this stack:
down_read
efx_ef10_filter_insert
efx_ef10_filter_insert_addr_list
efx_ef10_filter_vlan_sync_rx_mode
efx_ef10_filter_add_vlan
efx_ef10_filter_table_probe
efx_ef10_set_mac_address
efx_set_mac_address
dev_set_mac_address
In addition, lockdep rightly points out that nested calling of
down_read is incorrect.
Fixes: c2bebe37c6 ("sfc: give ef10 its own rwsem in the filter table instead of filter_lock")
Tested-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SYSTEMPORT Lite reversed the logic compared to SYSTEMPORT, the
GIB_FCS_STRIP bit is set when the Ethernet FCS is stripped, and that bit
is not set by default. Fix the logic such that we properly check whether
that bit is set or not and we don't forward an extra 4 bytes to the
network stack.
Fixes: 44a4524c54 ("net: systemport: Add support for SYSTEMPORT Lite")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I2C clients may misunderstand recovery pulses if they can't read SDA to
bail out early. In the worst case, as a write operation. To avoid that
and if we can write SDA, try to send STOP to avoid the
misinterpretation.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Under rare conditions where repair code may be used it is possible that
window probes are either unnecessary or undesired. If the user knows that
window probes are not wanted or needed this change allows them to skip
sending them when a socket comes out of repair.
Signed-off-by: Stefan Baranoff <sbaranoff@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a bug where the sequence numbers of a socket created using
TCP repair functionality are lower than set after connect is called.
This occurs when the repair socket overlaps with a TIME-WAIT socket and
triggers the re-use code. The amount lower is equal to the number of times
that a particular IP/port set is re-used and then put back into TIME-WAIT.
Re-using the first time the sequence number is 1 lower, closing that socket
and then re-opening (with repair) a new socket with the same addresses/ports
puts the sequence number 2 lower than set via setsockopt. The third time is
3 lower, etc. I have not tested what the limit of this acrewal is, if any.
The fix is, if a socket is in repair mode, to respect the already set
sequence number and timestamp when it would have already re-used the
TIME-WAIT socket.
Signed-off-by: Stefan Baranoff <sbaranoff@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzkaller managed to trigger the following bug through fault injection:
[...]
[ 141.043668] verifier bug. No program starts at insn 3
[ 141.044648] WARNING: CPU: 3 PID: 4072 at kernel/bpf/verifier.c:1613
get_callee_stack_depth kernel/bpf/verifier.c:1612 [inline]
[ 141.044648] WARNING: CPU: 3 PID: 4072 at kernel/bpf/verifier.c:1613
fixup_call_args kernel/bpf/verifier.c:5587 [inline]
[ 141.044648] WARNING: CPU: 3 PID: 4072 at kernel/bpf/verifier.c:1613
bpf_check+0x525e/0x5e60 kernel/bpf/verifier.c:5952
[ 141.047355] CPU: 3 PID: 4072 Comm: a.out Not tainted 4.18.0-rc4+ #51
[ 141.048446] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),BIOS 1.10.2-1 04/01/2014
[ 141.049877] Call Trace:
[ 141.050324] __dump_stack lib/dump_stack.c:77 [inline]
[ 141.050324] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
[ 141.050950] ? dump_stack_print_info.cold.2+0x52/0x52 lib/dump_stack.c:60
[ 141.051837] panic+0x238/0x4e7 kernel/panic.c:184
[ 141.052386] ? add_taint.cold.5+0x16/0x16 kernel/panic.c:385
[ 141.053101] ? __warn.cold.8+0x148/0x1ba kernel/panic.c:537
[ 141.053814] ? __warn.cold.8+0x117/0x1ba kernel/panic.c:530
[ 141.054506] ? get_callee_stack_depth kernel/bpf/verifier.c:1612 [inline]
[ 141.054506] ? fixup_call_args kernel/bpf/verifier.c:5587 [inline]
[ 141.054506] ? bpf_check+0x525e/0x5e60 kernel/bpf/verifier.c:5952
[ 141.055163] __warn.cold.8+0x163/0x1ba kernel/panic.c:538
[ 141.055820] ? get_callee_stack_depth kernel/bpf/verifier.c:1612 [inline]
[ 141.055820] ? fixup_call_args kernel/bpf/verifier.c:5587 [inline]
[ 141.055820] ? bpf_check+0x525e/0x5e60 kernel/bpf/verifier.c:5952
[...]
What happens in jit_subprogs() is that kcalloc() for the subprog func
buffer is failing with NULL where we then bail out. Latter is a plain
return -ENOMEM, and this is definitely not okay since earlier in the
loop we are walking all subprogs and temporarily rewrite insn->off to
remember the subprog id as well as insn->imm to temporarily point the
call to __bpf_call_base + 1 for the initial JIT pass. Thus, bailing
out in such state and handing this over to the interpreter is troublesome
since later/subsequent e.g. find_subprog() lookups are based on wrong
insn->imm.
Therefore, once we hit this point, we need to jump to out_free path
where we undo all changes from earlier loop, so that interpreter can
work on unmodified insn->{off,imm}.
Another point is that should find_subprog() fail in jit_subprogs() due
to a verifier bug, then we also should not simply defer the program to
the interpreter since also here we did partial modifications. Instead
we should just bail out entirely and return an error to the user who is
trying to load the program.
Fixes: 1c2a088a66 ("bpf: x64: add JIT support for multi-function programs")
Reported-by: syzbot+7d427828b2ea6e592804@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2018-07-12
This series contains updates to ixgbe and e100/e1000 kernel documentation.
Alex fixes ixgbe to ensure that we are more explicit about the ordering
of updates to the receive address register (RAR) table.
Dan Carpenter fixes an issue where we were reading one element beyond
the end of the array.
Mauro Carvalho Chehab fixes formatting issues in the e100.rst and
e1000.rst that were causing errors during 'make htmldocs'.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MTD fix from Boris Brezillon:
"A SPI NOR fix to fix a timeout in the cadence QSPI controller driver"
* tag 'mtd/fixes-for-4.18-rc5' of git://git.infradead.org/linux-mtd:
mtd: spi-nor: cadence-quadspi: Fix direct mode write timeouts
Suppress warnings for systems that do not recognize LFS_*.
getconf: no such configuration parameter `LFS_CFLAGS'
getconf: no such configuration parameter `LFS_LDFLAGS'
getconf: no such configuration parameter `LFS_LIBS'
Fixes: d7f14c66c2 ("kbuild: Enable Large File Support for hostprogs")
Reported-by: Chen Feng <puck.chen@hisilicon.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Commit 0c3b7e4261 ("tools build: Add support for host programs format")
introduced host_c_flags which referenced CHOSTFLAGS. The actual name of the
variable is HOSTCFLAGS. Fix this up.
Fixes: 0c3b7e4261 ("tools build: Add support for host programs format")
Signed-off-by: Laura Abbott <labbott@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
In 2016 GNU Make made a backwards incompatible change to the way '#'
characters were handled in Makefiles when used inside functions or
macros:
http://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57
Due to this change, when attempting to run `make prepare' I get a
spurious make syntax error:
/home/earnest/linux/tools/objtool/.fixdep.o.cmd:1: *** missing separator. Stop.
When inspecting `.fixdep.o.cmd' it includes two lines which use
unescaped comment characters at the top:
\# cannot find fixdep (/home/earnest/linux/tools/objtool//fixdep)
\# using basic dep data
This is because `tools/build/Build.include' prints these '\#'
characters:
printf '\# cannot find fixdep (%s)\n' $(fixdep) > $(dot-target).cmd; \
printf '\# using basic dep data\n\n' >> $(dot-target).cmd; \
This completes commit 9564a8cf42 ("Kbuild: fix # escaping in .cmd files
for future Make").
Link: https://bugzilla.kernel.org/show_bug.cgi?id=197847
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Documentation/networking/e1000.rst:83: ERROR: Unexpected indentation.
Documentation/networking/e1000.rst:84: WARNING: Block quote ends without a blank line; unexpected unindent.
Documentation/networking/e1000.rst:173: WARNING: Definition list ends without a blank line; unexpected unindent.
Documentation/networking/e1000.rst:236: WARNING: Definition list ends without a blank line; unexpected unindent.
While here, fix highlights and mark a table as such.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
[Why]
When a dce100 asic was suspended, the clocks were not set to 0.
Upon resume, the new clock was compared to the existing clock,
they were found to be the same, and so the clock was not set.
This resulted in a pernicious blackscreen.
[How]
In atomic commit, check to see if there are any active pipes.
If no, set clocks to 0
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The driver is expecting clock frequency in kHz, while SMU returns
the values in 10kHz, which causes the bandwidth validation to fail
4.18 has the faulty clock assignment in pp_to_dc_clock_levels_with_latency
only, which is only used by Vega. Make sure we multiply these values
by 10 here, as we do for other ASICs as powerplay assigned them
wrong. 4.19 has the proper fix in powerplay.
v2: Add Fixes tag
v3: Fixes -> Bugzilla, with simplified link
Bugzilla: https://bugs.freedesktop.org/107082
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This change makes it so that we are much more explicit about the ordering
of updates to the receive address register (RAR) table. Prior to this patch
I believe we may have been updating the table while entries were still
active, or possibly allowing for reordering of things since we weren't
explicitly flushing writes to either the lower or upper portion of the
register prior to accessing the other half.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The AM3517 has a different OTG controller location than the OMAP3,
which is included from omap3.dtsi. This results in a hwmod error.
Since the AM3517 has a different OTG controller address, this patch
disabes one that is isn't available.
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
python interface fixes:
- Make 'perf script -g python' generate scripts that are compatible
with both python 2 and 3 (Jeremy Cline)
- Fix python dictionary reference counting (Janne Huttunen)
- Add python3 support for various python scripts (Jeremy Cline)
- Use python-config --includes rather than --cflags, fixing the build
on Fedora, where the python 3.7 started adding -flto to what
perf stat fixes:
- Remove needless extra header line in --interval_clear (Jiri Olsa)
python-config --cflags generate, breaking the perf build (Jeremy Cline)
Build fixes:
- Fix compilation errors on gcc8 (Jiri Olsa)
perf llvm-utils fixes:
- Remove bashism from kernel include fetch script (Kim Phillips)
perf test fixes: (Kim Phillips)
- Replace '|&' with '2>&1 |' to work with more shells
- Make perf's inet_pton test more portable
- Prevent temporary editor files from being considered test scripts
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Call secure services to enable ACTLR[0] (Enable invalidates of BTB with
ICIALLU) when branch hardening is enabled for kernel.
On GP devices OMAP5/DRA7, there is no possibility to update secure
side since "secure world" is ROM and there are no override mechanisms
possible. On HS devices, appropriate PPA should do the workarounds as
well.
However, the configuration is only done for secondary core, since it is
expected that firmware/bootloader will have enabled the required
configuration for the primary boot core (note: bootloaders typically
will NOT enable secondary processors, since it has no need to do so).
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
When removing the global bit from __supported_pte_mask do the same for
__default_kernel_pte_mask in order to avoid the WARN_ONCE() in
check_pgprot() when setting a kernel pte before having called
init_mem_mapping().
Cc: <stable@vger.kernel.org> # 4.17
Reported-by: Michael Young <m.a.young@durham.ac.uk>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Stefan Schmidt says:
====================
pull-request: ieee802154 for net 2018-07-11
An update from ieee802154 for your *net* tree.
Build system fix for a missing include from Arnd Bergmann.
Setting the IFLA_LINK for the lowpan parent from Lubomir Rintel.
Fixes for some RX corner cases in adf7242 driver by Michael Hennerich.
And some small patches to cleanup our BUG_ON vs WARN_ON usage.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The PCI subsystem in question for this quirk rule has been
identified as a Gigabyte GA-Z170X-Gaming 7 motherboard. Set the
device name appropriately.
Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Reviewed-by: Connor McAdams <conmanx360@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
These motherboards have Sound Core3D and apparently "support"
Recon3Di. Added to the quirk list as QUIRK_R3DI.
Issue report, PCI Subsystem ID, and testing by a contributor on
IRC who wished to remain anonymous.
Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Reviewed-by: Connor McAdams <conmanx360@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Trivial fix to spelling mistake in qed_probe message.
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The nvme driver specific structures need to be initialized prior to
enabling the generic controller so we can unwind on failure with out
using the reference counting callbacks so that 'probe' and 'remove'
can be symmetric.
The newly added iod_mempool is the only resource that was being
allocated out of order, and a failure there would leak the generic
controller memory. This patch just moves that allocation above the
controller initialization.
Fixes: 943e942e62 ("nvme-pci: limit max IO size and segments to avoid high order allocations")
Reported-by: Weiping Zhang <zwp10758@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
It was been observed that with a particular order of initialisation,
the netdev can be up, but the SFP module still has its TX_DISABLE
signal asserted. This occurs when the network device brought up before
the SFP kernel module has been inserted by userspace.
This occurs because sfp-bus layer does not hear about the change in
network device state, and so assumes that it is still down. Set
netdev->sfp when the upstream is registered to work around this problem.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We fail to correctly clean up after a bus registration failure, which
can lead to an incorrect assumption about the registration state of
the upstream or sfp cage.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
sykzaller triggered several panics similar to the below:
[...]
[ 248.851531] BUG: KASAN: use-after-free in _copy_to_user+0x5c/0x90
[ 248.857656] Read of size 985 at addr ffff8808017ffff2 by task a.out/1425
[...]
[ 248.865902] CPU: 1 PID: 1425 Comm: a.out Not tainted 4.18.0-rc4+ #13
[ 248.865903] Hardware name: Supermicro SYS-5039MS-H12TRF/X11SSE-F, BIOS 2.1a 03/08/2018
[ 248.865905] Call Trace:
[ 248.865910] dump_stack+0xd6/0x185
[ 248.865911] ? show_regs_print_info+0xb/0xb
[ 248.865913] ? printk+0x9c/0xc3
[ 248.865915] ? kmsg_dump_rewind_nolock+0xe4/0xe4
[ 248.865919] print_address_description+0x6f/0x270
[ 248.865920] kasan_report+0x25b/0x380
[ 248.865922] ? _copy_to_user+0x5c/0x90
[ 248.865924] check_memory_region+0x137/0x190
[ 248.865925] kasan_check_read+0x11/0x20
[ 248.865927] _copy_to_user+0x5c/0x90
[ 248.865930] bpf_test_finish.isra.8+0x4f/0xc0
[ 248.865932] bpf_prog_test_run_skb+0x6a0/0xba0
[...]
After scrubbing the BPF prog a bit from the noise, turns out it called
bpf_skb_change_head() for the lwt_xmit prog with headroom of 2. Nothing
wrong in that, however, this was run with repeat >> 0 in bpf_prog_test_run_skb()
and the same skb thus keeps changing until the pskb_expand_head() called
from skb_cow() keeps bailing out in atomic alloc context with -ENOMEM.
So upon return we'll basically have 0 headroom left yet blindly do the
__skb_push() of 14 bytes and keep copying data from there in bpf_test_finish()
out of bounds. Fix to check if we have enough headroom and if pskb_expand_head()
fails, bail out with error.
Another bug independent of this fix (but related in triggering above) is
that BPF_PROG_TEST_RUN should be reworked to reset the skb/xdp buffer to
it's original state from input as otherwise repeating the same test in a
loop won't work for benchmarking when underlying input buffer is getting
changed by the prog each time and reused for the next run leading to
unexpected results.
Fixes: 1cf1cae963 ("bpf: introduce BPF_PROG_TEST_RUN command")
Reported-by: syzbot+709412e651e55ed96498@syzkaller.appspotmail.com
Reported-by: syzbot+54f39d6ab58f39720a55@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Dynamic ftrace requires modifying the code segments that are usually
set to read-only. To do this, a per arch function is called both before
and after the ftrace modifications are performed. The "before" function
will set kernel code text to read-write to allow for ftrace to make the
modifications, and the "after" function will set the kernel code text
back to "read-only" to keep the kernel code text protected.
The issue happens when dynamic ftrace is tested at boot up. The test is
done before the kernel code text has been set to read-only. But the
"before" and "after" calls are still performed. The "after" call will
change the kernel code text to read-only prematurely, and other boot
code that expects this code to be read-write will fail.
The solution is to add a variable that is set when the kernel code text
is expected to be converted to read-only, and make the ftrace "before"
and "after" calls do nothing if that variable is not yet set. This is
similar to the x86 solution from commit 1623963097 ("ftrace, x86:
make kernel text writable only for conversions").
Link: http://lkml.kernel.org/r/20180620212906.24b7b66e@vmware.local.home
Reported-by: Stefan Agner <stefan@agner.ch>
Tested-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
When extracting bitfield from a number, btf_int_bits_seq_show() builds
a mask and accesses least significant byte of the number in a way
specific to little-endian. This patch fixes that by checking endianness
of the machine and then shifting left and right the unneeded bits.
Thanks to Martin Lau for the help in navigating potential pitfalls when
dealing with endianess and for the final solution.
Fixes: b00b8daec8 ("bpf: btf: Add pretty print capability for data with BTF type info")
Signed-off-by: Okash Khawaja <osk@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
bpf_lwt_seg6_* helpers require CONFIG_IPV6_SEG6_BPF, and currently
return -EOPNOTSUPP to indicate unavailability. This patch forces the
BPF verifier to reject programs using these helpers when
!CONFIG_IPV6_SEG6_BPF, allowing users to more easily probe if they are
available or not.
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Fix memory leak in the error path of mlx5_ib_create_srq() by making sure
to free the allocated srq.
Fixes: c2b37f7648 ("IB/mlx5: Fix integer overflows in mlx5_ib_create_srq")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Pull kprobe fix from Steven Rostedt:
"This fixes a memory leak in the kprobe code"
* tag 'trace-v4.18-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobe: Release kprobe print_fmt properly
Pull libata fixes from Tejun Heo:
- Jens's patches to expand the usable command depth from 31 to 32 broke
sata_fsl due to a subtle command iteration bug. Fixed by introducing
explicit iteration helpers and using the correct variant.
- On some laptops, enabling LPM by default reportedly led to occasional
hard hangs. Blacklist the affected cases.
- Other misc fixes / changes.
* 'for-4.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata: Remove depends on HAS_DMA in case of platform dependency
ata: Fix ZBC_OUT all bit handling
ata: Fix ZBC_OUT command block check
ahci: Add Intel Ice Lake LP PCI ID
ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS
sata_nv: remove redundant pointers sdev0 and sdev1
sata_fsl: remove dead code in tag retrieval
sata_fsl: convert to command iterator
libata: convert eh to command iterators
libata: add command iterator helpers
ata: ahci_mvebu: ahci_mvebu_stop_engine() can be static
libahci: Fix possible Spectre-v1 pmp indexing in ahci_led_store()
mdev_access() calls mbochs_get_page() with mdev_state->ops_lock held,
while mbochs_get_page() locks the mutex by itself.
It leads to unavoidable deadlock.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
mprotect(EXEC) was failing for stack mappings as default vm flags was
missing MAYEXEC.
This was triggered by glibc test suite nptl/tst-execstack testcase
What is surprising is that despite running LTP for years on, we didn't
catch this issue as it lacks a directed test case.
gcc dejagnu tests with nested functions also requiring exec stack work
fine though because they rely on the GNU_STACK segment spit out by
compiler and handled in kernel elf loader.
This glibc case is different as the stack is non exec to begin with and
a dlopen of shared lib with GNU_STACK segment triggers the exec stack
proceedings using a mprotect(PROT_EXEC) which was broken.
CC: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Incremental patch to fix the unchecked dereference in acpi_nfit_ctl.
Reported by Dan Carpenter:
"acpi/nfit: fix cmd_rc for acpi_nfit_ctl to
always return a value" from Jun 28, 2018, leads to the following
Smatch complaint:
drivers/acpi/nfit/core.c:578 acpi_nfit_ctl()
warn: variable dereferenced before check 'cmd_rc' (see line 411)
drivers/acpi/nfit/core.c
410
411 *cmd_rc = -EINVAL;
^^^^^^^^^^^^^^^^^^
Patch adds unchecked dereference.
Fixes: c1985cefd8 ("acpi/nfit: fix cmd_rc for acpi_nfit_ctl to always return a value")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Pull char/misc fixes from Greg KH:
"Here are a few char/misc driver fixes for 4.18-rc5.
The "largest" stuff here is fixes for the UIO changes in 4.18-rc1 that
caused breakages for some people. Thanks to Xiubo Li for fixing them
quickly. Other than that, minor fixes for thunderbolt, vmw_balloon,
nvmem, mei, ibmasm, and mei drivers. There's also a MAINTAINERS update
where Rafael is offering to help out with reviewing driver core
patches.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
nvmem: Don't let a NULL cell_id for nvmem_cell_get() crash us
thunderbolt: Notify userspace when boot_acl is changed
uio: fix crash after the device is unregistered
uio: change to use the mutex lock instead of the spin lock
uio: use request_threaded_irq instead
fpga: altera-cvp: Fix an error handling path in 'altera_cvp_probe()'
ibmasm: don't write out of bounds in read handler
MAINTAINERS: Add myself as driver core changes reviewer
mei: discard messages from not connected client during power down.
vmw_balloon: fix inflation with batching
Pull staging fixes from Greg KH:
"Here are two tiny staging driver fixes for reported issues for
4.18-rc5.
One fixes the r8822be driver to properly work on lots of new laptops,
the other is for the rtl8723bs driver to fix an underflow error.
Both have been in linux-next for a while with no reported issues"
* tag 'staging-4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: r8822be: Fix RTL8822be can't find any wireless AP
staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data().
Pull USB fixes from Greg KH:
"Here are a number of small USB fixes for 4.18-rc5.
Nothing major here, just the normal set of new device ids, xhci fixes,
and some typec fixes. The typec fix required some tiny changes in an
i2c driver, which that maintainer acked to come through my tree.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: yurex: fix out-of-bounds uaccess in read handler
usb: quirks: add delay quirks for Corsair Strafe
xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
usb/gadget: aspeed-vhub: add USB_LIBCOMPOSITE dependency
docs: kernel-parameters.txt: document xhci-hcd.quirks parameter
USB: serial: mos7840: fix status-register error handling
USB: serial: keyspan_pda: fix modem-status error handling
USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick
USB: serial: ch341: fix type promotion bug in ch341_control_in()
i2c-cht-wc: Fix bq24190 supplier
typec: tcpm: Correctly report power_supply current and voltage for non pd supply
usb: xhci: dbc: Don't decrement runtime PM counter if DBC is not started
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Fixup devname in /proc/interrupts for card detect GPIO
MMC host:
- sdhci-esdhc-imx: Allow 1.8V speed-modes without 100/200MHz pinctrls
- sunxi: Disable IRQ in low power state to prevent IRQ storm
- dw_mmc: Fix card threshold control configuration
- renesas_sdhi_internal_dmac: Fixup DMA error paths"
* tag 'mmc-v4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states
mmc: sunxi: Disable irq during pm_suspend
mmc: dw_mmc: fix card threshold control configuration
mmc: core: cd_label must be last entry of mmc_gpio struct
mmc: renesas_sdhi_internal_dmac: Cannot clear the RX_IN_USE in abort
mmc: renesas_sdhi_internal_dmac: Fix missing unmap in error patch
Pull ACPI fix from Rafael Wysocki:
"Address a regression in ACPICA that ceased to clear the status of GPEs
and fixed events before entering the ACPI S5 (off) system state during
the 4.17 cycle which caused some systems to power up immediately after
they had been turned off"
* tag 'acpi-4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: Clear status of all events when entering S5
The HPLL can be configured through a register (SCU24), however some
platforms chose to configure it through the strapping settings and do
not use the register. This was not noticed as the logic for bit 18 in
SCU24 was confused: set means programmed, but the driver read it as set
means strapped.
This gives us the correct HPLL value on Palmetto systems, from which
most of the peripheral clocks are generated.
Fixes: 5eda5d79e4 ("clk: Add clock driver for ASPEED BMC SoCs")
Cc: stable@vger.kernel.org # v4.15
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
It does not matter if the caller of may_use_simd() migrates to
another cpu after the call, but it is still important that the
kernel_neon_busy percpu instance that is read matches the cpu the
task is running on at the time of the read.
This means that raw_cpu_read() is not sufficient. kernel_neon_busy
may appear true if the caller migrates during the execution of
raw_cpu_read() and the next task to be scheduled in on the initial
cpu calls kernel_neon_begin().
This patch replaces raw_cpu_read() with this_cpu_read() to protect
against this race.
Cc: <stable@vger.kernel.org>
Fixes: cb84d11e16 ("arm64: neon: Remove support for nested or hardirq kernel-mode NEON")
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Yandong Zhao <yandong77520@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Fix a regression introduced in Linux kernel 4.17 where sending a SCSI
command that does not transfer data (such as TEST UNIT READY) via
/dev/bsg/* results in EINVAL.
Fixes: 17cb960f29 ("bsg: split handling of SCSI CDBs vs transport requeues")
Cc: <stable@vger.kernel.org> # 4.17+
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Builds started failing in Fedora on Python 3.7 with:
`.gnu.debuglto_.debug_macro' referenced in section
`.gnu.debuglto_.debug_macro' of
util/scripting-engines/trace-event-python.o: defined in discarded
section
In Fedora, Python 3.7 added -flto to the list of --cflags and since it
was only applied to util/scripting-engines/trace-event-python.c and
scripts/python/Perf-Trace-Util/Context.c, linking failed.
It's not the first time the addition of flags has broken builds: commit
c6707fdef7 ("perf tools: Fix up build in hardnened environments")
appears to have fixed a similar problem. "python-config --includes"
provides the proper -I flags and doesn't introduce additional CFLAGS.
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180710154612.6285-1-jcline@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The dictionaries are attached to the parameter tuple that steals the
references and takes care of releasing them when appropriate. The code
should not decrement the reference counts explicitly. E.g. if libpython
has been built with reference debugging enabled, the superfluous DECREFs
will trigger this error when running perf script:
Fatal Python error: Objects/tupleobject.c:238 object at
0x7f10f2041b40 has negative ref count -1
Aborted (core dumped)
If the reference debugging is not enabled, the superfluous DECREFs might
cause the dict objects to be silently released while they are still in
use. This may trigger various other assertions or just cause perf
crashes and/or weird and unexpected data changes in the stored Python
objects.
Signed-off-by: Janne Huttunen <janne.huttunen@nokia.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jaroslav Skarvada <jskarvad@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1531133990-17485-1-git-send-email-janne.huttunen@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We are getting following warnings on gcc8 that break compilation:
$ make
CC jvmti/jvmti_agent.o
jvmti/jvmti_agent.c: In function ‘jvmti_open’:
jvmti/jvmti_agent.c:252:35: error: ‘/jit-’ directive output may be truncated \
writing 5 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
snprintf(dump_path, PATH_MAX, "%s/jit-%i.dump", jit_path, getpid());
There's no point in checking the result of snprintf call in
jvmti_open, the following open call will fail in case the
name is mangled or too long.
Using tools/lib/ function scnprintf that touches the return value from
the snprintf() calls and thus get rid of those warnings.
$ make DEBUG=1
CC arch/x86/util/perf_regs.o
arch/x86/util/perf_regs.c: In function ‘arch_sdt_arg_parse_op’:
arch/x86/util/perf_regs.c:229:4: error: ‘strncpy’ output truncated before terminating nul
copying 2 bytes from a string of the same length [-Werror=stringop-truncation]
strncpy(prefix, "+0", 2);
^~~~~~~~~~~~~~~~~~~~~~~~
Using scnprintf instead of the strncpy (which we know is safe in here)
to get rid of that warning.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180702134202.17745-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Debian based systems such as Ubuntu have dash as their default shell.
Even if the normal or root user's shell is bash, certain scripts still
call /bin/sh, which points to dash, so we fix this perf test by
rewriting it in a more portable way.
BEFORE:
$ sudo perf test -v 64
64: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 31942
./tests/shell/record+probe_libc_inet_pton.sh: 18: ./tests/shell/record+probe_libc_inet_pton.sh: expected[0]=ping[][0-9 \.:]+probe_libc:inet_pton: \([[:xdigit:]]+\): not found
./tests/shell/record+probe_libc_inet_pton.sh: 19: ./tests/shell/record+probe_libc_inet_pton.sh: expected[1]=.*inet_pton\+0x[[:xdigit:]]+[[:space:]]\(/lib/x86_64-linux-gnu/libc-2.27.so|inlined\)$: not found
./tests/shell/record+probe_libc_inet_pton.sh: 29: ./tests/shell/record+probe_libc_inet_pton.sh: expected[2]=getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\(/lib/x86_64-linux-gnu/libc-2.27.so\)$: not found
./tests/shell/record+probe_libc_inet_pton.sh: 30: ./tests/shell/record+probe_libc_inet_pton.sh: expected[3]=.*\+0x[[:xdigit:]]+[[:space:]]\(.*/bin/ping.*\)$: not found
ping 31963 [004] 83577.670613: probe_libc:inet_pton: (7fe15f87f4b0)
./tests/shell/record+probe_libc_inet_pton.sh: 39: ./tests/shell/record+probe_libc_inet_pton.sh: Bad substitution
./tests/shell/record+probe_libc_inet_pton.sh: 41: ./tests/shell/record+probe_libc_inet_pton.sh: Bad substitution
test child finished with -2
---- end ----
probe libc's inet_pton & backtrace it with ping: Skip
AFTER:
$ sudo perf test -v 64
64: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 32277
ping 32295 [001] 83679.690020: probe_libc:inet_pton: (7ff244f504b0)
7ff244f504b0 __GI___inet_pton+0x0 (/lib/x86_64-linux-gnu/libc-2.27.so)
7ff244f14ce4 getaddrinfo+0x124 (/lib/x86_64-linux-gnu/libc-2.27.so)
556ac036b57d _init+0xb75 (/bin/ping)
test child finished with 0
---- end ----
probe libc's inet_pton & backtrace it with ping: Ok
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20180629124643.2089b3ce59960eba34e87b27@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since we do not specify bash (and/or zsh) as a requirement, use the
standard error redirection that is more widely supported.
BEFORE:
$ sudo perf test -v 62
62: Check open filename arg using perf trace + vfs_getname:
--- start ---
test child forked, pid 27305
./tests/shell/trace+probe_vfs_getname.sh: 20: ./tests/shell/trace+probe_vfs_getname.sh: Syntax error: "&" unexpected
test child finished with -2
---- end ----
Check open filename arg using perf trace + vfs_getname: Skip
AFTER:
$ sudo perf test -v 62
64: Check open filename arg using perf trace + vfs_getname :
--- start ---
test child forked, pid 23008
Added new event:
probe:vfs_getname (on getname_flags:72 with pathname=result->name:string)
You can now use it in all perf tools, such as:
perf record -e probe:vfs_getname -aR sleep 1
0.361 ( 0.008 ms): touch/23032 openat(dfd: CWD, filename: /tmp/temporary_file.VEh0n, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) = 4
test child finished with 0
---- end ----
Check open filename arg using perf trace + vfs_getname: Ok
Similar to commit 35435cd060, with the same title.
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20180629124633.0a9f4bea54b8d2c28f265de2@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Hans de Goede reported that his mixed EFI mode Bay Trail tablet
would not boot at all any more, but enter a reboot loop without
any logs printed by the kernel.
Unbreak 64-bit Linux/x86 on 32-bit UEFI:
When it was first introduced, the EFI stub code that copies the
contents of PCI option ROMs originally only intended to do so if
the EFI_PCI_IO_ATTRIBUTE_EMBEDDED_ROM attribute was *not* set.
The reason was that the UEFI spec permits PCI option ROM images
to be provided by the platform directly, rather than via the ROM
BAR, and in this case, the OS can only access them at runtime if
they are preserved at boot time by copying them from the areas
described by PciIo->RomImage and PciIo->RomSize.
However, it implemented this check erroneously, as can be seen in
commit:
dd5fc854de ("EFI: Stash ROMs if they're not in the PCI BAR")
which introduced:
if (!attributes & EFI_PCI_IO_ATTRIBUTE_EMBEDDED_ROM)
continue;
and given that the numeric value of EFI_PCI_IO_ATTRIBUTE_EMBEDDED_ROM
is 0x4000, this condition never becomes true, and so the option ROMs
were copied unconditionally.
This was spotted and 'fixed' by commit:
886d751a2e ("x86, efi: correct precedence of operators in setup_efi_pci")
but inadvertently inverted the logic at the same time, defeating
the purpose of the code, since it now only preserves option ROM
images that can be read from the ROM BAR as well.
Unsurprisingly, this broke some systems, and so the check was removed
entirely in the following commit:
739701888f ("x86, efi: remove attribute check from setup_efi_pci")
It is debatable whether this check should have been included in the
first place, since the option ROM image provided to the UEFI driver by
the firmware may be different from the one that is actually present in
the card's flash ROM, and so whatever PciIo->RomImage points at should
be preferred regardless of whether the attribute is set.
As this was the only use of the attributes field, we can remove
the call to PciIo->Attributes() entirely, which is especially
nice because its prototype involves uint64_t type by-value
arguments which the EFI mixed mode has trouble dealing with.
Any mixed mode system with PCI is likely to be affected.
Tested-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180711090235.9327-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Greg reported that commit 3c24121039 ("ARM: 8756/1: NOMMU: Postpone
MPU activation till __after_proc_init") is causing breakage for the
old Versatile platform in no-MMU mode (with out-of-tree patches):
AS arch/arm/kernel/head-nommu.o
arch/arm/kernel/head-nommu.S: Assembler messages:
arch/arm/kernel/head-nommu.S:180: Error: selected processor does not support `isb' in ARM mode
scripts/Makefile.build:417: recipe for target 'arch/arm/kernel/head-nommu.o' failed
make[2]: *** [arch/arm/kernel/head-nommu.o] Error 1
Makefile:1034: recipe for target 'arch/arm/kernel' failed
make[1]: *** [arch/arm/kernel] Error 2
Since the code is common for all NOMMU builds usage of the isb was a
bad idea (please, note that isb also used in MPU related code which is
fine because MPU has dependency on CPU_V7/CPU_V7M), instead use more
robust instr_sync assembler macro.
Fixes: 3c24121039 ("ARM: 8756/1: NOMMU: Postpone MPU activation till __after_proc_init")
Reported-by: Greg Ungerer <gerg@kernel.org>
Tested-by: Greg Ungerer <gerg@kernel.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
commit cd43c221bb ("scsi: cxlflash: Isolate external module
dependencies") introduced the use of ifdefs to avoid compilation errors
when one of the possible backend driver, CXL or OCXL, is not compiled.
Unfortunately, the wrong defines are used and the backend ops are never
assigned, leading to a kernel crash in any case when the cxlflash module is
loaded.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In case of iSCSI offload BFS environment, MFW requires to mark virtual
link based upon qedi load status.
Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The original complaint was the lsscsi -t showed the same SAS address of the
two enclosures (SEP devices). In fact the SAS address was being set to the
Enclosure Logical Identifier (ELI).
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the description of sd_zbc_check_zone_size() to correctly explain that
the returned value is a number of device blocks, not bytes. Additionally,
the 32 bits "ret" variable used in this function may truncate the 64 bits
zone_blocks variable value upon return. To fix this, change "ret" type to
s64.
Fixes: ccce20fc79 ("sd_zbc: Avoid that resetting a zone fails sporadically")
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Cc: stable@kernel.org
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
GPNFT command allocates 2 buffer for switch query. On completion, the same
buffers were freed using different size, instead of using original size at
the time of allocation.
This patch saves the size of the request and response buffers and uses that
to free them.
Following stack trace can be seen when using debug kernel
dump_stack+0x19/0x1b
__warn+0xd8/0x100
warn_slowpath_fmt+0x5f/0x80
check_unmap+0xfb/0xa20
debug_dma_free_coherent+0x110/0x160
qla24xx_sp_unmap+0x131/0x1e0 [qla2xxx]
qla24xx_async_gnnft_done+0xb6/0x550 [qla2xxx]
qla2x00_do_work+0x1ec/0x9f0 [qla2xxx]
Cc: <stable@vger.kernel.org> # v4.17+
Fixes: 33b28357dd ("scsi: qla2xxx: Fix Async GPN_FT for FCP and FC-NVMe scan")
Reported-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull MIPS fixes from Paul Burton:
"A couple more MIPS fixes for 4.18:
- Use async IPIs for arch_trigger_cpumask_backtrace() in order to
avoid warnings & deadlocks, fixing a problem introduced in v3.19
with the fix trivial to backport as far as v4.9.
- Fix ioremap()'s MMU/TLB backed path to avoid spuriously rejecting
valid requests due to an incorrect belief that the memory region is
backed by potentially-in-use RAM. This fixes a regression in v4.2"
* tag 'mips_fixes_4.18_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Fix ioremap() RAM check
MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
MIPS: Call dump_stack() from show_regs()
Problem: When PD/PT update made by CPU root PD was not yet mapped causing
page fault.
Fix: Verify root PD is mapped into CPU address space.
v2:
Make sure that we add the root PD to the relocated list
since then it's get mapped into CPU address space bt default
in amdgpu_vm_update_directories.
v3:
Drop change to not move kernel type BOs to evicted list.
v4:
Remove redundant bo move to relocated list.
Link: https://bugs.freedesktop.org/show_bug.cgi?id=107065
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be
fine. However, cpu_possible_mask is more accurate and if there are other
clockevent source in the system which are set to cpu_possible_mask, then
having cpu_all_mask may result in issue.
E.g. on a platform with arm,sp804 timer with rating 300 and
cpu_possible_mask and this arch_mem_timer timer with rating 400 and
cpu_all_mask, tick_check_preferred may choose both preferred as the
cpumasks are not equal though they must be.
This issue was root caused incorrectly initially and a fix was merged as
commit 1332a90558 ("tick: Prefer a lower rating device only if it's CPU
local device").
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lkml.kernel.org/r/1531151136-18297-2-git-send-email-sudeep.holla@arm.com
This reverts commit 018d82e5f0.
This breaks DDC in certain cases. Revert for 4.18 and previous kernels.
For 4.19, this is fixed with the following more extensive patches:
drm/amd/display: Serialize is_dp_sink_present
drm/amd/display: Break out function to simply read aux reply
drm/amd/display: Return aux replies directly to DRM
drm/amd/display: Right shift AUX reply value sooner than later
drm/amd/display: Read AUX channel even if only status byte is returned
Link: https://lists.freedesktop.org/archives/amd-gfx/2018-July/023788.html
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Pull drm fixes from Dave Airlie:
"This just contains some etnaviv fixes and a MAINTAINERS update for the
new drm tree locations"
* tag 'drm-fixes-2018-07-10' of git://anongit.freedesktop.org/drm/drm:
MAINTAINERS: update drm tree
drm/etnaviv: bring back progress check in job timeout handler
drm/etnaviv: Fix driver unregistering
drm/etnaviv: Check for platform_device_register_simple() failure
Commit 52cdbdd498 (driver core: correct device's shutdown order)
introduced a regression by breaking device shutdown on some systems.
Namely, the devices_kset_move_last() call in really_probe() added by
that commit is a mistake as it may cause parents to follow children
in the devices_kset list which then causes shutdown to fail. For
example, if a device has children before really_probe() is called
for it (which is not uncommon), that call will cause it to be
reordered after the children in the devices_kset list and the
ordering of that list will not reflect the correct device shutdown
order any more.
Also it causes the devices_kset list to be constantly reordered
until all drivers have been probed which is totally pointless
overhead in the majority of cases and it only covered an issue
with system shutdown, while system-wide suspend/resume potentially
had the same issue on the affected platforms (which was not covered).
Moreover, the shutdown issue originally addressed by the change in
really_probe() made by commit 52cdbdd498 is not present in 4.18-rc
any more, since dra7 started to use the sdhci-omap driver which
doesn't disable any regulators during shutdown, so the really_probe()
part of commit 52cdbdd498 can be safely reverted. [The original
issue was related to the omap_hsmmc driver used by dra7 previously.]
For the above reasons, revert the really_probe() modifications made
by commit 52cdbdd498.
The other code changes made by commit 52cdbdd498 are useful and
they need not be reverted.
Fixes: 52cdbdd498 (driver core: correct device's shutdown order)
Link: https://lore.kernel.org/lkml/CAFgQCTt7VfqM=UyCnvNFxrSw8Z6cUtAi3HUwR4_xPAc03SgHjQ@mail.gmail.com/
Reported-by: Pingfan Liu <kernelfans@gmail.com>
Tested-by: Pingfan Liu <kernelfans@gmail.com>
Reviewed-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mark reported that syzkaller triggered a KASAN detected slab-out-of-bounds
bug in ___bpf_prog_run() with a BPF_LD | BPF_ABS word load at offset 0x8001.
After further investigation it became clear that the issue was the
BPF_LDX_MEM() which takes offset as an argument whereas it cannot encode
larger than S16_MAX offsets into it. For this synthetical case we need to
move the full address into tmp register instead and do the LDX without
immediate value.
Fixes: e0cea7ce98 ("bpf: implement ld_abs/ld_ind in native bpf")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This reverts commit 38fc424867.
Distributions such as Fedora and Debian do not package the ELF linker
scripts with their toolchains, resulting in kernel build failures such
as:
| CHK include/generated/compile.h
| LD [M] arch/arm64/crypto/sha512-ce.o
| aarch64-linux-gnu-ld: cannot open linker script file ldscripts/aarch64elf.xr: No such file or directory
| make[1]: *** [scripts/Makefile.build:530: arch/arm64/crypto/sha512-ce.o] Error 1
| make: *** [Makefile:1029: arch/arm64/crypto] Error 2
Revert back to the linux targets for now, adding a comment to the Makefile
so we don't accidentally break this in the future.
Cc: Paul Kocialkowski <contact@paulk.fr>
Cc: <stable@vger.kernel.org>
Fixes: 38fc424867 ("arm64: Use aarch64elf and aarch64elfb emulation mode variants")
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The below path error can occur:
# ./xdp2skb_meta.sh --dev eth0 --list
./xdp2skb_meta.sh: line 61: /usr/sbin/tc: No such file or directory
So just use command names instead of absolute paths of tc and ip.
In addition, it allow callers to redefine $TC and $IP paths
Fixes: 36e04a2d78 ("samples/bpf: xdp2skb_meta shows transferring info from XDP to SKB")
Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The inline data code was updating the raw inode directly; this is
problematic since if metadata checksums are enabled,
ext4_mark_inode_dirty() must be called to update the inode's checksum.
In addition, the jbd2 layer requires that get_write_access() be called
before the metadata buffer is modified. Fix both of these problems.
https://bugzilla.kernel.org/show_bug.cgi?id=200443
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Cast *tmp* and *nb_base* to u64 in order to give the compiler
complete information about the proper arithmetic to use.
Notice that such variables are used in contexts that expect
expressions of type u64 (64 bits, unsigned) and the following
expressions are currently being evaluated using 32-bit arithmetic:
tmp << 25
nb_base << 25
Addresses-Coverity-ID: 200586 ("Unintentional integer overflow")
Addresses-Coverity-ID: 200587 ("Unintentional integer overflow")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Use new return type vm_fault_t for fault handler. For now,
this is just documenting that the function returns a
VM_FAULT value rather than an errno. Once all instances are
converted, vm_fault_t will become a distinct type.
Ref-> commit 1c8f422059 ("mm: change return type to
vm_fault_t") was added in 4.17-rc1 to introduce the new
typedef vm_fault_t. Currently we are making change to all
drivers to return vm_fault_t for page fault handlers. As
part of that char/agp driver is also getting changed to
return vm_fault_t type from fault handler.
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Pull HID fixes from Jiri Kosina:
- spectrev1 pattern fix in hiddev from Gustavo A. R. Silva
- bounds check fix for hid-debug from Daniel Rosenberg
- regression fix for HID autobinding from Benjamin Tissoires
- removal of excessive logging from i2c-hid driver from Jason Andryuk
- fix specific to 2nd generation of Wacom Intuos devices from Jason
Gerecke
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: hiddev: fix potential Spectre v1
HID: i2c-hid: Fix "incomplete report" noise
HID: wacom: Correct touch maximum XY of 2nd-gen Intuos
HID: debug: check length before copy_to_user()
HID: core: allow concurrent registration of drivers
Update my TDA998x HDMI encoder MAINTAINERS entry to include the
dt-bindings header, and a keyword pattern to catch patches containing
the DT compatible. Also change the status to "maintained" rather than
"supported".
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Chan says:
====================
bnxt_en: Bug fixes.
These are bug fixes in error code paths, TC Flower VLAN TCI flow
checking bug fix, proper filtering of Broadcast packets if IFF_BROADCAST
is not set, and a bug fix in bnxt_get_max_rings() to return 0 ring
parameters when the return value is -ENOMEM.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix bug in the error code path when bnxt_request_irq() returns failure.
bnxt_disable_napi() should not be called in this error path because
NAPI has not been enabled yet.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling bnxt_set_max_func_irqs() to modify the max IRQ count requested or
freed by the RDMA driver is flawed. The max IRQ count is checked when
re-initializing the IRQ vectors and this can happen multiple times
during ifup or ethtool -L. If the max IRQ is reduced and the RDMA
driver is operational, we may not initailize IRQs correctly. This
problem shows up on VFs with very small number of MSIX.
There is no other logic that relies on the IRQ count excluding the ones
used by RDMA. So we fix it by just removing the call to subtract or
add the IRQs used by RDMA.
Fixes: a588e4580a ("bnxt_en: Add interface to support RDMA driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the driver assumes IFF_BROADCAST is always set and always sets
the broadcast filter. Modify the code to set or clear the broadcast
filter according to the IFF_BROADCAST flag.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code returns -ENOMEM and does not bother to set the output
parameters to 0 when no rings are available. Some callers, such as
bnxt_get_channels() will display garbage ring numbers when that happens.
Fix it by always setting the output parameters.
Fixes: 6e6c5a57fb ("bnxt_en: Modify bnxt_get_max_rings() to support shared or non shared rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there aren't enough RX rings available, the driver will attempt to
use a single RX ring without the aggregation ring. If that also
fails, the BNXT_FLAG_AGG_RINGS flag is cleared but the other ring
parameters are not set consistently to reflect that. If more RX
rings become available at the next open, the RX rings will be in
an inconsistent state and may crash when freeing the RX rings.
Fix it by restoring the BNXT_FLAG_AGG_RINGS if not enough RX rings are
available to run without aggregation rings.
Fixes: bdbd1eb59c ("bnxt_en: Handle no aggregation ring gracefully.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is possible that OVS may set don’t care for DEI/CFI bit in
vlan_tci mask. Hence, checking for vlan_tci exact match will endup
in a vlan flow rejection.
This patch fixes the problem by checking for vlan_pcp and vid
separately, instead of checking for the entire vlan_tci.
Fixes: e85a9be93c (bnxt_en: do not allow wildcard matches for L2 flows)
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On Tegra30 Cardhu the PCA9546 I2C mux is not ACK'ing I2C commands on
resume from suspend (which is caused by the reset signal for the I2C
mux not being configured correctl). However, this NACK is causing the
Tegra30 to hang on resuming from suspend which is not expected as we
detect NACKs and handle them. The hang observed appears to occur when
resetting the I2C controller to recover from the NACK.
Commit 77821b4678 ("i2c: tegra: proper handling of error cases") added
additional error handling for some error cases including NACK, however,
it appears that this change conflicts with an early fix by commit
f70893d083 ("i2c: tegra: Add delay before resetting the controller
after NACK"). After commit 77821b4678 was made we now disable 'packet
mode' before the delay from commit f70893d083 happens. Testing shows
that moving the delay to before disabling 'packet mode' fixes the hang
observed on Tegra30. The delay was added to give the I2C controller
chance to send a stop condition and so it makes sense to move this to
before we disable packet mode. Please note that packet mode is always
enabled for Tegra.
Fixes: 77821b4678 ("i2c: tegra: proper handling of error cases")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@vger.kernel.org
Just like with PIPESTAT, the edge triggered IIR on i965/g4x
also causes problems for hotplug interrupts. To make sure
we don't get the IIR port interrupt bit stuck low with the
ISR bit high we must force an edge in ISR. Unfortunately
we can't borrow the PIPESTAT trick and toggle the enable
bits in PORT_HOTPLUG_EN as that act itself generates hotplug
interrupts. Instead we just have to loop until we've cleared
PORT_HOTPLUG_STAT, or we just give up and WARN.
v2: Don't frob with PORT_HOTPLUG_EN
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180614175625.1615-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
(cherry picked from commit 0ba7c51a6f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree:
1) Missing module autoloadfor icmp and icmpv6 x_tables matches,
from Florian Westphal.
2) Possible non-linear access to TCP header from tproxy, from
Mate Eckl.
3) Do not allow rbtree to be used for single elements, this patch
moves all set backend into one single module since such thing
can only happen if hashtable module is explicitly blacklisted,
which should not ever be done.
4) Reject error and standard targets from nft_compat for sanity
reasons, they are never used from there.
5) Don't crash on double hashsize module parameter, from Andrey
Ryabinin.
6) Drop dst on skb before placing it in the fragmentation
reassembly queue, from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
commit ef1433f717 ("PCI: endpoint: Create configfs entry for each
pci_epf_device_id table entry") while adding configfs entry for each
pci_epf_device_id table entry introduced a NULL pointer dereference error
when CONFIG_PCI_ENDPOINT_CONFIGFS is not enabled.
Fix it here.
Fixes: ef1433f717 ("PCI: endpoint: Create configfs entry for each
pci_epf_device_id table entry")
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
[lorenzo.pieralisi: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
swap was broken on ARC due to silly copy-paste issue.
We encode offset from swapcache page in __swp_entry() as (off << 13) but
were not decoding back in __swp_offset() as (off >> 13) - it was still
(off << 13).
This finally fixes swap usage on ARC.
| # mkswap /dev/sda2
|
| # swapon -a -e /dev/sda2
| Adding 500728k swap on /dev/sda2. Priority:-2 extents:1 across:500728k
|
| # free
| total used free shared buffers cached
| Mem: 765104 13456 751648 4736 8 4736
| -/+ buffers/cache: 8712 756392
| Swap: 500728 0 500728
Cc: stable@vger.kernel.org
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
We used to have pre-set CONFIG_INITRAMFS_SOURCE with local path
to intramfs in ARC defconfigs. This was quite convenient for
in-house development but not that convenient for newcomers
who obviusly don't have folders like "arc_initramfs" next to
the Linux source tree. Which leads to quite surprising failure
of defconfig building:
------------------------------->8-----------------------------
../scripts/gen_initramfs_list.sh: Cannot open '../../arc_initramfs_hs/'
../usr/Makefile:57: recipe for target 'usr/initramfs_data.cpio.gz' failed
make[2]: *** [usr/initramfs_data.cpio.gz] Error 1
------------------------------->8-----------------------------
So now when more and more people start to deal with our defconfigs
let's make their life easier with removal of CONFIG_INITRAMFS_SOURCE.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Since commit eedf265aa0 ("devpts: Make each mount of devpts an
independent filesystem.") CONFIG_DEVPTS_MULTIPLE_INSTANCES isn't needed
in the defconfig anymore.
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This is used in configs lacking hardware atomics to emulate atomic r-m-w
for user space, implemented by disabling preemption in kernel.
However there are issues in current implementation:
1. Process not terminated if invalid user pointer passed:
i.e. __get_user() failed.
2. The reason for this patch was __put_user() failure not being handled
either, specifically for the COW break scenario.
The zero page is initially wired up and read from __get_user()
succeeds. A subsequent write by __put_user() induces a
Protection Violation, but COW can't finish as Linux page fault
handler is disabled due to preempt disable.
And what's worse is we silently return the stale value to user space.
Fix this specific case by re-enabling preemption and explicitly
fixing up the fault and retrying the whole sequence over.
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: rewrote the changelog]
In case of HSDK we have intermediate INTC in for of DW APB GPIO controller
which is used as a de-bounce logic for interrupt wires that come from
outside the board.
We cannot use existing "irq-dw-apb-ictl" driver here because all input
lines are routed to corresponding output lines but not muxed into one
line (this is configured in RTL and we cannot change this in software).
But even if we add such a feature to "irq-dw-apb-ictl" driver that won't
benefit us as higher-level INTC (in case of HSDK it is IDU) anyways has
per-input control so adding fully-controller intermediate INTC will only
bring some overhead on interrupt processing but no other benefits.
Thus we just do one-time configuration of DW APB GPIO controller and
forget about it.
Based on implementation available on arch/arc/plat-axs10x/axs10x.c file.
Acked-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Commit de0aa7b2f9 ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
uses local_bh_disable()/enable(), because hv_pci_onchannelcallback() can
also run in tasklet context as the channel event callback, so bottom halves
should be disabled to prevent a race condition.
With CONFIG_PROVE_LOCKING=y in the recent mainline, or old kernels that
don't have commit f71b74bca6 ("irq/softirqs: Use lockdep to assert IRQs
are disabled/enabled"), when the upper layer IRQ code calls
hv_compose_msi_msg() with local IRQs disabled, we'll see a warning at the
beginning of __local_bh_enable_ip():
IRQs not enabled as expected
WARNING: CPU: 0 PID: 408 at kernel/softirq.c:162 __local_bh_enable_ip
The warning exposes an issue in de0aa7b2f9: local_bh_enable() can
potentially call do_softirq(), which is not supposed to run when local IRQs
are disabled. Let's fix this by using local_irq_save()/restore() instead.
Note: hv_pci_onchannelcallback() is not a hot path because it's only called
when the PCI device is hot added and removed, which is infrequent.
Fixes: de0aa7b2f9 ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: stable@vger.kernel.org
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Switching the CPU from the L2 or L3 frequencies (300 and 200 Mhz
respectively) to L0 frequency (1.2 Ghz) requires a significant amount
of time to let VDD stabilize to the appropriate voltage. This amount of
time is large enough that it cannot be covered by the hardware
countdown register. Due to this, the CPU might start operating at L0
before the voltage is stabilized, leading to CPU stalls.
To work around this problem, we prevent switching directly from the
L2/L3 frequencies to the L0 frequency, and instead switch to the L1
frequency in-between. The sequence therefore becomes:
1. First switch from L2/L3(200/300MHz) to L1(600MHZ)
2. Sleep 20ms for stabling VDD voltage
3. Then switch from L1(600MHZ) to L0(1200Mhz).
It is based on the work done by Ken Ma <make@marvell.com>
Cc: stable@vger.kernel.org
Fixes: 2089dc33ea ("clk: mvebu: armada-37xx-periph: add DVFS support for cpu clocks")
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Eric Dumazet reports:
Here is a reproducer of an annoying bug detected by syzkaller on our production kernel
[..]
./b78305423 enable_conntrack
Then :
sleep 60
dmesg | tail -10
[ 171.599093] unregister_netdevice: waiting for lo to become free. Usage count = 2
[ 181.631024] unregister_netdevice: waiting for lo to become free. Usage count = 2
[ 191.687076] unregister_netdevice: waiting for lo to become free. Usage count = 2
[ 201.703037] unregister_netdevice: waiting for lo to become free. Usage count = 2
[ 211.711072] unregister_netdevice: waiting for lo to become free. Usage count = 2
[ 221.959070] unregister_netdevice: waiting for lo to become free. Usage count = 2
Reproducer sends ipv6 fragment that hits nfct defrag via LOCAL_OUT hook.
skb gets queued until frag timer expiry -- 1 minute.
Normally nf_conntrack_reasm gets called during prerouting, so skb has
no dst yet which might explain why this wasn't spotted earlier.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Loading the nf_conntrack module with doubled hashsize parameter, i.e.
modprobe nf_conntrack hashsize=12345 hashsize=12345
causes NULL-ptr deref.
If 'hashsize' specified twice, the nf_conntrack_set_hashsize() function
will be called also twice.
The first nf_conntrack_set_hashsize() call will set the
'nf_conntrack_htable_size' variable:
nf_conntrack_set_hashsize()
...
/* On boot, we can set this without any fancy locking. */
if (!nf_conntrack_htable_size)
return param_set_uint(val, kp);
But on the second invocation, the nf_conntrack_htable_size is already set,
so the nf_conntrack_set_hashsize() will take a different path and call
the nf_conntrack_hash_resize() function. Which will crash on the attempt
to dereference 'nf_conntrack_hash' pointer:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
RIP: 0010:nf_conntrack_hash_resize+0x255/0x490 [nf_conntrack]
Call Trace:
nf_conntrack_set_hashsize+0xcd/0x100 [nf_conntrack]
parse_args+0x1f9/0x5a0
load_module+0x1281/0x1a50
__se_sys_finit_module+0xbe/0xf0
do_syscall_64+0x7c/0x390
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fix this, by checking !nf_conntrack_hash instead of
!nf_conntrack_htable_size. nf_conntrack_hash will be initialized only
after the module loaded, so the second invocation of the
nf_conntrack_set_hashsize() won't crash, it will just reinitialize
nf_conntrack_htable_size again.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
iptables-nft never requests these, but make this explicitly illegal.
If it were quested, kernel could oops as ->eval is NULL, furthermore,
the builtin targets have no owning module so its possible to rmmod
eb/ip/ip6_tables module even if they would be loaded.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
uref->field_index, uref->usage_index, finfo.field_index and cinfo.index can be
indirectly controlled by user-space, hence leading to a potential exploitation
of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/hid/usbhid/hiddev.c:473 hiddev_ioctl_usage() warn: potential spectre issue 'report->field' (local cap)
drivers/hid/usbhid/hiddev.c:477 hiddev_ioctl_usage() warn: potential spectre issue 'field->usage' (local cap)
drivers/hid/usbhid/hiddev.c:757 hiddev_ioctl() warn: potential spectre issue 'report->field' (local cap)
drivers/hid/usbhid/hiddev.c:801 hiddev_ioctl() warn: potential spectre issue 'hid->collection' (local cap)
Fix this by sanitizing such structure fields before using them to index
report->field, field->usage and hid->collection
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Commit ac75a04104 ("HID: i2c-hid: fix size check and type usage") started
writing messages when the ret_size is <= 2 from i2c_master_recv. However, my
device i2c-DLL07D1 returns 2 for a short period of time (~0.5s) after I stop
moving the pointing stick or touchpad. It varies, but you get ~50 messages
each time which spams the log hard.
[ 95.925055] i2c_hid i2c-DLL07D1:01: i2c_hid_get_input: incomplete report (83/2)
This has also been observed with a i2c-ALP0017.
[ 1781.266353] i2c_hid i2c-ALP0017:00: i2c_hid_get_input: incomplete report (30/2)
Only print the message when ret_size is totally invalid and less than 2 to cut
down on the log spam.
Fixes: ac75a04104 ("HID: i2c-hid: fix size check and type usage")
Reported-by: John Smith <john-s-84@gmx.net>
Cc: stable@vger.kernel.org
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Add the missing locks to the IRQ enable/disable paths, and fix a comment
in the interrupt handler: reading the ISR clears down the status bits,
but does not reset the interrupt so it can signal again. That seems to
require a write.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
The colorkey mode property was not correctly disabling the colorkeying
when "disabled" mode was selected. Arrange for this to work as one
would expect.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
If pinctrl nodes for 100/200MHz are missing, the controller should
not select any mode which need signal frequencies 100MHz or higher.
To prevent such speed modes the driver currently uses the quirk flag
SDHCI_QUIRK2_NO_1_8_V. This works nicely for SD cards since 1.8V
signaling is required for all faster modes and slower modes use 3.3V
signaling only.
However, there are eMMC modes which use 1.8V signaling and run below
100MHz, e.g. DDR52 at 1.8V. With using SDHCI_QUIRK2_NO_1_8_V this
mode is prevented. When using a fixed 1.8V regulator as vqmmc-supply
the stack has no valid mode to use. In this tenuous situation the
kernel continuously prints voltage switching errors:
mmc1: Switching to 3.3V signalling voltage failed
Avoid using SDHCI_QUIRK2_NO_1_8_V and prevent faster modes by
altering the SDHCI capability register. With that the stack is able
to select 1.8V modes even if no faster pinctrl states are available:
# cat /sys/kernel/debug/mmc1/ios
...
timing spec: 8 (mmc DDR52)
signal voltage: 1 (1.80 V)
...
Link: http://lkml.kernel.org/r/20180628081331.13051-1-stefan@agner.ch
Signed-off-by: Stefan Agner <stefan@agner.ch>
Fixes: ad93220de7 ("mmc: sdhci-esdhc-imx: change pinctrl state according
to uhs mode")
Cc: <stable@vger.kernel.org> # v4.13+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
After commit 18996f2db9 (ACPICA: Events: Stop unconditionally
clearing ACPI IRQs during suspend/resume) the status of ACPI events
is not cleared any more when entering the ACPI S5 system state (power
off) which causes some systems to power up immediately after turing
off power in certain situations.
That is a functional regression, so address it by making the code
clear the status of all ACPI events again when entering S5 (for
system-wide suspend or hibernation the clearing of the status of all
events is not desirable, as it might cause the kernel to miss wakeup
events sometimes).
Fixes: 18996f2db9 (ACPICA: Events: Stop unconditionally clearing ACPI IRQs during suspend/resume)
Reported-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Thomas Hänig <haenig@cosifan.de>
Cc: 4.17+ <stable@vger.kernel.org> # 4.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit fdb5c4531c ("bpf: fix attach type BPF_LIRC_MODE2 dependency
wrt CONFIG_CGROUP_BPF") caused some build issues, detected by 0-DAY
kernel test infrastructure.
The problem is that cgroup_bpf_prog_attach/detach/query() functions
can return -EINVAL error code, which is not defined. Fix this adding
errno.h to includes.
Fixes: fdb5c4531c ("bpf: fix attach type BPF_LIRC_MODE2 dependency wrt CONFIG_CGROUP_BPF")
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Sean Young <sean@mess.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Here we are checking for the buffer length, not an offset for writing
to, so using > is correct. The current code incorrectly rejects a
command buffer ending at the memory buffer's end.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Only gather pins are mapped by the Host1x driver, regular BO relocations
are not. Check whether size of unpin isn't 0, otherwise IOVA allocation at
0x0 could be erroneously released.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Host1x's CDMA can't access the command buffers if IOMMU and Host1x
firewall are enabled in the kernels config because firewall doesn't map
the copied buffer into IOVA space. Fix this by skipping IOMMU
initialization if firewall is enabled as firewall merges sparse cmdbufs
into a single contiguous buffer and hence IOMMU isn't needed in this case.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The check is valid but it does not warrant to crash the kernel. A
WARN_ON() is good enough here.
Found by checkpatch.
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Instead of having the function name hard-coded (it might change and we
forgot to update them in the debug output) we can use __func__ instead
and also shorter the line so we do not need to break it. Also fix an
extra blank line while being here.
Found by checkpatch.
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
The check is valid but it does not warrant to crash the kernel. A
WARN_ON() is good enough here.
Found by checkpatch.
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
When the base sun4i DRM driver is built-in but the back-end is
a loadable module, we run into a link error:
drivers/gpu/drm/sun4i/sun4i_drv.o: In function `sun4i_drv_probe':
sun4i_drv.c:(.text+0x60c): undefined reference to `sun4i_frontend_of_table'
The dependency is a bit tricky, the best workaround I have come up
with is to use a Makefile hack to to interpret both
CONFIG_DRM_SUN4I_BACKEND=m and CONFIG_DRM_SUN4I_BACKEND=y
as a directive to build the front-end the same way as the main module.
Fixes: dd0421f475 ("drm/sun4i: Add a driver for the display frontend")
Link: https://lore.kernel.org/lkml/20180301091908.zcptz3ezqr2c6ly5@flea/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706142847.2032381-1-arnd@arndb.de
I was looking at usually suppressed gcc warnings,
[-Wimplicit-fallthrough=] in this case:
The code definitely looks like a break is missing here.
However I am not able to test the NL80211_IFTYPE_MESH_POINT,
nor do I actually know what might be :)
So please use this patch with caution and only if you are
able to do some testing.
Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
[johannes: looks obvious enough to apply as is, interesting
though that it never seems to have been a problem]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Previously, when an MMP-protected file system is remounted read-only,
the kmmpd thread would exit the next time it woke up (a few seconds
later), without resetting the MMP sequence number back to
EXT4_MMP_SEQ_CLEAN.
Fix this by explicitly killing the MMP thread when the file system is
remounted read-only.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger@dilger.ca>
Ext4_check_descriptors() was getting called before s_gdb_count was
initialized. So for file systems w/o the meta_bg feature, allocation
bitmaps could overlap the block group descriptors and ext4 wouldn't
notice.
For file systems with the meta_bg feature enabled, there was a
fencepost error which would cause the ext4_check_descriptors() to
incorrectly believe that the block allocation bitmap overlaps with the
block group descriptor blocks, and it would reject the mount.
Fix both of these problems.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Pull ARM SoC fixes from Olof Johansson:
"A small collection of fixes, sort of the usual at this point, all for
i.MX or OMAP:
- Enable ULPI drivers on i.MX to avoid a hang
- Pinctrl fix for touchscreen on i.MX51 ZII RDU1
- Fixes for ethernet clock references on am3517
- mmc0 write protect detection fix for am335x
- kzalloc->kcalloc conversion in an OMAP driver
- USB metastability fix for USB on dra7
- Fix touchscreen wakeup on am437x"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: imx_v4_v5_defconfig: Select ULPI support
ARM: imx_v6_v7_defconfig: Select ULPI support
ARM: dts: omap3: Fix am3517 mdio and emac clock references
ARM: dts: am335x-bone-common: Fix mmc0 Write Protect
bus: ti-sysc: Use 2-factor allocator arguments
ARM: dts: dra7: Disable metastability workaround for USB2
ARM: dts: imx51-zii-rdu1: fix touchscreen pinctrl
ARM: dts: am437x: make edt-ft5x06 a wakeup source
Pull x86/pti updates from Thomas Gleixner:
"Two small fixes correcting the handling of SSB mitigations on AMD
processors"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bugs: Fix the AMD SSBD usage of the SPEC_CTRL MSR
x86/bugs: Update when to check for the LS_CFG SSBD mitigation
Pull x86 fixes from Thomas Gleixner:
- Prevent an out-of-bounds access in mtrr_write()
- Break a circular dependency in the new hyperv IPI acceleration code
- Address the build breakage related to inline functions by enforcing
gnu_inline and explicitly bringing native_save_fl() out of line,
which also adds a set of _ARM_ARG macros which provide 32/64bit
safety.
- Initialize the shadow CR4 per cpu variable before using it.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mtrr: Don't copy out-of-bounds data in mtrr_write
x86/hyper-v: Fix the circular dependency in IPI enlightenment
x86/paravirt: Make native_save_fl() extern inline
x86/asm: Add _ASM_ARG* constants for argument registers to <asm/asm.h>
compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline declarations
x86/mm/32: Initialize the CR4 shadow before __flush_tlb_all()
Pull scheduler fixes from Thomas Gleixner:
- The hopefully final fix for the reported race problems in
kthread_parkme(). The previous attempt still left a hole and was
partially wrong.
- Plug a race in the remote tick mechanism which triggers a warning
about updates not being done correctly. That's a false positive if
the race condition is hit as the remote CPU is idle. Plug it by
checking the condition again when holding run queue lock.
- Fix a bug in the utilization estimation of a run queue which causes
the estimation to be 0 when a run queue is throttled.
- Advance the global expiration of the period timer when the timer is
restarted after a idle period. Otherwise the expiry time is stale and
the timer fires prematurely.
- Cure the drift between the bandwidth timer and the runqueue
accounting, which leads to bogus throttling of runqueues
- Place the call to cpufreq_update_util() correctly so the function
will observe the correct number of running RT tasks and not a stale
one.
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kthread, sched/core: Fix kthread_parkme() (again...)
sched/util_est: Fix util_est_dequeue() for throttled cfs_rq
sched/fair: Advance global expiration when period timer is restarted
sched/fair: Fix bandwidth timer clock drift condition
sched/rt: Fix call to cpufreq_update_util()
sched/nohz: Skip remote tick on idle task entirely
Pull objtool fix from Thomas Gleixner:
"A single fix for objtool to address a bug in handling the cold
subfunction detection for aliased functions which was added recently.
The bug causes objtool to enter an infinite loop"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Support GCC 8 '-fnoreorder-functions'
Pull crypto fixes from Herbert Xu:
- add missing RETs in x86 aegis/morus
- fix build error in arm speck
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: x86 - Add missing RETs
crypto: arm/speck - fix building in Thumb2 mode
Pull ext4 bugfixes from Ted Ts'o:
"Bug fixes for ext4; most of which relate to vulnerabilities where a
maliciously crafted file system image can result in a kernel OOPS or
hang.
At least one fix addresses an inline data bug could be triggered by
userspace without the need of a crafted file system (although it does
require that the inline data feature be enabled)"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: check superblock mapped prior to committing
ext4: add more mount time checks of the superblock
ext4: add more inode number paranoia checks
ext4: avoid running out of journal credits when appending to an inline file
jbd2: don't mark block as modified if the handle is out of credits
ext4: never move the system.data xattr out of the inode body
ext4: clear i_data in ext4_inode_info when removing inline data
ext4: include the illegal physical block in the bad map ext4_error msg
ext4: verify the depth of extent tree in ext4_find_extent()
ext4: only look at the bg_flags field if it is valid
ext4: make sure bitmaps and the inode table don't overlap with bg descriptors
ext4: always check block group bounds in ext4_init_block_bitmap()
ext4: always verify the magic number in xattr blocks
ext4: add corruption check in ext4_xattr_set_entry()
ext4: add warn_on_error mount option
Pull PCI fixes from Bjorn Helgaas:
- Fix a use-after-free in the endpoint code (Dan Carpenter)
- Stop defaulting CONFIG_PCIE_DW_PLAT_HOST to yes (Geert Uytterhoeven)
- Fix an nfp regression caused by a change in how we limit the number
of VFs we can enable (Jakub Kicinski)
- Fix failure path cleanup issues in the new R-Car gen3 PHY support
(Marek Vasut)
- Fix leaks of OF nodes in faraday, xilinx-nwl, xilinx (Nicholas Mc
Guire)
* tag 'pci-v4.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
nfp: stop limiting VFs to 0
PCI/IOV: Reset total_VFs limit after detaching PF driver
PCI: faraday: Add missing of_node_put()
PCI: xilinx-nwl: Add missing of_node_put()
PCI: xilinx: Add missing of_node_put()
PCI: endpoint: Use after free in pci_epf_unregister_driver()
PCI: controller: dwc: Do not let PCIE_DW_PLAT_HOST default to yes
PCI: rcar: Clean up PHY init on failure
PCI: rcar: Shut the PHY down in failpath
tcp_zerocopy_receive() relies on tcp_inq() to limit number of bytes
requested by user.
syzbot found that after tcp_disconnect(), tcp_inq() was returning
a stale value (number of bytes in queue before the disconnect).
Note that after this patch, ioctl(fd, SIOCINQ, &val) is also fixed
and returns 0, so this might be a candidate for all known linux kernels.
While we are at this, we probably also should clear urg_data to
avoid other syzkaller reports after it discovers how to deal with
urgent data.
syzkaller repro :
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
bind(3, {sa_family=AF_INET, sin_port=htons(20000), sin_addr=inet_addr("224.0.0.1")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(20000), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
send(3, ..., 4096, 0) = 4096
connect(3, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 128) = 0
getsockopt(3, SOL_TCP, TCP_ZEROCOPY_RECEIVE, ..., [16]) = 0 // CRASH
Fixes: 05255b823a ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf 2018-07-07
The following pull-request contains BPF updates for your *net* tree.
Plenty of fixes for different components:
1) A set of critical fixes for sockmap and sockhash, from John Fastabend.
2) fixes for several race conditions in af_xdp, from Magnus Karlsson.
3) hash map refcnt fix, from Mauricio Vasquez.
4) samples/bpf fixes, from Taeung Song.
5) ifup+mtu check for xdp_redirect, from Toshiaki Makita.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting the low threshold to 0 has no effect on frags allocation,
we need to clear high_thresh instead.
The code was pre-existent to commit 648700f76b ("inet: frags:
use rhashtables for reassembly units"), but before the above,
such assignment had a different role: prevent concurrent eviction
from the worker and the netns cleanup helper.
Fixes: 648700f76b ("inet: frags: use rhashtables for reassembly units")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tcp_diag_destroy closes a TCP_NEW_SYN_RECV socket, it first
frees it by calling inet_csk_reqsk_queue_drop_and_and_put in
tcp_abort, and then frees it again by calling sock_gen_put.
Since tcp_abort only has one caller, and all the other codepaths
in tcp_abort don't free the socket, just remove the free in that
function.
Cc: David Ahern <dsa@cumulusnetworks.com>
Tested: passes Android sock_diag_test.py, which exercises this codepath
Fixes: d7226c7a4d ("net: diag: Fix refcnt leak in error path destroying socket")
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin reported that icmp replies may not use the address on the device the
echo request is received if the destination address is broadcast. Instead
a route lookup is done without considering VRF context. Fix by setting
oif in flow struct to the master device if it is enslaved. That directs
the lookup to the VRF table. If the device is not enslaved, oif is still
0 so no affect.
Fixes: cd2fbe1b6b ("net: Use VRF device index for lookups on RX")
Reported-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull cifs fixes from Steve French:
"Five smb3/cifs fixes for stable (including for some leaks and memory
overwrites) and also a few fixes for recent regressions in packet
signing.
Additional testing at the recent SMB3 test event, and some good work
by Paulo and others spotted the issues fixed here. In addition to my
xfstest runs on these, Aurelien and Stefano did additional test runs
to verify this set"
* tag '4.18-rc3-smb3fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix stack out-of-bounds in smb{2,3}_create_lease_buf()
cifs: Fix infinite loop when using hard mount option
cifs: Fix slab-out-of-bounds in send_set_info() on SMB2 ACE setting
cifs: Fix memory leak in smb2_set_ea()
cifs: fix SMB1 breakage
cifs: Fix validation of signed data in smb2
cifs: Fix validation of signed data in smb3+
cifs: Fix use after free of a mid_q_entry
Pull dma-mapping fix from Christoph Hellwig:
"Revert an incorrect dma-mapping commit for 4.18-rc"
* tag 'dma-mapping-4.18-3' of git://git.infradead.org/users/hch/dma-mapping:
Revert "iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent()"
Note that the LZ4 signature is different than that of modern LZ4 as we
use the "legacy" format which suffers from some downsides like inability
to disable compression.
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Pull dmaengine fixes from Vinod Koul:
"We have few odd driver fixes and one email update change for you this
time:
- Driver fixes for k3dma (off by one), pl330 (burst residue
granularity) and omap-dma (incorrect residue_granularity)
- Sinan's email update"
* tag 'dmaengine-fix-4.18-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: k3dma: Off by one in k3_of_dma_simple_xlate()
dmaengine: pl330: report BURST residue granularity
MAINTAINERS: Update email-id of Sinan Kaya
dmaengine: ti: omap-dma: Fix OMAP1510 incorrect residue_granularity
Pull IPMI fixes from Corey Minyard:
"A couple of small fixes: one to the BMC side of things that fixes an
interrupt issue, and one oops fix if init fails in a certain way on
the client driver"
* tag 'for-linus-4.18-2' of git://github.com/cminyard/linux-ipmi:
ipmi: kcs_bmc: fix IRQ exception if the channel is not open
ipmi: Cleanup oops on initialization failure
Otherwise we end up with attempting to send packets from down devices
or to send oversized packets, which may cause unexpected driver/device
behaviour. Generic XDP has already done this check, so reuse the logic
in native XDP.
Fixes: 814abfabef ("xdp: add bpf_redirect helper function")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
John Fastabend says:
====================
First three patches resolve issues found while testing sockhash and
reviewing code. Syzbot also found them about the same time as I was
working on fixes. The main issue is in the sockhash path we reduced
the scope of sk_callback lock but this meant we could get update and
close running in parallel so fix that here.
Then testing sk_msg and sk_skb programs together found that skb->dev
is not always assigned and some of the helpers were depending on this
to lookup max mtu. Fix this by using SKB_MAX_ALLOC when no MTU is
available.
Finally, Martin spotted that the sockmap code was still using the
qdisc skb cb structure. But I was sure we had fixed this long ago.
Looks like we missed it in a merge conflict resolution and then by
chance data_end offset was the same in both structures so everything
sort of continued to work even though it could break at any moment
if the structs ever change. So redo the conversion and this time
also convert the helpers.
v2: fix '0 files changed' issue in patches
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
In commit
'bpf: bpf_compute_data uses incorrect cb structure' (8108a77515)
we added the routine bpf_compute_data_end_sk_skb() to compute the
correct data_end values, but this has since been lost. In kernel
v4.14 this was correct and the above patch was applied in it
entirety. Then when v4.14 was merged into v4.15-rc1 net-next tree
we lost the piece that renamed bpf_compute_data_pointers to the
new function bpf_compute_data_end_sk_skb. This was done here,
e1ea2f9856 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
When it conflicted with the following rename patch,
6aaae2b6c4 ("bpf: rename bpf_compute_data_end into bpf_compute_data_pointers")
Finally, after a refactor I thought even the function
bpf_compute_data_end_sk_skb() was no longer needed and it was
erroneously removed.
However, we never reverted the sk_skb_convert_ctx_access() usage of
tcp_skb_cb which had been committed and survived the merge conflict.
Here we fix this by adding back the helper and *_data_end_sk_skb()
usage. Using the bpf_skc_data_end mapping is not correct because it
expects a qdisc_skb_cb object but at the sock layer this is not the
case. Even though it happens to work here because we don't overwrite
any data in-use at the socket layer and the cb structure is cleared
later this has potential to create some subtle issues. But, even
more concretely the filter.c access check uses tcp_skb_cb.
And by some act of chance though,
struct bpf_skb_data_end {
struct qdisc_skb_cb qdisc_cb; /* 0 28 */
/* XXX 4 bytes hole, try to pack */
void * data_meta; /* 32 8 */
void * data_end; /* 40 8 */
/* size: 48, cachelines: 1, members: 3 */
/* sum members: 44, holes: 1, sum holes: 4 */
/* last cacheline: 48 bytes */
};
and then tcp_skb_cb,
struct tcp_skb_cb {
[...]
struct {
__u32 flags; /* 24 4 */
struct sock * sk_redir; /* 32 8 */
void * data_end; /* 40 8 */
} bpf; /* 24 */
};
So when we use offset_of() to track down the byte offset we get 40 in
either case and everything continues to work. Fix this mess and use
correct structures its unclear how long this might actually work for
until someone moves the structs around.
Reported-by: Martin KaFai Lau <kafai@fb.com>
Fixes: e1ea2f9856 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Fixes: 6aaae2b6c4 ("bpf: rename bpf_compute_data_end into bpf_compute_data_pointers")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Currently, when a sock is closed and the bpf_tcp_close() callback is
used we remove memory but do not free the skb. Call consume_skb() if
the skb is attached to the buffer.
Reported-by: syzbot+d464d2c20c717ef5a6a8@syzkaller.appspotmail.com
Fixes: 1aa12bdf1b ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
After latest lock updates there is no longer anything preventing a
close and recvmsg call running in parallel. Additionally, we can
race update with close if we close a socket and simultaneously update
if via the BPF userspace API (note the cgroup ops are already run
with sock_lock held).
To resolve this take sock_lock in close and update paths.
Reported-by: syzbot+b680e42077a0d7c9a0c4@syzkaller.appspotmail.com
Fixes: e9db4ef6bf ("bpf: sockhash fix omitted bucket lock in sock_close")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Multiple BPF helpers in use by sk_skb programs calculate the max
skb length using the __bpf_skb_max_len function. However, this
calculates the max length using the skb->dev pointer which can be
NULL when an sk_skb program is paired with an sk_msg program.
To force this a sk_msg program needs to redirect into the ingress
path of a sock with an attach sk_skb program. Then the the sk_skb
program would need to call one of the helpers that adjust the skb
size.
To fix the null ptr dereference use SKB_MAX_ALLOC size if no dev
is available.
Fixes: 8934ce2fd0 ("bpf: sockmap redirect ingress support")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
John Fastabend says:
====================
I missed fixing the error path in the sockhash code to align with
supporting socks in multiple maps. Simply checking if the psock is
present does not mean we can decrement the reference count because
it could be part of another map. Fix this by cleaning up the error
path so this situation does not happen.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This removes locking from readers of RCU hash table. Its not
necessary.
Fixes: 8111038444 ("bpf: sockmap, add hash map support")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The current code, in the error path of sock_hash_ctx_update_elem,
checks if the sock has a psock in the user data and if so decrements
the reference count of the psock. However, if the error happens early
in the error path we may have never incremented the psock reference
count and if the psock exists because the sock is in another map then
we may inadvertently decrement the reference count.
Fix this by making the error path only call smap_release_sock if the
error happens after the increment.
Reported-by: syzbot+d464d2c20c717ef5a6a8@syzkaller.appspotmail.com
Fixes: 8111038444 ("bpf: sockmap, add hash map support")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Pull arm64 LDFLAGS clean-up from Catalin Marinas:
- use aarch64elf instead of aarch64linux
- move endianness options to LDFLAGS instead from LD
- remove no-op '-p' linker flag
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: remove no-op -p linker flag
arm64: add endianness option to LDFLAGS instead of LD
arm64: Use aarch64elf and aarch64elfb emulation mode variants
In commit ca04d9d3e1 ("phy: qcom-qusb2: New driver for QUSB2 PHY on
Qcom chips") you can see a call like:
devm_nvmem_cell_get(dev, NULL);
Note that the cell ID passed to the function is NULL. This is because
the qcom-qusb2 driver is expected to work only on systems where the
PHY node is hooked up via device-tree and is nameless.
This works OK for the most part. The first thing nvmem_cell_get()
does is to call of_nvmem_cell_get() and there it's documented that a
NULL name is fine. The problem happens when the call to
of_nvmem_cell_get() returns -EINVAL. In such a case we'll fall back
to nvmem_cell_get_from_list() and eventually might (if nvmem_cells
isn't an empty list) crash with something that looks like:
strcmp
nvmem_find_cell
__nvmem_device_get
nvmem_cell_get_from_list
nvmem_cell_get
devm_nvmem_cell_get
qusb2_phy_probe
There are several different ways we could fix this problem:
One could argue that perhaps the qcom-qusb2 driver should be changed
to use of_nvmem_cell_get() which is allowed to have a NULL name. In
that case, we'd need to add a patche to introduce
devm_of_nvmem_cell_get() since the qcom-qusb2 driver is using devm
managed resources.
One could also argue that perhaps we could just add a name to
qcom-qusb2. That would be OK but I believe it effectively changes the
device tree bindings, so maybe it's a no-go.
In this patch I have chosen to fix the problem by simply not crashing
when a NULL cell_id is passed to nvmem_cell_get().
NOTE: that for the qcom-qusb2 driver the "nvmem-cells" property is
defined to be optional and thus it's expected to be a common case that
we would hit this crash and this is more than just a theoretical fix.
Fixes: ca04d9d3e1 ("phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The commit 9aaa3b8b4c ("thunderbolt: Add support for preboot ACL")
introduced boot_acl attribute but missed the fact that now userspace
needs to poll the attribute constantly to find out whether it has
changed or not. Fix this by sending notification to the userspace
whenever the boot_acl attribute is changed.
Fixes: 9aaa3b8b4c ("thunderbolt: Add support for preboot ACL")
Reported-and-tested-by: Christian Kellner <christian@kellner.me>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Christian Kellner <christian@kellner.me>
Acked-by: Yehezkel Bernat <yehezkelshb@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We are hitting a regression with the following commit:
commit a93e7b3315
Author: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
Date: Mon May 14 13:32:23 2018 +1200
uio: Prevent device destruction while fds are open
The problem is the addition of spin_lock_irqsave in uio_write. This
leads to hitting uio_write -> copy_from_user -> _copy_from_user ->
might_fault and the logs filling up with sleeping warnings.
I also noticed some uio drivers allocate memory, sleep, grab mutexes
from callouts like open() and release and uio is now doing
spin_lock_irqsave while calling them.
Reported-by: Mike Christie <mchristi@redhat.com>
CC: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
Reviewed-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If 'fpga_mgr_create()' fails, we should release some resources, as done
in the other error handling path of the function.
Fixes: 7085e2a94f ("fpga: manager: change api, don't use drvdata")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davide Caratti says:
====================
net/sched: fix NULL dereference in 'goto chain' control action
in a couple of TC actions (i.e. csum and tunnel_key), the control action
is stored together with the action-specific configuration data.
This avoids a race condition (see [1]), but it causes a crash when 'goto
chain' is used with the above actions. Since this race condition is
tolerated on the other TC actions (it's present even on actions where the
spinlock is still used), storing the control action in the common area
should be acceptable for tunnel_key and csum as well.
[1] https://www.spinics.net/lists/netdev/msg472047.html
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
GEM version in ZynqMP and most versions greater than r1p07 supports
TX and RX BD prefetch. The number of BDs that can be prefetched is a
HW configurable parameter. For ZynqMP, this parameter is 4.
When GEM DMA is accessing the last BD in the ring, even before the
BD is processed and the WRAP bit is noticed, it will have prefetched
BDs outside the BD ring. These will not be processed but it is
necessary to have accessible memory after the last BD. Especially
in cases where SMMU is used, memory locations immediately after the
last BD may not have translation tables triggering HRESP errors. Hence
always allocate extra BDs to accommodate for prefetch.
The value of tx/rx bd prefetch for any given SoC version is:
2 ^ (corresponding field in design config 10 register).
(value of this field >= 1)
Added a capability flag so that older IP versions that do not have
DCFG10 or this prefetch capability are not affected.
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PXA3xx platforms have 56 interrupts that are stored in two ICMR
registers. The code in pxa_irq_suspend() and pxa_irq_resume() however
does a simple division by 32 which only leads to one register being
saved at suspend and restored at resume time. The NAND interrupt
setting, for instance, is lost.
Fix this by using DIV_ROUND_UP() instead.
Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
smc_release() calls a sock_put() for smc fallback sockets to cover
the passive closing sock_hold() in __smc_connect() and
smc_tcp_listen_work(). This does not make sense for sockets in state
SMC_LISTEN and SMC_INIT.
An SMC socket stays in state SMC_INIT if connect fails. The sock_put
in smc_connect_abort() does not cover all failures. Move it into
smc_connect_decline_fallback().
Fixes: ee9dfbef02 ("net/smc: handle sockopts forcing fallback")
Reported-by: syzbot+3a0748c8f2f210c0ef9b@syzkaller.appspotmail.com
Reported-by: syzbot+9e60d2428a42049a592a@syzkaller.appspotmail.com
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These two functions return the regular -EINVAL failure in the normal
code path, but return a nonstandard '-1' error otherwise, which gets
interpreted as -EPERM.
Let's change it to -EINVAL for the dummy functions as well.
Fixes: 4d4fd36126 ("net: bridge: Publish bridge accessor functions")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
t4_get_flash_params() fails in a fatal fashion if the FLASH part isn't
one of the recognized parts. But this leads to desperate efforts to update
drivers when various FLASH parts which we are using suddenly become
unavailable and we need to substitute new FLASH parts. This has lead to
more than one Customer Field Emergency when a Customer has an old driver
and suddenly can't use newly shipped adapters.
This commit fixes this by simply assuming that the FLASH part is 4MB in
size if it can't be identified. Note that all Chelsio adapters will have
flash parts which are at least 4MB in size.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy says:
====================
tipc: fixes in duplicate address discovery function
commit 25b0b9c4e8 ("tipc: handle collisions of 32-bit node address
hash values") introduced new functionality that has turned out to
contain several bugs and weaknesses.
We address those in this series.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The setting of the node address is not thread safe, meaning that
two discoverers may decide to set it simultanously, with a duplicate
entry in the name table as result. We fix that with this commit.
Fixes: 25b0b9c4e8 ("tipc: handle collisions of 32-bit node address hash values")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The duplicate address discovery protocol is not safe against two
discoverers running in parallel. The one executing first after the
trial period is over will set the node address and change its own
message type to DSC_REQ_MSG. The one executing last may find that the
node address is already set, and never change message type, with the
result that its links may never be established.
In this commmit we ensure that the message type always is set correctly
after the trial period is over.
Fixes: 25b0b9c4e8 ("tipc: handle collisions of 32-bit node address hash values")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the duplicate address discovery protocol for tipc nodes addresses
we introduced a one second trial period before a node is allocated a
hash number to use as address.
Unfortunately, we miss to handle the case when a regular LINK REQUEST/
RESPONSE arrives from a cluster node during the trial period. Such
messages are not ignored as they should be, leading to links setup
attempts while the node still has no address.
Fixes: 25b0b9c4e8 ("tipc: handle collisions of 32-bit node address hash values")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function for checking if there is an node address conflict is
supposed to return a suggestion for a new address if it finds a
conflict, and zero otherwise. But in case the peer being checked
is previously unknown it does instead return a "suggestion" for
the checked address itself. This results in a DSC_TRIAL_FAIL_MSG
being sent unecessarily to the peer, and sometimes makes the trial
period starting over again.
Fixes: 25b0b9c4e8 ("tipc: handle collisions of 32-bit node address hash values")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This read handler had a lot of custom logic and wrote outside the bounds of
the provided buffer. This could lead to kernel and userspace memory
corruption. Just use simple_read_from_buffer() with a stack buffer.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull SCSI fixes from James Bottomley:
"This is two minor bug fixes (aacraid, target) and a fix for a
potential exploit in the way sg handles teardown"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sg: mitigate read/write abuse
scsi: aacraid: Fix PD performance regression over incorrect qd being set
scsi: target: Fix truncated PR-in ReadKeys response
Pull block fixes from Jens Axboe:
"Two minor fixes for this series:
- add LOOP_SET_BLOCK_SIZE as compat ioctl (Evan Green)
- drbd use-after-free fix (Lars Ellenberg)"
* tag 'for-linus-20180706' of git://git.kernel.dk/linux-block:
loop: Add LOOP_SET_BLOCK_SIZE in compat ioctl
drbd: fix access after free
Vladimir Zapolskiy says:
====================
ravb/sh_eth: fix sleep in atomic by reusing shared ethtool handlers
For ages trivial changes to RAVB and SuperH ethernet links by means of
standard 'ethtool' trigger a 'sleeping function called from invalid
context' bug, to visualize it on r8a7795 ULCB:
% ethtool -r eth0
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
in_atomic(): 1, irqs_disabled(): 128, pid: 554, name: ethtool
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] (null)
hardirqs last disabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918
softirqs last enabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918
softirqs last disabled at (0): [<0000000000000000>] (null)
CPU: 5 PID: 554 Comm: ethtool Not tainted 4.17.0-rc4-arm64-renesas+ #33
Hardware name: Renesas H3ULCB board based on r8a7795 ES2.0+ (DT)
Call trace:
dump_backtrace+0x0/0x198
show_stack+0x24/0x30
dump_stack+0xb8/0xf4
___might_sleep+0x1c8/0x1f8
__might_sleep+0x58/0x90
__mutex_lock+0x50/0x890
mutex_lock_nested+0x3c/0x50
phy_start_aneg_priv+0x38/0x180
phy_start_aneg+0x24/0x30
ravb_nway_reset+0x3c/0x68
dev_ethtool+0x3dc/0x2338
dev_ioctl+0x19c/0x490
sock_do_ioctl+0xe0/0x238
sock_ioctl+0x254/0x460
do_vfs_ioctl+0xb0/0x918
ksys_ioctl+0x50/0x80
sys_ioctl+0x34/0x48
__sys_trace_return+0x0/0x4
The root cause is that an attempt to modify ECMR and GECMR registers
only when RX/TX function is disabled was too overcomplicated in its
original implementation, also processing of an optional Link Change
interrupt added even more complexity, as a result the implementation
was error prone.
The new locking scheme is confirmed to be correct by dumping driver
specific and generic PHY framework function calls with aid of ftrace
while running more or less advanced tests.
Please note that sh_eth patches from the series were built-tested only.
On purpose I do not add Fixes tags, the reused PHY handlers were added
way later than the fixed problems were firstly found in the drivers.
Changes from v1 to v2:
* the original patches are split to bugfixes and enhancements only,
both v1 and v2 series are absolutely equal in total, thus I omit
description of changes in individual patches,
* the latter implies that there should be no strict need for retesting,
but because formally two series are different, I have to drop the tags
given by Geert and Andrew, please send your tags again.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
phy_ethtool_ksettings_get() call does not modify device state or device
driver state, hence there is no need to utilize a driver specific
spinlock.
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to call a heavyweight phy_start_aneg() for phy
auto-negotiation by ethtool, the phy is already initialized and
link auto-negotiation is started by calling phy_start() from
ravb_phy_start() when a network device is opened.
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The change fixes sleep in atomic context bug, which is encountered
every time when link settings are changed by ethtool.
Since commit 35b5f6b1a8 ("PHYLIB: Locking fixes for PHY I/O
potentially sleeping") phy_start_aneg() function utilizes a mutex
to serialize changes to phy state, however that helper function is
called in atomic context under a grabbed spinlock, because
phy_start_aneg() is called by phy_ethtool_ksettings_set() and by
replaced phy_ethtool_sset() helpers from phylib.
Now duplex mode setting is enforced in ravb_adjust_link() only, also
now RX/TX is disabled when link is put down or modifications to E-MAC
registers ECMR and GECMR are expected for both cases of checked and
ignored link status pin state from E-MAC interrupt handler.
Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 35b5f6b1a8 ("PHYLIB: Locking fixes for PHY I/O
potentially sleeping") phy_start_aneg() function utilizes a mutex
to serialize changes to phy state, however the helper function is
called in atomic context.
The bug can be reproduced by running "ethtool -r" command, the bug
is reported if CONFIG_DEBUG_ATOMIC_SLEEP build option is enabled.
Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
phy_ethtool_ksettings_get() call does not modify device state or device
driver state, hence there is no need to utilize a driver specific
spinlock.
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to call a heavyweight phy_start_aneg() for phy
auto-negotiation by ethtool, the phy is already initialized and
link auto-negotiation is started by calling phy_start() from
sh_eth_phy_start() when a network device is opened.
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The change fixes sleep in atomic context bug, which is encountered
every time when link settings are changed by ethtool.
Since commit 35b5f6b1a8 ("PHYLIB: Locking fixes for PHY I/O
potentially sleeping") phy_start_aneg() function utilizes a mutex
to serialize changes to phy state, however that helper function is
called in atomic context under a grabbed spinlock, because
phy_start_aneg() is called by phy_ethtool_ksettings_set() and by
replaced phy_ethtool_sset() helpers from phylib.
Now duplex mode setting is enforced in sh_eth_adjust_link() only,
also now RX/TX is disabled when link is put down or modifications
to E-MAC registers ECMR and GECMR are expected for both cases of
checked and ignored link status pin state from E-MAC interrupt handler.
For reference the change is a partial rework of commit 1e1b812bbe
("sh_eth: fix handling of no LINK signal").
Fixes: dc19e4e5e0 ("sh: sh_eth: Add support ethtool")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 35b5f6b1a8 ("PHYLIB: Locking fixes for PHY I/O
potentially sleeping") phy_start_aneg() function utilizes a mutex
to serialize changes to phy state, however the helper function is
called in atomic context.
The bug can be reproduced by running "ethtool -r" command, the bug
is reported if CONFIG_DEBUG_ATOMIC_SLEEP build option is enabled.
Fixes: dc19e4e5e0 ("sh: sh_eth: Add support ethtool")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is used by the host to talk to the BMC's PCIe slave device. The BMC
is not involved, but the clock needs to be enabled so the host can use
the device.
Fixes: 15ed8ce5f8 ("clk: aspeed: Register gated clocks")
Cc: stable@vger.kernel.org # 4.15
Acked-by: Andrew Jeffery <andrew@aj.id.au>
Tested-by: Lei YU <mine260309@gmail.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Pull clk fixes from Stephen Boyd:
"The usual collection of driver fixlets:
- build cleanup/fix for the sunxi makefile that tried to save size
but failed and prevented dead code elimination from working
- two Davinci clk driver fixes for a typo causing build failures in
different configurations and an error check that checks the wrong
variable.
- undo the DT ABI breaking imx6ul binding header shuffle that got
merged this cycle"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
dt-bindings: clock: imx6ul: Do not change the clock definition order
clk: davinci: fix a typo (which leads to build failures)
clk: davinci: cfgchip: testing the wrong variable
clk: sunxi-ng: replace lib-y with obj-y
Pull VFIO fixes from Alex Williamson:
- Make vfio-pci IGD extensions optional via Kconfig (Alex Williamson)
- Remove unused and soon to be removed map_atomic callback from mbochs
sample driver, add unmap callback to avoid dmabuf leaks (Gerd
Hoffmann)
- Fix usage of get_user_pages_longterm() (Jason Gunthorpe)
- Fix sample mbochs driver vm_operations_struct.fault return type
(Souptick Joarder)
* tag 'vfio-v4.18-rc4' of git://github.com/awilliam/linux-vfio:
sample/vfio-mdev: Change return type to vm_fault_t
vfio: Use get_user_pages_longterm correctly
sample/mdev/mbochs: add mbochs_kunmap_dmabuf
sample/mdev/mbochs: remove mbochs_kmap_atomic_dmabuf
vfio/pci: Make IGD support a configurable option
Patch (7705bb7176 clk: qcom: mmcc-msm8996: leave all mmagic gdscs
and clocks always enabled") makes all mmgaic gdscs ALWAYS_ON.
The mmagic_bimc_gdsc is also needed to be turned on to get display
working on 8x96.
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Cc: Rajendra Nayak <rnayak@codeaurora.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Fixes: 7705bb7176 ("clk: qcom: mmcc-msm8996: leave all mmagic gdscs and clocks always enabled")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Pull Amlogic clk driver fixes from Jerome Brunet:
These are two simple fixes, yet the first one is quite important as it
solves boots hangs we've been having when FDIV2 gets disabled. This did
not show up before because this particular clock is heavily used and
only gets disabled for a very short period of time before modules (such
as ethernet or emmc) probe.
- fix boot issue with gxbb and gxl platforms
- fix racalculation error in the clk_audio_divider
* tag 'meson-clk-fixes-4.18-1' of https://github.com/BayLibre/clk-meson:
clk: meson: audio-divider is one based
clk: meson-gxbb: set fclk_div2 as CLK_IS_CRITICAL
On some systems, we come out of the bootloader with some
gates set with the clock "enabled" but the reset also
asserted.
Since 8a53fc511c "clk: aspeed: Prevent reset if clock is enabled"
we check that enabled bit in aspeed_clk_enabled(), and do
nothing if already set.
This breaks when the above scenario occurs, as the clock
is enabled, but the reset still needs to be lifted.
This patch fixes it by also checking the reset bit (if any)
and treating a gate in "reset" as being disabled.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fixes: 8a53fc511c "clk: aspeed: Prevent reset if clock is enabled"
Cc: Eddie James <eajames@linux.vnet.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
The last-minute fold-in of the ENTRY() macro did change behavior:
instead of printing the symbolic name (e.g. "CLK_IS_BASIC"), it prints
the expansion of it (e.g. "(1UL << (5))").
Use "#" instead of __stringify() to fix this.
Fixes: a6059ab981 ("clk: Show symbolic clock flags in debugfs")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
This patch disallows rbtree with single elements, which is causing
problems with the recent timeout support. Before this patch, you
could opt out individual set representations per module, which is
just adding extra complexity.
Fixes: 8d8540c4f5e0("netfilter: nft_set_rbtree: add timeout support")
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull s390 fixes from Martin Schwidefsky:
"A few more changes for v4.18:
- wire up the two new system calls io_pgetevents and rseq
- fix a register corruption in the expolines code for machines
without EXRL
- drastically reduce the memory utilization of the dasd driver
- fix reference counting for KVM page table pages"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: wire up rseq system call
s390: wire up io_pgetevents system call
s390/mm: fix refcount usage for 4K pgste
s390/dasd: reduce the default queue depth and nr of hardware queues
s390: Correct register corruption in critical section cleanup
RTL8822be can't bring up properly on ASUS X530UN, and dmesg says:
[ 8.591333] r8822be: module is from the staging directory, the quality
is unknown, you have been warned.
[ 8.593122] r8822be 0000:02:00.0: enabling device (0000 -> 0003)
[ 8.669163] r8822be: Using firmware rtlwifi/rtl8822befw.bin
[ 9.289939] r8822be: rtlwifi: wireless switch is on
[ 10.056426] r8822be 0000:02:00.0 wlp2s0: renamed from wlan0
...
[ 11.952534] r8822be: halmac_init_hal failed
[ 11.955933] r8822be: halmac_init_hal failed
[ 11.956227] r8822be: halmac_init_hal failed
[ 22.007942] r8822be: halmac_init_hal failed
Jian-Hong reported it works if turn off ASPM with module parameter aspm=0.
In order to fix this problem kindly, this commit don't turn off aspm but
enlarge ASPM L1 latency to 7.
Reported-by: Jian-Hong Pan <jian-hong@endlessm.com>
Tested-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In general, accessing userspace memory beyond the length of the supplied
buffer in VFS read/write handlers can lead to both kernel memory corruption
(via kernel_read()/kernel_write(), which can e.g. be triggered via
sys_splice()) and privilege escalation inside userspace.
Fix it by using simple_read_from_buffer() instead of custom logic.
Fixes: 6bc235a2e2 ("USB: add driver for Meywa-Denki & Kayac YUREX")
Signed-off-by: Jann Horn <jannh@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 2f28e4c24b (thermal: armada: Clarify control registers
accesses) introduced the new thermal binding. The new binding extends
the second registers field size to 8. Switch to the new binding to fix
thermal reading values. Without this change the fix for errata #132698
introduced in commit 8c0b888f66 (thermal: armada: Change sensors trim
default value) has no effect.
Cc: stable@vger.kernel.org # v4.16+
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Corsair Strafe appears to suffer from the same issues
as the Corsair Strafe RGB.
Apply the same quirks (control message delay and init delay)
that the RGB version has to 1b1c:1b15.
With these quirks in place the keyboard works correctly upon
booting the system, and no longer requires reattaching the device.
Signed-off-by: Nico Sneck <snecknico@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The > should be >= here so that we don't read one element beyond the end
of the ep->stream_info->stream_rings[] array.
Fixes: e9df17eb14 ("USB: xhci: Correct assumptions about number of rings per endpoint.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Without that option, we run into a link failure:
drivers/usb/gadget/udc/aspeed-vhub/hub.o: In function `ast_vhub_std_hub_request':
hub.c:(.text+0x5b0): undefined reference to `usb_gadget_get_string'
Fixes: 7ecca2a408 ("usb/gadget: Add driver for Aspeed SoC virtual hub")
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This parameter introduced several years ago in the XHCI host controller
driver was somehow left undocumented. Add a few lines in the kernel
parameters text.
Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
USB-serial fixes for v4.18-rc4
Here are three fixes for broken control-transfer error handling, which
could lead to uninitialised slab data leaking to user space.
Included is also a new device id for cp210x.
All but the final two patches have been in linux-next with no reported
issues.
Signed-off-by: Johan Hovold <johan@kernel.org>
The comment is the same as in the top-level Makefile.
Also, the comments contain typos:
- the .PHONY variable -> the PHONY variable
- se we can ... -> so we can ...
Instead of fixing the typos, just remove the duplicated comments.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The comment line for addtree says "skip if -I has no parameter".
What it actually does is "drop if -I has no parameter". For example,
if you have the compiler flag '-I foo' (a space between), it will be
converted to 'foo'. This completely changes the meaning.
What we want is, "do nothing" for -I without parameter so that
'-I foo' is kept as-is.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Update Documentation/kbuild/kconfig.txt, which mostly contains
user help for using the kernel config tools.
- Add mention of 'nconfig' embedded help text.
- Make the section on new config symbols readable.
- Correct how to find menuconfig search help.
- Add section on 'nconfig' usage.
- Mention that gconfig has multiple viewing modes/options.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Removed Kbuild documentation for INSTALL_FW_PATH.
The kbuild symbol INSTALL_FW_PATH was removed from Kbuild tools in
September 2017 (for 4.14) but the symbol was not deleted from
the kbuild documentation, so do that now.
Fixes: 5620a0d1aa ("firmware: delete in-kernel firmware")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: stable@vger.kernel.org # 4.14+
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The supported alias for building sparc 32-bit is "sparc32",
not "sparc", so update the alias documentation for that.
Just using "sparc" produces a 64-bit config file.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
In Kbuild documentation, add alias for 64-bit sh ARCH ("sh64")
to the list of ARCH aliases.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The current implementation of cfg80211_rx_control_port assumed that the
caller could provide a contiguous region of memory for the control port
frame to be sent up to userspace. Unfortunately, many drivers produce
non-linear skbs, especially for data frames. This resulted in userspace
getting notified of control port frames with correct metadata (from
address, port, etc) yet garbage / nonsense contents, resulting in bad
handshakes, disconnections, etc.
mac80211 linearizes skbs containing management frames. But it didn't
seem worthwhile to do this for control port frames. Thus the signature
of cfg80211_rx_control_port was changed to take the skb directly.
nl80211 then takes care of obtaining control port frame data directly
from the (linear | non-linear) skb.
The caller is still responsible for freeing the skb,
cfg80211_rx_control_port does not take ownership of it.
Fixes: 6a671a50f8 ("nl80211: Add CMD_CONTROL_PORT_FRAME API")
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
[fix some kernel-doc formatting, add fixes tag]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch fixes a silent out-of-bound read possibility that was present
because of the misuse of this function.
Mostly it was called with a struct udphdr *hp which had only the udphdr
part linearized by the skb_header_pointer, however
nf_tproxy_get_sock_v{4,6} uses it as a tcphdr pointer, so some reads for
tcp specific attributes may be invalid.
Fixes: a583636a83 ("inet: refactor inet[6]_lookup functions to take skb")
Signed-off-by: Máté Eckl <ecklm94@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We have two new lenovo desktop models which need to apply the fixup of
ALC294_FIXUP_LENOVO_MIC_LOCATION, and they have the same pin cfg as
the machine with subsystem id:0x17aa3136, now use the pincfg table
to apply the fixup for them.
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Add missing transfer-length sanity check to the status-register
completion handler to avoid leaking bits of uninitialised slab data to
user space.
Fixes: 3f5429746d ("USB: Moschip 7840 USB-Serial Driver")
Cc: stable <stable@vger.kernel.org> # 2.6.19
Signed-off-by: Johan Hovold <johan@kernel.org>
Fix broken modem-status error handling which could lead to bits of slab
data leaking to user space.
Fixes: 3b36a8fd67 ("usb: fix uninitialized variable warning in keyspan_pda")
Cc: stable <stable@vger.kernel.org> # 2.6.27
Signed-off-by: Johan Hovold <johan@kernel.org>
The low and high values of the net.ipv4.ping_group_range sysctl were
being silently forced to the default disabled state when a write to the
sysctl contained GIDs that didn't map to the associated user namespace.
Confusingly, the sysctl's write operation would return success and then
a subsequent read of the sysctl would indicate that the low and high
values are the overflowgid.
This patch changes the behavior by clearly returning an error when the
sysctl write operation receives a GID range that doesn't map to the
associated user namespace. In such a situation, the previous value of
the sysctl is preserved and that range will be returned in a subsequent
read of the sysctl.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull drm fixes from Dave Airlie:
"This is the drm fixes for rc4.
It's a bit larger than I'd like but the exynos cleanups are pretty
mechanical, and I'd rather have them in sooner rather than later so we
can avoid too much conflicts around them. The non-mechanincal exynos
changes are mostly fixes for new feature recently introduced.
Apart from the exynos updates, we have:
i915:
- GVT and GGTT mapping fixes
amdgpu:
- fix HDMI2.0 4K@60 Hz regression
- Hotplug fixes for dual-GPU laptops to make power management better
- misc vega12 bios fixes, a race fix and some typos.
sii8620 bridge:
- small fixes around mode setting
core:
- use kvzalloc to allocate blob property memory"
* tag 'drm-fixes-2018-07-06' of git://anongit.freedesktop.org/drm/drm: (34 commits)
drm/amd/display: add a check for display depth validity
drm/amd/display: adding ycbcr420 pixel encoding for hdmi
drm/udl: fix display corruption of the last line
drm/bridge/sii8620: Fix link mode selection
drm/bridge/sii8620: Fix display of packed pixel modes
drm/bridge/sii8620: Send AVI infoframe in all MHL versions
drm/amdgpu: fix user fence write race condition
drm/i915: Try GGTT mmapping whole object as partial
drm/amdgpu/pm: fix display count in non-DC path
drm/amdgpu: fix swapped emit_ib_size in vce3
drm: Use kvzalloc for allocating blob property memory
drm/i915/gvt: changed DDI mode emulation type
drm/i915/gvt: fix a bug of partially write ggtt enties
drm/exynos: Replace drm_dev_unref with drm_dev_put
drm/exynos: Replace drm_gem_object_unreference_unlocked with put function
drm/exynos: Replace drm_framebuffer_{un/reference} with put,get functions
drm/exynos: ipp: use correct enum type
drm/exynos: decon5433: Fix WINCONx reset value
drm/exynos: decon5433: Fix per-plane global alpha for XRGB modes
drm/exynos: fimc: Use real buffer width for configuring the hardware
...
The notification of scrub completion happens within the scrub workqueue.
That can clearly race someone running scrub_show() and work_busy()
before the workqueue has a chance to flush the recently completed work.
Add a flag to reliably indicate the idle vs busy state. Without this
change applications using poll(2) to wait for scrub-completion may
falsely wakeup and read ARS as being busy even though the thread is
going idle and then hang indefinitely.
Fixes: bc6ba80858 ("nfit, address-range-scrub: rework and simplify ARS...")
Cc: <stable@vger.kernel.org>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Tested-by: Vishal Verma <vishal.l.verma@intel.com>
Reported-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Pull tracing fixes and cleanups from Steven Rostedt:
"While cleaning out my INBOX, I found a few patches that were lost in
the noise. These are minor bug fixes and clean ups. Those include:
- avoid a string overflow
- code that didn't match the comment (but should)
- a small code optimization (use of a conditional)
- quiet printf warnings
- nuke unused code
- fix function graph interrupt annotation"
* tag 'trace-v4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix missing return symbol in function_graph output
ftrace: Nuke clear_ftrace_function
tracing: Use __printf markup to silence compiler
tracing: Optimize trace_buffer_iter() logic
tracing: Make create_filter() code match the comments
tracing: Avoid string overflow
Setting up macvlan/macvtap networks over atlantic NIC results
in no traffic over these networks because ndo_set_rx_mode did
not listed UC MACs as registered in unicast filter.
Here we fix that taking into account maximum number of UC
filters supported by hardware. If more than MAX addresses were
registered, we just enable promisc and/or allmulti to pass
the traffic in.
We also remove MULTICAST_ADDRESS_MAX constant from aq_cfg since
thats not a configurable parameter at all.
Fixes: b21f502 ("net:ethernet:aquantia: Fix for multicast filter handling.")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mail server hosting the old address is going to fade out.
Time to update to an address I control directly.
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
schedule_timeout_* takes a timeout in jiffies but the code currently is
passing in a constant which makes this timeout HZ dependent. So define
a constant with (hopefully) meaningful name and pass it through
msecs_to_jiffies() to fix the HZ dependency.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
commit f21fb3ed36 ("Add support of Cavium Liquidio ethernet adapters")
Acked-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixups
- Fix several problems to IPPv2 merged to mainline recentely.
. An align problem of width size that IPP driver incorrectly
calculated the real buffer size.
. Horizontal and vertical flip problem.
. Per-plane global alpha for XRGB modes.
. Incorrect variant of the YUV modes.
- Fix plane overlapping problem.
. The stange order of overlapping planes on XRGB modes
by setting global alpha value to maximum value.
Cleanup
- Rename a enum type, drm_ipp_size_id, to one specific to Exynos,
drm_exynos_ipp_limit_type.
- Replace {un/reference} with {put,get} functions.
. it replaces several reference/unreference functions with Linux
kernel nameing standard.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1530512041-21392-1-git-send-email-inki.dae@samsung.com
Fixes for omap for v4.18-rc cycle
Few dts fixes for regressions for various SoCs and
devices for touchscreen wake, dra7 USB quirk, pinmux
for beaglebone mmc, and emac clock.
Also included is a change for ti-sysc to use kcalloc
that Kees wanted to get into v4.18 as that's the last
one he wanted to fix for improved defense against
allocation overflows.
* tag 'omap-for-v4.18/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap3: Fix am3517 mdio and emac clock references
ARM: dts: am335x-bone-common: Fix mmc0 Write Protect
bus: ti-sysc: Use 2-factor allocator arguments
ARM: dts: dra7: Disable metastability workaround for USB2
ARM: dts: am437x: make edt-ft5x06 a wakeup source
Signed-off-by: Olof Johansson <olof@lixom.net>
We currently attempt to check whether a physical address range provided
to __ioremap() may be in use by the page allocator by examining the
value of PageReserved for each page in the region - lowmem pages not
marked reserved are presumed to be in use by the page allocator, and
requests to ioremap them fail.
The way we check this has been broken since commit 92923ca3aa ("mm:
meminit: only set page reserved in the memblock region"), because
memblock will typically not have any knowledge of non-RAM pages and
therefore those pages will not have the PageReserved flag set. Thus when
we attempt to ioremap a region outside of RAM we incorrectly fail
believing that the region is RAM that may be in use.
In most cases ioremap() on MIPS will take a fast-path to use the
unmapped kseg1 or xkphys virtual address spaces and never hit this path,
so the only way to hit it is for a MIPS32 system to attempt to ioremap()
an address range in lowmem with flags other than _CACHE_UNCACHED.
Perhaps the most straightforward way to do this is using
ioremap_uncached_accelerated(), which is how the problem was discovered.
Fix this by making use of walk_system_ram_range() to test the address
range provided to __ioremap() against only RAM pages, rather than all
lowmem pages. This means that if we have a lowmem I/O region, which is
very common for MIPS systems, we're free to ioremap() address ranges
within it. A nice bonus is that the test is no longer limited to lowmem.
The approach here matches the way x86 performed the same test after
commit c81c8a1eee ("x86, ioremap: Speed up check for RAM pages") until
x86 moved towards a slightly more complicated check using walk_mem_res()
for unrelated reasons with commit 0e4c12b45a ("x86/mm, resource: Use
PAGE_KERNEL protection for ioremap of memory pages").
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reported-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Fixes: 92923ca3aa ("mm: meminit: only set page reserved in the memblock region")
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # v4.2+
Patchwork: https://patchwork.linux-mips.org/patch/19786/
sgid directories have special semantics, making newly created files in
the directory belong to the group of the directory, and newly created
subdirectories will also become sgid. This is historically used for
group-shared directories.
But group directories writable by non-group members should not imply
that such non-group members can magically join the group, so make sure
to clear the sgid bit on non-directories for non-members (but remember
that sgid without group execute means "mandatory locking", just to
confuse things even more).
Reported-by: Jann Horn <jannh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit may cause a less than required dma mask to be used for
some allocations, which apparently leads to module load failures for
iwlwifi sometimes.
This reverts commit d657c5c73c.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Fabio Coatti <fabio.coatti@gmail.com>
Tested-by: Fabio Coatti <fabio.coatti@gmail.com>
For every request we send, whether it is SMB1 or SMB2+, we attempt to
reconnect tcon (cifs_reconnect_tcon or smb2_reconnect) before carrying
out the request.
So, while server->tcpStatus != CifsNeedReconnect, we wait for the
reconnection to succeed on wait_event_interruptible_timeout(). If it
returns, that means that either the condition was evaluated to true, or
timeout elapsed, or it was interrupted by a signal.
Since we're not handling the case where the process woke up due to a
received signal (-ERESTARTSYS), the next call to
wait_event_interruptible_timeout() will _always_ fail and we end up
looping forever inside either cifs_reconnect_tcon() or smb2_reconnect().
Here's an example of how to trigger that:
$ mount.cifs //foo/share /mnt/test -o
username=foo,password=foo,vers=1.0,hard
(break connection to server before executing bellow cmd)
$ stat -f /mnt/test & sleep 140
[1] 2511
$ ps -aux -q 2511
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2511 0.0 0.0 12892 1008 pts/0 S 12:24 0:00 stat -f
/mnt/test
$ kill -9 2511
(wait for a while; process is stuck in the kernel)
$ ps -aux -q 2511
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2511 83.2 0.0 12892 1008 pts/0 R 12:24 30:01 stat -f
/mnt/test
By using 'hard' mount point means that cifs.ko will keep retrying
indefinitely, however we must allow the process to be killed otherwise
it would hang the system.
Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Cc: stable@vger.kernel.org
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
SMB1 mounting broke in commit 35e2cc1ba7
("cifs: Use correct packet length in SMB2_TRANSFORM header")
Fix it and also rename smb2_rqst_len to smb_rqst_len
to make it less unobvious that the function is also called from
CIFS/SMB1
Good job by Paulo reviewing and cleaning up Ronnie's original patch.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Paulo Alcantara <palcantara@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Fixes: c713c8770f ("cifs: push rfc1002 generation down the stack")
We failed to validate signed data returned by the server because
__cifs_calc_signature() now expects to sign the actual data in iov but
we were also passing down the rfc1002 length.
Fix smb3_calc_signature() to calculate signature of rfc1002 length prior
to passing only the actual data iov[1-N] to __cifs_calc_signature(). In
addition, there are a few cases where no rfc1002 length is passed so we
make sure there's one (iov_len == 4).
Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Fixes: c713c8770f ("cifs: push rfc1002 generation down the stack")
We failed to validate signed data returned by the server because
__cifs_calc_signature() now expects to sign the actual data in iov but
we were also passing down the rfc1002 length.
Fix smb3_calc_signature() to calculate signature of rfc1002 length prior
to passing only the actual data iov[1-N] to __cifs_calc_signature(). In
addition, there are a few cases where no rfc1002 length is passed so we
make sure there's one (iov_len == 4).
Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
With protocol version 2.0 mounts we have seen crashes with corrupt mid
entries. Either the server->pending_mid_q list becomes corrupt with a
cyclic reference in one element or a mid object fetched by the
demultiplexer thread becomes overwritten during use.
Code review identified a race between the demultiplexer thread and the
request issuing thread. The demultiplexer thread seems to be written
with the assumption that it is the sole user of the mid object until
it calls the mid callback which either wakes the issuer task or
deletes the mid.
This assumption is not true because the issuer task can be woken up
earlier by a signal. If the demultiplexer thread has proceeded as far
as setting the mid_state to MID_RESPONSE_RECEIVED then the issuer
thread will happily end up calling cifs_delete_mid while the
demultiplexer thread still is using the mid object.
Inserting a delay in the cifs demultiplexer thread widens the race
window and makes reproduction of the race very easy:
if (server->large_buf)
buf = server->bigbuf;
+ usleep_range(500, 4000);
server->lstrp = jiffies;
To resolve this I think the proper solution involves putting a
reference count on the mid object. This patch makes sure that the
demultiplexer thread holds a reference until it has finished
processing the transaction.
Cc: stable@vger.kernel.org
Signed-off-by: Lars Persson <larper@axis.com>
Acked-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
It turns out that systemd has a bug: it wants to load the autofs module
early because of some initialization ordering with udev, and it doesn't
do that correctly. Everywhere else it does the proper "look up module
name" that does the proper alias resolution, but in that early code, it
just uses a hardcoded "autofs4" for the module name.
The result of that is that as of commit a2225d931f ("autofs: remove
left-over autofs4 stubs"), you get
systemd[1]: Failed to insert module 'autofs4': No such file or directory
in the system logs, and a lack of module loading. All this despite the
fact that we had very clearly marked 'autofs4' as an alias for this
module.
What's so ridiculous about this is that literally everything else does
the module alias handling correctly, including really old versions of
systemd (that just used 'modprobe' to do this), and even all the other
systemd module loading code.
Only that special systemd early module load code is broken, hardcoding
the module names for not just 'autofs4', but also "ipv6", "unix",
"ip_tables" and "virtio_rng". Very annoying.
Instead of creating an _additional_ separate compatibility 'autofs4'
module, just rely on the fact that everybody else gets this right, and
just call the module 'autofs4' for compatibility reasons, with 'autofs'
as the alias name.
That will allow the systemd people to fix their bugs, adding the proper
alias handling, and maybe even fix the name of the module to be just
"autofs" (so that they can _test_ the alias handling). And eventually,
we can revert this silly compatibility hack.
See also
https://github.com/systemd/systemd/issues/9501https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=902946
for the systemd bug reports upstream and in the Debian bug tracker
respectively.
Fixes: a2225d931f ("autofs: remove left-over autofs4 stubs")
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Reported-by: Michael Biebl <biebl@debian.org>
Cc: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linking the ARM64 defconfig kernel with LLVM lld fails with the error:
ld.lld: error: unknown argument: -p
Makefile:1015: recipe for target 'vmlinux' failed
Without this flag, the ARM64 defconfig kernel successfully links with
lld and boots on Dragonboard 410c.
After digging through binutils source and changelogs, it turns out that
-p is only relevant to ancient binutils installations targeting 32-bit
ARM. binutils accepts -p for AArch64 too, but it's always been
undocumented and silently ignored. A comment in
ld/emultempl/aarch64elf.em explains that it's "Only here for backwards
compatibility".
Since this flag is a no-op on ARM64, we can safely drop it.
Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Pull ACPI fixes from Rafael Wysocki:
"These fix a recent ACPICA regression, fix a battery driver regression
introduced during the 4.17 cycle and fix up the recently added support
for the PPTT ACPI table.
Specifics:
- Revert part of a recent ACPICA regression fix that added leading
newlines to ACPICA error messages and made the kernel log look
broken (Rafael Wysocki).
- Fix an ACPI battery driver regression introduced during the 4.17
cycle due to incorrect error handling that made Thinkpad 13 laptops
crash on boot (Jouke Witteveen).
- Fix up the recently added PPTT ACPI table support by covering the
case when a PPTT structure represents a processors group correctly
(Sudeep Holla)"
* tag 'acpi-4.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / battery: Safe unregistering of hooks
ACPI / PPTT: use ACPI ID whenever ACPI_PPTT_ACPI_PROCESSOR_ID_VALID is set
ACPICA: Drop leading newlines from error messages
Pull power management fixes from Rafael Wysocki:
"These fix a PCI power management regression introduced during the 4.17
cycle and fix up the recently added support for devices in multiple
power domains.
Specifics:
- Resume parallel PCI (non-PCIe) bridges on suspend-to-RAM (ACP S3)
to avoid confusing the platform firmware which started to happen
after a core power management regression fix that went in during
the 4.17 cycle (Rafael Wysocki).
- Fix up the recently added support for devices in multiple power
domains by avoiding to power up the entire domain unnecessarily
when attaching a device to it (Ulf Hansson)"
* tag 'pm-4.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Domains: Don't power on at attach for the multi PM domain case
PCI / ACPI / PM: Resume bridges w/o drivers on suspend-to-RAM
Pull RISC-V fixes from Palmer Dabbelt:
"This contains a handful of fixes for the RISC-V port:
- A fix to R_RISCV_ADD32/R_RISCV_SUB32 relocations that allows
modules that use these to load correctly.
- The removal of of_platform_populate(), which is obselete.
- The removal of irq-riscv-intc.h, which is obselete.
- A fix to PTRACE_SETREGSET.
- Fixes that allow the RV32I kernel to build (at least for Zong, I've
got another patch on the mailing list that's necessary on my setup :)).
I've just given these a defconfig build test"
* tag 'riscv-for-linus-4.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
RISC-V: Fix PTRACE_SETREGSET bug.
RISC-V: Don't include irq-riscv-intc.h
riscv: remove unnecessary of_platform_populate call
RISC-V: fix R_RISCV_ADD32/R_RISCV_SUB32 relocations
RISC-V: Change variable type for 32-bit compatible
RISC-V: Add definiion of extract symbol's index and type for 32-bit
RISC-V: Select GENERIC_UCMPDI2 on RV32I
RISC-V: Add conditional macro for zone of DMA32
Pull m68knommu fix from Greg Ungerer:
"A single fix for breakage introduced in this merge window"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: fix "bad page state" oops on ColdFire boot
[why]
HDMI 2.0 fails to validate 4K@60 timing with 10 bpc
[how]
Adding a helper function that would verify if the display depth
assigned would pass a bandwidth validation.
Drop the display depth by one level till calculated pixel clk
is lower than maximum TMDS clk.
Bugzilla: https://bugs.freedesktop.org/106959
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
I really need to look at driver core changes before they are applied
due to PM dependencies and they sometimes get lost in the LKML
traffic, so add myself as an official driver core reviewer.
Signed-off-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[why]
HDMI EDID's VSDB contains spectial timings for specifically
YCbCr 4:2:0 colour space. In those cases we need to verify
if the mode provided is one of the special ones has to use
YCbCr 4:2:0 pixel encoding for display info.
[how]
Verify if the mode is using specific ycbcr420 colour space with
the help of DRM helper function and assign the mode to use
ycbcr420 pixel encoding.
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When the hangcheck handler was replaced by the DRM scheduler timeout
handling we dropped the forward progress check, as this might allow
clients to hog the GPU for a long time with a big job.
It turns out that even reasonably well behaved clients like the
Armada Xorg driver occasionally trip over the 500ms timeout. Bring
back the forward progress check to get rid of the userspace regression.
We would still like to fix userspace to submit smaller batches
if possible, but that is for another day.
Cc: <stable@vger.kernel.org>
Fixes: 6d7a20c077 (drm/etnaviv: replace hangcheck with scheduler timeout)
Reported-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
When mmc host controller enters suspend state, the clocks are
disabled, but irqs are not. For some reason the irqchip emits
false interrupts, which causes system lock loop.
Debug log is:
...
sunxi-mmc 1c11000.mmc: setting clk to 52000000, rounded 51200000
sunxi-mmc 1c11000.mmc: enabling the clock
sunxi-mmc 1c11000.mmc: cmd 13(8000014d) arg 10000 ie 0x0000bbc6 len 0
sunxi-mmc 1c11000.mmc: irq: rq (ptrval) mi 00000004 idi 00000000
sunxi-mmc 1c11000.mmc: cmd 6(80000146) arg 3210101 ie 0x0000bbc6 len 0
sunxi-mmc 1c11000.mmc: irq: rq (ptrval) mi 00000004 idi 00000000
sunxi-mmc 1c11000.mmc: cmd 13(8000014d) arg 10000 ie 0x0000bbc6 len 0
sunxi-mmc 1c11000.mmc: irq: rq (ptrval) mi 00000004 idi 00000000
mmc1: new DDR MMC card at address 0001
mmcblk1: mmc1:0001 AGND3R 14.6 GiB
mmcblk1boot0: mmc1:0001 AGND3R partition 1 4.00 MiB
mmcblk1boot1: mmc1:0001 AGND3R partition 2 4.00 MiB
sunxi-mmc 1c11000.mmc: cmd 18(80003352) arg 0 ie 0x0000fbc2 len 409
sunxi-mmc 1c11000.mmc: irq: rq (ptrval) mi 00004000 idi 00000002
mmcblk1: p1
sunxi-mmc 1c11000.mmc: irq: rq (null) mi 00000000 idi 00000000
sunxi-mmc 1c11000.mmc: irq: rq (null) mi 00000000 idi 00000000
sunxi-mmc 1c11000.mmc: irq: rq (null) mi 00000000 idi 00000000
sunxi-mmc 1c11000.mmc: irq: rq (null) mi 00000000 idi 00000000
and so on...
This issue apears on eMMC cards, routed on MMC2 slot. The patch is
tested with A20-OLinuXino-MICRO/LIME/LIME2 boards.
Fixes: 9a8e1e8cc2 ("mmc: sunxi: Add runtime_pm support")
Signed-off-by: Stefan Mavrodiev <stefan@olimex.com>
Acked-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This adds the USB id of LTE modem Quectel EG91. It requires the
same quirk as other Quectel modems to make it work.
Signed-off-by: Matevz Vucnik <vucnikm@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arun Kumar Neelakantam says:
====================
net: qrtr: Broadcasting control messages
Allow messages only from control port to broadcast to avoid unnecessary
messages and reset the node to local router NODE ID in control messages
otherwise remote routers consider the packets as invalid and Drops it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
All the control messages broadcast to remote routers are using
QRTR_NODE_BCAST instead of using local router NODE ID which cause
the packets to be dropped on remote router due to invalid NODE ID.
Signed-off-by: Arun Kumar Neelakantam <aneela@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The broadcast node id should only be sent with the control port id.
Signed-off-by: Arun Kumar Neelakantam <aneela@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
At present the ipv6_renew_options_kern() function ends up calling into
access_ok() which is problematic if done from inside an interrupt as
access_ok() calls WARN_ON_IN_IRQ() on some (all?) architectures
(x86-64 is affected). Example warning/backtrace is shown below:
WARNING: CPU: 1 PID: 3144 at lib/usercopy.c:11 _copy_from_user+0x85/0x90
...
Call Trace:
<IRQ>
ipv6_renew_option+0xb2/0xf0
ipv6_renew_options+0x26a/0x340
ipv6_renew_options_kern+0x2c/0x40
calipso_req_setattr+0x72/0xe0
netlbl_req_setattr+0x126/0x1b0
selinux_netlbl_inet_conn_request+0x80/0x100
selinux_inet_conn_request+0x6d/0xb0
security_inet_conn_request+0x32/0x50
tcp_conn_request+0x35f/0xe00
? __lock_acquire+0x250/0x16c0
? selinux_socket_sock_rcv_skb+0x1ae/0x210
? tcp_rcv_state_process+0x289/0x106b
tcp_rcv_state_process+0x289/0x106b
? tcp_v6_do_rcv+0x1a7/0x3c0
tcp_v6_do_rcv+0x1a7/0x3c0
tcp_v6_rcv+0xc82/0xcf0
ip6_input_finish+0x10d/0x690
ip6_input+0x45/0x1e0
? ip6_rcv_finish+0x1d0/0x1d0
ipv6_rcv+0x32b/0x880
? ip6_make_skb+0x1e0/0x1e0
__netif_receive_skb_core+0x6f2/0xdf0
? process_backlog+0x85/0x250
? process_backlog+0x85/0x250
? process_backlog+0xec/0x250
process_backlog+0xec/0x250
net_rx_action+0x153/0x480
__do_softirq+0xd9/0x4f7
do_softirq_own_stack+0x2a/0x40
</IRQ>
...
While not present in the backtrace, ipv6_renew_option() ends up calling
access_ok() via the following chain:
access_ok()
_copy_from_user()
copy_from_user()
ipv6_renew_option()
The fix presented in this patch is to perform the userspace copy
earlier in the call chain such that it is only called when the option
data is actually coming from userspace; that place is
do_ipv6_setsockopt(). Not only does this solve the problem seen in
the backtrace above, it also allows us to simplify the code quite a
bit by removing ipv6_renew_options_kern() completely. We also take
this opportunity to cleanup ipv6_renew_options()/ipv6_renew_option()
a small amount as well.
This patch is heavily based on a rough patch by Al Viro. I've taken
his original patch, converted a kmemdup() call in do_ipv6_setsockopt()
to a memdup_user() call, made better use of the e_inval jump target in
the same function, and cleaned up the use ipv6_renew_option() by
ipv6_renew_options().
CC: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If format_idx == s_mcp_trace_meta.formats_num then we read one element
beyond the end of the s_mcp_trace_meta.formats[] array.
Fixes: 50bc60cb15 ("qed*: Utilize FW 8.33.11.0")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge ACPICA regression fix and a fix for the recently added PPTT
support.
* acpi-tables:
ACPI / PPTT: use ACPI ID whenever ACPI_PPTT_ACPI_PROCESSOR_ID_VALID is set
* acpica:
ACPICA: Drop leading newlines from error messages
nft_compat relies on xt_request_find_match to increment
refcount of the module that provides the match/target.
The (builtin) icmp matches did't set the module owner so it
was possible to rmmod ip(6)tables while icmp extensions were still in use.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Otherwise NetworkManager (and iproute alike) is not able to identify the
parent IEEE 802.15.4 interface of a 6LoWPAN link.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Acked-by: Alexander Aring <aring@mojatatu.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Without CONFIG_GPIOLIB, some headers are not included implicitly,
leading to a build failure:
drivers/net/ieee802154/mcr20a.c: In function 'mcr20a_probe':
drivers/net/ieee802154/mcr20a.c:1347:13: error: implicit declaration of function 'irq_get_trigger_type'; did you mean 'irq_get_irqchip_state'? [-Werror=implicit-function-declaration]
This includes gpio/consumer.h and irq.h directly rather through the
gpiolib header.
Fixes: 8c6ad9cc51 ("ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Xue Liu <liuxuenetmail@gmail.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
make allyesconfig
It will cause a linking error because the jump assembly code were using j,
however it may be not enough to jump to the destination. We have to change it
with pseudo instruction b. In that way, assembler will generate a set of safe
assembly codes to make sure the destination is able to jump.
Toolchain:
https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/8.1.0/x86_64-gcc-8.1.0-nolibc-nds32le-linux.tar.gz
Command:
PATH=/NOBACKUP/atcsqa06/greentime/os/toolchain-kernel.org/gcc-8.1.0-nolibc/nds32le-linux/bin:$PATH ARCH=nds32 CROSS_COMPILE=nds32le-linux- make allyesconfig
PATH=/NOBACKUP/atcsqa06/greentime/os/toolchain-kernel.org/gcc-8.1.0-nolibc/nds32le-linux/bin:$PATH ARCH=nds32 CROSS_COMPILE=nds32le-linux- make -j8
MODPOST vmlinux.o
WARNING: EXPORT symbol "copy_page" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: EXPORT symbol "clear_page" [vmlinux] version generation failed, symbol will not be versioned.
nds32le-linux-ld: kernel/futex.o:(.fixup+0x4): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text'
nds32le-linux-ld: kernel/futex.o:(.fixup+0xaa): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text'
nds32le-linux-ld: kernel/futex.o:(.fixup+0xb0): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text'
nds32le-linux-ld: kernel/futex.o:(.fixup+0xb6): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text'
nds32le-linux-ld: kernel/futex.o:(.fixup+0xbc): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text'
nds32le-linux-ld: kernel/futex.o:(.fixup+0xc4): relocation truncated to fit: R_NDS32_25_PCREL_RELA against `.text'
Makefile:1010: recipe for target 'vmlinux' failed
make: *** [vmlinux] Error 1
Signed-off-by: Greentime Hu <greentime@andestech.com>
For untracked executables of samples/bpf, add this.
Untracked files:
(use "git add <file>..." to include in what will be committed)
samples/bpf/cpustat
samples/bpf/fds_example
samples/bpf/lathist
samples/bpf/load_sock_ops
...
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
To avoid the below build warning message,
use new generate_load() checking the return value.
ignoring return value of ‘system’, declared with attribute warn_unused_result
And it also refactors the duplicate code of both
test_perf_event_all_cpu() and test_perf_event_task()
Cc: Teng Qin <qinteng@fb.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This fixes build error regarding redefinition:
CLANG-bpf samples/bpf/parse_varlen.o
samples/bpf/parse_varlen.c:111:8: error: redefinition of 'vlan_hdr'
struct vlan_hdr {
^
./include/linux/if_vlan.h:38:8: note: previous definition is here
So remove duplicate 'struct vlan_hdr' in sample code and include if_vlan.h
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Commit cd7e 61b9"init mmio by lri command in vgpu inhibit context"
initializes registers saved/restored in context with its vreg value
through lri command in ring buffer. It relies on vreg got updated
on every guest access. There is a case found that Linux guest uses
lri command in inhibit-ctx to update the register. This patch adds
vreg update on this case.
v2: move mmio_attribute functions to gvt.h (Zhenyu)
v3: use mask_mmio_write in vreg update
v4: refine codes and add more comments (Zhenyu)
Fixes: cd7e61b9("drm/i915/gvt: init mmio by lri command in vgpu inhibit context")
Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The displaylink hardware has such a peculiarity that it doesn't render a
command until next command is received. This produces occasional
corruption, such as when setting 22x11 font on the console, only the first
line of the cursor will be blinking if the cursor is located at some
specific columns.
When we end up with a repeating pixel, the driver has a bug that it leaves
one uninitialized byte after the command (and this byte is enough to flush
the command and render it - thus it fixes the screen corruption), however
whe we end up with a non-repeating pixel, there is no byte appended and
this results in temporary screen corruption.
This patch fixes the screen corruption by always appending a byte 0xAF at
the end of URB. It also removes the uninitialized byte.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Commit 03e6275ae3 ("usb: chipidea: Fix ULPI on imx51") causes a kernel
hang on imx51 systems that use the ULPI interface and do not select the
CONFIG_USB_CHIPIDEA_ULPI option.
In order to avoid such potential misuse, let's always build the
chipidea ULPI code into the final ci_hdrc object.
Tested on a imx51-babbage board.
Fixes: 03e6275ae3 ("usb: chipidea: Fix ULPI on imx51")
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Configuration information read at driver load can become stale after it is
updated. Mark information as not valid and re-populate when this happens.
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently also the pause flags are removed from phydev->supported because
they're not included in PHY_DEFAULT_FEATURES. I don't think this is
intended, especially when considering that this function can be called
via phy_set_max_speed() anywhere in a driver. Change the masking to mask
out only the values we're going to change. In addition remove the
misleading comment, job of this small function is just to adjust the
supported and advertised speeds.
Fixes: f3a6bd393c ("phylib: Add phy_set_max_speed helper")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the recent syntax extension, Kconfig is now able to evaluate the
compiler / toolchain capability.
However, accumulating flags to 'LD' is not compatible with the way
it works; 'LD' must be passed to Kconfig to call $(ld-option,...)
from Kconfig files. If you tweak 'LD' in arch Makefile depending on
CONFIG_CPU_BIG_ENDIAN, this would end up with circular dependency
between Makefile and Kconfig.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
These patches for building 32-bit RISC-V kernel.
- Fix the compile errors and warnings on RV32I.
- Fix some incompatible problem on RV32I.
- Add format.h for compatible of print format.
The fixed width integer types format for Elf_Addr will move to
generic header by another patch. For now, there are some warning
about unexpected argument of type on RV32I.
Change in v1:
- Fix some error in v1
- Remove implementation of fixed width integer types format for Elf_Addr.
In riscv_gpr_set, pass regs instead of ®s to user_regset_copyin to fix
gdb segfault.
Signed-off-by: Jim Wilson <jimw@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
This file has never existed in the upstream kernel, but it's guarded by
an #ifdef that's also never existed in the upstream kernel. As a part
of our interrupt controller refactoring this header is no longer
necessary, but this reference managed to sneak in anyway.
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
The DT core will call of_platform_default_populate, so it is not
necessary for arch specific code to call it unless there are custom
match entries, auxdata or parent device. Neither of those apply here, so
remove the call.
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: linux-riscv@lists.infradead.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
The R_RISCV_ADD32/R_RISCV_SUB32 relocations should add/subtract the
address of the symbol (without overflow check), not its contents.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
On 32-bit, it need to use __ucmpdi2, otherwise, it can't find the __ucmpdi2
symbol.
Signed-off-by: Zong Li <zong@andestech.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
The aarch64linux and aarch64linuxb emulation modes are not supported by
bare-metal toolchains and Linux using them forbids building the kernel
with these toolchains.
Since there is apparently no reason to target these emulation modes, the
more generic elf modes are used instead, allowing to build on bare-metal
toolchains as well as the already-supported ones.
Fixes: 3d6a7b99e3 ("arm64: ensure the kernel is compiled for LP64")
Cc: stable@vger.kernel.org
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Paul Kocialkowski <contact@paulk.fr>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Submitters of device tree binding documentation may forget to CC
the subsystem maintainer if this is missing.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Current implementation does not guarantee packed pixel modes working
with every dongle. There are some dongles, which require selecting
the output mode explicitly.
Write proper values to registers in packed_pixel mode, based on how it
is done in vendor's code. Select output color space: RGB
(no packed pixel) or YCBCR422 (packed pixel).
This reverts commit e8b92efa62
("drm/bridge/sii8620: fix display of packed pixel modes in MHL2").
Signed-off-by: Maciej Purski <m.purski@samsung.com>
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1530204243-6370-3-git-send-email-m.purski@samsung.com
Currently AVI infoframe is sent only in MHL3. However, some MHL2 dongles
need AVI infoframe to work correctly in either packed pixel mode or
non-packed pixel mode.
Send AVI infoframe in set_infoframes() in every case. Create an
infoframe using drm_hdmi_infoframe_from_display_mode() instead of
manually filling each infoframe structure's field.
Signed-off-by: Maciej Purski <m.purski@samsung.com>
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1530204243-6370-2-git-send-email-m.purski@samsung.com
There are two versions of the Qivicon Zigbee stick in circulation. This
adds the second USB ID to the cp210x driver.
Signed-off-by: Olli Salonen <olli.salonen@iki.fi>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
The "r" variable is an int and "bufsize" is an unsigned int so the
comparison is type promoted to unsigned. If usb_control_msg() returns a
negative that is treated as a high positive value and the error handling
doesn't work.
Fixes: 2d5a9c72d0 ("USB: serial: ch341: fix control-message error handling")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
In certain conditions, the device may not be able to link in gigabit mode. This software workaround ensures that the device will not enter the failure state.
Fixes: d0cad87170 ("SMSC75XX USB 2.0 Gigabit Ethernet Devices")
Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit b6c5734db0 ("sctp: fix the handling of ICMP Frag Needed
for too small MTUs"), sctp_transport_update_pmtu would refetch pathmtu
from the dst and set it to transport's pathmtu without any check.
The new pathmtu may be lower than MINSEGMENT if the dst is obsolete and
updated by .get_dst() in sctp_transport_update_pmtu. In this case, it
could have a smaller MTU as well, and thus we should validate it
against MINSEGMENT instead.
Syzbot reported a warning in sctp_mtu_payload caused by this.
This patch refetches the pathmtu by calling sctp_dst_mtu where it does
the check against MINSEGMENT.
v1->v2:
- refetch the pathmtu by calling sctp_dst_mtu instead as Marcelo's
suggestion.
Fixes: b6c5734db0 ("sctp: fix the handling of ICMP Frag Needed for too small MTUs")
Reported-by: syzbot+f0d9d7cba052f9344b03@syzkaller.appspotmail.com
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A hooking API was implemented for 4.17 in fa93854f7a followed
by hooks for Thinkpad laptops in 2801b9683f. The Thinkpad
drivers did not support the Thinkpad 13 and the hooking API crashes
on unsupported batteries by altering a list of hooks during unsafe
iteration. Thus, Thinkpad 13 laptops could no longer boot.
Additionally, a lock was kept in place and debugging information was
printed out of order.
Fixes: fa93854f7a (battery: Add the battery hooking API)
Cc: 4.17+ <stable@vger.kernel.org> # 4.17+
Signed-off-by: Jouke Witteveen <j.witteveen@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The buffer object backing the user fence is reserved using the non-user
fence, i.e., as soon as the non-user fence is signaled, the user fence
buffer object can be moved or even destroyed.
Therefore, emit the user fence first.
Both fences have the same cache invalidation behavior, so this should
have no user-visible effect.
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
NetworkManager likes to manage linklocal prefix routes and does so with
the NLM_F_APPEND flag, breaking attempts to simplify the IPv6 route
code and by extension enable multipath routes with device only nexthops.
Revert f34436a430 and these followup patches:
6eba08c362 ("ipv6: Only emit append events for appended routes").
ce45bded64 ("mlxsw: spectrum_router: Align with new route replace logic")
53b562df8c ("mlxsw: spectrum_router: Allow appending to dev-only routes")
Update the fib_tests cases to reflect the old behavior.
Fixes: f34436a430 ("net/ipv6: Simplify route replace and appending into multipath route")
Signed-off-by: David Ahern <dsahern@gmail.com>
The gen_stats facility will add a header for the toplevel nlattr of type
TCA_STATS2 that contains all stats added by qdisc callbacks. A reference
to this header is stored in the gnet_dump struct, and when all the
per-qdisc callbacks have finished adding their stats, the length of the
containing header will be adjusted to the right value.
However, on architectures that need padding (i.e., that don't set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS), the padding nlattr is added
before the stats, which means that the stored pointer will point to the
padding, and so when the header is fixed up, the result is just a very
big padding nlattr. Because most qdiscs also supply the legacy TCA_STATS
struct, this problem has been mostly invisible, but we exposed it with
the netlink attribute-based statistics in CAKE.
Fix the issue by fixing up the stored pointer if it points to a padding
nlattr.
Tested-by: Pete Heist <pete@heistp.net>
Tested-by: Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The m88e1121 LED default configuration does not apply m88e151x.
So add a function to relpace m88e1121 LED configuration.
Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If struct page is poisoned, and uninitialized access is detected via
PF_POISONED_CHECK(page) dump_page() is called to output the page. But,
the dump_page() itself accesses struct page to determine how to print
it, and therefore gets into a recursive loop.
For example:
dump_page()
__dump_page()
PageSlab(page)
PF_POISONED_CHECK(page)
VM_BUG_ON_PGFLAGS(PagePoisoned(page), page)
dump_page() recursion loop.
Link: http://lkml.kernel.org/r/20180702180536.2552-1-pasha.tatashin@oracle.com
Fixes: f165b378bb ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ARM trusted foundations code is currently broken in linux-next when
CONFIG_KCOV_INSTRUMENT_ALL is set:
/tmp/ccHdQsCI.s: Assembler messages:
/tmp/ccHdQsCI.s:37: Error: .err encountered
/tmp/ccHdQsCI.s:38: Error: .err encountered
/tmp/ccHdQsCI.s:39: Error: .err encountered
scripts/Makefile.build:311: recipe for target 'arch/arm/firmware/trusted_foundations.o' failed
I could not find a function attribute that lets me disable
-fsanitize-coverage=trace-pc for just one function, so this turns it off
for the entire file instead.
Link: http://lkml.kernel.org/r/20180529103636.1535457-1-arnd@arndb.de
Fixes: 758517202b ("arm: port KCOV to arm")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Olof Johansson <olof@lixom.net>
Tested-by: Olof Johansson <olof@lixom.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a special case that the size is "(N << KASAN_SHADOW_SCALE_SHIFT)
Pages plus X", the value of X is [1, KASAN_SHADOW_SCALE_SIZE-1]. The
operation "size >> KASAN_SHADOW_SCALE_SHIFT" will drop X, and the
roundup operation can not retrieve the missed one page. For example:
size=0x28006, PAGE_SIZE=0x1000, KASAN_SHADOW_SCALE_SHIFT=3, we will get
shadow_size=0x5000, but actually we need 6 pages.
shadow_size = round_up(size >> KASAN_SHADOW_SCALE_SHIFT, PAGE_SIZE);
This can lead to a kernel crash when kasan is enabled and the value of
mod->core_layout.size or mod->init_layout.size is like above. Because
the shadow memory of X has not been allocated and mapped.
move_module:
ptr = module_alloc(mod->core_layout.size);
...
memset(ptr, 0, mod->core_layout.size); //crashed
Unable to handle kernel paging request at virtual address ffff0fffff97b000
......
Call trace:
__asan_storeN+0x174/0x1a8
memset+0x24/0x48
layout_and_allocate+0xcd8/0x1800
load_module+0x190/0x23e8
SyS_finit_module+0x148/0x180
Link: http://lkml.kernel.org/r/1529659626-12660-1-git-send-email-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Dmitriy Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Libin <huawei.libin@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When booting with very large numbers of gigantic (i.e. 1G) pages, the
operations in the loop of gather_bootmem_prealloc, and specifically
prep_compound_gigantic_page, takes a very long time, and can cause a
softlockup if enough pages are requested at boot.
For example booting with 3844 1G pages requires prepping
(set_compound_head, init the count) over 1 billion 4K tail pages, which
takes considerable time.
Add a cond_resched() to the outer loop in gather_bootmem_prealloc() to
prevent this lockup.
Tested: Booted with softlockup_panic=1 hugepagesz=1G hugepages=3844 and
no softlockup is reported, and the hugepages are reported as
successfully setup.
Link: http://lkml.kernel.org/r/20180627214447.260804-1-cannonmatthews@google.com
Signed-off-by: Cannon Matthews <cannonmatthews@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Silence warnings (triggered at W=1) by adding relevant __printf attributes.
CC kernel/trace/trace.o
kernel/trace/trace.c: In function ‘__trace_array_vprintk’:
kernel/trace/trace.c:2979:2: warning: function might be possible candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
len = vscnprintf(tbuffer, TRACE_BUF_SIZE, fmt, args);
^~~
AR kernel/trace/built-in.o
Link: http://lkml.kernel.org/r/20180308205843.27447-1-malat@debian.org
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The comment in create_filter() states that the passed in filter pointer
(filterp) will either be NULL or contain an error message stating why the
filter failed. But it also expects the filter pointer to point to NULL when
passed in. If it is not, the function create_filter_start() will warn and
return an error message without updating the filter pointer. This is not
what the comment states.
As we always expect the pointer to point to NULL, if it is not, trigger a
WARN_ON(), set it to NULL, and then continue the path as the rest will work
as the comment states. Also update the comment to state it must point to
NULL.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
'err' is used as a NUL-terminated string, but using strncpy() with the length
equal to the buffer size may result in lack of the termination:
kernel/trace/trace_events_hist.c: In function 'hist_err_event':
kernel/trace/trace_events_hist.c:396:3: error: 'strncpy' specified bound 256 equals destination size [-Werror=stringop-truncation]
strncpy(err, var, MAX_FILTER_STR_VAL);
This changes it to use the safer strscpy() instead.
Link: http://lkml.kernel.org/r/20180328140920.2842153-1-arnd@arndb.de
Cc: stable@vger.kernel.org
Fixes: f404da6e1d ("tracing: Add 'last error' error facility for hist triggers")
Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Decrement the number of elements in the map in case the allocation
of a new node fails.
Fixes: 6c90598174 ("bpf: pre-allocate hash map elements")
Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The conversion from offsetof() calculations to sizeof()
wrongly behaved for missed exact size and in scenario with
more than one flow.
In such scenario we got "create flow failed, flow 10: 8 bytes
left from uverb cmd" error, which is wrong because the size of
kern_spec is exactly 8 bytes, and we were not supposed to fail.
Cc: <stable@vger.kernel.org> # 3.12
Fixes: 4fae7f1704 ("RDMA/uverbs: Fix slab-out-of-bounds in ib_uverbs_ex_create_flow")
Reported-by: Ran Rozenstein <ranro@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
My networking merge (commit 4e33d7d479: "Pull networking fixes from
David Miller") got the poll() handling conflict wrong for af_smc.
The conflict between my a11e1d432b ("Revert changes to convert to
->poll_mask() and aio IOCB_CMD_POLL") and Ursula Braun's 24ac3a08e6
("net/smc: rebuild nonblocking connect") should have left the call to
sock_poll_wait() in place, just without the socket lock release/retake.
And I really should have realized that. But happily, I at least asked
Ursula to double-check the merge, and she set me right.
This also fixes an incidental whitespace issue nearby that annoyed me
while looking at this.
Pointed-out-by: Ursula Braun <ubraun@linux.ibm.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
i.MX fixes for 4.18, round 2:
- A couple of imx defconfig updates selecting USB ULPI support to fix
a regression seen with USB driver, which is caused by commit
03e6275ae3 ("usb: chipidea: Fix ULPI on imx51").
- A fix on imx51-zii-rdu1 board touchscreen pinctrl setting, which
causes an interrupt storm.
* tag 'imx-fixes-4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: imx_v4_v5_defconfig: Select ULPI support
ARM: imx_v6_v7_defconfig: Select ULPI support
ARM: dts: imx51-zii-rdu1: fix touchscreen pinctrl
Signed-off-by: Olof Johansson <olof@lixom.net>
There are no legacy behavior in drivers to consider while attaching a
device to genpd - for the multiple PM domain case.
For that reason, let's instead require the driver to runtime resume the
device, via calling pm_runtime_get_sync() for example, when it needs to
power on the corresponding PM domain.
This allows us to improve the situation during attach. Instead of always
power on the PM domain, which may be unnecessary, let's leave it in its
current state. Additionally, to avoid the PM domain to stay powered on,
let's schedule a power off work.
Fixes: 3c095f32a9 (PM / Domains: Add support for multi PM domains ...)
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Kalle Valo says:
====================
wireless-drivers fixes for 4.18
First set of fixes for 4.18 and for numerous drivers. Something to mention
about is the wcn36xx fix which makes it possible to compile with gcc older than
4.4 (though I'm not sure if we even support those anymore).
qtnfmac
* coverity fix for a new commit in v4.18-rc1
rtlwifi
* fix kernel oops during driver removal
* fix firmware image corruption for rtl8821ae
brcmfmac
* fix crash if there's no firmware image
mwifiex
* a revert and a better fix for a new commit v4.18-rc1
mt7601u
* fix a recent regression about unnecessary warning about avg_rssi
wcn36xx
* convert testmode.c to plain ASCII
ath10k
* fix a firmware crash during bandwidth change
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Network core refuses to change mac address because flag
IFF_LIVE_ADDR_CHANGE isn't set. Set this missing flag.
Fixes: 1f7aa2bc26 ("r8169: simplify rtl_set_mac_address")
Reported-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code does not inspect the return value of skb_to_sgvec. This
can cause a nullptr kernel panic when the malformed sgvec is passed into
the crypto request.
Checking the return value of skb_to_sgvec and skipping decryption if it
is negative fixes this problem.
Fixes: c46234ebb4 ("tls: RX path for ktls")
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In file lib/rhashtable.c line 777, skip variable is assigned to
itself. The following error was observed:
lib/rhashtable.c:777:41: warning: explicitly assigning value of
variable of type 'int' to itself [-Wself-assign] error, forbidden
warning: rhashtable.c:777
This error was found when compiling with Clang 6.0. Change it to iter->skip.
Signed-off-by: Rishabh Bhatnagar <rishabhb@codeaurora.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change adds LOOP_SET_BLOCK_SIZE as one of the supported ioctls
in lo_compat_ioctl. It only takes an unsigned long argument, and
in practice a 32-bit value works fine.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Select CONFIG_USB_CHIPIDEA_ULPI and CONFIG_USB_ULPI_BUS so that
USB ULPI can be functional on some boards like that use ULPI
interface.
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Select CONFIG_USB_CHIPIDEA_ULPI and CONFIG_USB_ULPI_BUS so that
USB ULPI can be functional on some boards like imx51-babbge.
This fixes a kernel hang in 4.18-rc1 on i.mx51-babbage, caused by commit
03e6275ae3 ("usb: chipidea: Fix ULPI on imx51").
Suggested-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
This fixes regression introduced by
commit 8d52af6795 ("mei: speed up the power down flow")
In power down or suspend flow a message can still be received
from the FW because the clients fake disconnection.
In normal case we interpret messages w/o destination as corrupted
and link reset is performed in order to clean the channel,
but during power down link reset is already in progress resulting
in endless loop. To resolve the issue under power down flow we
discard messages silently.
Cc: <stable@vger.kernel.org> 4.16+
Fixes: 8d52af6795 ("mei: speed up the power down flow")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199541
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Embarrassingly, the recent fix introduced worse problem than it solved,
causing the balloon not to inflate. The VM informed the hypervisor that
the pages for lock/unlock are sitting in the wrong address, as it used
the page that is used the uninitialized page variable.
Fixes: b23220fe05 ("vmw_balloon: fixing double free when batching mode is off")
Cc: stable@vger.kernel.org
Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The touch sensors on the 2nd-gen Intuos tablets don't use a 4096x4096
sensor like other similar tablets (3rd-gen Bamboo, Intuos5, etc.).
The incorrect maximum XY values don't normally affect userspace since
touch input from these devices is typically relative rather than
absolute. It does, however, cause problems when absolute distances
need to be measured, e.g. for gesture recognition. Since the resolution
of the touch sensor on these devices is 10 units / mm (versus 100 for
the pen sensor), the proper maximum values can be calculated by simply
dividing by 10.
Fixes: b5fd2a3e92 ("Input: wacom - add support for three new Intuos devices")
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
As part of hw reconfig, only stations linked to AP interfaces are added
back to the driver ignoring those which are tied to AP_VLAN interfaces.
It is true that there could be stations tied to the AP_VLAN interface while
serving 4addr clients or when using AP_VLAN for VLAN operations; we should
be adding these stations back to the driver as part of hw reconfig, failing
to do so can cause functional issues.
In the case of ath10k driver, the following errors were observed.
ath10k_pci : failed to install key for non-existent peer XX:XX:XX:XX:XX:XX
Workqueue: events_freezable ieee80211_restart_work [mac80211]
(unwind_backtrace) from (show_stack+0x10/0x14)
(show_stack) (dump_stack+0x80/0xa0)
(dump_stack) (warn_slowpath_common+0x68/0x8c)
(warn_slowpath_common) (warn_slowpath_null+0x18/0x20)
(warn_slowpath_null) (ieee80211_enable_keys+0x88/0x154 [mac80211])
(ieee80211_enable_keys) (ieee80211_reconfig+0xc90/0x19c8 [mac80211])
(ieee80211_reconfig]) (ieee80211_restart_work+0x8c/0xa0 [mac80211])
(ieee80211_restart_work) (process_one_work+0x284/0x488)
(process_one_work) (worker_thread+0x228/0x360)
(worker_thread) (kthread+0xd8/0xec)
(kthread) (ret_from_fork+0x14/0x24)
Also while bringing down the AP VAP, WARN_ONs and errors related to peer
removal were observed.
ath10k_pci : failed to clear all peer wep keys for vdev 0: -2
ath10k_pci : failed to disassociate station: 8c:fd:f0:0a:8c:f5 vdev 0: -2
(unwind_backtrace) (show_stack+0x10/0x14)
(show_stack) (dump_stack+0x80/0xa0)
(dump_stack) (warn_slowpath_common+0x68/0x8c)
(warn_slowpath_common) (warn_slowpath_null+0x18/0x20)
(warn_slowpath_null) (sta_set_sinfo+0xb98/0xc9c [mac80211])
(sta_set_sinfo [mac80211]) (__sta_info_flush+0xf0/0x134 [mac80211])
(__sta_info_flush [mac80211]) (ieee80211_stop_ap+0xe8/0x390 [mac80211])
(ieee80211_stop_ap [mac80211]) (__cfg80211_stop_ap+0xe0/0x3dc [cfg80211])
(__cfg80211_stop_ap [cfg80211]) (cfg80211_stop_ap+0x30/0x44 [cfg80211])
(cfg80211_stop_ap [cfg80211]) (genl_rcv_msg+0x274/0x30c)
(genl_rcv_msg) (netlink_rcv_skb+0x58/0xac)
(netlink_rcv_skb) (genl_rcv+0x20/0x34)
(genl_rcv) (netlink_unicast+0x11c/0x204)
(netlink_unicast) (netlink_sendmsg+0x30c/0x370)
(netlink_sendmsg) (sock_sendmsg+0x70/0x84)
(sock_sendmsg) (___sys_sendmsg.part.3+0x188/0x228)
(___sys_sendmsg.part.3) (__sys_sendmsg+0x4c/0x70)
(__sys_sendmsg) (ret_fast_syscall+0x0/0x44)
These issues got fixed by adding the stations which are
tied to AP_VLANs back to the driver.
Signed-off-by: Manikanta Pubbisetty <mpubbise@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Card write threshold control is supposed to be set since controller
version 2.80a for data write in HS400 mode and data read in
HS200/HS400/SDR104 mode. However the current code returns without
configuring it in the case of data writing in HS400 mode.
Meanwhile the patch fixes that the current code goes to
'disable' when doing data reading in HS400 mode.
Fixes: 7e4bf1bc95 ("mmc: dw_mmc: add the card write threshold for HS400 mode")
Signed-off-by: Qing Xia <xiaqing17@hisilicon.com>
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Sometimes when writing large size files to flash in direct/memory mapped
mode, it is seen that flash write enable command times out with error:
[ 503.146293] cadence-qspi 47040000.ospi: Flash command execution timed out.
This is because, we need to make sure previous direct write operation
is complete by polling for IDLE bit in CONFIG_REG before starting the
next operation.
Fix this by polling for IDLE bit after memory mapped write.
Fixes: a27f2eaf2b ("mtd: spi-nor: cadence-quadspi: Add support for direct access mode")
Cc: stable@vger.kernel.org
Signed-off-by: Vignesh R <vigneshr@ti.com>
Reviewed-by: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
A previous patch removed OMAP clock aliases that were perceived
to be unnecessary. Unfortunately, it broke the ethernet on the
am3517-evm. This patch enables the MDIO clock and EMAC clock.
Fixes: 0ed266d7ae ("clk: ti: omap3: cleanup unnecessary clock aliases")
Cc: stable@vger.kernel.org #4.16+
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Gaurav reports that commit:
85f1abe001 ("kthread, sched/wait: Fix kthread_parkme() completion issue")
isn't working for him. Because of the following race:
> controller Thread CPUHP Thread
> takedown_cpu
> kthread_park
> kthread_parkme
> Set KTHREAD_SHOULD_PARK
> smpboot_thread_fn
> set Task interruptible
>
>
> wake_up_process
> if (!(p->state & state))
> goto out;
>
> Kthread_parkme
> SET TASK_PARKED
> schedule
> raw_spin_lock(&rq->lock)
> ttwu_remote
> waiting for __task_rq_lock
> context_switch
>
> finish_lock_switch
>
>
>
> Case TASK_PARKED
> kthread_park_complete
>
>
> SET Running
Furthermore, Oleg noticed that the whole scheduler TASK_PARKED
handling is buggered because the TASK_DEAD thing is done with
preemption disabled, the current code can still complete early on
preemption :/
So basically revert that earlier fix and go with a variant of the
alternative mentioned in the commit. Promote TASK_PARKED to special
state to avoid the store-store issue on task->state leading to the
WARN in kthread_unpark() -> __kthread_bind().
But in addition, add wait_task_inactive() to kthread_park() to ensure
the task really is PARKED when we return from kthread_park(). This
avoids the whole kthread still gets migrated nonsense -- although it
would be really good to get this done differently.
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 85f1abe001 ("kthread, sched/wait: Fix kthread_parkme() completion issue")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a cfs_rq is throttled, parent cfs_rq->nr_running is decreased and
everything happens at cfs_rq level. Currently util_est stays unchanged
in such case and it keeps accounting the utilization of throttled tasks.
This can somewhat make sense as we don't dequeue tasks but only throttled
cfs_rq.
If a task of another group is enqueued/dequeued and root cfs_rq becomes
idle during the dequeue, util_est will be cleared whereas it was
accounting util_est of throttled tasks before. So the behavior of util_est
is not always the same regarding throttled tasks and depends of side
activity. Furthermore, util_est will not be updated when the cfs_rq is
unthrottled as everything happens at cfs_rq level. Main results is that
util_est will stay null whereas we now have running tasks. We have to wait
for the next dequeue/enqueue of the previously throttled tasks to get an
up to date util_est.
Remove the assumption that cfs_rq's estimated utilization of a CPU is 0
if there is no running task so the util_est of a task remains until the
latter is dequeued even if its cfs_rq has been throttled.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 7f65ea42eb ("sched/fair: Add util_est on top of PELT")
Link: http://lkml.kernel.org/r/1528972380-16268-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I noticed that cgroup task groups constantly get throttled even
if they have low CPU usage, this causes some jitters on the response
time to some of our business containers when enabling CPU quotas.
It's very simple to reproduce:
mkdir /sys/fs/cgroup/cpu/test
cd /sys/fs/cgroup/cpu/test
echo 100000 > cpu.cfs_quota_us
echo $$ > tasks
then repeat:
cat cpu.stat | grep nr_throttled # nr_throttled will increase steadily
After some analysis, we found that cfs_rq::runtime_remaining will
be cleared by expire_cfs_rq_runtime() due to two equal but stale
"cfs_{b|q}->runtime_expires" after period timer is re-armed.
The current condition to judge clock drift in expire_cfs_rq_runtime()
is wrong, the two runtime_expires are actually the same when clock
drift happens, so this condtion can never hit. The orginal design was
correctly done by this commit:
a9cf55b286 ("sched: Expire invalid runtime")
... but was changed to be the current implementation due to its locking bug.
This patch introduces another way, it adds a new field in both structures
cfs_rq and cfs_bandwidth to record the expiration update sequence, and
uses them to figure out if clock drift happens (true if they are equal).
Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ben Segall <bsegall@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 51f2176d74 ("sched/fair: Fix unlocked reads of some cfs_b->quota/period")
Link: http://lkml.kernel.org/r/20180620101834.24455-1-xlpang@linux.alibaba.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With commit:
8f111bc357 ("cpufreq/schedutil: Rewrite CPUFREQ_RT support")
the schedutil governor uses rq->rt.rt_nr_running to detect whether an
RT task is currently running on the CPU and to set frequency to max
if necessary.
cpufreq_update_util() is called in enqueue/dequeue_top_rt_rq() but
rq->rt.rt_nr_running has not been updated yet when dequeue_top_rt_rq() is
called so schedutil still considers that an RT task is running when the
last task is dequeued. The update of rq->rt.rt_nr_running happens later
in dequeue_rt_stack().
In fact, we can take advantage of the sequence that the dequeue then
re-enqueue rt entities when a rt task is enqueued or dequeued;
As a result enqueue_top_rt_rq() is always called when a task is
enqueued or dequeued and also when groups are throttled or unthrottled.
The only place that not use enqueue_top_rt_rq() is when root rt_rq is
throttled.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: efault@gmx.de
Cc: juri.lelli@redhat.com
Cc: patrick.bellasi@arm.com
Cc: viresh.kumar@linaro.org
Fixes: 8f111bc357 ('cpufreq/schedutil: Rewrite CPUFREQ_RT support')
Link: http://lkml.kernel.org/r/1530021202-21695-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some people have reported that the warning in sched_tick_remote()
occasionally triggers, especially in favour of some RCU-Torture
pressure:
WARNING: CPU: 11 PID: 906 at kernel/sched/core.c:3138 sched_tick_remote+0xb6/0xc0
Modules linked in:
CPU: 11 PID: 906 Comm: kworker/u32:3 Not tainted 4.18.0-rc2+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Workqueue: events_unbound sched_tick_remote
RIP: 0010:sched_tick_remote+0xb6/0xc0
Code: e8 0f 06 b8 00 c6 03 00 fb eb 9d 8b 43 04 85 c0 75 8d 48 8b 83 e0 0a 00 00 48 85 c0 75 81 eb 88 48 89 df e8 bc fe ff ff eb aa <0f> 0b eb
+c5 66 0f 1f 44 00 00 bf 17 00 00 00 e8 b6 2e fe ff 0f b6
Call Trace:
process_one_work+0x1df/0x3b0
worker_thread+0x44/0x3d0
kthread+0xf3/0x130
? set_worker_desc+0xb0/0xb0
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x35/0x40
This happens when the remote tick applies on an idle task. Usually the
idle_cpu() check avoids that, but it is performed before we lock the
runqueue and it is therefore racy. It was intended to be that way in
order to prevent from useless runqueue locks since idle task tick
callback is a no-op.
Now if the racy check slips out of our hands and we end up remotely
ticking an idle task, the empty task_tick_idle() is harmless. Still
it won't pass the WARN_ON_ONCE() test that ensures rq_clock_task() is
not too far from curr->se.exec_start because update_curr_idle() doesn't
update the exec_start value like other scheduler policies. Hence the
reported false positive.
So let's have another check, while the rq is locked, to make sure we
don't remote tick on an idle task. The lockless idle_cpu() still applies
to avoid unecessary rq lock contention.
Reported-by: Jacek Tomaka <jacekt@dug.com>
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reported-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1530203381-31234-1-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mainline Commit b74c2b21e1 added the pinmux
settings for mmc1, however this pin (0x9a0) is routed to P9_42 on the cape
header. Thus any BeagleBone cape that utilizes P9_42 triggers mmc0's Write
Protect.
Fixes: b74c2b21e1 ("ARM: dts: am33xx: Add pinmux data for mmc1 in
am335x-evm, evmsk and beaglebone")
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
CC: Faiz Abbas <faiz_abbas@ti.com>
CC: Tony Lindgren <tony@atomide.com>
CC: Jason Kridner <jkridner@beagleboard.org>
CC: Drew Fustini <drew@beagleboard.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The checking code is done in kmap_atomic() so that we don't need to
check it in update_mmu_cache() again. There is no need to implement
it for cache aliasing or cache non-aliasing versions. We can just
implement one version for both.
Signed-off-by: Greentime Hu <greentime@andestech.com>
commit bfd694d5e2 ("mmc: core: Add tunable delay
before detecting card after card is inserted") adds
"u32 cd_debounce_delay_ms" to the last of mmc_gpio
struct and cause "char cd_label[0]" NOT work as string
pointer of card detect label, when "cat /proc/interrupts",
the devname for card detect gpio is incorrect as below:
144: 0 gpio-mxc 22 Edge ▒
161: 0 gpio-mxc 7 Edge ▒
Move the cd_label field down to fix this, and drop the
zero from the array size to prevent future similar bugs,
the result is correct as below:
144: 0 gpio-mxc 22 Edge 2198000.mmc cd
161: 0 gpio-mxc 7 Edge 2190000.mmc cd
Fixes: bfd694d5e2 ("mmc: core: Add tunable delay before detecting card after card is inserted")
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
We found that the original implementation will only use the built-in dtb
pointer instead of the pointer pass from bootloader. This bug is fixed
by this patch.
Signed-off-by: Greentime Hu <greentime@andestech.com>
data cache.
This issue is found by Guo Ren. Based on the Documentation/core-api/cachetlb.rst
and it says:
"Any necessary cache flushing or other coherency operations
that need to occur should happen here. If the processor's
instruction cache does not snoop cpu stores, it is very
likely that you will need to flush the instruction cache
for copy_to_user_page()."
"If the icache does not snoop stores then this
routine(flush_icache_range) will need to flush it."
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Signed-off-by: Greentime Hu <greentime@andestech.com>
Magnus Karlsson says:
====================
This patch set fixes three bugs in the SKB TX path of AF_XDP.
Details in the individual commits.
The structure of the patch set is as follows:
Patch 1: Fix for lost completion message
Patch 2-3: Fix for possible multiple completions of single packet
Patch 4: Fix potential race during error
Changes from v1:
* Added explanation of race in commit message of patch 4.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
There is a potential race in the TX completion code for the SKB
case. One process enters the sendmsg code of an AF_XDP socket in order
to send a frame. The execution eventually trickles down to the driver
that is told to send the packet. However, it decides to drop the
packet due to some error condition (e.g., rings full) and frees the
SKB. This will trigger the SKB destructor and a completion will be
sent to the AF_XDP user space through its
single-producer/single-consumer queues.
At the same time a TX interrupt has fired on another core and it
dispatches the TX completion code in the driver. It does its HW
specific things and ends up freeing the SKB associated with the
transmitted packet. This will trigger the SKB destructor and a
completion will be sent to the AF_XDP user space through its
single-producer/single-consumer queues. With a pseudo call stack, it
would look like this:
Core 1:
sendmsg() being called in the application
netdev_start_xmit()
Driver entered through ndo_start_xmit
Driver decides to free the SKB for some reason (e.g., rings full)
Destructor of SKB called
xskq_produce_addr() is called to signal completion to user space
Core 2:
TX completion irq
NAPI loop
Driver irq handler for TX completions
Frees the SKB
Destructor of SKB called
xskq_produce_addr() is called to signal completion to user space
We now have a violation of the single-producer/single-consumer
principle for our queues as there are two threads trying to produce at
the same time on the same queue.
Fixed by introducing a spin_lock in the destructor. In regards to the
performance, I get around 1.74 Mpps for txonly before and after the
introduction of the spinlock. There is of course some impact due to
the spin lock but it is in the less significant digits that are too
noisy for me to measure. But let us say that the version without the
spin lock got 1.745 Mpps in the best case and the version with 1.735
Mpps in the worst case, then that would mean a maximum drop in
performance of 0.5%.
Fixes: 35fcde7f8d ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Sendmsg in the SKB path of AF_XDP can now return EBUSY when a packet
was discarded and completed by the driver. Just ignore this message
in the sample application.
Fixes: b4b8faa1de ("samples/bpf: sample application and documentation for AF_XDP sockets")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reported-by: Pavel Odintsov <pavel@fastnetmon.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Fixed a bug in which a frame could be completed more than once
when an error was returned from dev_direct_xmit(). The code
erroneously retried sending the message leading to multiple
calls to the SKB destructor and therefore multiple completions
of the same buffer to user space.
The error code in this case has been changed from EAGAIN to EBUSY
in order to tell user space that the sending of the packet failed
and the buffer has been return to user space through the completion
queue.
Fixes: 35fcde7f8d ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reported-by: Pavel Odintsov <pavel@fastnetmon.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The code in xskq_produce_addr erroneously checked if there
was up to LAZY_UPDATE_THRESHOLD amount of space in the completion
queue. It only needs to check if there is one slot left in the
queue. This bug could under some circumstances lead to a WARN_ON_ONCE
being triggered and the completion message to user space being lost.
Fixes: 35fcde7f8d ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reported-by: Pavel Odintsov <pavel@fastnetmon.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch attempts to close a hole leading to a BUG seen with hot
removals during writes [1].
A block device (NVME namespace in this test case) is formatted to EXT4
without partitions. It's mounted and write I/O is run to a file, then
the device is hot removed from the slot. The superblock attempts to be
written to the drive which is no longer present.
The typical chain of events leading to the BUG:
ext4_commit_super()
__sync_dirty_buffer()
submit_bh()
submit_bh_wbc()
BUG_ON(!buffer_mapped(bh));
This fix checks for the superblock's buffer head being mapped prior to
syncing.
[1] https://www.spinics.net/lists/linux-ext4/msg56527.html
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Currently, we use the ACPI processor ID only for the leaf/processor nodes
as the specification states it must match the value of the ACPI processor
ID field in the processor’s entry in the MADT.
However, if a PPTT structure represents a processors group, it
matches a processor container UID in the namespace and the
ACPI_PPTT_ACPI_PROCESSOR_ID_VALID flag indicates whether the
ACPI processor ID is valid.
Let's use UID whenever ACPI_PPTT_ACPI_PROCESSOR_ID_VALID is set to be
consistent instead of using table offset as it's currently done for
non-leaf nodes.
Fixes: 2bd00bcd73 (ACPI/PPTT: Add Processor Properties Topology Table parsing)
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Jeremy Linton <jeremy.linton@arm.com>
[ rjw: Changelog (minor) ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.
Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.
This simplifies the dependencies, and allows to improve compile-testing.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Add an optional parameter "start_sector" to allow the start of the
device to be offset by the specified number of 512-byte sectors. The
sectors below this offset are not used by the writecache device and are
left to be used for disk labels and/or userspace metadata (e.g. lvm).
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Pull MD fixes from Shaohua Li:
"Two small fixes for MD:
- an error handling fix from me
- a recover bug fix for raid10 from BingJing"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md/raid10: fix that replacement cannot complete recovery after reassemble
MD: cleanup resources in failure
Pull OpenRISC fixes from Stafford Horne:
"Two fixes for issues which were breaking OpenRISC boot:
- Fix bug in __pte_free_tlb() exposed in 4.18 by Matthew Wilcox's
page table flag addition.
- Fix issue booting on real hardware if delay slot detection
emulation is disabled"
* tag 'for-linus' of git://github.com/stffrdhrn/linux:
openrisc: entry: Fix delay slot exception detection
openrisc: Call destructor during __pte_free_tlb
new_active_crtcs is a bitmask, new_active_crtc_count is the
actual count.
Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull networking fixes from David Miller:
1) Verify netlink attributes properly in nf_queue, from Eric Dumazet.
2) Need to bump memory lock rlimit for test_sockmap bpf test, from
Yonghong Song.
3) Fix VLAN handling in lan78xx driver, from Dave Stevenson.
4) Fix uninitialized read in nf_log, from Jann Horn.
5) Fix raw command length parsing in mlx5, from Alex Vesker.
6) Cleanup loopback RDS connections upon netns deletion, from Sowmini
Varadhan.
7) Fix regressions in FIB rule matching during create, from Jason A.
Donenfeld and Roopa Prabhu.
8) Fix mpls ether type detection in nfp, from Pieter Jansen van Vuuren.
9) More bpfilter build fixes/adjustments from Masahiro Yamada.
10) Fix XDP_{TX,REDIRECT} flushing in various drivers, from Jesper
Dangaard Brouer.
11) fib_tests.sh file permissions were broken, from Shuah Khan.
12) Make sure BH/preemption is disabled in data path of mac80211, from
Denis Kenzior.
13) Don't ignore nla_parse_nested() return values in nl80211, from
Johannes berg.
14) Properly account sock objects ot kmemcg, from Shakeel Butt.
15) Adjustments to setting bpf program permissions to read-only, from
Daniel Borkmann.
16) TCP Fast Open key endianness was broken, it always took on the host
endiannness. Whoops. Explicitly make it little endian. From Yuching
Cheng.
17) Fix prefix route setting for link local addresses in ipv6, from
David Ahern.
18) Potential Spectre v1 in zatm driver, from Gustavo A. R. Silva.
19) Various bpf sockmap fixes, from John Fastabend.
20) Use after free for GRO with ESP, from Sabrina Dubroca.
21) Passing bogus flags to crypto_alloc_shash() in ipv6 SR code, from
Eric Biggers.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
qede: Adverstise software timestamp caps when PHC is not available.
qed: Fix use of incorrect size in memcpy call.
qed: Fix setting of incorrect eswitch mode.
qed: Limit msix vectors in kdump kernel to the minimum required count.
ipvlan: call dev_change_flags when ipvlan mode is reset
ipv6: sr: fix passing wrong flags to crypto_alloc_shash()
net: fix use-after-free in GRO with ESP
tcp: prevent bogus FRTO undos with non-SACK flows
bpf: sockhash, add release routine
bpf: sockhash fix omitted bucket lock in sock_close
bpf: sockmap, fix smap_list_map_remove when psock is in many maps
bpf: sockmap, fix crash when ipv6 sock is added
net: fib_rules: bring back rule_exists to match rule during add
hv_netvsc: split sub-channel setup into async and sync
net: use dev_change_tx_queue_len() for SIOCSIFTXQLEN
atm: zatm: Fix potential Spectre v1
s390/qeth: consistently re-enable device features
s390/qeth: don't clobber buffer on async TX completion
s390/qeth: avoid using is_multicast_ether_addr_64bits on (u8 *)[6]
s390/qeth: fix race when setting MAC address
...
The block (LBA) specified must not exceed the last addressable LBA,
which is dev->nr_sectors - 1. So fix the correct check is
"if (block >= dev->n_sectors)" and not "if (block > dev->n_sectords)".
Additionally, the asc/ascq to return for an LBA that is not a zone start
LBA should be ILLEGAL REQUEST, regardless if the bad LBA is out of
range.
Reported-by: David Butterfield <david.butterfield@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Commit f2a8aa053c ("typec: tcpm: Represent source supply through
power_supply") moved the code to register a power_supply representing
the device supplying power to the type-C connector, from the fusb302
code to the generic tcpm code.
This has caused the power-supply registered by the fusb302 driver,
which determines how much current the bq24190 can draw, to change name
from "fusb302-typec-source" to "tcpm-source-psy-i2c-fusb302".
Fixes: f2a8aa053c ("typec: tcpm: Represent source supply through...")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit f2a8aa053c ("typec: tcpm: Represent source supply through
power_supply") moved the code to register a power_supply representing
the device supplying power to the type-C connector, from the fusb302
code to the generic tcpm code so that we have a psy reporting the
supply voltage and current for all tcpm devices.
This broke the reporting of current and voltage through the psy interface
when supplied by a a non pd supply (5V, current as reported by
get_current_limit). The cause of this breakage is port->supply_voltage
and port->current_limit not being set in that case.
This commit fixes this by setting port->supply_voltage and
port->current_limit from tcpm_set_current_limit().
This commit also removes setting supply_voltage and current_limit
from tcpm_reset_port() as that calls tcpm_set_current_limit(0, 0)
which now already sets these to 0.
Fixes: f2a8aa053c ("typec: tcpm: Represent source supply through...")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pm_runtime_put_sync() gets called everytime in xhci_dbc_stop().
If dbc is not started, this makes the runtime PM counter incorrectly
becomes 0, and calls autosuspend function. Then we'll keep seeing this:
[54664.762220] xhci_hcd 0000:00:14.0: Root hub is not suspended
So only calls pm_runtime_put_sync() when dbc was started.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There have been several reports of LPM related hard freezes about once
a day on multiple Lenovo 50 series models. Strange enough these reports
where not disk model specific as LPM issues usually are and some users
with the exact same disk + laptop where seeing them while other users
where not seeing these issues.
It turns out that enabling LPM triggers a firmware bug somewhere, which
has been fixed in later BIOS versions.
This commit adds a new ahci_broken_lpm() function and a new ATA_FLAG_NO_LPM
for dealing with this.
The ahci_broken_lpm() function contains DMI match info for the 4 models
which are known to be affected by this and the DMI BIOS date field for
known good BIOS versions. If the BIOS date is older then the one in the
table LPM will be disabled and a warning will be printed.
Note the BIOS dates are for known good versions, some older versions may
work too, but we don't know for sure, the table is using dates from BIOS
versions for which users have confirmed that upgrading to that version
makes the problem go away.
Unfortunately I've been unable to get hold of the reporter who reported
that BIOS version 2.35 fixed the problems on the W541 for him. I've been
able to verify the DMI_SYS_VENDOR and DMI_PRODUCT_VERSION from an older
dmidecode, but I don't know the exact BIOS date as reported in the DMI.
Lenovo keeps a changelog with dates in their release notes, but the
dates there are the release dates not the build dates which are in DMI.
So I've chosen to set the date to which we compare to one day past the
release date of the 2.34 BIOS. I plan to fix this with a follow up
commit once I've the necessary info.
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Pointers sdev0 and sdev1 are being assigned but are never used hence they
are redundant and can be removed.
Cleans up clang warnings:
warning: variable 'sdev0' set but not used [-Wunused-but-set-variable]
warning: variable 'sdev1' set but not used [-Wunused-but-set-variable]
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
We have
struct drbd_requests { ... struct bio *private_bio; ... }
to hold a bio clone for local submission.
On local IO completion, we put that bio, and in case we want to use the
result later, we overload that member to hold the ERR_PTR() of the
completion result,
Which, before v4.3, used to be the passed in "int error",
so we could first bio_put(), then assign.
v4.3-rc1~100^2~21 4246a0b63b block: add a bi_error field to struct bio
changed that:
bio_put(req->private_bio);
- req->private_bio = ERR_PTR(error);
+ req->private_bio = ERR_PTR(bio->bi_error);
Which introduces an access after free,
because it was non obvious that req->private_bio == bio.
Impact of that was mostly unnoticable, because we only use that value
in a multiple-failure case, and even then map any "unexpected" error
code to EIO, so worst case we could potentially mask a more specific
error with EIO in a multiple failure case.
Unless the pointed to memory region was unmapped, as is the case with
CONFIG_DEBUG_PAGEALLOC, in which case this results in
BUG: unable to handle kernel paging request
v4.13-rc1~70^2~75 4e4cbee93d block: switch bios to blk_status_t
changes it further to
bio_put(req->private_bio);
req->private_bio = ERR_PTR(blk_status_to_errno(bio->bi_status));
And blk_status_to_errno() now contains a WARN_ON_ONCE() for unexpected
values, which catches this "sometimes", if the memory has been reused
quickly enough for other things.
Should also go into stable since 4.3, with the trivial change around 4.13.
Cc: stable@vger.kernel.org
Fixes: 4246a0b63b block: add a bi_error field to struct bio
Reported-by: Sarah Newman <srn@prgmr.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Obtaining the runtime pm wakeref can fail, especially in a hotplug
scenario where i915.ko has been unloaded. If we do not catch the
failure, we end up with an unbalanced pm.
v2 additions by tiwai:
hdmi_present_sense() checks the return value and handle only a
negative error case and bails out only if it's really still suspended.
Also, snd_hda_power_down() is called at the error path so that the
refcount is balanced.
Along with it, the spec->pcm_lock is taken outside
hdmi_present_sense() in the caller side, so that it won't cause
deadlock at reentrace via runtime resume.
v3 fix by tiwai:
Missing linux/pm_runtime.h is included.
References: 222bde0388 ("ALSA: hda - Fix mutex deadlock at HDMI/DP hotplug")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This patch is fixes an issue that the SDHI_INTERNAL_DMAC_RX_IN_USE
flag cannot be cleared because tmio_mmc_core sets the host->data
to NULL before the tmio_mmc_core calls tmio_mmc_abort_dma().
So, this patch clears the SDHI_INTERNAL_DMAC_RX_IN_USE in
the renesas_sdhi_internal_dmac_abort_dma() anyway. This doesn't
cause any side effects.
Fixes: 0cbc94daa5 ("mmc: renesas_sdhi_internal_dmac: limit DMA RX for old SoCs")
Cc: <stable@vger.kernel.org> # v4.17+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Sudarsana Reddy Kalluru says:
====================
qed*: Fix series.
The patch series addresses few issues in the qed* drivers.
Please consider applying it to 'net' branch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When ptp clock is not available for a PF (e.g., higher PFs in NPAR mode),
get-tsinfo() callback should return the software timestamp capabilities
instead of returning the error.
Fixes: 4c55215c ("qede: Add driver support for PTP")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By default, driver sets the eswitch mode incorrectly as VEB (virtual
Ethernet bridging).
Need to set VEB eswitch mode only when sriov is enabled, and it should be
to set NONE by default. The patch incorporates this change.
Fixes: 0fefbfbaa ("qed*: Management firmware - notifications and defaults")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Memory size is limited in the kdump kernel environment. Allocation of more
msix-vectors (or queues) consumes few tens of MBs of memory, which might
lead to the kdump kernel failure.
This patch adds changes to limit the number of MSI-X vectors in kdump
kernel to minimum required value (i.e., 2 per engine).
Fixes: fe56b9e6a ("qed: Add module with basic common support")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After we change the ipvlan mode from l3 to l2, or vice versa, we only
reset IFF_NOARP flag, but don't flush the ARP table cache, which will
cause eth->h_dest to be equal to eth->h_source in ipvlan_xmit_mode_l2().
Then the message will not come out of host.
Here is the reproducer on local host:
ip link set eth1 up
ip addr add 192.168.1.1/24 dev eth1
ip link add link eth1 ipvlan1 type ipvlan mode l3
ip netns add net1
ip link set ipvlan1 netns net1
ip netns exec net1 ip link set ipvlan1 up
ip netns exec net1 ip addr add 192.168.2.1/24 dev ipvlan1
ip route add 192.168.2.0/24 via 192.168.1.2
ping 192.168.2.2 -c 2
ip netns exec net1 ip link set ipvlan1 type ipvlan mode l2
ping 192.168.2.2 -c 2
Add the same configuration on remote host. After we set the mode to l2,
we could find that the src/dst MAC addresses are the same on eth1:
21:26:06.648565 00:b7:13:ad:d3:05 > 00:b7:13:ad:d3:05, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 58356, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.2.1 > 192.168.2.2: ICMP echo request, id 22686, seq 1, length 64
Fix this by calling dev_change_flags(), which will call netdevice notifier
with flag change info.
v2:
a) As pointed out by Wang Cong, check return value for dev_change_flags() when
change dev flags.
b) As suggested by Stefano and Sabrina, move flags setting before l3mdev_ops.
So we don't need to redo ipvlan_{, un}register_nf_hook() again in err path.
Reported-by: Jianlin Shi <jishi@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Fixes: 2ad7bf3638 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'mask' argument to crypto_alloc_shash() uses the CRYPTO_ALG_* flags,
not 'gfp_t'. So don't pass GFP_KERNEL to it.
Fixes: bf355b8d2c ("ipv6: sr: add core files for SR HMAC support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the addition of GRO for ESP, gro_receive can consume the skb and
return -EINPROGRESS. In that case, the lower layer GRO handler cannot
touch the skb anymore.
Commit 5f114163f2 ("net: Add a skb_gro_flush_final helper.") converted
some of the gro_receive handlers that can lead to ESP's gro_receive so
that they wouldn't access the skb when -EINPROGRESS is returned, but
missed other spots, mainly in tunneling protocols.
This patch finishes the conversion to using skb_gro_flush_final(), and
adds a new helper, skb_gro_flush_final_remcsum(), used in VXLAN and
GUE.
Fixes: 5f114163f2 ("net: Add a skb_gro_flush_final helper.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adjusts the allocator calls to use 2-factor argument call style, as
done treewide already for improved defense against allocation overflows.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Disable the metastability workaround for USB2. The original
patch disabled the workaround on the wrong USB port.
Fixes: b8c9c6fa20 ("ARM: dts: dra7: Disable USB metastability workaround for USB2")
Cc: <stable@vger.kernel.org> [4.16+]
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
s390 no longer uses the _mapcount field in struct page to identify
the page table format being used. While the code was diligent in handling
the different mappings, it neglected to turn "off" the map bits when
alloc_pgste was being used. This resulted in bits remaining "on" in the
_refcount field, and thus an artifically huge "in use" count that prevents
the pages from actually being released by __free_page.
There's opportunity for improvement in the "1 vs 3" vs "1U vs 3U" vs
"0x1 vs 0x11" etc. variations for all these calls, I am just keeping
things simple compared to neighboring code.
Fixes: 620b4e9031 ("s390: use _refcount for pgtables")
Reported-by: Halil Pasic <pasic@linux.ibm.com>
Bisected-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reduce the default values for the number of hardware queues and queue depth
to significantly reduce the memory footprint of a DASD device.
The memory consumption per DASD device reduces from approximately 40MB to
approximately 1.5MB.
This is necessary to build systems with a large number of DASD devices and
a reasonable amount of memory.
Performance measurements showed that good performance results are possible
with the new default values even on systems with lots of CPUs and lots of
alias devices.
Fixes: e443343e50 ("s390/dasd: blk-mq conversion")
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Since the following commit:
cd77849a69 ("objtool: Fix GCC 8 cold subfunction detection for aliased functions")
... if the kernel is built with EXTRA_CFLAGS='-fno-reorder-functions',
objtool can get stuck in an infinite loop.
That flag causes the new GCC 8 cold subfunctions to be placed in .text
instead of .text.unlikely. But it also has an unfortunate quirk: in the
symbol table, the subfunction (e.g., nmi_panic.cold.7) is nested inside
the parent (nmi_panic).
That function overlap confuses objtool, and causes it to get into an
infinite loop in next_insn_same_func(). Here's Allan's description of
the loop:
"Objtool iterates through the instructions in nmi_panic using
next_insn_same_func. Once it reaches the end of nmi_panic at 0x534 it
jumps to 0x528 as that's the start of nmi_panic.cold.7. However, since
the instructions starting at 0x528 are still associated with nmi_panic
objtool will get stuck in a loop, continually jumping back to 0x528
after reaching 0x534."
Fix it by shortening the length of the parent function so that the
functions no longer overlap.
Reported-and-analyzed-by: Allan Xavier <allan.x.xavier@oracle.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Allan Xavier <allan.x.xavier@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/9e704c52bee651129b036be14feda317ae5606ae.1530136978.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
changed gvt display transcode DDI mode from DP_SST to
DVI to address below calltrace issue during guest booting
up which is caused by zero dotclock initial value with DP_SST
mode. transcode DVI mode emulation also align with native with DP
connection.
[drm:drm_calc_timestamping_constants]
ERROR crtc 41: Can't calculate constants, dotclock = 0!
WARNING: at drivers/gpu/drm/drm_vblank.c:620
drm_calc_vbltimestamp_from_scanoutpos
Call Trace:
? drm_calc_timestamping_constants+0x144/0x150 [drm]
drm_get_last_vbltimestamp+0x54/0x90 [drm]
drm_reset_vblank_timestamp+0x59/0xd0 [drm]
drm_crtc_vblank_on+0x7b/0xd0 [drm]
intel_modeset_setup_hw_state+0xb67/0xfd0 [i915]
? gen2_read32+0x110/0x110 [i915]
? drm_modeset_lock+0x30/0xa0 [drm]
intel_modeset_init+0x794/0x19d0 [i915]
? intel_setup_gmbus+0x232/0x2e0 [i915]
i915_driver_load+0xb4a/0xf40 [i915]
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
when guest writes ggtt entries, it could write 8 bytes a time if
gtt_entry_size is 8. But, qemu could split the 8 bytes into 2 consecutive
4-byte writes.
If each 4-byte partial write could trigger a host ggtt write, it is very
possible that a wrong combination is written to the host ggtt. E.g.
the higher 4 bytes is the old value, but the lower 4 bytes is the new
value, and this 8-byte combination is wrong but written to the ggtt, thus
causing bugs.
To handle this condition, we just record the first 4-byte write, then wait
until the second 4-byte write comes and write the combined 64-bit data to
host ggtt table.
To save memory space and to spot partial write as early as possible, we
don't keep this information for every ggtt index. Instread, we just record
the last ggtt write position, and assume the two 4-byte writes come in
consecutively for each vgpu.
This assumption is right based on the characteristic of ggtt entry which
stores memory address. When gtt_entry_size is 8, the guest memory physical
address should be 64 bits, so any sane guest driver should write 8-byte
long data at a time, so 2 consecutive 4-byte writes at the same ggtt index
should be trapped in gvt.
v2:
when incomplete ggtt entry write is located, e.g.
1. guest only writes 4 bytes at a ggtt offset and no long writes the
rest 4 bytes.
2. guest writes 4 bytes of a ggtt offset, then write at other ggtt
offsets, then return back to write the left 4 bytes of the first
ggtt offset.
add error handling logic to remap host entry to scratch page, and mark
guest virtual ggtt entry as not present. (zhenyu wang)
Signed-off-by: Zhao Yan <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch unifies the naming of DRM functions for reference counting
of struct drm_device. The resulting code is more aligned with the rest
of the Linux kernel interfaces.
Signed-off-by: Thomas Zimmermann <tdz@users.sourceforge.net>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
This patch unifies the naming of DRM functions for reference counting
of struct drm_gem_object. The resulting code is more aligned with the
rest of the Linux kernel interfaces.
Signed-off-by: Thomas Zimmermann <tdz@users.sourceforge.net>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
This patch unifies the naming of DRM functions for reference counting
of struct drm_framebuffer. The resulting code is more aligned with the
rest of the Linux kernel interfaces.
Signed-off-by: Thomas Zimmermann <tdz@users.sourceforge.net>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Booting a ColdFire m68k core with MMU enabled causes a "bad page state"
oops since commit 1d40a5ea01 ("mm: mark pages in use for page tables"):
BUG: Bad page state in process sh pfn:01ce2
page:004fefc8 count:0 mapcount:-1024 mapping:00000000 index:0x0
flags: 0x0()
raw: 00000000 00000000 00000000 fffffbff 00000000 00000100 00000200 00000000
raw: 039c4000
page dumped because: nonzero mapcount
Modules linked in:
CPU: 0 PID: 22 Comm: sh Not tainted 4.17.0-07461-g1d40a5ea01d5 #13
Fix by calling pgtable_page_dtor() in our __pte_free_tlb() code path,
so that the PG_table flag is cleared before we free the pte page.
Note that I had to change the type of pte_free() to be static from
extern. Otherwise you get a lot of warnings like this:
./arch/m68k/include/asm/mcf_pgalloc.h:80:2: warning: ‘pgtable_page_dtor’ is static but used in inline function ‘pte_free’ which is not static
pgtable_page_dtor(page);
^
And making it static is consistent with our use of this in the other
m68k pgalloc definitions of pte_free().
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
CC: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Pull btrfs fixes from David Sterba:
"We have a few regression fixes for qgroup rescan status tracking and
the vm_fault_t conversion that mixed up the error values"
* tag 'for-4.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Btrfs: fix mount failure when qgroup rescan is in progress
Btrfs: fix regression in btrfs_page_mkwrite() from vm_fault_t conversion
btrfs: quota: Set rescan progress to (u64)-1 if we hit last leaf
Pull vfs fix from Al Viro:
"Followup to procfs-seq_file series this window"
This fixes a memory leak by making sure that proc seq files release any
private data on close. The 'proc_seq_open' has to be properly paired
with 'proc_seq_release' that releases the extra private data.
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
proc: add proc_seq_release
Pull staging/IIO fixes from Greg KH:
"Here are a few small staging and IIO driver fixes for 4.18-rc3.
Nothing major or big, all just fixes for reported problems since
4.18-rc1. All of these have been in linux-next this week with no
reported problems"
* tag 'staging-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: android: ion: Return an ERR_PTR in ion_map_kernel
staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write()
iio: imu: inv_mpu6050: Fix probe() failure on older ACPI based machines
iio: buffer: fix the function signature to match implementation
iio: mma8452: Fix ignoring MMA8452_INT_DRDY
iio: tsl2x7x/tsl2772: avoid potential division by zero
iio: pressure: bmp280: fix relative humidity unit
Pull tty/serial fixes from Greg KH:
"Here are five fixes for the tty core and some serial drivers.
The tty core ones fix some security and other issues reported by the
syzbot that I have taken too long in responding to (sorry Tetsuo!).
The 8350 serial driver fix resolves an issue of devices that used to
work properly stopping working as they shouldn't have been added to a
blacklist.
All of these have been in linux-next for a few days with no reported
issues"
* tag 'tty-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
vt: prevent leaking uninitialized data to userspace via /dev/vcs*
serdev: fix memleak on module unload
serial: 8250_pci: Remove stalled entries in blacklist
n_tty: Access echo_* variables carefully.
n_tty: Fix stall at n_tty_receive_char_special().
Pull USB fixes from Greg KH:
"Here is a number of USB gadget and other driver fixes for 4.18-rc3.
There's a bunch of them here, most of them being gadget driver and
xhci host controller fixes for reported issues (as normal), but there
are also some new device ids, and some fixes for the typec code.
There is an acpi core patch in here that was acked by the acpi
maintainer as it is needed for the typec fixes in order to properly
solve a problem in that driver.
All of these have been in linux-next this week with no reported
issues"
* tag 'usb-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (33 commits)
usb: chipidea: host: fix disconnection detect issue
usb: typec: tcpm: fix logbuffer index is wrong if _tcpm_log is re-entered
typec: tcpm: Fix a msecs vs jiffies bug
NFC: pn533: Fix wrong GFP flag usage
usb: cdc_acm: Add quirk for Uniden UBC125 scanner
staging/typec: fix tcpci_rt1711h build errors
usb: typec: ucsi: Fix for incorrect status data issue
usb: typec: ucsi: acpi: Workaround for cache mode issue
acpi: Add helper for deactivating memory region
usb: xhci: increase CRS timeout value
usb: xhci: tegra: fix runtime PM error handling
usb: xhci: remove the code build warning
xhci: Fix kernel oops in trace_xhci_free_virt_device
xhci: Fix perceived dead host due to runtime suspend race with event handler
dwc2: gadget: Fix ISOC IN DDMA PID bitfield value calculation
usb: gadget: dwc2: fix memory leak in gadget_init()
usb: gadget: composite: fix delayed_status race condition when set_interface
usb: dwc2: fix isoc split in transfer with no data
usb: dwc2: alloc dma aligned buffer for isoc split in
usb: dwc2: fix the incorrect bitmaps for the ports of multi_tt hub
...
Pull dma mapping fixlet from Christoph Hellwig:
"Add a missing export required by riscv and unicore"
* tag 'dma-mapping-4.18-2' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: export swiotlb_dma_ops
Add explicit RETs to the tail calls of AEGIS and MORUS crypto algorithms
otherwise they run into INT3 padding due to
("x86/asm: Pad assembly functions with INT3 instructions")
leading to spurious debug exceptions.
Mike Galbraith <efault@gmx.de> took care of all the remaining callsites.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Building the kernel with CONFIG_THUMB2_KERNEL=y and
CONFIG_CRYPTO_SPECK_NEON set fails with the following errors:
arch/arm/crypto/speck-neon-core.S: Assembler messages:
arch/arm/crypto/speck-neon-core.S:419: Error: r13 not allowed here -- `bic sp,#0xf'
arch/arm/crypto/speck-neon-core.S:423: Error: r13 not allowed here -- `bic sp,#0xf'
arch/arm/crypto/speck-neon-core.S:427: Error: r13 not allowed here -- `bic sp,#0xf'
arch/arm/crypto/speck-neon-core.S:431: Error: r13 not allowed here -- `bic sp,#0xf'
The problem is that the 'bic' instruction can't operate on the 'sp'
register in Thumb2 mode. Fix it by using a temporary register. This
isn't in the main loop, so the performance difference is negligible.
This also matches what aes-neonbs-core.S does.
Reported-by: Stefan Agner <stefan@agner.ch>
Fixes: ede9622162 ("crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If SACK is not enabled and the first cumulative ACK after the RTO
retransmission covers more than the retransmitted skb, a spurious
FRTO undo will trigger (assuming FRTO is enabled for that RTO).
The reason is that any non-retransmitted segment acknowledged will
set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is
no indication that it would have been delivered for real (the
scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK
case so the check for that bit won't help like it does with SACK).
Having FLAG_ORIG_SACK_ACKED set results in the spurious FRTO undo
in tcp_process_loss.
We need to use more strict condition for non-SACK case and check
that none of the cumulatively ACKed segments were retransmitted
to prove that progress is due to original transmissions. Only then
keep FLAG_ORIG_SACK_ACKED set, allowing FRTO undo to proceed in
non-SACK case.
(FLAG_ORIG_SACK_ACKED is planned to be renamed to FLAG_ORIG_PROGRESS
to better indicate its purpose but to keep this change minimal, it
will be done in another patch).
Besides burstiness and congestion control violations, this problem
can result in RTO loop: When the loss recovery is prematurely
undoed, only new data will be transmitted (if available) and
the next retransmission can occur only after a new RTO which in case
of multiple losses (that are not for consecutive packets) requires
one RTO per loss to recover.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Originally in patch e6d20c55a4 ("openrisc: entry: Fix delay slot
detection") I fixed delay slot detection, but only for QEMU. We missed
that hardware delay slot detection using delay slot exception flag (DSX)
was still broken. This was because QEMU set the DSX flag in both
pre-exception supervision register (ESR) and supervision register (SR)
register, but on real hardware the DSX flag is only set on the SR
register during exceptions.
Fix this by carrying the DSX flag into the SR register during exception.
We also update the DSX flag read locations to read the value from the SR
register not the pt_regs SR register which represents ESR. The ESR
should never have the DSX flag set.
In the process I updated/removed a few comments to match the current
state. Including removing a comment saying that the DSX detection logic
was inefficient and needed to be rewritten.
I have tested this on QEMU with a patch ensuring it matches the hardware
specification.
Link: https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00000.html
Fixes: e6d20c55a4 ("openrisc: entry: Fix delay slot detection")
Signed-off-by: Stafford Horne <shorne@gmail.com>
The pinctrl settings were incorrect for the touchscreen interrupt line, causing
an interrupt storm. This change has been tested with both the atmel_mxt_ts and
RMI4 drivers on the RDU1 units.
The value 0x4 comes from the value of register IOMUXC_SW_PAD_CTL_PAD_CSI1_D8
from the old vendor kernel.
Signed-off-by: Nick Dyer <nick@shmanahar.org>
Fixes: ceef0396f3 ("ARM: dts: imx: add ZII RDU1 board")
Cc: <stable@vger.kernel.org> # 4.15+
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Tested-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf 2018-07-01
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) A bpf_fib_lookup() helper fix to change the API before freeze to
return an encoding of the FIB lookup result and return the nexthop
device index in the params struct (instead of device index as return
code that we had before), from David.
2) Various BPF JIT fixes to address syzkaller fallout, that is, do not
reject progs when set_memory_*() fails since it could still be RO.
Also arm32 JIT was not using bpf_jit_binary_lock_ro() API which was
an issue, and a memory leak in s390 JIT found during review, from
Daniel.
3) Multiple fixes for sockmap/hash to address most of the syzkaller
triggered bugs. Usage with IPv6 was crashing, a GPF in bpf_tcp_close(),
a missing sock_map_release() routine to hook up to callbacks, and a
fix for an omitted bucket lock in sock_close(), from John.
4) Two bpftool fixes to remove duplicated error message on program load,
and another one to close the libbpf object after program load. One
additional fix for nfp driver's BPF offload to avoid stopping offload
completely if replace of program failed, from Jakub.
5) Couple of BPF selftest fixes that bail out in some of the test
scripts if the user does not have the right privileges, from Jeffrin.
6) Fixes in test_bpf for s390 when CONFIG_BPF_JIT_ALWAYS_ON is set
where we need to set the flag that some of the test cases are expected
to fail, from Kleber.
7) Fix to detangle BPF_LIRC_MODE2 dependency from CONFIG_CGROUP_BPF
since it has no relation to it and lirc2 users often have configs
without cgroups enabled and thus would not be able to use it, from Sean.
8) Fix a selftest failure in sockmap by removing a useless setrlimit()
call that would set a too low limit where at the same time we are
already including bpf_rlimit.h that does the job, from Yonghong.
9) Fix BPF selftest config with missing missing NET_SCHED, from Anders.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend says:
====================
This addresses two syzbot issues that lead to identifying (by Eric and
Wei) a class of bugs where we don't correctly check for IPv4/v6
sockets and their associated state. The second issue was a locking
omission in sockhash.
The first patch addresses IPv6 socks and fixing an error where
sockhash would overwrite the prot pointer with IPv4 prot. To fix
this build similar solution to TLS ULP. Although we continue to
allow socks in all states not just ESTABLISH in this patch set
because as Martin points out there should be no issue with this
on the sockmap ULP because we don't use the ctx in this code. Once
multiple ULPs coexist we may need to revisit this. However we
can do this in *next trees.
The other issue syzbot found that the tcp_close() handler missed
locking the hash bucket lock which could result in corrupting the
sockhash bucket list if delete and close ran at the same time.
And also the smap_list_remove() routine was not working correctly
at all. This was not caught in my testing because in general my
tests (to date at least lets add some more robust selftest in
bpf-next) do things in the "expected" order, create map, add socks,
delete socks, then tear down maps. The tests we have that do the
ops out of this order where only working on single maps not multi-
maps so we never saw the issue. Thanks syzbot. The fix is to
restructure the tcp_close() lock handling. And fix the obvious
bug in smap_list_remove().
Finally, during review I noticed the release handler was omitted
from the upstream code (patch 4) due to an incorrect merge conflict
fix when I ported the code to latest bpf-next before submitting.
This would leave references to the map around if the user never
closes the map.
v3: rework patches, dropping ESTABLISH check and adding rcu
annotation along with the smap_list_remove fix
v4: missed one more case where maps was being accessed without
the sk_callback_lock, spoted by Martin as well.
v5: changed to use a specific lock for maps and reduced callback
lock so that it is only used to gaurd sk callbacks. I think
this makes the logic a bit cleaner and avoids confusion
ovoer what each lock is doing.
Also big thanks to Martin for thorough review he caught at least
one case where I missed a rcu_call().
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add map_release_uref pointer to hashmap ops. This was dropped when
original sockhash code was ported into bpf-next before initial
commit.
Fixes: 8111038444 ("bpf: sockmap, add hash map support")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
First the sk_callback_lock() was being used to protect both the
sock callback hooks and the psock->maps list. This got overly
convoluted after the addition of sockhash (in sockmap it made
some sense because masp and callbacks were tightly coupled) so
lets split out a specific lock for maps and only use the callback
lock for its intended purpose. This fixes a couple cases where
we missed using maps lock when it was in fact needed. Also this
makes it easier to follow the code because now we can put the
locking closer to the actual code its serializing.
Next, in sock_hash_delete_elem() the pattern was as follows,
sock_hash_delete_elem()
[...]
spin_lock(bucket_lock)
l = lookup_elem_raw()
if (l)
hlist_del_rcu()
write_lock(sk_callback_lock)
.... destroy psock ...
write_unlock(sk_callback_lock)
spin_unlock(bucket_lock)
The ordering is necessary because we only know the {p}sock after
dereferencing the hash table which we can't do unless we have the
bucket lock held. Once we have the bucket lock and the psock element
it is deleted from the hashmap to ensure any other path doing a lookup
will fail. Finally, the refcnt is decremented and if zero the psock
is destroyed.
In parallel with the above (or free'ing the map) a tcp close event
may trigger tcp_close(). Which at the moment omits the bucket lock
altogether (oops!) where the flow looks like this,
bpf_tcp_close()
[...]
write_lock(sk_callback_lock)
for each psock->maps // list of maps this sock is part of
hlist_del_rcu(ref_hash_node);
.... destroy psock ...
write_unlock(sk_callback_lock)
Obviously, and demonstrated by syzbot, this is broken because
we can have multiple threads deleting entries via hlist_del_rcu().
To fix this we might be tempted to wrap the hlist operation in a
bucket lock but that would create a lock inversion problem. In
summary to follow locking rules the psocks maps list needs the
sk_callback_lock (after this patch maps_lock) but we need the bucket
lock to do the hlist_del_rcu.
To resolve the lock inversion problem pop the head of the maps list
repeatedly and remove the reference until no more are left. If a
delete happens in parallel from the BPF API that is OK as well because
it will do a similar action, lookup the lock in the map/hash, delete
it from the map/hash, and dec the refcnt. We check for this case
before doing a destroy on the psock to ensure we don't have two
threads tearing down a psock. The new logic is as follows,
bpf_tcp_close()
e = psock_map_pop(psock->maps) // done with map lock
bucket_lock() // lock hash list bucket
l = lookup_elem_raw(head, hash, key, key_size);
if (l) {
//only get here if elmnt was not already removed
hlist_del_rcu()
... destroy psock...
}
bucket_unlock()
And finally for all the above to work add missing locking around map
operations per above. Then add RCU annotations and use
rcu_dereference/rcu_assign_pointer to manage values relying on RCU so
that the object is not free'd from sock_hash_free() while it is being
referenced in bpf_tcp_close().
Reported-by: syzbot+0ce137753c78f7b6acc1@syzkaller.appspotmail.com
Fixes: 8111038444 ("bpf: sockmap, add hash map support")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
If a hashmap is free'd with open socks it removes the reference to
the hash entry from the psock. If that is the last reference to the
psock then it will also be free'd by the reference counting logic.
However the current logic that removes the hash reference from the
list of references is broken. In smap_list_remove() we first check
if the sockmap entry matches and then check if the hashmap entry
matches. But, the sockmap entry sill always match because its NULL in
this case which causes the first entry to be removed from the list.
If this is always the "right" entry (because the user adds/removes
entries in order) then everything is OK but otherwise a subsequent
bpf_tcp_close() may reference a free'd object.
To fix this create two list handlers one for sockmap and one for
sockhash.
Reported-by: syzbot+0ce137753c78f7b6acc1@syzkaller.appspotmail.com
Fixes: 8111038444 ("bpf: sockmap, add hash map support")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This fixes a crash where we assign tcp_prot to IPv6 sockets instead
of tcpv6_prot.
Previously we overwrote the sk->prot field with tcp_prot even in the
AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
are used.
Tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
crashing case here.
Fixes: 174a79ff95 ("bpf: sockmap with sk redirect support")
Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Commit 5088814a6e (ACPICA: AML parser: attempt to continue loading
table after error) unintentionally added leading newlines to error
messages emitted by ACPICA which caused unexpected things to be
printed to the kernel log. Drop these newlines (which effectively
reverts the part of commit 5088814a6e adding them).
Fixes: 5088814a6e (ACPICA: AML parser: attempt to continue loading table after error)
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: 4.17+ <stable@vger.kernel.org> # 4.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It is reported that commit c62ec4610c (PM / core: Fix direct_complete
handling for devices with no callbacks) introduced a system suspend
regression on Samsung 305V4A by allowing a PCI bridge (not a PCIe
port) to stay in D3 over suspend-to-RAM, which is a side effect of
setting power.direct_complete for the children of that bridge that
have no PM callbacks.
On the majority of systems PCI bridges are not allowed to be
runtime-suspended (the power/control sysfs attribute is set to "on"
for them by default), but user space can change that setting and if
it does so and a given bridge has no children with PM callbacks, the
direct_complete optimization will be applied to it and it will stay
in suspend over system suspend. Apparently, that confuses the
platform firmware on the affected machine and that may very well
happen elsewhere, so avoid the direct_complete optimization for
PCI bridges with no drivers (if there is a driver, it should take
care of the PM handling) on suspend-to-RAM altogether (that should
not matter for suspend-to-idle as platform firmware is not involved
in it).
Fixes: c62ec4610c (PM / core: Fix direct_complete handling for devices with no callbacks)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199941
Reported-by: n0000b.n000b@gmail.com
Tested-by: n0000b.n000b@gmail.com
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull parisc fixes and cleanups from Helge Deller:
"Nothing exiting in this patchset, just
- small cleanups of header files
- default to 4 CPUs when building a SMP kernel
- mark 16kB and 64kB page sizes broken
- addition of the new io_pgetevents syscall"
* 'parisc-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Build kernel without -ffunction-sections
parisc: Reduce debug output in unwind code
parisc: Wire up io_pgetevents syscall
parisc: Default to 4 SMP CPUs
parisc: Convert printk(KERN_LEVEL) to pr_lvl()
parisc: Mark 16kB and 64kB page sizes BROKEN
parisc: Drop struct sigaction from not exported header file
Pull ARM SoC fixes from Olof Johansson:
"A smaller batch for the end of the week (let's see if I can keep the
weekly cadence going for once).
All medium-grade fixes here, nothing worrisome:
- Fixes for some fairly old bugs around SD card write-protect
detection and GPIO interrupt assignments on Davinci.
- Wifi module suspend fix for Hikey.
- Minor DT tweaks to fix inaccuracies for Amlogic platforms, one
of which solves booting with third-party u-boot"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: dts: hikey960: Define wl1837 power capabilities
arm64: dts: hikey: Define wl1835 power capabilities
ARM64: dts: meson-gxl: fix Mali GPU compatible string
ARM64: dts: meson-axg: fix ethernet stability issue
ARM64: dts: meson-gx: fix ATF reserved memory region
ARM64: dts: meson-gxl-s905x-p212: Add phy-supply for usb0
ARM64: dts: meson: fix register ranges for SD/eMMC
ARM64: dts: meson: disable sd-uhs modes on the libretech-cc
ARM: dts: da850: Fix interrups property for gpio
ARM: davinci: board-da850-evm: fix WP pin polarity for MMC/SD
Pull Kbuild fixes from Masahiro Yamada:
- introduce __diag_* macros and suppress -Wattribute-alias warnings
from GCC 8
- fix stack protector test script for x86_64
- fix line number handling in Kconfig
- document that '#' starts a comment in Kconfig
- handle P_SYMBOL property in dump debugging of Kconfig
- correct help message of LD_DEAD_CODE_DATA_ELIMINATION
- fix occasional segmentation faults in Kconfig
* tag 'kbuild-fixes-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: loop boundary condition fix
kbuild: reword help of LD_DEAD_CODE_DATA_ELIMINATION
kconfig: handle P_SYMBOL in print_symbol()
kconfig: document Kconfig source file comments
kconfig: fix line numbers for if-entries in menu tree
stack-protector: Fix test with 32-bit userland and CONFIG_64BIT=y
powerpc: Remove -Wattribute-alias pragmas
disable -Wattribute-alias warning for SYSCALL_DEFINEx()
kbuild: add macro for controlling warnings to linux/compiler.h
The patch noted in the fixes below converted get_user_pages_fast() to
get_user_pages_longterm(), however the two calls differ in a few ways.
First _fast() is documented to not require the mmap_sem, while _longterm()
is documented to need it. Hold the mmap sem as required.
Second, _fast accepts an 'int write' while _longterm uses 'unsigned int
gup_flags', so the expression '!!(prot & IOMMU_WRITE)' is only working by
luck as FOLL_WRITE is currently == 0x1. Use the expected FOLL_WRITE
constant instead.
Fixes: 94db151dc8 ("vfio: disable filesystem-dax page pinning")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pull x86 fixes from Ingo Molnar:
"The biggest diffstat comes from self-test updates, plus there's entry
code fixes, 5-level paging related fixes, console debug output fixes,
and misc fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Clean up the printk()s in show_fault_oops()
x86/mm: Drop unneeded __always_inline for p4d page table helpers
x86/efi: Fix efi_call_phys_epilog() with CONFIG_X86_5LEVEL=y
selftests/x86/sigreturn: Do minor cleanups
selftests/x86/sigreturn/64: Fix spurious failures on AMD CPUs
x86/entry/64/compat: Fix "x86/entry/64/compat: Preserve r8-r11 in int $0x80"
x86/mm: Don't free P4D table when it is folded at runtime
x86/entry/32: Add explicit 'l' instruction suffix
x86/mm: Get rid of KERN_CONT in show_fault_oops()
Pull perf fixes from Ingo Molnar:
"Tooling fixes mostly, plus a build warning fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
perf/core: Move inline keyword at the beginning of declaration
tools/headers: Pick up latest kernel ABIs
perf tools: Fix crash caused by accessing feat_ops[HEADER_LAST_FEATURE]
perf script: Fix crash because of missing evsel->priv
perf script: Add missing output fields in a hint
perf bench: Fix numa report output code
perf stat: Remove duplicate event counting
perf alias: Rebuild alias expression string to make it comparable
perf alias: Remove trailing newline when reading sysfs files
perf tools: Fix a clang 7.0 compilation error
tools include uapi: Synchronize bpf.h with the kernel
tools include uapi: Update if_link.h to pick IFLA_{BRPORT_ISOLATED,VXLAN_TTL_INHERIT}
tools include powerpc: Update arch/powerpc/include/uapi/asm/unistd.h copy to get 'rseq' syscall
perf tools: Update x86's syscall_64.tbl, adding 'io_pgetevents' and 'rseq'
tools headers uapi: Synchronize drm/drm.h
perf intel-pt: Fix packet decoding of CYC packets
perf tests: Add valid callback for parse-events test
perf tests: Add event parsing error handling to parse events test
perf report powerpc: Fix crash if callchain is empty
perf test session topology: Fix test on s390
...
Pull selinux fix from Paul Moore:
"One fairly straightforward patch to fix a longstanding issue where a
process could stall while accessing files in selinuxfs and block
everyone else due to a held mutex.
The patch passes all our tests and looks to apply cleanly to your
current tree"
* tag 'selinux-pr-20180629' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: move user accesses in selinuxfs out of locked regions
Pull block fixes from Jens Axboe:
"Small set of fixes for this series. Mostly just minor fixes, the only
oddball in here is the sg change.
The sg change came out of the stall fix for NVMe, where we added a
mempool and limited us to a single page allocation. CONFIG_SG_DEBUG
sort-of ruins that, since we'd need to account for that. That's
actually a generic problem, since lots of drivers need to allocate SG
lists. So this just removes support for CONFIG_SG_DEBUG, which I added
back in 2007 and to my knowledge it was never useful.
Anyway, outside of that, this pull contains:
- clone of request with special payload fix (Bart)
- drbd discard handling fix (Bart)
- SATA blk-mq stall fix (me)
- chunk size fix (Keith)
- double free nvme rdma fix (Sagi)"
* tag 'for-linus-20180629' of git://git.kernel.dk/linux-block:
sg: remove ->sg_magic member
drbd: Fix drbd_request_prepare() discard handling
blk-mq: don't queue more if we get a busy return
block: Fix cloning of requests with a special payload
nvme-rdma: fix possible double free of controller async event buffer
block: Fix transfer when chunk sectors exceeds max
Commit 546eb0317c "libnvdimm, pmem: Do not flush power-fail protected CPU caches"
fixed the write_cache detection to correctly show the lack of a write
cache based on the platform capabilities described in the ACPI NFIT. The
nfit_test unit tests expected a write cache to be present, so change the
nfit test namespaces to only advertise a persistence domain limited to
the memory controller. This allows the kernel to show a write_cache
attribute, and the test behaviour remains unchanged.
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
cmd_rc is passed in by reference to the acpi_nfit_ctl() function and the
caller expects a value returned. However, when the package is pass through
via the ND_CMD_CALL command, cmd_rc is not touched. Make sure cmd_rc is
always set.
Fixes: aef2533822 ("libnvdimm, nfit: centralize command status translation")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
After commit f9d4b0c1e9 ("fib_rules: move common handling of newrule
delrule msgs into fib_nl2rule"), rule_exists got replaced by rule_find
for existing rule lookup in both the add and del paths. While this
is good for the delete path, it solves a few problems but opens up
a few invalid key matches in the add path.
$ip -4 rule add table main tos 10 fwmark 1
$ip -4 rule add table main tos 10
RTNETLINK answers: File exists
The problem here is rule_find does not check if the key masks in
the new and old rule are the same and hence ends up matching a more
secific rule. Rule key masks cannot be easily compared today without
an elaborate if-else block. Its best to introduce key masks for easier
and accurate rule comparison in the future. Until then, due to fear of
regressions this patch re-introduces older loose rule_exists during add.
Also fixes both rule_exists and rule_find to cover missing attributes.
Fixes: f9d4b0c1e9 ("fib_rules: move common handling of newrule delrule msgs into fib_nl2rule")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When doing device hotplug the sub channel must be async to avoid
deadlock issues because device is discovered in softirq context.
When doing changes to MTU and number of channels, the setup
must be synchronous to avoid races such as when MTU and device
settings are done in a single ip command.
Reported-by: Thomas Walker <Thomas.Walker@twosigma.com>
Fixes: 8195b1396e ("hv_netvsc: fix deadlock on hotplug")
Fixes: 732e49850c ("netvsc: fix race on sub channel creation")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As noticed by Eric, we need to switch to the helper
dev_change_tx_queue_len() for SIOCSIFTXQLEN call path too,
otheriwse still miss dev_qdisc_change_tx_queue_len().
Fixes: 6a643ddb56 ("net: introduce helper dev_change_tx_queue_len()")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pool can be indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/atm/zatm.c:1491 zatm_ioctl() warn: potential spectre issue
'zatm_dev->pool_info' (local cap)
Fix this by sanitizing pool before using it to index
zatm_dev->pool_info
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann says:
====================
s390/qeth: fixes 2018-06-29
please apply a few qeth fixes for -net and your 4.17 stable queue.
Patches 1-3 fix several issues wrt to MAC address management that were
introduced during the 4.17 cycle.
Patch 4 tackles a long-standing issue with busy multi-connection workloads
on devices in af_iucv mode.
Patch 5 makes sure to re-enable all active HW offloads, after a card was
previously set offline and thus lost its HW context.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
commit e830baa9c3 ("qeth: restore device features after recovery") and
commit ce34435641 ("s390/qeth: rely on kernel for feature recovery")
made sure that the HW functions for device features get re-programmed
after recovery.
But we missed that the same handling is also required when a card is
first set offline (destroying all HW context), and then online again.
Fix this by moving the re-enable action out of the recovery-only path.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If qeth_qdio_output_handler() detects that a transmit requires async
completion, it replaces the pending buffer's metadata object
(qeth_qdio_out_buffer) so that this queue buffer can be re-used while
the data is pending completion.
Later when the CQ indicates async completion of such a metadata object,
qeth_qdio_cq_handler() tries to free any data associated with this
object (since HW has now completed the transfer). By calling
qeth_clear_output_buffer(), it erronously operates on the queue buffer
that _previously_ belonged to this transfer ... but which has been
potentially re-used several times by now.
This results in double-free's of the buffer's data, and failing
transmits as the buffer descriptor is scrubbed in mid-air.
The correct way of handling this situation is to
1. scrub the queue buffer when it is prepared for re-use, and
2. later obtain the data addresses from the async-completion notifier
(ie. the AOB), instead of the queue buffer.
All this only affects qeth devices used for af_iucv HiperTransport.
Fixes: 0da9581ddb ("qeth: exploit asynchronous delivery of storage blocks")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
*ether_addr*_64bits functions have been introduced to optimize
performance critical paths, which access 6-byte ethernet address as u64
value to get "nice" assembly. A harmless hack works nicely on ethernet
addresses shoved into a structure or a larger buffer, until busted by
Kasan on smth like plain (u8 *)[6].
qeth_l2_set_mac_address calls qeth_l2_remove_mac passing
u8 old_addr[ETH_ALEN] as an argument.
Adding/removing macs for an ethernet adapter is not that performance
critical. Moreover is_multicast_ether_addr_64bits itself on s390 is not
faster than is_multicast_ether_addr:
is_multicast_ether_addr(%r2) -> %r2
llc %r2,0(%r2)
risbg %r2,%r2,63,191,0
is_multicast_ether_addr_64bits(%r2) -> %r2
llgc %r2,0(%r2)
risbg %r2,%r2,63,191,0
So, let's just use is_multicast_ether_addr instead of
is_multicast_ether_addr_64bits.
Fixes: bcacfcbc82 ("s390/qeth: fix MAC address update sequence")
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When qeth_l2_set_mac_address() finds the card in a non-reachable state,
it merely copies the new MAC address into dev->dev_addr so that
__qeth_l2_set_online() can later register it with the HW.
But __qeth_l2_set_online() may very well be running concurrently, so we
can't trust the card state without appropriate locking:
If the online sequence is past the point where it registers
dev->dev_addr (but not yet in SOFTSETUP state), any address change needs
to be properly programmed into the HW. Otherwise the netdevice ends up
with a different MAC address than what's set in the HW, and inbound
traffic is not forwarded as expected.
This is most likely to occur for OSD in LPAR, where
commit 21b1702af1 ("s390/qeth: improve fallback to random MAC address")
now triggers eg. systemd to immediately change the MAC when the netdevice
is registered with a NET_ADDR_RANDOM address.
Fixes: bcacfcbc82 ("s390/qeth: fix MAC address update sequence")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit b7493e91c1.
On its own, querying RDEV for a MAC address works fine. But when upgrading
from a qeth that previously queried DDEV on a z/VM NIC (ie. any kernel with
commit ec61bd2fd2), the RDEV query now returns a _different_ MAC address
than the DDEV query.
If the NIC is configured with MACPROTECT, z/VM apparently requires us to
use the MAC that was initially returned (on DDEV) and registered. So after
upgrading to a kernel that uses RDEV, the SETVMAC registration cmd for the
new MAC address fails and we end up with a non-operabel interface.
To avoid regressions on upgrade, switch back to using DDEV for the MAC
address query. The downgrade path (first RDEV, later DDEV) is fine, in this
case both queries return the same MAC address.
Fixes: b7493e91c1 ("s390/qeth: use Read device to query hypervisor for MAC")
Reported-by: Michal Kubecek <mkubecek@suse.com>
Tested-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The __alx_open function can be called from ndo_open, which is called
under RTNL, or from alx_resume, which isn't. Since commit d768319cd4,
we're calling the netif_set_real_num_{tx,rx}_queues functions, which
need to be called under RTNL.
This is similar to commit 0c2cc02e57 ("igb: Move the calls to set the
Tx and Rx queues into igb_open").
Fixes: d768319cd4 ("alx: enable multiple tx queues")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a bug where INT_STAT1 was written twice and
INT_STAT2 was ignored when disabling interrupts.
Fixes: b753a9faaf ("net: phy: DP83TC811: Introduce support for the DP83TC811 phy")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini reported that a recent commit broke prefix routes for linklocal
addresses. The newly added modify_prefix_route is attempting to add a
new prefix route when the ifp priority does not match the route metric
however the check needs to account for the default priority. In addition,
the route add fails because the route already exists, and then the delete
removes the one that exists. Flip the order to do the delete first.
Fixes: 8308f3ff17 ("net/ipv6: Add support for specifying metric of connected routes")
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Tested-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alloc_skb_with_frags uses __GFP_NORETRY for non-sleeping allocations
which is just a noop and a little bit confusing.
__GFP_NORETRY was added by ed98df3361 ("net: use __GFP_NORETRY for
high order allocations") to prevent from the OOM killer. Yet this was
not enough because fb05e7a89f ("net: don't wait for order-3 page
allocation") didn't want an excessive reclaim for non-costly orders
so it made it completely NOWAIT while it preserved __GFP_NORETRY in
place which is now redundant.
Drop the pointless __GFP_NORETRY because this function is used as
copy&paste source for other places.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur says:
====================
DPAA fixes
A couple of fixes for the DPAA drivers, addressing an issue
with short UDP or TCP frames (with padding) that were marked
as having a wrong checksum and dropped by the FMan hardware
and a problem with the buffer used for the scatter-gather
table being too small as per the hardware requirements.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The DPAA HW requires that at least 256 bytes from the start of the
first scatter-gather table entry are allocated and accessible. The
hardware reads the maximum size the table can have in one access,
thus requiring that the allocation and mapping to be done for the
maximum size of 256B even if there is a smaller number of entries
in the table.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The FMan hardware parser needs to be configured to remove the
short frame padding from the checksum calculation, otherwise
short UDP and TCP frames are likely to be marked as having a
bad checksum.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver performs the internal reload when it receives tx-timeout event from
the OS. Internal reload might fail in some scenarios e.g., fatal HW issues.
In such cases OS still see the link, which would result in undesirable
functionalities such as re-generation of tx-timeouts.
The patch addresses this issue by indicating the link-down to OS when
tx-timeout is detected, and keeping the link in down state till the
internal reload is successful.
Please consider applying it to 'net' branch.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Static checkers complain that id_tbl->table points to longs and 4 bytes
is smaller than sizeof(long). But the since other side is dividing by
32 instead of sizeof(long), that means the current code works fine.
Anyway, it's more conventional to use the BITS_TO_LONGS() macro when
we're allocating a bitmap.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code assumes that there is 4 bytes in a pointer and it doesn't
allocate enough memory.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fast Open key could be stored in different endian based on the CPU.
Previously hosts in different endianness in a server farm using
the same key config (sysctl value) would produce different cookies.
This patch fixes it by always storing it as little endian to keep
same API for LE hosts.
Reported-by: Daniele Iamartino <danielei@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull powerpc fixes from Michael Ellerman:
"Two regression fixes, and a new syscall wire-up:
- A fix for the recent conversion to time64_t in the powermac RTC
routines, which caused time to go backward.
- Another fix for fallout from the split PMD PTL conversion.
- Wire up the new io_pgetevents() syscall.
Thanks to: Aneesh Kumar K.V, Arnd Bergmann, Breno Leitao, Mathieu
Malaterre"
* tag 'powerpc-4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powermac: Fix rtc read/write functions
powerpc/mm/32: Fix pgtable_page_dtor call
powerpc: Wire up io_pgetevents
This fixes polarity of SD card write-protect pin on DA850 EVM
and fixes interrupt property for DA850 SoC GPIO as defined in
device-tree.
Both of these are not introduced with v4.18 merge but have
existed prior.
* tag 'davinci-fixes-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
ARM: dts: da850: Fix interrups property for gpio
ARM: davinci: board-da850-evm: fix WP pin polarity for MMC/SD
Signed-off-by: Olof Johansson <olof@lixom.net>
ARM64: hisi fixes for 4.18
- Added power capabilities for the mmc host controller on the
hikey and hikey960 boards to avoid broken wifi.
* tag 'hisi-fixes-for-4.18' of git://github.com/hisilicon/linux-hisi:
arm64: dts: hikey960: Define wl1837 power capabilities
arm64: dts: hikey: Define wl1835 power capabilities
Signed-off-by: Olof Johansson <olof@lixom.net>
Before 8d85a7a4f2 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0"),
pci_sriov_set_totalvfs(pdev, 0) meant "we can enable TotalVFs virtual
functions". After 8d85a7a4f2, it means "we can't enable *any* VFs".
That broke this scenario where nfp intends to remove any limit on the
number of VFs that can be enabled:
nfp_pci_probe
nfp_pcie_sriov_read_nfd_limit
nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err)
pci_sriov_set_totalvfs(pf->pdev, 0) # if FW didn't expose a limit
...
# userspace writes N to sysfs "sriov_numvfs":
sriov_numvfs_store
pci_sriov_get_totalvfs # now returns 0
return -ERANGE
Prior to 8d85a7a4f2, pci_sriov_get_totalvfs() returned TotalVFs, but it
now returns 0.
Remove the pci_sriov_set_totalvfs(pdev, 0) calls so we don't limit the
number of VFs that can be enabled.
Fixes: 8d85a7a4f2 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The TotalVFs register in the SR-IOV capability is the hardware limit on the
number of VFs. A PF driver can limit the number of VFs further with
pci_sriov_set_totalvfs(). When the PF driver is removed, reset any VF
limit that was imposed by the driver because that limit may not apply to
other drivers.
Before 8d85a7a4f2 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0"),
pci_sriov_set_totalvfs(pdev, 0) meant "we can enable TotalVFs virtual
functions", and the nfp driver used that to remove the VF limit when the
driver unloads.
8d85a7a4f2 broke that because instead of removing the VF limit,
pci_sriov_set_totalvfs(pdev, 0) actually sets the limit to zero, and that
limit persists even if another driver is loaded.
We could fix that by making the nfp driver reset the limit when it unloads,
but it seems more robust to do it in the PCI core instead of relying on the
driver.
The regression scenario is:
nfp_pci_probe (driver 1)
...
nfp_pci_remove
pci_sriov_set_totalvfs(pf->pdev, 0) # limits VFs to 0
...
nfp_pci_probe (driver 2)
nfp_rtsym_read_le("nfd_vf_cfg_max_vfs")
# no VF limit from firmware
Now driver 2 is broken because the VF limit is still 0 from driver 1.
Fixes: 8d85a7a4f2 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
[bhelgaas: changelog, rename functions]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull i2c fixes from Wolfram Sang:
- a revert because of bugzilla #200045 (and some documentation about
it)
- another regression fix in the i2c-gpio driver
- a leak fix for the i2c core
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: gpio: initialize SCL to HIGH again
i2c: smbus: kill memory leak on emulated and failed DMA SMBus xfers
i2c: algos: bit: mention our experience about initial states
Revert "i2c: algo-bit: init the bus to a known state"
Pull ceph fix from Ilya Dryomov:
"A trivial dentry leak fix from Zheng"
* tag 'ceph-for-4.18-rc3' of git://github.com/ceph/ceph-client:
ceph: fix dentry leak in splice_dentry()
The call to of_get_next_child() returns a node pointer with refcount
incremented thus it must be explicitly decremented here in the error
path and after the last usage.
Fixes: d3c68e0a7e ("PCI: faraday: Add Faraday Technology FTPCI100 PCI Host Bridge driver")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
The call to of_get_next_child() returns a node pointer with
refcount incremented thus it must be explicitly decremented
here after the last usage.
Fixes: ab597d35ef ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The call to of_get_next_child() returns a node pointer with refcount
incremented thus it must be explicitly decremented here after the last
usage.
Fixes: 8961def568 ("PCI: xilinx: Add Xilinx AXI PCIe Host Bridge IP driver")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
[lorenzo.pieralisi@arm.com: reworked commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We need to use list_for_each_entry_safe() because the
pci_ep_cfs_remove_epf_group() function frees "group".
Fixes: ef1433f717 ("PCI: endpoint: Create configfs entry for each pci_epf_device_id table entry")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Commit f5a4670de9 ("clk: imx: Add new clo01 and clo2 controlled
by CCOSR") introduced the CLK_CLKO definitions, but didn't put them
at the end of the list, which may cause dtb breakage when running an old
dtb with a newer kernel.
In order to avoid that, simply add the new CLK_CKO clock definitions
at the end of the list.
Fixes: f5a4670de9 ("clk: imx: Add new clo01 and clo2 controlled by CCOSR")
Reported-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Daniel Borkmann says:
====================
This set contains three fixes that are mostly JIT and set_memory_*()
related. The third in the series in particular fixes the syzkaller
bugs that were still pending; aside from local reproduction & testing,
also 'syz test' wasn't able to trigger them anymore. I've tested this
series on x86_64, arm64 and s390x, and kbuild bot wasn't yelling either
for the rest. For details, please see patches as usual, thanks!
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Partially undo commit 9facc33687 ("bpf: reject any prog that failed
read-only lock") since it caused a regression, that is, syzkaller was
able to manage to cause a panic via fault injection deep in set_memory_ro()
path by letting an allocation fail: In x86's __change_page_attr_set_clr()
it was able to change the attributes of the primary mapping but not in
the alias mapping via cpa_process_alias(), so the second, inner call
to the __change_page_attr() via __change_page_attr_set_clr() had to split
a larger page and failed in the alloc_pages() with the artifically triggered
allocation error which is then propagated down to the call site.
Thus, for set_memory_ro() this means that it returned with an error, but
from debugging a probe_kernel_write() revealed EFAULT on that memory since
the primary mapping succeeded to get changed. Therefore the subsequent
hdr->locked = 0 reset triggered the panic as it was performed on read-only
memory, so call-site assumptions were infact wrong to assume that it would
either succeed /or/ not succeed at all since there's no such rollback in
set_memory_*() calls from partial change of mappings, in other words, we're
left in a state that is "half done". A later undo via set_memory_rw() is
succeeding though due to matching permissions on that part (aka due to the
try_preserve_large_page() succeeding). While reproducing locally with
explicitly triggering this error, the initial splitting only happens on
rare occasions and in real world it would additionally need oom conditions,
but that said, it could partially fail. Therefore, it is definitely wrong
to bail out on set_memory_ro() error and reject the program with the
set_memory_*() semantics we have today. Shouldn't have gone the extra mile
since no other user in tree today infact checks for any set_memory_*()
errors, e.g. neither module_enable_ro() / module_disable_ro() for module
RO/NX handling which is mostly default these days nor kprobes core with
alloc_insn_page() / free_insn_page() as examples that could be invoked long
after bootup and original 314beb9bca ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks") did neither when it got first introduced to BPF
so "improving" with bailing out was clearly not right when set_memory_*()
cannot handle it today.
Kees suggested that if set_memory_*() can fail, we should annotate it with
__must_check, and all callers need to deal with it gracefully given those
set_memory_*() markings aren't "advisory", but they're expected to actually
do what they say. This might be an option worth to move forward in future
but would at the same time require that set_memory_*() calls from supporting
archs are guaranteed to be "atomic" in that they provide rollback if part
of the range fails, once that happened, the transition from RW -> RO could
be made more robust that way, while subsequent RO -> RW transition /must/
continue guaranteeing to always succeed the undo part.
Reported-by: syzbot+a4eb8c7766952a1ca872@syzkaller.appspotmail.com
Reported-by: syzbot+d866d1925855328eac3b@syzkaller.appspotmail.com
Fixes: 9facc33687 ("bpf: reject any prog that failed read-only lock")
Cc: Laura Abbott <labbott@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
If we would ever fail in the bpf_jit_prog() pass that writes the
actual insns to the image after we got header via bpf_jit_binary_alloc()
then we also need to make sure to free it through bpf_jit_binary_free()
again when bailing out. Given we had prior bpf_jit_prog() passes to
initially probe for clobbered registers, program size and to fill in
addrs arrray for jump targets, this is more of a theoretical one,
but at least make sure this doesn't break with future changes.
Fixes: 0546231057 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Any eBPF JIT that where its underlying arch supports ARCH_HAS_SET_MEMORY
would need to use bpf_jit_binary_{un,}lock_ro() pair instead of the
set_memory_{ro,rw}() pair directly as otherwise changes to the former
might break. arm32's eBPF conversion missed to change it, so fix this
up here.
Fixes: 39c13c204b ("arm: eBPF JIT compiler")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
As suggested by Nick Piggin it seems we can drop the -ffunction-sections
compile flag, now that the kernel uses thin archives. Testing with 32-
and 64-bit kernel showed no difference in kernel size.
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
This was introduced more than a decade ago when sg chaining was
added, but we never really caught anything with it. The scatterlist
entry size can be critical, since drivers allocate it, so remove
the magic member. Recently it's been triggering allocation stalls
and failures in NVMe.
Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull power management fixes from Rafael Wysocki:
"These fix up recently added features (the Kryo cpufreq driver and
performance states coverage in the generic power domains framework),
add missing documentation for a recently added sysfs knob in the
intel_pstate driver and fix an error in its documentation.
Specifics:
- Fix the initialization time error handling in the recently added
Kryo cpufreq driver (Dan Carpenter).
- Fix up the recently added coverage of performance states in the
generic power domains (genpd) framework (Viresh Kumar).
- Add missing documentation of the new hwp_dynamic_boost sysfs knob
in the intel_pstate driver (Rafael Wysocki).
- Fix incorrect sysfs path in the intel_pstate driver documentation
(Rafael Wysocki)"
* tag 'pm-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Documentation: intel_pstate: Describe hwp_dynamic_boost sysfs knob
Documentation: admin-guide: intel_pstate: Fix sysfs path
PM / Domains: Rename opp_node to np
PM / Domains: Fix return value of of_genpd_opp_to_performance_state()
cpufreq: qcom-kryo: Fix error handling in probe()
Pull drm fixes from Dave Airlie:
"Nothing too major this round:
- small set of mali-dp fixes
- single meson fix
- a bunch of amdgpu fixes (one makes non-4k page sizes not be a bad
experience)"
* tag 'drm-fixes-2018-06-29' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: release spinlock before committing updates to stream
drm/amdgpu:Support new VCN FW version naming convention
drm/amdgpu: fix UBSAN: Undefined behaviour for amdgpu_fence.c
drm/meson: Fix an un-handled error path in 'meson_drv_bind_master()'
drm/amdgpu: GPU vs CPU page size fixes in amdgpu_vm_bo_split_mapping
drm/amdgpu: Count disabled CRTCs in commit tail earlier
drm/mali-dp: Rectify the width and height passed to rotmem_required()
drm/arm/malidp: Preserve LAYER_FORMAT contents when setting format
drm: mali-dp: Enable Global SE interrupts mask for DP500
drm/arm/malidp: Ensure that the crtcs are shutdown before removing any encoder/connector
Pull device mapper fixes from Mike Snitzer:
- Fix dm core to use more efficient bio_split() instead of
bio_clone_bioset(). Also fixes splitting bio that has integrity
payload.
- Three fixes related to properly validating DAX capabilities of a
stacked DM device that will advertise DAX support.
- Update DM writecache target to use 2-factor allocator arguments. Kees
says this is the last related change for 4.18.
- Fix DM zoned target to use GFP_NOIO to avoid triggering reclaim
during IO submission (caught by lockdep).
- Fix DM thinp to gracefully recover from running out of data space
while a previous async discard completes (whereby freeing space).
- Fix DM thinp's metadata transaction commit to avoid needless work.
* tag 'for-4.18/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: prevent DAX mounts if not supported
dax: check for QUEUE_FLAG_DAX in bdev_dax_supported()
pmem: only set QUEUE_FLAG_DAX for fsdax mode
dm thin: handle running out of data space vs concurrent discard
dm raid: don't use 'const' in function return
dm zoned: avoid triggering reclaim from inside dmz_map()
dm writecache: use 2-factor allocator arguments
dm thin metadata: remove needless work from __commit_transaction
dm: use bio_split() when splitting out the already processed bio
Pull single NVMe fix from Christoph.
* 'nvme-4.18' of git://git.infradead.org/nvme:
nvme-rdma: fix possible double free of controller async event buffer
Fix the test that verifies whether bio_op(bio) represents a discard
or write zeroes operation. Compile-tested only.
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Fixes: 7435e9018f ("drbd: zero-out partial unaligned discards on local backend")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Some devices have different queue limits depending on the type of IO. A
classic case is SATA NCQ, where some commands can queue, but others
cannot. If we have NCQ commands inflight and encounter a non-queueable
command, the driver returns busy. Currently we attempt to dispatch more
from the scheduler, if we were able to queue some commands. But for the
case where we ended up stopping due to BUSY, we should not attempt to
retrieve more from the scheduler. If we do, we can get into a situation
where we attempt to queue a non-queueable command, get BUSY, then
successfully retrieve more commands from that scheduler and queue those.
This can repeat forever, starving the non-queuable command indefinitely.
Fix this by NOT attempting to pull more commands from the scheduler, if
we get a BUSY return. This should also be more optimal in terms of
letting requests stay in the scheduler for as long as possible, if we
get a BUSY due to the regular out-of-tags condition.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_pgetevents() will not change the signal mask. Mark it const to make
it clear and to reduce the need for casts in user code.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Avi Kivity <avi@scylladb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[hch: reapply the patch that got incorrectly reverted]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Berg says:
====================
Just three fixes:
* fix HT operation in mesh mode
* disable preemption in control frame TX
* check nla_parse_nested() return values
where missing (two places)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the kernel accounts the memory for network traffic through
mem_cgroup_[un]charge_skmem() interface. However the memory accounted
only includes the truesize of sk_buff which does not include the size of
sock objects. In our production environment, with opt-out kmem
accounting, the sock kmem caches (TCP[v6], UDP[v6], RAW[v6], UNIX) are
among the top most charged kmem caches and consume a significant amount
of memory which can not be left as system overhead. So, this patch
converts the kmem caches of all sock objects to SLAB_ACCOUNT.
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The limit_id_fallback array uses enum drm_ipp_size_id to index its
content. The content itself is of type enum drm_exynos_ipp_limit_type.
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
The only bits that should be preserved in decon_win_set_fmt() is
WINCONx_ENWIN_F. All other bits depends on the selected pixel formats and
are set by the mentioned function.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Set per-plane global alpha to maximum value to get proper blending of
XRGB and ARGB planes. This fixes the strange order of overlapping planes.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
DMA hardware should respect buffer pitch, so use the width calculated from
the buffer pitch instead of the virtual one.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Fix following issues related to planar YUV pixel format configuration:
- NV16/61 modes were incorrectly programmed as NV12/21,
- YVU420 was programmed as YUV420 on source,
- YVU420 and YUV422 were programmed as YUV420 on output.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Horizontal (DRM_MODE_REFLECT_Y) and vertical (DMR_MODE_REFLECT_Y) flip
were swapped in GScaler driver. Fix this by swapping code for interpreting
them.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Investigation revealed that GScaler hardware requires the real buffer width
(pitch) to be aligned to 16 pixels.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
DMA hardware should respect buffer pitch, so use the width calculated from
the buffer pitch instead of the virtual one.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Fix Cb/CR components order in two-planar YUV420, YUV422 and YUV444 modes.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Ensure that Scaler hardware is properly reset and interrupts are cleared
before processing next image.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Horizontal (DRM_MODE_REFLECT_Y) and vertical (DMR_MODE_REFLECT_Y) flip
were swapped in Rotator driver. Fix this by swapping code for interpreting
them.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Prepare a common function for size and scale checks and call it for
source and destination buffers. Then also move there the state-less checks
from exynos_drm_ipp_task_setup_buffer, so the format information is already
available in limits processing. Finally perform the IPP_LIMIT_BUFFER check
on the real width of the buffer (the width calculated from the provided
buffer pitch).
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Merge fixups for the recent extenstion of the generic power domains
(genpd) framework covering performance states.
* pm-domains:
PM / Domains: Rename opp_node to np
PM / Domains: Fix return value of of_genpd_opp_to_performance_state()
At the very least we should check the return value if
nla_parse_nested() is called with a non-NULL policy.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Commit 9757235f45, "nl80211: correct checks for
NL80211_MESHCONF_HT_OPMODE value") relaxed the range for the HT
operation field in meshconf, while also adding checks requiring
the non-greenfield and non-ht-sta bits to be set in certain
circumstances. The latter bit is actually reserved for mesh BSSes
according to Table 9-168 in 802.11-2016, so in fact it should not
be set.
wpa_supplicant sets these bits because the mesh and AP code share
the same implementation, but authsae does not. As a result, some
meshconf updates from authsae which set only the NONHT_MIXED
protection bits were being rejected.
In order to avoid breaking userspace by changing the rules again,
simply accept the values with or without the bits set, and mask
off the reserved bit to match the spec.
While in here, update the 802.11-2012 reference to 802.11-2016.
Fixes: 9757235f45 ("nl80211: correct checks for NL80211_MESHCONF_HT_OPMODE value")
Cc: Masashi Honma <masashi.honma@gmail.com>
Signed-off-by: Bob Copeland <bobcopeland@fb.com>
Reviewed-by: Masashi Honma <masashi.honma@gmail.com>
Reviewed-by: Masashi Honma <masashi.honma@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
On pre-emption enabled kernels the following print was being seen due to
missing local_bh_disable/local_bh_enable calls. mac80211 assumes that
pre-emption is disabled in the data path.
BUG: using smp_processor_id() in preemptible [00000000] code: iwd/517
caller is __ieee80211_subif_start_xmit+0x144/0x210 [mac80211]
[...]
Call Trace:
dump_stack+0x5c/0x80
check_preemption_disabled.cold.0+0x46/0x51
__ieee80211_subif_start_xmit+0x144/0x210 [mac80211]
Fixes: 9118064914 ("mac80211: Add support for tx_control_port")
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
[commit message rewrite, fixes tag]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Without this patch, firmware will not run properly on rtl8821ae, and it
causes bad user experience. For example, bad connection performance with
low rate, higher power consumption, and so on.
rtl8821ae uses two kinds of firmwares for normal and WoWlan cases, and
each firmware has firmware data buffer and size individually. Original
code always overwrite size of normal firmware rtlpriv->rtlhal.fwsize, and
this mismatch causes firmware checksum error, then firmware can't start.
In this situation, driver gives message "Firmware is not ready to run!".
Fixes: fe89707f0a ("rtlwifi: rtl8821ae: Simplify loading of WOWLAN firmware")
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Cc: Stable <stable@vger.kernel.org> # 4.0+
Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Kbuilt test robot reported:
drivers/phy/motorola/phy-mapphone-mdm6600.c:188:16: warning: is used
uninitialized in this function [-Wuninitialized]
val |= values[i] << i;
~~~~~~^~~
Looking at the phy_mdm6600_status() values does get initialized by
gpiod_get_array_value_cansleep(), but we are using wrong enum
in that function. Let's fix the use, both end up being three though
so urgent rush on this one AFAIK.
Fixes: 5d1ebbda03 ("phy: mapphone-mdm6600: Add USB PHY driver for
MDM6600 on Droid 4")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Unset is required to enable USB 3.0 PHY when XHCI reenabled in response
to setting PHY3_IDDQ_OVERRIDE in uninit().
Fixes: cd6f769fde ("phy: phy-brcm-usb-init: Power down USB 3.0 PHY when XHCI disabled")
Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
If DMA safe memory was allocated, but the subsequent I2C transfer
fails the memory is leaked. Plug this leak.
Fixes: 8a77821e74 ("i2c: smbus: use DMA safe buffers for emulated SMBus transactions")
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
This reverts commit 3e5f06bed7. As per
bugzilla #200045, this caused a regression. I don't really see a way to
fix it without having the hardware. So, revert the patch and I will fix
the issue I was seeing originally in the i2c-gpio driver itself. I
couldn't find new users of this algorithm since, so there should be no
one depending on the new behaviour.
Reported-by: Sergey Larin <cerg2010cerg2010@mail.ru>
Fixes: 3e5f06bed7 ("i2c: algo-bit: init the bus to a known state")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Sergey Larin <cerg2010cerg2010@mail.ru>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
This is easily triggered from userspace, so let's ratelimit the
messages.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Commit 60622d6822 "x86/asm/memcpy_mcsafe: Return bytes remaining"
converted callers of memcpy_mcsafe() to expect a positive 'bytes
remaining' value rather than a negative error code. The nsio_rw_bytes()
conversion failed to return success. The failure is benign in that
nsio_rw_bytes() will end up writing back what it just read.
Fixes: 60622d6822 ("x86/asm/memcpy_mcsafe: Return bytes remaining")
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If a user is accessing a file in selinuxfs with a pointer to a userspace
buffer that is backed by e.g. a userfaultfd, the userspace access can
stall indefinitely, which can block fsi->mutex if it is held.
For sel_read_policy(), remove the locking, since this method doesn't seem
to access anything that requires locking.
For sel_read_bool(), move the user access below the locked region.
For sel_write_bool() and sel_commit_bools_write(), move the user access
up above the locked region.
Cc: stable@vger.kernel.org
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: removed an unused variable in sel_read_policy()]
Signed-off-by: Paul Moore <paul@paul-moore.com>
For ACLs implemented using either FIB rules or FIB entries, the BPF
program needs the FIB lookup status to be able to drop the packet.
Since the bpf_fib_lookup API has not reached a released kernel yet,
change the return code to contain an encoding of the FIB lookup
result and return the nexthop device index in the params struct.
In addition, inform the BPF program of any post FIB lookup reason as
to why the packet needs to go up the stack.
The fib result for unicast routes must have an egress device, so remove
the check that it is non-NULL.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Flag with FLAG_EXPECTED_FAIL the BPF_MAXINSNS tests that cannot be jited
on s390 because they exceed BPF_SIZE_MAX and fail when
CONFIG_BPF_JIT_ALWAYS_ON is set. Also set .expected_errcode to -ENOTSUPP
so the tests pass in that case.
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The current MIPS implementation of arch_trigger_cpumask_backtrace() is
broken because it attempts to use synchronous IPIs despite the fact that
it may be run with interrupts disabled.
This means that when arch_trigger_cpumask_backtrace() is invoked, for
example by the RCU CPU stall watchdog, we may:
- Deadlock due to use of synchronous IPIs with interrupts disabled,
causing the CPU that's attempting to generate the backtrace output
to hang itself.
- Not succeed in generating the desired output from remote CPUs.
- Produce warnings about this from smp_call_function_many(), for
example:
[42760.526910] INFO: rcu_sched detected stalls on CPUs/tasks:
[42760.535755] 0-...!: (1 GPs behind) idle=ade/140000000000000/0 softirq=526944/526945 fqs=0
[42760.547874] 1-...!: (0 ticks this GP) idle=e4a/140000000000000/0 softirq=547885/547885 fqs=0
[42760.559869] (detected by 2, t=2162 jiffies, g=266689, c=266688, q=33)
[42760.568927] ------------[ cut here ]------------
[42760.576146] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:416 smp_call_function_many+0x88/0x20c
[42760.587839] Modules linked in:
[42760.593152] CPU: 2 PID: 1216 Comm: sh Not tainted 4.15.4-00373-gee058bb4d0c2 #2
[42760.603767] Stack : 8e09bd20 8e09bd20 8e09bd20 fffffff0 00000007 00000006 00000000 8e09bca8
[42760.616937] 95b2b379 95b2b379 807a0080 00000007 81944518 0000018a 00000032 00000000
[42760.630095] 00000000 00000030 80000000 00000000 806eca74 00000009 8017e2b8 000001a0
[42760.643169] 00000000 00000002 00000000 8e09baa4 00000008 808b8008 86d69080 8e09bca0
[42760.656282] 8e09ad50 805e20aa 00000000 00000000 00000000 8017e2b8 00000009 801070ca
[42760.669424] ...
[42760.673919] Call Trace:
[42760.678672] [<27fde568>] show_stack+0x70/0xf0
[42760.685417] [<84751641>] dump_stack+0xaa/0xd0
[42760.692188] [<699d671c>] __warn+0x80/0x92
[42760.698549] [<68915d41>] warn_slowpath_null+0x28/0x36
[42760.705912] [<f7c76c1c>] smp_call_function_many+0x88/0x20c
[42760.713696] [<6bbdfc2a>] arch_trigger_cpumask_backtrace+0x30/0x4a
[42760.722216] [<f845bd33>] rcu_dump_cpu_stacks+0x6a/0x98
[42760.729580] [<796e7629>] rcu_check_callbacks+0x672/0x6ac
[42760.737476] [<059b3b43>] update_process_times+0x18/0x34
[42760.744981] [<6eb94941>] tick_sched_handle.isra.5+0x26/0x38
[42760.752793] [<478d3d70>] tick_sched_timer+0x1c/0x50
[42760.759882] [<e56ea39f>] __hrtimer_run_queues+0xc6/0x226
[42760.767418] [<e88bbcae>] hrtimer_interrupt+0x88/0x19a
[42760.775031] [<6765a19e>] gic_compare_interrupt+0x2e/0x3a
[42760.782761] [<0558bf5f>] handle_percpu_devid_irq+0x78/0x168
[42760.790795] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
[42760.798117] [<1b6d462c>] gic_handle_local_int+0x38/0x86
[42760.805545] [<b2ada1c7>] gic_irq_dispatch+0xa/0x14
[42760.812534] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
[42760.820086] [<c7521934>] do_IRQ+0x16/0x20
[42760.826274] [<9aef3ce6>] plat_irq_dispatch+0x62/0x94
[42760.833458] [<6a94b53c>] except_vec_vi_end+0x70/0x78
[42760.840655] [<22284043>] smp_call_function_many+0x1ba/0x20c
[42760.848501] [<54022b58>] smp_call_function+0x1e/0x2c
[42760.855693] [<ab9fc705>] flush_tlb_mm+0x2a/0x98
[42760.862730] [<0844cdd0>] tlb_flush_mmu+0x1c/0x44
[42760.869628] [<cb259b74>] arch_tlb_finish_mmu+0x26/0x3e
[42760.877021] [<1aeaaf74>] tlb_finish_mmu+0x18/0x66
[42760.883907] [<b3fce717>] exit_mmap+0x76/0xea
[42760.890428] [<c4c8a2f6>] mmput+0x80/0x11a
[42760.896632] [<a41a08f4>] do_exit+0x1f4/0x80c
[42760.903158] [<ee01cef6>] do_group_exit+0x20/0x7e
[42760.909990] [<13fa8d54>] __wake_up_parent+0x0/0x1e
[42760.917045] [<46cf89d0>] smp_call_function_many+0x1a2/0x20c
[42760.924893] [<8c21a93b>] syscall_common+0x14/0x1c
[42760.931765] ---[ end trace 02aa09da9dc52a60 ]---
[42760.938342] ------------[ cut here ]------------
[42760.945311] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:291 smp_call_function_single+0xee/0xf8
...
This patch switches MIPS' arch_trigger_cpumask_backtrace() to use async
IPIs & smp_call_function_single_async() in order to resolve this
problem. We ensure use of the pre-allocated call_single_data_t
structures is serialized by maintaining a cpumask indicating that
they're busy, and refusing to attempt to send an IPI when a CPU's bit is
set in this mask. This should only happen if a CPU hasn't responded to a
previous backtrace IPI - ie. if it's hung - and we print a warning to
the console in this case.
I've marked this for stable branches as far back as v4.9, to which it
applies cleanly. Strictly speaking the faulty MIPS implementation can be
traced further back to commit 856839b768 ("MIPS: Add
arch_trigger_all_cpu_backtrace() function") in v3.19, but kernel
versions v3.19 through v4.8 will require further work to backport due to
the rework performed in commit 9a01c3ed5c ("nmi_backtrace: add more
trigger_*_cpu_backtrace() methods").
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/19597/
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # v4.9+
Fixes: 856839b768 ("MIPS: Add arch_trigger_all_cpu_backtrace() function")
Fixes: 9a01c3ed5c ("nmi_backtrace: add more trigger_*_cpu_backtrace() methods")
The other day I was testing one of the HP laptops at my office with an
i915/amdgpu hybrid setup and noticed that hotplugging was non-functional
on almost all of the display outputs. I eventually discovered that all
of the external outputs were connected to the amdgpu device instead of
i915, and that the hotplugs weren't being detected so long as the GPU
was in runtime suspend. After some talking with folks at AMD, I learned
that amdgpu is actually supposed to support hotplug detection in runtime
suspend so long as the OEM has implemented it properly in the firmware.
On this HP ZBook 15 G4 (the machine in question), amdgpu wasn't managing
to find the ATIF handle at all despite the fact that I could see acpi
events being sent in response to any hotplugging. After going through
dumps of the firmware, I discovered that this machine did in fact
support ATIF, but that it's ATIF method lived in an entirely different
namespace than this device's handle (the device handle was
\_SB_.PCI0.PEG0.PEGP, but ATIF lives in ATPX's handle at
\_SB_.PCI0.GFX0).
So, fix this by probing ATPX's ACPI parent's namespace if we can't find
ATIF elsewhere, along with storing a pointer to the proper handle to use
for ATIF and using that instead of the device's handle.
This fixes HPD detection while in runtime suspend for this ZBook!
v2: Update the comment to reflect how the namespaces are arranged
based on the system configuration. (Alex)
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Since it seems that some vendors are storing the ATIF ACPI methods under
the same handle that ATPX lives under instead of the device's own
handle, we're going to need to be able to retrieve this handle later so
we can probe for ATIF there.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Currently, there is nothing in amdgpu that actually uses these structs
other than amdgpu_acpi.c. Additionally, since we're about to start
saving the correct ACPI handle to use for calling ATIF in this struct
this saves us from having to handle making sure that the acpi_handle
(and by proxy, the type definition for acpi_handle and all of the other
acpi headers) doesn't need to be included within the amdgpu_drv struct
itself. This follows the example set by amdgpu_atpx_handler.c.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX
flag is set on the device's request queue to decide whether or not the
device supports filesystem DAX. Really we should be using
bdev_dax_supported() like filesystems do at mount time. This performs
other tests like checking to make sure the dax_direct_access() path works.
We also explicitly clear QUEUE_FLAG_DAX on the DM device's request queue if
any of the underlying devices do not support DAX. This makes the handling
of QUEUE_FLAG_DAX consistent with the setting/clearing of most other flags
in dm_table_set_restrictions().
Now that bdev_dax_supported() explicitly checks for QUEUE_FLAG_DAX, this
will ensure that filesystems built upon DM devices will only be able to
mount with DAX if all underlying devices also support DAX.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Fixes: commit 545ed20e6d ("dm: add infrastructure for DAX support")
Cc: stable@vger.kernel.org
Acked-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Add an explicit check for QUEUE_FLAG_DAX to __bdev_dax_supported(). This
is needed for DM configurations where the first element in the dm-linear or
dm-stripe target supports DAX, but other elements do not. Without this
check __bdev_dax_supported() will pass for such devices, letting a
filesystem on that device mount with the DAX option.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Mike Snitzer <snitzer@redhat.com>
Fixes: commit 545ed20e6d ("dm: add infrastructure for DAX support")
Cc: stable@vger.kernel.org
Acked-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
QUEUE_FLAG_DAX is an indication that a given block device supports
filesystem DAX and should not be set for PMEM namespaces which are in "raw"
mode. These namespaces lack struct page and are prevented from
participating in filesystem DAX as of commit 569d0365f5 ("dax: require
'struct page' by default for filesystem dax").
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Mike Snitzer <snitzer@redhat.com>
Fixes: 569d0365f5 ("dax: require 'struct page' by default for filesystem dax")
Cc: stable@vger.kernel.org
Acked-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
During assemble, the spare marked for replacement is not checked.
conf->fullsync cannot be updated to be 1. As a result, recovery will
treat it as a clean array. All recovering sectors are skipped. Original
device is replaced with the not-recovered spare.
mdadm -C /dev/md0 -l10 -n4 -pn2 /dev/loop[0123]
mdadm /dev/md0 -a /dev/loop4
mdadm /dev/md0 --replace /dev/loop0
mdadm -S /dev/md0 # stop array during recovery
mdadm -A /dev/md0 /dev/loop[01234]
After reassemble, you can see recovery go on, but it completes
immediately. In fact, recovery is not actually processed.
To solve this problem, we just add the missing logics for replacment
spares. (In raid1.c or raid5.c, they have already been checked.)
Reported-by: Alex Chen <alexchen@synology.com>
Reviewed-by: Alex Wu <alexwu@synology.com>
Reviewed-by: Chung-Chiang Cheng <cccheng@synology.com>
Signed-off-by: BingJing Chang <bingjingc@synology.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Pull printk fix from Petr Mladek:
"Revert a commit that went in by mistake. I already have a better fix
in the queue for 4.19"
* tag 'printk-for-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
Revert "lib/test_printf.c: call wait_for_random_bytes() before plain %p tests"
Pull sound fixes from Takashi Iwai:
"Over a dozen changes, but all small and clear fixes.
Half of them are the regression fixes for CA0132 HD-audio codec, and
the rest are, again, a few more fixups for HD-audio, two UBSAN fixes
in the core ioctls, and a trivial fix in the error path handling in
lx6464es driver"
* tag 'sound-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: seq: Fix UBSAN warning at SNDRV_SEQ_IOCTL_QUERY_NEXT_CLIENT ioctl
ALSA: timer: Fix UBSAN warning at SNDRV_TIMER_IOCTL_NEXT_DEVICE ioctl
ALSA: hda/realtek - Fix the problem of two front mics on more machines
ALSA: hda/realtek - Add a quirk for FSC ESPRIMO U9210
ALSA: hda/ca0132: make array ca0132_alt_chmaps static
ALSA: hda - Force to link down at runtime suspend on ATI/AMD HDMI
ALSA: lx6464es: Missing error code in snd_lx6464es_create()
ALSA: hda/ca0132: Fix DMic data rate for Alienware M17x R4
ALSA: hda/ca0132: Restore PCM Analog Mic-In2
ALSA: hda/ca0132: Don't test for QUIRK_NONE
ALSA: hda/ca0132: Restore behavior of QUIRK_ALIENWARE
ALSA: hda/ca0132: Delete redundant UNSOL event requests
ALSA: hda/ca0132: Delete pointless assignments to struct auto_pin_cfg fields
ALSA: hda/realtek - Fix pop noise on Lenovo P50 & co
Pull mtd fixes from Boris Brezillon:
"NAND fixes:
- add a quirk for a bunch of broken Macronix chips
- fix nand_block_bad() when chip->ecc.read_oob() returns a positive
value encoding the number of bitflips
- fix OOB handling in the MXC driver fo V2.1 controllers
- flag the ONFI_FEATURE_ON_DIE_ECC as supported in the Micron driver
- hardcode clk rate in the denali_dt driver to address a bad DT
representation (the proper fix will be queued for 4.19)
SPI NOR fixes:
- add an ULL constant to some ID definitions so that the ID is not
truncated on 32-bit platforms
MTD fixes:
- fix the sector unlocking logic in the CFI driver"
* tag 'mtd/fixes-for-4.18-rc3' of git://git.infradead.org/linux-mtd:
mtd: rawnand: denali_dt: set clk_x_rate to 200 MHz unconditionally
mtd: dataflash: Use ULL suffix for 64-bit constants
mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking.
mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary
mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips
mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock()
mtd: rawnand: All AC chips have a broken GET_FEATURES(TIMINGS).
mtd: rawnand: fix return value check for bad block status
mtd: rawnand: mxc: set spare area size register explicitly
mtd: rawnand: micron: add ONFI_FEATURE_ON_DIE_ECC to supported features
The vbios firmware structure changed between v3_1 and v3_2. So,
the code to setup bootup values needs different paths based
on header version.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The generic nmi_cpu_backtrace() function calls show_regs() when a struct
pt_regs is available, and dump_stack() otherwise. If we were to make use
of the generic nmi_cpu_backtrace() with MIPS' current implementation of
show_regs() this would mean that we see only register data with no
accompanying stack information, in contrast with our current
implementation which calls dump_stack() regardless of whether register
state is available.
In preparation for making use of the generic nmi_cpu_backtrace() to
implement arch_trigger_cpumask_backtrace(), have our implementation of
show_regs() call dump_stack() and drop the explicit dump_stack() call in
arch_dump_stack() which is invoked by arch_trigger_cpumask_backtrace().
This will allow the output we produce to remain the same after a later
patch switches to using nmi_cpu_backtrace(). It may mean that we produce
extra stack output in other uses of show_regs(), but this:
1) Seems harmless.
2) Is good for consistency between arch_trigger_cpumask_backtrace()
and other users of show_regs().
3) Matches the behaviour of the ARM & PowerPC architectures.
Marked for stable back to v4.9 as a prerequisite of the following patch
"MIPS: Call dump_stack() from show_regs()".
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/19596/
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # v4.9+
Merge fixes from Andrew Morton:
"7 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
proc: add Alexey to MAINTAINERS
kasan: depend on CONFIG_SLUB_DEBUG
include/linux/dax.h: dax_iomap_fault() returns vm_fault_t
x86/e820: put !E820_TYPE_RAM regions into memblock.reserved
slub: fix failure when we delete and create a slab cache
Revert mm/vmstat.c: fix vmstat_update() preemption BUG
lib/percpu_ida.c: don't do alloc from per-CPU list if there is none
There is a kernel panic that is triggered when reading /proc/kpageflags
on the kernel booted with kernel parameter 'memmap=nn[KMG]!ss[KMG]':
BUG: unable to handle kernel paging request at fffffffffffffffe
PGD 9b20e067 P4D 9b20e067 PUD 9b210067 PMD 0
Oops: 0000 [#1] SMP PTI
CPU: 2 PID: 1728 Comm: page-types Not tainted 4.17.0-rc6-mm1-v4.17-rc6-180605-0816-00236-g2dfb086ef02c+ #160
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.fc28 04/01/2014
RIP: 0010:stable_page_flags+0x27/0x3c0
Code: 00 00 00 0f 1f 44 00 00 48 85 ff 0f 84 a0 03 00 00 41 54 55 49 89 fc 53 48 8b 57 08 48 8b 2f 48 8d 42 ff 83 e2 01 48 0f 44 c7 <48> 8b 00 f6 c4 01 0f 84 10 03 00 00 31 db 49 8b 54 24 08 4c 89 e7
RSP: 0018:ffffbbd44111fde0 EFLAGS: 00010202
RAX: fffffffffffffffe RBX: 00007fffffffeff9 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffffed1182fff5c0
RBP: ffffffffffffffff R08: 0000000000000001 R09: 0000000000000001
R10: ffffbbd44111fed8 R11: 0000000000000000 R12: ffffed1182fff5c0
R13: 00000000000bffd7 R14: 0000000002fff5c0 R15: ffffbbd44111ff10
FS: 00007efc4335a500(0000) GS:ffff93a5bfc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffffffffffffe CR3: 00000000b2a58000 CR4: 00000000001406e0
Call Trace:
kpageflags_read+0xc7/0x120
proc_reg_read+0x3c/0x60
__vfs_read+0x36/0x170
vfs_read+0x89/0x130
ksys_pread64+0x71/0x90
do_syscall_64+0x5b/0x160
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7efc42e75e23
Code: 09 00 ba 9f 01 00 00 e8 ab 81 f4 ff 66 2e 0f 1f 84 00 00 00 00 00 90 83 3d 29 0a 2d 00 00 75 13 49 89 ca b8 11 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 db d3 01 00 48 89 04 24
According to kernel bisection, this problem became visible due to commit
f7f99100d8 ("mm: stop zeroing memory during allocation in vmemmap")
which changes how struct pages are initialized.
Memblock layout affects the pfn ranges covered by node/zone. Consider
that we have a VM with 2 NUMA nodes and each node has 4GB memory, and
the default (no memmap= given) memblock layout is like below:
MEMBLOCK configuration:
memory size = 0x00000001fff75c00 reserved size = 0x000000000300c000
memory.cnt = 0x4
memory[0x0] [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0
memory[0x1] [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0
memory[0x2] [0x0000000100000000-0x000000013fffffff], 0x0000000040000000 bytes on node 0 flags: 0x0
memory[0x3] [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0
...
If you give memmap=1G!4G (so it just covers memory[0x2]),
the range [0x100000000-0x13fffffff] is gone:
MEMBLOCK configuration:
memory size = 0x00000001bff75c00 reserved size = 0x000000000300c000
memory.cnt = 0x3
memory[0x0] [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0
memory[0x1] [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0
memory[0x2] [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0
...
This causes shrinking node 0's pfn range because it is calculated by the
address range of memblock.memory. So some of struct pages in the gap
range are left uninitialized.
We have a function zero_resv_unavail() which does zeroing the struct pages
within the reserved unavailable range (i.e. memblock.memory &&
!memblock.reserved). This patch utilizes it to cover all unavailable
ranges by putting them into memblock.reserved.
Link: http://lkml.kernel.org/r/20180615072947.GB23273@hori1.linux.bs1.fc.nec.co.jp
Fixes: f7f99100d8 ("mm: stop zeroing memory during allocation in vmemmap")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Tested-by: Oscar Salvador <osalvador@suse.de>
Tested-by: "Herton R. Krzesinski" <herton@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In kernel 4.17 I removed some code from dm-bufio that did slab cache
merging (commit 21bb132767: "dm bufio: remove code that merges slab
caches") - both slab and slub support merging caches with identical
attributes, so dm-bufio now just calls kmem_cache_create and relies on
implicit merging.
This uncovered a bug in the slub subsystem - if we delete a cache and
immediatelly create another cache with the same attributes, it fails
because of duplicate filename in /sys/kernel/slab/. The slub subsystem
offloads freeing the cache to a workqueue - and if we create the new
cache before the workqueue runs, it complains because of duplicate
filename in sysfs.
This patch fixes the bug by moving the call of kobject_del from
sysfs_slab_remove_workfn to shutdown_cache. kobject_del must be called
while we hold slab_mutex - so that the sysfs entry is deleted before a
cache with the same attributes could be created.
Running device-mapper-test-suite with:
dmtest run --suite thin-provisioning -n /commit_failure_causes_fallback/
triggered:
Buffer I/O error on dev dm-0, logical block 1572848, async page read
device-mapper: thin: 253:1: metadata operation 'dm_pool_alloc_data_block' failed: error = -5
device-mapper: thin: 253:1: aborting current metadata transaction
sysfs: cannot create duplicate filename '/kernel/slab/:a-0000144'
CPU: 2 PID: 1037 Comm: kworker/u48:1 Not tainted 4.17.0.snitm+ #25
Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017
Workqueue: dm-thin do_worker [dm_thin_pool]
Call Trace:
dump_stack+0x5a/0x73
sysfs_warn_dup+0x58/0x70
sysfs_create_dir_ns+0x77/0x80
kobject_add_internal+0xba/0x2e0
kobject_init_and_add+0x70/0xb0
sysfs_slab_add+0xb1/0x250
__kmem_cache_create+0x116/0x150
create_cache+0xd9/0x1f0
kmem_cache_create_usercopy+0x1c1/0x250
kmem_cache_create+0x18/0x20
dm_bufio_client_create+0x1ae/0x410 [dm_bufio]
dm_block_manager_create+0x5e/0x90 [dm_persistent_data]
__create_persistent_data_objects+0x38/0x940 [dm_thin_pool]
dm_pool_abort_metadata+0x64/0x90 [dm_thin_pool]
metadata_operation_failed+0x59/0x100 [dm_thin_pool]
alloc_data_block.isra.53+0x86/0x180 [dm_thin_pool]
process_cell+0x2a3/0x550 [dm_thin_pool]
do_worker+0x28d/0x8f0 [dm_thin_pool]
process_one_work+0x171/0x370
worker_thread+0x49/0x3f0
kthread+0xf8/0x130
ret_from_fork+0x35/0x40
kobject_add_internal failed for :a-0000144 with -EEXIST, don't try to register things with the same name in the same directory.
kmem_cache_create(dm_bufio_buffer-16) failed with error -17
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1806151817130.6333@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Mike Snitzer <snitzer@redhat.com>
Tested-by: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert commit c7f26ccfb2 ("mm/vmstat.c: fix vmstat_update() preemption
BUG"). Steven saw a "using smp_processor_id() in preemptible" message
and added a preempt_disable() section around it to keep it quiet. This
is not the right thing to do it does not fix the real problem.
vmstat_update() is invoked by a kworker on a specific CPU. This worker
it bound to this CPU. The name of the worker was "kworker/1:1" so it
should have been a worker which was bound to CPU1. A worker which can
run on any CPU would have a `u' before the first digit.
smp_processor_id() can be used in a preempt-enabled region as long as
the task is bound to a single CPU which is the case here. If it could
run on an arbitrary CPU then this is the problem we have an should seek
to resolve.
Not only this smp_processor_id() must not be migrated to another CPU but
also refresh_cpu_vm_stats() which might access wrong per-CPU variables.
Not to mention that other code relies on the fact that such a worker
runs on one specific CPU only.
Therefore revert that commit and we should look instead what broke the
affinity mask of the kworker.
Link: http://lkml.kernel.org/r/20180504104451.20278-1-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven J. Hill <steven.hill@cavium.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The poll() changes were not well thought out, and completely
unexplained. They also caused a huge performance regression, because
"->poll()" was no longer a trivial file operation that just called down
to the underlying file operations, but instead did at least two indirect
calls.
Indirect calls are sadly slow now with the Spectre mitigation, but the
performance problem could at least be largely mitigated by changing the
"->get_poll_head()" operation to just have a per-file-descriptor pointer
to the poll head instead. That gets rid of one of the new indirections.
But that doesn't fix the new complexity that is completely unwarranted
for the regular case. The (undocumented) reason for the poll() changes
was some alleged AIO poll race fixing, but we don't make the common case
slower and more complex for some uncommon special case, so this all
really needs way more explanations and most likely a fundamental
redesign.
[ This revert is a revert of about 30 different commits, not reverted
individually because that would just be unnecessarily messy - Linus ]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These properties are required for compatibility with runtime PM.
Without these properties, MMC host controller will not be aware
of power capabilities. When the wlcore driver attempts to power
on the device, it will erroneously fail with -EACCES. This fixes
a regression found here: https://lkml.org/lkml/2018/6/12/930
Fixes: 60f36637bb ("wlcore: sdio: allow pm to handle sdio power")
Signed-off-by: Ryan Grachek <ryan@edited.us>
Tested-by: John Stultz <john.stultz@linaro.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
These properties are required for compatibility with runtime PM.
Without these properties, MMC host controller will not be aware
of power capabilities. When the wlcore driver attempts to power
on the device, it will erroneously fail with -EACCES.
Fixes: 60f36637bb ("wlcore: sdio: allow pm to handle sdio power")
Signed-off-by: Ryan Grachek <ryan@edited.us>
Tested-by: John Stultz <john.stultz@linaro.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
This patch avoids that removing a path controlled by the dm-mpath driver
while mkfs is running triggers the following kernel bug:
kernel BUG at block/blk-core.c:3347!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 20 PID: 24369 Comm: mkfs.ext4 Not tainted 4.18.0-rc1-dbg+ #2
RIP: 0010:blk_end_request_all+0x68/0x70
Call Trace:
<IRQ>
dm_softirq_done+0x326/0x3d0 [dm_mod]
blk_done_softirq+0x19b/0x1e0
__do_softirq+0x128/0x60d
irq_exit+0x100/0x110
smp_call_function_single_interrupt+0x90/0x330
call_function_single_interrupt+0xf/0x20
</IRQ>
Fixes: f9d03f96b9 ("block: improve handling of the magic discard payload")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
I haven't seen any real SMP machine yet with > 4 CPUs (we don't suport
SuperDomes yet), so reducing the default maximum number of CPUs may speed up
various bitop functions which depend on number of CPUs in the system.
bload-o-meter on a typical 64-bit kernel shows:
Data: add/remove: 0/0 grow/shrink: 0/10 up/down: 0/-3724 (-3724)
Total: Before=1910404, After=1906680, chg -0.19%
Code: add/remove: 0/2 grow/shrink: 42/38 up/down: 2320/-3500 (-1180)
Total: Before=11053099, After=11051919, chg -0.01%
Signed-off-by: Helge Deller <deller@gmx.de>
Convert printk(KERN_LEVEL) type of calls to pr_lvl() macros.
While here,
- convert printk() to pr_info()
- join back string literal to be on one line
- use %*phN (note, it gives 1 byte more for sake of simplicity)
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
A full boot only succeeds with 4kB page sizes currently.
For 16kB and 64kB page size support somone needs to fix the LBA PCI code
at least, so mark those broken for now.
Signed-off-by: Helge Deller <deller@gmx.de>
This header file isn't exported to userspace, so there is no benefit in
defining struct sigaction for userspace here.
Signed-off-by: Helge Deller <deller@gmx.de>
If reconnect/reset failed where the controller async event buffer
was freed, we might end up freeing it again as we call
nvme_rdma_destroy_admin_queue again in the remove path. Given that
the sequence is guaranteed to serialize by .ctrl_stop, we simply
set ctrl->async_event_sqe.data to NULL and don't free it in future
visits.
Reported-by: Max Gurtovoy <maxg@mellanox.com>
Tested-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Make sure we initialize *bno and *len, before jumping to out_bad_rec
label, and risk calling xfs_warn() with uninitialized variables.
Coverity: 100898
Coverity: 1437081
Coverity: 1437129
Coverity: 1437191
Coverity: 1437201
Coverity: 1437212
Coverity: 1437341
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
If buf[-1] just happens to hold the byte 0x0A, then nread can wrap around
to (size_t)-1, leading to invalid memory accesses.
This has caused segmentation faults when trying to build the latest
kernel snapshots for i686 in Fedora:
https://bugzilla.redhat.com/show_bug.cgi?id=1592374
Signed-off-by: Jerry James <loganjerry@gmail.com>
[alexpl@fedoraproject.org: reformatted patch for submission]
Signed-off-by: Alexander Ploumistos <alexpl@fedoraproject.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Since commit 5d20ee3192 ("kbuild: Allow LD_DEAD_CODE_DATA_ELIMINATION
to be selectable if enabled"), HAVE_LD_DEAD_CODE_DATA_ELIMINATION is
supposed to be selected by architectures that are capable of this
functionality. LD_DEAD_CODE_DATA_ELIMINATION is now users' selection.
Update the help message.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Each symbol has a property of type P_SYMBOL since commit
59e89e3ddf (kconfig: save location of config symbols).
Handle those properties in print_symbol().
Further, place a pointer to print_symbol() in the comment above the
list of known property type.
Signed-off-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
This is clearly a bug.
We need to set the DMA buffer size in the HW otherwise corruption can
occur when receiving packets.
This is probably not occuring because of small MTU values and because HW
has a default value internally (which currently is bigger than default
buffer size).
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent poll change may lead to stalls for non-blocking connecting
SMC sockets, since sock_poll_wait is no longer performed on the
internal CLC socket, but on the outer SMC socket. kernel_connect() on
the internal CLC socket returns with -EINPROGRESS, but the wake up
logic does not work in all cases. If the internal CLC socket is still
in state TCP_SYN_SENT when polled, sock_poll_wait() from sock_poll()
does not sleep. It is supposed to sleep till the state of the internal
CLC socket switches to TCP_ESTABLISHED.
This problem triggered a redesign of the SMC nonblocking connect logic.
This patch introduces a connect worker covering all connect steps
followed by a wake up of socket waiters. It allows to get rid of all
delays and locks in smc_poll().
Fixes: c0129a0614 ("smc: convert to ->poll_mask")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Larry Brakmo proposal ( https://patchwork.ozlabs.org/patch/935233/
tcp: force cwnd at least 2 in tcp_cwnd_reduction) made us rethink
about our recent patch removing ~16 quick acks after ECN events.
tcp_enter_quickack_mode(sk, 1) makes sure one immediate ack is sent,
but in the case the sender cwnd was lowered to 1, we do not want
to have a delayed ack for the next packet we will receive.
Fixes: 522040ea5f ("tcp: do not aggressively quick ack after ECN events")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Neal Cardwell <ncardwell@google.com>
Cc: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
What we want here is to embed a user-space program into the kernel.
Instead of the complex ELF magic, let's simply wrap it in the assembly
with the '.incbin' directive.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
On receving an incomplete message, the existing code stores the
remaining length of the cloned skb in the early_eaten field instead of
incrementing the value returned by __strp_recv. This defers invocation
of sock_rfree for the current skb until the next invocation of
__strp_recv, which returns early_eaten if early_eaten is non-zero.
This behavior causes a stall when the current message occupies the very
tail end of a massive skb, and strp_peek/need_bytes indicates that the
remainder of the current message has yet to arrive on the socket. The
TCP receive buffer is totally full, causing the TCP window to go to
zero, so the remainder of the message will never arrive.
Incrementing the value returned by __strp_recv by the amount otherwise
stored in early_eaten prevents stalls of this nature.
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure to free all resources associated with the ida on module
exit.
Fixes: cd6484e183 ("serdev: Introduce new bus for serial attached devices")
Cc: stable <stable@vger.kernel.org> # 4.11
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After the commit
7d8905d064 ("serial: 8250_pci: Enable device after we check black list")
pure serial multi-port cards, such as CH355, got blacklisted and thus
not being enumerated anymore. Previously, it seems, blacklisting them
was on purpose to shut up pciserial_init_one() about record duplication.
So, remove the entries from blacklist in order to get cards enumerated.
Fixes: 7d8905d064 ("serial: 8250_pci: Enable device after we check black list")
Reported-by: Matt Turner <mattst88@gmail.com>
Cc: Sergej Pupykin <ml@sergej.pp.ru>
Cc: Alexandr Petrenko <petrenkoas83@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzbot is reporting stalls at __process_echoes() [1]. This is because
since ldata->echo_commit < ldata->echo_tail becomes true for some reason,
the discard loop is serving as almost infinite loop. This patch tries to
avoid falling into ldata->echo_commit < ldata->echo_tail situation by
making access to echo_* variables more carefully.
Since reset_buffer_flags() is called without output_lock held, it should
not touch echo_* variables. And omit a call to reset_buffer_flags() from
n_tty_open() by using vzalloc().
Since add_echo_byte() is called without output_lock held, it needs memory
barrier between storing into echo_buf[] and incrementing echo_head counter.
echo_buf() needs corresponding memory barrier before reading echo_buf[].
Lack of handling the possibility of not-yet-stored multi-byte operation
might be the reason of falling into ldata->echo_commit < ldata->echo_tail
situation, for if I do WARN_ON(ldata->echo_commit == tail + 1) prior to
echo_buf(ldata, tail + 1), the WARN_ON() fires.
Also, explicitly masking with buffer for the former "while" loop, and
use ldata->echo_commit > tail for the latter "while" loop.
[1] https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+108696293d7a21ab688f@syzkaller.appspotmail.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For architectures that do not use per-device dma ops we need to export
the dma_map_ops structure returned from get_arch_dma_ops().
Fixes: 10314e09 ("riscv: add swiotlb support")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Andreas Schwab <schwab@suse.de>
Johan writes:
USB-serial fixes for v4.18-rc3
Here are bunch of new device ids for cp210x.
All have been in linux-next with no reported issues.
Signed-off-by: Johan Hovold <johan@kernel.org>
If a power failure happens while the qgroup rescan kthread is running,
the next mount operation will always fail. This is because of a recent
regression that makes qgroup_rescan_init() incorrectly return -EINVAL
when we are mounting the filesystem (through btrfs_read_qgroup_config()).
This causes the -EINVAL error to be returned regardless of any qgroup
flags being set instead of returning the error only when neither of
the flags BTRFS_QGROUP_STATUS_FLAG_RESCAN nor BTRFS_QGROUP_STATUS_FLAG_ON
are set.
A test case for fstests follows up soon.
Fixes: 9593bf4967 ("btrfs: qgroup: show more meaningful qgroup_rescan_init error message")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The vm_fault_t conversion commit introduced a ret2 variable for tracking
the integer return values from internal btrfs functions. It was
sometimes returning VM_FAULT_LOCKED for pages that were actually invalid
and had been removed from the radix. Something like this:
ret2 = btrfs_delalloc_reserve_space() // returns zero on success
lock_page(page)
if (page->mapping != inode->i_mapping)
goto out_unlock;
...
out_unlock:
if (!ret2) {
...
return VM_FAULT_LOCKED;
}
This ends up triggering this WARNING in btrfs_destroy_inode()
WARN_ON(BTRFS_I(inode)->block_rsv.size);
xfstests generic/095 was able to reliably reproduce the errors.
Since out_unlock: is only used for errors, this fix moves it below the
if (!ret2) check we use to return VM_FAULT_LOCKED for success.
Fixes: a528a24150 (btrfs: change return type of btrfs_page_mkwrite to vm_fault_t)
Signed-off-by: Chris Mason <clm@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit ff3d27a048 ("btrfs: qgroup: Finish rescan when hit the last leaf
of extent tree") added a new exit for rescan finish.
However after finishing quota rescan, we set
fs_info->qgroup_rescan_progress to (u64)-1 before we exit through the
original exit path.
While we missed that assignment of (u64)-1 in the new exit path.
The end result is, the quota status item doesn't have the same value.
(-1 vs the last bytenr + 1)
Although it doesn't affect quota accounting, it's still better to keep
the original behavior.
Reported-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Fixes: ff3d27a048 ("btrfs: qgroup: Finish rescan when hit the last leaf of extent tree")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Older gcc (< 4.4) doesn't like files starting with a Unicode BOM:
drivers/net/wireless/ath/wcn36xx/testmode.c:1: error: stray ‘\357’ in program
drivers/net/wireless/ath/wcn36xx/testmode.c:1: error: stray ‘\273’ in program
drivers/net/wireless/ath/wcn36xx/testmode.c:1: error: stray ‘\277’ in program
Remove the BOM, the rest of the file is plain ASCII anyway.
Output of "file drivers/net/wireless/ath/wcn36xx/testmode.c" before:
drivers/net/wireless/ath/wcn36xx/testmode.c: C source, UTF-8 Unicode (with BOM) text
and after:
drivers/net/wireless/ath/wcn36xx/testmode.c: C source, ASCII text
Fixes: 87f825e6e2 ("wcn36xx: Add support for Factory Test Mode (FTM)")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ramon Fried <ramon.fried@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
In the case of Station connects to AP with narrower bandwidth at beginning.
And later the AP changes the bandwidth to winder bandwidth, the AP will
beacon with wider bandwidth IE, eg VHT20->VHT40->VHT80 or VHT40->VHT80.
Since the supported BANDWIDTH will be limited by the PHYMODE, so while
Station receives the bandwidth change request, it will also need to
reconfigure the PHYMODE setting to firmware instead of just configuring
the BANDWIDTH info, otherwise it'll trigger a firmware crash with
non-support bandwidth.
The issue was observed in WLAN.RM.4.4.1-00051-QCARMSWP-1, QCA6174 with
below scenario:
AP xxx changed bandwidth, new config is 5200 MHz, width 2 (5190/0 MHz)
disconnect from AP xxx for new auth to yyy
RX ReassocResp from xxx (capab=0x1111 status=0 aid=102)
associated
....
AP xxx changed bandwidth, new config is 5200 MHz, width 2 (5190/0 MHz)
AP xxx changed bandwidth, new config is 5200 MHz, width 3 (5210/0 MHz)
....
firmware register dump:
[00]: 0x05030000 0x000015B3 0x00987291 0x00955B31
[04]: 0x00987291 0x00060730 0x00000004 0x00000001
[08]: 0x004089F0 0x00955A00 0x000A0B00 0x00400000
[12]: 0x00000009 0x00000000 0x00952CD0 0x00952CE6
[16]: 0x00952CC4 0x0098E25F 0x00000000 0x0091080D
[20]: 0x40987291 0x0040E7A8 0x00000000 0x0041EE3C
[24]: 0x809ABF05 0x0040E808 0x00000000 0xC0987291
[28]: 0x809A650C 0x0040E948 0x0041FE40 0x004345C4
[32]: 0x809A5C63 0x0040E988 0x0040E9AC 0x0042D1A8
[36]: 0x8091D252 0x0040E9A8 0x00000002 0x00000001
[40]: 0x809FDA9D 0x0040EA58 0x0043D554 0x0042D554
[44]: 0x809F8B22 0x0040EA78 0x0043D554 0x00000001
[48]: 0x80911210 0x0040EAC8 0x00000010 0x004041D0
[52]: 0x80911154 0x0040EB28 0x00400000 0x00000000
[56]: 0x8091122D 0x0040EB48 0x00000000 0x00400600
Reported-by: Rouven Czerwinski <rouven@czerwinskis.de>
Tested-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Ryan Hsu <ryanhsu@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Saeed Mahameed says:
====================
mlx5-fixes-2018-06-26
Fixes for mlx5 core and netdev driver:
Two fixes from Alex Vesker to address command interface issues
- Race in command interface polling mode
- Incorrect raw command length parsing
From Shay Agroskin, Fix wrong size allocation for QoS ETC TC regitster.
From Or Gerlitz and Eli Cohin, Address backward compatability issues for when
Eswitch capability is not advertised for the PF host driver
- Fix required capability for manipulating MPFS
- E-Switch, Disallow vlan/spoofcheck setup if not being esw manager
- Avoid dealing with vport IB/eth representors if not being e-switch manager
- E-Switch, Avoid setup attempt if not being e-switch manager
- Don't attempt to dereference the ppriv struct if not being eswitch manager
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
fib_tests.sh became non-executable at some point. This is
what happens:
selftests: net: fib_tests.sh: Warning: file fib_tests.sh is
not executable, correct this.
not ok 1..11 selftests: net: fib_tests.sh [FAIL]
Fixes: d69faad765 ("selftests: fib_tests: Add prefix route tests with metric")
Signed-off-by: Daniel Díaz <daniel.diaz@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The d->chans[] array has d->dma_requests elements so the > should be
>= here.
Fixes: 8e6152bc66 ("dmaengine: Add hisilicon k3 DMA engine driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The reported residue is already calculated in BURST unit granularity, so
advertise this capability properly to other devices in the system.
Fixes: aee4d1fac8 ("dmaengine: pl330: improve pl330_tx_status() function")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Jesper Dangaard Brouer says:
====================
xdp: don't mix XDP_TX and XDP_REDIRECT flush ops
Fix driver logic that are combining XDP_TX flush and XDP_REDIRECT map
flushing. These are two different XDP xmit modes, and it is clearly
wrong to invoke both types of flush operations when only one of the
XDP xmit modes is used.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver was combining XDP_TX virtqueue_kick and XDP_REDIRECT
map flushing (xdp_do_flush_map). This is suboptimal, these two
flush operations should be kept separate.
The suboptimal behavior was introduced in commit 9267c430c6
("virtio-net: add missing virtqueue kick when flushing packets").
Fixes: 9267c430c6 ("virtio-net: add missing virtqueue kick when flushing packets")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver was combining the XDP_TX tail flush and XDP_REDIRECT
map flushing (xdp_do_flush_map). This is suboptimal, these two
flush operations should be kept separate.
It looks like the mistake was copy-pasted from ixgbe.
Fixes: d9314c474d ("i40e: add support for XDP_REDIRECT")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver was combining the XDP_TX tail flush and XDP_REDIRECT
map flushing (xdp_do_flush_map). This is suboptimal, these two
flush operations should be kept separate.
Fixes: 11393cc9b9 ("xdp: Add batching support to redirect map")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The macb driver currently crashes on at91rm9200 with the following trace:
Unable to handle kernel NULL pointer dereference at virtual address 00000014
[...]
[<c031da44>] (macb_rx_desc) from [<c031f2bc>] (at91ether_open+0x2e8/0x3f8)
[<c031f2bc>] (at91ether_open) from [<c041e8d8>] (__dev_open+0x120/0x13c)
[<c041e8d8>] (__dev_open) from [<c041ec08>] (__dev_change_flags+0x17c/0x1a8)
[<c041ec08>] (__dev_change_flags) from [<c041ec4c>] (dev_change_flags+0x18/0x4c)
[<c041ec4c>] (dev_change_flags) from [<c07a5f4c>] (ip_auto_config+0x220/0x10b0)
[<c07a5f4c>] (ip_auto_config) from [<c000a4fc>] (do_one_initcall+0x78/0x18c)
[<c000a4fc>] (do_one_initcall) from [<c0783e50>] (kernel_init_freeable+0x184/0x1c4)
[<c0783e50>] (kernel_init_freeable) from [<c0574d70>] (kernel_init+0x8/0xe8)
[<c0574d70>] (kernel_init) from [<c00090e0>] (ret_from_fork+0x14/0x34)
Solve that by initializing bp->queues[0].bp in at91ether_init (as is done
in macb_init).
Fixes: ae1f2a56d2 ("net: macb: Added support for many RX queues")
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the brand-new syntax extension of Kconfig, we can directly
check the compiler capability in the configuration phase.
If the cc-can-link.sh fails, the BPFILTER_UMH is automatically
hidden by the dependency.
I also deleted 'default n', which is no-op.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following warning is seen when rmmod hinic. This is because affinity
value is not reset before calling free_irq(). This patch fixes it.
[ 55.181232] WARNING: CPU: 38 PID: 19589 at kernel/irq/manage.c:1608
__free_irq+0x2aa/0x2c0
Fixes: 352f58b0d9 ("net-next/hinic: Set Rxq irq to specific cpu for NUMA")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree:
1) Missing netlink attribute validation in nf_queue, uncovered by KASAN,
from Eric Dumazet.
2) Use pointer to sysctl table, save us 192 bytes of memory per netns.
Also from Eric.
3) Possible use-after-free when removing conntrack helper modules due
to missing synchronize RCU call. From Taehee Yoo.
4) Fix corner case in systcl writes to nf_log that lead to appending
data to uninitialized buffer, from Jann Horn.
5) Jann Horn says we may indefinitely block other users of nf_log_mutex
if a userspace access in proc_dostring() blocked e.g. due to a
userfaultfd.
6) Fix garbage collection race for unconfirmed conntrack entries,
from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
kmemleak reported some memory leak on reading proc files. After adding
some debug lines, find that proc_seq_fops is using seq_release as
release handler, which won't handle the free of 'private' field of
seq_file, while in fact the open handler proc_seq_open could create
the private data with __seq_open_private when state_size is greater
than zero. So after reading files created with proc_create_seq_private,
such as /proc/timer_list and /proc/vmallocinfo, the private mem of a
seq_file is not freed. Fix it by adding the paired proc_seq_release
as the default release handler of proc_seq_ops instead of seq_release.
Fixes: 44414d82cf ("proc: introduce proc_create_seq_private")
Reviewed-by: Christoph Hellwig <hch@lst.de>
CC: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
meson-gxl-mali.dtsi is only used on GXL SoCs. Thus it should use the GXL
specific compatible string instead of the GXBB one.
For now this is purely cosmetic since the (out-of-tree) lima driver for
this GPU currently uses the "arm,mali-450" match instead of the SoC
specific one. However, update the .dts to match the documentation since
this driver behavior might change in the future.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Like the odroid-c2 and wetek, the s400 uses the RTL8211F and seems to
suffer from the kind of stability issue.
Doing an iperf3 download test, we can see a significant number of LPI
interrupts on the tx path. After a short while (5 to 15 seconds), the
network connection dies. If using rootfs over NFS, the connection may
also break during the boot sequence.
We still don't have a real explanation for this problem so let's disable
EEE once again.
Fixes: f6f6ac914b ("ARM64: dts: meson-axg: enable ethernet for A113D S400 board")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Vendor firmware/uboot has different reserved regions depending on
firmware version, but current codebase reserves the same regions on
GXL and GXBB, so move the additional reserved memory region to common
.dtsi.
Found when putting a recent vendor u-boot on meson-gxbb-p200.
Suggested-by: Neil Armstrong <narmstrong@baylibre.com>
Cc: stable@vger.kernel.org
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Like LibreTech-CC, the USB0 needs the 5V regulator to be enabled to power the
devices on the P212 Reference Design based boards.
Fixes: b9f07cb4f4 ("ARM64: dts: meson-gxl-s905x-p212: enable the USB controller")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Based on updated information from Amlogic, correct the register range
for the SD/eMMC blocks to the right size.
Reported-by: Yixun Lan <yixun.lan@amlogic.com>
Tested-by: Yixun Lan <yixun.lan@amlogic.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
There is a problem with the sd-uhs mode when doing a soft reboot.
Switching back from 1.8v to 3.3v messes with the card, which no longer
respond (timeout errors). According to the specification, we should
perform a card reset (power cycling the card) but this is something we
cannot control on this design.
Then the only solution to restore the communication with the card is an
"unplug-plug" which is not acceptable
Until we find a solution, if any, disable the sd-uhs modes on this design.
For the people using uhs at the moment, there will a performance drop as
a result.
Fixes: 3cde63ebc8 ("ARM64: dts: meson-gxl: libretech-cc: enable high speed modes")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Cc: stable@vger.kernel.org
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Currently, amdgpu_do_flip() spinlocks crtc->dev->event_lock and
releases it only after committing updates to the stream.
dc_commit_updates_for_stream() should be moved out of
spinlock for the below reasons:
1. event_lock is supposed to protect access to acrct->pflip_status _only_
2. dc_commit_updates_for_stream() has potential sleep's
and also its not appropriate to be in an atomic state
for such long sequences of code.
Signed-off-by: Shirish S <shirish.s@amd.com>
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Support new VCN FW version naming convention:
[31, 28] for VEP interface major version if applicable
[27, 24] for decode interface major version
[23, 20] for encode interface major version
[19, 12] for encode interface minor version
[11, 0] for firmware revision
Bit 20-23, it is encode major and non-zero for new naming convention.
This field is part of version minor and DRM_DISABLED_FLAG in old naming
convention. Since the latest version minor is 0x5B and DRM_DISABLED_FLAG
is zero in old naming convention, this field is always zero so far.
These four bits are used to tell which naming convention is present.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Fang, Peter <Peter.Fang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Here is the UBSAN dump:
[ 3.866656] index 2 is out of range for type 'amdgpu_uvd_inst [2]'
[ 3.866693] Workqueue: events work_for_cpu_fn
[ 3.866702] Call Trace:
[ 3.866710] dump_stack+0x85/0xc5
[ 3.866719] ubsan_epilogue+0x9/0x40
[ 3.866727] __ubsan_handle_out_of_bounds+0x89/0x90
[ 3.866737] ? rcu_read_lock_sched_held+0x58/0x60
[ 3.866746] ? __kmalloc+0x26c/0x2d0
[ 3.866846] amdgpu_fence_driver_start_ring+0x259/0x280 [amdgpu]
[ 3.866896] amdgpu_ring_init+0x12c/0x710 [amdgpu]
[ 3.866906] ? sprintf+0x42/0x50
[ 3.866956] amdgpu_gfx_kiq_init_ring+0x1bc/0x3a0 [amdgpu]
[ 3.867009] gfx_v8_0_sw_init+0x1ad3/0x2360 [amdgpu]
[ 3.867062] ? smu7_init+0xec/0x160 [amdgpu]
[ 3.867109] amdgpu_device_init+0x112c/0x1dc0 [amdgpu]
'ring->me' might be set as 2 with 'amdgpu_gfx_kiq_init_ring', that would
cause out of range for 'amdgpu_uvd_inst[2]'.
v2: simplified with ring type
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull xfs fixes from Darrick Wong:
"Here are some patches for 4.18 to fix regressions, accounting
problems, overflow problems, and to strengthen metadata validation to
prevent corruption.
This series has been run through a full xfstests run over the weekend
and through a quick xfstests run against this morning's master, with
no major failures reported.
Changes since last update:
- more metadata validation strengthening to prevent crashes.
- fix extent offset overflow problem when insert_range on a 512b
block fs
- fix some off-by-one errors in the realtime fsmap code
- fix some math errors in the default resblks calculation when free
space is low
- fix a problem where stale page contents are exposed via mmap read
after a zero_range at eof
- fix accounting problems with per-ag reservations causing statfs
reports to vary incorrectly"
* tag 'xfs-4.18-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix fdblocks accounting w/ RMAPBT per-AG reservation
xfs: ensure post-EOF zeroing happens after zeroing part of a file
xfs: fix off-by-one error in xfs_rtalloc_query_range
xfs: fix uninitialized field in rtbitmap fsmap backend
xfs: recheck reflink state after grabbing ILOCK_SHARED for a write
xfs: don't allow insert-range to shift extents past the maximum offset
xfs: don't trip over negative free space in xfs_reserve_blocks
xfs: allow empty transactions while frozen
xfs: xfs_iflush_abort() can be called twice on cluster writeback failure
xfs: More robust inode extent count validation
xfs: simplify xfs_bmap_punch_delalloc_range
Timur Tabi no longer works for Qualcomm, and he now has a kernel.org
email address, so update MAINTAINERS accordingly.
Signed-off-by: Timur Tabi <timur@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull davinci clk fixes for 4.18 from David Lechner:
Here are a couple of typo fixes for clk-davinci for 4.18.
* tag 'clk-davinci-fixes-4.18' of https://github.com/dlech/linux:
clk: davinci: fix a typo (which leads to build failures)
clk: davinci: cfgchip: testing the wrong variable
Commit 7f0b1bf045 ("arm64: Fix barriers used for page table modifications")
fixed a reported issue with fixmap page-table entries not being visible
to the walker due to a missing DSB instruction. At the same time, it added
ISB instructions to the arm64 set_{pte,pmd,pud} functions, which are not
required by the architecture and make little sense in isolation.
Remove the redundant ISBs.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The implementation of flush_icache_range() includes instruction sequences
which are themselves patched at runtime, so it is not safe to call from
the patching framework.
This patch reworks the alternatives cache-flushing code so that it rolls
its own internal D-cache maintenance using DC CIVAC before invalidating
the entire I-cache after all alternatives have been applied at boot.
Modules don't cause any issues, since flush_icache_range() is safe to
call by the time they are loaded.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Rohit Khanna <rokhanna@nvidia.com>
Cc: Alexander Van Brunt <avanbrunt@nvidia.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Pull MIPS build fix from Paul Burton:
"A single build fix for 4.18:
Adjust rseq_signal_deliver() & rseq_handle_notify_resume() calls to
add the ksig argument introduced in v4.18-rc2, around the same time as
the unadjusted MIPS rseq support"
* tag 'mips_fixes_4.18_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Add ksig argument to rseq_{signal_deliver,handle_notify_resume}
Pull ARM SoC fixes from Olof Johansson:
"A handful of fixes, nothing really concerning and most touching
devicetree files for various platforms.
I also regenerated the shared multiplatform defconfigs; they have
drifted quite a bit due to Kconfig changes and reordering, and several
platform maintainers tried doing the same which resulted in a lot of
conflict pain -- this way we get everybody onto the same base for next
merge window"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (31 commits)
arm64: dts: uniphier: fix widget name of headphone for LD11/LD20 boards
ARM: dts: Fix SPI node for Arria10
arm64: dts: stratix10: Fix SPI nodes for Stratix10
qcom: cmd-db: enforce CONFIG_OF_RESERVED_MEM dependency
ARM: Always build secure_cntvoff.S on ARM V7 to fix shmobile !SMP build
ARM: multi_v7_defconfig: renormalize based on recent additions
arm64: defconfig: renormalize based on recent additions
arm64: dts: msm8916: fix Coresight ETF graph connections
arm64: dts: apq8096-db820c: disable uart0 by default
ARM: dts: imx6sx: fix irq for pcie bridge
arm64: dts: Stingray: Fix I2C controller interrupt type
arm64: dts: ns2: Fix PCIe controller interrupt type
arm64: dts: ns2: Fix I2C controller interrupt type
arm64: dts: specify 1.8V EMMC capabilities for bcm958742t
arm64: dts: specify 1.8V EMMC capabilities for bcm958742k
ARM: dts: Cygnus: Fix PCIe controller interrupt type
ARM: dts: Cygnus: Fix I2C controller interrupt type
ARM: dts: BCM5301x: Fix i2c controller interrupt type
ARM: dts: HR2: Fix interrupt types for i2c and PCIe
ARM: dts: NSP: Fix PCIe controllers interrupt types
...
Pull SCSI fixes from James Bottomley:
"Three small bug fixes (barrier elimination, memory leak on unload,
spinlock recursion) and a technical enhancement left over from the
merge window: the TCMU read length support is required for tape
devices read when the length of the read is greater than the tape
block size"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: scsi_debug: Fix memory leak on module unload
scsi: qla2xxx: Spinlock recursion in qla_target
scsi: ipr: Eliminate duplicate barriers
scsi: target: tcmu: add read length support
Pull input updates from Dmitry Torokhov:
- the main change is a fix for my brain-dead patch to PS/2 button
reporting for some protocols that made it in 4.17
- there is a new driver for Spreadtum vibrator that I intended to send
during merge window but ended up not sending the 2nd pull request.
Given that this is a brand new driver we should not see regressions
here
- a fixup to Elantech PS/2 driver to avoid decoding errors on Thinkpad
P52
- addition of few more ACPI IDs for Silead and Elan drivers
- RMI4 is switched to using IRQ domain code instead of rolling its own
implementation
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: psmouse - fix button reporting for basic protocols
Input: xpad - fix GPD Win 2 controller name
Input: elan_i2c_smbus - fix more potential stack buffer overflows
Input: elan_i2c - add ELAN0618 (Lenovo v330 15IKB) ACPI ID
Input: elantech - fix V4 report decoding for module with middle key
Input: elantech - enable middle button of touchpads on ThinkPad P52
Input: do not assign new tracking ID when changing tool type
Input: make input_report_slot_state() return boolean
Input: synaptics-rmi4 - fix axis-swap behavior
Input: synaptics-rmi4 - fix the error return code in rmi_probe_interrupts()
Input: synaptics-rmi4 - convert irq distribution to irq_domain
Input: silead - add MSSL0002 ACPI HID
Input: goldfish_events - fix checkpatch warnings
Input: add Spreadtrum vibrator driver
Pull more security subsystem fixes from James Morris:
"Two further fixes for the keys subsystem"
* 'fixes-v4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
dh key: fix rounding up KDF output length
certs/blacklist: fix const confusion
It may not be the actual real stable mailing list address, but the
stable scripts to actually pick up on the traditional way to mark stable
patches.
There are also reasons to explicitly avoid using the actual mailing list
address, since security patches with embargo dates generally do want the
stable marking, but don't want tools etc to mistakenly send the patch
out to the mailing list early.
So don't warn for things that are still actively used and explicitly
supported.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This ought to be an omission in e619492323 ("esp: Fix memleaks on error
paths."). The memleak on error path in esp6_input is similar to esp_input
of esp4.
Fixes: e619492323 ("esp: Fix memleaks on error paths.")
Fixes: 3f29770723 ("ipsec: check return value of skb_to_sgvec always")
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
This patch fixes wrong name of headphone widget for receiving events
of insert/remove headphone plug from simple-card or audio-graph-card.
If we use wrong widget name then we get warning messages such as
"asoc-audio-graph-card sound: ASoC: DAPM unknown pin Headphones"
when the plug is inserted or removed from headphone jack.
Fixes: fb21a0acaa ("arm64: dts: uniphier: add sound node")
Signed-off-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Since commit 48231f289e ("media: rc: drivers should produce alternate
pulse and space timing events"), on meson-ir we are regularly producing
errors. Reduce to warning level and only warn once to avoid flooding
the log.
A proper fix for meson-ir is going to be too large for v4.18.
Signed-off-by: Sean Young <sean@mess.org>
Cc: stable@vger.kernel.org # 4.17+
Tested-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Russell King reported:
"When removing and reloading the etnaviv module, the following splat
occurs:
sysfs: cannot create duplicate filename '/devices/platform/etnaviv'
CPU: 0 PID: 1471 Comm: modprobe Not tainted 4.17.0+ #1608
Hardware name: Marvell Dove (Cubox)
Backtrace:
[<c00157d4>] (dump_backtrace) from [<c0015b8c>] (show_stack+0x18/0x1c)
r6:ef033e38 r5:ee07b340 r4:edb9d000 r3:00000000
[<c0015b74>] (show_stack) from [<c0620784>] (dump_stack+0x20/0x28)
[<c0620764>] (dump_stack) from [<c01bcd24>] (sysfs_warn_dup+0x5c/0x70)
[<c01bccc8>] (sysfs_warn_dup) from [<c01bce14>] (sysfs_create_dir_ns+0x90/0x98)
..."
Commit 246774d17f ("drm/etnaviv: remove the need for a gpu-subsystem
DT node") introduced DRM registration via
platform_device_register_simple(), but missed to call
platform_device_unregister() inside etnaviv_exit().
Fix the problem by calling platform_device_unregister() inside
etnaviv_exit(). While at it, also rearrange the function calls
in the exit path to make them happen in the opposite order of
registration.
Tested on a imx6-sabresd board.
Cc: <stable@vger.kernel.org>
Fixes: 246774d17f ("drm/etnaviv: remove the need for a gpu-subsystem DT node")
Reported-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
platform_device_register_simple() may fail, so we should better
check its return value and propagate it in the case of error.
Cc: <stable@vger.kernel.org>
Fixes: 246774d17f ("drm/etnaviv: remove the need for a gpu-subsystem DT node")
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Discards issued to a DM thin device can complete to userspace (via
fstrim) _before_ the metadata changes associated with the discards is
reflected in the thinp superblock (e.g. free blocks). As such, if a
user constructs a test that loops repeatedly over these steps, block
allocation can fail due to discards not having completed yet:
1) fill thin device via filesystem file
2) remove file
3) fstrim
From initial report, here:
https://www.redhat.com/archives/dm-devel/2018-April/msg00022.html
"The root cause of this issue is that dm-thin will first remove
mapping and increase corresponding blocks' reference count to prevent
them from being reused before DISCARD bios get processed by the
underlying layers. However. increasing blocks' reference count could
also increase the nr_allocated_this_transaction in struct sm_disk
which makes smd->old_ll.nr_allocated +
smd->nr_allocated_this_transaction bigger than smd->old_ll.nr_blocks.
In this case, alloc_data_block() will never commit metadata to reset
the begin pointer of struct sm_disk, because sm_disk_get_nr_free()
always return an underflow value."
While there is room for improvement to the space-map accounting that
thinp is making use of: the reality is this test is inherently racey and
will result in the previous iteration's fstrim's discard(s) completing
vs concurrent block allocation, via dd, in the next iteration of the
loop.
No amount of space map accounting improvements will be able to allow
user's to use a block before a discard of that block has completed.
So the best we can really do is allow DM thinp to gracefully handle such
aggressive use of all the pool's data by degrading the pool into
out-of-data-space (OODS) mode. We _should_ get that behaviour already
(if space map accounting didn't falsely cause alloc_data_block() to
believe free space was available).. but short of that we handle the
current reality that dm_pool_alloc_data_block() can return -ENOSPC.
Reported-by: Dennis Yang <dennisyang@qnap.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The intc #interrupt-cells is equal to 1. Currently gpio
node has 2 cells per IRQ which is wrong. Remove the additional
cell for each of the interrupts.
Signed-off-by: Keerthy <j-keerthy@ti.com>
Fixes: 2e38b946dc ("ARM: davinci: da850: add GPIO DT node")
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Document the recently introduced hwp_dynamic_boost sysfs knob
allowing user space to tell intel_pstate to use iowait boosting
in the active mode with HWP enabled (to improve performance).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Fix an incorrect sysfs path in the intel_pstate admin-guide
documentation.
Fixes: 33fc30b470 (cpufreq: intel_pstate: Document the current behavior and user interface)
Reported-by: Pawit Pornkitprasan <p.pawit@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts the following commits:
1ea66554d3 ("x86/mm: Mark p4d_offset() __always_inline")
046c0dbec0 ("x86: Mark native_set_p4d() as __always_inline")
p4d_offset(), native_set_p4d() and native_p4d_clear() were marked
__always_inline in attempt to move __pgtable_l5_enabled into __initdata
section.
It was required as KASAN initialization code is a user of
USE_EARLY_PGTABLE_L5, so all pgtable_l5_enabled() translated to
__pgtable_l5_enabled there. This includes pgtable_l5_enabled() called
from inline p4d helpers.
If compiler would decided to not inline these p4d helpers, but leave
them standalone, we end up with section mismatch.
We don't need __always_inline here anymore. __pgtable_l5_enabled moved
back to be __ro_after_init. See the following commit:
51be133515 ("Revert "x86/mm: Mark __pgtable_l5_enabled __initdata"")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180626100341.49910-1-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
sizeof() will return unsigned value so in the error check
negative error code will be always larger than sizeof().
Fixes: a0d8e02c35 ("nfp: add support for reading nffw info")
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As Mathieu pointed out, my conversion to time64_t was incorrect and
resulted in negative times to be read from the RTC. The problem is
that during the conversion from a byte array to a time64_t, the
'unsigned char' variable holding the top byte gets turned into a
negative signed 32-bit integer before being assigned to the 64-bit
variable for any times after 1972.
This changes the logic to cast to an unsigned 32-bit number first for
the Macintosh time and then convert that to the Unix time, which then
gives us a time in the documented 1904..2040 year range. I decided not
to use the longer 1970..2106 range that other drivers use, for
consistency with the literal interpretation of the register, but that
could be easily changed if we decide we want to support any Mac after
2040.
Just to be on the safe side, I'm also adding a WARN_ON that will
trigger if either the year 2040 has come and is observed by this
driver, or we run into an RTC that got set back to a pre-1970 date for
some reason (the two are indistinguishable).
For the RTC write functions, Andreas found another problem: both
pmu_request() and cuda_request() are varargs functions, so changing
the type of the arguments passed into them from 32 bit to 64 bit
breaks the API for the set_rtc_time functions. This changes it back to
32 bits.
The same code exists in arch/m68k/ and is patched in an identical way
now in a separate patch.
Fixes: 5bfd643583 ("powerpc: use time64_t in read_persistent_clock")
Reported-by: Mathieu Malaterre <malat@debian.org>
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Jakub Kicinski says:
====================
nfp: MPLS and shared blocks TC offload fixes
This series brings two fixes to TC filter/action offload code.
Pieter fixes matching MPLS packets when the match is purely on
the MPLS ethertype and none of the MPLS fields are used.
John provides a fix for offload of shared blocks. Unfortunately,
with shared blocks there is currently no guarantee that filters
which were added by the core will be removed before block unbind.
Our simple fix is to not support offload of rules on shared blocks
at all, a revert of this fix will be send for -next once the
reoffload infrastructure lands. The shared blocks became important
as we are trying to use them for bonding offload (managed from user
space) and lack of remove calls leads to resource leaks.
v2:
- fix build error reported by kbuild bot due to missing
tcf_block_shared() helper.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
TC shared blocks allow multiple qdiscs to be grouped together and filters
shared between them. Currently the chains of filters attached to a block
are only flushed when the block is removed. If a qdisc is removed from a
block but the block still exists, flow del messages are not passed to the
callback registered for that qdisc. For the NFP, this presents the
possibility of rules still existing in hw when they should be removed.
Prevent binding to shared blocks until the kernel can send per qdisc del
messages when block unbinds occur.
tcf_block_shared() was not used outside of the core until now, so also
add an empty implementation for builds with CONFIG_NET_CLS=n.
Fixes: 4861738775 ("net: sched: introduce shared filter blocks infrastructure")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously it was not possible to distinguish between mpls ether types and
other ether types. This leads to incorrect classification of offloaded
filters that match on mpls ether type. For example the following two
filters overlap:
# tc filter add dev eth0 parent ffff: \
protocol 0x8847 flower \
action mirred egress redirect dev eth1
# tc filter add dev eth0 parent ffff: \
protocol 0x0800 flower \
action mirred egress redirect dev eth2
The driver now correctly includes the mac_mpls layer where HW stores mpls
fields, when it detects an mpls ether type. It also sets the MPLS_Q bit to
indicate that the filter should match mpls packets.
Fixes: bb055c198d ("nfp: add mpls match offloading support")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two rules with different values of suppress_prefix or suppress_ifgroup
are not the same. This fixes an -EEXIST when running:
$ ip -4 rule add table main suppress_prefixlength 0
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Fixes: f9d4b0c1e9 ("fib_rules: move common handling of newrule delrule msgs into fib_nl2rule")
Signed-off-by: David S. Miller <davem@davemloft.net>
The RDS core module creates rds_connections based on callbacks
from rds_loop_transport when sending/receiving packets to local
addresses.
These connections will need to be cleaned up when they are
created from a netns that is not init_net, and that netns is deleted.
Add the changes aligned with the changes from
commit ebeeb1ad9b ("rds: tcp: use rds_destroy_pending() to synchronize
netns/module teardown and rds connection/workq management") for
rds_loop_transport
Reported-and-tested-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit ba667650c5 ("Input: psmouse - clean up code") was pretty
brain-dead and broke extra buttons reporting for variety of PS/2 mice:
Genius, Thinkmouse and Intellimouse Explorer. We need to actually inspect
the data coming from the device when reporting events.
Fixes: ba667650c5 ("Input: psmouse - clean up code")
Reported-by: Jiri Slaby <jslaby@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The command interface can work in two modes: Events and Polling.
In the general case, each time we invoke a command, a work is
queued to handle it.
When working in events, the interrupt handler completes the
command execution. On the other hand, when working in polling
mode, the work itself completes it.
Due to a bug in the work handler, a command could have been
completed by the interrupt handler, while the work handler
hasn't finished yet, causing the it to complete once again
if the command interface mode was changed from Events to
polling after the interrupt handler was called.
mlx5_unload_one()
mlx5_stop_eqs()
// Destroy the EQ before cmd EQ
...cmd_work_handler()
write_doorbell()
--> EVENT_TYPE_CMD
mlx5_cmd_comp_handler() // First free
free_ent(cmd, ent->idx)
complete(&ent->done)
<-- mlx5_stop_eqs //cmd was complete
// move to polling before destroying the last cmd EQ
mlx5_cmd_use_polling()
cmd->mode = POLL;
--> cmd_work_handler (continues)
if (cmd->mode == POLL)
mlx5_cmd_comp_handler() // Double free
The solution is to store the cmd->mode before writing the doorbell.
Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The NULL character was not set correctly for the string containing
the command length, this caused failures reading the output of the
command due to a random length. The fix is to initialize the output
length string.
Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Manipulating of the MPFS requires eswitch manager capabilities.
Fixes: eeb66cdb68 ('net/mlx5: Separate between E-Switch and MPFS')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In smartnic env, if the host (PF) driver is not an e-switch manager, we
are not allowed to apply eswitch ports setups such as vlan (VST),
spoof-checks, min/max rate or state.
Make sure we are eswitch manager when coming to issue these callbacks
and err otherwise.
Also fix the definition of ESW_ALLOWED to rely on eswitch_manager
capability and on the vport_group_manger.
Operations on the VF nic vport context, such as setting a mac or reading
the vport counters are allowed to the PF in this scheme.
The modify nic vport guid code was modified to omit checking the
nic_vport_node_guid_modify eswitch capability.
The reason for doing so is that modifying node guid requires vport group
manager capability, and there's no need to check further capabilities.
1. set_vf_vlan - disallowed
2. set_vf_spoofchk - disallowed
3. set_vf_mac - allowed
4. get_vf_config - allowed
5. set_vf_trust - disallowed
6. set_vf_rate - disallowed
7. get_vf_stat - allowed
8. set_vf_link_state - disallowed
Fixes: f942380c12 ('net/mlx5: E-Switch, Vport ingress/egress ACLs rules for spoofchk')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Tested-by: Or Gerlitz <ogerlitz@mellanox.com>
In smartnic env, the host (PF) driver might not be an e-switch
manager, hence the switchdev mode representors are running on
the embedded cpu (EC) and not at the host.
As such, we should avoid dealing with vport representors if
not being esw manager.
Fixes: b5ca15ad7e ('IB/mlx5: Add proper representors support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In smartnic env, the host (PF) driver might not be an e-switch
manager, hence the switchdev mode representors are running on
the embedded cpu (EC) and not at the host.
As such, we should avoid dealing with vport representors if
not being esw manager.
While here, make sure to disallow eswitch switchdev related
setups through devlink if we are not esw managers.
Fixes: cb67b83292 ('net/mlx5e: Introduce SRIOV VF representors')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In smartnic env, the host (PF) driver might not be an e-switch
manager, hence the FW will err on driver attempts to deal with
setting/unsetting the eswitch and as a result the overall setup
of sriov will fail.
Fix that by avoiding the operation if e-switch management is not
allowed for this driver instance. While here, move to use the
correct name for the esw manager capability name.
Fixes: 81848731ff ('net/mlx5: E-Switch, Add SR-IOV (FDB) support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Guy Kushnir <guyk@mellanox.com>
Reviewed-by: Eli Cohen <eli@melloanox.com>
Tested-by: Eli Cohen <eli@melloanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The check for cpu hit statistics was not returning immediate false for
any non vport rep netdev and hence we crashed (say on mlx5 probed VFs) if
user-space tool was calling into any possible netdev in the system.
Fix that by doing a proper check before dereferencing.
Fixes: 1d447a3914 ('net/mlx5e: Extendable vport representor netdev private data')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Eli Cohen <eli@melloanox.com>
Reviewed-by: Eli Cohen <eli@melloanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Commit 51bc085d64 ("PCI: Improve host drivers compile test coverage")
added configuration options to allow PCI host controller drivers to be
compile tested on all architectures.
Some host controller drivers (eg PCIE_ALTERA) config entries select the
PCI_DOMAINS config option to enable PCI domains management in the kernel.
Now that host controller drivers can be compiled on all architectures, this
triggers build regressions on arches that do not implement the PCI_DOMAINS
required API (ie pci_domain_nr()):
drivers/ata/pata_ali.c: In function 'ali_init_chipset':
drivers/ata/pata_ali.c:469:38: error: implicit declaration of function 'pci_domain_nr'; did you mean 'pci_iomap_wc'?
Furthemore, some software configurations (ie Jailhouse) require a
PCI_DOMAINS enabled kernel to configure multiple host controllers without
having an explicit dependency on the ARM platform on which they run.
Make PCI_DOMAINS a visible configuration option on ARM so that software
configurations that need it can manually select it and move the PCI_DOMAINS
selection from PCI controllers configuration file to ARM sub-arch config
entries that currently require it, fixing the issue.
Fixes: 51bc085d64 ("PCI: Improve host drivers compile test coverage")
Link: https://lkml.kernel.org/r/20180612170229.GA10141@roeck-us.net
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Ley Foon Tan <ley.foon.tan@intel.com>
Acked-by: Rob Herring <robh@kernel.org>
Cc: Scott Branden <scott.branden@broadcom.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Guenter Roeck <linux@roeck-us.net>
The endpoint library must be initialized before its users, which are in
drivers/pci/controllers. The endpoint initialization currently depends on
link order.
This corrects a kernel crash when loading the Cadence EP driver, since it
calls devm_pci_epc_create() and this is only valid once the endpoint
library has been initialized.
Fixes: 6e0832fa43 ("PCI: Collect all native drivers under drivers/pci/controller/")
Signed-off-by: Alan Douglas <adouglas@cadence.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The __get_txreq() function can return a pointer, ERR_PTR(-EBUSY), or NULL.
All of the relevant call sites look for IS_ERR, so the NULL return would
lead to a NULL pointer exception.
Do not use the ERR_PTR mechanism for this function.
Update all call sites to handle the return value correctly.
Clean up error paths to reflect return value.
Fixes: 45842abbb2 ("staging/rdma/hfi1: move txreq header code")
Cc: <stable@vger.kernel.org> # 4.9.x+
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In an early version of the I2C patch that was posted to the list the
default I2C frequency (if none was specified) was 400 kHz. There was
debate on the list and we decided that it would be more consistent
with the rest of i2c if we defaulted to 100 kHz. ...but we never
updated the bindings. Let's fix this.
NOTE: since the i2c driver itself hasn't actually landed yet and the
SoC here is very new it seems terribly unlikely that anyone was
relying on the old 400 kHz number, so I'll assume this is an OK
"incompatible" device tree change.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Rob Herring <robh@kernel.org>
Remove unneeded unit address from onewire node, so that dtc does not
complain that unit address is present without a corresponding reg entry,
when such use is made on a real dts file.
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Trivial fix to spelling mistake in tilcdc.txt devicetree documentation.
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Qualcomm Fixes for v4.18-rc2
* Fix compiler warnings for cmd-db driver
* tag 'qcom-fixes-for-4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux:
qcom: cmd-db: enforce CONFIG_OF_RESERVED_MEM dependency
Signed-off-by: Olof Johansson <olof@lixom.net>
As Al Viro noted in commit 128394eff3 ("sg_write()/bsg_write() is not fit
to be called under KERNEL_DS"), sg improperly accesses userspace memory
outside the provided buffer, permitting kernel memory corruption via
splice(). But it doesn't just do it on ->write(), also on ->read().
As a band-aid, make sure that the ->read() and ->write() handlers can not
be called in weird contexts (kernel context or credentials different from
file opener), like for ib_safe_file_access().
If someone needs to use these interfaces from different security contexts,
a new interface should be written that goes through the ->ioctl() handler.
I've mostly copypasted ib_safe_file_access() over as sg_safe_file_access()
because I couldn't find a good common header - please tell me if you know a
better way.
[mkp: s/_safe_/_check_/]
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In any case, d_splice_alias() does not drop reference of original
dentry.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Yi-Hung Wei and Justin Pettit found a race in the garbage collection scheme
used by nf_conncount.
When doing list walk, we lookup the tuple in the conntrack table.
If the lookup fails we remove this tuple from our list because
the conntrack entry is gone.
This is the common cause, but turns out its not the only one.
The list entry could have been created just before by another cpu, i.e. the
conntrack entry might not yet have been inserted into the global hash.
The avoid this, we introduce a timestamp and the owning cpu.
If the entry appears to be stale, evict only if:
1. The current cpu is the one that added the entry, or,
2. The timestamp is older than two jiffies
The second constraint allows GC to be taken over by other
cpu too (e.g. because a cpu was offlined or napi got moved to another
cpu).
We can't pretend the 'doubtful' entry wasn't in our list.
Instead, when we don't find an entry indicate via IS_ERR
that entry was removed ('did not exist' or withheld
('might-be-unconfirmed').
This most likely also fixes a xt_connlimit imbalance earlier reported by
Dmitry Andrianov.
Cc: Dmitry Andrianov <dmitry.andrianov@alertme.com>
Reported-by: Justin Pettit <jpettit@vmware.com>
Reported-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The driver fails to set the correct queue depth for native devices, due to
failing to set the device type prior to calling aac_set_safw_target_qd().
This results in slave configure setting the queue depth to 1.
This causes around 30% performance degradation. Fixed by setting the dev
type before trying to set queue depth.
Reported-by: Steve Best <sbest@redhat.com>
Fixes: 0bcb45fb20 ("scsi: aacraid: Add helper function to set queue depth")
cc: stable@vger.kernel.org
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull security subsystem fixes from James Morris:
- Smack: fix a regression caused by 1bbc55131e
- X.509: fix a (usually un-seen) bug in RSA signature parsing
* 'fixes-v4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
X.509: unpack RSA signatureValue field from BIT STRING
Smack: Mark inode instant in smack_task_to_inode
Pull btrfs fixes from David Sterba:
"Two regression fixes and an incorrect error value propagation fix from
'rename exchange'"
* tag 'for-4.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Btrfs: fix return value on rename exchange failure
btrfs: fix invalid-free in btrfs_extent_same
Btrfs: fix physical offset reported by fiemap for inline extents
The old code would indefinitely block other users of nf_log_mutex if
a userspace access in proc_dostring() blocked e.g. due to a userfaultfd
region. Fix it by moving proc_dostring() out of the locked region.
This is a followup to commit 266d07cb1c ("netfilter: nf_log: fix
sleeping function called from invalid context"), which changed this code
from using rcu_read_lock() to taking nf_log_mutex.
Fixes: 266d07cb1c ("netfilter: nf_log: fix sleeping function calle[...]")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When proc_dostring() is called with a non-zero offset in strict mode, it
doesn't just write to the ->data buffer, it also reads. Make sure it
doesn't read uninitialized data.
Fixes: c6ac37d8d8 ("netfilter: nf_log: fix error on write NONE to [...]")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When booting from MMC/SD in rw mode on DA850 EVM, the system
crashes because the write protect pin should be active high
and not active low. This patch fixes the polarity of the WP pin.
Fixes: bdf0e8364f ("ARM: davinci: da850-evm: use gpio descriptor for mmc pins")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Dave Stevenson says:
====================
lan78xx minor fixes
This is a small set of patches for the Microchip LAN78xx chip,
as used in the Raspberry Pi 3B+.
The main debug/discussion was on
https://github.com/raspberrypi/linux/issues/2458
Initial symptoms were that VLANs were very unreliable.
A couple of things were found:
- firstly that the hardware timeout value set failed to
take into account the VLAN tag, so a full MTU packet
would be timed out.
- second was that regular checksum failures were being
reported. Disabling checksum offload confirmed that
the checksums were valid, and further experimentation
identified that it was only if the VLAN tags were being
passed through to the kernel that there were issues.
The hardware supports VLAN filtering and tag stripping,
therefore those have been implemented (much of the work
was already done), and the driver drops back to s/w
checksums should the choice be made not to use the h/w
VLAN stripping.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Observations of VLANs dropping packets due to invalid
checksums when not offloading VLAN tag receive.
With VLAN tag stripping enabled no issue is observed.
Drop back to s/w checksums if VLAN offload is disabled.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The chip supports stripping the VLAN tag and reporting it
in metadata.
Complete the support for this.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
HW_VLAN_CTAG_FILTER was partially implemented, but not advertised
to Linux.
Complete the implementation of this.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The frame abort timeout being set by lan78xx_set_rx_max_frame_length
didn't account for any VLAN headers, resulting in very low
throughput if used with tagged VLANs.
Use VLAN_ETH_HLEN instead of ETH_HLEN to correct for this.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 667416f385 ("powerpc/mm: Fix kernel crash on page table free")
added a call for pgtable_page_dtor in the rcu page table free routine. We missed
the fact that for 32 bit platforms we did call the 'dtor' early. Drop the extra
call for pgtable_page_dtor. We remove the call from __pte_free_tlb so that we
do the page table free and 'dtor' call together. This should help when we
switch these platforms to pte fragments.
Fixes: 667416f385 ("powerpc/mm: Fix kernel crash on page table free")
Reported-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
An SHPC can be operated either by platform firmware or by the OS. The OS
uses a host bridge ACPI _OSC method to negotiate for control of SHPC. If
firmware wants to prevent an OS from operating an SHPC, it must supply an
_OSC method that declines to grant SHPC ownership to the OS.
If acpi_pci_find_root() returns NULL, it means there's no ACPI host bridge
device (PNP0A03 or PNP0A08) and hence no _OSC method, so the OS is always
allowed to manage the SHPC.
Fix a NULL pointer dereference when CONFIG_ACPI=y but the current
hardware/firmware platform doesn't support ACPI. In that case,
acpi_get_hp_hw_control_from_firmware() is implemented but
acpi_pci_find_root() returns NULL.
Fixes: 90cc0c3cc7 ("PCI: shpchp: Add shpchp_is_native()")
Link: https://lkml.kernel.org/r/20180621164715.28160-1-marc.zyngier@arm.com
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
This test needs root privilege for it's successful execution.
This patch is atleast used to notify the user about the privilege
the script demands for the smooth execution of the test.
Signed-off-by: Jeffrin Jose T (Rajagiri SET) <ahiliation@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The test_lirc_mode2.sh script require root privilege for the successful
execution of the test.
This patch is to notify the user about the privilege the script
demands for the successful execution of the test.
Signed-off-by: Jeffrin Jose T (Rajagiri SET) <ahiliation@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
CONFIG_NET_SCHED wasn't enabled in arm64's defconfig only for x86.
So bpf/test_tunnel.sh tests fails with:
RTNETLINK answers: Operation not supported
RTNETLINK answers: Operation not supported
We have an error talking to the kernel, -1
Enable NET_SCHED and more tests pass.
Fixes: 3bce593ac0 ("selftests: bpf: config: add config fragments")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
If the kernel is compiled with CONFIG_CGROUP_BPF not enabled, it is not
possible to attach, detach or query IR BPF programs to /dev/lircN devices,
making them impossible to use. For embedded devices, it should be possible
to use IR decoding without cgroups or CONFIG_CGROUP_BPF enabled.
This change requires some refactoring, since bpf_prog_{attach,detach,query}
functions are now always compiled, but their code paths for cgroups need
moving out. Rather than a #ifdef CONFIG_CGROUP_BPF in kernel/bpf/syscall.c,
moving them to kernel/bpf/cgroup.c and kernel/bpf/sockmap.c does not
require #ifdefs since that is already conditionally compiled.
Fixes: f4364dcfc8 ("media: rc: introduce BPF_PROG_LIRC_MODE2")
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
When unplugging an r8152 adapter while the interface is UP, the NIC
becomes unusable. usb->disconnect (aka rtl8152_disconnect) deletes
napi. Then, rtl8152_disconnect calls unregister_netdev and that invokes
netdev->ndo_stop (aka rtl8152_close). rtl8152_close tries to
napi_disable, but the napi is already deleted by disconnect above. So
the first while loop in napi_disable never finishes. This results in
complete deadlock of the network layer as there is rtnl_mutex held by
unregister_netdev.
So avoid the call to napi_disable in rtl8152_close when the device is
already gone.
The other calls to usb_kill_urb, cancel_delayed_work_sync,
netif_stop_queue etc. seem to be fine. The urb and netdev is not
destroyed yet.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: linux-usb@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
perf bench: (Jiri Olsa):
- Fix NUMA report output code handling of less than 1s runtimes.
perf script: (Ravi Bangoria)
- Add missing output fields in a 'perf script -h' hint.
- Fix crash because of missing evsel->priv.
- Fix crash caused by accessing feat_ops[HEADER_LAST_FEATURE], which
is just a end of features header marker.
perf stat: (Thomas Richter)
- Remove duplicate event counting
perf test:
- Wire parsing error handling in 'parse events' test (Jiri Olsa)
- Fix 'session topology' test on s/390 (Thomas Richter)
eBPF: (Yonghong Song)
- Fix a clang 7.0 compilation error when building perf linking
with libclang
intel-pt: (Adrian Hunter)
- Fix packet decoding of CYC packets.
Copies of kernel files: (Arnaldo Carvalho de Melo)
- Synchronize drm/drm.h UAPI
- Update x86's syscall_64.tbl, adding support for 'io_pgetevents' and 'rseq'
in 'perf trace'.
- Update powerpc uapi/asm/unistd.h, adding support for the 'rseq' syscall.
- Update if_link.h and bpf.h, no effect on tool features.
PowerPC: (Sandipan Das)
- Fix crash if callchain is empty.
s/390: (Thomas Richter)
- Support random socked_id assignment in the perf header.
- Support s390 random socket_id assignment in perf.data file.
- Make PMU alias definitions taken from sysfs and JSON files comparable
by normalizing them wrt spaces and newlines.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is no default implementation for dma_buf_ops->unmap.
So add a function unmapping the page, otherwise we'll leak them.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Without CONFIG_OF_RESERVED_MEM, gcc sees that the global cmd_db_header
variable is never initialized, and through code optimization concludes
that a lot of other code cannot possibly work after that:
drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_addr':
drivers/soc/qcom/cmd-db.c:197:21: error: 'ent.addr' may be used uninitialized in this function [-Werror=maybe-uninitialized]
return ret < 0 ? 0 : le32_to_cpu(ent.addr);
drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_aux_data':
drivers/soc/qcom/cmd-db.c:224:10: error: 'ent.len' may be used uninitialized in this function [-Werror=maybe-uninitialized]
ent_len = le16_to_cpu(ent.len);
drivers/soc/qcom/cmd-db.c:115:6: error: 'rsc_hdr.data_offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
u16 offset = le16_to_cpu(hdr->data_offset);
^~~~~~
drivers/soc/qcom/cmd-db.c:116:6: error: 'ent.offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
u16 loffset = le16_to_cpu(ent->offset);
^~~~~~~
drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_aux_data_len':
drivers/soc/qcom/cmd-db.c:250:38: error: 'ent.len' may be used uninitialized in this function [-Werror=maybe-uninitialized]
return ret < 0 ? 0 : le16_to_cpu(ent.len);
^
drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_slave_id':
drivers/soc/qcom/cmd-db.c:272:7: error: 'ent.addr' may be used uninitialized in this function [-Werror=maybe-uninitialized]
Using a hard CONFIG_OF_RESERVED_MEM dependency avoids this warning,
and we can remove the CONFIG_OF dependency.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
The commit 4e88d4c083 ("usb: add a flag to skip PHY
initialization to struct usb_hcd") delete the assignment
for hcd->usb_phy, it causes usb_phy_notify_connect{disconnect)
are not called, the USB PHY driver is not notified of hot plug
event, then the disconnection will not be detected by hardware.
Fixes: 4e88d4c083 ("usb: add a flag to skip PHY initialization
to struct usb_hcd")
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reported-by: Mats Karrman <mats.dev.list@gmail.com>
Tested-by: Mats Karrman <mats.dev.list@gmail.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Jonathan writes:
First set of IIO fixes for the 4.18 cycle.
* bmp280
- Fix wrong relative humidity unit.
* buffer
- Fix a function signature to match the function.
* inv_mpu6050
- Fix a regression in which older ACPI devices won't have working
interrupts due to lack of information on the interrupt type.
* mma8452
- Don't ignore data ready interrupt when handling interrupts as will
look like an unhandled interrupt.
* tsl2x7x/tsl2772
- Avoid a potential division by zero.
There is a copy and paste bug here. We should be testing "usb1" instead
of "usb0".
Fixes: 58e1e2d2cd ("clk: davinci: cfgchip: Add TI DA8XX USB PHY clocks")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Lechner <david@lechnology.com>
The check of cmd.flow_attr.size should check into account the size of the
reserved field (2 bytes), otherwise user can provide a size which will
cause a slab-out-of-bounds warning below.
==================================================================
BUG: KASAN: slab-out-of-bounds in ib_uverbs_ex_create_flow+0x1740/0x1d00
Read of size 2 at addr ffff880068dff1a6 by task syz-executor775/269
CPU: 0 PID: 269 Comm: syz-executor775 Not tainted 4.18.0-rc1+ #245
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
Call Trace:
dump_stack+0xef/0x17e
print_address_description+0x83/0x3b0
kasan_report+0x18d/0x4d0
ib_uverbs_ex_create_flow+0x1740/0x1d00
ib_uverbs_write+0x923/0x1010
__vfs_write+0x10d/0x720
vfs_write+0x1b0/0x550
ksys_write+0xc6/0x1a0
do_syscall_64+0xa7/0x590
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x433899
Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d
89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 91 fd ff c3 66
2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc2724db58 EFLAGS: 00000217 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000020006880 RCX: 0000000000433899
RDX: 00000000000000e0 RSI: 0000000020002480 RDI: 0000000000000003
RBP: 00000000006d7018 R08: 00000000004002f8 R09: 00000000004002f8
R10: 00000000004002f8 R11: 0000000000000217 R12: 0000000000000000
R13: 000000000040cd20 R14: 000000000040cdb0 R15: 0000000000000006
Allocated by task 269:
kasan_kmalloc+0xa0/0xd0
__kmalloc+0x1a9/0x510
ib_uverbs_ex_create_flow+0x26c/0x1d00
ib_uverbs_write+0x923/0x1010
__vfs_write+0x10d/0x720
vfs_write+0x1b0/0x550
ksys_write+0xc6/0x1a0
do_syscall_64+0xa7/0x590
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 0:
__kasan_slab_free+0x12e/0x180
kfree+0x159/0x630
detach_buf+0x559/0x7a0
virtqueue_get_buf_ctx+0x3cc/0xab0
virtblk_done+0x1eb/0x3d0
vring_interrupt+0x16d/0x2b0
__handle_irq_event_percpu+0x10a/0x980
handle_irq_event_percpu+0x77/0x190
handle_irq_event+0xc6/0x1a0
handle_edge_irq+0x211/0xd80
handle_irq+0x3d/0x60
do_IRQ+0x9b/0x220
The buggy address belongs to the object at ffff880068dff180
which belongs to the cache kmalloc-64 of size 64
The buggy address is located 38 bytes inside of
64-byte region [ffff880068dff180, ffff880068dff1c0)
The buggy address belongs to the page:
page:ffffea0001a37fc0 count:1 mapcount:0 mapping:ffff88006c401780
index:0x0
flags: 0x4000000000000100(slab)
raw: 4000000000000100 ffffea0001a31100 0000001100000011 ffff88006c401780
raw: 0000000000000000 00000000802a002a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff880068dff080: fb fb fb fb fc fc fc fc fb fb fb fb fb fb fb fb
ffff880068dff100: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
>ffff880068dff180: 00 00 00 00 07 fc fc fc fc fc fc fc fb fb fb fb
^
ffff880068dff200: fb fb fb fb fc fc fc fc 00 00 00 00 00 00 fc fc
ffff880068dff280: fc fc fc fc 00 00 00 00 00 00 00 00 fc fc fc fc
==================================================================
Cc: <stable@vger.kernel.org> # 3.12
Fixes: f884827438 ("IB/core: clarify overflow/underflow checks on ib_create/destroy_flow")
Cc: syzkaller <syzkaller@googlegroups.com>
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Flows can be created on UD and RAW_PACKET QP types. Attempts to provide
other QP types as an input causes to various unpredictable failures.
The reason is that in order to support all various types (e.g. XRC), we
are supposed to use real_qp handle and not qp handle and expect to
driver/FW to fail such (XRC) flows. The simpler and safer variant is to
ban all QP types except UD and RAW_PACKET, instead of relying on
driver/FW.
Cc: <stable@vger.kernel.org> # 3.11
Fixes: 436f2ad05a ("IB/core: Export ib_create/destroy_flow through uverbs")
Cc: syzkaller <syzkaller@googlegroups.com>
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The code was mistakenly using the length of the page array memory instead
of the depth of the page array.
This would cause MR creation to fail in some cases.
Fixes: 8376b86de7 ("iw_cxgb4: Support the new memory registration API")
Cc: stable@vger.kernel.org
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The signatureValue field of a X.509 certificate is encoded as a BIT STRING.
For RSA signatures this BIT STRING is of so-called primitive subtype, which
contains a u8 prefix indicating a count of unused bits in the encoding.
We have to strip this prefix from signature data, just as we already do for
key data in x509_extract_key_data() function.
This wasn't noticed earlier because this prefix byte is zero for RSA key
sizes divisible by 8. Since BIT STRING is a big-endian encoding adding zero
prefixes has no bearing on its value.
The signature length, however was incorrect, which is a problem for RSA
implementations that need it to be exactly correct (like AMD CCP).
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Fixes: c26fd69fa0 ("X.509: Add a crypto key parser for binary (DER) X.509 certificates")
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.morris@microsoft.com>
perf_event__process_feature() accesses feat_ops[HEADER_LAST_FEATURE]
which is not defined and thus perf is crashing. HEADER_LAST_FEATURE is
used as an end marker for the perf report but it's unused for perf
script/annotate. Ignore HEADER_LAST_FEATURE for perf script/annotate,
just like it is done in 'perf report'.
Before:
# perf record -o - ls | perf script
<SNIP 'ls' output>
Segmentation fault (core dumped)
#
After:
# perf record -o - ls | perf script
<SNIP 'ls' output>
Segmentation fault (core dumped)
ls 7031 4392.099856: 250000 cpu-clock:uhH: 7f5e0ce7cd60
ls 7031 4392.100355: 250000 cpu-clock:uhH: 7f5e0c706ef7
#
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 57b5de4639 ("perf report: Support forced leader feature in pipe mode")
Link: http://lkml.kernel.org/r/20180625124220.6434-4-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we can hit following assert when running numa bench:
$ perf bench numa mem -p 3 -t 1 -P 512 -s 100 -zZ0cm --thp 1
perf: bench/numa.c:1577: __bench_numa: Assertion `!(!(((wait_stat) & 0x7f) == 0))' failed.
The assertion is correct, because we hit the SIGFPE in following line:
Thread 2.2 "thread 0/0" received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7fffd28c6700 (LWP 11750)]
0x000.. in worker_thread (__tdata=0x7.. ) at bench/numa.c:1257
1257 td->speed_gbs = bytes_done / (td->runtime_ns / NSEC_PER_SEC) / 1e9;
We don't check if the runtime is actually bigger than 1 second,
and thus this might end up with zero division within FPU.
Adding the check to prevent this.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180620094036.17278-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf stat' shows a mismatch in perf stat regarding counter names on
s390:
Run command:
[root@s35lp76 perf]# ./perf stat -e tx_nc_tend -v --
~/mytesttx 1 >/tmp/111
tx_nc_tend: 1 573146 573146
tx_nc_tend: 1 573146 573146
Performance counter stats for '/root/mytesttx 1':
3 tx_nc_tend
0.001037252 seconds time elapsed
[root@s35lp76 perf]#
shows transaction counter tx_nc_tend with value 3 but it was triggered
only once as seen by the output of mytesttx.
When looking up the event name tx_nc_tend the following function
sequence is called:
parse_events_multi_pmu_add()
+--> perf_pmu__scan() being called with NULL argument
+--> pmu_read_sysfs() scans directory ../devices/ for
all PMUs
+--> perf_pmu__find() tries to find a PMU in the
global pmu list.
+--> pmu_lookup() called to read all file
entries when not in global
list.
pmu_lookup() causes the issue. It calls
+---> pmu_aliases() to read all the entries in the PMU directory.
On s390 this is named
/sys/devices/cpum_cf/events.
+--> pmu_aliases_parse() reads all files and creates an
alias for each file name.
So we end up with first entry created by
reading the sysfs file
[root@s35lp76 perf]# cat /sys/devices/cpum_cf
/events/TX_NC_TEND
event=0x008d
[root@s35lp76 perf]#
Debug output shows this entry
tx_nc_tend -> 'cpum_cf'/'event=0x008d
'/
After all files in this directory have been
read and aliases created this function is called:
+--> pmu_add_cpu_aliases()
This function looks up the CPU tables
created by the json files.
With json files for s390 now available all
the aliases are added to
the PMU alias list a second time.
The second entry is added by
reading the json file converted by jevent
resulting in file pmu-events/pmu-events.c:
{
.name = "tx_nc_tend",
.event = "event=0x8d",
.desc = "Unit: cpum_cf Completed TEND \
instructions \
in non-constrained TX mode",
.topic = "extended",
.long_desc = "A TEND instruction has \
completed in a \
non-constrained \
transactional-execution mode",
.pmu = "cpum_cf",
},
Debug output shows this entry
tx_nc_tend -> 'cpum_cf'/'event=0x8d'/
Function pmu_aliases_parse() and pmu_add_cpu_aliases() both use
__perf_pmu__new_alias() to add an alias to the PMU alias list. There is
no check if an alias already exist
So we end up with 2 entries for tx_nc_tend in the PMU alias list.
Having set up the PMU alias list for this PMU now
parse_events_multi_add_pmu() reads the complete alias list and adds each
alias with parse_events_add_pmu() to the global perfev_list. This
causes the alias to be added multiple times to the event list.
Fix this by making __perf_pmu__new_alias() to merge alias definitions if
an alias is already on the alias list. Also print a debug message when
the alias has mismatches in some fields.
Output before:
[root@s35lp76 perf]# ./perf stat -e tx_nc_tend -v \
-- ~/mytesttx 1 >/tmp/111
tx_nc_tend: 1 551446 551446
Performance counter stats for '/root/mytesttx 1':
3 tx_nc_tend
0.000961134 seconds time elapsed
[root@s35lp76 perf]#
Output after:
[root@s35lp76 perf]# ./perf stat -e tx_nc_tend -v \
-- ~/mytesttx 1 >/tmp/111
tx_nc_tend: 1 551446 551446
Performance counter stats for '/root/mytesttx 1':
1 tx_nc_tend
0.000961134 seconds time elapsed
[root@s35lp76 perf]#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180615101105.47047-3-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PMU alias definitions in sysfs files may have spaces, newlines and
numbers with leading zeroes. Some alias definitions may also appear in
JSON files without spaces, etc.
Scan alias definitions and remove leading zeroes, spaces, newlines, etc
and rebuild string to make alias->str member comparable.
s390 for example has terms specified as event=0x0091 (read from files
../<PMU>/events/<FILE> and terms specified as event=0x91 (read from JSON
files).
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180615101105.47047-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo reported the perf build failure with latest llvm/clang compiler
(7.0).
$ make LIBCLANGLLVM=1 -C tools/perf/
<SNIP>
CC /tmp/tmp.t53Qo38zci/tests/kmod-path.o
util/c++/clang.cpp: In function ‘std::unique_ptr<llvm::SmallVectorImpl<char> >
perf::getBPFObjectFromModule(llvm::Module*)’:
util/c++/clang.cpp:150:43: error: no matching function for call to
‘llvm::TargetMachine::addPassesToEmitFile(llvm::legacy::PassManager&,
llvm::raw_svector_ostream&, llvm::TargetMachine::CodeGenFileType)’
TargetMachine::CGFT_ObjectFile)) {
^
In file included from util/c++/clang.cpp:25:0:
/usr/local/include/llvm/Target/TargetMachine.h:254:16: note: candidate:
virtual bool llvm::TargetMachine::addPassesToEmitFile(
llvm::legacy::PassManagerBase&, llvm::raw_pwrite_stream&,
llvm::raw_pwrite_stream*, llvm::TargetMachine::CodeGenFileType, bool,
llvm::MachineModuleInfo*)
virtual bool addPassesToEmitFile(PassManagerBase &, raw_pwrite_stream &,
^~~~~~~~~~~~~~~~~~~
/usr/local/include/llvm/Target/TargetMachine.h:254:16: note:
candidate expects 6 arguments, 3 provided
mv: cannot stat '/tmp/tmp.t53Qo38zci/util/c++/.clang.o.tmp': No such file or directory
make[7]: *** [/home/acme/git/perf/tools/build/Makefile.build:101:
/tmp/tmp.t53Qo38zci/util/c++/clang.o] Error 1
make[6]: *** [/home/acme/git/perf/tools/build/Makefile.build:139: c++] Error 2
make[5]: *** [/home/acme/git/perf/tools/build/Makefile.build:139: util] Error 2
make[5]: *** Waiting for unfinished jobs....
CC /tmp/tmp.t53Qo38zci/tests/thread-map.o
The function addPassesToEmitFile signature changed in llvm 7.0 and such
a change caused the failure. This patch fixed the issue with using
proper function signatures under different compiler versions.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180616174739.1076733-1-yhs@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This updates the tools/perf/ copy of the system call table for x86 which makes
'perf trace' become aware of the new 'io_pgetevents' and 'rseq' syscalls, no
matter in which system it gets built, i.e. older systems where the syscalls are
not available in the running kernel (via tracefs) or in the system headers will
still be aware of these syscalls/.
These are the csets introducing the source drift:
05c17cedf8 ("x86: Wire up restartable sequence system call")
7a074e96de ("aio: implement io_pgetevents")
This results in this build time change:
$ diff -u /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.old /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c
--- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.old 2018-06-15 11:48:17.648948094 -0300
+++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c 2018-06-15 11:48:22.133942480 -0300
@@ -332,5 +332,7 @@
[330] = "pkey_alloc",
[331] = "pkey_free",
[332] = "statx",
+ [333] = "io_pgetevents",
+ [334] = "rseq",
};
-#define SYSCALLTBL_x86_64_MAX_ID 332
+#define SYSCALLTBL_x86_64_MAX_ID 334
$
This silences the following tools/perf/ build warning:
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-tfvyz51sabuzemrszbrhzxni@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add missing error handling for parse_events calls in test_event function
that led to following segfault on s390:
running test 52 'intel_pt//u'
perf: Segmentation fault
...
/lib64/libc.so.6(vasprintf+0xe6) [0x3fffca3f106]
/lib64/libc.so.6(asprintf+0x46) [0x3fffca1aa96]
./perf(parse_events_add_pmu+0xb8) [0x80132088]
./perf(parse_events_parse+0xc62) [0x8019529a]
./perf(parse_events+0x98) [0x801341c0]
./perf(test__parse_events+0x48) [0x800cd140]
./perf(cmd_test+0x26a) [0x800bd44a]
test child interrupted
Adding the struct parse_events_error argument to parse_events call. Also
adding parse_events_print_error to get more details on the parsing
failures, like:
# perf test 6 -v
running test 52 'intel_pt//u'failed to parse event 'intel_pt//u', err 1, str 'Cannot find PMU `intel_pt'. Missing kernel support?'
event syntax error: 'intel_pt//u'
\___ Cannot find PMU `intel_pt'. Missing kernel support?
Committer note:
Use named initializers in the struct parse_events_error variable to
avoid breaking the build on centos5, 6 and others with a similar gcc:
cc1: warnings being treated as errors
tests/parse-events.c: In function 'test_event':
tests/parse-events.c:1696: error: missing initializer
tests/parse-events.c:1696: error: (near initialization for 'err.str')
Reported-by: Kim Phillips <kim.phillips@arm.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lkml.kernel.org/r/20180611093422.1005-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For some cases, the callchain provided by the kernel may be empty. So,
the callchain ip filtering code will cause a crash if we do not check
whether the struct ip_callchain pointer is NULL before accessing any
members.
This can be observed on a powerpc64le system running Fedora 27 as shown
below.
# perf record -b -e cycles:u ls
Before:
# perf report --branch-history
perf: Segmentation fault
-------- backtrace --------
perf[0x1027615c]
linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x7fff856304d8]
perf(arch_skip_callchain_idx+0x44)[0x10257c58]
perf[0x1017f2e4]
perf(thread__resolve_callchain+0x124)[0x1017ff5c]
perf(sample__resolve_callchain+0xf0)[0x10172788]
...
After:
# perf report --branch-history
Samples: 25 of event 'cycles:u', Event count (approx.): 2306870
Overhead Source:Line Symbol Shared Object
+ 11.60% _init+35736 [.] _init ls
+ 9.84% strcoll_l.c:137 [.] __strcoll_l libc-2.26.so
+ 9.16% memcpy.S:175 [.] __memcpy_power7 libc-2.26.so
+ 9.01% gconv_charset.h:54 [.] _nl_find_locale libc-2.26.so
+ 8.87% dl-addr.c:52 [.] _dl_addr libc-2.26.so
+ 8.83% _init+236 [.] _init ls
...
Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20180611104049.11048-1-sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On s390 this test case fails because the socket identifiction numbers
assigned to the CPU are higher than the CPU identification numbers.
F/ix this by adding the platform architecture into the perf data header
flag information. This helps identifiing the test platform and handles
s390 specifics in process_cpu_topology().
Before:
[root@p23lp27 perf]# perf test -vvvvv -F 39
39: Session topology :
--- start ---
templ file: /tmp/perf-test-iUv755
socket_id number is too big.You may need to upgrade the perf tool.
---- end ----
Session topology: Skip
[root@p23lp27 perf]#
After:
[root@p23lp27 perf]# perf test -vvvvv -F 39
39: Session topology :
--- start ---
templ file: /tmp/perf-test-8X8VTs
CPU 0, core 0, socket 6
CPU 1, core 1, socket 3
---- end ----
Session topology: Ok
[root@p23lp27 perf]#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixes: c84974ed9f ("perf test: Add entry to test cpu topology")
Link: http://lkml.kernel.org/r/20180611073153.15592-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On s390 the socket identifier assigned to a CPU identifier is random and
(depending on the configuration of the LPAR) may be higher than the CPU
identifier. This is currently not supported.
Fix this by allowing arbitrary socket identifiers being assigned to
CPU id.
Output before:
[root@p23lp27 perf]# ./perf report --header -I -v
...
socket_id number is too big.You may need to upgrade the perf tool.
Error:
The perf.data file has no samples!
# ========
# captured on : Tue May 29 09:29:57 2018
# header version : 1
...
# Core ID and Socket ID information is not available
...
[root@p23lp27 perf]#
Output after:
[root@p23lp27 perf]# ./perf report --header -I -v
...
Error:
The perf.data file has no samples!
# ========
# captured on : Tue May 29 09:29:57 2018
# header version : 1
...
# CPU 0: Core ID 0, Socket ID 6
# CPU 1: Core ID 1, Socket ID 3
# CPU 2: Core ID -1, Socket ID -1
...
[root@p23lp27 perf]#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180611073153.15592-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
nlmsg_multicast() always frees the skb, so in case we cannot call
it we must do that ourselves.
Fixes: 21ee543edc ("xfrm: fix race between netns cleanup and state expire notification")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
I saw this type of Kconfig construct on LKML:
config SYMBOOL
#bool "prompt string"
default y
and wondered what it does. Then I wondered if '#' comments are
even documented. They aren't, so add a little doc for that.
Ah, good. kconfig says:
arch/x86/Kconfig:2942:warning: config symbol defined without type
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The line numers for if-entries in the menu tree are off by one or more
lines which is confusing when debugging for correctness of unrelated changes.
According to the git log, commit a02f0570ae (kconfig: improve
error handling in the parser) was the last one that changed that part
of the parser and replaced
"if_entry: T_IF expr T_EOL"
by
"if_entry: T_IF expr nl"
but the commit message does not state why this has been done.
When reverting that part of the commit, only the line numers are
corrected (checked with cdebug = DEBUG_PARSE in zconf.y), otherwise
the menu tree remains unchanged (checked with zconfdump() enabled in
conf.c).
An example for the corrected line numbers:
drivers/soc/Kconfig:15:source drivers/soc/tegra/Kconfig
drivers/soc/tegra/Kconfig:4:if
drivers/soc/tegra/Kconfig:6:if
changes to:
drivers/soc/Kconfig:15:source drivers/soc/tegra/Kconfig
drivers/soc/tegra/Kconfig:1:if
drivers/soc/tegra/Kconfig:4:if
Signed-off-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
When building a 64-bit 4.18-rc1 kernel with a 32-bit userland, I
noticed that stack protection was silently disabled. Adding -m64 in
gcc-x86_64-has-stack-protector.sh fixed that, similar to what has been
noticed in commit 2a61f4747e ("stack-protector: test compiler
capability in Kconfig and drop AUTO mode") for
gcc-x86_32-has-stack-protector.sh.
Signed-off-by: Sven Joachim <svenjoac@gmx.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
With SYSCALL_DEFINEx() disabling -Wattribute-alias generically, there's
no need to duplicate that for PowerPC syscalls.
This reverts commit 4155203739 ("powerpc: fix build failure by
disabling attribute-alias warning in pci_32") and commit 2479bfc9bc
("powerpc: Fix build by disabling attribute-alias warning for
SYSCALL_DEFINEx").
Signed-off-by: Paul Burton <paul.burton@mips.com>
Acked-by: Christophe Leroy <christophe.leroy@c-s.fr>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
gcc-8 warns for every single definition of a system call entry
point, e.g.:
include/linux/compat.h:56:18: error: 'compat_sys_rt_sigprocmask' alias between functions of incompatible types 'long int(int, compat_sigset_t *, compat_sigset_t *, compat_size_t)' {aka 'long int(int, struct <anonymous> *, struct <anonymous> *, unsigned int)'} and 'long int(long int, long int, long int, long int)' [-Werror=attribute-alias]
asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))\
^~~~~~~~~~
include/linux/compat.h:45:2: note: in expansion of macro 'COMPAT_SYSCALL_DEFINEx'
COMPAT_SYSCALL_DEFINEx(4, _##name, __VA_ARGS__)
^~~~~~~~~~~~~~~~~~~~~~
kernel/signal.c:2601:1: note: in expansion of macro 'COMPAT_SYSCALL_DEFINE4'
COMPAT_SYSCALL_DEFINE4(rt_sigprocmask, int, how, compat_sigset_t __user *, nset,
^~~~~~~~~~~~~~~~~~~~~~
include/linux/compat.h:60:18: note: aliased declaration here
asmlinkage long compat_SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__))\
^~~~~~~~~~
The new warning seems reasonable in principle, but it doesn't
help us here, since we rely on the type mismatch to sanitize the
system call arguments. After I reported this as GCC PR82435, a new
-Wno-attribute-alias option was added that could be used to turn the
warning off globally on the command line, but I'd prefer to do it a
little more fine-grained.
Interestingly, turning a warning off and on again inside of
a single macro doesn't always work, in this case I had to add
an extra statement inbetween and decided to copy the __SC_TEST
one from the native syscall to the compat syscall macro. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83256 for more details
about this.
[paul.burton@mips.com:
- Rebase atop current master.
- Split GCC & version arguments to __diag_ignore() in order to match
changes to the preceding patch.
- Add the comment argument to match the preceding patch.]
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82435
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Tested-by: Christophe Leroy <christophe.leroy@c-s.fr>
Tested-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
I have occasionally run into a situation where it would make sense to
control a compiler warning from a source file rather than doing so from
a Makefile using the $(cc-disable-warning, ...) or $(cc-option, ...)
helpers.
The approach here is similar to what glibc uses, using __diag() and
related macros to encapsulate a _Pragma("GCC diagnostic ...") statement
that gets turned into the respective "#pragma GCC diagnostic ..." by
the preprocessor when the macro gets expanded.
Like glibc, I also have an argument to pass the affected compiler
version, but decided to actually evaluate that one. For now, this
supports GCC_4_6, GCC_4_7, GCC_4_8, GCC_4_9, GCC_5, GCC_6, GCC_7,
GCC_8 and GCC_9. Adding support for CLANG_5 and other interesting
versions is straightforward here. GNU compilers starting with gcc-4.2
could support it in principle, but "#pragma GCC diagnostic push"
was only added in gcc-4.6, so it seems simpler to not deal with those
at all. The same versions show a large number of warnings already,
so it seems easier to just leave it at that and not do a more
fine-grained control for them.
The use cases I found so far include:
- turning off the gcc-8 -Wattribute-alias warning inside of the
SYSCALL_DEFINEx() macro without having to do it globally.
- Reducing the build time for a simple re-make after a change,
once we move the warnings from ./Makefile and
./scripts/Makefile.extrawarn into linux/compiler.h
- More control over the warnings based on other configurations,
using preprocessor syntax instead of Makefile syntax. This should make
it easier for the average developer to understand and change things.
- Adding an easy way to turn the W=1 option on unconditionally
for a subdirectory or a specific file. This has been requested
by several developers in the past that want to have their subsystems
W=1 clean.
- Integrating clang better into the build systems. Clang supports
more warnings than GCC, and we probably want to classify them
as default, W=1, W=2 etc, but there are cases in which the
warnings should be classified differently due to excessive false
positives from one or the other compiler.
- Adding a way to turn the default warnings into errors (e.g. using
a new "make E=0" tag) while not also turning the W=1 warnings into
errors.
This patch for now just adds the minimal infrastructure in order to
do the first of the list above. As the #pragma GCC diagnostic
takes precedence over command line options, the next step would be
to convert a lot of the individual Makefiles that set nonstandard
options to use __diag() instead.
[paul.burton@mips.com:
- Rebase atop current master.
- Add __diag_GCC, or more generally __diag_<compiler>, abstraction to
avoid code outside of linux/compiler-gcc.h needing to duplicate
knowledge about different GCC versions.
- Add a comment argument to __diag_{ignore,warn,error} which isn't
used in the expansion of the macros but serves to push people to
document the reason for using them - per feedback from Kees Cook.
- Translate severity to GCC-specific pragmas in linux/compiler-gcc.h
rather than using GCC-specific in linux/compiler_types.h.
- Drop all but GCC 8 macros, since we only need to define macros for
versions that we need to introduce pragmas for, and as of this
series that's just GCC 8.
- Capitalize comments in linux/compiler-gcc.h to match the style of
the rest of the file.
- Line up macro definitions with tabs in linux/compiler-gcc.h.]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Tested-by: Christophe Leroy <christophe.leroy@c-s.fr>
Tested-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The port->logbuffer_head may be wrong if the two processes enters
_tcpm_log at the mostly same time. The 2nd process enters _tcpm_log
before the 1st process update the index, then the 2nd process will
not allocate logbuffer, when the 2nd process tries to use log buffer,
the index has already updated by the 1st process, so it will get
NULL pointer for updated logbuffer, the error message like below:
tcpci 0-0050: Log buffer index 6 is NULL
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jun Li <jun.li@nxp.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pn533_recv_response() is an urb completion handler, so it must use
GFP_ATOMIC. pn533_usb_send_frame() OTOH runs from a regular sleeping
context, so the pn533_submit_urb_for_response() there (and only there)
can use the regular GFP_KERNEL flags.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1514134
Fixes: 9815c7cf22 ("NFC: pn533: Separate physical layer from ...")
Cc: Michael Thalmeier <michael.thalmeier@hale.at>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix Kconfig warning and build errors in staging/typec/rt1711h.c.
The driver uses I2C interfaces so it should depend on I2C.
WARNING: unmet direct dependencies detected for TYPEC_TCPCI
Depends on [m]: STAGING [=y] && TYPEC_TCPM [=y] && I2C [=m]
Selected by [y]:
- TYPEC_RT1711H [=y] && STAGING [=y] && TYPEC_TCPM [=y]
and then:
drivers/staging/typec/tcpci.o: In function `tcpci_probe':
../drivers/staging/typec/tcpci.c:536: undefined reference to `__devm_regmap_init_i2c'
drivers/staging/typec/tcpci.o: In function `tcpci_i2c_driver_init':
../drivers/staging/typec/tcpci.c:593: undefined reference to `i2c_register_driver'
drivers/staging/typec/tcpci.o: In function `tcpci_i2c_driver_exit':
../drivers/staging/typec/tcpci.c:593: undefined reference to `i2c_del_driver'
drivers/staging/typec/tcpci_rt1711h.o: In function `rt1711h_check_revision':
../drivers/staging/typec/tcpci_rt1711h.c:218: undefined reference to `i2c_smbus_read_word_data'
../drivers/staging/typec/tcpci_rt1711h.c:225: undefined reference to `i2c_smbus_read_word_data'
drivers/staging/typec/tcpci_rt1711h.o: In function `rt1711h_probe':
../drivers/staging/typec/tcpci_rt1711h.c:251: undefined reference to `__devm_regmap_init_i2c'
drivers/staging/typec/tcpci_rt1711h.o: In function `rt1711h_i2c_driver_init':
../drivers/staging/typec/tcpci_rt1711h.c:308: undefined reference to `i2c_register_driver'
drivers/staging/typec/tcpci_rt1711h.o: In function `rt1711h_i2c_driver_exit':
../drivers/staging/typec/tcpci_rt1711h.c:308: undefined reference to `i2c_del_driver'
Fixes: ce08eaeb63 ("staging: typec: rt1711h typec chip driver")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: ShuFan Lee <shufan_lee@richtek.com>
Cc: kbuild-all@01.org
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Revieved-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to UCSI Specification, Connector Change Event only
means a change in the Connector Status and Operation Mode
fields of the STATUS data structure. So any other change
should create another event.
Unfortunately on some platforms the firmware acting as PPM
(platform policy manager - usually embedded controller
firmware) still does not report any other status changes if
there is a connector change event. So if the connector power
or data role was changed when a device was plugged to the
connector, the driver does not get any indication about
that. The port will show wrong roles if that happens.
To fix the issue, always checking the data and power role
together with a connector change event.
Fixes: c1b0bc2dab ("usb: typec: Add support for UCSI interface")
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes an issue where the driver fails with an error:
ioremap error for 0x3f799000-0x3f79a000, requested 0x2, got 0x0
On some platforms the UCSI ACPI mailbox SystemMemory
Operation Region may be setup before the driver has been
loaded. That will lead into the driver failing to map the
mailbox region, as it has been already marked as write-back
memory. acpi_os_ioremap() for x86 uses ioremap_cache()
unconditionally.
When the issue happens, the embedded controller has a
pending query event for the UCSI notification right after
boot-up which causes the operation region to be setup before
UCSI driver has been loaded.
The fix is to notify acpi core that the driver is about to
access memory region which potentially overlaps with an
operation region right before mapping it.
acpi_release_memory() will check if the memory has already
been setup (mapped) by acpi core, and deactivate it (unmap)
if it has. The driver is then able to map the memory with
ioremap_nocache() and set the memtype to uncached for the
region.
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Fixes: 8243edf441 ("usb: typec: ucsi: Add ACPI driver")
Cc: stable@vger.kernel.org
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sometimes memory resource may be overlapping with
SystemMemory Operation Region by design, for example if the
memory region is used as a mailbox for communication with a
firmware in the system. One occasion of such mailboxes is
USB Type-C Connector System Software Interface (UCSI).
With regions like that, it is important that the driver is
able to map the memory with the requirements it has. For
example, the driver should be allowed to map the memory as
non-cached memory. However, if the operation region has been
accessed before the driver has mapped the memory, the memory
has been marked as write-back by the time the driver is
loaded. That means the driver will fail to map the memory
if it expects non-cached memory.
To work around the problem, introducing helper that the
drivers can use to temporarily deactivate (unmap)
SystemMemory Operation Regions that overlap with their
IO memory.
Fixes: 8243edf441 ("usb: typec: ucsi: Add ACPI driver")
Cc: stable@vger.kernel.org
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Detected on the Dell XPS 9365.
The laptop has 2 devices that benefit from the hid-generic auto-unbinding.
When those 2 devices are presented to the userspace, udev loads both wacom and
hid-multitouch. When this happens, the code in __hid_bus_reprobe_drivers() is
called concurrently and the second device gets reprobed twice.
An other bug in the power_supply subsystem prevent to remove the wacom driver
if it just finished its initialization, which basically kills the wacom node.
[jkosina@suse.cz: reformat changelog a bit]
Fixes c17a7476e4 ("HID: core: rewrite the hid-generic automatic unbind")
Cc: stable@vger.kernel.org # v4.17
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Some controllers take almost 55ms to complete controller
restore state (CRS).
There is no timeout limit mentioned in xhci specification so
fixing the issue by increasing the timeout limit to 100ms
[reformat code comment -Mathias]
Signed-off-by: Ajay Gupta <ajaykuee@gmail.com>
Signed-off-by: Nagaraj Annaiah <naga.annaiah@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The address-of operator will always evaluate to true. However,
power should be explicitly disabled if no power domain is used.
Remove the address-of operator.
Fixes: 58c38116c6 ("usb: xhci: tegra: Add support for managing powergates")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Initialize the 'err' variate to remove the build warning,
the warning is shown as below:
drivers/usb/host/xhci-tegra.c: In function 'tegra_xusb_mbox_thread':
drivers/usb/host/xhci-tegra.c:552:6: warning: 'err' may be used uninitialized in this function [-Wuninitialized]
drivers/usb/host/xhci-tegra.c:482:6: note: 'err' was declared here
Fixes: e84fce0f88 ("usb: xhci: Add NVIDIA Tegra XUSB controller driver")
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Don't rely on event interrupt (EINT) bit alone to detect pending port
change in resume. If no change event is detected the host may be suspended
again, oterwise roothubs are resumed.
There is a lag in xHC setting EINT. If we don't notice the pending change
in resume, and the controller is runtime suspeded again, it causes the
event handler to assume host is dead as it will fail to read xHC registers
once PCI puts the controller to D3 state.
[ 268.520969] xhci_hcd: xhci_resume: starting port polling.
[ 268.520985] xhci_hcd: xhci_hub_status_data: stopping port polling.
[ 268.521030] xhci_hcd: xhci_suspend: stopping port polling.
[ 268.521040] xhci_hcd: // Setting command ring address to 0x349bd001
[ 268.521139] xhci_hcd: Port Status Change Event for port 3
[ 268.521149] xhci_hcd: resume root hub
[ 268.521163] xhci_hcd: port resume event for port 3
[ 268.521168] xhci_hcd: xHC is not running.
[ 268.521174] xhci_hcd: handle_port_status: starting port polling.
[ 268.596322] xhci_hcd: xhci_hc_died: xHCI host controller not responding, assume dead
The EINT lag is described in a additional note in xhci specs 4.19.2:
"Due to internal xHC scheduling and system delays, there will be a lag
between a change bit being set and the Port Status Change Event that it
generated being written to the Event Ring. If SW reads the PORTSC and
sees a change bit set, there is no guarantee that the corresponding Port
Status Change Event has already been written into the Event Ring."
Cc: <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes an issue uncovered when a recent change to add the "page
table" flag was merged. During bootup we see many errors like the
following:
BUG: Bad page state in process mkdir pfn:00bae
page:c1ff15c0 count:0 mapcount:-1024 mapping:00000000 index:0x0
flags: 0x0()
raw: 00000000 00000000 00000000 fffffbff 00000000 00000100 00000200 00000000
page dumped because: nonzero mapcount
Modules linked in:
CPU: 0 PID: 46 Comm: mkdir Tainted: G B 4.17.0-simple-smp-07461-g1d40a5ea01d5-dirty #993
Call trace:
[<(ptrval)>] show_stack+0x44/0x54
[<(ptrval)>] dump_stack+0xb0/0xe8
[<(ptrval)>] bad_page+0x138/0x174
[<(ptrval)>] ? cpumask_next+0x24/0x34
[<(ptrval)>] free_pages_check_bad+0x6c/0xd0
[<(ptrval)>] free_pcppages_bulk+0x174/0x42c
[<(ptrval)>] free_unref_page_commit.isra.17+0xb8/0xc8
[<(ptrval)>] free_unref_page_list+0x10c/0x190
[<(ptrval)>] ? set_reset_devices+0x0/0x2c
[<(ptrval)>] release_pages+0x3a0/0x414
[<(ptrval)>] tlb_flush_mmu_free+0x5c/0x90
[<(ptrval)>] tlb_flush_mmu+0x90/0xa4
[<(ptrval)>] arch_tlb_finish_mmu+0x50/0x94
[<(ptrval)>] tlb_finish_mmu+0x30/0x64
[<(ptrval)>] exit_mmap+0x110/0x1e0
[<(ptrval)>] mmput+0x50/0xf0
[<(ptrval)>] do_exit+0x274/0xa94
[<(ptrval)>] do_group_exit+0x50/0x110
[<(ptrval)>] __wake_up_parent+0x0/0x38
[<(ptrval)>] _syscall_return+0x0/0x4
During the __pte_free_tlb path openrisc fails to call the page
destructor which would clear the new bits that were introduced.
To fix this we are calling the destructor.
It seem openrisc was the only architecture missing this, all other
architectures either call the destructor like we are doing here or use
pte_free.
Note: failing to call the destructor was also messing up the zone stats
(and will be cause other problems if you were using SPLIT_PTE_PTLOCKS,
which we are not yet).
Fixes: 1d40a5ea01 ("mm: mark pages in use for page tables")
Acked-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Stafford Horne <shorne@gmail.com>
usb: fixes for v4.18-rc1
First set of fixes for the current -rc cycle. The main parts being
warnings of different kinds being fixed. We're also adding support for
Intel'l Icelake devices on dwc3-pci.c.
This reverts commit ee410f15b1.
It might prevent the machine from boot. It would wait for enough
randomness at the very beginning of kernel_init(). But there is
basically nothing running in parallel that would help to produce
any randomness.
Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Stopping offload completely if replace of program failed dates
back to days of transparent offload. Back then we wanted to
silently fall back to the in-driver processing. Today we mark
programs for offload when they are loaded into the kernel, so
the transparent offload is no longer a reality.
Flags check in the driver will only allow replace of a driver
program with another driver program or an offload program with
another offload program.
When driver program is replaced stopping offload is a no-op,
because driver program isn't offloaded. When replacing
offloaded program if the offload fails the entire operation
will fail all the way back to user space and we should continue
using the old program. IOW when replacing a driver program
stopping offload is unnecessary and when replacing offloaded
program - it's a bug, old program should continue to run.
In practice this bug would mean that if offload operation was to
fail (either due to FW communication error, kernel OOM or new
program being offloaded but for a different netdev) driver
would continue reporting that previous XDP program is offloaded
but in fact no program will be loaded in hardware. The failure
is fairly unlikely (found by inspection, when working on the code)
but it's unpleasant.
Backport note: even though the bug was introduced in commit
cafa92ac25 ("nfp: bpf: add support for XDP_FLAGS_HW_MODE"),
this fix depends on commit 441a33031f ("net: xdp: don't allow
device-bound programs in driver mode"), so this fix is sufficient
only in v4.15 or newer. Kernels v4.13.x and v4.14.x do need to
stop offload if it was transparent/opportunistic, i.e. if
XDP_FLAGS_HW_MODE was not set on running program.
Fixes: cafa92ac25 ("nfp: bpf: add support for XDP_FLAGS_HW_MODE")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The kernel may spew a WARNING with UBSAN undefined behavior at
handling ALSA sequencer ioctl SNDRV_SEQ_IOCTL_QUERY_NEXT_CLIENT:
UBSAN: Undefined behaviour in sound/core/seq/seq_clientmgr.c:2007:14
signed integer overflow:
2147483647 + 1 cannot be represented in type 'int'
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x122/0x1c8 lib/dump_stack.c:113
ubsan_epilogue+0x12/0x86 lib/ubsan.c:159
handle_overflow+0x1c2/0x21f lib/ubsan.c:190
__ubsan_handle_add_overflow+0x2a/0x31 lib/ubsan.c:198
snd_seq_ioctl_query_next_client+0x1ac/0x1d0 sound/core/seq/seq_clientmgr.c:2007
snd_seq_ioctl+0x264/0x3d0 sound/core/seq/seq_clientmgr.c:2144
....
It happens only when INT_MAX is passed there, as we're incrementing it
unconditionally. So the fix is trivial, check the value with
INT_MAX. Although the bug itself is fairly harmless, it's better to
fix it so that fuzzers won't hit this again later.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200211
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The kernel may spew a WARNING about UBSAN undefined behavior at
handling ALSA timer ioctl SNDRV_TIMER_IOCTL_NEXT_DEVICE:
UBSAN: Undefined behaviour in sound/core/timer.c:1524:19
signed integer overflow:
2147483647 + 1 cannot be represented in type 'int'
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x122/0x1c8 lib/dump_stack.c:113
ubsan_epilogue+0x12/0x86 lib/ubsan.c:159
handle_overflow+0x1c2/0x21f lib/ubsan.c:190
__ubsan_handle_add_overflow+0x2a/0x31 lib/ubsan.c:198
snd_timer_user_next_device sound/core/timer.c:1524 [inline]
__snd_timer_user_ioctl+0x204d/0x2520 sound/core/timer.c:1939
snd_timer_user_ioctl+0x67/0x95 sound/core/timer.c:1994
....
It happens only when a value with INT_MAX is passed, as we're
incrementing it unconditionally. So the fix is trivial, check the
value with INT_MAX. Although the bug itself is fairly harmless, it's
better to fix it so that fuzzers won't hit this again later.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200213
Reported-and-tested-by: Team OWL337 <icytxw@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
In the critical section cleanup we must not mess with r1. For march=z9
or older, larl + ex (instead of exrl) are used with r1 as a temporary
register. This can clobber r1 in several interrupt handlers. Fix this by
using r11 as a temp register. r11 is being saved by all callers of
cleanup_critical.
Fixes: 6dd85fbb87 ("s390: move expoline assembler macros to a header")
Cc: stable@vger.kernel.org #v4.16
Reported-by: Oliver Kurz <okurz@suse.com>
Reported-by: Petr Tesařík <ptesarik@suse.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We have 3 more Lenovo machines, they all have 2 front mics on them,
so they need the fixup to change the location for one of two mics.
Among these 3 Lenovo machines, one of them has the same pin cfg as the
machine with subid 0x17aa3138, so use the pin cfg table to apply fixup
for them. The rest machines don't share the same pin cfg, so far use
the subid to apply fixup for them.
Fixes: a3dafb2200 ("ALSA: hda/realtek - adjust the location of one mic")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull networking fixes from David Miller:
1) Fix netpoll OOPS in r8169, from Ville Syrjälä.
2) Fix bpf instruction alignment on powerpc et al., from Eric Dumazet.
3) Don't ignore IFLA_MTU attribute when creating new ipvlan links. From
Xin Long.
4) Fix use after free in AF_PACKET, from Eric Dumazet.
5) Mis-matched RTNL unlock in xen-netfront, from Ross Lagerwall.
6) Fix VSOCK loopback on big-endian, from Claudio Imbrenda.
7) Missing RX buffer offset correction when computing DMA addresses in
mvneta driver, from Antoine Tenart.
8) Fix crashes in DCCP's ccid3_hc_rx_send_feedback, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
sfc: make function efx_rps_hash_bucket static
strparser: Corrected typo in documentation.
qmi_wwan: add support for the Dell Wireless 5821e module
cxgb4: when disabling dcb set txq dcb priority to 0
net_sched: remove a bogus warning in hfsc
net: dccp: switch rx_tstamp_last_feedback to monotonic clock
net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
net: Remove depends on HAS_DMA in case of platform dependency
MAINTAINERS: Add file patterns for dsa device tree bindings
net: mscc: make sparse happy
net: mvneta: fix the Rx desc DMA address in the Rx path
Documentation: e1000: Fix docs build error
Documentation: e100: Fix docs build error
Documentation: e1000: Use correct heading adornment
Documentation: e100: Use correct heading adornment
ipv6: mcast: fix unsolicited report interval after receiving querys
vhost_net: validate sock before trying to put its fd
VSOCK: fix loopback on big-endian systems
net: ethernet: ti: davinci_cpdma: make function cpdma_desc_pool_create static
xen-netfront: Update features after registering netdev
...
The DT node passed here isn't necessarily an OPP node, as this routine
can also be used for cases where the "required-opps" property is present
directly in the device's node. Rename it.
This also removes a stale comment.
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The touchscreen driver no longer configures the device as wakeup source by
default. A "wakeup-source" property is needed.
Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
When kcs_bmc_handle_event calls kcs_force_abort function to handle the
not open (no user running) KCS channel transaction, the returned status
value -ENODEV causes the low level IRQ handler indicating that the irq
was not for him by returning IRQ_NONE. After some time, this IRQ will
be treated to be spurious one, and the exception dump happens.
irq 30: nobody cared (try booting with the "irqpoll" option)
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.15-npcm750 #1
Hardware name: NPCMX50 Chip family
[<c010b264>] (unwind_backtrace) from [<c0106930>] (show_stack+0x20/0x24)
[<c0106930>] (show_stack) from [<c03dad38>] (dump_stack+0x8c/0xa0)
[<c03dad38>] (dump_stack) from [<c0168810>] (__report_bad_irq+0x3c/0xdc)
[<c0168810>] (__report_bad_irq) from [<c0168c34>] (note_interrupt+0x29c/0x2ec)
[<c0168c34>] (note_interrupt) from [<c0165c80>] (handle_irq_event_percpu+0x5c/0x68)
[<c0165c80>] (handle_irq_event_percpu) from [<c0165cd4>] (handle_irq_event+0x48/0x6c)
[<c0165cd4>] (handle_irq_event) from [<c0169664>] (handle_fasteoi_irq+0xc8/0x198)
[<c0169664>] (handle_fasteoi_irq) from [<c016529c>] (__handle_domain_irq+0x90/0xe8)
[<c016529c>] (__handle_domain_irq) from [<c01014bc>] (gic_handle_irq+0x58/0x9c)
[<c01014bc>] (gic_handle_irq) from [<c010752c>] (__irq_svc+0x6c/0x90)
Exception stack(0xc0a01de8 to 0xc0a01e30)
1de0: 00002080 c0a6fbc0 00000000 00000000 00000000 c096d294
1e00: 00000000 00000001 dc406400 f03ff100 00000082 c0a01e94 c0a6fbc0 c0a01e38
1e20: 00200102 c01015bc 60000113 ffffffff
[<c010752c>] (__irq_svc) from [<c01015bc>] (__do_softirq+0xbc/0x358)
[<c01015bc>] (__do_softirq) from [<c011c798>] (irq_exit+0xb8/0xec)
[<c011c798>] (irq_exit) from [<c01652a0>] (__handle_domain_irq+0x94/0xe8)
[<c01652a0>] (__handle_domain_irq) from [<c01014bc>] (gic_handle_irq+0x58/0x9c)
[<c01014bc>] (gic_handle_irq) from [<c010752c>] (__irq_svc+0x6c/0x90)
Exception stack(0xc0a01ef8 to 0xc0a01f40)
1ee0: 00000000 000003ae
1f00: dcc0f338 c0111060 c0a00000 c0a0cc44 c0a0cbe4 c0a1c22b c07bc218 00000001
1f20: dcffca40 c0a01f54 c0a01f58 c0a01f48 c0103524 c0103528 60000013 ffffffff
[<c010752c>] (__irq_svc) from [<c0103528>] (arch_cpu_idle+0x48/0x4c)
[<c0103528>] (arch_cpu_idle) from [<c0681390>] (default_idle_call+0x30/0x3c)
[<c0681390>] (default_idle_call) from [<c0156f24>] (do_idle+0xc8/0x134)
[<c0156f24>] (do_idle) from [<c015722c>] (cpu_startup_entry+0x28/0x2c)
[<c015722c>] (cpu_startup_entry) from [<c067ad74>] (rest_init+0x84/0x88)
[<c067ad74>] (rest_init) from [<c0900d44>] (start_kernel+0x388/0x394)
[<c0900d44>] (start_kernel) from [<0000807c>] (0x807c)
handlers:
[<c041c5dc>] npcm7xx_kcs_irq
Disabling IRQ #30
It needs to change the returned status from -ENODEV to 0. The -ENODEV
was originally used to tell the low level IRQ handler that no user was
running, but not consider the IRQ handling desgin.
And multiple KCS channels share one IRQ handler, it needs to check the
IBF flag before doing force abort. If the IBF is set, after handling,
return 0 to low level IRQ handler to indicate that the IRQ is handled.
Signed-off-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Commit 93c303d204 "ipmi_si: Clean up shutdown a bit" didn't
copy the behavior of the cleanup in one spot, it needed to
check for a non-NULL interface before cleaning it up.
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Tested-by: Meelis Roos <mroos@linux.ee>
We should return if get_cpu_device() fails or it leads to a NULL
dereference. Also dev_pm_opp_of_get_opp_desc_node() returns NULL on
error, it never returns error pointers.
Fixes: 46e2856b8e (cpufreq: Add Kryo CPU scaling driver)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In __xfs_ag_resv_init we incorrectly calculate the amount by which to
decrease fdblocks when reserving blocks for the rmapbt. Because rmapbt
allocations do not decrease fdblocks, we must decrease fdblocks by the
entire size of the requested reservation in order to achieve our goal of
always having enough free blocks to satisfy an rmapbt expansion.
This is in contrast to the refcountbt/finobt, which /do/ subtract from
fdblocks whenever they allocate a block. For this allocation type we
preserve the existing behavior where we decrease fdblocks only by the
requested reservation minus the size of the existing tree.
This fixes the problem where the available block counts reported by
statfs change across a remount if there had been an rmapbt size change
since mount time.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
If a user asks us to zero_range part of a file, the end of the range is
EOF, and not aligned to a page boundary, invoke writeback of the EOF
page to ensure that the post-EOF part of the page is zeroed. This
ensures that we don't expose stale memory contents via mmap, if in a
clumsy manner.
Found by running generic/127 when it runs zero_range and mapread at EOF
one after the other.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
In commit 8ad560d256 ("xfs: strengthen rtalloc query range checks")
we strengthened the input parameter checks in the rtbitmap range query
function, but introduced an off-by-one error in the process. The call
to xfs_rtfind_forw deals with the high key being rextents, but we clamp
the high key to rextents - 1. This causes the returned results to stop
one block short of the end of the rtdev, which is incorrect.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Initialize the extent count field of the high key so that when we use
the high key to synthesize an 'unknown owner' record (i.e. used space
record) at the end of the queried range we have a field with which to
compute rm_blockcount. This is not strictly necessary because the
synthesizer never uses the rm_blockcount field, but we can shut up the
static code analysis anyway.
Coverity-id: 1437358
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Zorro Lang reports that generic/485 blows an assert on a filesystem with
512 byte blocks. The test tries to fallocate a post-eof extent at the
maximum file size and calls insert range to shift the extents right by
two blocks. On a 512b block filesystem this causes startoff to overflow
the 54-bit startoff field, leading to the assert.
Therefore, always check the rightmost extent to see if it would overflow
prior to invoking the insert range machinery.
Reported-by: zlang@redhat.com
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200137
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
If we somehow end up with a filesystem that has fewer free blocks than
the blocks set aside to avoid ENOSPC deadlocks, it's possible that the
free space calculation in xfs_reserve_blocks will spit out a negative
number (because percpu_counter_sum returns s64). We fail to notice
this negative number and set fdblks_delta to it. Now we increment
fdblocks(!) and the unsigned type of m_resblks means that we end up
setting a ridiculously huge m_resblks reservation.
Avoid this comedy of errors by detecting the negative free space and
returning -ENOSPC.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
In commit e89c041338 ("xfs: implement the GETFSMAP ioctl") we
created the ability to obtain empty transactions. These transactions
have no log or block reservations and therefore can't modify anything.
Since they're also NO_WRITECOUNT they can run while the fs is frozen,
so we don't need to WARN_ON about that usage.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Commit 784e0300fe ("rseq: Avoid infinite recursion when delivering
SIGSEGV") added a new ksig argument to the rseq_signal_deliver() &
rseq_handle_notify_resume() functions, and was merged in v4.18-rc2.
Meanwhile MIPS support for restartable sequences was also merged in
v4.18-rc2 with commit 9ea141ad54 ("MIPS: Add support for restartable
sequences"), and therefore didn't get updated for the API change.
This results in build failures like the following:
CC arch/mips/kernel/signal.o
arch/mips/kernel/signal.c: In function 'handle_signal':
arch/mips/kernel/signal.c:804:22: error: passing argument 1 of
'rseq_signal_deliver' from incompatible pointer type
[-Werror=incompatible-pointer-types]
rseq_signal_deliver(regs);
^~~~
In file included from ./include/linux/context_tracking.h:5,
from arch/mips/kernel/signal.c:12:
./include/linux/sched.h:1811:56: note: expected 'struct ksignal *' but
argument is of type 'struct pt_regs *'
static inline void rseq_signal_deliver(struct ksignal *ksig,
~~~~~~~~~~~~~~~~^~~~
arch/mips/kernel/signal.c:804:2: error: too few arguments to function
'rseq_signal_deliver'
rseq_signal_deliver(regs);
^~~~~~~~~~~~~~~~~~~
Fix this by adding the ksig argument as was done for other architectures
in commit 784e0300fe ("rseq: Avoid infinite recursion when delivering
SIGSEGV").
Signed-off-by: Paul Burton <paul.burton@mips.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Patchwork: https://patchwork.linux-mips.org/patch/19603/
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
It turned out that we can run calibration without already received
frames with RSSI data. In such case just don't update AGC register
and don't print the warning.
Fixes: b305a6ab02 ("mt7601u: use EWMA to calculate avg_rssi")
Reported-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Race condition is observed during rmmod of mwifiex_usb:
1. The rmmod thread will call mwifiex_usb_disconnect(), download
SHUTDOWN command and do wait_event_interruptible_timeout(),
waiting for response.
2. The main thread will handle the response and will do a
wake_up_interruptible(), unblocking rmmod thread.
3. On getting unblocked, rmmod thread will make rx_cmd.urb = NULL in
mwifiex_usb_free().
4. The main thread will try to resubmit rx_cmd.urb in
mwifiex_usb_submit_rx_urb(), which is NULL.
To fix this, move mwifiex_usb_free() from mwifiex_usb_disconnect
to mwifiex_unregister_dev(). Function mwifiex_unregister_dev() is
called after flushing the command and RX work queues.
Suggested-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Ganapathi Bhat <gbhat@marvell.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
This reverts commit b817047ae7.
We have a better fix for this issue, which will be sent on top
of this revert.
Signed-off-by: Ganapathi Bhat <gbhat@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
When connecting to AP, mac80211 asks driver to enter and leave PS quickly,
but driver deinit doesn't wait for delayed work complete when entering PS,
then driver reinit procedure and delay work are running simultaneously.
This will cause unpredictable kernel oops or crash like
rtl8723be: error H2C cmd because of Fw download fail!!!
WARNING: CPU: 3 PID: 159 at drivers/net/wireless/realtek/rtlwifi/
rtl8723be/fw.c:227 rtl8723be_fill_h2c_cmd+0x182/0x510 [rtl8723be]
CPU: 3 PID: 159 Comm: kworker/3:2 Tainted: G O 4.16.13-2-ARCH #1
Hardware name: ASUSTeK COMPUTER INC. X556UF/X556UF, BIOS X556UF.406
10/21/2016
Workqueue: rtl8723be_pci rtl_c2hcmd_wq_callback [rtlwifi]
RIP: 0010:rtl8723be_fill_h2c_cmd+0x182/0x510 [rtl8723be]
RSP: 0018:ffffa6ab01e1bd70 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffa26069071520 RCX: 0000000000000001
RDX: 0000000080000001 RSI: ffffffff8be70e9c RDI: 00000000ffffffff
RBP: 0000000000000000 R08: 0000000000000048 R09: 0000000000000348
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: ffffa26069071520 R14: 0000000000000000 R15: ffffa2607d205f70
FS: 0000000000000000(0000) GS:ffffa26081d80000(0000) knlGS:000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000443b39d3000 CR3: 000000037700a005 CR4: 00000000003606e0
Call Trace:
? halbtc_send_bt_mp_operation.constprop.17+0xd5/0xe0 [btcoexist]
? ex_btc8723b1ant_bt_info_notify+0x3b8/0x820 [btcoexist]
? rtl_c2hcmd_launcher+0xab/0x110 [rtlwifi]
? process_one_work+0x1d1/0x3b0
? worker_thread+0x2b/0x3d0
? process_one_work+0x3b0/0x3b0
? kthread+0x112/0x130
? kthread_create_on_node+0x60/0x60
? ret_from_fork+0x35/0x40
Code: 00 76 b4 e9 e2 fe ff ff 4c 89 ee 4c 89 e7 e8 56 22 86 ca e9 5e ...
This patch ensures all delayed works done before entering PS to satisfy
our expectation, so use cancel_delayed_work_sync() instead. An exception
is delayed work ips_nic_off_wq because running task may be itself, so add
a parameter ips_wq to deinit function to handle this case.
This issue is reported and fixed in below threads:
https://github.com/lwfinger/rtlwifi_new/issues/367https://github.com/lwfinger/rtlwifi_new/issues/366
Tested-by: Evgeny Kapun <abacabadabacaba@gmail.com> # 8723DE
Tested-by: Shivam Kakkar <shivam543@gmail.com> # 8723BE on 4.18-rc1
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Fixes: cceb0a5973 ("rtlwifi: Add work queue for c2h cmd.")
Cc: Stable <stable@vger.kernel.org> # 4.11+
Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The function efx_rps_hash_bucket is local to the source and
does not need to be in global scope, so make it static.
Cleans up sparse warning:
symbol 'efx_rps_hash_bucket' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 5ec6486daa ("iio:imu: inv_mpu6050: support more interrupt types")
causes inv_mpu_core_probe() to fail if the IRQ does not have a
trigger-type setup.
This happens on machines where the mpu6050 is enumerated through ACPI and
an older Interrupt type ACPI resource is used for the interrupt, rather
then a GpioInt type type, causing the mpu6050 driver to no longer work
there. This happens on e.g. the Asus T100TA.
This commits makes the mpu6050 fallback to the old IRQF_TRIGGER_RISING
default if the irq-type is not setup, fixing this.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Fixes: 5ec6486daa ("iio:imu: inv_mpu6050: support more interrupt types")
Reviewed-by: Martin Kelly <mkelly@xevo.com>
Reviewed-by: Jean-Baptiste Maneyrol <jmaneyrol@invensense.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
linux/iio/buffer-dma.h was not updated to when length was changed to
unsigned int.
Fixes: c043ec1ca5 ("iio:buffer: make length types match kfifo types")
Signed-off-by: Phil Reid <preid@electromag.com.au>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Interrupts are ignored if no event bit is set in the status status
register and this breaks the buffer interface. No data is shown when
running "iio_generic_buffer -n mma8451 -a" and interrupt counts go
crazy.
Fix by not returning IRQ_NONE if DRDY is set.
Fixes: 605f72de13 ("iio: accel: mma8452: improvements to handle
multiple events")
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
It may be possible for tsl2772_get_lux to return a zero lux value
and hence a division by zero can occur when lux_val is zero. Check
for this case and return -ERANGE to avoid the division by zero.
Detected by CoverityScan, CID#1469484 ("Division or modulo by zero")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Brian Masney <masneyb@onstation.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
According to IIO ABI relative humidity reading should be
returned in milli percent.
This patch addresses that by applying proper scaling and
returning integer instead of fractional format type specifier.
Note that the fixes tag is before the driver was heavily refactored
to introduce spi support, so the patch won't apply that far back.
Signed-off-by: Tomasz Duszynski <tduszyns@gmail.com>
Fixes: 14beaa8f5a ("iio: pressure: bmp280: add humidity support")
Acked-by: Matt Ranostay <matt.ranostay@konsulko.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Pull perf fixes from Thomas Gleixner:
"A pile of perf updates:
Kernel side:
- Remove an incorrect warning in uprobe_init_insn() when
insn_get_length() fails. The error return code is handled at the
call site.
- Move the inline keyword to the right place in the perf ringbuffer
code to address a W=1 build warning.
Tooling:
perf stat:
- Fix metric column header display alignment
- Improve error messages for default attributes, providing better
output for error in command line.
- Add --interval-clear option, to provide a 'watch' like printing
perf script:
- Show hw-cache events too
perf c2c:
- Fix data dependency problem in layout of 'struct c2c_hist_entry'
Core:
- Do not blindly assume that 'struct perf_evsel' can be obtained via
a straight forward container_of() as there are call sites which
hand in a plain 'struct hist' which is not part of a container.
- Fix error index in the PMU event parser, so that error messages can
point to the problematic token"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Move the inline keyword at the beginning of the function declaration
uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
perf script: Show hw-cache events
perf c2c: Keep struct hist_entry at the end of struct c2c_hist_entry
perf stat: Add event parsing error handling to add_default_attributes
perf stat: Allow to specify specific metric column len
perf stat: Fix metric column header display alignment
perf stat: Use only color_fprintf call in print_metric_only
perf stat: Add --interval-clear option
perf tools: Fix error index for pmu event parser
perf hists: Reimplement hists__has_callchains()
perf hists browser gtk: Use hist_entry__has_callchains()
perf hists: Make hist_entry__has_callchains() work with 'perf c2c'
perf hists: Save the callchain_size in struct hist_entry
Pull rseq fixes from Thomas Gleixer:
"A pile of rseq related fixups:
- Prevent infinite recursion when delivering SIGSEGV
- Remove the abort of rseq critical section on fork() as syscalls
inside rseq critical sections are explicitely forbidden. So no
point in doing the abort on the child.
- Align the rseq structure on 32 bytes in the ARM selftest code.
- Fix file permissions of the test script"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Avoid infinite recursion when delivering SIGSEGV
rseq/cleanup: Do not abort rseq c.s. in child on fork()
rseq/selftests/arm: Align 'struct rseq_cs' on 32 bytes
rseq/selftests: Make run_param_test.sh executable
Pull EFI fixes from Thomas Gleixner:
"Two fixlets for the EFI maze:
- Properly zero variables to prevent an early boot hang on EFI mixed
mode systems
- Fix the fallout of merging the 32bit and 64bit variants of EFI PCI
related code which ended up chosing the 32bit variant of the actual
EFi call invocation which leads to failures on 64bit"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/x86: Fix incorrect invocation of PciIo->Attributes()
efi/libstub/tpm: Initialize efi_physical_addr_t vars to zero for mixed mode
Pull core fixes from Thomas Gleixner:
"Two tiny fixes:
- Add the missing machine_real_restart() to objtools noreturn list so
it stops complaining
- Fix a trivial comment typo"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kernel.h: Fix a typo in comment
objtool: Add machine_real_restart() to the noreturn list
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Make Xen PV guest deal with speculative store bypass correctly
- Address more fallout from the 5-Level pagetable handling. Undo an
__initdata annotation to avoid section mismatch and malfunction
when post init code would touch the freed variable.
- Handle exception fixup in math_error() before calling notify_die().
The reverse call order incorrectly triggers notify_die() listeners
for soemthing which is handled correctly at the site which issues
the floating point instruction.
- Fix an off by one in the LLC topology calculation on AMD
- Handle non standard memory block sizes gracefully un UV platforms
- Plug a memory leak in the microcode loader
- Sanitize the purgatory build magic
- Add the x86 specific device tree bindings directory to the x86
MAINTAINER file patterns"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix 'no5lvl' handling
Revert "x86/mm: Mark __pgtable_l5_enabled __initdata"
x86/CPU/AMD: Fix LLC ID bit-shift calculation
MAINTAINERS: Add file patterns for x86 device tree bindings
x86/microcode/intel: Fix memleak in save_microcode_patch()
x86/platform/UV: Add kernel parameter to set memory block size
x86/platform/UV: Use new set memory block size function
x86/platform/UV: Add adjustable set memory block size function
x86/build: Remove unnecessary preparation for purgatory
Revert "kexec/purgatory: Add clean-up for purgatory directory"
x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
x86: Call fixup_exception() before notify_die() in math_error()
Pull x86 pti fixes from Thomas Gleixner:
"Two small updates for the speculative distractions:
- Make it more clear to the compiler that array_index_mask_nospec()
is not subject for optimizations. It's not perfect, but ...
- Don't report XEN PV guests as vulnerable because their mitigation
state depends on the hypervisor. Report unknown and refer to the
hypervisor requirement"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()
x86/pti: Don't report XenPV as vulnerable
Pull locking fixes from Thomas Gleixner:
"A set of fixes and updates for the locking code:
- Prevent lockdep from updating irq state within its own code and
thereby confusing itself.
- Buid fix for older GCCs which mistreat anonymous unions
- Add a missing lockdep annotation in down_read_non_onwer() which
causes up_read_non_owner() to emit a lockdep splat
- Remove the custom alpha dec_and_lock() implementation which is
incorrect in terms of ordering and use the generic one.
The remaining two commits are not strictly fixes. They provide irqsave
variants of atomic_dec_and_lock() and refcount_dec_and_lock(). These
are required to merge the relevant updates and cleanups into different
maintainer trees for 4.19, so routing them into mainline without
actual users is the sanest approach.
They should have been in -rc1, but last weekend I took the liberty to
just avoid computers in order to regain some mental sanity"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/qspinlock: Fix build for anonymous union in older GCC compilers
locking/lockdep: Do not record IRQ state within lockdep code
locking/rwsem: Fix up_read_non_owner() warning with DEBUG_RWSEMS
locking/refcounts: Implement refcount_dec_and_lock_irqsave()
atomic: Add irqsave variant of atomic_dec_and_lock()
alpha: Remove custom dec_and_lock() implementation
Pull ras fixes from Thomas Gleixner:
"A set of fixes for RAS/MCE:
- Improve the error message when the kernel cannot recover from a MCE
so the maximum amount of information gets provided.
- Individually check MCE recovery features on SkyLake CPUs instead of
assuming none when the CAPID0 register does not advertise the
general ability for recovery.
- Prevent MCE to output inconsistent messages which first show an
error location and then claim that the source is unknown.
- Prevent overwriting MCi_STATUS in the attempt to gather more
information when a fatal MCE has alreay been detected. This leads
to empty status values in the printout and failing to react
promptly on the fatal event"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Fix incorrect "Machine check from unknown source" message
x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()
x86/mce: Check for alternate indication of machine check recovery on Skylake
x86/mce: Improve error message when kernel cannot recover
Pull timer fixes from Thomas Gleixner:
"A small set of fixes for time(r) related issues:
- Fix a long standing conversion issue in jiffies_to_msecs() for odd
HZ values like 1024 or 1200 which resulted in returning 0 for small
jiffies values due to rounding down.
- Use the proper CONFIG symbol in the new Y2038 safe compat code for
posix-timers. Not yet a visible breakage, but this will immediately
trigger when the architecture support for the new interfaces is
merged.
- Return an error code in the STM32 clocksource driver on failure
instead of success.
- Remove the redundant and stale irq disabled check in the posix cpu
timer code. The check is at the wrong place anyway and lockdep
already covers it via the sighand lock locking coverage"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Make sure jiffies_to_msecs() preserves non-zero time periods
posix-timers: Fix nanosleep_copyout() for CONFIG_COMPAT_32BIT_TIME
clocksource/drivers/stm32: Fix error return code
posix-cpu-timers: Remove lockdep_assert_irqs_disabled()
Pull irq fixes from Thomas Gleixner:
"A set of fixes mostly for the ARM/GIC world:
- Fix the MSI affinity handling in the ls-scfg irq chip driver so it
updates and uses the effective affinity mask correctly
- Prevent binding LPIs to offline CPUs and respect the Cavium erratum
which requires that LPIs which belong to an offline NUMA node are
not bound to a CPU on a different NUMA node.
- Free only the amount of allocated interrupts in the GIC-V2M driver
instead of trying to free log2(nrirqs).
- Prevent emitting SYNC and VSYNC targetting non existing interrupt
collections in the GIC-V3 ITS driver
- Ensure that the GIV-V3 interrupt redistributor is correctly
reprogrammed on CPU hotplug
- Remove a stale unused helper function"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqdesc: Delete irq_desc_get_msi_desc()
irqchip/gic-v3-its: Fix reprogramming of redistributors on CPU hotplug
irqchip/gic-v3-its: Only emit VSYNC if targetting a valid collection
irqchip/gic-v3-its: Only emit SYNC if targetting a valid collection
irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node
irqchip/gic-v2m: Fix SPI release on error path
irqchip/ls-scfg-msi: Fix MSI affinity handling
genirq/debugfs: Add missing IRQCHIP_SUPPORTS_LEVEL_MSI debug
Pull MIPS fixes from Paul Burton:
"A few MIPS fixes for 4.18:
- a GPIO device name fix for a regression in v4.15-rc1.
- an errata workaround for the BCM5300X platform.
- a fix to ftrace function graph tracing, broken for a long time with
the fix applying cleanly back as far as v3.17.
- addition of read barriers to in{b,w,l,q}() functions, matching
behavior of other architectures & mirroring the equivalent addition
to read{b,w,l,q} in v4.17-rc2.
Plus changes to wire up new syscalls introduced in the 4.18 cycle:
- Restartable sequences support is added, including MIPS support in
the selftests.
- io_pgetevents is wired up"
* tag 'mips_fixes_4.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Wire up io_pgetevents syscall
rseq/selftests: Implement MIPS support
MIPS: Wire up the restartable sequences (rseq) syscall
MIPS: Add syscall detection for restartable sequences
MIPS: Add support for restartable sequences
MIPS: io: Add barrier after register read in inX()
mips: ftrace: fix static function graph tracing
MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum
MIPS: pb44: Fix i2c-gpio GPIO descriptor table
Replaced strp_pause() with strp_unpause() to correct a seemingly copy
paste documentation mistake.
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following commit:
2c3625cb9f ("efi/x86: Fold __setup_efi_pci32() and __setup_efi_pci64() into one function")
... merged the two versions of __setup_efi_pciXX(), without taking into
account that the 32-bit version used a rather dodgy trick to pass an
immediate 0 constant as argument for a uint64_t parameter.
The issue is caused by the fact that on x86, UEFI protocol method calls
are redirected via struct efi_config::call(), which is a variadic function,
and so the compiler has to infer the types of the parameters from the
arguments rather than from the prototype.
As the 32-bit x86 calling convention passes arguments via the stack,
passing the unqualified constant 0 twice is the same as passing 0ULL,
which is why the 32-bit code in __setup_efi_pci32() contained the
following call:
status = efi_early->call(pci->attributes, pci,
EfiPciIoAttributeOperationGet, 0, 0,
&attributes);
to invoke this UEFI protocol method:
typedef
EFI_STATUS
(EFIAPI *EFI_PCI_IO_PROTOCOL_ATTRIBUTES) (
IN EFI_PCI_IO_PROTOCOL *This,
IN EFI_PCI_IO_PROTOCOL_ATTRIBUTE_OPERATION Operation,
IN UINT64 Attributes,
OUT UINT64 *Result OPTIONAL
);
After the merge, we inadvertently ended up with this version for both
32-bit and 64-bit builds, breaking the latter.
So replace the two zeroes with the explicitly typed constant 0ULL,
which works as expected on both 32-bit and 64-bit builds.
Wilfried tested the 64-bit build, and I checked the generated assembly
of a 32-bit build with and without this patch, and they are identical.
Reported-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
Tested-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hdegoede@redhat.com
Cc: linux-efi@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This module exposes two USB configurations: a QMI+AT capable setup on
USB config #1 and a MBIM capable setup on USB config #2.
By default the kernel will choose the MBIM capable configuration as
long as the cdc_mbim driver is available. This patch adds support for
the QMI port in the secondary configuration.
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we are disabling DCB, store "0" in txq->dcb_prio
since that's used for future TX Work Request "OVLAN_IDX"
values. Setting non zero priority upon disabling DCB
would halt the traffic.
Reported-by: AMG Zollner Robert <robert@cloudmedia.eu>
CC: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If CONFIG_SMP=n, building a kernel for R-Car Gen2 fails with:
arch/arm/mach-shmobile/setup-rcar-gen2.o: In function `rcar_gen2_timer_init':
setup-rcar-gen2.c:(.init.text+0x30): undefined reference to `secure_cntvoff_init'
Indeed, on R-Car Gen2 SoCs, secure_cntvoff_init() is not only needed for
secondary CPUs, but also for the boot CPU. This is most visible on SoCs
with Cortex A7 cores (e.g. R-Car E2, cfr. commit 9ce3fa6816 ("ARM:
shmobile: rcar-gen2: Add CA7 arch_timer initialization for r8a7794")),
but Cortex A15 is affected, too.
Fix this by always providing secure_cntvoff_init() when building for ARM
V7.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 7c607944bc ("ARM: smp: Add initialization of CNTVOFF")
Fixes: cad160ed0a ("ARM: shmobile: Convert file to use cntvoff")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Maxime Ripard <maxime.ripard@bootlin.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull block fixes from Jens Axboe:
- Further timeout fixes. We aren't quite there yet, so expect another
round of fixes for that to completely close some of the IRQ vs
completion races. (Christoph/Bart)
- Set of NVMe fixes from the usual suspects, mostly error handling
- Two off-by-one fixes (Dan)
- Another bdi race fix (Jan)
- Fix nbd reconfigure with NBD_DISCONNECT_ON_CLOSE (Doron)
* tag 'for-linus-20180623' of git://git.kernel.dk/linux-block:
blk-mq: Fix timeout handling in case the timeout handler returns BLK_EH_DONE
bdi: Fix another oops in wb_workfn()
lightnvm: Remove depends on HAS_DMA in case of platform dependency
nvme-pci: limit max IO size and segments to avoid high order allocations
nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl
nvme-fc: release io queues to allow fast fail
nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag.
block: sed-opal: Fix a couple off by one bugs
blk-mq-debugfs: Off by one in blk_mq_rq_state_name()
nvmet: reset keep alive timer in controller enable
nvme-rdma: don't override opts->queue_size
nvme-rdma: Fix command completion race at error recovery
nvme-rdma: fix possible free of a non-allocated async event buffer
nvme-rdma: fix possible double free condition when failing to create a controller
Revert "block: Add warning for bi_next not NULL in bio_endio()"
block: fix timeout changes for legacy request drivers
Pull crypto fixes from Herbert Xu:
- Fix use after free in chtls
- Fix RBP breakage in sha3
- Fix use after free in hwrng_unregister
- Fix overread in morus640
- Move sleep out of kernel_neon in arm64/aes-blk
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: core - Always drop the RNG in hwrng_unregister()
crypto: morus640 - Fix out-of-bounds access
crypto: don't optimize keccakf()
crypto: arm64/aes-blk - fix and move skcipher_walk_done out of kernel_neon_begin, _end
crypto: chtls - use after free in chtls_pt_recvmsg()
Pull kselftest fixes from Shuah Khan:
- fix new sparc64 adi driver test compile errors on non-sparc systems
- fix config fragment for sync framework for improved test coverage
- fix several tests to return correct Kselftest skip code
* tag 'linux-kselftest-4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: sparc64: Add missing SPDX License Identifiers
selftests: sparc64: delete RUN_TESTS and EMIT_TESTS overrides
selftests: sparc64: Fix to do nothing on non-sparc64
selftests: sync: add config fragment for testing sync framework
selftests: vm: return Kselftest Skip code for skipped tests
selftests: zram: return Kselftest Skip code for skipped tests
selftests: user: return Kselftest Skip code for skipped tests
selftests: sysctl: return Kselftest Skip code for skipped tests
selftests: static_keys: return Kselftest Skip code for skipped tests
selftests: pstore: return Kselftest Skip code for skipped tests
Pull tracing fixes from Steven Rostedt:
"This contains a few fixes and a clean up.
- a bad merge caused an "endif" to go in the wrong place in
scripts/Makefile.build
- softirq tracing fix for tracing that corrupts lockdep and causes a
false splat
- histogram documentation typo fixes
- fix a bad memory reference when passing in no filter to the filter
code
- simplify code by using the swap macro instead of open coding the
swap"
* tag 'trace-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix SKIP_STACK_VALIDATION=1 build due to bad merge with -mrecord-mcount
tracing: Fix some errors in histogram documentation
tracing: Use swap macro in update_max_tr
softirq: Reorder trace_softirqs_on to prevent lockdep splat
tracing: Check for no filter when processing event filters
The defconfig has drifted over time, as Kconfig entries have changed order
or default values. Several maintainers ended up running 'savedefconfig'
themselves which caused a cascade of conflicts. Let's do it once and
for all in our tree before -rc2 instead.
Signed-off-by: Olof Johansson <olof@lixom.net>
The defconfig has drifted over time, as Kconfig entries have changed order
or default values. Several maintainers ended up running 'savedefconfig'
themselves which caused a cascade of conflicts. Let's do it once and
for all in our tree before -rc2 instead.
Signed-off-by: Olof Johansson <olof@lixom.net>
This pull request contains Broadcom ARM64-based SoCs MAINTAINERS file
fixes for 4.18, please pull the following:
- Florian fixes the entry for the Northstar2 SoCs which was somehow lost
during a directory move. He also adds an entry for the Stingray SoCs
which was missing from the day files were added
* tag 'arm-soc/for-4.18/maintainers-arm64-fixes' of https://github.com/Broadcom/stblinux:
MAINTAINERS: Update Broadcom iProc entry with Stingray
MAINAINTERS: Corrected Broadcom Northstar2 entry
Signed-off-by: Olof Johansson <olof@lixom.net>
This pull request contains Broadcom ARM64-based SoCs Device Tree fixes
for 4.18, please pull the following:
- Scott fixes both the bcm958742k and bcm958742t reference boards to
have the correct eMMC voltage specified
- Ray fixes the I2C and PCIe Device Tree nodes interrupt specifiers for
Northstar 2 and Stingray SoCs.
* tag 'arm-soc/for-4.18/devicetree-arm64-fixes' of https://github.com/Broadcom/stblinux:
arm64: dts: Stingray: Fix I2C controller interrupt type
arm64: dts: ns2: Fix PCIe controller interrupt type
arm64: dts: ns2: Fix I2C controller interrupt type
arm64: dts: specify 1.8V EMMC capabilities for bcm958742t
arm64: dts: specify 1.8V EMMC capabilities for bcm958742k
Signed-off-by: Olof Johansson <olof@lixom.net>
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for
4.18, please pull the following:
- Ray fixes the I2C and PCIe interrupt types for the Cygnus SoC
- Florian fixes the I2C and PCIe interrupts for the Northstar
(BCM5301x), Northstar Plus and Hurricane 2 SoCs
* tag 'arm-soc/for-4.18/devicetree-fixes' of https://github.com/Broadcom/stblinux:
ARM: dts: Cygnus: Fix PCIe controller interrupt type
ARM: dts: Cygnus: Fix I2C controller interrupt type
ARM: dts: BCM5301x: Fix i2c controller interrupt type
ARM: dts: HR2: Fix interrupt types for i2c and PCIe
ARM: dts: NSP: Fix PCIe controllers interrupt types
ARM: dts: NSP: Fix i2c controller interrupt type
Signed-off-by: Olof Johansson <olof@lixom.net>
i.MX fixes for 4.18:
- Fix i.MX6SX PCIe MSI interrupt number, so that MSI IRQs can be
properly propagated to the upstream interrupt controller.
- Fix GPCv2 MIPI/PCIe/USB_HSIC's PGC offset. The values in Reference
Manual are incorrect.
- Correct SDMA setting for i.MX6Q SPI5 device to fix the issue, that
the SPI controller RX FIFO was not empty after a DMA transfer, and
the driver gets stuck in the next PIO transfer when reading one word
more than expected.
* tag 'imx-fixes-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6sx: fix irq for pcie bridge
soc: imx: gpcv2: correct PGC offset
ARM: dts: imx6q: Use correct SDMA script for SPI5 core
Signed-off-by: Olof Johansson <olof@lixom.net>
Renesas ARM Based SoC Fixes for v4.18
Make PM domain initialization more robust in Renesas R-Car SYSC driver.
This resolves a regression due to re-parenting of PM domains by
086b399965 ("soc: renesas: r8a77990-sysc: Add workaround for 3DG-{A,B}").
* tag 'renesas-fixes-for-v4.18' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
soc: renesas: rcar-sysc: Make PM domain initialization more robust
Signed-off-by: Olof Johansson <olof@lixom.net>
mvebu fixes for 4.17 (part 2)
- Use correct size for ICU nodes (irq controller) on Armada 7K/8K
- Fix "#cooling-cells" property's name on Synology DS116 (Armada 385)
* tag 'mvebu-fixes-4.17-2' of git://git.infradead.org/linux-mvebu:
arm: dts: armada: Fix "#cooling-cells" property's name
arm64: dts: marvell: fix CP110 ICU node size
Signed-off-by: Olof Johansson <olof@lixom.net>
Make sure that RQF_TIMED_OUT is cleared when a request is reused
after a block driver timeout handler has returned BLK_EH_DONE.
Fixes: da66126739 ("blk-mq: don't time out requests again that are in the timeout handler")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Andrew Randrianasulu <randrianasulu@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fix missing dst_release() when local broadcast or multicast traffic is
xfrm policy blocked.
For IPv4 this results to dst leak: ip_route_output_flow() allocates
dst_entry via __ip_route_output_key() and passes it to
xfrm_lookup_route(). xfrm_lookup returns ERR_PTR(-EPERM) that is
propagated. The dst that was allocated is never released.
IPv4 local broadcast testcase:
ping -b 192.168.1.255 &
sleep 1
ip xfrm policy add src 0.0.0.0/0 dst 192.168.1.255/32 dir out action block
IPv4 multicast testcase:
ping 224.0.0.1 &
sleep 1
ip xfrm policy add src 0.0.0.0/0 dst 224.0.0.1/32 dir out action block
For IPv6 the missing dst_release() causes trouble e.g. when used in netns:
ip netns add TEST
ip netns exec TEST ip link set lo up
ip link add dummy0 type dummy
ip link set dev dummy0 netns TEST
ip netns exec TEST ip addr add fd00::1111 dev dummy0
ip netns exec TEST ip link set dummy0 up
ip netns exec TEST ping -6 -c 5 ff02::1%dummy0 &
sleep 1
ip netns exec TEST ip xfrm policy add src ::/0 dst ff02::1 dir out action block
wait
ip netns del TEST
After netns deletion we see:
[ 258.239097] unregister_netdevice: waiting for lo to become free. Usage count = 2
[ 268.279061] unregister_netdevice: waiting for lo to become free. Usage count = 2
[ 278.367018] unregister_netdevice: waiting for lo to become free. Usage count = 2
[ 288.375259] unregister_netdevice: waiting for lo to become free. Usage count = 2
Fixes: ac37e2515c ("xfrm: release dst_orig in case of error in xfrm_lookup()")
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Pull powerpc fixes from Michael Ellerman:
- a fix for hugetlb with 4K pages, broken by our recent changes for
split PMD PTL.
- set the correct assembler machine type on e500mc, needed since
binutils 2.26 introduced two forms for the "wait" instruction.
- a fix for potential missed TLB flushes with MADV_[FREE|DONTNEED] etc.
and THP on Power9 Radix.
- three fixes to try and make our panic handling more robust by hard
disabling interrupts, and not marking stopped CPUs as offline because
they haven't been properly offlined.
- three other minor fixes.
Thanks to: Aneesh Kumar K.V, Michael Jeanson, Nicholas Piggin.
* tag 'powerpc-4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm/hash/4k: Free hugetlb page table caches correctly.
powerpc/64s/radix: Fix radix_kvm_prefetch_workaround paca access of not possible CPU
powerpc/64s: Fix build failures with CONFIG_NMI_IPI=n
powerpc/64: hard disable irqs on the panic()ing CPU
powerpc: smp_send_stop do not offline stopped CPUs
powerpc/64: hard disable irqs in panic_smp_self_stop
powerpc/64s: Fix DT CPU features Power9 DD2.1 logic
powerpc/64s/radix: Fix MADV_[FREE|DONTNEED] TLB flush miss problem with THP
powerpc/e500mc: Set assembler machine type to e500mc
Pull arm64 fixes from Catalin Marinas:
- clear buffers allocated with FORCE_CONTIGUOUS explicitly until the
CMA code honours __GFP_ZERO
- notrace annotation for secondary_start_kernel()
- use early_param() instead of __setup() for "kpti=" as it is needed
for the cpufeature callback remapping swapper to non-global mappings
- ensure writes to swapper are ordered wrt subsequent cache maintenance
in the kpti non-global remapping code
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: Ensure writes to swapper are ordered wrt subsequent cache maintenance
arm64: kpti: Use early_param for kpti= command-line option
arm64: make secondary_start_kernel() notrace
arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag
Pull KVM fixes from Radim Krčmář:
"ARM:
- Lazy FPSIMD switching fixes
- Really disable compat ioctls on architectures that don't want it
- Disable compat on arm64 (it was never implemented...)
- Rely on architectural requirements for GICV on GICv3
- Detect bad alignments in unmap_stage2_range
x86:
- Add nested VM entry checks to avoid broken error recovery path
- Minor documentation fix"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number
kvm: vmx: Nested VM-entry prereqs for event inj.
KVM: arm64: Prevent KVM_COMPAT from being selected
KVM: Enforce error in ioctl for compat tasks when !KVM_COMPAT
KVM: arm/arm64: add WARN_ON if size is not PAGE_SIZE aligned in unmap_stage2_range
KVM: arm64: Avoid mistaken attempts to save SVE state for vcpus
KVM: arm64/sve: Fix SVE trap restoration for non-current tasks
KVM: arm64: Don't mask softirq with IRQs disabled in vcpu_put()
arm64: Introduce sysreg_clear_set()
KVM: arm/arm64: Drop resource size check for GICV window
Pull xen fixes from Juergen Gross:
"This contains the following fixes/cleanups:
- the removal of a BUG_ON() which wasn't necessary and which could
trigger now due to a recent change
- a correction of a long standing bug happening very rarely in Xen
dom0 when a hypercall buffer from user land was not accessible by
the hypervisor for very short periods of time due to e.g. page
migration or compaction
- usage of EXPORT_SYMBOL_GPL() instead of EXPORT_SYMBOL() in a
Xen-related driver (no breakage possible as using those symbols
without others already exported via EXPORT-SYMBOL_GPL() wouldn't
make any sense)
- a simplification for Xen PVH or Xen ARM guests
- some additional error handling for callers of xenbus_printf()"
* tag 'for-linus-4.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Remove unnecessary BUG_ON from __unbind_from_irq()
xen: add new hypercall buffer mapping device
xen/scsiback: add error handling for xenbus_printf
scsi: xen-scsifront: add error handling for xenbus_printf
xen/grant-table: Export gnttab_{alloc|free}_pages as GPL
xen: add error handling for xenbus_printf
xen: share start flags between PV and PVH
This reverts commit e4e961e36f.
We need to use early version of pgtable_l5_enabled() in
early_identify_cpu() as this code runs before cpu_feature_enabled() is
usable.
But it leads to section mismatch:
cpu_init()
load_mm_ldt()
ldt_slot_va()
LDT_BASE_ADDR
LDT_PGD_ENTRY
pgtable_l5_enabled()
__pgtable_l5_enabled
__pgtable_l5_enabled marked as __initdata, but cpu_init() is not __init.
It's fixable: early code can be isolated into a separate translation unit,
but such change collides with other work in the area. That's too much
hassle to save 4 bytes of memory.
Return __pgtable_l5_enabled back to be __ro_after_init.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20180622220841.54135-2-kirill.shutemov@linux.intel.com
Wire up io_pgetevents system call on powerpc.
io_pgetevents is a new syscall to read asynchronous I/O events from the
completion queue.
Tested with libaio branch aio-poll[1] and the io_pgetevents test (#22) passed
on both ppc64 LE and BE modes.
[1] https://pagure.io/libaio/branch/aio-poll
CC: Christoph Hellwig <hch@lst.de>
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When a (broken) node wrongly sends multicast TT entries with a ROAM
flag then this causes any receiving node to drop all entries for the
same multicast MAC address announced by other nodes, leading to
packet loss.
Fix this DoS vector by only storing TT sync flags. For multicast TT
non-sync'ing flag bits like ROAM are unused so far anyway.
Fixes: 1d8ab8d3c1 ("batman-adv: Modified forwarding behaviour for multicast packets")
Reported-by: Leonardo Mörlein <me@irrelefant.net>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Since commit 54e22f265e ("batman-adv: fix TT sync flag inconsistencies")
TT sync flags and TT non-sync'd flags are supposed to be stored
separately.
The previous patch missed to apply this separation on a TT entry with
only a single TT orig entry.
This is a minor fix because with only a single TT orig entry the DDoS
issue the former patch solves does not apply.
Fixes: 54e22f265e ("batman-adv: fix TT sync flag inconsistencies")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
batman-adv is creating special debugfs directories in the init
net_namespace for each created soft-interface (batadv net_device). But it
is possible to rename a net_device to a completely different name then the
original one.
It can therefore happen that a user registers a new batadv net_device with
the name "bat0". batman-adv is then also adding a new directory under
$debugfs/batman-adv/ with the name "wlan0".
The user then decides to rename this device to "bat1" and registers a
different batadv device with the name "bat0". batman-adv will then try to
create a directory with the name "bat0" under $debugfs/batman-adv/ again.
But there already exists one with this name under this path and thus this
fails. batman-adv will detect a problem and rollback the registering of
this device.
batman-adv must therefore take care of renaming the debugfs directories for
soft-interfaces whenever it detects such a net_device rename.
Fixes: c6c8fea297 ("net: Add batman-adv meshing protocol")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
batman-adv is creating special debugfs directories in the init
net_namespace for each valid hard-interface (net_device). But it is
possible to rename a net_device to a completely different name then the
original one.
It can therefore happen that a user registers a new net_device which gets
the name "wlan0" assigned by default. batman-adv is also adding a new
directory under $debugfs/batman-adv/ with the name "wlan0".
The user then decides to rename this device to "wl_pri" and registers a
different device. The kernel may now decide to use the name "wlan0" again
for this new device. batman-adv will detect it as a valid net_device and
tries to create a directory with the name "wlan0" under
$debugfs/batman-adv/. But there already exists one with this name under
this path and thus this fails. batman-adv will detect a problem and
rollback the registering of this device.
batman-adv must therefore take care of renaming the debugfs directories
for hard-interfaces whenever it detects such a net_device rename.
Fixes: 5bc7c1eb44 ("batman-adv: add debugfs structure for information per interface")
Reported-by: John Soros <sorosj@gmail.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
A reference for the best gateway is taken when the list of gateways in the
mesh is sent via netlink. This is necessary to check whether the currently
dumped entry is the currently selected gateway or not. This information is
then transferred as flag BATADV_ATTR_FLAG_BEST.
After the comparison of the current entry is done,
batadv_v_gw_dump_entry() has to decrease the reference counter again.
Otherwise the reference will be held and thus prevents a proper shutdown of
the batman-adv interfaces (and some of the interfaces enslaved in it).
Fixes: b71bb6f924 ("batman-adv: add B.A.T.M.A.N. V bat_gw_dump implementations")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
A reference for the best gateway is taken when the list of gateways in the
mesh is sent via netlink. This is necessary to check whether the currently
dumped entry is the currently selected gateway or not. This information is
then transferred as flag BATADV_ATTR_FLAG_BEST.
After the comparison of the current entry is done,
batadv_iv_gw_dump_entry() has to decrease the reference counter again.
Otherwise the reference will be held and thus prevents a proper shutdown of
the batman-adv interfaces (and some of the interfaces enslaved in it).
Fixes: efb766af06 ("batman-adv: add B.A.T.M.A.N. IV bat_gw_dump implementations")
Reported-by: Andreas Ziegler <dev@andreas-ziegler.de>
Tested-by: Andreas Ziegler <dev@andreas-ziegler.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The ETF input should be connected to the funnel output, and the ETF
output should be connected to the replicator input. The labels are wrong
and these got swapped:
Warning (graph_endpoint): /soc/funnel@821000/ports/port@8/endpoint: graph connection to node '/soc/etf@825000/ports/port@1/endpoint' is not bidirectional
Warning (graph_endpoint): /soc/replicator@824000/ports/port@2/endpoint: graph connection to node '/soc/etf@825000/ports/port@0/endpoint' is not bidirectional
Fixes: 7c10da3736 ("arm64: dts: qcom: Add msm8916 CoreSight components")
Cc: Ivan T. Ivanov <ivan.ivanov@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Andy Gross <andy.gross@linaro.org>
Cc: David Brown <david.brown@linaro.org>
Cc: linux-arm-msm@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Access to UART0 is disabled by bootloaders. By leaving it enabled by
default would reboot the board.
Disable this for now, this would alteast give a board which boots.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
In update_vf():
cftree_remove(cl);
update_cfmin(cl->cl_parent);
the cl_cfmin of cl->cl_parent is intentionally updated to 0
when that parent only has one child. And if this parent is
root qdisc, we could end up, in hfsc_schedule_watchdog(),
that we can't decide the next schedule time for qdisc watchdog.
But it seems safe that we can just skip it, as this watchdog is
not always scheduled anyway.
Thanks to Marco for testing all the cases, nothing is broken.
Reported-by: Marco Berizzi <pupilla@libero.it>
Tested-by: Marco Berizzi <pupilla@libero.it>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
net: dccp: fixes around rx_tstamp_last_feedback
This patch series fix some issues with rx_tstamp_last_feedback.
- Switch to monotonic clock.
- Avoid potential overflows on fast hosts/networks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
To compute delays, better not use time of the day which can
be changed by admins or malicious programs.
Also change ccid3_first_li() to use s64 type for delta variable
to avoid potential overflows.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: dccp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Smack: Mark inode instant in smack_task_to_inode
/proc clean-up in commit 1bbc55131e
resulted in smack_task_to_inode() being called before smack_d_instantiate.
This resulted in the smk_inode value being ignored, even while present
for files in /proc/self. Marking the inode as instant here fixes that.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.
Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.
This simplifies the dependencies, and allows to improve compile-testing.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Submitters of device tree binding documentation may forget to CC
the subsystem maintainer if this is missing.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a sparse warning about using an incorrect type in
argument 2 of ocelot_write_rix(), as an u32 was expected but a __be32
was given. The conversion to u32 is forced, which is safe as the value
will be written as-is in the hardware without any modification.
Fixes: 08d02364b1 ("net: mscc: fix the injection header")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using s/w buffer management, buffers are allocated and DMA mapped.
When doing so on an arm64 platform, an offset correction is applied on
the DMA address, before storing it in an Rx descriptor. The issue is
this DMA address is then used later in the Rx path without removing the
offset correction. Thus the DMA address is wrong, which can led to
various issues.
This patch fixes this by removing the offset correction from the DMA
address retrieved from the Rx descriptor before using it in the Rx path.
Fixes: 8d5047cf9c ("net: mvneta: Convert to be 64 bits compatible")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent patch updated e1000 docs to rst format. Docs build (`make
htmldocs`) is currently failing due to this file with error:
(SEVERE/4) Unexpected section title.
This is because a section of the file is indented 2 spaces. Build error
can be cleared by aligning the text with column 0. While we are changing
these lines we can make sure line length does not exceed 72, that
newlines following headings are uniform, and that full stops are
followed by two spaces.
Align text with column 0, limit line length to 72, ensure two spaces
follow all full stops, ensure uniform use of newlines after heading.
Fixes commit (228046e761 Documentation: e1000: Update kernel documentation)
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent patch updated e100 docs to rst format. Docs build (`make
htmldocs`) is currently failing due to this file with error:
(SEVERE/4) Unexpected section title.
This is because a section of the file is indented 2 spaces. Build error
can be cleared by aligning the text with column 0. While we are changing
these lines we can make sure line length does not exceed 72, that
newlines following headings are uniform, and that full stops are
followed by two spaces.
Align text with column 0, limit line length to 72, ensure two spaces
follow all full stops, ensure uniform use of newlines after heading.
Fixes commit (85d63445f4 Documentation: e100: Update the Intel 10/100 driver doc)
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently documentation file was converted to rst. The document title
has the incorrect heading adornment. From kernel docs:
* Please stick to this order of heading adornments:
1. ``=`` with overline for document title::
==============
Document title
==============
Add overline heading adornment to document title.
Fixes commit (228046e761 Documentation: e1000: Update kernel documentation)
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently documentation file was converted to rst. The document title
has the incorrect heading adornment. From kernel docs:
* Please stick to this order of heading adornments:
1. ``=`` with overline for document title::
==============
Document title
==============
Add overline heading adornment to document title.
Fixes commit (85d63445f4 Documentation: e100: Update the Intel 10/100 driver doc)
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After recieving MLD querys, we update idev->mc_maxdelay with max_delay
from query header. This make the later unsolicited reports have the same
interval with mc_maxdelay, which means we may send unsolicited reports with
long interval time instead of default configured interval time.
Also as we will not call ipv6_mc_reset() after device up. This issue will
be there even after leave the group and join other groups.
Fixes: fc4eba58b4 ("ipv6: make unsolicited report intervals configurable for mld")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when
we meet errors during ubuf allocation, the code does not check for
NULL before calling sockfd_put(), this will lead NULL
dereferencing. Fixing by checking sock pointer before.
Fixes: bab632d69e ("vhost: vhost TX zero-copy support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a regression I accidentally reduced that was picked up by
kasan, where we were checking the CRTC atomic states after DRM's helpers
had already freed them. Example:
==================================================================
BUG: KASAN: use-after-free in amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
Read of size 1 at addr ffff8803a697b071 by task kworker/u16:0/7
CPU: 7 PID: 7 Comm: kworker/u16:0 Tainted: G O 4.18.0-rc1Lyude-Upstream+ #1
Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.21 05/02/2018
Workqueue: events_unbound commit_work [drm_kms_helper]
Call Trace:
dump_stack+0xc1/0x169
? dump_stack_print_info.cold.1+0x42/0x42
? kmsg_dump_rewind_nolock+0xd9/0xd9
? printk+0x9f/0xc5
? amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
print_address_description+0x6c/0x23c
? amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
kasan_report.cold.6+0x241/0x2fd
amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
? commit_planes_to_stream.constprop.45+0x13b0/0x13b0 [amdgpu]
? cpu_load_update_active+0x290/0x290
? finish_task_switch+0x2bd/0x840
? __switch_to_asm+0x34/0x70
? read_word_at_a_time+0xe/0x20
? strscpy+0x14b/0x460
? drm_atomic_helper_wait_for_dependencies+0x47d/0x7e0 [drm_kms_helper]
commit_tail+0x96/0xe0 [drm_kms_helper]
process_one_work+0x88a/0x1360
? create_worker+0x540/0x540
? __sched_text_start+0x8/0x8
? move_queued_task+0x760/0x760
? call_rcu_sched+0x20/0x20
? vsnprintf+0xcda/0x1350
? wait_woken+0x1c0/0x1c0
? mutex_unlock+0x1d/0x40
? init_timer_key+0x190/0x230
? schedule+0xea/0x390
? __schedule+0x1ea0/0x1ea0
? need_to_create_worker+0xe4/0x210
? init_worker_pool+0x700/0x700
? try_to_del_timer_sync+0xbf/0x110
? del_timer+0x120/0x120
? __mutex_lock_slowpath+0x10/0x10
worker_thread+0x196/0x11f0
? flush_rcu_work+0x50/0x50
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __schedule+0x7d6/0x1ea0
? migrate_swap_stop+0x850/0x880
? __sched_text_start+0x8/0x8
? save_stack+0x8c/0xb0
? kasan_kmalloc+0xbf/0xe0
? kmem_cache_alloc_trace+0xe4/0x190
? kthread+0x98/0x390
? ret_from_fork+0x35/0x40
? ret_from_fork+0x35/0x40
? deactivate_slab.isra.67+0x3c4/0x5c0
? kthread+0x98/0x390
? kthread+0x98/0x390
? set_track+0x76/0x120
? schedule+0xea/0x390
? __schedule+0x1ea0/0x1ea0
? wait_woken+0x1c0/0x1c0
? kasan_unpoison_shadow+0x30/0x40
? parse_args.cold.15+0x17a/0x17a
? flush_rcu_work+0x50/0x50
kthread+0x2d4/0x390
? kthread_create_worker_on_cpu+0xc0/0xc0
ret_from_fork+0x35/0x40
Allocated by task 1124:
kasan_kmalloc+0xbf/0xe0
kmem_cache_alloc_trace+0xe4/0x190
dm_crtc_duplicate_state+0x78/0x130 [amdgpu]
drm_atomic_get_crtc_state+0x147/0x410 [drm]
page_flip_common+0x57/0x230 [drm_kms_helper]
drm_atomic_helper_page_flip+0xa6/0x110 [drm_kms_helper]
drm_mode_page_flip_ioctl+0xc4b/0x10a0 [drm]
drm_ioctl_kernel+0x1d4/0x260 [drm]
drm_ioctl+0x433/0x920 [drm]
amdgpu_drm_ioctl+0x11d/0x290 [amdgpu]
do_vfs_ioctl+0x1a1/0x13d0
ksys_ioctl+0x60/0x90
__x64_sys_ioctl+0x6f/0xb0
do_syscall_64+0x147/0x440
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 1124:
__kasan_slab_free+0x12e/0x180
kfree+0x92/0x1a0
drm_atomic_state_default_clear+0x315/0xc40 [drm]
__drm_atomic_state_free+0x35/0xd0 [drm]
drm_atomic_helper_update_plane+0xac/0x350 [drm_kms_helper]
__setplane_internal+0x2d6/0x840 [drm]
drm_mode_cursor_universal+0x41e/0xbe0 [drm]
drm_mode_cursor_common+0x49f/0x880 [drm]
drm_mode_cursor_ioctl+0xd8/0x130 [drm]
drm_ioctl_kernel+0x1d4/0x260 [drm]
drm_ioctl+0x433/0x920 [drm]
amdgpu_drm_ioctl+0x11d/0x290 [amdgpu]
do_vfs_ioctl+0x1a1/0x13d0
ksys_ioctl+0x60/0x90
__x64_sys_ioctl+0x6f/0xb0
do_syscall_64+0x147/0x440
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The buggy address belongs to the object at ffff8803a697b068
which belongs to the cache kmalloc-1024 of size 1024
The buggy address is located 9 bytes inside of
1024-byte region [ffff8803a697b068, ffff8803a697b468)
The buggy address belongs to the page:
page:ffffea000e9a5e00 count:1 mapcount:0 mapping:ffff88041e00efc0 index:0x0 compound_mapcount: 0
flags: 0x8000000000008100(slab|head)
raw: 8000000000008100 ffffea000ecbc208 ffff88041e000c70 ffff88041e00efc0
raw: 0000000000000000 0000000000170017 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8803a697af00: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8803a697af80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8803a697b000: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
^
ffff8803a697b080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8803a697b100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
So, we fix this by counting the number of CRTCs this atomic commit disabled
early on in the function before their atomic states have been freed, then use
that count later to do the appropriate number of RPM puts at the end of the
function.
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: stable@vger.kernel.org
Fixes: 97028037a3 ("drm/amdgpu: Grab/put runtime PM references in atomic_commit_tail()")
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Michel Dänzer <michel@daenzer.net>
Reported-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The current logic incorrectly calculates the LLC ID from the APIC ID.
Unless specified otherwise, the LLC ID should be calculated by removing
the Core and Thread ID bits from the least significant end of the APIC
ID. For more info, see "ApicId Enumeration Requirements" in any Fam17h
PPR document.
[ bp: Improve commit message. ]
Fixes: 68091ee7ac ("Calculate last level cache ID from number of sharing threads")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1528915390-30533-1-git-send-email-suravee.suthikulpanit@amd.com
A newly introduced function has 'const int' as the return type,
but as "make W=1" reports, that has no meaning:
drivers/md/dm-raid.c:510:18: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]
This changes the return type to plain 'int'.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 33e53f0685 ("dm raid: introduce extended superblock and new raid types to support takeover/reshaping")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Fixes: 552aa679f2 ("dm raid: use rs_is_raid*()")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This adjusts the allocator calls to use the 2-factor argument style, as
already done treewide for better defense against allocator overflows.
Signed-off-by: Kees Cook <keescook@chromium.org>
[snitzer: tweaked code to leave assignment in a test alone]
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Commit 5a32083d03 ("dm: take care to copy the space map roots before
locking the superblock") properly removed the calls to dm_sm_root_size()
from __write_initial_superblock(). But the dm_sm_root_size() calls were
left dangling in __commit_transaction().
Fixes: 5a32083d03 ("dm: take care to copy the space map roots before locking the superblock")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Use of bio_clone_bioset() is inefficient if there is no need to clone
the original bio's bio_vec array. Best to use the bio_clone_fast()
variant. Also, just using bio_advance() is only part of what is needed
to properly setup the clone -- it doesn't account for the various
bio_integrity() related work that also needs to be performed (see
bio_split).
Address both of these issues by switching from bio_clone_bioset() to
bio_split().
Fixes: 18a25da8 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.15+, requires removal of '&' before md->queue->bio_split
Reported-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
syzbot is reporting NULL pointer dereference at wb_workfn() [1] due to
wb->bdi->dev being NULL. And Dmitry confirmed that wb->state was
WB_shutting_down after wb->bdi->dev became NULL. This indicates that
unregister_bdi() failed to call wb_shutdown() on one of wb objects.
The problem is in cgwb_bdi_unregister() which does cgwb_kill() and thus
drops bdi's reference to wb structures before going through the list of
wbs again and calling wb_shutdown() on each of them. This way the loop
iterating through all wbs can easily miss a wb if that wb has already
passed through cgwb_remove_from_bdi_list() called from wb_shutdown()
from cgwb_release_workfn() and as a result fully shutdown bdi although
wb_workfn() for this wb structure is still running. In fact there are
also other ways cgwb_bdi_unregister() can race with
cgwb_release_workfn() leading e.g. to use-after-free issues:
CPU1 CPU2
cgwb_bdi_unregister()
cgwb_kill(*slot);
cgwb_release()
queue_work(cgwb_release_wq, &wb->release_work);
cgwb_release_workfn()
wb = list_first_entry(&bdi->wb_list, ...)
spin_unlock_irq(&cgwb_lock);
wb_shutdown(wb);
...
kfree_rcu(wb, rcu);
wb_shutdown(wb); -> oops use-after-free
We solve these issues by synchronizing writeback structure shutdown from
cgwb_bdi_unregister() with cgwb_release_workfn() using a new mutex. That
way we also no longer need synchronization using WB_shutting_down as the
mutex provides it for CONFIG_CGROUP_WRITEBACK case and without
CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from
bdi_unregister().
Reported-by: syzbot <syzbot+4a7438e774b21ddd8eca@syzkaller.appspotmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.
Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.
This simplifies the dependencies, and allows to improve compile-testing.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When delivering a signal to a task that is using rseq, we call into
__rseq_handle_notify_resume() so that the registers pushed in the
sigframe are updated to reflect the state of the restartable sequence
(for example, ensuring that the signal returns to the abort handler if
necessary).
However, if the rseq management fails due to an unrecoverable fault when
accessing userspace or certain combinations of RSEQ_CS_* flags, then we
will attempt to deliver a SIGSEGV. This has the potential for infinite
recursion if the rseq code continuously fails on signal delivery.
Avoid this problem by using force_sigsegv() instead of force_sig(), which
is explicitly designed to reset the SEGV handler to SIG_DFL in the case
of a recursive fault. In doing so, remove rseq_signal_deliver() from the
internal rseq API and have an optional struct ksignal * parameter to
rseq_handle_notify_resume() instead.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: peterz@infradead.org
Cc: paulmck@linux.vnet.ibm.com
Cc: boqun.feng@gmail.com
Link: https://lkml.kernel.org/r/1529664307-983-1-git-send-email-will.deacon@arm.com
Since commit 1bb8866677 ("mtd: nand: denali: handle timing parameters
by setup_data_interface()"), denali_dt.c gets the clock rate from the
clock driver. The driver expects the frequency of the bus interface
clock, whereas the clock driver of SOCFPGA provides the core clock.
Thus, the setup_data_interface() hook calculates timing parameters
based on a wrong frequency.
To make it work without relying on the clock driver, hard-code the clock
frequency, 200MHz. This is fine for existing DT of UniPhier, and also
fixes the issue of SOCFPGA because both platforms use 200 MHz for the
bus interface clock.
Fixes: 1bb8866677 ("mtd: nand: denali: handle timing parameters by setup_data_interface()")
Cc: linux-stable <stable@vger.kernel.org> #4.14+
Reported-by: Philipp Rosenberger <p.rosenberger@linutronix.de>
Suggested-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
When rewriting swapper using nG mappings, we must performance cache
maintenance around each page table access in order to avoid coherency
problems with the host's cacheable alias under KVM. To ensure correct
ordering of the maintenance with respect to Device memory accesses made
with the Stage-1 MMU disabled, DMBs need to be added between the
maintenance and the corresponding memory access.
This patch adds a missing DMB between writing a new page table entry and
performing a clean+invalidate on the same line.
Fixes: f992b4dfd5 ("arm64: kpti: Add ->enable callback to remap swapper using nG mappings")
Cc: <stable@vger.kernel.org> # 4.16.x-
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We inspect __kpti_forced early on as part of the cpufeature enable
callback which remaps the swapper page table using non-global entries.
Ensure that __kpti_forced has been updated to reflect the kpti=
command-line option before we start using it.
Fixes: ea1e3de85e ("arm64: entry: Add fake CPU feature for unmapping the kernel at EL0")
Cc: <stable@vger.kernel.org> # 4.16.x-
Reported-by: Wei Xu <xuwei5@hisilicon.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For the common cases where 1000 is a multiple of HZ, or HZ is a multiple of
1000, jiffies_to_msecs() never returns zero when passed a non-zero time
period.
However, if HZ > 1000 and not an integer multiple of 1000 (e.g. 1024 or
1200, as used on alpha and DECstation), jiffies_to_msecs() may return zero
for small non-zero time periods. This may break code that relies on
receiving back a non-zero value.
jiffies_to_usecs() does not need such a fix: one jiffy can only be less
than one µs if HZ > 1000000, and such large values of HZ are already
rejected at build time, twice:
- include/linux/jiffies.h does #error if HZ >= 12288,
- kernel/time/time.c has BUILD_BUG_ON(HZ > USEC_PER_SEC).
Broken since forever.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: linux-alpha@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180622143357.7495-1-geert@linux-m68k.org
KVM_CAP_HYPERV_TLBFLUSH collided with KVM_CAP_S390_PSW-BPB, its paragraph
number should now be 8.18.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
This patch extends the checks done prior to a nested VM entry.
Specifically, it extends the check_vmentry_prereqs function with checks
for fields relevant to the VM-entry event injection information, as
described in the Intel SDM, volume 3.
This patch is motivated by a syzkaller bug, where a bad VM-entry
interruption information field is generated in the VMCS02, which causes
the nested VM launch to fail. Then, KVM fails to resume L1.
While KVM should be improved to correctly resume L1 execution after a
failed nested launch, this change is justified because the existing code
to resume L1 is flaky/ad-hoc and the test coverage for resuming L1 is
sparse.
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Marc Orr <marcorr@google.com>
[Removed comment whose parts were describing previous revisions and the
rest was obvious from function/variable naming. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Pull NVMe fixes from Christoph:
"Various relatively small fixes, mostly to fix error handling of various
sorts."
* 'nvme-4.18' of git://git.infradead.org/nvme:
nvme-pci: limit max IO size and segments to avoid high order allocations
nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl
nvme-fc: release io queues to allow fast fail
nvmet: reset keep alive timer in controller enable
nvme-rdma: don't override opts->queue_size
nvme-rdma: Fix command completion race at error recovery
nvme-rdma: fix possible free of a non-allocated async event buffer
nvme-rdma: fix possible double free condition when failing to create a controller
KVM/arm fixes for 4.18, take #1
- Lazy FPSIMD switching fixes
- Really disable compat ioctls on architectures that don't want it
- Disable compat on arm64 (it was never implemented...)
- Rely on architectural requirements for GICV on GICv3
- Detect bad alignments in unmap_stage2_range
Some injection testing resulted in the following console log:
mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134
mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]}
mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88
mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043
mce: [Hardware Error]: Run the above through 'mcelog --ascii'
Kernel panic - not syncing: Machine check from unknown source
This confused everybody because the first line quite clearly shows
that we found a logged error in "Bank 1", while the last line says
"unknown source".
The problem is that the Linux code doesn't do the right thing
for a local machine check that results in a fatal error.
It turns out that we know very early in the handler whether the
machine check is fatal. The call to mce_no_way_out() has checked
all the banks for the CPU that took the local machine check. If
it says we must crash, we can do so right away with the right
messages.
We do scan all the banks again. This means that we might initially
not see a problem, but during the second scan find something fatal.
If this happens we print a slightly different message (so I can
see if it actually every happens).
[ bp: Remove unneeded severity assignment. ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: stable@vger.kernel.org # 4.2
Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.1527283897.git.tony.luck@intel.com
mce_no_way_out() does a quick check during #MC to see whether some of
the MCEs logged would require the kernel to panic immediately. And it
passes a struct mce where MCi_STATUS gets written.
However, after having saved a valid status value, the next iteration
of the loop which goes over the MCA banks on the CPU, overwrites the
valid status value because we're using struct mce as storage instead of
a temporary variable.
Which leads to MCE records with an empty status value:
mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000
mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10}
In order to prevent the loss of the status register value, return
immediately when severity is a panic one so that we can panic
immediately with the first fatal MCE logged. This is also the intention
of this function and not to noodle over the banks while a fatal MCE is
already logged.
Tony: read the rest of the MCA bank to populate the struct mce fully.
Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180622095428.626-8-bp@alien8.de
It is possible, under obscure circumstances, to convince the ITS driver to
emit a SYNC operation that targets a collection that is not bound to any
redistributor (and the target_address field is zero) because the
corresponding CPU has not been seen yet (the system has been booted with
max_cpus="something small").
If the ITS is using the linear CPU number as the target, this is not a big
deal, as we just end-up issuing a SYNC to CPU0. But if the ITS requires the
physical address of the redistributor (with GITS_TYPER.PTA==1), we end-up
asking the ITS to write to the physical address zero, which is not exactly
a good idea (there has been report of the ITS locking up). This should of
course never happen, but hey, this is SW...
In order to avoid the above disaster, let's track which collections have
been actually initialized, and let's not generate a SYNC if the collection
hasn't been properly bound to a redistributor. Take this opportunity to
spit our a warning, in the hope that someone may report the issue if it
arrises again.
Reported-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-6-marc.zyngier@arm.com
If we failed during a rename exchange operation after starting/joining a
transaction, we would end up replacing the return value, stored in the
local 'ret' variable, with the return value from btrfs_end_transaction().
So this could end up returning 0 (success) to user space despite the
operation having failed and aborted the transaction, because if there are
multiple tasks having a reference on the transaction at the time
btrfs_end_transaction() is called by the rename exchange, that function
returns 0 (otherwise it returns -EIO and not the original error value).
So fix this by not overwriting the return value on error after getting
a transaction handle.
Fixes: cdd1fedf82 ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT")
CC: stable@vger.kernel.org # 4.9+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Pull udf, quota, ext2 fixes from Jan Kara:
"UDF:
- fix an oops due to corrupted disk image
- two small cleanups
quota:
- a fixfor lru handling
- cleanup
ext2:
- a warning about a deprecated mount option"
* tag 'for_v4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Drop unused arguments of udf_delete_aext()
udf: Provide function for calculating dir entry length
udf: Detect incorrect directory size
ext2: add warning when specifying nocheck option
quota: Cleanup list iteration in dqcache_shrink_scan()
quota: reclaim least recently used dquots
Commit:
79832f0b5f ("efi/libstub/tpm: Initialize pointer variables to zero for mixed mode")
fixes a problem with the tpm code on mixed mode (64-bit kernel on 32-bit UEFI),
where 64-bit pointer variables are not fully initialized by the 32-bit EFI code.
A similar problem applies to the efi_physical_addr_t variables which
are written by the ->get_event_log() EFI call. Even though efi_physical_addr_t
is 64-bit everywhere, it seems that some 32-bit UEFI implementations only
fill in the lower 32 bits when passed a pointer to an efi_physical_addr_t
to fill.
This commit initializes these to 0 to, to ensure the upper 32 bits are
0 in mixed mode. This fixes recent kernels sometimes hanging during
early boot on mixed mode UEFI systems.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: <stable@vger.kernel.org> # v4.16+
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180622064222.11633-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With gcc 4.1.2 when compiling for 32-bit:
drivers/mtd/devices/mtd_dataflash.c:736: warning: integer constant is too large for ‘long’ type
drivers/mtd/devices/mtd_dataflash.c:737: warning: integer constant is too large for ‘long’ type
Add the missing "ULL" suffixes to fix this.
Fixes: 67e4145ebf ("mtd: dataflash: Add flash_info for AT45DB641E")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Commit 910f8befdf ("xen/pirq: fix error path cleanup when binding
MSIs") fixed a couple of errors in error cleanup path of
xen_bind_pirq_msi_to_irq(). This cleanup allowed a call to
__unbind_from_irq() with an unbound irq, which would result in
triggering the BUG_ON there.
Since there is really no reason for the BUG_ON (xen_free_irq() can
operate on unbound irqs) we can remove it.
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: stable@vger.kernel.org
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
When a corrupt inode is detected during xfs_iflush_cluster, we can
get a shutdown ASSERT failure like this:
XFS (pmem1): Metadata corruption detected at xfs_symlink_shortform_verify+0x5c/0xa0, inode 0x86627 data fork
XFS (pmem1): Unmount and run xfs_repair
XFS (pmem1): xfs_do_force_shutdown(0x8) called from line 3372 of file fs/xfs/xfs_inode.c. Return address = ffffffff814f4116
XFS (pmem1): Corruption of in-memory data detected. Shutting down filesystem
XFS (pmem1): xfs_do_force_shutdown(0x1) called from line 222 of file fs/xfs/libxfs/xfs_defer.c. Return address = ffffffff814a8a88
XFS (pmem1): xfs_do_force_shutdown(0x1) called from line 222 of file fs/xfs/libxfs/xfs_defer.c. Return address = ffffffff814a8ef9
XFS (pmem1): Please umount the filesystem and rectify the problem(s)
XFS: Assertion failed: xfs_isiflocked(ip), file: fs/xfs/xfs_inode.h, line: 258
.....
Call Trace:
xfs_iflush_abort+0x10a/0x110
xfs_iflush+0xf3/0x390
xfs_inode_item_push+0x126/0x1e0
xfsaild+0x2c5/0x890
kthread+0x11c/0x140
ret_from_fork+0x24/0x30
Essentially, xfs_iflush_abort() has been called twice on the
original inode that that was flushed. This happens because the
inode has been flushed to teh buffer successfully via
xfs_iflush_int(), and so when another inode is detected as corrupt
in xfs_iflush_cluster, the buffer is marked stale and EIO, and
iodone callbacks are run on it.
Running the iodone callbacks walks across the original inode and
calls xfs_iflush_abort() on it. When xfs_iflush_cluster() returns
to xfs_iflush(), it runs the error path for that function, and that
calls xfs_iflush_abort() on the inode a second time, leading to the
above assert failure as the inode is not flush locked anymore.
This bug has been there a long time.
The simple fix would be to just avoid calling xfs_iflush_abort() in
xfs_iflush() if we've got a failure from xfs_iflush_cluster().
However, xfs_iflush_cluster() has magic delwri buffer handling that
means it may or may not have run IO completion on the buffer, and
hence sometimes we have to call xfs_iflush_abort() from
xfs_iflush(), and sometimes we shouldn't.
After reading through all the error paths and the delwri buffer
code, it's clear that the error handling in xfs_iflush_cluster() is
unnecessary. If the buffer is delwri, it leaves it on the delwri
list so that when the delwri list is submitted it sees a shutdown
fliesystem in xfs_buf_submit() and that marks the buffer stale, EIO
and runs IO completion. i.e. exactly what xfs+iflush_cluster() does
when it's not a delwri buffer. Further, marking a buffer stale
clears the _XBF_DELWRI_Q flag on the buffer, which means when
submission of the buffer occurs, it just skips over it and releases
it.
IOWs, the error handling in xfs_iflush_cluster doesn't need to care
if the buffer is already on a the delwri queue or not - it just
needs to mark the buffer stale, EIO and run completions. That means
we can just use the easy fix for xfs_iflush() to avoid the double
abort.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
For passing arbitrary data from user land to the Xen hypervisor the
Xen tools today are using mlock()ed buffers. Unfortunately the kernel
might change access rights of such buffers for brief periods of time
e.g. for page migration or compaction, leading to access faults in the
hypervisor, as the hypervisor can't use the locks of the kernel.
In order to solve this problem add a new device node to the Xen privcmd
driver to easily allocate hypercall buffers via mmap(). The memory is
allocated in the kernel and just mapped into user space. Marked as
VM_IO the user mapping will not be subject to page migration et al.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
When the inode is in extent format, it can't have more extents that
fit in the inode fork. We don't currenty check this, and so this
corruption goes unnoticed by the inode verifiers. This can lead to
crashes operating on invalid in-memory structures.
Attempts to access such a inode will now error out in the verifier
rather than allowing modification operations to proceed.
Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: fix a typedef, add some braces and breaks to shut up compiler warnings]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Instead of using xfs_bmapi_read to find delalloc extents and then punch
them out using xfs_bunmapi, opencode the loop to iterate over the extents
and call xfs_bmap_del_extent_delay directly. This both simplifies the
code and reduces the number of extent tree lookups required.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Pull drm fixes from Dave Airlie:
"Just run of the mill fixes,
core:
- regression fix in device unplug
qxl:
- regression fix for might sleep in cursor handling
nouveau:
- regression fix in multi-screen cursor handling
amdgpu:
- switch off DC by default on Kaveri and older
- some minor fixes
i915:
- some GEM regression fixes
- doublescan mode fixes
sun4i:
- revert fix for a regression
sii8620 bridge:
- misc fixes"
* tag 'drm-fixes-2018-06-22' of git://anongit.freedesktop.org/drm/drm: (28 commits)
drm/bridge/sii8620: fix display of packed pixel modes in MHL2
drm/amdgpu: Make amdgpu_vram_mgr_bo_invisible_size always accurate
drm/amdgpu: Refactor amdgpu_vram_mgr_bo_invisible_size helper
drm/amdgpu: Update pin_size values before unpinning BO
drm/amdgpu:All UVD instances share one idle_work handle
drm/amdgpu: Don't default to DC support for Kaveri and older
drm/amdgpu: Use kvmalloc_array for allocating VRAM manager nodes array
drm/amd/pp: Fix uninitialized variable
drm/i915: Enable provoking vertex fix on Gen9 systems.
drm/i915: Fix context ban and hang accounting for client
drm/i915: Turn off g4x DP port in .post_disable()
drm/i915: Disallow interlaced modes on g4x DP outputs
drm/i915: Fix PIPESTAT irq ack on i965/g4x
drm/i915: Allow DBLSCAN user modes with eDP/LVDS/DSI
drm/i915/execlists: Avoid putting the error pointer
drm/i915: Apply batch location restrictions before pinning
drm/nouveau/kms/nv50-: cursors always use core channel vram ctxdma
Revert "drm/sun4i: Handle DRM_BUS_FLAG_PIXDATA_*EDGE"
drm/atmel-hlcdc: check stride values in the first plane
drm/bridge/sii8620: fix HDMI cable connection to dongle
...
One of my tests compiles the kernel with gcc 4.5.3, and I hit the
following build error:
include/linux/semaphore.h: In function 'sema_init':
include/linux/semaphore.h:35:17: error: unknown field 'val' specified in initializer
include/linux/semaphore.h:35:17: warning: missing braces around initializer
include/linux/semaphore.h:35:17: warning: (near initialization for '(anonymous).raw_lock.<anonymous>.val')
I bisected it down to:
625e88be1f ("locking/qspinlock: Merge 'struct __qspinlock' into 'struct qspinlock'")
... which makes qspinlock have an anonymous union, which makes initializing it special
for older compilers. By adding strategic brackets, it makes the build
happy again.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Waiman Long <longman@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: 625e88be1f ("locking/qspinlock: Merge 'struct __qspinlock' into 'struct qspinlock'")
Link: http://lkml.kernel.org/r/20180621203526.172ab5c4@vmware.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The dst_cid and src_cid are 64 bits, therefore 64 bit accessors should be
used, and in fact in virtio_transport_common.c only 64 bit accessors are
used. Using 32 bit accessors for 64 bit values breaks big endian systems.
This patch fixes a wrong use of le32_to_cpu in virtio_transport_send_pkt.
Fixes: b911682318 ("VSOCK: add loopback to virtio_transport")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function cpdma_desc_pool_create is local to the source and does not
need to be in global scope, so make it static.
Cleans up sparse warning:
warning: symbol 'cpdma_desc_pool_create' was not declared. Should it
be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes using the controller with SDL2.
SDL2 has a naive algorithm to apply the correct settings to a controller.
For X-Box compatible controllers it expects that the controller name
contains a variation of a 'XBOX'-string.
This patch changes the identifier to contain "X-Box" as substring. Tested
with Steam and C-Dogs-SDL which both detect the controller properly after
adding this patch.
Fixes: c1ba08390a ("Input: xpad - add GPD Win 2 Controller USB IDs")
Cc: stable@vger.kernel.org
Signed-off-by: Enno Boland <gottox@voidlinux.eu>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Commit 40f7090bb1 ("Input: elan_i2c_smbus - fix corrupted stack")
fixed most of the functions using i2c_smbus_read_block_data() to
allocate a buffer with the maximum block size. However three
functions were left unchanged:
* In elan_smbus_initialize(), increase the buffer size in the same
way.
* In elan_smbus_calibrate_result(), the buffer is provided by the
caller (calibrate_store()), so introduce a bounce buffer. Also
name the result buffer size.
* In elan_smbus_get_report(), the buffer is provided by the caller
but happens to be the right length. Add a compile-time assertion
to ensure this remains the case.
Cc: <stable@vger.kernel.org> # 3.19+
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Some touchpad has middle key and it will be indicated in bit 2 of packet[0].
We need to fix V4 formation's byte mask to prevent error decoding.
Signed-off-by: KT Liao <kt.liao@emc.com.tw>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The "sector is in requested range" test used to determine whether
sectors should be re-locked or not is done on a variable that is reset
everytime we cross a chip boundary, which can lead to some blocks being
re-locked while the caller expect them to be unlocked.
Fix the check to make sure this cannot happen.
Fixes: 1648eaaa15 ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking")
Cc: stable@vger.kernel.org
Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Ross Lagerwall says:
====================
xen-netfront: Fix issues with commit f599c64fdf
Fix a couple of issues with commit f599c64fdf ("xen-netfront: Fix race
between device setup and open").
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Update the features after calling register_netdev() otherwise the
device features are not set up correctly and it not possible to change
the MTU of the device. After this change, the features reported by
ethtool match the device's features before the commit which introduced
the issue and it is possible to change the device's MTU.
Fixes: f599c64fdf ("xen-netfront: Fix race between device setup and open")
Reported-by: Liam Shepherd <liam@dancer.es>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cfi_ppb_unlock() tries to relock all sectors that were locked before
unlocking the whole chip.
This locking used the chip start address + the FULL offset from the
first flash chip, thereby forming an illegal address. Fix that by using
the chip offset(adr).
Fixes: 1648eaaa15 ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking")
Cc: stable@vger.kernel.org
Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
do_ppb_xxlock() fails to add chip->start when querying for lock status
(and chip_ready test), which caused false status reports.
Fix that by adding adr += chip->start and adjust call sites
accordingly.
Fixes: 1648eaaa15 ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking")
Cc: stable@vger.kernel.org
Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
On one of our production test machine, when running
bpf selftest test_sockmap, I got the following error:
# sudo ./test_sockmap
libbpf: failed to create map (name: 'sock_map'): Operation not permitted
libbpf: failed to load object 'test_sockmap_kern.o'
libbpf: Can't get the 0th fd from program sk_skb1: only -1 instances
......
load_bpf_file: (-1) Operation not permitted
ERROR: (-1) load bpf failed
The error is due to not-big-enough rlimit
struct rlimit r = {10 * 1024 * 1024, RLIM_INFINITY};
The test already includes "bpf_rlimit.h", which sets current
and max rlimit to RLIM_INFINITY. Let us just use it.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The test_kmod.sh script require root privilege for the successful
execution of the test.
This patch is to notify the user about the privilege the script
demands for the successful execution of the test.
Signed-off-by: Jeffrin Jose T (Rajagiri SET) <ahiliation@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
If flower filter is created without the skip_sw flag, fl_mask_put()
can race with fl_classify() and we can destroy the mask rhashtable
while a lookup operation is accessing it.
BUG: unable to handle kernel paging request at 00000000000911d1
PGD 0 P4D 0
SMP PTI
CPU: 3 PID: 5582 Comm: vhost-5541 Not tainted 4.18.0-rc1.vanilla+ #1950
Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
RIP: 0010:rht_bucket_nested+0x20/0x60
Code: 31 c8 c1 c1 18 29 c8 c3 66 90 8b 4f 04 ba 01 00 00 00 8b 07 48 8b bf 80 00 00 0
RSP: 0018:ffffafc5cfbb7a48 EFLAGS: 00010206
RAX: 0000000000001978 RBX: ffff9f12dff88a00 RCX: 00000000ffff9f12
RDX: 00000000000911d1 RSI: 0000000000000148 RDI: 0000000000000001
RBP: ffff9f12dff88a00 R08: 000000005f1cc119 R09: 00000000a715fae2
R10: ffffafc5cfbb7aa8 R11: ffff9f1cb4be804e R12: ffff9f1265e13000
R13: 0000000000000000 R14: ffffafc5cfbb7b48 R15: ffff9f12dff88b68
FS: 0000000000000000(0000) GS:ffff9f1d3f0c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000911d1 CR3: 0000001575a94006 CR4: 00000000001626e0
Call Trace:
fl_lookup+0x134/0x140 [cls_flower]
fl_classify+0xf3/0x180 [cls_flower]
tcf_classify+0x78/0x150
__netif_receive_skb_core+0x69e/0xa50
netif_receive_skb_internal+0x42/0xf0
tun_get_user+0xdd5/0xfd0 [tun]
tun_sendmsg+0x52/0x70 [tun]
handle_tx+0x2b3/0x5f0 [vhost_net]
vhost_worker+0xab/0x100 [vhost]
kthread+0xf8/0x130
ret_from_fork+0x35/0x40
Modules linked in: act_mirred act_gact cls_flower vhost_net vhost tap sch_ingress
CR2: 00000000000911d1
Fix the above waiting for a RCU grace period before destroying the
rhashtable: we need to use tcf_queue_work(), as rhashtable_destroy()
must run in process context, as pointed out by Cong Wang.
v1 -> v2: use tcf_queue_work to run rhashtable_destroy().
Fixes: 05cd271fd6 ("cls_flower: Support multiple masks per priority")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Positive return value from read_oob() is making false BAD
blocks. For some of the NAND controllers, OOB bytes will be
protected with ECC and read_oob() will return number of bitflips.
If there is any bitflip in ECC protected OOB bytes for BAD block
status page, then that block is getting treated as BAD.
Fixes: c120e75e0e ("mtd: nand: use read_oob() instead of cmdfunc() for bad block check")
Cc: <stable@vger.kernel.org>
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Pull NFS client bugfixes from Trond Myklebust:
"Hightlights include:
- fix an rcu deadlock in nfs_delegation_find_inode()
- fix NFSv4 deadlocks due to not freeing the session slot in
layoutget
- don't send layoutreturn if the layout is already invalid
- prevent duplicate XID allocation
- flexfiles: Don't tie up all the rpciod threads in resends"
* tag 'nfs-for-4.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
pNFS/flexfiles: Process writeback resends from nfsiod context as well
pNFS/flexfiles: Don't tie up all the rpciod threads in resends
sunrpc: Prevent duplicate XID allocation
pNFS: Don't send layoutreturn if the layout is already invalid
pNFS: Always free the session slot on error in nfs4_layoutget_handle_exception
NFS: Fix an rcu deadlock in nfs_delegation_find_inode()
Pull pin control fixes from Linus Walleij:
"Some fallout in the pin control subsystem in the first week after the
merge window, some minor fixes so I'd like to get it to you ASAP.
- fix a serious kernel panic on the Mediatek driver with the external
interrupt controller.
- fix an uninitialized compiler warning in the owl (actions) driver.
- allocation failure in the pinctrl-single driver.
- pointer overwrite problem in the i.MX driver.
- fix a small compiler warning"
* tag 'pinctrl-v4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: mt7622: fix a kernel panic when pio don't work as EINT controller
pinctrl: actions: Fix uninitialized error in owl_pin_config_set()
pinctrl: single: Add allocation failure checking of saved_vals
pinctrl: devicetree: Fix pctldev pointer overwrite
pinctrl: mediatek: remove redundant return value check of platform_get_resource()
Jakub Kicinski says:
====================
Two small fixes for error handling in bpftool prog load, first patch
removes a duplicated message, second one frees resources correctly.
Multiple error messages break JSON:
{
"error": "can't pin the object (/sys/fs/bpf/a): File exists"
},{
"error": "failed to pin program"
}
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Remembering to close all descriptors and free memory may not seem
important in a user space tool like bpftool, but if we were to run
in batch mode the consumed resources start to add up quickly. Make
sure program load closes the libbpf object (which unloads and frees
it).
Fixes: 49a086c201 ("bpftool: implement prog load command")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
do_pin_fd() will already print out an error message if something
goes wrong. Printing another error is unnecessary and will break
JSON output, since error messages are full objects:
$ bpftool -jp prog load tracex1_kern.o /sys/fs/bpf/a
{
"error": "can't pin the object (/sys/fs/bpf/a): File exists"
},{
"error": "failed to pin program"
}
Fixes: 49a086c201 ("bpftool: implement prog load command")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Pull hwmon fixes from Guenter Roeck:
- fix a loop limit in nct6775 driver
- disable fan support for Dell XPS13 9333
* tag 'hwmon-for-linus-v4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (nct6775) Fix loop limit
hwmon: (dell-smm) Disable fan support for Dell XPS13 9333
Pull ACPI fixes from Rafael Wysocki:
"These fix a suspend/resume regression in the ACPI driver for Intel
SoCs (LPSS), add a new system wakeup quirk to the ACPI EC driver and
fix an inline stub of a function in the ACPI processor driver that
diverged from the original.
Specifics:
- Fix a suspend/resume regression in the ACPI driver for Intel SoCs
(LPSS) to make it work on systems where some power management
quirks should only be applied for runtime PM and suspend-to-idle
and not for suspend-to-RAM (Rafael Wysocki).
- Add a system wakeup quirk for Thinkpad X1 Carbon 6th to the ACPI EC
driver to avoid drainig battery too fast while suspended to idle on
those systems (Mika Westerberg).
- Fix an inline stub of acpi_processor_ppc_has_changed() to match the
original function definition (Brian Norris)"
* tag 'acpi-4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / processor: Finish making acpi_processor_ppc_has_changed() void
ACPI / EC: Use ec_no_wakeup on Thinkpad X1 Carbon 6th
ACPI / LPSS: Avoid PM quirks on suspend and resume from S3
Pull power management updates from Rafael Wysocki:
"These are mostly fixes, including some fixes for changes made during
the recent merge window and some "stable" material, plus some minor
extensions of the turbostat utility.
Specifics:
- Fix the PM core to avoid introducing a runtime PM usage counter
imbalance when adding device links during driver probe (Rafael
Wysocki).
- Fix the operating performance points (OPP) framework to ensure that
the regulator voltage is always updated as appropriate when
updating clock rates (Waldemar Rymarkiewicz).
- Fix the intel_pstate driver to use correct max/min limits for cores
with differing maximum frequences (Srinivas Pandruvada).
- Fix a typo in the intel_pstate driver documentation (Rafael
Wysocki).
- Fix two issues with the recently added Kryo cpufreq driver (Ilia
Lin).
- Fix two recent regressions and some other minor issues in the
turbostat utility and extend it to provide some more diagnostic
information (Len Brown, Nathan Ciobanu)"
* tag 'pm-4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Documentation: intel_pstate: Fix typo
tools/power turbostat: version 18.06.20
tools/power turbostat: add the missing command line switches
tools/power turbostat: add single character tokens to help
tools/power turbostat: alphabetize the help output
tools/power turbostat: fix segfault on 'no node' machines
tools/power turbostat: add optional APIC X2APIC columns
tools/power turbostat: decode cpuid.1.HT
tools/power turbostat: fix show/hide issues resulting from mis-merge
PM / OPP: Update voltage in case freq == old_freq
cpufreq: intel_pstate: Fix scaling max/min limits with Turbo 3.0
cpufreq: kryo: Add module remove and exit
cpufreq: kryo: Fix possible error code dereference
PM / core: Fix supplier device runtime PM usage counter imbalance
The array ca0132_alt_chmaps is local to the source and does not
need to be in global scope, so make it static.
Cleans up sparse warning:
warning: symbol 'ca0132_alt_chmaps' was not declared. Should it be
static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Non gcc-5 builds with CONFIG_STACK_VALIDATION=y and
SKIP_STACK_VALIDATION=1 fail.
Example output:
/bin/sh: init/.tmp_main.o: Permission denied
commit 96f60dfa58 ("trace: Use -mcount-record for dynamic ftrace"),
added a mismatched endif. This causes cmd_objtool to get mistakenly
set.
Relocate endif to balance the newly added -record-mcount check.
Link: http://lkml.kernel.org/r/20180608214746.136554-1-gthelen@google.com
Fixes: 96f60dfa58 ("trace: Use -mcount-record for dynamic ftrace")
Acked-by: Andi Kleen <ak@linux.intel.com>
Tested-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The syzkaller detected a out-of-bounds issue with the events filter code,
specifically here:
prog[N].pred = NULL; /* #13 */
prog[N].target = 1; /* TRUE */
prog[N+1].pred = NULL;
prog[N+1].target = 0; /* FALSE */
-> prog[N-1].target = N;
prog[N-1].when_to_branch = false;
As that's the first reference to a "N-1" index, it appears that the code got
here with N = 0, which means the filter parser found no filter to parse
(which shouldn't ever happen, but apparently it did).
Add a new error to the parsing code that will check to make sure that N is
not zero before going into this part of the code. If N = 0, then -EINVAL is
returned, and a error message is added to the filter.
Cc: stable@vger.kernel.org
Fixes: 80765597bc ("tracing: Rewrite filter logic to be simpler and faster")
Reported-by: air icy <icytxw@gmail.com>
bugzilla url: https://bugzilla.kernel.org/show_bug.cgi?id=200019
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
If this condition ((BTRFS_I(src)->flags & BTRFS_INODE_NODATASUM) !=
(BTRFS_I(dst)->flags & BTRFS_INODE_NODATASUM))
is hit, we will go to free the uninitialized cmp.src_pages and
cmp.dst_pages.
Fixes: 67b07bd4be ("Btrfs: reuse cmp workspace in EXTENT_SAME ioctl")
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 9d311e11fc ("Btrfs: fiemap: pass correct bytenr when
fm_extent_count is zero") introduced a regression where we no longer
report 0 as the physical offset for inline extents (and other extents
with a special block_start value). This is because it always sets the
variable used to report the physical offset ("disko") as em->block_start
plus some offset, and em->block_start has the value 18446744073709551614
((u64) -2) for inline extents.
This made the btrfs test 004 (from fstests) often fail, for example, for
a file with an inline extent we have the following items in the subvolume
tree:
item 101 key (418 INODE_ITEM 0) itemoff 11029 itemsize 160
generation 25 transid 38 size 1525 nbytes 1525
block group 0 mode 100666 links 1 uid 0 gid 0 rdev 0
sequence 0 flags 0x2(none)
atime 1529342058.461891730 (2018-06-18 18:14:18)
ctime 1529342058.461891730 (2018-06-18 18:14:18)
mtime 1529342058.461891730 (2018-06-18 18:14:18)
otime 1529342055.869892885 (2018-06-18 18:14:15)
item 102 key (418 INODE_REF 264) itemoff 11016 itemsize 13
index 25 namelen 3 name: fc7
item 103 key (418 EXTENT_DATA 0) itemoff 9470 itemsize 1546
generation 38 type 0 (inline)
inline extent data size 1525 ram_bytes 1525 compression 0 (none)
Then when test 004 invoked fiemap against the file it got a non-zero
physical offset:
$ filefrag -v /mnt/p0/d4/d7/fc7
Filesystem type is: 9123683e
File size of /mnt/p0/d4/d7/fc7 is 1525 (1 block of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 4095: 18446744073709551614.. 4093: 4096: last,not_aligned,inline,eof
/mnt/p0/d4/d7/fc7: 1 extent found
This resulted in the test failing like this:
btrfs/004 49s ... [failed, exit status 1]- output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/004.out.bad)
--- tests/btrfs/004.out 2016-08-23 10:17:35.027012095 +0100
+++ /home/fdmanana/git/hub/xfstests/results//btrfs/004.out.bad 2018-06-18 18:15:02.385872155 +0100
@@ -1,3 +1,10 @@
QA output created by 004
*** test backref walking
-*** done
+./tests/btrfs/004: line 227: [: 7.55578637259143e+22: integer expression expected
+ERROR: 7.55578637259143e+22 is not a valid numeric value.
+unexpected output from
+ /home/fdmanana/git/hub/btrfs-progs/btrfs inspect-internal logical-resolve -s 65536 -P 7.55578637259143e+22 /home/fdmanana/btrfs-tests/scratch_1
...
(Run 'diff -u tests/btrfs/004.out /home/fdmanana/git/hub/xfstests/results//btrfs/004.out.bad' to see the entire diff)
Ran: btrfs/004
The large number in scientific notation reported as an invalid numeric
value is the result from the filter passed to perl which multiplies the
physical offset by the block size reported by fiemap.
So fix this by ensuring the physical offset is always set to 0 when we
are processing an extent with a special block_start value.
Fixes: 9d311e11fc ("Btrfs: fiemap: pass correct bytenr when fm_extent_count is zero")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
nvme requires an sg table allocation for each request. If the request
is large, then the allocation can become quite large. For instance,
with our default software settings of 1280KB IO size, we'll need
10248 bytes of sg table. That turns into a 2nd order allocation,
which we can't always guarantee. If we fail the allocation, blk-mq
will retry it later. But there's no guarantee that we'll EVER be
able to allocate that much contigious memory.
Limit the IO size such that we never need more than a single page
of memory. That's a lot faster and more reliable. Then back that
allocation with a mempool, so that we know we'll always be able
to succeed the allocation at some point.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
There is very little point in trying to support the 32bit KVM/arm API
on arm64, and this was never an anticipated use case.
Let's make it clear by not selecting KVM_COMPAT.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The current behaviour of the compat ioctls is a bit odd.
We provide a compat_ioctl method when KVM_COMPAT is set, and NULL
otherwise. But NULL means that the normal, non-compat ioctl should
be used directly for compat tasks, and there is no way to actually
prevent a compat task from issueing KVM ioctls.
This patch changes this behaviour, by always registering a compat_ioctl
method, even if KVM_COMPAT is not selected. In that case, the callback
will always return -EINVAL.
Fixes: de8e5d7440 ("KVM: Disable compat ioctl for s390")
Reported-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We had commit 06e226c7fb ("clk: sunxi-ng: Move all clock types to a
library") and commit 799c434154 ("kbuild: thin archives make default
for all archs") in the same development cycle, from different trees.
With migration to the thin archive, the entire drivers/clk/sunxi-ng/lib.a
is linked to the vmlinux. This does not break build, but we do not get
any size saving.
However, we do not need to go back to the individual Kconfig options.
The default configuration pulls in all (or most) of the CCU parts anyway.
Also, once we enable CONFIG_LD_DEAD_CODE_DATA_ELIMINATION, we can simply
list all files with obj-y, and the linker will drop all unused functions
by itself.
After the long discussion [1], people there agreed to fix this, but
nobody sent a patch after all. I am doing it now.
I lifted up CONFIG_SUNXI_CCU to drivers/clk/Makefile because everything
in drivers/clk/sunxi-ng/ depends on SUNXI_CCU.
[1] https://patchwork.kernel.org/patch/9796521/
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
There is race between nvme_remove and nvme_reset_work that can
lead to io hang.
nvme_remove nvme_reset_work
-> nvme_remove_dead_ctrl
-> nvme_dev_disable
-> quiesce request_queue
-> queue remove_work
-> cancel_work_sync reset_work
-> nvme_remove_namespaces
-> splice ctrl->namespaces
nvme_remove_dead_ctrl_work
-> nvme_kill_queues
-> nvme_ns_remove do nothing
-> blk_cleanup_queue
-> blk_freeze_queue
Finally, the request_queue is quiesced state when wait freeze,
we will get io hang here. To fix it, move the nvme_kill_queues
from nvme_remove_dead_ctrl_work to nvme_remove_dead_ctrl.
Suggested-by: Keith Busch <keith.busch@linux.intel.com>
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Mark Rutland noticed that GCC optimization passes have the potential to elide
necessary invocations of the array_index_mask_nospec() instruction sequence,
so mark the asm() volatile.
Mark explains:
"The volatile will inhibit *some* cases where the compiler could lift the
array_index_nospec() call out of a branch, e.g. where there are multiple
invocations of array_index_nospec() with the same arguments:
if (idx < foo) {
idx1 = array_idx_nospec(idx, foo)
do_something(idx1);
}
< some other code >
if (idx < foo) {
idx2 = array_idx_nospec(idx, foo);
do_something_else(idx2);
}
... since the compiler can determine that the two invocations yield the same
result, and reuse the first result (likely the same register as idx was in
originally) for the second branch, effectively re-writing the above as:
if (idx < foo) {
idx = array_idx_nospec(idx, foo);
do_something(idx);
}
< some other code >
if (idx < foo) {
do_something_else(idx);
}
... if we don't take the first branch, then speculatively take the second, we
lose the nospec protection.
There's more info on volatile asm in the GCC docs:
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
"
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: babdde2698 ("x86: Implement array_index_mask_nospec")
Link: https://lkml.kernel.org/lkml/152838798950.14521.4893346294059739135.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
These are a stable-candidate suspend/resume fix of the ACPI driver for
Intel SoCs (LPSS) and an inline stub fix for the ACPI processor driver.
* acpi-soc:
ACPI / LPSS: Avoid PM quirks on suspend and resume from S3
* acpi-processor:
ACPI / processor: Finish making acpi_processor_ppc_has_changed() void
These are turbostat utility updates for 4.18-rc2 including two fixes
for recent regressions and some minor extensions.
* pm-tools:
tools/power turbostat: version 18.06.20
tools/power turbostat: add the missing command line switches
tools/power turbostat: add single character tokens to help
tools/power turbostat: alphabetize the help output
tools/power turbostat: fix segfault on 'no node' machines
tools/power turbostat: add optional APIC X2APIC columns
tools/power turbostat: decode cpuid.1.HT
tools/power turbostat: fix show/hide issues resulting from mis-merge
Xen PV domain kernel is not by design affected by meltdown as it's
enforcing split CR3 itself. Let's not report such systems as "Vulnerable"
in sysfs (we're also already forcing PTI to off in X86_HYPER_XEN_PV cases);
the security of the system ultimately depends on presence of mitigation in
the Hypervisor, which can't be easily detected from DomU; let's report
that.
Reported-and-tested-by: Mike Latimer <mlatimer@suse.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1806180959080.6203@cbobk.fhfr.pm
[ Merge the user-visible string into a single line. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
These are a PM core fix and an OPP framework fix for 4.18-rc2,
both "stable" material.
* pm-core:
PM / core: Fix supplier device runtime PM usage counter imbalance
* pm-opp:
PM / OPP: Update voltage in case freq == old_freq
Now that platform.c only has the GPIO reset handling left, move the
initcall to reset.c and remove platform.c.
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
The call to of_platform_bus_probe has no effect because the DT core
already probes default buses like "simple-bus" before this call.
Michal Simek said 'xlnx,compound' hasn't been used in a long time, so
that match entry isn't needed.
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Henning Kühn reported that the discrete AMD GPU on his hybrid graphics
laptop no longer runtime-suspends due to the recent commit
07f4f97d7b ("vga_switcheroo: Use device link for HDA controller").
The root cause is that the HDMI codec on AMD GPU doesn't support
CLKSTOP and EPSS, which are currently mandatory for powering down the
HD-audio link at runtime suspend. Because the HD-audio link is still
up, HD-audio controller driver blocks the transition to D3.
For addressing the regression, this patch adds a new flag to indicate
the forced link-down, and sets it for AMD HDMI codecs appropriately
in the codec driver.
Fixes: 07f4f97d7b ("vga_switcheroo: Use device link for HDA controller")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106957
Reported-by: Lukas Wunner <lukas@wunner.de>
Reported-and-tested-by: Henning Kühn <prg@cooco.de>
Cc: <stable@vger.kernel.org> # v4.17+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reverts the following commit:
b0108f9e93 ("kexec: purgatory: add clean-up for purgatory directory")
... which incorrectly stated that the kexec-purgatory.c and purgatory.ro files
were not removed after 'make mrproper'.
In fact, they are. You can confirm it after reverting it.
$ make mrproper
$ touch arch/x86/purgatory/kexec-purgatory.c
$ touch arch/x86/purgatory/purgatory.ro
$ make mrproper
CLEAN arch/x86/purgatory
$ ls arch/x86/purgatory/
entry64.S Makefile purgatory.c setup-x86_64.S stack.S string.c
This is obvious from the build system point of view.
arch/x86/Makefile adds 'arch/x86' to core-y.
Hence 'make clean' descends like this:
arch/x86/Kbuild
-> arch/x86/purgatory/Makefile
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/1529401422-28838-2-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is a panic in armv8a server(QDF2400) under memory pressure tests
(start 20 guests and run memhog in the host).
---------------------------------begin--------------------------------
[35380.800950] BUG: Bad page state in process qemu-kvm pfn:dd0b6
[35380.805825] page:ffff7fe003742d80 count:-4871 mapcount:-2126053375
mapping: (null) index:0x0
[35380.815024] flags: 0x1fffc00000000000()
[35380.818845] raw: 1fffc00000000000 0000000000000000 0000000000000000
ffffecf981470000
[35380.826569] raw: dead000000000100 dead000000000200 ffff8017c001c000
0000000000000000
[35380.805825] page:ffff7fe003742d80 count:-4871 mapcount:-2126053375
mapping: (null) index:0x0
[35380.815024] flags: 0x1fffc00000000000()
[35380.818845] raw: 1fffc00000000000 0000000000000000 0000000000000000
ffffecf981470000
[35380.826569] raw: dead000000000100 dead000000000200 ffff8017c001c000
0000000000000000
[35380.834294] page dumped because: nonzero _refcount
[...]
--------------------------------end--------------------------------------
The root cause might be what was fixed at [1]. But from the KVM points of
view, it would be better if the issue was caught earlier.
If the size is not PAGE_SIZE aligned, unmap_stage2_range might unmap the
wrong(more or less) page range. Hence it caused the "BUG: Bad page
state"
Let's WARN in that case, so that the issue is obvious.
[1] https://lkml.org/lkml/2018/5/3/1042
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: jia.he@hxt-semitech.com
[maz: tidied up commit message]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We forgot to set the error code on this error path.
Fixes: 4a23fc8cc0 ("ALSA: lx6464es: add error handling for pci_ioremap_bar")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit e6b673b ("KVM: arm64: Optimise FPSIMD handling to reduce
guest/host thrashing") uses fpsimd_save() to save the FPSIMD state
for a vcpu when scheduling the vcpu out. However, currently
current's value of TIF_SVE is restored before calling fpsimd_save()
which means that fpsimd_save() may erroneously attempt to save SVE
state from the vcpu. This enables current's vector state to be
polluted with guest data. current->thread.sve_state may be
unallocated or not large enough, so this can also trigger a NULL
dereference or buffer overrun.
Instead of this, TIF_SVE should be configured properly for the
guest when calling fpsimd_save() with the vcpu context loaded.
This patch ensures this by delaying restoration of current's
TIF_SVE until after the call to fpsimd_save().
Fixes: e6b673b741 ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Commit e6b673b ("KVM: arm64: Optimise FPSIMD handling to reduce
guest/host thrashing") attempts to restore the configuration of
userspace SVE trapping via a call to fpsimd_bind_task_to_cpu(), but
the logic for determining when to do this is not correct.
The patch makes the errnoenous assumption that the only task that
may try to enter userspace with the currently loaded FPSIMD/SVE
register content is current. This may not be the case however: if
some other user task T is scheduled on the CPU during the execution
of the KVM run loop, and the vcpu does not try to use the registers
in the meantime, then T's state may be left there intact. If T
happens to be the next task to enter userspace on this CPU then the
hooks for reloading the register state and configuring traps will
be skipped.
(Also, current never has SVE state at this point anyway and should
always have the trap enabled, as a side-effect of the ioctl()
syscall needed to reach the KVM run loop in the first place.)
This patch instead restores the state of the EL0 trap from the
state observed at the most recent vcpu_load(), ensuring that the
trap is set correctly for the loaded context (if any).
Fixes: e6b673b741 ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Commit e6b673b ("KVM: arm64: Optimise FPSIMD handling to reduce
guest/host thrashing") introduces a specific helper
kvm_arch_vcpu_put_fp() for saving the vcpu FPSIMD state during
vcpu_put().
This function uses local_bh_disable()/_enable() to protect the
FPSIMD context manipulation from interruption by softirqs.
This approach is not correct, because vcpu_put() can be invoked
either from the KVM host vcpu thread (when exiting the vcpu run
loop), or via a preempt notifier. In the former case, only
preemption is disabled. In the latter case, the function is called
from inside __schedule(), which means that IRQs are disabled.
Use of local_bh_disable()/_enable() with IRQs disabled is considerd
an error, resulting in lockdep splats while running VMs if lockdep
is enabled.
This patch disables IRQs instead of attempting to disable softirqs,
avoiding the problem of calling local_bh_enable() with IRQs
disabled in the __schedule() path. This creates an additional
interrupt blackout during vcpu run loop exit, but this is the rare
case and the blackout latency is still less than that of
__schedule().
Fixes: e6b673b741 ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing")
Reported-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Currently we have a couple of helpers to manipulate bits in particular
sysregs:
* config_sctlr_el1(u32 clear, u32 set)
* change_cpacr(u64 val, u64 mask)
The parameters of these differ in naming convention, order, and size,
which is unfortunate. They also differ slightly in behaviour, as
change_cpacr() skips the sysreg write if the bits are unchanged, which
is a useful optimization when sysreg writes are expensive.
Before we gain yet another sysreg manipulation function, let's
unify these with a common helper, providing a consistent order for
clear/set operands, and the write skipping behaviour from
change_cpacr(). Code will be migrated to the new helper in subsequent
patches.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
When booting a 64 KB pages kernel on a ACPI GICv3 system that
implements support for v2 emulation, the following warning is
produced
GICV size 0x2000 not a multiple of page size 0x10000
and support for v2 emulation is disabled, preventing GICv2 VMs
from being able to run on such hosts.
The reason is that vgic_v3_probe() performs a sanity check on the
size of the window (it should be a multiple of the page size),
while the ACPI MADT parsing code hardcodes the size of the window
to 8 KB. This makes sense, considering that ACPI does not bother
to describe the size in the first place, under the assumption that
platforms implementing ACPI will follow the architecture and not
put anything else in the same 64 KB window.
So let's just drop the sanity check altogether, and assume that
the window is at least 64 KB in size.
Fixes: 9097773245 ("KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The audio divider is one based. This offset was mistakenly dropped from
recalc_rate() when migrating to clk_regmap.
Fixes: 88a4e12836 ("clk: meson: migrate the audio divider clock to clk_regmap")
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Rather than leaving io queues quiesced after tearing down an association,
restart them. This allows ios to be replayed, with fastfail ios terminating
and non-fastfail getting into loops of retry.
This follows rdma's lead.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Sagi Grimberg <sagi@grimber.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When delta passed to gem_ptp_adjtime is negative, the sign is
maintained in the ns_to_timespec64 conversion. Hence timespec_add
should be used directly. timespec_sub will just subtract the negative
value thus increasing the time difference.
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 296d485680 ("ipvlan: inherit MTU from master device") adjusted
the mtu from the master device when creating a ipvlan device, but it
would also override the mtu value set in rtnl_create_link. It causes
IFLA_MTU param not to take effect.
So this patch is to not adjust the mtu if IFLA_MTU param is set when
creating a ipvlan device.
Fixes: 296d485680 ("ipvlan: inherit MTU from master device")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently it is incrementing SctpFragUsrMsgs when the user message size
is of the exactly same size as the maximum fragment size, which is wrong.
The fix is to increment it only when user message is bigger than the
maximum fragment size.
Fixes: bfd2e4b873 ("sctp: refactor sctp_datamsg_from_user")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 9facc33687 ("bpf: reject any prog that failed read-only lock")
offsetof(struct bpf_binary_header, image) became 3 instead of 4,
breaking powerpc BPF badly, since instructions need to be word aligned.
Fixes: 9facc33687 ("bpf: reject any prog that failed read-only lock")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When injecting frames in the Ocelot switch driver an injection header
(IFH) should be used to configure various parameters related to a given
frame, such as the port onto which the frame should be departed or its
vlan id. Other parameters in the switch configuration can led to an
injected frame being sent without an IFH but this led to various issues
as the per-frame parameters are then not used. This is especially true
when using multiple ports for injection.
The IFH was injected with the wrong endianness which led to the switch
not taking it into account as the IFH_INJ_BYPASS bit was then unset.
(The bit tells the switch to use the IFH over its internal
configuration). This patch fixes it.
In addition to the endianness fix, the IFH is also fixed. As it was
(unwillingly) unused, some of its fields were not configured the right
way.
Fixes: a556c76adc ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device tree based systems without of_dev_auxdata will have the mdio
device named differently than "davinci_mdio(.0)". In this case use the
device's parent's compatible string for matching
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If NBD_DISCONNECT_ON_CLOSE is set on a device, then the driver will
issue a disconnect from nbd_release if the device has no remaining
bdev->bd_openers.
Fix ret val so reconfigure with only setting the flag succeeds.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In function strp_data_ready(), it is useless to call queue_work if
the state of strparser is already paused. The state checking should
be done before calling queue_work. The change reduces the context
switches and improves the ktls-rx throughput by approx 20% (measured
on cortex-a53 based platform).
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add fragments to pass bridge and vlan tests.
Fixes: 33b01b7b4f ("selftests: add rtnetlink test script")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use $(OBJDUMP) instead of literal 'objdump' to avoid
using host toolchain when cross compiling.
Fixes: 421780fd49 ("bpfilter: fix build error")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Reported-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
machine_desc->init_per_cpu() hook is supposed to be per cpu
initialization and would seem to apply equally to UP and/or SMP.
Infact the comment in header file seems to suggest it works for
UP too, which was not the case and this patch.
This enables !CONFIG_SMP build for platforms such as hsdk.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: trimmeed changelog]
Pull turbostat utility changes for 4.18-rc2 from Len Brown.
"This includes two regression fixes, plus a couple more random, but
worthy, patches."
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: version 18.06.20
tools/power turbostat: add the missing command line switches
tools/power turbostat: add single character tokens to help
tools/power turbostat: alphabetize the help output
tools/power turbostat: fix segfault on 'no node' machines
tools/power turbostat: add optional APIC X2APIC columns
tools/power turbostat: decode cpuid.1.HT
tools/power turbostat: fix show/hide issues resulting from mis-merge
Pull rdma fixes from Jason Gunthorpe:
"Here are eight fairly small fixes collected over the last two weeks.
Regression and crashing bug fixes:
- mlx4/5: Fixes for issues found from various checkers
- A resource tracking and uverbs regression in the core code
- qedr: NULL pointer regression found during testing
- rxe: Various small bugs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/rxe: Fix missing completion for mem_reg work requests
RDMA/core: Save kernel caller name when creating CQ using ib_create_cq()
IB/uverbs: Fix ordering of ucontext check in ib_uverbs_write
IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()'
RDMA/qedr: Fix NULL pointer dereference when running over iWARP without RDMA-CM
IB/mlx5: Fix return value check in flow_counters_set_data()
IB/mlx5: Fix memory leak in mlx5_ib_create_flow
IB/rxe: avoid double kfree skb
Pull networking fixes from David Miller:
1) Fix crash on bpf_prog_load() errors, from Daniel Borkmann.
2) Fix ATM VCC memory accounting, from David Woodhouse.
3) fib6_info objects need RCU freeing, from Eric Dumazet.
4) Fix SO_BINDTODEVICE handling for TCP sockets, from David Ahern.
5) Fix clobbered error code in enic_open() failure path, from
Govindarajulu Varadarajan.
6) Propagate dev_get_valid_name() error returns properly, from Li
RongQing.
7) Fix suspend/resume in davinci_emac driver, from Bartosz Golaszewski.
8) Various act_ife fixes (recursive locking, IDR leaks, etc.) from
Davide Caratti.
9) Fix buggy checksum handling in sungem driver, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits)
ip: limit use of gso_size to udp
stmmac: fix DMA channel hang in half-duplex mode
net: stmmac: socfpga: add additional ocp reset line for Stratix10
net: sungem: fix rx checksum support
bpfilter: ignore binary files
bpfilter: fix build error
net/usb/drivers: Remove useless hrtimer_active check
net/sched: act_ife: preserve the action control in case of error
net/sched: act_ife: fix recursive lock and idr leak
net: ethernet: fix suspend/resume in davinci_emac
net: propagate dev_get_valid_name return code
enic: do not overwrite error code
net/tcp: Fix socket lookups with SO_BINDTODEVICE
ptp: replace getnstimeofday64() with ktime_get_real_ts64()
net/ipv6: respect rcu grace period before freeing fib6_info
net: net_failover: fix typo in net_failover_slave_register()
ipvlan: use ETH_MAX_MTU as max mtu
net: hamradio: use eth_broadcast_addr
enic: initialize enic->rfs_h.lock in enic_probe
MAINTAINERS: Add Sam as the maintainer for NCSI
...
resp->num is the number of tokens in resp->tok[]. It gets set in
response_parse(). So if n == resp->num then we're reading beyond the
end of the data.
Fixes: 455a7b238c ("block: Add Sed-opal library")
Reviewed-by: Scott Bauer <scott.bauer@intel.com>
Tested-by: Scott Bauer <scott.bauer@intel.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Sort the command line arguments output of help() in
alphabetical order in line with other linux tools.
Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Running turbostat on machines that don't expose nodes
in sysfs (no /sys/bus/node) causes a segfault or a -nan
value diesplayed in the log. This is caused by
physical_node_id being reported as -1 and logical_node_id
being calculated as a negative number resulting in the new
GET_THREAD/GET_CORE returning an incorrect address.
Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Add APIC and X2APIC columns to the topology section.
They are disabled-by-default -- enable like so:
--debug
or
--enable APIC,X2APIC
Signed-off-by: Len Brown <len.brown@intel.com>
The --show and --hide options failed on "Node", which was listed as "Node%".
The --show and --hide options were generally fouled-up do due to come
content merges that scrambled the list of column name indexes.
Signed-off-by: Len Brown <len.brown@intel.com>
If rq_state == ARRAY_SIZE() then we read one element beyond the end of
the blk_mq_rq_state_name_array[] array.
Fixes: ec6dcf63c5 ("blk-mq-debugfs: Show more request state information")
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Controllers that are not yet enabled should not really enforce keep alive
timeouts, but we still want to track a timeout and cleanup in case a host
died before it enabled the controller. Hence, simply reset the keep
alive timer when the controller is enabled.
Suggested-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
That is user argument, and theoretically controller limits can change
over time (over reconnects/resets). Instead, use the sqsize controller
attribute to check queue depth boundaries and use it to the tagset
allocation.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The race is between completing the request at error recovery work and
rdma completions. If we cancel the request before getting the good
rdma completion we get a NULL deref of the request MR at
nvme_rdma_process_nvme_rsp().
When Canceling the request we return its mr to the mr pool (set mr to
NULL) and also unmap its data. Canceling the requests while the rdma
queues are active is not safe. Because rdma queues are active and we
get good rdma completions that can use the mr pointer which may be NULL.
Completing the request too soon may lead also to performing DMA to/from
user buffers which might have been already unmapped.
The commit fixes the race by draining the QP before starting the abort
commands mechanism.
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
If nvme_rdma_configure_admin_queue fails before we allocated
the async event buffer, we will falsly free it because
nvme_rdma_free_queue is freeing it. Fix it by allocating the buffer right
after nvme_rdma_alloc_queue and free it right before nvme_rdma_queue_free
to maintain orderly reverse cleanup sequence.
Reported-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Failures after nvme_init_ctrl will defer resource cleanups to .free_ctrl
when the reference is released, hence we should not free the controller
queues for these failures.
Fix that by moving controller queues allocation before controller
initialization and correctly freeing them for failures before
initialization and skip them for failures after initialization.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We should be returning "retval". The "mrst_rtc.rtc" variable is a valid
pointer.
Fixes: 32b41f93dc ("rtc: mrst: switch to devm functions")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
fpu__drop() has an explicit fwait which under some conditions can trigger a
fixable FPU exception while in kernel. Thus, we should attempt to fixup the
exception first, and only call notify_die() if the fixup failed just like
in do_general_protection(). The original call sequence incorrectly triggers
KDB entry on debug kernels under particular FPU-intensive workloads.
Andy noted, that this makes the whole conditional irq enable thing even
more inconsistent, but fixing that it outside the scope of this.
Signed-off-by: Siarhei Liakh <siarhei.liakh@concurrent-rt.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Borislav Petkov" <bpetkov@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/DM5PR11MB201156F1CAB2592B07C79A03B17D0@DM5PR11MB2011.namprd11.prod.outlook.com
Detect when a directory entry is (possibly partially) beyond directory
size and return EIO in that case since it means the filesystem is
corrupted. Otherwise directory operations can further corrupt the
directory and possibly also oops the kernel.
CC: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
CC: stable@vger.kernel.org
Reported-and-tested-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
The option nocheck(nocheck/check=none) is useless but considering
backwards compatibility it's better to print warning for a while
before completely remove from the code.
This patch add proper warning message for option 'nocheck' and
remove unnecessary comment/function declaration which is used for
removed option 'check'.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Use list_first_entry() and list_empty() instead of opencoded variants.
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Jan Kara <jack@suse.cz>
The dquots in the free_dquots list are not reclaimed in LRU way.
put_dquot_last() puts entries to the tail and dqcache_shrink_scan()
frees from the tail. Free unreferenced dquots in LRU order because it
seems more reasonable than freeing most recently used.
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Commit bca5f557dc "ACPI / processor: Make acpi_processor_ppc_has_changed()
void" changed one of the declarations of acpi_processor_ppc_has_changed()
to return void, but the !CPU_FREQ version still returns int. Let's return
void to be consistent.
Fixes: bca5f557dc "ACPI / processor: Make acpi_processor_ppc_has_changed() void"
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull HID fixes from Jiri Kosina:
- Wacom 2nd-gen Intuos Pro large Y axis handling fix from Jason Gerecke
- fix for hibernation in Intel ISH driver, from Even Xu
- crash fix for hid-steam driver, from Rodrigo Rivas Costa
- new device ID addition to google-hammer driver
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: wacom: Correct logical maximum Y for 2nd-gen Intuos Pro large
HID: intel_ish-hid: ipc: register more pm callbacks to support hibernation
HID: steam: use hid_device.driver_data instead of hid_set_drvdata()
HID: google: Add support for whiskers
Pull dma-mapping rename from Christoph Hellwig:
"Move all the dma-mapping code to kernel/dma and lose their dma-*
prefixes"
* tag 'dma-rename-4.18' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: move all DMA mapping code to kernel/dma
dma-mapping: use obj-y instead of lib-y for generic dma ops
The HID descriptor for the 2nd-gen Intuos Pro large (PTH-860) contains
a typo which defines an incorrect logical maximum Y value. This causes
a small portion of the bottom of the tablet to become unusable (both
because the area is below the "bottom" of the tablet and because
'wacom_wac_event' ignores out-of-range values). It also results in a
skewed aspect ratio.
To fix this, we add a quirk to 'wacom_usage_mapping' which overwrites
the data with the correct value.
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
CC: stable@vger.kernel.org # v4.10+
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Current ISH driver only registers suspend/resume PM callbacks which don't
support hibernation (suspend to disk). Basically after hiberation, the ISH
can't resume properly and user may not see sensor events (for example: screen
rotation may not work).
User will not see a crash or panic or anything except the following message
in log:
hid-sensor-hub 001F:8086:22D8.0001: timeout waiting for response from ISHTP device
So this patch adds support for S4/hiberbation to ISH by using the
SIMPLE_DEV_PM_OPS() MACRO instead of struct dev_pm_ops directly. The suspend
and resume functions will now be used for both suspend to RAM and hibernation.
If power management is disabled, SIMPLE_DEV_PM_OPS will do nothing, the suspend
and resume related functions won't be used, so mark them as __maybe_unused to
clarify that this is the intended behavior, and remove #ifdefs for power
management.
Cc: stable@vger.kernel.org
Signed-off-by: Even Xu <even.xu@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
When creating the low-level hidraw device, the reference to steam_device
was stored using hid_set_drvdata(). But this value is not guaranteed to
be kept when set before calling probe. If this pointer is reset, it
crashes when opening the emulated hidraw device.
It looks like hid_set_drvdata() is for users "avobe" this hid_device,
while hid_device.driver_data it for users "below" this one.
In this case, we are creating a virtual hidraw device, so we must use
hid_device.driver_data.
Signed-off-by: Rodrigo Rivas Costa <rodrigorivascosta@gmail.com>
Tested-by: Mariusz Ceier <mceier+kernel@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The rewrite of the cmdline fetching missed the fact that we used to also
return the final terminating NUL character of the last argument. I
hadn't noticed, and none of the tools I tested cared, but something
obviously must care, because Michal Kubecek noticed the change in
behavior.
Tweak the "find the end" logic to actually include the NUL character,
and once past the eend of argv, always start the strnlen() at the
expected (original) argument end.
This whole "allow people to rewrite their arguments in place" is a nasty
hack and requires that odd slop handling at the end of the argv array,
but it's our traditional model, so we continue to support it.
Repored-and-bisected-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-and-tested-by: Michal Kubecek <mkubecek@suse.cz>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ipcm(6)_cookie field gso_size is set only in the udp path. The ip
layer copies this to cork only if sk_type is SOCK_DGRAM. This check
proved too permissive. Ping and l2tp sockets have the same type.
Limit to sockets of type SOCK_DGRAM and protocol IPPROTO_UDP to
exclude ping sockets.
v1 -> v2
- remove irrelevant whitespace changes
Fixes: bec1f6f697 ("udp: generate gso with UDP_SEGMENT")
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
HW does not support Half-duplex mode in multi-queue
scenario. Fix it by not advertising the Half-Duplex
mode if multi-queue enabled.
Signed-off-by: Bhadram Varka <vbhadram@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Stratix10 platform has an additional reset line, OCP(Open Core Protocol),
that also needs to get deasserted for the stmmac ethernet controller to work.
Thus we need to update the Kconfig to include ARCH_STRATIX10 in order to build
dwmac-socfpga.
Also, remove the redundant check for the reset controller pointer. The
reset driver already checks for the pointer and returns 0 if the pointer
is NULL.
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 88078d98d1 ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
are friends"), sungem owners reported the infamous "eth0: hw csum failure"
message.
CHECKSUM_COMPLETE has in fact never worked for this driver, but this
was masked by the fact that upper stacks had to strip the FCS, and
therefore skb->ip_summed was set back to CHECKSUM_NONE before
my recent change.
Driver configures a number of bytes to skip when the chip computes
the checksum, and for some reason only half of the Ethernet header
was skipped.
Then a second problem is that we should strip the FCS by default,
unless the driver is updated to eventually support NETIF_F_RXFCS in
the future.
Finally, a driver should check if NETIF_F_RXCSUM feature is enabled
or not, so that the admin can turn off rx checksum if wanted.
Many thanks to Andreas Schwab and Mathieu Malaterre for their
help in debugging this issue.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Mathieu Malaterre <malat@debian.org>
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Tested-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
SPC5r17 states that the contents of the ADDITIONAL LENGTH field are not
altered based on the allocation length, so always calculate and pack the
full key list length even if the list itself is truncated.
According to Maged:
Yes it fixes the "Storage Spaces Persistent Reservation" test in the
Windows 2016 Server Failover Cluster validation suites when having
many connections that result in more than 8 registrations. I tested
your patch on 4.17 with iblock.
This behaviour can be tested using the libiscsi PrinReadKeys.Truncate test.
Cc: stable@vger.kernel.org
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Tested-by: Maged Mokhtar <mmokhtar@petasan.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
net/bpfilter/bpfilter_umh is a binary file generated when bpfilter is
enabled, add it to .gitignore to avoid committing it.
Fixes: d2ba09c17a ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bpfilter Makefile assumes that the system locale is en_US, and the
parsing of objdump output fails.
Set LC_ALL=C and, while at it, rewrite the objdump parsing so it spawns
only 2 processes instead of 7.
Fixes: d2ba09c17a ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code does:
if (hrtimer_active(&t))
hrtimer_cancel(&t);
However, hrtimer_cancel() checks if the timer is active, so the
test above is pointless.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
in the following script
# tc actions add action ife encode allow prio pass index 42
# tc actions replace action ife encode allow tcindex drop index 42
the action control should remain equal to 'pass', if the kernel failed
to replace the TC action. Pospone the assignment of the action control,
to ensure it is not overwritten in the error path of tcf_ife_init().
Fixes: ef6980b6be ("introduce IFE action")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a recursive lock warning [1] can be observed with the following script,
# $TC actions add action ife encode allow prio pass index 42
IFE type 0xED3E
# $TC actions replace action ife encode allow tcindex pass index 42
in case the kernel was unable to run the last command (e.g. because of
the impossibility to load 'act_meta_skbtcindex'). For a similar reason,
the kernel can leak idr in the error path of tcf_ife_init(), because
tcf_idr_release() is not called after successful idr reservation:
# $TC actions add action ife encode allow tcindex index 47
IFE type 0xED3E
RTNETLINK answers: No such file or directory
We have an error talking to the kernel
# $TC actions add action ife encode allow tcindex index 47
IFE type 0xED3E
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# $TC actions add action ife encode use mark 7 type 0xfefe pass index 47
IFE type 0xFEFE
RTNETLINK answers: No space left on device
We have an error talking to the kernel
Since tcfa_lock is already taken when the action is being edited, a call
to tcf_idr_release() wrongly makes tcf_idr_cleanup() take the same lock
again. On the other hand, tcf_idr_release() needs to be called in the
error path of tcf_ife_init(), to undo the last tcf_idr_create() invocation.
Fix both problems in tcf_ife_init().
Since the cleanup() routine can now be called when ife->params is NULL,
also add a NULL pointer check to avoid calling kfree_rcu(NULL, rcu).
[1]
============================================
WARNING: possible recursive locking detected
4.17.0-rc4.kasan+ #417 Tainted: G E
--------------------------------------------
tc/3932 is trying to acquire lock:
000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_cleanup+0x19/0x80 [act_ife]
but task is already holding lock:
000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_init+0xf6d/0x13c0 [act_ife]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&p->tcfa_lock)->rlock);
lock(&(&p->tcfa_lock)->rlock);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by tc/3932:
#0: 000000007ca8e990 (rtnl_mutex){+.+.}, at: tcf_ife_init+0xf61/0x13c0 [act_ife]
#1: 000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_init+0xf6d/0x13c0 [act_ife]
stack backtrace:
CPU: 3 PID: 3932 Comm: tc Tainted: G E 4.17.0-rc4.kasan+ #417
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
dump_stack+0x9a/0xeb
__lock_acquire+0xf43/0x34a0
? debug_check_no_locks_freed+0x2b0/0x2b0
? debug_check_no_locks_freed+0x2b0/0x2b0
? debug_check_no_locks_freed+0x2b0/0x2b0
? __mutex_lock+0x62f/0x1240
? kvm_sched_clock_read+0x1a/0x30
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x170
? find_held_lock+0x39/0x1d0
? lock_acquire+0x10b/0x330
lock_acquire+0x10b/0x330
? tcf_ife_cleanup+0x19/0x80 [act_ife]
_raw_spin_lock_bh+0x38/0x70
? tcf_ife_cleanup+0x19/0x80 [act_ife]
tcf_ife_cleanup+0x19/0x80 [act_ife]
__tcf_idr_release+0xff/0x350
tcf_ife_init+0xdde/0x13c0 [act_ife]
? ife_exit_net+0x290/0x290 [act_ife]
? __lock_is_held+0xb4/0x140
tcf_action_init_1+0x67b/0xad0
? tcf_action_dump_old+0xa0/0xa0
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x170
? kvm_sched_clock_read+0x1a/0x30
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x170
? memset+0x1f/0x40
tcf_action_init+0x30f/0x590
? tcf_action_init_1+0xad0/0xad0
? memset+0x1f/0x40
tc_ctl_action+0x48e/0x5e0
? mutex_lock_io_nested+0x1160/0x1160
? tca_action_gd+0x990/0x990
? sched_clock+0x5/0x10
? find_held_lock+0x39/0x1d0
rtnetlink_rcv_msg+0x4da/0x990
? validate_linkmsg+0x680/0x680
? sched_clock_cpu+0x18/0x170
? find_held_lock+0x39/0x1d0
netlink_rcv_skb+0x127/0x350
? validate_linkmsg+0x680/0x680
? netlink_ack+0x970/0x970
? __kmalloc_node_track_caller+0x304/0x3a0
netlink_unicast+0x40f/0x5d0
? netlink_attachskb+0x580/0x580
? _copy_from_iter_full+0x187/0x760
? import_iovec+0x90/0x390
netlink_sendmsg+0x67f/0xb50
? netlink_unicast+0x5d0/0x5d0
? copy_msghdr_from_user+0x206/0x340
? netlink_unicast+0x5d0/0x5d0
sock_sendmsg+0xb3/0xf0
___sys_sendmsg+0x60a/0x8b0
? copy_msghdr_from_user+0x340/0x340
? lock_downgrade+0x5e0/0x5e0
? tty_write_lock+0x18/0x50
? kvm_sched_clock_read+0x1a/0x30
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x170
? find_held_lock+0x39/0x1d0
? lock_downgrade+0x5e0/0x5e0
? lock_acquire+0x10b/0x330
? __audit_syscall_entry+0x316/0x690
? current_kernel_time64+0x6b/0xd0
? __fget_light+0x55/0x1f0
? __sys_sendmsg+0xd2/0x170
__sys_sendmsg+0xd2/0x170
? __ia32_sys_shutdown+0x70/0x70
? syscall_trace_enter+0x57a/0xd60
? rcu_read_lock_sched_held+0xdc/0x110
? __bpf_trace_sys_enter+0x10/0x10
? do_syscall_64+0x22/0x480
do_syscall_64+0xa5/0x480
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fd646988ba0
RSP: 002b:00007fffc9fab3c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fffc9fab4f0 RCX: 00007fd646988ba0
RDX: 0000000000000000 RSI: 00007fffc9fab440 RDI: 0000000000000003
RBP: 000000005b28c8b3 R08: 0000000000000002 R09: 0000000000000000
R10: 00007fffc9faae20 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fffc9fab504 R14: 0000000000000001 R15: 000000000066c100
Fixes: 4e8c861550 ("net sched: net sched: ife action fix late binding")
Fixes: ef6980b6be ("introduce IFE action")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch reverts commit 3243ff2a05 ("net: ethernet: davinci_emac:
Deduplicate bus_find_device() by name matching") and adds a comment
which should stop anyone from reintroducing the same "fix" in the future.
We can't use bus_find_device_by_name() here because the device name is
not guaranteed to be 'davinci_mdio'. On some systems it can be
'davinci_mdio.0' so we need to use strncmp() against the first part of
the string to correctly match it.
Fixes: 3243ff2a05 ("net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching")
Cc: stable@vger.kernel.org
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
With 4k page size for hugetlb we allocate hugepage directories from its on slab
cache. With patch 0c4d26802 ("powerpc/book3s64/mm: Simplify the rcu callback for page table free")
we missed to free these allocated hugepd tables.
Update pgtable_free to handle hugetlb hugepd directory table.
Fixes: 0c4d268029 ("powerpc/book3s64/mm: Simplify the rcu callback for page table free")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[mpe: Add CONFIG_HUGETLB_PAGE guard to fix build break]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
If possible CPUs are limited (e.g., by kexec), then the kvm prefetch
workaround function can access the paca pointer for a !possible CPU.
Fixes: d2e60075a3 ("powerpc/64: Use array of paca pointers and allocate pacas individually")
Cc: stable@kernel.org
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
if dev_get_valid_name failed, propagate its return code
and remove the setting err to ENODEV, it will be set to
0 again before dev_change_net_namespace exits.
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In failure path, we overwrite err to what vnic_rq_disable() returns. In
case it returns 0, enic_open() returns success in case of error.
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Fixes: e8588e2685 ("enic: enable rq before updating rq descriptors")
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to 69678bcd4d ("udp: fix SO_BINDTODEVICE"), TCP socket lookups
need to fail if dev_match is not true. Currently, a packet to a given port
can match a socket bound to device when it should not. In the VRF case,
this causes the lookup to hit a VRF socket and not a global socket
resulting in a response trying to go through the VRF when it should not.
Fixes: 3fa6f616a7 ("net: ipv4: add second dif to inet socket lookups")
Fixes: 4297a0ef08 ("net: ipv6: add second dif to inet6 socket lookups")
Reported-by: Lou Berger <lberger@labn.net>
Diagnosed-by: Renato Westphal <renato@opensourcerouting.org>
Tested-by: Renato Westphal <renato@opensourcerouting.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
getnstimeofday64() is deprecated and getting replaced throughout
the kernel with ktime_get_*() based helpers for a more consistent
interface.
The two functions do the exact same thing, so this is just
a cosmetic change.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to the fixes on team and bonding, this restores the ability
to set an ipvlan device's mtu to anything higher than 1500.
Fixes: 91572088e3 ("net: use core MTU range checking in core net infra")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The array bpq_eth_addr is only used to get the size of an
address, whereas the bcast_addr is used to set the broadcast
address. This leads to a warning when using clang:
drivers/net/hamradio/bpqether.c:94:13: warning: variable 'bpq_eth_addr' is not
needed and will not be emitted [-Wunneeded-internal-declaration]
static char bpq_eth_addr[6];
^
Remove both variables and use the common eth_broadcast_addr
to set the broadcast address.
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
lockdep spotted that we are using rfs_h.lock in enic_get_rxnfc() without
initializing. rfs_h.lock is initialized in enic_open(). But ethtool_ops
can be called when interface is down.
Move enic_rfs_flw_tbl_init to enic_probe.
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 18 PID: 1189 Comm: ethtool Not tainted 4.17.0-rc7-devel+ #27
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
Call Trace:
dump_stack+0x85/0xc0
register_lock_class+0x550/0x560
? __handle_mm_fault+0xa8b/0x1100
__lock_acquire+0x81/0x670
lock_acquire+0xb9/0x1e0
? enic_get_rxnfc+0x139/0x2b0 [enic]
_raw_spin_lock_bh+0x38/0x80
? enic_get_rxnfc+0x139/0x2b0 [enic]
enic_get_rxnfc+0x139/0x2b0 [enic]
ethtool_get_rxnfc+0x8d/0x1c0
dev_ethtool+0x16c8/0x2400
? __mutex_lock+0x64d/0xa00
? dev_load+0x6a/0x150
dev_ioctl+0x253/0x4b0
sock_do_ioctl+0x9a/0x130
sock_ioctl+0x1af/0x350
do_vfs_ioctl+0x8e/0x670
? syscall_trace_enter+0x1e2/0x380
ksys_ioctl+0x60/0x90
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x5a/0x170
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joel Stanley says:
====================
Slience NCSI logging
v2:
Fix indent issue and commit message based on Joe's feedback
Add Sam's acks
Here are three changes to silence unnecessary warnings in the ncsi code.
The final patch adds Sam as the maintainer for NCSI.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam has been handing the maintenance of NCSI for a number release cycles
now.
Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This moves all of the netdev_printk(KERN_DEBUG, ...) messages over to
netdev_dbg.
As Joe explains:
> netdev_dbg is not included in object code unless
> DEBUG is defined or CONFIG_DYNAMIC_DEBUG is set.
> And then, it is not emitted into the log unless
> DEBUG is set or this specific netdev_dbg is enabled
> via the dynamic debug control file.
Which is what we're after in this case.
Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This does not provide useful information. As the ncsi maintainer said:
> either we get a channel or broadcom has gone out to lunch
Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
In normal operation we see this series of messages as the host drives
the network device:
ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state down
ftgmac100 1e660000.ethernet eth0: NCSI: suspending channel 0
ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0
ftgmac100 1e660000.ethernet eth0: NCSI: channel 0 link down after config
ftgmac100 1e660000.ethernet eth0: NCSI interface down
ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state up
ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0
ftgmac100 1e660000.ethernet eth0: NCSI interface up
ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state down
ftgmac100 1e660000.ethernet eth0: NCSI: suspending channel 0
ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0
ftgmac100 1e660000.ethernet eth0: NCSI: channel 0 link down after config
ftgmac100 1e660000.ethernet eth0: NCSI interface down
ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state up
ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0
ftgmac100 1e660000.ethernet eth0: NCSI interface up
This makes all of these messages netdev_dbg. They are still useful to
debug eg. misbehaving network device firmware, but we do not need them
filling up the kernel logs in normal operation.
Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using skb_reserve(skb, I40E_SKB_PAD + (xdp->data - xdp->data_hard_start))
is clearly wrong since I40E_SKB_PAD already points to the offset where
the original xdp->data was sitting since xdp->data_hard_start is defined
as xdp->data - i40e_rx_offset(rx_ring) where latter offsets to I40E_SKB_PAD
when build skb is used.
However, also before cc5b114dcf ("bpf, i40e: add meta data support")
this seems broken since bpf_xdp_adjust_head() helper could have been used
to alter headroom and enlarge / shrink the frame and with that the assumption
that the xdp->data remains unchanged does not hold and would push a bogus
packet to upper stack.
ixgbe got this right in 9247080816 ("ixgbe: add XDP support for pass and
drop actions"). In any case, fix it by removing the I40E_SKB_PAD from both
skb_reserve() and truesize calculation.
Fixes: cc5b114dcf ("bpf, i40e: add meta data support")
Fixes: 0c8493d90b ("i40e: add XDP support for pass and drop actions")
Reported-by: Keith Busch <keith.busch@linux.intel.com>
Reported-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Björn Töpel <bjorn.topel@intel.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Tested-by: Keith Busch <keith.busch@linux.intel.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru says:
====================
qed*: Fix series.
The patch series fixes few issues in the qed/qede drivers.
Please consider applying this series to "net".
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not advertise DCBX_LLD_MANAGED capability i.e., do not allow
external agent to manage the dcbx/lldp negotiation. MFW acts as lldp agent
for qed* devices, and no other lldp agent is allowed to coexist with mfw.
Also updated a debug print, to not to display the redundant info.
Fixes: a1d8d8a51 ("qed: Add dcbnl support.")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid calling a SIMD fastpath handler if it is NULL. The check is needed
to handle an unlikely scenario where unsolicited interrupt is destined to
a PF in INTa mode.
Fixes: fe56b9e6a ("qed: Add module with basic common support")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Memory for packet buffers need to be freed in the error paths as there is
no consumer (e.g., upper layer) for such packets and that memory will never
get freed.
The issue was uncovered when port was attacked with flood of isatap
packets, these are multicast packets hence were directed at all the PFs.
For foce PF, this meant they were routed to the ll2 module which in turn
drops such packets.
Fixes: 0a7fb11c ("qed: Add Light L2 support")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even BOs with AMDGPU_GEM_CREATE_NO_CPU_ACCESS may end up at least
partially in CPU visible VRAM, in particular when all VRAM is visible.
v2:
* Don't take VRAM mgr spinlock, not needed (Christian König)
* Make loop logic simpler and clearer.
Cc: stable@vger.kernel.org
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Commit 0ba99ca483 ("block: Add warning for bi_next not NULL in
bio_endio()") breaks the dm driver. end_clone_bio() detects whether
or not a bio is the last bio associated with a request by checking
the .bi_next field. Commit 0ba99ca483 clears that field before
end_clone_bio() has had a chance to inspect that field. Hence revert
commit 0ba99ca483.
This patch avoids that KASAN reports the following complaint when
running the srp-test software (srp-test/run_tests -c -d -r 10 -t 02-mq):
==================================================================
BUG: KASAN: use-after-free in bio_advance+0x11b/0x1d0
Read of size 4 at addr ffff8801300e06d0 by task ksoftirqd/0/9
CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 4.18.0-rc1-dbg+ #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Call Trace:
dump_stack+0xa4/0xf5
print_address_description+0x6f/0x270
kasan_report+0x241/0x360
__asan_load4+0x78/0x80
bio_advance+0x11b/0x1d0
blk_update_request+0xa7/0x5b0
scsi_end_request+0x56/0x320 [scsi_mod]
scsi_io_completion+0x7d6/0xb20 [scsi_mod]
scsi_finish_command+0x1c0/0x280 [scsi_mod]
scsi_softirq_done+0x19a/0x230 [scsi_mod]
blk_mq_complete_request+0x160/0x240
scsi_mq_done+0x50/0x1a0 [scsi_mod]
srp_recv_done+0x515/0x1330 [ib_srp]
__ib_process_cq+0xa0/0xf0 [ib_core]
ib_poll_handler+0x38/0xa0 [ib_core]
irq_poll_softirq+0xe8/0x1f0
__do_softirq+0x128/0x60d
run_ksoftirqd+0x3f/0x60
smpboot_thread_fn+0x352/0x460
kthread+0x1c1/0x1e0
ret_from_fork+0x24/0x30
Allocated by task 1918:
save_stack+0x43/0xd0
kasan_kmalloc+0xad/0xe0
kasan_slab_alloc+0x11/0x20
kmem_cache_alloc+0xfe/0x350
mempool_alloc_slab+0x15/0x20
mempool_alloc+0xfb/0x270
bio_alloc_bioset+0x244/0x350
submit_bh_wbc+0x9c/0x2f0
__block_write_full_page+0x299/0x5a0
block_write_full_page+0x16b/0x180
blkdev_writepage+0x18/0x20
__writepage+0x42/0x80
write_cache_pages+0x376/0x8a0
generic_writepages+0xbe/0x110
blkdev_writepages+0xe/0x10
do_writepages+0x9b/0x180
__filemap_fdatawrite_range+0x178/0x1c0
file_write_and_wait_range+0x59/0xc0
blkdev_fsync+0x46/0x80
vfs_fsync_range+0x66/0x100
do_fsync+0x3d/0x70
__x64_sys_fsync+0x21/0x30
do_syscall_64+0x77/0x230
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 9:
save_stack+0x43/0xd0
__kasan_slab_free+0x137/0x190
kasan_slab_free+0xe/0x10
kmem_cache_free+0xd3/0x380
mempool_free_slab+0x17/0x20
mempool_free+0x63/0x160
bio_free+0x81/0xa0
bio_put+0x59/0x60
end_bio_bh_io_sync+0x5d/0x70
bio_endio+0x1a7/0x360
blk_update_request+0xd0/0x5b0
end_clone_bio+0xa3/0xd0 [dm_mod]
bio_endio+0x1a7/0x360
blk_update_request+0xd0/0x5b0
scsi_end_request+0x56/0x320 [scsi_mod]
scsi_io_completion+0x7d6/0xb20 [scsi_mod]
scsi_finish_command+0x1c0/0x280 [scsi_mod]
scsi_softirq_done+0x19a/0x230 [scsi_mod]
blk_mq_complete_request+0x160/0x240
scsi_mq_done+0x50/0x1a0 [scsi_mod]
srp_recv_done+0x515/0x1330 [ib_srp]
__ib_process_cq+0xa0/0xf0 [ib_core]
ib_poll_handler+0x38/0xa0 [ib_core]
irq_poll_softirq+0xe8/0x1f0
__do_softirq+0x128/0x60d
The buggy address belongs to the object at ffff8801300e0640
which belongs to the cache bio-0 of size 200
The buggy address is located 144 bytes inside of
200-byte region [ffff8801300e0640, ffff8801300e0708)
The buggy address belongs to the page:
page:ffffea0004c03800 count:1 mapcount:0 mapping:ffff88015a563a00 index:0x0 compound_mapcount: 0
flags: 0x8000000000008100(slab|head)
raw: 8000000000008100 dead000000000100 dead000000000200 ffff88015a563a00
raw: 0000000000000000 0000000000330033 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8801300e0580: fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc
ffff8801300e0600: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
>ffff8801300e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8801300e0700: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8801300e0780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Fixes: 0ba99ca483 ("block: Add warning for bi_next not NULL in bio_endio()")
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We've had a number of users report failures to detect and light up
display with DC with LVDS and VGA. These connector types are not
currently supported with DC. I'd like to add support but unfortunately
don't have a system with LVDS or VGA available.
In order not to cause regressions we should probably fallback to the
non-DC driver for ASICs that support VGA and LVDS.
These ASICs are:
* Bonaire
* Kabini
* Kaveri
* Mullins
ASIC support can always be force enabled with amdgpu.dc=1
v2: Keep Hawaii on DC
v3: Added Mullins to the list
Cc: stable@vger.kernel.org
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
blk_mq_complete_request can only be called for blk-mq drivers, but when
removing the BLK_EH_HANDLED return value, two legacy request timeout
methods incorrectly got switched to call blk_mq_complete_request.
Call __blk_complete_request instead to reinstance the previous behavior.
For that __blk_complete_request needs to be exported.
Fixes: 1fc2b62e ("scsi_transport_fc: complete requests from ->timeout")
Fixes: 0df0bb08 ("null_blk: complete requests from ->timeout")
Reported-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull an OPP fix for 4.18-rc2 from Viresh Kumar.
* 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
PM / OPP: Update voltage in case freq == old_freq
We can never pass in the internal tag to this helper, it'll
always be the hardware tag. So there's no need to check and
do an internal translation of that tag.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
We need to iterate all commands, including the internal one,
for ATAPI error handling.
Fixes: 28361c4036 ("libata: add extra internal command")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
Now that we have the internal tag as a special (higher) value tag,
it gets a bit tricky to iterate the internal commands as some loops
will exceed ATA_MAX_QUEUE. Add explicit helpers for iterating pending
commands, both inflight and internal.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
On Amlogic Meson GXBB & GXL platforms, the SCPI Cortex-M4 Co-Processor
seems to be dependent on the FCLK_DIV2 to be operationnal.
The issue occurred since v4.17-rc1 by freezing the kernel boot when
the 'schedutil' cpufreq governor was selected as default :
[ 12.071837] scpi_protocol scpi: SCP Protocol 0.0 Firmware 0.0.0 version
domain-0 init dvfs: 4
[ 12.087757] hctosys: unable to open rtc device (rtc0)
[ 12.087907] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[ 12.102241] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
But when disabling the MMC driver, the boot finished but cpufreq failed to
change the CPU frequency :
[ 12.153045] cpufreq: __target_index: Failed to change cpu frequency: -5
A bisect between v4.16 and v4.16-rc1 gave
05f814402d ("clk: meson: add fdiv clock gates") to be the first bad commit.
This commit added support for the missing clock gates before the fixed PLL
fixed dividers (FCLK_DIVx) and the clock framework basically disabled
all the unused fixed dividers, thus disabled a critical clock path for
the SCPI Co-Processor.
This patch simply sets the FCLK_DIV2 gate as critical to ensure
nobody can disable it.
Fixes: 05f814402d ("clk: meson: add fdiv clock gates")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
[few corrections in the commit description]
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Although the writeback resends are more robust than the reads, since they
are not immediately rescheduled by the same thread, we are better off
processing them in the same place as the reads.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
We do not want to have rpciod threads perform recursive calls into the
RPC layer since that can deadlock. In particular, having to wait for
a layoutget can be nasty... We want rather to defer scheduling those
retries until we're in the rpc_release() callback, since that is
called from the nfsiod workqueue.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
We can't call function trace hook before setup percpu offset.
When entering secondary_start_kernel(), percpu offset has not
been initialized. So this lead hotplug malfunction.
Here is the flow to reproduce this bug:
echo 0 > /sys/devices/system/cpu/cpu1/online
echo function > /sys/kernel/debug/tracing/current_tracer
echo 1 > /sys/kernel/debug/tracing/tracing_on
echo 1 > /sys/devices/system/cpu/cpu1/online
Acked-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Zhizhou Zhang <zhizhouzhang@asrmicro.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
dma_alloc_*() buffers might be exposed to userspace via mmap() call, so
they should be cleared on allocation. In case of IOMMU-based dma-mapping
implementation such buffer clearing was missing in the code path for
DMA_ATTR_FORCE_CONTIGUOUS flag handling, because dma_alloc_from_contiguous()
doesn't honor __GFP_ZERO flag. This patch fixes this issue. For more
information on clearing buffers allocated by dma_alloc_* functions,
see commit 6829e274a6 ("arm64: dma-mapping: always clear allocated
buffers").
Fixes: 44176bb38f ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
I broke the build when CONFIG_NMI_IPI=n with my recent commit to add
arch_trigger_cpumask_backtrace(), eg:
stacktrace.c:(.text+0x1b0): undefined reference to `.smp_send_safe_nmi_ipi'
We should rework the CONFIG symbols here in future to avoid these
double barrelled ifdefs but for now they fix the build.
Fixes: 5cc05910f2 ("powerpc/64s: Wire up arch_trigger_cpumask_backtrace()")
Reported-by: Christophe LEROY <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When xenbus_printf fails, the lack of error-handling code may
cause unexpected results.
This patch adds error-handling code after calling xenbus_printf.
Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
When xenbus_printf fails, the lack of error-handling code may
cause unexpected results.
This patch adds error-handling code after calling xenbus_printf.
Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Krzysztof Kozlowski <krzk@kernel.org> reports that a heavy NFSv4
WRITE workload against a slow NFS server causes his Raspberry Pi
clients to stall. Krzysztof bisected it to commit 37ac86c3a7
("SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock") .
I was able to reproduce similar behavior and it appears that rarely
the RPC client layer is re-allocating an XID for an RPC that it has
already partially sent. This results in the client ignoring the
subsequent reply, which carries the original XID.
For various reasons, checking !req->rq_xmit_bytes_sent in
xprt_prepare_transmit is not a 100% reliable mechanism for
determining when a fresh XID is needed.
Trond's preference is to allocate the XID at the time each rpc_rqst
slot is initialized.
This patch should also address a gcc 4.1.2 complaint reported by
Geert Uytterhoeven <geert@linux-m68k.org>.
Reported-by: Krzysztof Kozlowski <krzk@kernel.org>
Fixes: 37ac86c3a7 ("SUNRPC: Initialize rpc_rqst outside of ... ")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If the layout was invalidated due to a reboot, then don't try to send
a layoutreturn for it.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Right now, we can call nfs_commit_inode() while holding the session slot,
which could lead to NFSv4 deadlocks. Ensure we only keep the slot if
the server returned a layout that we have to process.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If client is smart or lucky enough to create a new context
after each hang, our context banning mechanism will never
catch up, and as a result of that it will be saved from
client banning. This can result in a never ending streak of
gpu hangs caused by bad or malicious client, preventing
access from other legit gpu clients.
Fix this by always incrementing per client ban score if
it hangs in short successions regardless of context ban
scoring. The exception are non bannable contexts. They remain
detached from client ban scoring mechanism.
v2: xchg timestamp, tidyup (Chris)
v3: comment, bannable & banned together (Chris)
Fixes: b083a0870c ("drm/i915: Add per client max context ban limit")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180615104429.31477-1-mika.kuoppala@linux.intel.com
(cherry picked from commit 14921f3cef)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
While Bspec doesn't list a specific sequence for turning off the DP port
on g4x we are getting an underrun if the port is disabled in the
.disable() hook. Looks like the pipe stops when the port stops, and by
that time the plane disable may not have completed yet. Also the plane(s)
seem to end up in some wonky state when this happens as they also signal
another underrun immediately after we turn them back on during the next
enable sequence.
We could add a vblank wait in .disable() to avoid wedging the planes,
but I assume we're still tripping up the pipe in some way. So it seems
better to me to just follow the ILK+ sequence and turn off the DP port
in .post_disable() instead. This sequence doesn't seem to suffer from
this problem. Could be it was always the intended sequence for DP and
the gen4 bspec was just never updated to include it.
Originally we used the bad sequence even on ilk+, but I changed that
in commit 08aff3fe26 ("drm/i915: Move DP port disable to post_disable
for pch platforms") as it was causing issues on those platforms as well.
I left out g4x then only because I didn't have the hardware to test it.
Now that I do it's fairly clear that the ilk+ sequence is also the
right choice for g4x.
v2: Fix whitespace fail (Jani)
Mention the ilk+ commit (Jani)
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180613160553.11664-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 51a9f6dfc0)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
On i965/g4x IIR is edge triggered. So in order for IIR to notice that
there is still a pending interrupt we have to force and edge in ISR.
For the ISR/IIR pipe event bits we can do that by temporarily
clearing all the PIPESTAT enable bits when we ack the status bits.
This will force the ISR pipe event bit low, and it can then go back
high when we restore the PIPESTAT enable bits.
This avoids the following race:
1. stat = read(PIPESTAT)
2. an enabled PIPESTAT status bit goes high
3. write(PIPESTAT, enable|stat);
4. write(IIR, PIPE_EVENT)
The end result is IIR==0 and ISR!=0. This can lead to nasty
vblank wait/flip_done timeouts if another interrupt source
doesn't trick us into looking at the PIPESTAT status bits despite
the IIR PIPE_EVENT bit being low.
Before i965 IIR was level triggered so this problem can't actually
happen there. And curiously VLV/CHV went back to the level triggered
scheme as well. But for simplicity we'll use the same i965/g4x
compatible code for all platforms.
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106033
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105225
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106030
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180611200258.27121-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 132c27c97c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Only gnttab_{alloc|free}_pages are exported as EXPORT_SYMBOL
while all the rest are exported as EXPORT_SYMBOL_GPL, thus
effectively making it not possible for non-GPL driver modules
to use grant table module. Export gnttab_{alloc|free}_pages as
EXPORT_SYMBOL_GPL so all the exports are aligned.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
When xenbus_printf fails, the lack of error-handling code may
cause unexpected results.
This patch adds error-handling code after calling xenbus_printf.
Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Use a global variable to store the start flags for both PV and PVH.
This allows the xen_initial_domain macro to work properly on PVH.
Note that ARM is also switched to use the new variable.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Similar to previous patches, hard disable interrupts when a CPU is
in panic. This reduces the chance the watchdog has to interfere with
the panic, and avoids any other type of masked interrupt being
executed when crashing which minimises the length of the crash path.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Marking CPUs stopped by smp_send_stop as offline can cause warnings
due to cross-CPU wakeups. This trace was noticed on a busy system
running a sysrq+c crash test, after the injected crash:
WARNING: CPU: 51 PID: 1546 at kernel/sched/core.c:1179 set_task_cpu+0x22c/0x240
CPU: 51 PID: 1546 Comm: kworker/u352:1 Tainted: G D
Workqueue: mlx5e mlx5e_update_stats_work [mlx5_core]
[...]
NIP [c00000000017c21c] set_task_cpu+0x22c/0x240
LR [c00000000017d580] try_to_wake_up+0x230/0x720
Call Trace:
[c000000001017700] runqueues+0x0/0xb00 (unreliable)
[c00000000017d580] try_to_wake_up+0x230/0x720
[c00000000015a214] insert_work+0x104/0x140
[c00000000015adb0] __queue_work+0x230/0x690
[c000003fc5007910] [c00000000015b26c] queue_work_on+0x5c/0x90
[c0080000135fc8f8] mlx5_cmd_exec+0x538/0xcb0 [mlx5_core]
[c008000013608fd0] mlx5_core_access_reg+0x140/0x1d0 [mlx5_core]
[c00800001362777c] mlx5e_update_pport_counters.constprop.59+0x6c/0x90 [mlx5_core]
[c008000013628868] mlx5e_update_ndo_stats+0x28/0x90 [mlx5_core]
[c008000013625558] mlx5e_update_stats_work+0x68/0xb0 [mlx5_core]
[c00000000015bcec] process_one_work+0x1bc/0x5f0
[c00000000015ecac] worker_thread+0xac/0x6b0
[c000000000168338] kthread+0x168/0x1b0
[c00000000000b628] ret_from_kernel_thread+0x5c/0xb4
This happens because firstly the CPU is not really offline in the
usual sense, processes and interrupts have not been migrated away.
Secondly smp_send_stop does not happen atomically on all CPUs, so
one CPU can have marked itself offline, while another CPU is still
running processes or interrupts which can affect the first CPU.
Fix this by just not marking the CPU as offline. It's more like
frozen in time, so offline does not really reflect its state properly
anyway. There should be nothing in the crash/panic path that walks
online CPUs and synchronously waits for them, so this change should
not introduce new hangs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Similarly to commit 855bfe0de1 ("powerpc: hard disable irqs in
smp_send_stop loop"), irqs should be hard disabled by
panic_smp_self_stop.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In the device tree CPU features quirk code we want to set
CPU_FTR_POWER9_DD2_1 on all Power9s that aren't DD2.0 or earlier. But
we got the logic wrong and instead set it on all CPUs that aren't
Power9 DD2.0 or earlier, ie. including Power8.
Fix it by making sure we're on a Power9. This isn't a bug in practice
because the only code that checks the feature is Power9 only to begin
with. But we'll backport it anyway to avoid confusion.
Fixes: 9e9626ed3a ("powerpc/64s: Fix POWER9 DD2.2 and above in DT CPU features")
Cc: stable@vger.kernel.org # v4.17+
Reported-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The patch 99baac21e4 ("mm: fix MADV_[FREE|DONTNEED] TLB flush miss
problem") added a force flush mode to the mmu_gather flush, which
unconditionally flushes the entire address range being invalidated
(even if actual ptes only covered a smaller range), to solve a problem
with concurrent threads invalidating the same PTEs causing them to
miss TLBs that need flushing.
This does not work with powerpc that invalidates mmu_gather batches
according to page size. Have powerpc flush all possible page sizes in
the range if it encounters this concurrency condition.
Patch 4647706ebe ("mm: always flush VMA ranges affected by
zap_page_range") does add a TLB flush for all page sizes on powerpc for
the zap_page_range case, but that is to be removed and replaced with
the mmu_gather flush to avoid redundant flushing. It is also thought to
not cover other obscure race conditions:
https://lkml.kernel.org/r/BD3A0EBE-ECF4-41D4-87FA-C755EA9AB6BD@gmail.com
Hash does not have a problem because it invalidates TLBs inside the
page table locks.
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The v21 version of the NAND flash controller contains a Spare Area Size
Register (SPAS) at offset 0x10. Its setting defaults to the maximum
spare area size of 218 bytes. The size that is set in this register is
used by the controller when it calculates the ECC bytes internally in
hardware.
Usually, this register is updated from settings in the IIM fuses when
the system is booting from NAND flash. For other boot media, however,
the SPAS register remains at the default setting, which may not work for
the particular flash chip on the board. The same goes for flash chips
whose configuration cannot be set in the IIM fuses (e.g. chips with 2k
sector size and 128 bytes spare area size can't be configured in the IIM
fuses on imx25 systems).
Set the SPAS register explicitly during the preset operation. Derive the
register value from mtd->oobsize that was detected during probe by
decoding the flash chip's ID bytes.
While at it, rename the define for the spare area register's offset to
NFC_V21_RSLTSPARE_AREA. The register at offset 0x10 on v1 controllers is
different from the register on v21 controllers.
Fixes: d484018 ("mtd: mxc_nand: set NFC registers after reset")
Cc: stable@vger.kernel.org
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
This commit fixes a rare but possible case when the clk rate is updated
without update of the regulator voltage.
At boot up, CPUfreq checks if the system is running at the right freq. This
is a sanity check in case a bootloader set clk rate that is outside of freq
table present with cpufreq core. In such cases system can be unstable so
better to change it to a freq that is preset in freq-table.
The CPUfreq takes next freq that is >= policy->cur and this is our
target_freq that needs to be set now.
dev_pm_opp_set_rate(dev, target_freq) checks the target_freq and the
old_freq (a current rate). If these are equal it returns early. If not,
it searches for OPP (old_opp) that fits best to old_freq (not listed in
the table) and updates old_freq (!).
Here, we can end up with old_freq = old_opp.rate = target_freq, which
is not handled in _generic_set_opp_regulator(). It's supposed to update
voltage only when freq > old_freq || freq > old_freq.
if (freq > old_freq) {
ret = _set_opp_voltage(dev, reg, new_supply);
[...]
if (freq < old_freq) {
ret = _set_opp_voltage(dev, reg, new_supply);
if (ret)
It results in, no voltage update while clk rate is updated.
Example:
freq-table = {
1000MHz 1.15V
666MHZ 1.10V
333MHz 1.05V
}
boot-up-freq = 800MHz # not listed in freq-table
freq = target_freq = 1GHz
old_freq = 800Mhz
old_opp = _find_freq_ceil(opp_table, &old_freq); #(old_freq is modified!)
old_freq = 1GHz
Fixes: 6a0712f6f1 ("PM / OPP: Add dev_pm_opp_set_rate()")
Cc: 4.6+ <stable@vger.kernel.org> # v4.6+
Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
PID bitfield in descriptor should be set based on particular request
length, not based on EP's mc value. PID value can't be set to 0 even
request length is 0.
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
It happens when enable debug log, if set_alt() returns
USB_GADGET_DELAYED_STATUS and usb_composite_setup_continue()
is called before increasing count of @delayed_status,
so fix it by using spinlock of @cdev->lock.
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Tested-by: Jay Hsu <shih-chieh.hsu@mediatek.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
If isoc split in transfer with no data (the length of DATA0
packet is zero), we can't simply return immediately. Because
the DATA0 can be the first transaction or the second transaction
for the isoc split in transaction. If the DATA0 packet with no
data is in the first transaction, we can return immediately.
But if the DATA0 packet with no data is in the second transaction
of isoc split in transaction sequence, we need to increase the
qtd->isoc_frame_index and giveback urb to device driver if needed,
otherwise, the MDATA packet will be lost.
A typical test case is that connect the dwc2 controller with an
usb hs Hub (GL852G-12), and plug an usb fs audio device (Plantronics
headset) into the downstream port of Hub. Then use the usb mic
to record, we can find noise when playback.
In the case, the isoc split in transaction sequence like this:
- SSPLIT IN transaction
- CSPLIT IN transaction
- MDATA packet (176 bytes)
- CSPLIT IN transaction
- DATA0 packet (0 byte)
This patch use both the length of DATA0 and qtd->isoc_split_offset
to check if the DATA0 is in the second transaction.
Tested-by: Gevorg Sahakyan <sahakyan@synopsys.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Minas Harutyunyan hminas@synopsys.com>
Signed-off-by: William Wu <william.wu@rock-chips.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The commit 3bc04e28a0 ("usb: dwc2: host: Get aligned DMA in
a more supported way") rips out a lot of code to simply the
allocation of aligned DMA. However, it also introduces a new
issue when use isoc split in transfer.
In my test case, I connect the dwc2 controller with an usb hs
Hub (GL852G-12), and plug an usb fs audio device (Plantronics
headset) into the downstream port of Hub. Then use the usb mic
to record, we can find noise when playback.
It's because that the usb Hub uses an MDATA for the first
transaction and a DATA0 for the second transaction for the isoc
split in transaction. An typical isoc split in transaction sequence
like this:
- SSPLIT IN transaction
- CSPLIT IN transaction
- MDATA packet
- CSPLIT IN transaction
- DATA0 packet
The DMA address of MDATA (urb->dma) is always DWORD-aligned, but
the DMA address of DATA0 (urb->dma + qtd->isoc_split_offset) may
not be DWORD-aligned, it depends on the qtd->isoc_split_offset (the
length of MDATA). In my test case, the length of MDATA is usually
unaligned, this cause DATA0 packet transmission error.
This patch use kmem_cache to allocate aligned DMA buf for isoc
split in transaction. Note that according to usb 2.0 spec, the
maximum data payload size is 1023 bytes for each fs isoc ep,
and the maximum allowable interrupt data payload size is 64 bytes
or less for fs interrupt ep. So we set the size of object to be
1024 bytes in the kmem cache.
Tested-by: Gevorg Sahakyan <sahakyan@synopsys.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Minas Harutyunyan hminas@synopsys.com>
Signed-off-by: William Wu <william.wu@rock-chips.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The dwc2_get_ls_map() use ttport to reference into the
bitmap if we're on a multi_tt hub. But the bitmaps index
from 0 to (hub->maxchild - 1), while the ttport index from
1 to hub->maxchild. This will cause invalid memory access
when the number of ttport is hub->maxchild.
Without this patch, I can easily meet a Kernel panic issue
if connect a low-speed USB mouse with the max port of FE2.1
multi-tt hub (1a40:0201) on rk3288 platform.
Fixes: 9f9f09b048 ("usb: dwc2: host: Totally redo the microframe scheduler")
Cc: <stable@vger.kernel.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Minas Harutyunyan hminas@synopsys.com>
Signed-off-by: William Wu <william.wu@rock-chips.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In case when a hub is connected to DWC2 host
auto suspend occurs and host goes to
hibernation. When any device connected to hub
host hibernation exiting incorrectly.
- Added dwc2_hcd_rem_wakeup() function call to
exit from suspend state by remote wakeup.
- Increase timeout value for port suspend bit to be set.
Acked-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Artur Petrosyan <arturp@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The #ifdef guards around these are wrong, resulting in warnings
in certain configurations:
drivers/usb/dwc3/dwc3-qcom.c:244:12: error: 'dwc3_qcom_resume' defined but not used [-Werror=unused-function]
static int dwc3_qcom_resume(struct dwc3_qcom *qcom)
^~~~~~~~~~~~~~~~
drivers/usb/dwc3/dwc3-qcom.c:223:12: error: 'dwc3_qcom_suspend' defined but not used [-Werror=unused-function]
static int dwc3_qcom_suspend(struct dwc3_qcom *qcom)
This replaces the guards with __maybe_unused annotations to shut up
the warnings and give better compile time coverage.
Fixes: a4333c3a6b ("usb: dwc3: Add Qualcomm DWC3 glue driver")
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Fix to return error code -ENODEV from the get device failed error
handling case instead of 0, as done elsewhere in this function.
Fixes: a4333c3a6b ("usb: dwc3: Add Qualcomm DWC3 glue driver")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
When scaling max/min settings are changed, internally they are converted
to a ratio using the max turbo 1 core turbo frequency. This works fine
when 1 core max is same irrespective of the core. But under Turbo 3.0,
this will not be the case. For example:
Core 0: max turbo pstate: 43 (4.3GHz)
Core 1: max turbo pstate: 45 (4.5GHz)
In this case 1 core turbo ratio will be maximum of all, so it will be
45 (4.5GHz). Suppose scaling max is set to 4GHz (ratio 40) for all cores
,then on core one it will be
= max_state * policy->max / max_freq;
= 43 * (4000000/4500000) = 38 (3.8GHz)
= 38
which is 200MHz less than the desired.
On core2, it will be correctly set to ratio 40 (4GHz). Same holds true
for scaling min frequency limit. So this requires usage of correct turbo
max frequency for core one, which in this case is 4.3GHz. So we need to
adjust per CPU cpu->pstate.turbo_freq using the maximum HWP ratio of that
core.
This change uses the HWP capability of a core to adjust max turbo
frequency. But since Broadwell HWP doesn't use ratios in the HWP
capabilities, we have to use legacy max 1 core turbo ratio. This is not
a problem as the HWP capabilities don't differ among cores in Broadwell.
We need to check for non Broadwell CPU model for applying this change,
though.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add device remove and module exit code to make the driver
functioning as a loadable module.
Fixes: ac28927659 (cpufreq: kryo: allow building as a loadable module)
Signed-off-by: Ilia Lin <ilia.lin@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In event of error returned by the nvmem_cell_read() non-pointer value
may be dereferenced. Fix this with error handling.
Additionally free the allocated speedbin buffer, as per the API.
Fixes: 9ce36edd1a52 (cpufreq: Add Kryo CPU scaling driver)
Signed-off-by: Ilia Lin <ilia.lin@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fix build error on nds32 due to the merge of commit e3d5980568 ("lib:
Rename compiler intrinsic selects to GENERIC_LIB_*") during the 4.18
merge window which renames Kconfig symbols. This had raced with commit
aeaa7af744 ("nds32: lib: To use generic lib instead of libgcc to
prevent the symbol undefined issue.") merged late in the 4.17 cycle,
which added selects to nds32 using the original Kconfig symbol names.
When they came together in merge commit 763f96944c ("Merge tag
'mips_4.18' of
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux") this resulted
in the following build errors:
nds32le-linux-ld: kernel/time/timekeeping.o: in function `timekeeping_init':
timekeeping.c:(.init.text+0x140): undefined reference to `__ashldi3'
nds32le-linux-ld: timekeeping.c:(.init.text+0x144): undefined reference to `__ashldi3'
nds32le-linux-ld: timekeeping.c:(.init.text+0x17e): undefined reference to `__lshrdi3'
nds32le-linux-ld: timekeeping.c:(.init.text+0x182): undefined reference to `__lshrdi3'
nds32le-linux-ld: drivers/clocksource/mmio.o: in function `clocksource_mmio_init':
mmio.c:(.init.text+0x54): undefined reference to `__lshrdi3'
nds32le-linux-ld: mmio.c:(.init.text+0x58): undefined reference to `__lshrdi3'
Rename all 6 selects in nds32 and adjust the ordering accordingly to be
alphabetical.
Fixes: 763f96944c ("Merge tag 'mips_4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[jhogan@kernel.org: Rename all 6 symbols, sort, update commit message]
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Matt Redfearn <matt.redfearn@mips.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Greentime Hu <greentime@andestech.com>
Signed-off-by: Greentime Hu <greentime@andestech.com>
nds32 depends on the macros '__NDS32_E[BL]__' to correctly
select or define endian-specific macros, structures or pieces
of code.
These macros are predefined by the compiler but sparse knows nothing
about them and thus may pre-process files differently from what
GCC would.
Fix this by adding '-D__NDS32_E[BL]__' to CHECKFLAGS.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Acked-by: Greentime Hu <greentime@andestech.com>
Signed-off-by: Greentime Hu <greentime@andestech.com>
Use the correct IRQ line for the MSI controller in the PCIe host
controller. Apparently a different IRQ line is used compared to other
i.MX6 variants. Without this change MSI IRQs aren't properly propagated
to the upstream interrupt controller.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Fixes: b1d17f68e5 ("ARM: dts: imx: add initial imx6sx device tree source")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
I'm no longer with QCOM. I am still interested in maintaining or reviewing
PCI/DMA engine patches. Update email-id to an active one.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Commit 0198d7bb8a ("ASoC: omap-mcbsp: Convert to use the sdma-pcm
instead of omap-pcm") resulted in broken audio playback on OMAP1510
(discovered on Amstrad Delta).
When running on OMAP1510, omap-pcm used to obtain DMA offset from
snd_dmaengine_pcm_pointer_no_residue() based on DMA interrupt triggered
software calculations instead of snd_dmaengine_pcm_pointer() which
depended on residue value calculated from omap_dma_get_src_pos().
Similar code path is still available in now used
sound/soc/soc-generic-dmaengine-pcm.c but it is not triggered.
It was verified already before that omap_get_dma_src_pos() from
arch/arm/plat-omap/dma.c didn't work correctly for OMAP1510 - see
commit 1bdd741991 ("ASoC: OMAP: fix OMAP1510 broken PCM pointer
callback") for details. Apparently the same applies to its successor,
omap_dma_get_src_pos() from drivers/dma/ti/omap-dma.c.
On the other hand, snd_dmaengine_pcm_pointer_no_residue() is described
as depreciated and discouraged for use in new drivers because of its
unreliable accuracy. However, it seems the only working option for
OPAM1510 now, as long as a software calculated residue is not
implemented as OMAP1510 fallback in omap-dma.
Using snd_dmaengine_pcm_pointer_no_residue() code path instead of
snd_dmaengine_pcm_pointer() in sound/soc/soc-generic-dmaengine-pcm.c
can be triggered in two ways:
- by passing pcm->flags |= SND_DMAENGINE_PCM_FLAG_NO_RESIDUE from
sound/soc/omap/sdma-pcm.c,
- by passing dma_caps.residue_granularity =
DMA_RESIDUE_GRANULARITY_DESCRIPTOR from DMA engine.
Let's do the latter.
Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Since commit 80c49563e2 ("scsi: scsi_debug: implement IMMED bit") there
are long delays in F_SYNC_DELAY and F_SSU_DELAY. This can cause a memory
leak in schedule_resp(), which can be invoked while unloading the
scsi_debug module: free_all_queued() had already freed all sd_dp and
schedule_resp will alloc a new one, which will never get freed. Here's the
kmemleak report while running xfstests generic/350:
unreferenced object 0xffff88007d752b00 (size 128):
comm "rmmod", pid 26940, jiffies 4295816945 (age 7.588s)
hex dump (first 32 bytes):
00 2b 75 7d 00 88 ff ff 00 00 00 00 00 00 00 00 .+u}............
00 00 00 00 00 00 00 00 8e 31 a2 34 5f 03 00 00 .........1.4_...
backtrace:
[<000000002abd83d0>] 0xffffffffa000705e
[<000000004c063fda>] scsi_dispatch_cmd+0xc7/0x1a0
[<000000000c119a00>] scsi_request_fn+0x251/0x550
[<000000009de0c736>] __blk_run_queue+0x3f/0x60
[<000000001c4453c8>] blk_execute_rq_nowait+0x98/0xd0
[<00000000d17ec79f>] blk_execute_rq+0x3a/0x50
[<00000000a7654b6e>] scsi_execute+0x113/0x250
[<00000000fd78f7cd>] sd_sync_cache+0x95/0x160
[<0000000024dacb14>] sd_shutdown+0x9b/0xd0
[<00000000e9101710>] sd_remove+0x5f/0xb0
[<00000000c43f0d63>] device_release_driver_internal+0x13c/0x1f0
[<00000000e8ad57b6>] bus_remove_device+0xe9/0x160
[<00000000713a7b8a>] device_del+0x120/0x320
[<00000000e5db670c>] __scsi_remove_device+0x115/0x150
[<00000000eccbef30>] scsi_forget_host+0x20/0x60
[<00000000cd5a0738>] scsi_remove_host+0x6d/0x120
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The patch reverts changes done in qlt_schedule_sess_for_deletion() to
avoid spinlock recursion sess->vha->work_lock should be used instead
of ha->tgt.sess_lock, that can be locked in callers: qlt_reset() or
qlt_handle_login()
[mkp: roll in build warning reported by sfr]
Fixes: 1c6cacf4ea ("scsi: qla2xxx: Fixup locking for session deletion")
Cc: <stable@vger.kernel.org> #v4.17
Signed-off-by: Mikhail Malygin <m.malygin@yadro.com>
Reported-by: Mikhail Malygin <m.malygin@yadro.com>
Tested-by: Mikhail Malygin <m.malygin@yadro.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Driver does both wmb() and writel(). The latter already has a barrier
on some architectures like arm64. This ends up with CPU observing two
barriers back to back before executing the register write.
Drivers should generally assume that the barrier implied by writel() is
sufficient for ordering DMA. Remove the extraneous wmb() before it.
[mkp: Squashed Arnd's and Sinan's patches]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Sinan Kaya <okaya@codeaurora.org>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Generally target core and TCMUser seem to work fine for tape devices and
media changers. But there is at least one situation where TCMUser is not
able to support sequential access device emulation correctly.
The situation is when an initiator sends a SCSI READ CDB with a length that
is greater than the length of the tape block to read. We can distinguish
two subcases:
A) The initiator sent the READ CDB with the SILI bit being set.
In this case the sequential access device has to transfer the data from
the tape block (only the length of the tape block) and transmit a good
status. The current interface between TCMUser and the userspace does
not support reduction of the read data size by the userspace program.
The patch below fixes this subcase by allowing the userspace program to
specify a reduced data size in read direction.
B) The initiator sent the READ CDB with the SILI bit not being set.
In this case the sequential access device has to transfer the data from
the tape block as in A), but additionally has to transmit CHECK
CONDITION with the ILI bit set and NO SENSE in the sensebytes. The
information field in the sensebytes must contain the residual count.
With the below patch a user space program can specify the real read data
length and appropriate sensebytes. TCMUser then uses the se_cmd flag
SCF_TREAT_READ_AS_NORMAL, to force target core to transmit the real data
size and the sensebytes. Note: the flag SCF_TREAT_READ_AS_NORMAL is
introduced by Lee Duncan's patch "[PATCH v4] target: transport should
handle st FM/EOM/ILI reads" from Tue, 15 May 2018 18:25:24 -0700.
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Ctxdmas for cursors from all heads are setup in the core channel, and due
to us tracking allocated handles per-window, we were failing with -EEXIST
on multiple-head setups trying to allocate duplicate handles.
The cursor code is hardcoded to use the core channel vram ctxdma already,
so just skip ctxdma allocation for cursor fbs to fix the issue.
Fixes: 5bca1621c0 ("drm/nouveau/kms/nv50-: move fb ctxdma tracking into windows")
Reported-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pull jfs fix from Dave Kleikamp:
"This fixes a too-small allocation in the xattr code"
* tag 'jfs-4.18' of git://github.com/kleikamp/linux-shaggy:
jfs: Fix inconsistency between memory allocation and ea_buf->max_size
Pull s390 updates from Martin Schwidefsky:
"common I/O layer
- Fix bit-fields crossing storage-unit boundaries in css_general_char
dasd driver
- Avoid a sparse warning in regard to the queue lock
- Allocate the struct dasd_ccw_req as per request data. Only for
internal I/O is the structure allocated separately
- Remove the unused function dasd_kmalloc_set_cda
- Save a few bytes in struct dasd_ccw_req by reordering fields
- Convert remaining users of dasd_kmalloc_request to
dasd_smalloc_request and remove the now unused function
vfio/ccw
- Refactor and improve pfn_array_alloc_pin/pfn_array_pin
- Add a new tracepoint for failed vfio/ccw requests
- Add a CCW translation improvement to accept more requests as valid
- Bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/dasd: only use preallocated requests
s390/dasd: reshuffle struct dasd_ccw_req
s390/dasd: remove dasd_kmalloc_set_cda
s390/dasd: move dasd_ccw_req to per request data
s390/dasd: simplify locking in process_final_queue
s390/cio: sanitize css_general_characteristics definition
vfio: ccw: add tracepoints for interesting error paths
vfio: ccw: set ccw->cda to NULL defensively
vfio: ccw: refactor and improve pfn_array_alloc_pin()
vfio: ccw: shorten kernel doc description for pfn_array_pin()
vfio: ccw: push down unsupported IDA check
vfio: ccw: fix error return in vfio_ccw_sch_event
s390/archrandom: Rework arch random implementation.
s390/net: add pnetid support
The patch fixed a W=1 warning but broke the ia64 build:
CC mm/memblock.o
mm/memblock.c:1340: error: redefinition of `memblock_virt_alloc_try_nid_raw'
./include/linux/bootmem.h:335: error: previous definition of `memblock_virt_alloc_try_nid_raw' was here
Because inlcude/linux/bootmem.h says
#if defined(CONFIG_HAVE_MEMBLOCK) && defined(CONFIG_NO_BOOTMEM)
whereas mm/Makefile says
obj-$(CONFIG_HAVE_MEMBLOCK) += memblock.o
So revert 26f09e9b3a ("mm/memblock: add missing include
<linux/bootmem.h>") while a full fix can be worked on.
Fixes: 26f09e9b3a ("mm/memblock: add missing include <linux/bootmem.h>")
Reported-by: Tony Luck <tony.luck@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow the code which provides extensions to support direct assignment
of Intel IGD (GVT-d) to be compiled out of the kernel if desired. The
config option for this was previously automatically enabled on X86,
therefore the default remains Y. This simply provides the option to
disable it even for X86.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The erratum and workaround are described by BCM5300X-ES300-RDS.pdf as
below.
R10: PCIe Transactions Periodically Fail
Description: The BCM5300X PCIe does not maintain transaction ordering.
This may cause PCIe transaction failure.
Fix Comment: Add a dummy PCIe configuration read after a PCIe
configuration write to ensure PCIe configuration access
ordering. Set ES bit of CP0 configu7 register to enable
sync function so that the sync instruction is functional.
Resolution: hndpci.c: extpci_write_config()
hndmips.c: si_mips_init()
mipsinc.h CONF7_ES
This is fixed by the CFE MIPS bcmsi chipset driver also for BCM47XX.
Also the dummy PCIe configuration read is already implemented in the
Linux BCMA driver.
Enable ExternalSync in Config7 when CONFIG_BCMA_DRIVER_PCI_HOSTMODE=y
too so that the sync instruction is externalised.
Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp>
Reviewed-by: Paul Burton <paul.burton@mips.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
Cc: Rafał Miłecki <zajec5@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/19461/
Signed-off-by: James Hogan <jhogan@kernel.org>
Fixes the following sparse warning:
drivers/ata/ahci_mvebu.c:85:5: warning:
symbol 'ahci_mvebu_stop_engine' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Currently smatch warns of possible Spectre-V1 issue in ahci_led_store():
drivers/ata/libahci.c:1150 ahci_led_store() warn: potential spectre issue 'pp->em_priv' (local cap)
Userspace controls @pmp from following callchain:
em_message->store()
->ata_scsi_em_message_store()
-->ap->ops->em_store()
--->ahci_led_store()
After the mask+shift @pmp is effectively an 8b value, which is used to
index into an array of length 8, so sanitize the array index.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Run the completer task to post a work completion after processing
a memory registration or invalidate work request. This covers the
case where the memory registration or invalidate was the last work
request posted to the qp.
Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com>
Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
On some Mali-DP processors, the LAYER_FORMAT register contains fields
other than the format. These bits were unconditionally cleared when
setting the pixel format, whereas they should be preserved at their
reset values.
Reported-by: Brian Starkey <brian.starkey@arm.com>
Reported-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Ayan Kumar halder <ayan.halder@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
In the situation that DE and SE aren’t shared the same interrupt number,
the Global SE interrupts mask bit MASK_IRQ_EN in MASKIRQ must be set, or
else other mask bits will not work and no SE interrupt will occur. This
patch enables MASK_IRQ_EN for SE to fix this problem.
Signed-off-by: Alison Wang <alison.wang@nxp.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
One needs to ensure that the crtcs are shutdown so that the
drm_crtc_state->connector_mask reflects that no connectors
are currently active. Further, it reduces the reference
count for each connector. This ensures that the connectors
and encoders can be cleanly removed either when _unbind
is called for the corresponding drivers or by
drm_mode_config_cleanup().
We need drm_atomic_helper_shutdown() to be called before
component_unbind_all() otherwise the connectors attached to the
component device will have the wrong reference count value and will not
be cleanly removed.
Signed-off-by: Ayan Kumar Halder <ayan.halder@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
This patch fixes the below parser error of the IOB SLOW PMU.
# perf stat -a -e iob-slow0/cycle-count/ sleep 1
evenf syntax error: 'iob-slow0/cycle-count/'
\___ parser error
It replaces the "-" character by "_" character inside the PMU name.
Signed-off-by: Hoan Tran <hoan.tran@amperecomputing.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Fix PCIe controller interrupt to use IRQ_TYPE_LEVEL_HIGH for Broadcom
Cygnus SoC
Fixes: cd590b50a9 ("ARM: dts: enable PCIe support for Cygnus")
Fixes: f6b889358a ("ARM: dts: Enable MSI support for Broadcom Cygnus")
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
The i2c controller should be using IRQ_TYPE_LEVEL_HIGH, fix that.
Fixes: bb097e3e00 ("ARM: dts: BCM5301X: Add I2C support to the DT")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
The i2c and PCIe controllers had an incorrect type which should have
been set to IRQ_TYPE_LEVEL_HIGH, fix that.
Fixes: b9099ec754 ("ARM: dts: Add Broadcom Hurricane 2 DTS include file")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
The interrupts for the PCIe controllers should all be of type
IRQ_TYPE_LEVEL_HIGH instead of IRQ_TYPE_NONE.
Fixes: d71eb94120 ("ARM: dts: NSP: Add MSI support on PCI")
Fixes: 522199029f ("ARM: dts: NSP: Fix PCIE DT issue")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
The i2c controller should use IRQ_TYPE_LEVEL_HIGH instead of
IRQ_TYPE_NONE.
Fixes: 0f9f27a36d ("ARM: dts: NSP: Add I2C support to the DT")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Update the MAINTAINERS file to cover the "stingray" pattern and a few
files under arch/arm64/boot/dts/broadcom/* as well as the clock driver
and binding.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
While moving the Northstar 2 DTS into a dedicated directory, the
corresponding MAINTAINERS file entry was not updated accordingly, fix
that.
Fixes: 63a913c157 ("arm64: dts: move ns2 into northstar2 directory")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Delete RUN_TESTS and EMIT_TESTS overrides and use common defines in
lib.mk. Common defines work just fine and there is no need to define
custom overrides.
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Reviewed-by: Tom Hromatka <tom.hromatka@oracle.com>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
sparc64 test fails with the following errors on non-sparc64 systems. Fix
the Makefile to do nothing on non-sparc64 systems to suppress the errors:
make run_tests
adi-test.c: Assembler messages:
adi-test.c:302: Error: no such instruction: `rd %tick,%r13'
adi-test.c:304: Error: no such instruction: `rd %tick,%rbp'
adi-test.c:190: Error: no such instruction: `rd %tick,%rbp'
adi-test.c:192: Error: no such instruction: `rd %tick,%rdx'
adi-test.c:273: Error: no such instruction: `rd %tick,%rbx'
adi-test.c:276: Error: no such instruction: `rd %tick,%rdx'
adi-test.c:217: Error: no such instruction: `rd %tick,%rbp'
adi-test.c:220: Error: no such instruction: `rd %tick,%rdx'
adi-test.c:79: Error: no such instruction: `rd %tick,%rax'
adi-test.c:79: Error: no such instruction: `rd %tick,%rax'
adi-test.c:79: Error: no such instruction: `rd %tick,%rax'
adi-test.c:79: Error: no such instruction: `rd %tick,%rax'
adi-test.c:246: Error: no such instruction: `rd %tick,%rbp'
adi-test.c:248: Error: no such instruction: `rd %tick,%rdx'
adi-test.c:79: Error: no such instruction: `rd %tick,%rax'
adi-test.c:79: Error: no such instruction: `rd %tick,%rax'
adi-test.c:79: Error: no such instruction: `rd %tick,%rax'
<builtin>: recipe for target 'adi-test' failed
make[1]: *** [adi-test] Error 1
adi: [FAIL]
./drivers_test.sh: 24: ./drivers_test.sh: ./adi-test: not found
../lib.mk:73: recipe for target 'run_tests' failed
make: *** [run_tests] Error 1
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Reviewed-by: Tom Hromatka <tom.hromatka@oracle.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Unless the software synchronization objects (CONFIG_SW_SYNC) is enabled,
the sync test will be skipped:
TAP version 13
1..0 # Skipped: Sync framework not supported by kernel
Add a config fragment file to be able to run "make kselftest-merge" to
enable relevant configuration required in order to run the sync test.
Signed-off-by: Fathi Boudra <fathi.boudra@linaro.org>
Link: https://lkml.org/lkml/2017/5/5/14
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
When vm test is skipped because of unmet dependencies and/or unsupported
configuration, it exits with error which is treated as a fail by the
Kselftest framework. This leads to false negative result even when the
test could not be run.
Change it to return kselftest skip code when a test gets skipped to
clearly report that the test could not be run.
Kselftest framework SKIP code is 4 and the framework prints appropriate
messages to indicate that the test is skipped.
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
When zram test is skipped because of unmet dependencies and/or
unsupported configuration, it exits with error which is treated as
a fail by the Kselftest framework. This leads to false negative result
even when the test could not be run.
Change it to return kselftest skip code when a test gets skipped to
clearly report that the test could not be run.
Kselftest framework SKIP code is 4 and the framework prints appropriate
messages to indicate that the test is skipped.
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
When user test is skipped because of unmet dependencies and/or
unsupported configuration, it exits with error which is treated as
a fail by the Kselftest framework. This leads to false negative result
even when the test could not be run.
Change it to return kselftest skip code when a test gets skipped to
clearly report that the test could not be run. Add an explicit check
for module presence and return skip code if module isn't present.
Kselftest framework SKIP code is 4 and the framework prints appropriate
messages to indicate that the test is skipped.
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
When sysctl test is skipped because of unmet dependencies and/or
unsupported configuration, it exits with error which is treated as
a fail by the Kselftest framework. This leads to false negative result
even when the test could not be run.
Change it to return kselftest skip code when a test gets skipped to
clearly report that the test could not be run.
Changed return code to kselftest skip code in skip error legs that check
requirements and module probe test error leg.
Kselftest framework SKIP code is 4 and the framework prints appropriate
messages to indicate that the test is skipped.
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
When static_keys test is skipped because of unmet dependencies and/or
unsupported configuration, it exits with error which is treated as a fail
by the Kselftest framework. This leads to false negative result even when
the test could not be run.
Change it to return kselftest skip code when a test gets skipped to clearly
report that the test could not be run.
Added an explicit searches for test_static_key_base and test_static_keys
modules and return skip code if they aren't found to differentiate between
the failure to load the module condition and module not found condition.
Kselftest framework SKIP code is 4 and the framework prints appropriate
messages to indicate that the test is skipped.
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
When pstore_post_reboot test gets skipped because of unmet dependencies
and/or unsupported configuration, it returns 0 which is treated as a pass
by the Kselftest framework. This leads to false positive result even when
the test could not be run.
Change it to return kselftest skip code when a test gets skipped to clearly
report that the test could not be run.
Kselftest framework SKIP code is 4 and the framework prints appropriate
messages to indicate that the test is skipped.
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
The helper module would be unloaded after nf_conntrack_helper_unregister,
so it may cause a possible panic caused by race.
nf_ct_iterate_destroy(unhelp, me) reset the helper of conntrack as NULL,
but maybe someone has gotten the helper pointer during this period. Then
it would panic, when it accesses the helper and the module was unloaded.
Take an example as following:
CPU0 CPU1
ctnetlink_dump_helpinfo
helper = rcu_dereference(help->helper);
unhelp
set helper as NULL
unload helper module
helper->to_nlattr(skb, ct);
As above, the cpu0 tries to access the helper and its module is unloaded,
then the panic happens.
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
It is a waste of memory to use a full "struct netns_sysctl_ipv6"
while only one pointer is really used, considering netns_sysctl_ipv6
keeps growing.
Also, since "struct netns_frags" has cache line alignment,
it is better to move the frags_hdr pointer outside, otherwise
we spend a full cache line for this pointer.
This saves 192 bytes of memory per netns.
Fixes: c038a767cd ("ipv6: add a new namespace for nf_conntrack_reasm")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
On this system EC interrupt triggers constantly kicking devices out of
low power states and thus blocking power management. The system also has
a PCIe root port hosting Alpine Ridge Thunderbolt controller and it
never gets a chance to go to D3cold because of this.
Since the power button works the same regardless if EC interrupt is
enabled or not during s2idle, add a quirk for this machine that sets
ec_no_wakeup=true preventing spurious wakeups.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In ISOC OUT transfer, when the OUT token received while EP disabled,
we shouldn't complete a usb request. The current flow completed one
usb request, this will lead to a packet drop to function driver.
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Commit fe8abf332b ("usb: dwc3: support clocks and resets for DWC3 core")
adds support for handling clocks and resets in the DWC3 core, so that for
platforms following the standard devicetree bindings this does not need
to be duplicated in all the different glue layers.
These changes intended for devicetree based platforms introduce an
uncoditional clk_bulk_get() in the core probe path. This leads to the
following error being logged on x86/ACPI systems:
[ 26.276783] dwc3 dwc3.3.auto: Failed to get clk 'ref': -2
This commits wraps the clk_bulk_get() in an if (dev->of_node) check so
that it only is done on devicetree instantiated devices, fixing this
error.
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In ISOC transfer, when the NAK interrupt happens, we shouldn't complete
a usb request, the current flow will complete one usb request with no
hardware transfer, this will lead to a packet drop on the usb bus.
Acked-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Zeng Tao <prime.zeng@hisilicon.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The clocks have already been explicitly disabled and put as part of
remove() so the runtime suspend callback must not be run when balancing
the runtime PM usage count before returning.
Fixes: 16adc674d0 ("usb: dwc3: add generic OF glue layer")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In case of requests queue is empty reset EP target_frame to
initial value.
This allow restarting ISOC traffic in case when function
driver queued requests with interruptions.
Tested-by: Zeng Tao <prime.zeng@hisilicon.com>
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In case *vif* is NULL at 655: if (!vif), the execution path jumps to
label out, where *vif* is dereferenced at 679:
if (vif->sta_state == QTNF_STA_CONNECTING)
Fix this by immediately returning when *vif* is NULL instead of
jumping to label out.
Addresses-Coverity-ID: 1469567 ("Dereference after null check")
Fixes: 480daa9cb6 ("qtnfmac: fix invalid STA state on EAPOL failure")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Sergey Matyukevich <sergey.matyukevich.os@quanenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
This reverts commit 2c17a4368a.
The offending commit triggers a run-time fault when accessing the panel
element of the sun4i_tcon structure when no such panel is attached.
It was apparently assumed in said commit that a panel is always used with
the TCON. Although it is often the case, this is not always true.
For instance a bridge might be used instead of a panel.
This issue was discovered using an A13-OLinuXino, that uses the TCON
in RGB mode for a simple DAC-based VGA bridge.
Cc: stable@vger.kernel.org
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180613081647.31183-1-paul.kocialkowski@bootlin.com
Silicon Labs defines alternative VID/PID pairs for some chips that when
used will automatically install drivers for Windows users without manual
intervention. Unfortunately, these IDs are not recognized by the Linux
module, so using these IDs improves user experience on one platform but
degrades it on Linux. This patch addresses this problem.
Signed-off-by: Karoly Pados <pados@pados.hu>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
With gcc 4.1.2:
drivers/pinctrl/actions/pinctrl-owl.c: In function ‘owl_pin_config_set’:
drivers/pinctrl/actions/pinctrl-owl.c:336: warning: ‘ret’ may be used uninitialized in this function
Indeed, if num_configs is zero, the uninitialized value will be returned
as an error code.
Fix this by preinitializing it to zero.
Fixes: 2242ddfbf4 ("pinctrl: actions: Add Actions S900 pinctrl driver")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Currently saved_vals is being allocated and there is no check for
failed allocation (which is more likely than normal when using
GFP_ATOMIC). Fix this by checking for a failed allocation and
propagating this error return down the the caller chain.
Detected by CoverityScan, CID#1469841 ("Dereference null return value")
Fixes: 88a1dbdec6 ("pinctrl: pinctrl-single: Add functions to save and restore pinctrl context")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Commit b89405b610 ("pinctrl: devicetree: Fix dt_to_map_one_config
handling of hogs") causes the pinctrl hog pins to not get initialized
on i.MX platforms leaving them with the IOMUX settings untouched.
This causes several regressions on i.MX such as:
- OV5640 camera driver can not be probed anymore on imx6qdl-sabresd
because the camera clock pin is in a pinctrl_hog group and since
its pinctrl initialization is skipped, the camera clock is kept
in GPIO functionality instead of CLK_CKO function.
- Audio stopped working on imx6qdl-wandboard and imx53-qsb for
the same reason.
Richard Fitzgerald explains the problem:
"I see the bug. If the hog node isn't a 1st level child of the pinctrl
parent node it will go around the for(;;) loop again but on the first
pass I overwrite pctldev with the result of
get_pinctrl_dev_from_of_node() so it doesn't point to the pinctrl driver
any more."
Fix the issue by stashing the original pctldev so it doesn't
get overwritten.
Fixes: b89405b610 ("pinctrl: devicetree: Fix dt_to_map_one_config handling of hogs")
Cc: <stable@vger.kernel.org>
Reported-by: Mika Penttilä <mika.penttila@nextfour.com>
Reported-by: Steve Longerbeam <slongerbeam@gmail.com>
Suggested-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Reviewed-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Remove unneeded error handling on the result of a call
to platform_get_resource() when the value is passed to
devm_ioremap_resource().
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull cifs fixes from Steve French:
"Misc SMB3 fixes, including particularly important ones for signing,
some minor documentation and debug improvements and another posix
smb3.11 fix"
* tag '4.18-rc1-more-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix invalid check in __cifs_calc_signature()
cifs: Use correct packet length in SMB2_TRANSFORM header
smb3: fix corrupt path in subdirs on smb311 with posix
smb3: do not display empty interface list
smb3: Fix mode on mkdir on smb311 mounts
cifs: Fix kernel oops when traceSMB is enabled
CIFS: dump every session iface info
CIFS: parse and store info on iface queries
CIFS: add iface info to struct cifs_ses
CIFS: complete PDU definitions for interface queries
CIFS: move default port definitions to cifsglob.h
cifs: Fix encryption/signing
cifs: update __smb_send_rqst() to take an array of requests
cifs: remove smb2_send_recv()
cifs: push rfc1002 generation down the stack
smb3: increase initial number of credits requested to allow write
cifs: minor documentation updates
cifs: add lease tracking to the cached root fid
smb3: note that smb3.11 posix extensions mount option is experimental
Pull dmi update from Jean Delvare:
"Expose SKU ID string as a DMI attribute"
* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
firmware: dmi: Add access to the SKU ID string
This fixes this documentation build error that is due to a file rename:
Error: Cannot open file ../arch/x86/kernel/cpu/mtrr/main.c
Fixes: 0afe832e55 ("Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The kernel's ext4 mount-time checks were more permissive than
e2fsprogs's libext2fs checks when opening a file system. The
superblock is considered too insane for debugfs or e2fsck to operate
on it, the kernel has no business trying to mount it.
This will make file system fuzzing tools work harder, but the failure
cases that they find will be more useful and be easier to evaluate.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
This is used in some systems from user space for determining the identity
of the device.
Expose this as a file so that that user-space tools don't need to read
from /sys/firmware/dmi/tables/DMI
Signed-off-by: Simon Glass <sjg@chromium.org>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Correct MIPI/PCIe/USB_HSIC's PGC offset based on
design RTL, the values in the Reference Manual
(Rev. 1, 01/2018 and the older ones) are incorrect.
The correct offset values should be as below:
0x800 ~ 0x83F: PGC for core0 of A7 platform;
0x840 ~ 0x87F: PGC for core1 of A7 platform;
0x880 ~ 0x8BF: PGC for SCU of A7 platform;
0xA00 ~ 0xA3F: PGC for fastmix/megamix;
0xC00 ~ 0xC3F: PGC for MIPI PHY;
0xC40 ~ 0xC7F: PGC for PCIe_PHY;
0xC80 ~ 0xCBF: PGC for USB OTG1 PHY;
0xCC0 ~ 0xCFF: PGC for USB OTG2 PHY;
0xD00 ~ 0xD3F: PGC for USB HSIC PHY;
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Fixes: 03aa12629f ("soc: imx: Add GPCv2 power gating driver")
Acked-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The commentary says to use various parameters, and lays out what
the mapping is... The code used a 32KHz rate when the comment
says that it needs to use a 48KHz rate. And this has been the
case since day one.
On the Alienware M17x R4, the DMic used to have exceptionally quiet
pickup and a lot of noise. Changing the data rate fixes both of
these issues.
Searching the kernel bug tracker for ca0132-related issues shows no
mention of this being an issue for other hardware, and I have no
other hardware to test with, so a quirk is used to limit the effect
to just the M17x R4.
Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit 009b8f979b conditionalized
adding the "CA0132 Analog Mic-In2" PCM with a comment to the
effect that, "desktops don't use this ADC", but the test was set
up such that the ADC was only created for desktops. Invert the
test.
Fixes: 009b8f979b ("ALSA: hda/ca0132: update core functions for sbz + r3di")
Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
QUIRK_NONE is, quite explicitly, the default case. The entire
point of a quirks system is to allow "programming by difference"
from a given base case, which requires that merely defining a new
quirk for some piece of hardware should not change the behavior of
the driver for that hardware. In turn, this means that testing
for QUIRK_NONE explicitly is a violation of that implicit contract.
Change a test for QUIRK_NONE and QUIRK_ALIENWARE to default, and
add a test for QUIRK_SBZ to disable the default behavior in that
instance.
Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit e93ac30a32 (ALSA: HDA/ca0132:
add extra init functions for r3di + sbz) introduced an extra
initialization function that was improperly guarded, taking effect
on systems with QUIRK_ALIENWARE, even though such systems were
supposedly not affected.
It may be that this piece of initialization should be done for all
systems, but that's not a call that I can make.
Fixes: e93ac30a32 ("ALSA: HDA/ca0132: add extra init functions for r3di + sbz")
Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
During ca0132_init(), ca0132_init_unsol() is run before the
spec->spec_init_verbs are written. ca0132_init_unsol() calls
snd_hda_jack_detect_enable_callback(), which requests UNSOL events
for three or four nodes, two of which were also (redundantly)
requested by spec_init_verbs.
Kill the redundant AC_VERB_SET_UNSOLICITED_ENABLE verbs.
Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
ca0132_config() was setting some values in the auto_pin_cfg for
the codec... but it is called prior to snd_hda_parse_pin_defcfg(),
which does a memset() to clear the entire structure as one of its
first actions, making the entire exercise pointless.
Kill all use of struct auto_pin_cfg from ca0132_config().
Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Some Lenovo laptops, e.g. Lenovo P50, showed the pop noise at resume
or runtime resume. It turned out to be reduced by applying
alc_no_shutup() just like TPT440 quirk does.
Since there are many Lenovo models showing the same behavior, put this
workaround in ALC269_FIXUP_THINKPAD_ACPI entry so that it's applied
commonly to all such Lenovo machines.
Reported-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Benjamin Berg <bberg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
According to the reference manual the shp_2_mcu / mcu_2_shp
scripts must be used for devices connected through the SPBA.
This fixes an issue we saw with DMA transfers.
Sometimes the SPI controller RX FIFO was not empty after a DMA
transfer and the driver got stuck in the next PIO transfer when
it read one word more than expected.
commit dd4b487b32 ("ARM: dts: imx6: Use correct SDMA script
for SPI cores") is fixing the same issue but only for SPI1 - 4.
Fixes: 677940258d ("ARM: dts: imx6q: enable dma for ecspi5")
Signed-off-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
If there is a directory entry pointing to a system inode (such as a
journal inode), complain and declare the file system to be corrupted.
Also, if the superblock's first inode number field is too small,
refuse to mount the file system.
This addresses CVE-2018-10882.
https://bugzilla.kernel.org/show_bug.cgi?id=200069
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Use a separate journal transaction if it turns out that we need to
convert an inline file to use an data block. Otherwise we could end
up failing due to not having journal credits.
This addresses CVE-2018-10883.
https://bugzilla.kernel.org/show_bug.cgi?id=200071
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Do not set the b_modified flag in block's journal head should not
until after we're sure that jbd2_journal_dirty_metadat() will not
abort with an error due to there not being enough space reserved in
the jbd2 handle.
Otherwise, future attempts to modify the buffer may lead a large
number of spurious errors and warnings.
This addresses CVE-2018-10883.
https://bugzilla.kernel.org/show_bug.cgi?id=200071
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
When blackhole is used on top of classful qdisc like hfsc it breaks
qlen and backlog counters because packets are disappear without notice.
In HFSC non-zero qlen while all classes are inactive triggers warning:
WARNING: ... at net/sched/sch_hfsc.c:1393 hfsc_dequeue+0xba4/0xe90 [sch_hfsc]
and schedules watchdog work endlessly.
This patch return __NET_XMIT_BYPASS in addition to NET_XMIT_SUCCESS,
this flag tells upper layer: this packet is gone and isn't queued.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit cc66b30382 ("hwmon: (nct6775) Rework temperature source and label
handling") changed a loop limit from "data->temp_label_num - 1" to "32",
as part of moving from a string array to a bit mask. This results in the
following error, reported by UBSAN.
UBSAN: Undefined behaviour in drivers/hwmon/nct6775.c:4179:27
shift exponent 32 is too large for 32-bit type 'long unsigned int'
Similar to the original loop, the limit has to be one less than the
number of bits.
Fixes: cc66b30382 ("hwmon: (nct6775) Rework temperature source and label handling")
Reported-by: Paul Menzel <pmenzel+linux-hwmon@molgen.mpg.de>
Cc: Paul Menzel <pmenzel+linux-hwmon@molgen.mpg.de>
Tested-by: Paul Menzel <pmenzel+linux-hwmon@molgen.mpg.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Calling fan related SMM functions implemented by Dell BIOS firmware on Dell
XPS13 9333 freeze kernel for about 500ms. Until Dell fixes it we need to
disable fan support for Dell XPS13 9333.
Via "force" module param fan support can be enabled.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=195751
Signed-off-by: Helge Eichelberg <kernelorg@elchenberg.name>
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
This breaks the build as this header is not meant to be used in this
way.
./include/linux/unaligned/access_ok.h:8:28: error: redefinition of ‘get_unaligned_le16’
static __always_inline u16 get_unaligned_le16(const void *p)
^~~~~~~~~~~~~~~~~~
In file included from drivers/bluetooth/hci_nokia.c:32:
./include/linux/unaligned/le_struct.h:7:19: note: previous definition of ‘get_unaligned_le16’ was here
static inline u16 get_unaligned_le16(const void *p)
Use asm/unaligned.h instead.
Signed-off-by: David S. Miller <davem@davemloft.net>
ATM accounts for in-flight TX packets in sk_wmem_alloc of the VCC on
which they are to be sent. But it doesn't take ownership of those
packets from the sock (if any) which originally owned them. They should
remain owned by their actual sender until they've left the box.
There's a hack in pskb_expand_head() to avoid adjusting skb->truesize
for certain skbs, precisely to avoid messing up sk_wmem_alloc
accounting. Ideally that hack would cover the ATM use case too, but it
doesn't — skbs which aren't owned by any sock, for example PPP control
frames, still get their truesize adjusted when the low-level ATM driver
adds headroom.
This has always been an issue, it seems. The truesize of a packet
increases, and sk_wmem_alloc on the VCC goes negative. But this wasn't
for normal traffic, only for control frames. So I think we just got away
with it, and we probably needed to send 2GiB of LCP echo frames before
the misaccounting would ever have caused a problem and caused
atm_may_send() to start refusing packets.
Commit 14afee4b60 ("net: convert sock.sk_wmem_alloc from atomic_t to
refcount_t") did exactly what it was intended to do, and turned this
mostly-theoretical problem into a real one, causing PPPoATM to fail
immediately as sk_wmem_alloc underflows and atm_may_send() *immediately*
starts refusing to allow new packets.
The least intrusive solution to this problem is to stash the value of
skb->truesize that was accounted to the VCC, in a new member of the
ATM_SKB(skb) structure. Then in atm_pop_raw() subtract precisely that
value instead of the then-current value of skb->truesize.
Fixes: 158f323b98 ("net: adjust skb->truesize in pskb_expand_head()")
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Tested-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2018-06-16
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix a panic in devmap handling in generic XDP where return type
of __devmap_lookup_elem() got changed recently but generic XDP
code missed the related update, from Toshiaki.
2) Fix a freeze when BPF progs are loaded that include BPF to BPF
calls when JIT is enabled where we would later bail out via error
path w/o dropping kallsyms, and another one to silence syzkaller
splats from locking prog read-only, from Daniel.
3) Fix a bug in test_offloads.py BPF selftest which must not assume
that the underlying system have no BPF progs loaded prior to test,
and one in bpftool to fix accuracy of program load time, from Jakub.
4) Fix a bug in bpftool's probe for availability of the bpf(2)
BPF_TASK_FD_QUERY subcommand, from Yonghong.
5) Fix a regression in AF_XDP's XDP_SKB receive path where queue
id check got erroneously removed, from Björn.
6) Fix missing state cleanup in BPF's xfrm tunnel test, from William.
7) Check tunnel type more accurately in BPF's tunnel collect metadata
kselftest, from Jian.
8) Fix missing Kconfig fragments for BPF kselftests, from Anders.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When expanding the extra isize space, we must never move the
system.data xattr out of the inode body. For performance reasons, it
doesn't make any sense, and the inline data implementation assumes
that system.data xattr is never in the external xattr block.
This addresses CVE-2018-10880
https://bugzilla.kernel.org/show_bug.cgi?id=200005
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
The following check would never evaluate to true:
> if (i == 0 && iov[0].iov_len <= 4)
Because 'i' always starts at 1.
This patch fixes it and also move the header checks outside the for loop
- which makes more sense.
Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
In smb3_init_transform_rq(), 'orig_len' was only counting the request
length, but forgot to count any data pages in the request.
Writing or creating files with the 'seal' mount option was broken.
In addition, do some code refactoring by exporting smb2_rqst_len() to
calculate the appropriate packet size and avoid duplicating the same
calculation all over the code.
The start of the io vector is either the rfc1002 length (4 bytes) or a
SMB2 header which is always > 4. Use this fact to check and skip the
rfc1002 length if requested.
Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Commit 67f29e07e1 ("bpf: devmap introduce dev_map_enqueue") changed
the return value type of __devmap_lookup_elem() from struct net_device *
to struct bpf_dtab_netdev * but forgot to modify generic XDP code
accordingly.
Thus generic XDP incorrectly used struct bpf_dtab_netdev where struct
net_device is expected, then skb->dev was set to invalid value.
v2:
- Fix compiler warning without CONFIG_BPF_SYSCALL.
Fixes: 67f29e07e1 ("bpf: devmap introduce dev_map_enqueue")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Daniel Borkmann says:
====================
First one is a panic I ran into while testing the second
one where we got several syzkaller reports. Series here
fixes both.
Thanks!
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
We currently lock any JITed image as read-only via bpf_jit_binary_lock_ro()
as well as the BPF image as read-only through bpf_prog_lock_ro(). In
the case any of these would fail we throw a WARN_ON_ONCE() in order to
yell loudly to the log. Perhaps, to some extend, this may be comparable
to an allocation where __GFP_NOWARN is explicitly not set.
Added via 65869a47f3 ("bpf: improve read-only handling"), this behavior
is slightly different compared to any of the other in-kernel set_memory_ro()
users who do not check the return code of set_memory_ro() and friends /at
all/ (e.g. in the case of module_enable_ro() / module_disable_ro()). Given
in BPF this is mandatory hardening step, we want to know whether there
are any issues that would leave both BPF data writable. So it happens
that syzkaller enabled fault injection and it triggered memory allocation
failure deep inside x86's change_page_attr_set_clr() which was triggered
from set_memory_ro().
Now, there are two options: i) leaving everything as is, and ii) reworking
the image locking code in order to have a final checkpoint out of the
central bpf_prog_select_runtime() which probes whether any of the calls
during prog setup weren't successful, and then bailing out with an error.
Option ii) is a better approach since this additional paranoia avoids
altogether leaving any potential W+X pages from BPF side in the system.
Therefore, lets be strict about it, and reject programs in such unlikely
occasion. While testing I noticed also that one bpf_prog_lock_ro()
call was missing on the outer dummy prog in case of calls, e.g. in the
destructor we call bpf_prog_free_deferred() on the main prog where we
try to bpf_prog_unlock_free() the program, and since we go via
bpf_prog_select_runtime() do that as well.
Reported-by: syzbot+3b889862e65a98317058@syzkaller.appspotmail.com
Reported-by: syzbot+9e762b52dd17e616a7a5@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
While testing I found that when hitting error path in bpf_prog_load()
where we jump to free_used_maps and prog contained BPF to BPF calls
that were JITed earlier, then we never clean up the bpf_prog_kallsyms_add()
done under jit_subprogs(). Add proper API to make BPF kallsyms deletion
more clear and fix that.
Fixes: 1c2a088a66 ("bpf: x64: add JIT support for multi-function programs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When converting from an inode from storing the data in-line to a data
block, ext4_destroy_inline_data_nolock() was only clearing the on-disk
copy of the i_blocks[] array. It was not clearing copy of the
i_blocks[] in ext4_inode_info, in i_data[], which is the copy actually
used by ext4_map_blocks().
This didn't matter much if we are using extents, since the extents
header would be invalid and thus the extents could would re-initialize
the extents tree. But if we are using indirect blocks, the previous
contents of the i_blocks array will be treated as block numbers, with
potentially catastrophic results to the file system integrity and/or
user data.
This gets worse if the file system is using a 1k block size and
s_first_data is zero, but even without this, the file system can get
quite badly corrupted.
This addresses CVE-2018-10881.
https://bugzilla.kernel.org/show_bug.cgi?id=200015
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
enable_best_rng() is used in hwrng_unregister() to switch away from the
currently active RNG, if that is the one currently being removed.
However enable_best_rng() might fail, if the next RNG's init routine
fails. In that case enable_best_rng() will return an error code and
the currently active RNG will remain active.
After unregistering this might lead to crashes due to use-after-free.
Fix this by dropping the currently active RNG, if enable_best_rng()
failed. This will result in no RNG to be active, if the next-best
one failed to initialize.
This problem was introduced by 142a27f0a7
Fixes: 142a27f0a7 ("hwrng: core - Reset user selected rng by...")
Reported-by: Wirz <spam@lukas-wirz.de>
Tested-by: Wirz <spam@lukas-wirz.de>
Signed-off-by: Michael Büsch <m@bues.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We call chtls_free_skb() but then we dereference it on the next lines.
Also "skb" can't be NULL, we just dereferenced it on the line before.
I have moved the free down a couple lines to fix this issue.
Fixes: 17a7d24aa8 ("crypto: chtls - generic handling of data and hdr")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If server does not support listing interfaces then do not
display empty "Server interfaces" line to avoid confusing users.
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Aurelien Aptel <aaptel@suse.com>
Since the rfc1002 generation was moved down to __smb_send_rqst(),
the transform header is now in rqst->rq_iov[0].
Correctly assign the transform header pointer in crypt_message().
Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Now that we have the plumbing to pass request without an rfc1002
header all the way down to the point we write to the socket we no
longer need the smb2_send_recv() function.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Move the generation of the 4 byte length field down the stack and
generate it immediately before we start writing the data to the socket.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Compared to other clients the Linux smb3 client ramps up
credits very slowly, taking more than 128 operations before a
maximum size write could be sent (since the number of credits
requested is only 2 per small operation, causing the credit
limit to grow very slowly).
This lack of credits initially would impact large i/o performance,
when large i/o is tried early before enough credits are built up.
Signed-off-by: Steve French <stfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Use a read lease for the cached root fid so that we can detect
when the content of the directory changes (via a break) at which time
we close the handle. On next access to the root the handle will be reopened
and cached again.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Make the printting of bpf xfrm tunnel better and
cleanup xfrm state and policy when xfrm test finishes.
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Jakub Kicinski says:
====================
This small series allows test_offload.py selftest to run on modern
distributions which may create BPF programs for cgroups at boot,
like Ubuntu 18.04. We still expect the program list to not be
altered by any other agent while the test is running, but no longer
depend on there being no BPF programs at all at the start.
Fixing the test revealed a small problem with bpftool, which doesn't
report the program load time very accurately. Because nanoseconds
were not taken into account reported load time would fluctuate by
1 second. First patch of the series takes care of fixing that.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Modern distroes increasingly make use of BPF programs. Default
Ubuntu 18.04 installation boots with a number of cgroup_skb
programs loaded.
test_offloads.py tries to check if programs and maps are not
leaked on error paths by confirming the list of programs on the
system is empty between tests.
Since we can no longer expect the system to have no BPF objects
at boot try to remember the programs and maps present at the start,
and skip those when scanning the system.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
BPF program load time is reported from the kernel relative to boot time.
If conversion to wall clock does not take nanosecond parts into account,
the load time reported by bpftool may differ by one second from run to
run. This means JSON object reported by bpftool for a program will
randomly change.
Fixes: 71bb428fe2 ("tools: bpf: add bpftool")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
If a device link is added via device_link_add() by the driver of the
link's consumer device, the supplier's runtime PM usage counter is
going to be dropped by the pm_runtime_put_suppliers() call in
driver_probe_device(). However, in that case it is not incremented
unless the supplier driver is already present and the link is not
stateless. That leads to a runtime PM usage counter imbalance for
the supplier device in a few cases.
To prevent that from happening, bump up the supplier runtime
PM usage counter in device_link_add() for all links with the
DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
time. Use pm_runtime_get_noresume() for that as the callers of
device_link_add() who want the supplier to be resumed by it are
expected to pass DL_FLAG_RPM_ACTIVE in flags to it anyway, but
additionally resume the supplier if the link is added during
consumer driver probe to retain the existing behavior for the
callers depending on it.
Fixes: 21d5c57b37 (PM / runtime: Use device links)
Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It is reported that commit a192aa923b (ACPI / LPSS: Consolidate
runtime PM and system sleep handling) introduced a system suspend
regression on some machines, but the only functional change made by
it was to cause the PM quirks in the LPSS to also be used during
system suspend and resume. While that should always work for
suspend-to-idle, it turns out to be problematic for S3
(suspend-to-RAM).
To address that issue restore the previous S3 suspend and resume
behavior of the LPSS to avoid applying PM quirks then.
Fixes: a192aa923b (ACPI / LPSS: Consolidate runtime PM and system sleep handling)
Link: https://bugs.launchpad.net/bugs/1774950
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Currently the code is split over various files with dma- prefixes in the
lib/ and drives/base directories, and the number of files keeps growing.
Move them into a single directory to keep the code together and remove
the file name prefixes. To match the irq infrastructure this directory
is placed under the kernel/ directory.
Signed-off-by: Christoph Hellwig <hch@lst.de>
We already have exact config symbols to select the direct, non-coherent,
or virt dma ops. So use the normal obj- scheme to select them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Pull perf/urgent fixes and improvements from Arnaldo Carvalho de Melo:
perf stat:
- Fix metric column header display alignment (Jiri Olsa)
- Improve error messages for default attributes, providing better output
for error in command lines such as:
$ perf stat -T
Cannot set up transaction events
event syntax error: '..cycles,cpu/cycles-t/,cpu/tx-start/,cpu/el-start/,cpu/cycles-ct/}'
\___ unknown term
Where the "event syntax error" line now appears (Jiri Olsa)
- Add --interval-clear option, to provide a 'watch' like printing (Jiri Olsa)
perf script:
- Show hw-cache events too (Seeteena Thoufeek)
perf c2c:
- Fix data dependency problem in layout of 'struct c2c_hist_entry', where
its member 'struct hist_entry' must be at the end because it has a ZLA
as its last member, that gets space when handling callchains (Jiri Olsa)
Core:
- We cannot assume that a 'struct perf_evsel' is to be obtained from a
container_of operation on a 'struct hists' as there are tools, such as
'perf c2c' that uses 'struct hist' instances without having them in
container structs that also have 'struct perf_evsel' in a particular
layout, so provide a different way of figuring out if a 'struct hists'
and 'struct hist_entry' have callchains (Arnaldo Carvalho de Melo)
- Fix error index in the PMU event parser, so that error messages can
point to the problematic token (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The bg_flags field in the block group descripts is only valid if the
uninit_bg or metadata_csum feature is enabled. We were not
consistently looking at this field; fix this.
Also block group #0 must never have uninitialized allocation bitmaps,
or need to be zeroed, since that's where the root inode, and other
special inodes are set up. Check for these conditions and mark the
file system as corrupted if they are detected.
This addresses CVE-2018-10876.
https://bugzilla.kernel.org/show_bug.cgi?id=199403
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
It's really bad when the allocation bitmaps and the inode table
overlap with the block group descriptors, since it causes random
corruption of the bg descriptors. So we really want to head those off
at the pass.
https://bugzilla.kernel.org/show_bug.cgi?id=199865
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
GCC built for arc*-*-linux has "-mmedium-calls" implicitly enabled by default
thus we don't see any problems during Linux kernel compilation.
----------------------------->8------------------------
arc-linux-gcc -mcpu=arc700 -Q --help=target | grep calls
-mlong-calls [disabled]
-mmedium-calls [enabled]
----------------------------->8------------------------
But if we try to use so-called Elf32 toolchain with GCC configured for
arc*-*-elf* then we'd see the following failure:
----------------------------->8------------------------
init/do_mounts.o: In function 'init_rootfs':
do_mounts.c:(.init.text+0x108): relocation truncated to fit: R_ARC_S21W_PCREL
against symbol 'unregister_filesystem' defined in .text section in fs/filesystems.o
arc-elf32-ld: final link failed: Symbol needs debug section which does not exist
make: *** [vmlinux] Error 1
----------------------------->8------------------------
That happens because neither "-mmedium-calls" nor "-mlong-calls" are enabled in
Elf32 GCC:
----------------------------->8------------------------
arc-elf32-gcc -mcpu=arc700 -Q --help=target | grep calls
-mlong-calls [disabled]
-mmedium-calls [disabled]
----------------------------->8------------------------
Now to make it possible to use Elf32 toolchain for building Linux kernel
we're explicitly add "-mmedium-calls" to CFLAGS.
And since we add "-mmedium-calls" to the global CFLAGS there's no point in
having per-file copies thus removing them.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
MHL bridge is usually connected to TV via MHL dongle. Currently plugging
HDMI cable to dongle is handled improperly.
Fix it by splitting connecting of a dongle and a HDMI cable. The driver
should now handle unplugging a sink from a dongle and plugging a
different sink with new edid.
Tested on MHL1, MHL2 and MHL3 using various vendors' dongles both in
DVI and HDMI mode.
Signed-off-by: Maciej Purski <m.purski@samsung.com>
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1516705996-8928-1-git-send-email-m.purski@samsung.com
The vendor code waits for infoframe to detect video mode set by source.
We do not need to follow this pattern, because video mode information is
provided by drm core. As a result most of the infoframe handling
code can be removed.
Start transmission immediately after detecting stream on HDMI lines
in irq_scdt() function without waiting for infoframe interrupt.
Signed-off-by: Maciej Purski <m.purski@samsung.com>
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1511956130-24482-1-git-send-email-m.purski@samsung.com
If there an inode points to a block which is also some other type of
metadata block (such as a block allocation bitmap), the
buffer_verified flag can be set when it was validated as that other
metadata block type; however, it would make a really terrible external
attribute block. The reason why we use the verified flag is to avoid
constantly reverifying the block. However, it doesn't take much
overhead to make sure the magic number of the xattr block is correct,
and this will avoid potential crashes.
This addresses CVE-2018-10879.
https://bugzilla.kernel.org/show_bug.cgi?id=200001
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Cc: stable@kernel.org
Commit b04df400c3 ("tools/bpftool: add perf subcommand")
introduced bpftool subcommand perf to query bpf program
kuprobe and tracepoint attachments.
The perf subcommand will first test whether bpf subcommand
BPF_TASK_FD_QUERY is supported in kernel or not. It does it
by opening a file with argv[0] and feeds the file descriptor
and current task pid to the kernel for querying.
Such an approach won't work if the argv[0] cannot be opened
successfully in the current directory. This is especially
true when bpftool is accessible through PATH env variable.
The error below reflects the open failure for file argv[0]
at home directory.
[yhs@localhost ~]$ which bpftool
/usr/local/sbin/bpftool
[yhs@localhost ~]$ bpftool perf
Error: perf_query_support: No such file or directory
To fix the issue, let us open root directory ("/")
which exists in every linux system. With the fix, the
error message will correctly reflect the permission issue.
[yhs@localhost ~]$ which bpftool
/usr/local/sbin/bpftool
[yhs@localhost ~]$ bpftool perf
Error: perf_query_support: Operation not permitted
HINT: non root or kernel doesn't support TASK_FD_QUERY
Fixes: b04df400c3 ("tools/bpftool: add perf subcommand")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
There are in-tree users of refcount_dec_and_lock() which must acquire the
spin lock with interrupts disabled. To workaround the lack of an irqsave
variant of refcount_dec_and_lock() they use local_irq_save() at the call
site. This causes extra code and creates in some places unneeded long
interrupt disabled times. These places need also extra treatment for
PREEMPT_RT due to the disconnect of the irq disabling and the lock
function.
Implement the missing irqsave variant of the function.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r20180612161621.22645-4-bigeasy@linutronix.de
[bigeasy: s@atomic_dec_and_lock@refcount_dec_and_lock@g]
There are in-tree users of atomic_dec_and_lock() which must acquire the
spin lock with interrupts disabled. To workaround the lack of an irqsave
variant of atomic_dec_and_lock() they use local_irq_save() at the call
site. This causes extra code and creates in some places unneeded long
interrupt disabled times. These places need also extra treatment for
PREEMPT_RT due to the disconnect of the irq disabling and the lock
function.
Implement the missing irqsave variant of the function.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r20180612161621.22645-3-bigeasy@linutronix.de
Alpha provides a custom implementation of dec_and_lock(). The functions
is split into two parts:
- atomic_add_unless() + return 0 (fast path in assembly)
- remaining part including locking (slow path in C)
Comparing the result of the alpha implementation with the generic
implementation compiled by gcc it looks like the fast path is optimized
by avoiding a stack frame (and reloading the GP), register store and all
this. This is only done in the slowpath.
After marking the slowpath (atomic_dec_and_lock_1()) as "noinline" and
doing the slowpath in C (the atomic_add_unless(atomic, -1, 1) part) I
noticed differences in the resulting assembly:
- the GP is still reloaded
- atomic_add_unless() adds more memory barriers compared to the custom
assembly
- the custom assembly here does "load, sub, beq" while
atomic_add_unless() does "load, cmpeq, add, bne". This is okay because
it compares against zero after subtraction while the generic code
compares against 1 before.
I'm not sure if avoiding the stack frame (and GP reloading) brings a lot
in terms of performance. Regarding the different barriers, Peter
Zijlstra says:
|refcount decrement needs to be a RELEASE operation, such that all the
|load/stores to the object happen before we decrement the refcount.
|
|Otherwise things like:
|
| obj->foo = 5;
| refcnt_dec(&obj->ref);
|
|can be re-ordered, which then allows fun scenarios like:
|
| CPU0 CPU1
|
| refcnt_dec(&obj->ref);
| if (dec_and_test(&obj->ref))
| free(obj);
| obj->foo = 5; // oops UaF
|
|
|This means (for alpha) that there should be a memory barrier _before_
|the decrement, however the dec_and_lock asm thing only has one _after_,
|which, per the above, is too late.
|
|The generic version using add_unless will result in memory barrier
|before and after (because that is the rule for atomic ops with a return
|value) which is strictly too many barriers for the refcount story, but
|who knows what other ordering requirements code has.
Remove the custom alpha implementation of dec_and_lock() and if it is an
issue (performance wise) then the fast path could still be inlined.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: linux-alpha@vger.kernel.org
Link: https://lkml.kernel.org/r/20180606115918.GG12198@hirez.programming.kicks-ass.net
Link: https://lkml.kernel.org/r20180612161621.22645-2-bigeasy@linutronix.de
During disassociation the ucontext will become NULL, however due to how
the SRCU locking works the ucontext must only be examined after looking
at the ib_dev, which governs the RCU control flow.
With the wrong ordering userspace will see EINVAL instead of EIO for a
disassociated uverbs FD, which breaks rdma-core.
Cc: stable@vger.kernel.org
Fixes: 491d5c6a30 ("RDMA/uverbs: Move uncontext check before SRCU read lock")
Reported-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
The lockdep_assert_irqs_disabled() was a BUG_ON() statement in the
beginning and it was added just before the "spin_lock(siglock)"
statement to ensure this lock was taken with disabled interrupts.
This is no longer the case: the siglock is acquired via
lock_task_sighand() and this function already disables the interrupts.
The lock is also acquired before this "lockdep_assert_irqs_disabled" so
it is best to remove it.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <frederic@kernel.org>
Link: https://lkml.kernel.org/r20180504152548.7166-1-bigeasy@linutronix.de
Change the remaining users of dasd_kmalloc_request to use
preallocated memory and remove this function.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Move some members of struct dasd_ccw_req to get rid of padding
bytes. This saves 16 bytes per dasd request.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Simplify locking in __dasd_device_process_final_queue to fix
the following sparse warning:
drivers/s390/block/dasd.c:1902:9: warning:
context imbalance in '__dasd_device_process_final_queue' - different lock contexts for basic block
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Höppner <hoeppner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Change css_general_characteristics such that the bitfields don't
straddle storage-unit boundaries of the base types.
This does not change the offsets of the structs members but now
we do as documented and also fix the following sparse complaint:
drivers/s390/cio/chsc.c:926:56:
warning: invalid access past the end of 'css_general_characteristics' (16 18)
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull vfio-ccw from Cornelia Huck with the following changes:
- Various fixes and improvements in vfio-ccw, including a first stab
at adding tracepoints.
There is no reason to keep this gpio based code in architecture. Use
ledtrig-heartbeat.c instead which is much more flexible then this
ancient code.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Since commit 96f0e6fcc9 ("microblaze: remove redundant early_printk
support") prom.h was removed and one instance in heartbeat.c remained.
Include of.h as it is the actual header needed.
Fixes: 96f0e6fcc9 ("microblaze: remove redundant early_printk support")
Reported-by: kbuild test robot <lkp@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Commit 173d3adb6f ("xsk: add zero-copy support for Rx") introduced a
regression on the XDP_SKB receive path, when the queue id checks were
removed. Now, they are back again.
Fixes: 173d3adb6f ("xsk: add zero-copy support for Rx")
Reported-by: Qi Zhang <qi.z.zhang@intel.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Before returning -EPERM we should release some resources, as already done
in the other error handling path of the function.
Fixes: d8f9cc328c ("IB/mlx4: Mark user MR as writable if actual virtual memory is writable")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The documentation for the touchscreen-swapped-x-y property states that
swapping is done after inverting if both are used. RMI4 did it the other
way around, leading to inconsistent behavior with regard to other
touchscreens.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Nick Dyer <nick@shmanahar.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The error return code PTR_ERR(data->irqdomain) is always 0 since
data->irqdomain is equal to NULL in this error handling case.
Fixes: 24d28e4f12 ("Input: synaptics-rmi4 - convert irq distribution to irq_domain")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Some RoCE specific code in qedr_modify_qp was run over an iWARP device
when running perftest benchmarks without the -R option.
The commit 3e44e0ee08 ("IB/providers: Avoid null netdev check for RoCE")
exposed this. Dropping the check for NULL pointer on ndev in
qedr_modify_qp lead to a null pointer dereference when running over
iWARP. Before the code would identify ndev as being NULL and return an
error.
Fixes: 3e44e0ee08 ("IB/providers: Avoid null netdev check for RoCE")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In case of error, the function mlx5_fc_create() returns ERR_PTR() and
never returns NULL. The NULL test in the return value check should be
replaced with IS_ERR().
Fixes: 3b3233fbf0 ("IB/mlx5: Add flow counters binding support")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In case memory resources for *ucmd* were allocated, release them
before return.
Addresses-Coverity-ID: 1469857 ("Resource leak")
Fixes: 3b3233fbf0 ("IB/mlx5: Add flow counters binding support")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In rxe_send, when network_type is not RDMA_NETWORK_IPV4 or
RDMA_NETWORK_IPV6, skb is freed and -EINVAL is returned.
Then rxe_xmit_packet will return -EINVAL, too. In rxe_requester,
this skb is double freed.
In rxe_requester, kfree_skb is needed only after fill_packet fails.
So kfree_skb is moved from label err to test fill_packet.
Fixes: 5793b46521 ("IB/rxe: remove unnecessary skb_clone in xmit")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Exactly as the comment just before 'struct c2c_hist_entry" says, i.e.
the last entry in struct hist_entry is a zero length array, that when
allocating space for hist_entry gets extra space if callchains are in
use, which, if hist_entry is not at the end of c2c_hist_entry, the
members after it gets corrupted when callchains get added to the rb
trees collecting them, etc.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Fixes: 7f834c2e84 ("perf c2c report: Display node for cacheline address")
Link: http://lkml.kernel.org/n/tip-bh0ke4fh2ygpj3yowna7o1di@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The quirk for R-Car E3 ES1.0 added in commit 086b399965 ("soc:
renesas: r8a77990-sysc: Add workaround for 3DG-{A,B}") makes the 3DG-A
PM domain a subdomain of the 3DG-B PM domain. However, registering
3DG-A with its parent fails silently, as the 3DG-B PM domain hasn't been
registered yet, and such failures are never reported.
Fix this by:
1. Splitting PM Domain initialization in two steps, so all PM domains
are registered before any child-parent links are established,
2. Reporting any failures in establishing child-parent relations.
Check for and report pm_genpd_init() failures, too, as that function
gained a return value in commit 7eb231c337 ("PM / Domains: Convert
pm_genpd_init() to return an error code").
Fixes: 086b399965 ("soc: renesas: r8a77990-sysc: Add workaround for 3DG-{A,B}")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Since we added support to add recovery from some errors inside the kernel in:
commit b2f9d678e2 ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries")
we have done a less than stellar job at reporting the cause of recoverable
machine checks that occur in other parts of the kernel. The user just gets
the unhelpful message:
mce: [Hardware Error]: Machine check: Action required: unknown MCACOD
doubly unhelpful when they check the manual for the reported IA32_MSR_STATUS.MCACOD
and see that it is listed as one of the standard recoverable values.
Add an extra rule to the MCE severity table to catch this case and report it
as:
mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel
Fixes: b2f9d678e2 ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org # 4.6+
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/4cc7c465150a9a48b8b9f45d0b840278e77eb9b5.1527283897.git.tony.luck@intel.com
There are places where we have only access to struct hists and need to
know if any of its hist_entries has callchains, like when drawing
headers for the various output modes (stdio, TUI, etc), so, when adding
a new hist_entry, check if it has callchains, storing this info for
later use by hists__has_callchains().
This reimplementation is necessary because not always a 'struct hists'
is allocated together with a 'struct perf evsel', so we can't go from
'hists' to 'perf_event_attr.sample_type & PERF_SAMPLE_CALLCHAIN'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-hg5g7yddjio3ljwyqnnaj5dt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we can figure out the real size of the struct and also be able
to tell if callchains may be present in this histogram entry.
Since we can't always guarantee that from hist_entry->hists we can use
hists_to_evsel, to then look at evsel->attr.sample_type for
PERF_SAMPLE_CALLCHAIN, like with the 'perf c2c' tool, that uses plain
'struct hists' instances, we need another way of deciding if a specific
hist_entry instance has callchains associated with it, i.e. if its
hist_entry->callchain[0] has space allocated for.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ptvndealxs1k7myluvu9flnq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When setting the skb->dst before doing the MTU check, the route PMTU
caching and reporting is done on the new dst which is about to be
released.
Instead, PMTU handling should be done using the original dst.
This is aligned with IPv4 VTI.
Fixes: ccd740cbc6 ("vti6: Add pmtu handling to vti6_xmit.")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Convert the RMI driver to use the standard mechanism for
distributing IRQs to the various functions.
Tested on:
* S7300 (F11, F34, F54)
* S7817 (F12, F34, F54)
Signed-off-by: Nick Dyer <nick@shmanahar.org>
Acked-by: Christopher Heiny <cheiny@synaptics.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The code is assuming the buffer is max_size length, but we weren't
allocating enough space for it.
Signed-off-by: Shankara Pailoor <shankarapailoor@gmail.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
When unplugging a hotpluggable DRM device we first unregister it
with drm_dev_unregister and then set drm_device.unplugged flag which
is used to mark device critical sections with drm_dev_enter()/
drm_dev_exit() preventing access to device resources that are not
available after the device is gone.
But drm_dev_unregister may lead to hotplug uevent(s) fired to
user-space on card and/or connector removal, thus making it possible
for user-space to try accessing a disconnected device.
Fix this by first making sure device is properly marked as
disconnected and only then unregister it.
Fixes: bee330f3d6 ("drm: Use srcu to protect drm_device.unplugged")
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reported-by: Andrii Chepurnyi <andrii_chepurnyi@epam.com>
Cc: "Noralf Trønnes" <noralf@tronnes.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180522141304.18646-1-andr2000@gmail.com
This refactors pfn_array_alloc_pin() and also improves it by adding
defensive code in error handling so that calling pfn_array_unpin_free()
after error return won't lead to problem. This mainly does:
1. Merge pfn_array_pin() into pfn_array_alloc_pin(), since there is no
other user of pfn_array_pin(). As a result, also remove kernel-doc
for pfn_array_pin() and add/update kernel-doc for pfn_array_alloc_pin()
and struct pfn_array.
2. For a vfio_pin_pages() failure, set pa->pa_nr to zero to indicate
zero pages were pinned.
3. Set pa->pa_iova_pfn to NULL right after it was freed.
Suggested-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.ibm.com>
Message-Id: <20180523025645.8978-3-bjsdjshi@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The kernel doc description for usage of the struct pfn_array in
pfn_array_pin() is unnecessary long. Let's shorten it by describing
the contents of the struct pfn_array fields at the struct's definition
instead.
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.ibm.com>
Message-Id: <20180523025645.8978-2-bjsdjshi@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
There is at least one relevant guest OS that doesn't set the IDA flags in
the ORB as we would like them, but never uses any IDA. So instead of
saying -EOPNOTSUPP when observing an ORB, such that a channel program
specified by it could be a not supported one, let us say -EOPNOTSUPP only
if the channel program is a not supported one.
Of course, the real solution would be doing proper translation for all
IDA. This is possible, but given the current code not straight forward.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Jason J. Herne <jjherne@linux.ibm.com>
Message-Id: <20180516173342.15174-1-pasic@linux.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The arch_get_random_seed_long() invocation done by the random device
driver is done in interrupt context and may be invoked very very
frequently. The existing s390 arch_get_random_seed*() implementation
uses the PRNO(TRNG) instruction which produces excellent high quality
entropy but is relatively slow and thus expensive.
This fix reworks the arch_get_random_seed* implementation. It
introduces a buffer concept to decouple the delivery of random data
via arch_get_random_seed*() from the generation of new random
bytes. The buffer of random data is filled asynchronously by a
workqueue thread.
If there are enough bytes in the buffer the s390_arch_random_generate()
just delivers these bytes. Otherwise false is returned until the worker
thread refills the buffer.
The worker fills the rng buffer by pulling fresh entropy from the
high quality (but slow) true hardware random generator. This entropy
is then spread over the buffer with an pseudo random generator.
As the arch_get_random_seed_long() fetches 8 bytes and the calling
function add_interrupt_randomness() counts this as 1 bit entropy the
distribution needs to make sure there is in fact 1 bit entropy
contained in 8 bytes of the buffer. The current values pull 32 byte
entropy and scatter this into a 2048 byte buffer. So 8 byte in the
buffer will contain 1 bit of entropy.
The worker thread is rescheduled based on the charge level of the
buffer but at least with 500 ms delay to avoid too much cpu consumption.
So the max. amount of rng data delivered via arch_get_random_seed is
limited to 4Kb per second.
Signed-off-by: Harald Freudenberger <freude@de.ibm.com>
Reviewed-by: Patrick Steuer <patrick.steuer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
s390 hardware supports the definition of a so-call Physical NETwork
IDentifier (short PNETID) per network device port. These PNETIDS
can be used to identify network devices that are attached to the same
physical network (broadcast domain).
This patch provides the interface to extract the PNETID of a port of
a device attached to the ccw-bus or pci-bus.
Parts of this patch are based on an initial implementation by
Thomas Richter.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The NAND compatible "denali,denal-nand-dt" property has never been used and
is obsolete. Remove it.
Cc: stable@vger.kernel.org
Fixes: f549af06e9b6("ARM: dts: socfpga: Add NAND device tree for Arria10")
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
The compatible string for the Denali NAND controller is incorrect,
fix it by replacing it with one matching the DT bindings and the
driver.
Cc: stable@vger.kernel.org
Signed-off-by: Marek Vasut <marex@denx.de>
Fixes: d837a80d19 ("ARM: dts: socfpga: add nand controller nodes")
Cc: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
The Denali NAND x-clock should be supplied by nand_x_clk, not by
nand_clk. Fix this, otherwise the Denali driver gets incorrect
clock frequency information and incorrectly configures the NAND
timing.
Cc: stable@vger.kernel.org
Signed-off-by: Marek Vasut <marex@denx.de>
Fixes: d837a80d19 ("ARM: dts: socfpga: add nand controller nodes")
Cc: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2018-05-14 09:48:58 -05:00
1635 changed files with 17530 additions and 9096 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.