Pull x86 fix from Ingo Molnar:
"A single fix for a potential deadlock when printing a message about
spurious interrupts"
* tag 'x86-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/i8259: Use printk_deferred() to prevent deadlock
Pull Kbuild fixes from Masahiro Yamada:
- clean the generated moc file for xconfig
- fix xconfig bugs, and revert some bad commits
* tag 'kbuild-fixes-v5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: remove redundant FORCE definition in scripts/Makefile.modpost
kconfig: qconf: remove wrong ConfigList::firstChild()
Revert "kconfig: qconf: don't show goback button on splitMode"
Revert "kconfig: qconf: Change title for the item window"
kconfig: qconf: remove "goBack" debug message
kconfig: qconf: use delete[] instead of delete to free array
kconfig: qconf: compile moc object separately
kconfig: qconf: use if_changed for qconf.moc rule
modpost: explain why we can't use strsep
Pull KVM fixes from Paolo Bonzini:
"Bugfixes and strengthening the validity checks on inputs from new
userspace APIs.
Now I know why I shouldn't prepare pull requests on the weekend, it's
hard to concentrate if your son is shouting about his latest Minecraft
builds in your ear. Fortunately all the patches were ready and I just
had to check the test results..."
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SVM: Fix disable pause loop exit/pause filtering capability on SVM
KVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled
KVM: arm64: Don't inherit exec permission across page-table levels
KVM: arm64: Prevent vcpu_has_ptrauth from generating OOL functions
KVM: nVMX: check for invalid hdr.vmx.flags
KVM: nVMX: check for required but missing VMCS12 in KVM_SET_NESTED_STATE
selftests: kvm: do not set guest mode flag
The same code exists a few lines above.
Fixes: 436b2ac603 ("modpost: invoke modpost only when input files are updated")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
This function returns the first child object, but the returned pointer
is not compatible with (ConfigItem *).
Commit cc1c08edcc ("kconfig: qconf: don't show goback button on
splitMode") uncovered this issue because using the pointer from this
function would make qconf crash. (https://lkml.org/lkml/2020/7/18/411)
This function does not work. Remove.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Pull networking fixes from David Miller:
1) Encap offset calculation is incorrect in esp6, from Sabrina Dubroca.
2) Better parameter validation in pfkey_dump(), from Mark Salyzyn.
3) Fix several clang issues on powerpc in selftests, from Tanner Love.
4) cmsghdr_from_user_compat_to_kern() uses the wrong length, from Al
Viro.
5) Out of bounds access in mlx5e driver, from Raed Salem.
6) Fix transfer buffer memleak in lan78xx, from Johan Havold.
7) RCU fixups in rhashtable, from Herbert Xu.
8) Fix ipv6 nexthop refcnt leak, from Xiyu Yang.
9) vxlan FDB dump must be done under RCU, from Ido Schimmel.
10) Fix use after free in mlxsw, from Ido Schimmel.
11) Fix map leak in HASH_OF_MAPS bpf code, from Andrii Nakryiko.
12) Fix bug in mac80211 Tx ack status reporting, from Vasanthakumar
Thiagarajan.
13) Fix memory leaks in IPV6_ADDRFORM code, from Cong Wang.
14) Fix bpf program reference count leaks in mlx5 during
mlx5e_alloc_rq(), from Xin Xiong.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits)
vxlan: fix memleak of fdb
rds: Prevent kernel-infoleak in rds_notify_queue_get()
net/sched: The error lable position is corrected in ct_init_module
net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq
net/mlx5e: E-Switch, Specify flow_source for rule with no in_port
net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring
net/mlx5e: CT: Support restore ipv6 tunnel
net: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_port_probe()
ionic: unlock queue mutex in error path
atm: fix atm_dev refcnt leaks in atmtcp_remove_persistent
net: ethernet: mtk_eth_soc: fix MTU warnings
net: nixge: fix potential memory leak in nixge_probe()
devlink: ignore -EOPNOTSUPP errors on dumpit
rxrpc: Fix race between recvmsg and sendmsg on immediate call failure
MAINTAINERS: Replace Thor Thayer as Altera Triple Speed Ethernet maintainer
selftests/bpf: fix netdevsim trap_flow_action_cookie read
ipv6: fix memory leaks on IPV6_ADDRFORM path
net/bpfilter: Initialize pos in __bpfilter_process_sockopt
igb: reinit_locked() should be called with rtnl_lock
e1000e: continue to init PHY even when failed to disable ULP
...
Pull thread fix from Christian Brauner:
"A simple spelling fix for dequeue_synchronous_signal()"
* tag 'for-linus-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
signal: fix typo in dequeue_synchronous_signal()
Pull perf tooling fixes from Arnaldo Carvalho de Melo:
- Fix libtraceevent build with binutils 2.35
- Fix memory leak in process_dynamic_array_len in libtraceevent
- Fix 'perf test 68' zstd compression for s390
- Fix record failure when mixed with ARM SPE event
* tag 'perf-tools-fixes-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
libtraceevent: Fix build with binutils 2.35
perf tools: Fix record failure when mixed with ARM SPE event
perf tests: Fix test 68 zstd compression for s390
tools lib traceevent: Fix memory leak in process_dynamic_array_len
Pull pin control fix from Linus Walleij:
"A single last minute pin control fix to the Qualcomm driver fixing
missing dual edge PCH interrupts"
* tag 'pinctrl-v5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: qcom: Handle broken/missing PDC dual edge IRQs on sc7180
This reverts commit cc1c08edcc.
Maxim Levitsky reports 'make xconfig' crashes since that commit
(https://lkml.org/lkml/2020/7/18/411)
Or, the following is simple test code that makes it crash:
menu "Menu"
config FOO
bool "foo"
default y
menuconfig BAR
bool "bar"
depends on FOO
endmenu
Select the Split View mode, and double-click "bar" in the right
window, then you will see Segmentation fault.
When 'last' is not set for symbolMode, the following code in
ConfigList::updateList() calls firstChild().
item = last ? last->nextSibling() : firstChild();
However, the pointer returned by ConfigList::firstChild() does not
seem to be compatible with (ConfigItem *), which seems another bug.
I'd rather want to reconsider whether hiding the goback icon is the
right thing to do.
In the following test code, the Split View shows "Menu2" and "Menu3"
in the right window. You can descend into "Menu3", but there is no way
to ascend back to "Menu2" from "Menu3".
menu "Menu1"
config FOO
bool "foo"
default y
menu "Menu2"
depends on FOO
menu "Menu3"
config BAZ
bool "baz"
endmenu
endmenu
endmenu
It is true that the goback button is currently not functional due to
yet another bug, but hiding the problem is not the right way to go.
Anyway, Segmentation fault is fatal. Revert the offending commit for
now, and we should find the right solution.
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
This reverts commit 5752ff07fd.
It added dead code to ConfigList:ConfigList().
The constructor of ConfigList has the initializer, mode(singleMode).
if (mode == symbolMode)
setHeaderLabels(QStringList() << "Item" << "Name" << "N" << "M" << "Y" << "Value");
else
setHeaderLabels(QStringList() << "Option" << "Name" << "N" << "M" << "Y" << "Value");
... always takes the else part.
The change to ConfigList::updateSelection() is strange too.
When you click the split view icon for the first time, the titles in
both windows show "Option". After you click something in the right
window, the title suddenly changes to "Item".
ConfigList::updateSelection() is not the right place to do this,
at least. It was not a good idea, I think.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Every time the goback icon is clicked, the annoying message "goBack"
is displayed on the console.
I guess this line is the left-over debug code of commit af737b4def
("kconfig: qconf: simplify the goBack() logic").
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Currently, qconf.moc is included from qconf.cc but they can be compiled
independently.
When you modify qconf.cc, qconf.moc does not need recompiling.
Rename qconf.moc to qconf-moc.cc, and split it out as an independent
compilation unit.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Regenerate qconf.moc when the moc command is changed.
This also allows 'make mrproper' to clean it up. Previously, it was
not cleaned up because 'clean-files += qconf.moc' was missing.
Now 'make mrproper' correctly cleans it up because files listed in
'targets' are cleaned.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf 2020-07-31
The following pull-request contains BPF updates for your *net* tree.
We've added 5 non-merge commits during the last 21 day(s) which contain
a total of 5 files changed, 126 insertions(+), 18 deletions(-).
The main changes are:
1) Fix a map element leak in HASH_OF_MAPS map type, from Andrii Nakryiko.
2) Fix a NULL pointer dereference in __btf_resolve_helper_id() when no
btf_vmlinux is available, from Peilin Ye.
3) Init pos variable in __bpfilter_process_sockopt(), from Christoph Hellwig.
4) Fix a cgroup sockopt verifier test by specifying expected attach type,
from Jean-Philippe Brucker.
Note that when net gets merged into net-next later on, there is a small
merge conflict in kernel/bpf/btf.c between commit 5b801dfb7f ("bpf: Fix
NULL pointer dereference in __btf_resolve_helper_id()") from the bpf tree
and commit 138b9a0511 ("bpf: Remove btf_id helpers resolving") from the
net-next tree.
Resolve as follows: remove the old hunk with the __btf_resolve_helper_id()
function. Change the btf_resolve_helper_id() so it actually tests for a
NULL btf_vmlinux and bails out:
int btf_resolve_helper_id(struct bpf_verifier_log *log,
const struct bpf_func_proto *fn, int arg)
{
int id;
if (fn->arg_type[arg] != ARG_PTR_TO_BTF_ID || !btf_vmlinux)
return -EINVAL;
id = fn->btf_id[arg];
if (!id || id > btf_vmlinux->nr_types)
return -EINVAL;
return id;
}
Let me know if you run into any others issues (CC'ing Jiri Olsa so he's in
the loop with regards to merge conflict resolution).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2020-07-31
1) Fix policy matching with mark and mask on userspace interfaces.
From Xin Long.
2) Several fixes for the new ESP in TCP encapsulation.
From Sabrina Dubroca.
3) Fix crash when the hold queue is used. The assumption that
xdst->path and dst->child are not a NULL pointer only if dst->xfrm
is not a NULL pointer is true with the exception of using the
hold queue. Fix this by checking for hold queue usage before
dereferencing xdst->path or dst->child.
4) Validate pfkey_dump parameter before sending them.
From Mark Salyzyn.
5) Fix the location of the transport header with ESP in UDPv6
encapsulation. From Sabrina Dubroca.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2020-07-30
This small patchset introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
For -stable v4.18:
('net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq')
For -stable v5.7:
('net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
rds_notify_queue_get() is potentially copying uninitialized kernel stack
memory to userspace since the compiler may leave a 4-byte hole at the end
of `cmsg`.
In 2016 we tried to fix this issue by doing `= { 0 };` on `cmsg`, which
unfortunately does not always initialize that 4-byte hole. Fix it by using
memset() instead.
Cc: stable@vger.kernel.org
Fixes: f037590fff ("rds: fix a leak of kernel memory")
Fixes: bdbe6fbc6a ("RDS: recv.c")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2020-07-30
This series contains updates to the e1000e and igb drivers.
Aaron Ma allows PHY initialization to continue if ULP disable failed for
e1000e.
Francesco Ruggeri fixes race conditions in igb reset that could cause panics.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Exchange the positions of the err_tbl_init and err_register labels in
ct_init_module function.
Fixes: c34b961a24 ("net/sched: act_ct: Create nf flow table per zone")
Signed-off-by: liujian <liujian56@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull i2c fixes from Wolfram Sang:
"Some I2C core improvements to prevent NULL pointer usage and a
MAINTAINERS update"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: slave: add sanity check when unregistering
i2c: slave: improve sanity check when registering
MAINTAINERS: Update GENI I2C maintainers list
i2c: also convert placeholder function to return errno
Pull powerpc fix from Michael Ellerman:
"Fix a bug introduced by the changes we made to lockless page table
walking this cycle.
When using the hash MMU, and perf with callchain recording, we can
deadlock if the PMI interrupts a hash fault, and the callchain
recording then takes a hash fault on the same page.
Thanks to Nicholas Piggin, Aneesh Kumar K.V, Anton Blanchard, and
Athira Rajeev"
* tag 'powerpc-5.8-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s/hash: Fix hash_preload running with interrupts enabled
Pull arm64 fixes from Will Deacon:
"The main one is to fix the build after Willy's per-cpu entropy changes
this week. Although that was already resolved elsewhere, the arm64 fix
here is useful cleanup anyway.
Other than that, we've got a fix for building with Clang's integrated
assembler and a fix to make our IPv4 checksumming robust against
invalid header lengths (this only seems to be triggerable by injected
errors).
- Fix build breakage due to circular headers
- Fix build regression when using Clang's integrated assembler
- Fix IPv4 header checksum code to deal with invalid length field
- Fix broken path for Arm PMU entry in MAINTAINERS"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
MAINTAINERS: Include drivers subdirs for ARM PMU PROFILING AND DEBUGGING entry
arm64: csum: Fix handling of bad packets
arm64: Drop unnecessary include from asm/smp.h
arm64/alternatives: move length validation inside the subsection
Pull ARM fixes from Russell King:
- avoid invoking overflow handler for uaccess watchpoints
- fix incorrect clock_gettime64 availability
- fix EFI crash in create_mapping_late()
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8988/1: mmu: fix crash in EFI calls due to p4d typo in create_mapping_late()
ARM: 8987/1: VDSO: Fix incorrect clock_gettime64
ARM: 8986/1: hw_breakpoint: Don't invoke overflow handler on uaccess watchpoints
Pull rdma fixes from Jason Gunthorpe:
"Two more merge window regressions, a corruption bug in hfi1 and a few
other small fixes.
- Missing user input validation regression in ucma
- Disallowing a previously allowed user combination regression in
mlx5
- ODP prefetch memory leaking triggerable by userspace
- Memory corruption in hf1 due to faulty ring buffer logic
- Missed mutex initialization crash in mlx5
- Two small defects with RDMA DIM"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/core: Free DIM memory in error unwind
RDMA/core: Stop DIM before destroying CQ
RDMA/mlx5: Initialize QP mutex for the debug kernels
IB/rdmavt: Fix RQ counting issues causing use of an invalid RWQE
RDMA/mlx5: Allow providing extra scatter CQE QP flag
RDMA/mlx5: Fix prefetch memory leak if get_prefetchable_mr fails
RDMA/cm: Add min length checks to user structure copies
Pull sound fixes from Takashi Iwai:
"A few wrap-up small fixes for the usual HD-audio and USB-audio stuff:
- A regression fix for S3 suspend on old Intel platforms
- A fix for possible Oops in ASoC HD-audio binding
- Trivial quirks for various devices"
* tag 'sound-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek - Fixed HP right speaker no sound
ALSA: hda: fix NULL pointer dereference during suspend
ALSA: hda/hdmi: Fix keep_power assignment for non-component devices
ALSA: hda: Workaround for spurious wakeups on some Intel platforms
ALSA: hda/realtek: Fix add a "ultra_low_power" function for intel reference board (alc256)
ALSA: hda/realtek: typo_fix: enable headset mic of ASUS ROG Zephyrus G14(GA401) series with ALC289
ALSA: hda/realtek: enable headset mic of ASUS ROG Zephyrus G15(GA502) series with ALC289
ALSA: usb-audio: Add implicit feedback quirk for SSL2
When recording with cache-misses and arm_spe_x event, I found that it
will just fail without showing any error info if i put cache-misses
after 'arm_spe_x' event.
[root@localhost 0620]# perf record -e cache-misses \
-e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.067 MB perf.data ]
[root@localhost 0620]#
[root@localhost 0620]# perf record -e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ \
-e cache-misses sleep 1
[root@localhost 0620]#
The current code can only work if the only event to be traced is an
'arm_spe_x', or if it is the last event to be specified. Otherwise the
last event type will be checked against all the arm_spe_pmus[i]->types,
none will match and an out of bound 'i' index will be used in
arm_spe_recording_init().
We don't support concurrent multiple arm_spe_x events currently, that
is checked in arm_spe_recording_options(), and it will show the relevant
info. So add the check and record of the first found 'arm_spe_pmu' to
fix this issue here.
Fixes: ffd3d18c20 ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")
Signed-off-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200724071111.35593-2-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 5aa98879ef ("s390/cpum_sf: prohibit callchain data collection")
prohibits call graph sampling for hardware events on s390. The
information recorded is out of context and does not match.
On s390 this commit now breaks test case 68 Zstd perf.data
compression/decompression.
Therefore omit call graph sampling on s390 in this test.
Output before:
[root@t35lp46 perf]# ./perf test -Fv 68
68: Zstd perf.data compression/decompression :
--- start ---
Collecting compressed record file:
Error:
cycles: PMU Hardware doesn't support sampling/overflow-interrupts.
Try 'perf stat'
---- end ----
Zstd perf.data compression/decompression: FAILED!
[root@t35lp46 perf]#
Output after:
[root@t35lp46 perf]# ./perf test -Fv 68
68: Zstd perf.data compression/decompression :
--- start ---
Collecting compressed record file:
500+0 records in
500+0 records out
256000 bytes (256 kB, 250 KiB) copied, 0.00615638 s, 41.6 MB/s
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.004 MB /tmp/perf.data.X3M,
compressed (original 0.002 MB, ratio is 3.609) ]
Checking compressed events stats:
# compressed : Zstd, level = 1, ratio = 4
COMPRESSED events: 1
2ELIFREPh---- end ----
Zstd perf.data compression/decompression: Ok
[root@t35lp46 perf]#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200729135314.91281-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I compiled with AddressSanitizer and I had these memory leaks while I
was using the tep_parse_format function:
Direct leak of 28 byte(s) in 4 object(s) allocated from:
#0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe)
#1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985
#2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140
#3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206
#4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291
#5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299
#6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849
#7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161
#8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207
#9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786
#10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285
#11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369
#12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335
#13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389
#14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431
#15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251
#16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284
#17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593
#18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727
#19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048
#20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127
#21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152
#22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252
#23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347
#24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461
#25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673
#26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
The token variable in the process_dynamic_array_len function is
allocated in the read_expect_type function, but is not freed before
calling the read_token function.
Free the token variable before calling read_token in order to plug the
leak.
Signed-off-by: Philippe Duplessis-Guindon <pduplessis@efficios.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/linux-trace-devel/20200730150236.5392-1-pduplessis@efficios.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'Commit 8566ac8b8e ("KVM: SVM: Implement pause loop exit logic in SVM")'
drops disable pause loop exit/pause filtering capability completely, I
guess it is a merge fault by Radim since disable vmexits capabilities and
pause loop exit for SVM patchsets are merged at the same time. This patch
reintroduces the disable pause loop exit/pause filtering capability support.
Reported-by: Haiwei Li <lihaiwei@tencent.com>
Tested-by: Haiwei Li <lihaiwei@tencent.com>
Fixes: 8566ac8b ("KVM: SVM: Implement pause loop exit logic in SVM")
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1596165141-28874-3-git-send-email-wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull more drm fixes from Dave Airlie:
"As mentioned previously this contains the nouveau regression fix.
amdgpu had three fixes outstanding as well, one revert, an info leak
and use after free. The use after free is a bit trickier than I'd
like, and I've personally gone over it to confirm I'm happy that it is
doing what it says.
nouveau:
- final modifiers regression fix
amdgpu:
- Revert a fix which caused other regressions
- Fix potential kernel info leak
- Fix a use-after-free bug that was uncovered by another change in 5.7"
* tag 'drm-fixes-2020-07-31' of git://anongit.freedesktop.org/drm/drm:
drm/nouveau: Accept 'legacy' format modifiers
Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers"
drm/amd/display: Clear dm_state for fast updates
drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl()
Accept the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK()
family of modifiers to handle broken userspace
Xorg modesetting and Mesa drivers. Existing Mesa
drivers are still aware of only these older
format modifiers which do not differentiate
between different variations of the block linear
layout. When the format modifier support flag was
flipped in the nouveau kernel driver, the X.org
modesetting driver began attempting to use its
format modifier-enabled framebuffer path. Because
the set of format modifiers advertised by the
kernel prior to this change do not intersect with
the set of format modifiers advertised by Mesa,
allocating GBM buffers using format modifiers
fails and the modesetting driver falls back to
non-modifier allocation. However, it still later
queries the modifier of the GBM buffer when
creating its DRM-KMS framebuffer object, receives
the old-format modifier from Mesa, and attempts
to create a framebuffer with it. Since the kernel
is still not aware of these formats, this fails.
Userspace should not be attempting to query format
modifiers of GBM buffers allocated with a non-
format-modifier-aware allocation path, but to
avoid breaking existing userspace behavior, this
change accepts the old-style format modifiers when
creating framebuffers and applying them to planes
by translating them to the equivalent new-style
modifier. To accomplish this, some layout
parameters must be assumed to match properties of
the device targeted by the relevant ioctls. To
avoid perpetuating misuse of the old-style
modifiers, this change does not advertise support
for them. Doing so would imply compatibility
between devices with incompatible memory layouts.
Tested with Xorg 1.20 modesetting driver,
weston@c46c70dac84a4b3030cd05b380f9f410536690fc,
gnome & KDE wayland desktops from Ubuntu 18.04,
and sway 1.5
Reported-by: Kirill A. Shutemov <kirill@shutemov.name>
Fixes: fa4f4c213f ("drm/nouveau/kms: Support NVIDIA format modifiers")
Link: https://lkml.org/lkml/2020/6/30/1251
Signed-off-by: James Jones <jajones@nvidia.com>
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The function invokes bpf_prog_inc(), which increases the reference
count of a bpf_prog object "rq->xdp_prog" if the object isn't NULL.
The refcount leak issues take place in two error handling paths. When
either mlx5_wq_ll_create() or mlx5_wq_cyc_create() fails, the function
simply returns the error code and forgets to drop the reference count
increased earlier, causing a reference count leak of "rq->xdp_prog".
Fix this issue by jumping to the error handling path err_rq_wq_destroy
while either function fails.
Fixes: 422d4c401e ("net/mlx5e: RX, Split WQ objects for different RQ types")
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The flow_source must be specified, even for rule without matching
source vport, because some actions are only allowed in uplink.
Otherwise, rule can't be offloaded and firmware syndrome happens.
Fixes: 6fb0701a9c ("net/mlx5: E-Switch, Add support for offloading rules with no in_port")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The modified flow_context fields in FTE must be indicated in
modify_enable bitmask. Previously, the misc bit in modify_enable is
always set as source vport must be set for each rule. So, when parsing
vxlan/gre/geneve/qinq rules, this bit is not set because those are all
from the same misc fileds that source vport fields are located at, and
we don't need to set the indicator twice.
After adding per vport tables for mirroring, misc bit is not set, then
firmware syndrome happens. To fix it, set the bit wherever misc fileds
are changed. This also makes it unnecessary to check misc fields and set
the misc bit accordingly in metadata matching, so here remove it.
Besides, flow_source must be specified for uplink because firmware
will check it and some actions are only allowed for packets received
from uplink.
Fixes: 96e326878f ("net/mlx5e: Eswitch, Use per vport tables for mirroring")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently the driver restores only IPv4 tunnel headers.
Add support for restoring IPv6 tunnel header.
Fixes: b8ce903709 ("net/mlx5e: Restore tunnel metadata on miss")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Johannes Berg says:
====================
A couple of more changes:
* remove a warning that can trigger in certain races
* check a function pointer before using it
* check before adding 6 GHz to avoid a warning in mesh
* fix two memory leaks in mesh
* fix a TX status bug leading to a memory leak
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the missing clk_disable_unprepare() before return
from gemini_ethernet_port_probe() in the error handling case.
Fixes: 4d5ae32f5e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On an error return, jump to the unlock at the end to be sure
to unlock the queue_lock mutex.
Fixes: 0925e9db4d ("ionic: use mutex to protect queue operations")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
atmtcp_remove_persistent() invokes atm_dev_lookup(), which returns a
reference of atm_dev with increased refcount or NULL if fails.
The refcount leaks issues occur in two error handling paths. If
dev_data->persist is zero or PRIV(dev)->vcc isn't NULL, the function
returns 0 without decreasing the refcount kept by a local variable,
resulting in refcount leaks.
Fix the issue by adding atm_dev_put() before returning 0 both when
dev_data->persist is zero or PRIV(dev)->vcc isn't NULL.
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
in recent kernel versions there are warnings about incorrect MTU size
like these:
eth0: mtu greater than device maximum
mtk_soc_eth 1b100000.ethernet eth0: error -22 setting MTU to include DSA overhead
Fixes: bfcb813203 ("net: dsa: configure the MTU for switch ports")
Fixes: 72579e14a1 ("net: dsa: don't fail to probe if we couldn't set the MTU")
Fixes: 7a4c53bee3 ("net: report invalid mtu value via netlink extack")
Signed-off-by: Landen Chao <landen.chao@mediatek.com>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
If some processes in nixge_probe() fail, free_netdev(dev)
needs to be called to aviod a memory leak.
Fixes: 87ab207981 ("net: nixge: Separate ctrl and dma resources")
Fixes: abcd3d6fc6 ("net: nixge: Fix error path for obtaining mac address")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Lu Wei <luwei32@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Number of .dumpit functions try to ignore -EOPNOTSUPP errors.
Recent change missed that, and started reporting all errors
but -EMSGSIZE back from dumps. This leads to situation like
this:
$ devlink dev info
devlink answers: Operation not supported
Dump should not report an error just because the last device
to be queried could not provide an answer.
To fix this and avoid similar confusion make sure we clear
err properly, and not leave it set to an error if we don't
terminate the iteration.
Fixes: c62c2cfb80 ("net: devlink: don't ignore errors during dumpit")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a race between rxrpc_sendmsg setting up a call, but then failing to
send anything on it due to an error, and recvmsg() seeing the call
completion occur and trying to return the state to the user.
An assertion fails in rxrpc_recvmsg() because the call has already been
released from the socket and is about to be released again as recvmsg deals
with it. (The recvmsg_q queue on the socket holds a ref, so there's no
problem with use-after-free.)
We also have to be careful not to end up reporting an error twice, in such
a way that both returns indicate to userspace that the user ID supplied
with the call is no longer in use - which could cause the client to
malfunction if it recycles the user ID fast enough.
Fix this by the following means:
(1) When sendmsg() creates a call after the point that the call has been
successfully added to the socket, don't return any errors through
sendmsg(), but rather complete the call and let recvmsg() retrieve
them. Make sendmsg() return 0 at this point. Further calls to
sendmsg() for that call will fail with ESHUTDOWN.
Note that at this point, we haven't send any packets yet, so the
server doesn't yet know about the call.
(2) If sendmsg() returns an error when it was expected to create a new
call, it means that the user ID wasn't used.
(3) Mark the call disconnected before marking it completed to prevent an
oops in rxrpc_release_call().
(4) recvmsg() will then retrieve the error and set MSG_EOR to indicate
that the user ID is no longer known by the kernel.
An oops like the following is produced:
kernel BUG at net/rxrpc/recvmsg.c:605!
...
RIP: 0010:rxrpc_recvmsg+0x256/0x5ae
...
Call Trace:
? __init_waitqueue_head+0x2f/0x2f
____sys_recvmsg+0x8a/0x148
? import_iovec+0x69/0x9c
? copy_msghdr_from_user+0x5c/0x86
___sys_recvmsg+0x72/0xaa
? __fget_files+0x22/0x57
? __fget_light+0x46/0x51
? fdget+0x9/0x1b
do_recvmmsg+0x15e/0x232
? _raw_spin_unlock+0xa/0xb
? vtime_delta+0xf/0x25
__x64_sys_recvmmsg+0x2c/0x2f
do_syscall_64+0x4c/0x78
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 357f5ef646 ("rxrpc: Call rxrpc_release_call() on error in rxrpc_new_client_call()")
Reported-by: syzbot+b54969381df354936d96@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to replace Thor Thayer as Altera Triple Speed Ethernet
maintainer as he is moving to a different role.
Signed-off-by: Joyce Ooi <joyce.ooi@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When read netdevsim trap_flow_action_cookie, we need to init it first,
or we will get "Invalid argument" error.
Fixes: d3cbb907ae ("netdevsim: add ACL trap reporting cookie as a metadata")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPV6_ADDRFORM causes resource leaks when converting an IPv6 socket
to IPv4, particularly struct ipv6_ac_socklist. Similar to
struct ipv6_mc_socklist, we should just close it on this path.
This bug can be easily reproduced with the following C program:
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>
int main()
{
int s, value;
struct sockaddr_in6 addr;
struct ipv6_mreq m6;
s = socket(AF_INET6, SOCK_DGRAM, 0);
addr.sin6_family = AF_INET6;
addr.sin6_port = htons(5000);
inet_pton(AF_INET6, "::ffff:192.168.122.194", &addr.sin6_addr);
connect(s, (struct sockaddr *)&addr, sizeof(addr));
inet_pton(AF_INET6, "fe80::AAAA", &m6.ipv6mr_multiaddr);
m6.ipv6mr_interface = 5;
setsockopt(s, SOL_IPV6, IPV6_JOIN_ANYCAST, &m6, sizeof(m6));
value = AF_INET;
setsockopt(s, SOL_IPV6, IPV6_ADDRFORM, &value, sizeof(value));
close(s);
return 0;
}
Reported-by: ch3332xr@gmail.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KVM/arm64 fixes for Linux 5.8, take #3
- Fix a corner case of a new mapping inheriting exec permission without
and yet bypassing invalidation of the I-cache
- Make sure PtrAuth predicates oinly generate inline code for the
non-VHE hypervisor code
Pull virtio fixes from Michael Tsirkin:
"A couple of last minute bugfixes"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio-mem: Fix build error due to improper use 'select'
virtio_balloon: fix up endian-ness for free cmd id
virtio-balloon: Document byte ordering of poison_val
vhost/scsi: fix up req type endian-ness
firmware: Fix a reference count leak.
Fix build error for the case:
defined(CONFIG_SMP) && !defined(CONFIG_CPU_V6)
config: keystone_defconfig
CC arch/arm/kernel/signal.o
In file included from ../include/linux/random.h:14,
from ../arch/arm/kernel/signal.c:8:
../arch/arm/include/asm/percpu.h: In function ‘__my_cpu_offset’:
../arch/arm/include/asm/percpu.h:29:34: error: ‘current_stack_pointer’ undeclared (first use in this function); did you mean ‘user_stack_pointer’?
: "Q" (*(const unsigned long *)current_stack_pointer));
^~~~~~~~~~~~~~~~~~~~~
user_stack_pointer
Fixes: f227e3ec3b ("random32: update the net random state on interrupt and activity")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We observed two panics involving races with igb_reset_task.
The first panic is caused by this race condition:
kworker reboot -f
igb_reset_task
igb_reinit_locked
igb_down
napi_synchronize
__igb_shutdown
igb_clear_interrupt_scheme
igb_free_q_vectors
igb_free_q_vector
adapter->q_vector[v_idx] = NULL;
napi_disable
Panics trying to access
adapter->q_vector[v_idx].napi_state
The second panic (a divide error) is caused by this race:
kworker reboot -f tx packet
igb_reset_task
__igb_shutdown
rtnl_lock()
...
igb_clear_interrupt_scheme
igb_free_q_vectors
adapter->num_tx_queues = 0
...
rtnl_unlock()
rtnl_lock()
igb_reinit_locked
igb_down
igb_up
netif_tx_start_all_queues
dev_hard_start_xmit
igb_xmit_frame
igb_tx_queue_mapping
Panics on
r_idx % adapter->num_tx_queues
This commit applies to igb_reset_task the same changes that
were applied to ixgbe in commit 2f90b8657e ("ixgbe: this patch
adds support for DCB to the kernel and ixgbe driver"),
commit 8f4c5c9fb8 ("ixgbe: reinit_locked() should be called with
rtnl_lock") and commit 88adce4ea8 ("ixgbe: fix possible race in
reset subtask").
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
After 'commit e086ba2fcc ("e1000e: disable s0ix entry and exit flows
for ME systems")',
ThinkPad P14s always failed to disable ULP by ME.
'commit 0c80cdbf33 ("e1000e: Warn if disabling ULP failed")'
break out of init phy:
error log:
[ 42.364753] e1000e 0000:00:1f.6 enp0s31f6: Failed to disable ULP
[ 42.524626] e1000e 0000:00:1f.6 enp0s31f6: PHY Wakeup cause - Unicast Packet
[ 42.822476] e1000e 0000:00:1f.6 enp0s31f6: Hardware Error
When disable s0ix, E1000_FWSM_ULP_CFG_DONE will never be 1.
If continue to init phy like before, it can work as before.
iperf test result good too.
Fixes: 0c80cdbf33 ("e1000e: Warn if disabling ULP failed")
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Pull block fixes from Jens Axboe:
"Three NVMe fixes"
* tag 'block-5.8-2020-07-30' of git://git.kernel.dk/linux-block:
nvme: add a Identify Namespace Identification Descriptor list quirk
nvme-pci: prevent SK hynix PC400 from using Write Zeroes command
nvme-tcp: fix possible hang waiting for icresp response
Pull io_uring fixes from Jens Axboe:
"Two small fixes for corner/error cases"
* tag 'io_uring-5.8-2020-07-30' of git://git.kernel.dk/linux-block:
io_uring: fix lockup in io_fail_links()
io_uring: fix ->work corruption with poll_add
Daniel Díaz and Kees Cook independently reported that commit
f227e3ec3b ("random32: update the net random state on interrupt and
activity") broke arm64 due to a circular dependency on include files
since the addition of percpu.h in random.h.
The correct fix would definitely be to move all the prandom32 stuff out
of random.h but for backporting, a smaller solution is preferred.
This one replaces linux/percpu.h with asm/percpu.h, and this fixes the
problem on x86_64, arm64, arm, and mips. Note that moving percpu.h
around didn't change anything and that removing it entirely broke
differently. When backporting, such options might still be considered
if this patch fails to help.
[ It turns out that an alternate fix seems to be to just remove the
troublesome <asm/pointer_auth.h> remove from the arm64 <asm/smp.h>
that causes the circular dependency.
But we might as well do the whole belt-and-suspenders thing, and
minimize inclusion in <linux/random.h> too. Either will fix the
problem, and both are good changes. - Linus ]
Reported-by: Daniel Díaz <daniel.diaz@linaro.org>
Reported-by: Kees Cook <keescook@chromium.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Fixes: f227e3ec3b
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Although iph is expected to point to at least 20 bytes of valid memory,
ihl may be bogus, for example on reception of a corrupt packet. If it
happens to be less than 5, we really don't want to run away and
dereference 16GB worth of memory until it wraps back to exactly zero...
Fixes: 0e455d8e80 ("arm64: Implement optimised IP checksum helpers")
Reported-by: guodeqing <geffrey.guo@huawei.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
asm/pointer_auth.h is not needed anymore in asm/smp.h, as 62a679cb28
("arm64: simplify ptrauth initialization") removed the keys from the
secondary_data structure.
This also cures a compilation issue introduced by f227e3ec3b
("random32: update the net random state on interrupt and activity").
Fixes: 62a679cb28 ("arm64: simplify ptrauth initialization")
Fixes: f227e3ec3b ("random32: update the net random state on interrupt and activity")
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
This patch fixes a race condition that causes a use-after-free during
amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits
are requested and the second one finishes before the first. Essentially,
this bug occurs when the following sequence of events happens:
1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is
deferred to the workqueue.
2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is
deferred to the workqueue.
3. Commit #2 starts before commit #1, dm_state #1 is used in the
commit_tail and commit #2 completes, freeing dm_state #1.
4. Commit #1 starts after commit #2 completes, uses the freed dm_state
1 and dereferences a freelist pointer while setting the context.
Since this bug has only been spotted with fast commits, this patch fixes
the bug by clearing the dm_state instead of using the old dc_state for
fast updates. In addition, since dm_state is only used for its dc_state
and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found,
removing the dm_state should not have any consequences in fast updates.
This use-after-free bug has existed for a while now, but only caused a
noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: relocate
freelist pointer to middle of object") moving the freelist pointer from
dm_state->base (which was unused) to dm_state->context (which is
dereferenced).
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383
Fixes: bd200d190f ("drm/amd/display: Don't replace the dc_state for fast updates")
Reported-by: Duncan <1i5t5.duncan@cox.net>
Signed-off-by: Mazin Rezk <mnrzk@protonmail.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Compiler leaves a 4-byte hole near the end of `dev_info`, causing
amdgpu_info_ioctl() to copy uninitialized kernel stack memory to userspace
when `size` is greater than 356.
In 2015 we tried to fix this issue by doing `= {};` on `dev_info`, which
unfortunately does not initialize that 4-byte hole. Fix it by using
memset() instead.
Cc: stable@vger.kernel.org
Fixes: c193fa91b9 ("drm/amdgpu: information leak in amdgpu_info_ioctl()")
Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This warning can trigger if there is a mismatch between frames that were
sent with the sta pointer set vs tx status frames reported for the sta address.
This can happen due to race conditions on re-creating stations, or even
in the case of .sta_add/remove being used instead of .sta_state, which can cause
frames to be sent to a station that has not been uploaded yet.
If there is an actual underflow issue, it should show up in the device airtime
warning below, so it is better to remove this one.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20200725084533.13829-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Allocated ack_frame id from local->ack_status_frames is not really
stored in the tx_info for 802.3 Tx path. Due to this, tx ack status
is not reported and ack_frame id is not freed for the buffers requiring
tx ack status. Also move the memset to 0 of tx_info before
IEEE80211_TX_CTL_REQ_TX_STATUS flag assignment.
Fixes: 50ff477a86 ("mac80211: add 802.11 encapsulation offloading support")
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@codeaurora.org>
Link: https://lore.kernel.org/r/1595427617-1713-1-git-send-email-vthiagar@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In the case where a vendor command does not implement doit, and has no
flags set, doit would not be validated and a NULL pointer dereference
would occur, for example when invoking the vendor command via iw.
I encountered this while developing new vendor commands. Perhaps in
practice it is advisable to always implement doit along with dumpit,
but it seems reasonable to me to always check doit anyway, not just
when NEED_WDEV.
Signed-off-by: Julian Squires <julian@cipht.net>
Link: https://lore.kernel.org/r/20200706211353.2366470-1-julian@cipht.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The commit 24a2042cb2 ("mac80211: add HE 6 GHz Band Capability
element") failed to check device capability before adding HE 6 GHz
capability element. Below warning is reported in 11ac device in mesh.
Fix that by checking device capability at HE 6 GHz cap IE addition
in mesh beacon and association request.
WARNING: CPU: 1 PID: 1897 at net/mac80211/util.c:2878
ieee80211_ie_build_he_6ghz_cap+0x149/0x150 [mac80211]
[ 3138.720358] Call Trace:
[ 3138.720361] ieee80211_mesh_build_beacon+0x462/0x530 [mac80211]
[ 3138.720363] ieee80211_start_mesh+0xa8/0xf0 [mac80211]
[ 3138.720365] __cfg80211_join_mesh+0x122/0x3e0 [cfg80211]
[ 3138.720368] nl80211_join_mesh+0x3d3/0x510 [cfg80211]
Fixes: 24a2042cb2 ("mac80211: add HE 6 GHz Band Capability element")
Reported-by: Markus Theil <markus.theil@tu-ilmenau.de>
Signed-off-by: Rajkumar Manoharan <rmanohar@codeaurora.org>
Link: https://lore.kernel.org/r/1593656424-18240-1-git-send-email-rmanohar@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Currently, espintcp_rcv drops packets silently, which makes debugging
issues difficult. Count packets as either XfrmInHdrError (when the
packet was too short or contained invalid data) or XfrmInError (for
other issues).
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Currently, short messages (less than 4 bytes after the length header)
will break the stream of messages. This is unnecessary, since we can
still parse messages even if they're too short to contain any usable
data. This is also bogus, as keepalive messages (a single 0xff byte),
though not needed with TCP encapsulation, should be allowed.
This patch changes the stream parser so that short messages are
accepted and dropped in the kernel. Messages that contain a valid SPI
or non-ESP header are processed as before.
Fixes: e27cca96cd ("xfrm: add espintcp (RFC 8229)")
Reported-by: Andrew Cagney <cagney@libreswan.org>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
It turns out that the plugin right now ends up being really unhappy
about the change from 'static' to 'extern' storage that happened in
commit f227e3ec3b ("random32: update the net random state on interrupt
and activity").
This is probably a trivial fix for the latent_entropy plugin, but for
now, just remove net_rand_state from the list of things the plugin
worries about.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Recently ASPM handling was changed to allow ASPM on PCIe-to-PCI/PCI-X
bridges. Unfortunately the ASMedia ASM1083/1085 PCIe to PCI bridge device
doesn't seem to function properly with ASPM enabled. On an Asus PRIME
H270-PRO motherboard, it causes errors like these:
pcieport 0000:00:1c.0: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
pcieport 0000:00:1c.0: AER: device [8086:a292] error status/mask=00003000/00002000
pcieport 0000:00:1c.0: AER: [12] Timeout
pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0
pcieport 0000:00:1c.0: AER: can't find device of ID00e0
In addition to flooding the kernel log, this also causes the machine to
wake up immediately after suspend is initiated.
The device advertises ASPM L0s and L1 support in the Link Capabilities
register, but the ASMedia web page for ASM1083 [1] claims "No PCIe ASPM
support".
Windows 10 (build 2004) enables L0s, but it also logs correctable PCIe
errors.
Add a quirk to disable ASPM for this device.
[1] https://www.asmedia.com.tw/eng/e_show_products.php?cate_index=169&item=114
[bhelgaas: commit log]
Fixes: 66ff14e59e ("PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208667
Link: https://lore.kernel.org/r/20200722021803.17958-1-hancockrwd@gmail.com
Signed-off-by: Robert Hancock <hancockrwd@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Fix HASH_OF_MAPS bug of not putting inner map pointer on bpf_map_elem_update()
operation. This is due to per-cpu extra_elems optimization, which bypassed
free_htab_elem() logic doing proper clean ups. Make sure that inner map is put
properly in optimized case as well.
Fixes: 8c290e60fa ("bpf: fix hashmap extra_elems logic")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200729040913.2815687-1-andriin@fb.com
RX queue IRQ mappings are disposed in both the TX IRQ and RX IRQ
error paths. Fix this and dispose of TX IRQ mappings correctly in
case of an error.
Fixes: ea22d51a78 ("ibmvnic: simplify and improve driver probe function")
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull audit fixes from Paul Moore:
"One small audit fix that you can hopefully merge before v5.8 is
released. Unfortunately it is a revert of a patch that went in during
the v5.7 window and we just recently started to see some bug reports
relating to that commit.
We are working on a proper fix, but I'm not yet clear on when that
will be ready and we need to fix the v5.7 kernels anyway, so in the
interest of time a revert seemed like the best solution right now"
* tag 'audit-pr-20200729' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
revert: 1320a4052e ("audit: trigger accompanying records when no rules present")
Pull 9p fixes from Dominique Martinet:
"A couple of syzcaller fixes for 5.8
The first one in particular has been quite noisy ("broke" in -rc5) so
this would be worth landing even this late even if users likely won't
see a difference"
* tag '9p-for-5.8-2' of git://github.com/martinetd/linux:
9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work
net/9p: validate fds in p9_fd_open
Ido Schimmel says:
====================
mlxsw fixes
This patch set contains various fixes for mlxsw.
Patches #1-#2 fix two trap related issues introduced in previous cycle.
Patches #3-#5 fix rare use-after-frees discovered by syzkaller. After
over a week of fuzzing with the fixes, the bugs did not reproduce.
Patch #6 from Amit fixes an issue in the ethtool selftest that was
recently discovered after running the test on a new platform that
supports only 1Gbps and 10Gbps speeds.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The test case check_highest_speed_is_chosen() configures $h1 to
advertise a subset of its supported speeds and checks that $h2 chooses
the highest speed from the subset.
To find the common advertised speeds between $h1 and $h2,
common_speeds_get() is called.
Currently, the first speed returned from common_speeds_get() is removed
claiming "h1 does not advertise this speed". The claim is wrong because
the function is called after $h1 already advertised a subset of speeds.
In case $h1 supports only two speeds, it will advertise a single speed
which will be later removed because of previously mentioned bug. This
results in the test needlessly failing. When more than two speeds are
supported this is not an issue because the first advertised speed
is the lowest one.
Fix this by not removing any speed from the list of commonly advertised
speeds.
Fixes: 64916b57c0 ("selftests: forwarding: Add speed and auto-negotiation test")
Reported-by: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The lifetime of the Rx listener item ('rxl_item') is managed using RCU,
but is dereferenced outside of RCU read-side critical section, which can
lead to a use-after-free.
Fix this by increasing the scope of the RCU read-side critical section.
Fixes: 93c1edb27f ("mlxsw: Introduce Mellanox switch driver core")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cited commit mistakenly removed the trap group for externally routed
packets (e.g., via the management interface) and grouped locally routed
and externally routed packet traps under the same group, thereby
subjecting them to the same policer.
This can result in problems, for example, when FRR is restarted and
suddenly all transient traffic is trapped to the CPU because of a
default route through the management interface. Locally routed packets
required to re-establish a BGP connection will never reach the CPU and
the routing tables will not be re-populated.
Fix this by using a different trap group for externally routed packets.
Fixes: 8110668ecd ("mlxsw: spectrum_trap: Register layer 3 control traps")
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cited commit added the ability to program link-local prefix routes to
the ASIC so that relevant packets are routed and trapped correctly.
However, host routes were not included in the change and thus not
programmed to the ASIC. This can result in packets being trapped via an
external route trap instead of a local route trap as in IPv4.
Fix this by programming all the link-local routes to the ASIC.
Fixes: 10d3757fcb ("mlxsw: spectrum_router: Allow programming link-local prefix routes")
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fib_trie_unmerge() is called with RTNL held, but not from an RCU
read-side critical section. This leads to the following warning [1] when
the FIB alias list in a leaf is traversed with
hlist_for_each_entry_rcu().
Since the function is always called with RTNL held and since
modification of the list is protected by RTNL, simply use
hlist_for_each_entry() and silence the warning.
[1]
WARNING: suspicious RCU usage
5.8.0-rc4-custom-01520-gc1f937f3f83b #30 Not tainted
-----------------------------
net/ipv4/fib_trie.c:1867 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by ip/164:
#0: ffffffff85a27850 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x49a/0xbd0
stack backtrace:
CPU: 0 PID: 164 Comm: ip Not tainted 5.8.0-rc4-custom-01520-gc1f937f3f83b #30
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
dump_stack+0x100/0x184
lockdep_rcu_suspicious+0x153/0x15d
fib_trie_unmerge+0x608/0xdb0
fib_unmerge+0x44/0x360
fib4_rule_configure+0xc8/0xad0
fib_nl_newrule+0x37a/0x1dd0
rtnetlink_rcv_msg+0x4f7/0xbd0
netlink_rcv_skb+0x17a/0x480
rtnetlink_rcv+0x22/0x30
netlink_unicast+0x5ae/0x890
netlink_sendmsg+0x98a/0xf40
____sys_sendmsg+0x879/0xa00
___sys_sendmsg+0x122/0x190
__sys_sendmsg+0x103/0x1d0
__x64_sys_sendmsg+0x7d/0xb0
do_syscall_64+0x54/0xa0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fc80a234e97
Code: Bad RIP value.
RSP: 002b:00007ffef8b66798 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc80a234e97
RDX: 0000000000000000 RSI: 00007ffef8b66800 RDI: 0000000000000003
RBP: 000000005f141b1c R08: 0000000000000001 R09: 0000000000000000
R10: 00007fc80a2a8ac0 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 00007ffef8b67008 R15: 0000556fccb10020
Fixes: 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit cited below removed the RCU read-side critical section from
rtnl_fdb_dump() which means that the ndo_fdb_dump() callback is invoked
without RCU protection.
This results in the following warning [1] in the VXLAN driver, which
relied on the callback being invoked from an RCU read-side critical
section.
Fix this by calling rcu_read_lock() in the VXLAN driver, as already done
in the bridge driver.
[1]
WARNING: suspicious RCU usage
5.8.0-rc4-custom-01521-g481007553ce6 #29 Not tainted
-----------------------------
drivers/net/vxlan.c:1379 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by bridge/166:
#0: ffffffff85a27850 (rtnl_mutex){+.+.}-{3:3}, at: netlink_dump+0xea/0x1090
stack backtrace:
CPU: 1 PID: 166 Comm: bridge Not tainted 5.8.0-rc4-custom-01521-g481007553ce6 #29
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
dump_stack+0x100/0x184
lockdep_rcu_suspicious+0x153/0x15d
vxlan_fdb_dump+0x51e/0x6d0
rtnl_fdb_dump+0x4dc/0xad0
netlink_dump+0x540/0x1090
__netlink_dump_start+0x695/0x950
rtnetlink_rcv_msg+0x802/0xbd0
netlink_rcv_skb+0x17a/0x480
rtnetlink_rcv+0x22/0x30
netlink_unicast+0x5ae/0x890
netlink_sendmsg+0x98a/0xf40
__sys_sendto+0x279/0x3b0
__x64_sys_sendto+0xe6/0x1a0
do_syscall_64+0x54/0xa0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fe14fa2ade0
Code: Bad RIP value.
RSP: 002b:00007fff75bb5b88 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00005614b1ba0020 RCX: 00007fe14fa2ade0
RDX: 000000000000011c RSI: 00007fff75bb5b90 RDI: 0000000000000003
RBP: 00007fff75bb5b90 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00005614b1b89160
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Fixes: 5e6d243587 ("bridge: netlink dump interface at par with brctl")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The lookaside count is improperly initialized to the size of the
Receive Queue with the additional +1. In the traces below, the
RQ size is 384, so the count was set to 385.
The lookaside count is then rarely refreshed. Note the high and
incorrect count in the trace below:
rvt_get_rwqe: [hfi1_0] wqe ffffc900078e9008 wr_id 55c7206d75a0 qpn c
qpt 2 pid 3018 num_sge 1 head 1 tail 0, count 385
rvt_get_rwqe: (hfi1_rc_rcv+0x4eb/0x1480 [hfi1] <- rvt_get_rwqe) ret=0x1
The head,tail indicate there is only one RWQE posted although the count
says 385 and we correctly return the element 0.
The next call to rvt_get_rwqe with the decremented count:
rvt_get_rwqe: [hfi1_0] wqe ffffc900078e9058 wr_id 0 qpn c
qpt 2 pid 3018 num_sge 0 head 1 tail 1, count 384
rvt_get_rwqe: (hfi1_rc_rcv+0x4eb/0x1480 [hfi1] <- rvt_get_rwqe) ret=0x1
Note that the RQ is empty (head == tail) yet we return the RWQE at tail 1,
which is not valid because of the bogus high count.
Best case, the RWQE has never been posted and the rc logic sees an RWQE
that is too small (all zeros) and puts the QP into an error state.
In the worst case, a server slow at posting receive buffers might fool
rvt_get_rwqe() into fetching an old RWQE and corrupt memory.
Fix by deleting the faulty initialization code and creating an
inline to fetch the posted count and convert all callers to use
new inline.
Fixes: f592ae3c99 ("IB/rdmavt: Fracture single lock used for posting and processing RWQEs")
Link: https://lore.kernel.org/r/20200728183848.22226.29132.stgit@awfm-01.aw.intel.com
Reported-by: Zhaojuan Guo <zguo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.4.x
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Tested-by: Honggang Li <honli@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Pull drm fixes from Dave Airlie:
"The nouveau fixes missed the last pull by a few hours, and we had a
few arm driver/panel/bridge fixes come in.
This is possibly a bit more than I'm comfortable sending at this
stage, but I've looked at each patch, the core + nouveau patches fix
regressions, and the arm related ones are all around screens turning
on and working, and are mostly trivial patches, the line count is
mostly in comments.
core:
- fix possible use-after-free
drm_fb_helper:
- regression fix to use memcpy_io on bochs' sparc64
nouveau:
- format modifiers fixes
- HDA regression fix
- turing modesetting race fix
of:
- fix a double free
dbi:
- fix SPI Type 1 transfer
mcde:
- fix screen stability crash
panel:
- panel: fix display noise on auo,kd101n80-45na
- panel: delay HPD checks for boe_nv133fhm_n61
bridge:
- bridge: drop connector check in nwl-dsi bridge
- bridge: set proper bridge type for adv7511"
* tag 'drm-fixes-2020-07-29' of git://anongit.freedesktop.org/drm/drm:
drm: hold gem reference until object is no longer accessed
drm/dbi: Fix SPI Type 1 (9-bit) transfer
drm/drm_fb_helper: fix fbdev with sparc64
drm/mcde: Fix stability issue
drm/bridge: nwl-dsi: Drop DRM_BRIDGE_ATTACH_NO_CONNECTOR check.
drm/panel: Fix auo, kd101n80-45na horizontal noise on edges of panel
drm: panel: simple: Delay HPD checking on boe_nv133fhm_n61 for 15 ms
drm/bridge/adv7511: set the bridge type properly
drm: of: Fix double-free bug
drm/nouveau/fbcon: zero-initialise the mode_cmd2 structure
drm/nouveau/fbcon: fix module unload when fbcon init has failed for some reason
drm/nouveau/kms/tu102: wait for core update to complete when assigning windows
drm/nouveau/kms/gf100: use correct format modifiers
drm/nouveau/disp/gm200-: fix regression from HDA SOR selection changes
This modifies the first 32 bits out of the 128 bits of a random CPU's
net_rand_state on interrupt or CPU activity to complicate remote
observations that could lead to guessing the network RNG's internal
state.
Note that depending on some network devices' interrupt rate moderation
or binding, this re-seeding might happen on every packet or even almost
never.
In addition, with NOHZ some CPUs might not even get timer interrupts,
leaving their local state rarely updated, while they are running
networked processes making use of the random state. For this reason, we
also perform this update in update_process_times() in order to at least
update the state when there is user or system activity, since it's the
only case we care about.
Reported-by: Amit Klein <aksecurity@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The poison_val field in the virtio_balloon_config is treated as a
little-endian field by the host. Since we are currently only having to deal
with a single byte poison value this isn't a problem, however if the value
should ever expand it would cause byte ordering issues. Document that in
the code so that we know that if the value should ever expand we need to
byte swap the value on big-endian architectures.
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Link: https://lore.kernel.org/r/20200713203539.17140.71425.stgit@localhost.localdomain
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
vhost/scsi doesn't handle type conversion correctly
for request type when using virtio 1.0 and up for BE,
or cross-endian platforms.
Fix it up using vhost_32_to_cpu.
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Pull NVMe fixes from Christoph.
* 'nvme-5.8' of git://git.infradead.org/nvme:
nvme: add a Identify Namespace Identification Descriptor list quirk
nvme-pci: prevent SK hynix PC400 from using Write Zeroes command
nvme-tcp: fix possible hang waiting for icresp response
Scatter CQE feature relies on two flags MLX5_QP_FLAG_SCATTER_CQE and
MLX5_QP_FLAG_ALLOW_SCATTER_CQE, both of them can be provided without
relation to device capability.
Relax global validity check to allow MLX5_QP_FLAG_ALLOW_SCATTER_CQE QP
flag.
Existing user applications are failing on this new validity check.
Fixes: 90ecb37a75 ("RDMA/mlx5: Change scatter CQE flag to be set like other vendor flags")
Fixes: 37518fa49f ("RDMA/mlx5: Process all vendor flags in one place")
Link: https://lore.kernel.org/r/20200728120255.805733-1-leon@kernel.org
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
kobject_init_and_add() takes reference even when it fails.
If this function returns an error, kobject_put() must be called to
properly clean up the memory associated with the object.
Callback function fw_cfg_sysfs_release_entry() in kobject_put()
can handle the pointer "entry" properly.
Signed-off-by: Qiushi Wu <wu000273@umn.edu>
Link: https://lore.kernel.org/r/20200613190533.15712-1-wu000273@umn.edu
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
0day reported a possible circular locking dependency:
Chain exists of:
&irq_desc_lock_class --> console_owner --> &port_lock_key
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&port_lock_key);
lock(console_owner);
lock(&port_lock_key);
lock(&irq_desc_lock_class);
The reason for this is a printk() in the i8259 interrupt chip driver
which is invoked with the irq descriptor lock held, which reverses the
lock operations vs. printk() from arbitrary contexts.
Switch the printk() to printk_deferred() to avoid that.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87365abt2v.fsf@nanos.tec.linutronix.de
Unfortunately the commit listed in the subject line above failed
to ensure that the task's audit_context was properly initialized/set
before enabling the "accompanying records". Depending on the
situation, the resulting audit_context could have invalid values in
some of it's fields which could cause a kernel panic/oops when the
task/syscall exists and the audit records are generated.
We will revisit the original patch, with the necessary fixes, in a
future kernel but right now we just want to fix the kernel panic
with the least amount of added risk.
Cc: stable@vger.kernel.org
Fixes: 1320a4052e ("audit: trigger accompanying records when no rules present")
Reported-by: j2468h@googlemail.com
Signed-off-by: Paul Moore <paul@paul-moore.com>
When the ASoC card registration fails and the codec component driver
never probes, the codec device is not initialized and therefore
memory for codec->wcaps is not allocated. This results in a NULL pointer
dereference when the codec driver suspend callback is invoked during
system suspend. Fix this by returning without performing any actions
during codec suspend/resume if the card was not registered successfully.
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://lore.kernel.org/r/20200728231011.1454066-1-ranjani.sridharan@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Add a quirk for a device that does not support the Identify Namespace
Identification Descriptor list despite claiming 1.3 compliance.
Fixes: ea43d9709f ("nvme: fix identify error status silent ignore")
Reported-by: Ingo Brunberg <ingo_brunberg@web.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Ingo Brunberg <ingo_brunberg@web.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
* drm: fix possible use-after-free
* dbi: fix SPI Type 1 transfer
* drm_fb_helper: use memcpy_io on bochs' sparc64
* mcde: fix stability
* panel: fix display noise on auo,kd101n80-45na
* panel: delay HPD checks for boe_nv133fhm_n61
* bridge: drop connector check in nwl-dsi bridge
* bridge: set proper bridge type for adv7511
* of: fix a double free
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728110446.GA8076@linux-uq9g
Removed redundant words.
Fixes: 571912c69f ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In multiproto mode, bareudp_xmit() accepts sending multicast MPLS and
IPv6 packets regardless of the bareudp ethertype. In practice, this
let an IP tunnel send multicast MPLS packets, or an MPLS tunnel send
IPv6 packets.
We need to restrict the test further, so that the multiproto mode only
enables
* IPv6 for IPv4 tunnels,
* or multicast MPLS for unicast MPLS tunnels.
To improve clarity, the protocol validation is moved to its own
function, where each logical test has its own condition.
v2: s/ntohs/htons/
Fixes: 4b5f67232d ("net: Special handling for IP & MPLS.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip6_route_info_create() invokes nexthop_get(), which increases the
refcount of the "nh".
When ip6_route_info_create() returns, local variable "nh" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling path of
ip6_route_info_create(). When nexthops can not be used with source
routing, the function forgets to decrease the refcnt increased by
nexthop_get(), causing a refcnt leak.
Fix this issue by pulling up the error source routing handling when
nexthops can not be used with source routing.
Fixes: f88d8ea67f ("ipv6: Plumb support for nexthop object in a fib6_info")
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subbaraya Sundeep says:
====================
Fix bugs in Octeontx2 netdev driver
There are problems in the existing Octeontx2
netdev drivers like missing cancel_work for the
reset task, missing lock in reset task and
missing unergister_netdev in driver remove.
This patch set fixes the above problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Added unregister_netdev in the driver remove
function. Generally unregister_netdev is called
after disabling all the device interrupts but here
it is called before disabling device mailbox
interrupts. The reason behind this is VF needs
mailbox interrupt to communicate with its PF to
clean up its resources during otx2_stop.
otx2_stop disables packet I/O and queue interrupts
first and by using mailbox interrupt communicates
to PF to free VF resources. Hence this patch
calls unregister_device just before
disabling mailbox interrupts.
Fixes: 3184fb5ba9 ("octeontx2-vf: Virtual function driver support")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During driver exit cancel the queued
reset_task work in VF driver.
Fixes: 3184fb5ba9 ("octeontx2-vf: Virtual function driver support")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two bugs exist in the code related to reset_task
in PF driver one is the missing protection
against network stack ndo_open and ndo_close.
Other one is the missing cancel_work.
This patch fixes those problems.
Fixes: 4ff7d1488a ("octeontx2-pf: Error handling support")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu says:
====================
rhashtable: Fix unprotected RCU dereference in __rht_ptr
This patch series fixes an unprotected dereference in __rht_ptr.
The first patch is a minimal fix that does not use the correct
RCU markings but is suitable for backport, and the second patch
cleans up the RCU markings.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch restores the RCU marking on bucket_table->buckets as
it really does need RCU protection. Its removal had led to a fatal
bug.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rcu_dereference call in rht_ptr_rcu is completely bogus because
we've already dereferenced the value in __rht_ptr and operated on it.
This causes potential double readings which could be fatal. The RCU
dereference must occur prior to the comparison in __rht_ptr.
This patch changes the order of RCU dereference so that it is done
first and the result is then fed to __rht_ptr. The RCU marking
changes have been minimised using casts which will be removed in
a follow-up patch.
Fixes: ba6306e3f6 ("rhashtable: Remove RCU marking from...")
Reported-by: "Gong, Sishuai" <sishuai@purdue.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
mlx5 fixes-2020-07-28
This series introduces some fixes to mlx5 driver.
v1->v2:
- Drop the "Hold reference on mirred devices" patch, until Or's
comments are addressed.
- Imporve "Modify uplink state" patch commit message per Or's request.
Please pull and let me know if there is any problem.
For -Stable:
For -stable v4.9
('net/mlx5e: Fix error path of device attach')
For -stable v4.15
('net/mlx5: Verify Hardware supports requested ptp function on a given
pin')
For -stable v5.3
('net/mlx5e: Modify uplink state on interface up/down')
For -stable v5.4
('net/mlx5e: Fix kernel crash when setting vf VLANID on a VF dev')
('net/mlx5: E-switch, Destroy TSAR when fail to enable the mode')
For -stable v5.5
('net/mlx5: E-switch, Destroy TSAR after reload interface')
For -stable v5.7
('net/mlx5: Fix a bug of using ptp channel index as pin index')
====================
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hovold says:
====================
net: lan78xx: fix NULL deref and memory leak
The first two patches fix a NULL-pointer dereference at probe that can
be triggered by a malicious device and a small transfer-buffer memory
leak, respectively.
For another subsystem I would have marked them:
Cc: stable@vger.kernel.org # 4.3
The third one replaces the driver's current broken endpoint lookup
helper, which could end up accepting incomplete interfaces and whose
results weren't even useeren
Johan
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop the bogus endpoint-lookup helper which could end up accepting
interfaces based on endpoints belonging to unrelated altsettings.
Note that the returned bulk pipes and interrupt endpoint descriptor
were never actually used. Instead the bulk-endpoint numbers are
hardcoded to 1 and 2 (matching the specification), while the interrupt-
endpoint descriptor was assumed to be the third descriptor created by
USB core.
Try to bring some order to this by dropping the bogus lookup helper and
adding the missing endpoint sanity checks while keeping the interrupt-
descriptor assumption for now.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the missing endpoint sanity check to prevent a NULL-pointer
dereference should a malicious device lack the expected endpoints.
Note that the driver has a broken endpoint-lookup helper,
lan78xx_get_endpoints(), which can end up accepting interfaces in an
altsetting without endpoints as long as *some* altsetting has a bulk-in
and a bulk-out endpoint.
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Cc: Woojung.Huh@microchip.com <Woojung.Huh@microchip.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When setting the PF interface up/down, notify the firmware to update
uplink state via MODIFY_VPORT_STATE, when E-Switch is enabled.
This behavior will prevent sending traffic out on uplink port when PF is
down, such as sending traffic from a VF interface which is still up.
Currently when calling mlx5e_open/close(), the driver only sends PAOS
command to notify the firmware to set the physical port state to
up/down, however, it is not sufficient. When VF is in "auto" state, it
follows the uplink state, which was not updated on mlx5e_open/close()
before this patch.
When switchdev mode is enabled and uplink representor is first enabled,
set the uplink port state value back to its FW default "AUTO".
Fixes: 63bfd399de ("net/mlx5e: Send PAOS command on interface up/down")
Signed-off-by: Ron Diskin <rondi@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In a special configuration, a ConnectX6-Dx pin pps-out might be activated
when driver is loaded. Fix the driver to always read the operational pin
mode when registering it, and advertise it accordingly.
Fixes: ee7f12205a ("net/mlx5e: Implement 1PPS support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlx5e_rep_is_lag_netdev is used as first check as part of netdev events
handler for bond device of non-uplink representors, this handler can get
any netdevice under the same network namespace of mlx5e netdevice. Current
code treats the netdev as mlx5e netdev and only later on verifies this,
hence causes the following Kasan trace:
[15402.744990] ==================================================================
[15402.746942] BUG: KASAN: slab-out-of-bounds in mlx5e_rep_is_lag_netdev+0xcb/0xf0 [mlx5_core]
[15402.749009] Read of size 8 at addr ffff880391f3f6b0 by task ovs-vswitchd/5347
[15402.752065] CPU: 7 PID: 5347 Comm: ovs-vswitchd Kdump: loaded Tainted: G B O --------- -t - 4.18.0-g3dcc204d291d-dirty #1
[15402.755349] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[15402.757600] Call Trace:
[15402.758968] dump_stack+0x71/0xab
[15402.760427] print_address_description+0x6a/0x270
[15402.761969] kasan_report+0x179/0x2d0
[15402.763445] ? mlx5e_rep_is_lag_netdev+0xcb/0xf0 [mlx5_core]
[15402.765121] mlx5e_rep_is_lag_netdev+0xcb/0xf0 [mlx5_core]
[15402.766782] mlx5e_rep_esw_bond_netevent+0x129/0x620 [mlx5_core]
Fix by deferring the violating access to be post the netdev verify check.
Fixes: 7e51891a23 ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Fix a bug where driver did not verify Hardware pin capabilities for
PTP functions.
Fixes: ee7f12205a ("net/mlx5e: Implement 1PPS support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
On PTP mlx5_ptp_enable(on=0) flow, driver mistakenly used channel index
as pin index.
After ptp patch marked in fixes tag was introduced, driver can freely
call ptp_find_pin() as part of the .enable() callback.
Fix driver mlx5_ptp_enable(on=0) flow to always use ptp_find_pin(). With
that, Driver will use the correct pin index in mlx5_ptp_enable(on=0) flow.
In addition, when initializing the pins, always set channel to zero. As
all pins can be attached to all channels, let ptp_set_pinfunc() to move
them between the channels.
For stable branches, this fix to be applied only on kernels that includes
both patches in fixes tag. Otherwise, mlx5_ptp_enable(on=0) will be stuck
on pincfg_mux.
Fixes: 62582a7ee7 ("ptp: Avoid deadlocks in the programmable pin code.")
Fixes: ee7f12205a ("net/mlx5e: Implement 1PPS support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The cited commit add initialization of ethtool steering during
representor rx initializations without cleaning it up in representor
rx cleanup, this may cause for stale ethtool flows to remain after
moving back from switchdev mode to legacy mode.
Fixed by calling ethtool steering cleanup during rep rx cleanup.
Fixes: 6783e8b29f ("net/mlx5e: Init ethtool steering for representors")
Signed-off-by: Maor Dickman <maord@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
On failure to attach the netdev, fix the rollback by re-setting the
device's state back to MLX5E_STATE_DESTROYING.
Failing to attach doesn't stop statistics polling via .ndo_get_stats64.
In this case, although the device is not attached, it falsely continues
to query the firmware for counters. Setting the device's state back to
MLX5E_STATE_DESTROYING prevents the firmware counters query.
Fixes: 26e59d8077 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The steering tree is as follow (nic RX as example):
---------
|root_ns|
---------
|
--------------------------------
| | |
---------- ---------- ---------
|p(prio)0| | p1 | | pn |
---------- ---------- ---------
| |
---------------- ---------------
|ns(e.g bypass)| |ns(e.g. lag) |
---------------- ---------------
| | |
---- ---- ----
|p0| |p1| |pn|
---- ---- ----
|
----
|FT|
----
find_next_chained_ft(prio) returns the first flow table in the next
priority. If prio is a parent of a flow table then it returns the first
flow table in the next priority in the same namespace, else if prio
is parent of namespace, then it should return the first flow table
in the next namespace. Currently if the user requests to forward to
next namespace, the code calls to find_next_chained_ft with the prio
of the next namespace and not the prio of the namesapce itself.
Fixes: 9254f8ed15 ("net/mlx5: Add support in forward to namespace")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When eswitch offloads is enabled, TSAR is created before reloading
the interfaces.
However when eswitch offloads mode is disabled, TSAR is disabled before
reloading the interfaces.
To keep the eswitch enable/disable sequence as mirror, destroy TSAR
after reloading the interfaces.
Fixes: 1bd27b11c1 ("net/mlx5: Introduce E-switch QoS management")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When either esw_legacy_enable() or esw_offloads_enable() fails,
code missed to destroy the created TSAR.
Hence, add the missing call to destroy the TSAR.
Fixes: 610090ebce ("net/mlx5: E-switch, Initialize TSAR Qos hardware block before its user vports")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Huazhong Tan says:
====================
net: hns3: fixes for -net
There are some bugfixes for the HNS3 ethernet driver. patch#1 fixes
a desc filling bug, patch#2 fixes a false TX timeout issue, and
patch#3~#5 fixes some bugs related to VLAN and FD.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When device is resetting or reset failed, firmware is unable to
handle mailbox. VLAN should not be configured in this case.
Fixes: fe4144d47e ("net: hns3: sync VLAN filter entries when kill VLAN ID failed")
Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When user had created a FD rule, all the aRFS rules should be clear up.
HNS3 process flow as below:
1.get spin lock of fd_ruls_list
2.clear up all aRFS rules
3.release lock
4.get spin lock of fd_ruls_list
5.creat a rules
6.release lock;
There is a short period of time between step 3 and step 4, which would
creatting some new aRFS FD rules if driver was receiving packet.
So refactor the fd_rule_lock to fix it.
Fixes: 4412288757 ("net: hns3: refine the flow director handle")
Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently hclgevf_update_port_base_vlan_info() may be called when
VF is resetting, which may cause hns3_nic_net_open() being called
twice unexpectedly.
So fix it by adding a reset check for it, and extend critical
region for rntl_lock in hclgevf_update_port_base_vlan_info().
Fixes: 92f11ea177 ("net: hns3: fix set port based VLAN issue for VF")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the queue depth and queue parameters are modified, there is
a low probability that TX timeout occurs. The two operations cause
the link to be down or up when the watchdog is still working. All
queues are stopped when the link is down. After the carrier is on,
all queues are woken up. If the watchdog detects the link between
the carrier on and wakeup queues, a false TX timeout occurs.
So fix this issue by modifying the sequence of carrier on and queue
wakeup, which is symmetrical to the link down action.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The linear and frag data part may be changed when the skb is expanded
or lineared in skb_cow_head() or skb_checksum_help(), which is called
by hns3_fill_skb_desc(), so the linear len return by skb_headlen()
before the calling of hns3_fill_skb_desc() is unreliable.
Move hns3_fill_skb_desc() before the calling of skb_headlen() to fix
this bug.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull asm-generic bugfix from Arnd Bergmann:
"A single bugfix for a regression introduced through a typo in the v5.8
merge window, leading to incorrect data returned from inl() on some
architectures"
* tag 'asm-generic-fixes-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
io: Fix return type of _inb and _inl
Pull ARM SoC DT fixes from Arnd Bergmann:
"These are the latest device tree fixes for Arm SoCs:
- TI Keystone2 ethernet regressed after a driver change broke with
incorrect phy-mode in a board's DT source.
- A similar fix is needed for two i.MX boards that were missed in an
earlier bugfix.
- DT change for Armada 38x allowing to add the register needed to fix
NETA lockup when repeatedly switching speed.
- One fix on imx6qdl-icore pin muxing to get USB OTG_ID and SD card
detect work correctly.
- Two fixes for the Allwinner SoCs, one to relax the CMA allocation
ranges that were failing on older SoCs and one to fix Cedrus on the
H6"
* tag 'arm-fixes-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: dts: keystone-k2g-evm: fix rgmii phy-mode for ksz9031 phy
ARM: dts: armada-38x: fix NETA lockup when repeatedly switching speeds
ARM: dts: imx6qdl-icore: Fix OTG_ID pin and sdcard detect
ARM: dts: imx6sx-sabreauto: Fix the phy-mode on fec2
ARM: dts: imx6sx-sdb: Fix the phy-mode on fec2
arm64: dts: allwinner: h6: Fix Cedrus IOMMU usage
ARM: dts sunxi: Relax a bit the CMA pool allocation range
It's been reported that, when neither nouveau nor Nvidia graphics
driver is used, the screen starts flickering. And, after comparing
between the working case (stable 4.4.x) and the broken case, it turned
out that the problem comes from the audio component binding. The
Nvidia and AMD audio binding code clears the bus->keep_power flag
whenever snd_hdac_acomp_init() succeeds. But this doesn't mean that
the component is actually bound, but it merely indicates that it's
ready for binding. So, when both nouveau and Nvidia are blacklisted
or not ready, the driver keeps running without the audio component but
also with bus->keep_power = false. This made the driver runtime PM
kicked in and powering down when unused, which results in flickering
in the graphics side, as it seems.
For fixing the bug, this patch moves the bus->keep_power flag change
into generic_acomp_notifier_set() that is the function called from the
master_bind callback of component ops; i.e. it's guaranteed that the
binding succeeded.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208609
Fixes: 5a858e79c9 ("ALSA: hda - Disable audio component for legacy Nvidia HDMI codecs")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200728082033.23933-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
If a stage-2 page-table contains an executable, read-only mapping at the
pte level (e.g. due to dirty logging being enabled), a subsequent write
fault to the same page which tries to install a larger block mapping
(e.g. due to dirty logging having been disabled) will erroneously inherit
the exec permission and consequently skip I-cache invalidation for the
rest of the block.
Ensure that exec permission is only inherited by write faults when the
new mapping is of the same size as the existing one. A subsequent
instruction abort will result in I-cache invalidation for the entire
block mapping.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Quentin Perret <qperret@google.com>
Reviewed-by: Quentin Perret <qperret@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200723101714.15873-1-will@kernel.org
So far, vcpu_has_ptrauth() is implemented in terms of system_supports_*_auth()
calls, which are declared "inline". In some specific conditions (clang
and SCS), the "inline" very much turns into an "out of line", which
leads to a fireworks when this predicate is evaluated on a non-VHE
system (right at the beginning of __hyp_handle_ptrauth).
Instead, make sure vcpu_has_ptrauth gets expanded inline by directly
using the cpus_have_final_cap() helpers, which are __always_inline,
generate much better code, and are the only thing that make sense when
running at EL2 on a nVHE system.
Fixes: 29eb5a3c57 ("KVM: arm64: Handle PtrAuth traps early")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://lore.kernel.org/r/20200722162231.3689767-1-maz@kernel.org
commit 17175d1a27 ("xfrm: esp6: fix encapsulation header offset
computation") changed esp6_input_done2 to correctly find the size of
the IPv6 header that precedes the TCP/UDP encapsulation header, but
didn't adjust the final call to skb_set_transport_header, which I
assumed was correct in using skb_network_header_len.
Xiumei Mu reported that when we create xfrm states that include port
numbers in the selector, traffic from the user sockets is dropped. It
turns out that we get a state mismatch in __xfrm_policy_check, because
we end up trying to compare the encapsulation header's ports with the
selector that's based on user traffic ports.
Fixes: 0146dca70b ("xfrm: add support for UDPv6 encapsulation of ESP")
Fixes: 26333c37fc ("xfrm: add IPv6 support for espintcp")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Pull arch/sh fixes from Rich Felker:
"Two last-minute fixes: one is for a boot regression (mmu code broken)
and the other fixes a long-standing broken syscall number bounds
check"
* tag 'sh-for-5.8-part2' of git://git.libc.org/linux-sh:
sh: Fix validation of system call number
sh/tlb: Fix PGTABLE_LEVELS > 2
commit 547ce4cfb3 ("switch cmsghdr_from_user_compat_to_kern() to
copy_from_user()") missed one of the places where ucmlen should've been
replaced with cmsg.cmsg_len, now that we are fetching the entire struct
rather than doing it field-by-field.
As the result, compat sendmsg() with several different-sized cmsg
attached started to fail with EINVAL. Trivial to fix, fortunately.
Fixes: 547ce4cfb3 ("switch cmsghdr_from_user_compat_to_kern() to copy_from_user()")
Reported-by: Nick Bowler <nbowler@draconx.ca>
Tested-by: Nick Bowler <nbowler@draconx.ca>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The slow path for traced system call entries accessed a wrong memory
location to get the number of the maximum allowed system call number.
Renumber the numbered "local" label for the correct location to avoid
collisions with actual local labels.
Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Fixes: f3a8308864 ("sh: Add a few missing irqflags tracing markers.")
Signed-off-by: Rich Felker <dalias@libc.org>
Geert reported that his SH7722-based Migo-R board failed to boot after
commit:
c5b27a889d ("sh/tlb: Convert SH to generic mmu_gather")
That commit fell victim to copying the wrong pattern --
__pmd_free_tlb() used to be implemented with pmd_free().
Fixes: c5b27a889d ("sh/tlb: Convert SH to generic mmu_gather")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Rich Felker <dalias@libc.org>
When size_t maps to unsigned int (e.g. on 32-bit powerpc), then the
comparison with 1<<35 is always true. Clang 9 threw:
warning: result of comparison of constant 34359738368 with \
expression of type 'size_t' (aka 'unsigned int') is always true \
[-Wtautological-constant-out-of-range-compare]
while (total < FILE_SZ) {
Tested: make -C tools/testing/selftests TARGETS="net" run_tests
Fixes: 192dc405f3 ("selftests: net: add tcp_mmap program")
Signed-off-by: Tanner Love <tannerlove@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On powerpcle, int64_t maps to long long. Clang 9 threw:
warning: absolute value function 'labs' given an argument of type \
'long long' but has parameter of type 'long' which may cause \
truncation of value [-Wabsolute-value]
if (labs(tstop - texpect) > cfg_variance_us)
Tested: make -C tools/testing/selftests TARGETS="net" run_tests
Fixes: af5136f950 ("selftests/net: SO_TXTIME with ETF and FQ")
Signed-off-by: Tanner Love <tannerlove@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang 9 threw:
warning: format specifies type 'unsigned short' but the argument has \
type 'int' [-Wformat]
typeflags, PORT_BASE, PORT_BASE + port_off);
Tested: make -C tools/testing/selftests TARGETS="net" run_tests
Fixes: 77f65ebdca ("packet: packet fanout rollover during socket overload")
Signed-off-by: Tanner Love <tannerlove@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The signedness of char is implementation-dependent. Some systems
(including PowerPC and ARM) use unsigned char. Clang 9 threw:
warning: result of comparison of constant -1 with expression of type \
'char' is always true [-Wtautological-constant-out-of-range-compare]
&arg_index)) != -1) {
Tested: make -C tools/testing/selftests TARGETS="net" run_tests
Fixes: 16e7812241 ("selftests/net: Add a test to validate behavior of rx timestamps")
Signed-off-by: Tanner Love <tannerlove@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The next hw timestamp should be snapshoot to the read registers
only once the current timestamp has been read.
If none of the pending skbs matches the current HW timestamp
just gracefully flush the available timestamp by reading it.
Signed-off-by: laurent brando <laurent.brando@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unblocking sockets used for outgoing connections were not containing
inet info about the initial connection due to a typo there: the value of
"err" variable is negative in the kernelspace.
This fixes the creation of additional subflows where the remote port has
to be reused if the other host didn't announce another one. This also
fixes inet_diag showing blank info about MPTCP sockets from unblocking
sockets doing a connect().
Fixes: 41be81a8d3 ("mptcp: fix unblocking connect()")
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function mipi_dbi_spi1_transfer() will transfer its payload as 9-bit
data, the 9th (MSB) bit being the data/command bit. In order to do that,
it unpacks the 8-bit values into 16-bit values, then sets the 9th bit if
the byte corresponds to data, clears it otherwise. The 7 MSB are
padding. The array of now 16-bit values is then passed to the SPI core
for transfer.
This function was broken since its introduction, as the length of the
SPI transfer was set to the payload size before its conversion, but the
payload doubled in size due to the 8-bit -> 16-bit conversion.
Fixes: 02dd95fe31 ("drm/tinydrm: Add MIPI DBI support")
Cc: <stable@vger.kernel.org> # 5.4+
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200703141341.1266263-1-paul@crapouillou.net
Alok Chauhan has moved out of GENI team, he no longer supports GENI I2C
driver, remove him from maintainer list.
Add Akash Asthana & Mukesh Savaliya as maintainers for GENI I2C drivers.
Signed-off-by: Akash Asthana <akashast@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
All i2c_new_device-alike functions return ERR_PTR these days, but this
fallback function was missed.
Fixes: 2dea645ffc ("i2c: acpi: Return error pointers from i2c_acpi_new_device()")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[wsa: changed from 'ENOSYS' to 'ENODEV']
Signed-off-by: Wolfram Sang <wsa@kernel.org>
We've received a regression report on Intel HD-audio controller that
wakes up immediately after S3 suspend. The bisection leads to the
commit c4c8dd6ef8 ("ALSA: hda: Skip controller resume if not
needed"). This commit replaces the system-suspend to use
pm_runtime_force_suspend() instead of the direct call of
__azx_runtime_suspend(). However, by some really mysterious reason,
pm_runtime_force_suspend() causes a spurious wakeup (although it calls
the same __azx_runtime_suspend() internally).
As an ugly workaround for now, revert the behavior to call
__azx_runtime_suspend() and __azx_runtime_resume() for those old Intel
platforms that may exhibit such a problem, while keeping the new
standard pm_runtime_force_suspend() and pm_runtime_force_resume()
pair for the remaining chips.
Fixes: c4c8dd6ef8 ("ALSA: hda: Skip controller resume if not needed")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208649
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200727164443.4233-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
These are missing throughout ucma, it harmlessly copies garbage from
userspace, but in this new code which uses min to compute the copy length
it can result in uninitialized stack memory. Check for minimum length at
the very start.
BUG: KMSAN: uninit-value in ucma_connect+0x2aa/0xab0 drivers/infiniband/core/ucma.c:1091
CPU: 0 PID: 8457 Comm: syz-executor069 Not tainted 5.8.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1df/0x240 lib/dump_stack.c:118
kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:121
__msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
ucma_connect+0x2aa/0xab0 drivers/infiniband/core/ucma.c:1091
ucma_write+0x5c5/0x630 drivers/infiniband/core/ucma.c:1764
do_loop_readv_writev fs/read_write.c:737 [inline]
do_iter_write+0x710/0xdc0 fs/read_write.c:1020
vfs_writev fs/read_write.c:1091 [inline]
do_writev+0x42d/0x8f0 fs/read_write.c:1134
__do_sys_writev fs/read_write.c:1207 [inline]
__se_sys_writev+0x9b/0xb0 fs/read_write.c:1204
__x64_sys_writev+0x4a/0x70 fs/read_write.c:1204
do_syscall_64+0xb0/0x150 arch/x86/entry/common.c:386
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 34e2ab57a9 ("RDMA/ucma: Extend ucma_connect to receive ECE parameters")
Fixes: 0cb15372a6 ("RDMA/cma: Connect ECE to rdma_accept")
Link: https://lore.kernel.org/r/0-v1-d5b86dab17dc+28c25-ucma_syz_min_jgg@nvidia.com
Reported-by: syzbot+086ab5ca9eafd2379aa6@syzkaller.appspotmail.com
Reported-by: syzbot+7446526858b83c8828b2@syzkaller.appspotmail.com
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Recent kernels have been reported to panic using the bochs_drm
framebuffer under qemu-system-sparc64 which was bisected to
commit 7a0483ac4f ("drm/bochs: switch to generic drm fbdev emulation").
The backtrace indicates that the shadow framebuffer copy in
drm_fb_helper_dirty_blit_real() is trying to access the real
framebuffer using a virtual address rather than use an IO access
typically implemented using a physical (ASI_PHYS) access on SPARC.
The fix is to replace the memcpy with memcpy_toio() from io.h.
memcpy_toio() uses writeb() where the original fbdev code
used sbus_memcpy_toio(). The latter uses sbus_writeb().
The difference between writeb() and sbus_memcpy_toio() is
that writeb() writes bytes in little-endian, where sbus_writeb() writes
bytes in big-endian. As endian does not matter for byte writes they are
the same. So we can safely use memcpy_toio() here.
Note that this only fixes bochs, in general fbdev helpers still have
issues with mixing up system memory and __iomem space. Fixing that will
require a lot more work.
v3:
- Improved changelog (Daniel)
- Added FIXME to fbdev_use_iomem (Daniel)
v2:
- Added missing __iomem cast (kernel test robot)
- Made changelog readable and fix typos (Mark)
- Add flag to select iomem - and set it in the bochs driver
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reported-by: kernel test robot <lkp@intel.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200709193016.291267-1-sam@ravnborg.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200725191012.GA434957@ravnborg.org
hdr.vmx.flags is meant for future extensions to the ABI, rejecting
invalid flags is necessary to avoid broken half-loads of the
nVMX state.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A missing VMCS12 was not causing -EINVAL (it was just read with
copy_from_user, so it is not a security issue, but it is still
wrong). Test for VMCS12 validity and reject the nested state
if a VMCS12 is required but not present.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Setting KVM_STATE_NESTED_GUEST_MODE enables various consistency checks
on VMCS12 and therefore causes KVM_SET_NESTED_STATE to fail spuriously
with -EINVAL. Do not set the flag so that we're sure to cover the
conditions included by the test, and cover the case where VMCS12 is
set and KVM_SET_NESTED_STATE is called with invalid VMCS12 contents.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The return type of functions _inb, _inw and _inl are all u16 which looks
wrong. This patch makes them u8, u16 and u32 respectively.
The original commit text for these does not indicate that these should
be all forced to u16.
Fixes: f009c89df7 ("io: Provide _inX() and _outX()")
Signed-off-by: Stafford Horne <shorne@gmail.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Commit 2f92447f9f ("powerpc/book3s64/hash: Use the pte_t address from the
caller") removed the local_irq_disable from hash_preload, but it was
required for more than just the page table walk: the hash pte busy bit is
effectively a lock which may be taken in interrupt context, and the local
update flag test must not be preempted before it's used.
This solves apparent lockups with perf interrupting __hash_page_64K. If
get_perf_callchain then also takes a hash fault on the same page while it
is already locked, it will loop forever taking hash faults, which looks like
this:
cpu 0x49e: Vector: 100 (System Reset) at [c00000001a4f7d70]
pc: c000000000072dc8: hash_page_mm+0x8/0x800
lr: c00000000000c5a4: do_hash_page+0x24/0x38
sp: c0002ac1cc69ac70
msr: 8000000000081033
current = 0xc0002ac1cc602e00
paca = 0xc00000001de1f280 irqmask: 0x03 irq_happened: 0x01
pid = 20118, comm = pread2_processe
Linux version 5.8.0-rc6-00345-g1fad14f18bc6
49e:mon> t
[c0002ac1cc69ac70] c00000000000c5a4 do_hash_page+0x24/0x38 (unreliable)
--- Exception: 300 (Data Access) at c00000000008fa60 __copy_tofrom_user_power7+0x20c/0x7ac
[link register ] c000000000335d10 copy_from_user_nofault+0xf0/0x150
[c0002ac1cc69af70] c00032bf9fa3c880 (unreliable)
[c0002ac1cc69afa0] c000000000109df0 read_user_stack_64+0x70/0xf0
[c0002ac1cc69afd0] c000000000109fcc perf_callchain_user_64+0x15c/0x410
[c0002ac1cc69b060] c000000000109c00 perf_callchain_user+0x20/0x40
[c0002ac1cc69b080] c00000000031c6cc get_perf_callchain+0x25c/0x360
[c0002ac1cc69b120] c000000000316b50 perf_callchain+0x70/0xa0
[c0002ac1cc69b140] c000000000316ddc perf_prepare_sample+0x25c/0x790
[c0002ac1cc69b1a0] c000000000317350 perf_event_output_forward+0x40/0xb0
[c0002ac1cc69b220] c000000000306138 __perf_event_overflow+0x88/0x1a0
[c0002ac1cc69b270] c00000000010cf70 record_and_restart+0x230/0x750
[c0002ac1cc69b620] c00000000010d69c perf_event_interrupt+0x20c/0x510
[c0002ac1cc69b730] c000000000027d9c performance_monitor_exception+0x4c/0x60
[c0002ac1cc69b750] c00000000000b2f8 performance_monitor_common_virt+0x1b8/0x1c0
--- Exception: f00 (Performance Monitor) at c0000000000cb5b0 pSeries_lpar_hpte_insert+0x0/0x160
[link register ] c0000000000846f0 __hash_page_64K+0x210/0x540
[c0002ac1cc69ba50] 0000000000000000 (unreliable)
[c0002ac1cc69bb00] c000000000073ae0 update_mmu_cache+0x390/0x3a0
[c0002ac1cc69bb70] c00000000037f024 wp_page_copy+0x364/0xce0
[c0002ac1cc69bc20] c00000000038272c do_wp_page+0xdc/0xa60
[c0002ac1cc69bc70] c0000000003857bc handle_mm_fault+0xb9c/0x1b60
[c0002ac1cc69bd50] c00000000006c434 __do_page_fault+0x314/0xc90
[c0002ac1cc69be20] c00000000000c5c8 handle_page_fault+0x10/0x2c
--- Exception: 300 (Data Access) at 00007fff8c861fe8
SP (7ffff6b19660) is in userspace
Fixes: 2f92447f9f ("powerpc/book3s64/hash: Use the pte_t address from the caller")
Reported-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reported-by: Anton Blanchard <anton@ozlabs.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200727060947.10060-1-npiggin@gmail.com
Mention why we open-code strsep, so it is clear that it is intentional.
Fixes: 736bb11898 ("modpost: remove use of non-standard strsep() in HOSTCC code")
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Pull Kbuild fixes from Masahiro Yamada:
- do not use non-portable strsep() in a host program
- fix single target builds for external modules
- change Clang's --prefix option to make it work for the latest Clang
* tag 'kbuild-fixes-v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation
kbuild: fix single target builds for external modules
modpost: remove use of non-standard strsep() in HOSTCC code
Whenever a display update was sent, apart from updating
the memory base address, we called mcde_display_send_one_frame()
which also sent a command to the display requesting the TE IRQ
and enabling the FIFO.
When continuous updates are running this is wrong: we need
to only send this to start the flow to the display on
the very first update. This lead to the display pipeline
locking up and crashing.
Check if the flow is already running and in that case
do not call mcde_display_send_one_frame().
This fixes crashes on the Samsung GT-S7710 (Skomer).
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Stephan Gerhold <stephan@gerhold.net>
Cc: Stephan Gerhold <stephan@gerhold.net>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200718233323.3407670-1-linus.walleij@linaro.org
Pull parisc fixes from Helge Deller:
"Two fixes:
- Add the cmpxchg() function for pointers to u8 values. This fixes a
kernel linking error when building the tusb1210 driver (from Liam
Beguin).
- Add a define for atomic64_set_release() to fix CPU soft lockups
which happen because of missing unlocks while processing bit
operations (from John David Anglin)"
* 'parisc-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Add atomic64_set_release() define to avoid CPU soft lockups
parisc: add support for cmpxchg on u8 pointers
On boe_nv133fhm_n62 (and presumably on boe_nv133fhm_n61) a scope shows
a small spike on the HPD line right when you power the panel on. The
picture looks something like this:
+--------------------------------------
|
|
|
Power ---+
+---
|
++ |
+----+| |
HPD -----+ +---------------------------+
So right when power is applied there's a little bump in HPD and then
there's small spike right before it goes low. The total time of the
little bump plus the spike was measured on one panel as being 8 ms
long. The total time for the HPD to go high on the same panel was
51.2 ms, though the datasheet only promises it is < 200 ms.
When asked about this glitch, BOE indicated that it was expected and
persisted until the TCON has been initialized.
If this was a real hotpluggable DP panel then this wouldn't matter a
whole lot. We'd debounce the HPD signal for a really long time and so
the little blip wouldn't hurt. However, this is not a hotpluggable DP
panel and the the debouncing logic isn't needed and just shows down
the time needed to get the display working. This is why the code in
panel_simple_prepare() doesn't do debouncing and just waits for HPD to
go high once. Unfortunately if we get unlucky and happen to poll the
HPD line right at the spike we can try talking to the panel before
it's ready.
Let's handle this situation by putting in a 15 ms prepare delay and
decreasing the "hpd absent delay" by 15 ms. That means:
* If you don't have HPD hooked up at all you've still got the
hardcoded 200 ms delay.
* If you've got HPD hooked up you will always wait at least 15 ms
before checking HPD. The only case where this could be bad is if
the panel is sharing a voltage rail with something else in the
system and was already turned on long before the panel came up. In
such a case we'll be delaying 15 ms for no reason, but it's not a
huge delay and I don't see any other good solution to handle that
case.
Even though the delay was measured as 8 ms, 15 ms was chosen to give a
bit of margin.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716132120.1.I01e738cd469b61fc9b28b3ef1c6541a4f48b11bf@changeid
Pull char/misc driver fixes from Greg KH:
"Here are a few small driver fixes for 5.8-rc7
They include:
- habanalabs fixes
- tiny fpga driver fixes
- /dev/mem fixup from previous changes
- interconnect driver fixes
- binder fix
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
interconnect: msm8916: Fix buswidth of pcnoc_s nodes
interconnect: Do not skip aggregation for disabled paths
/dev/mem: Add missing memory barriers for devmem_inode
binder: Don't use mmput() from shrinker function.
habanalabs: prevent possible out-of-bounds array access
fpga: dfl: fix bug in port reset handshake
fpga: dfl: pci: reduce the scope of variable 'ret'
habanalabs: set 4s timeout for message to device CPU
habanalabs: set clock gating per engine
habanalabs: block WREG_BULK packet on PDMA
Pull driver core fix from Greg KH:
"A single driver core fix for 5.8-rc7. It resolves a problem found in
the previous fix for this code made in 5.8-rc6. Hopefully this is all
now cleared up, as this seems to be the last of the reported issues in
this area, and was tested on the problem hardware.
This patch has been in linux-next with no reported problems"
* tag 'driver-core-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
device property: Avoid NULL pointer dereference in device_get_next_child_node()
Pull staging driver fixes from Greg KH:
"Five small staging driver fixes for 5.8-rc7 to resolve some reported
problems:
- four comedi driver fixes for problems found with them
- a syzbot-found fix for the wlang-ng driver that resolves a much
reported problem.
All of these have been in linux-next with no reported issues"
* tag 'staging-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: wlan-ng: properly check endpoint types
staging: comedi: addi_apci_1564: check INSN_CONFIG_DIGITAL_TRIG shift
staging: comedi: addi_apci_1500: check INSN_CONFIG_DIGITAL_TRIG shift
staging: comedi: addi_apci_1032: check INSN_CONFIG_DIGITAL_TRIG shift
staging: comedi: ni_6527: fix INSN_CONFIG_DIGITAL_TRIG support
Pull tty/serial/fbcon fixes from Greg KH:
"Here are some small tty and serial and fbcon fixes for 5.8-rc7 to
resolve some reported issues.
The fbcon fix is in here as it was simpler to take it this way (and it
was acked by the maintainer) as it was related to the vt console fix
as well, both of which resolve syzbot-found issues in the console
handling code.
The other serial driver fixes are for small issues reported in the -rc
releases.
All of these have been in linux-next with no reported issues"
* tag 'tty-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: exar: Fix GPIO configuration for Sealevel cards based on XR17V35X
fbdev: Detect integer underflow at "struct fbcon_ops"->clear_margins.
serial: 8250_mtk: Fix high-speed baud rates clamping
serial: 8250: fix null-ptr-deref in serial8250_start_tx()
serial: tegra: drop bogus NULL tty-port checks
serial: tegra: fix CREAD handling for PIO
tty: xilinx_uartps: Really fix id assignment
vt: Reject zero-sized screen buffer size.
Pull USB fixes from Greg KH:
"Three small USB XHCI driver fixes for 5.8-rc7.
They all resolve some minor issues that have been reported on some
different platforms.
All of these have been in linux-next with no reported issues"
* tag 'usb-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: tegra: Fix allocation for the FPCI context
usb: xhci: Fix ASM2142/ASM3142 DMA addressing
usb: xhci-mtk: fix the failure of bandwidth allocation
Pull SCSI fix from James Bottomley:
"Small core patch to fix a corner case bug: we forgot to run the queues
to handle starvation in the error exit from the scsi_queue_rq routine,
which can lead to hangs on error conditions"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: core: Run queue in case of I/O resource contention failure
After commit 6e02318eae ("nvme: add support for the Write Zeroes
command"), SK hynix PC400 becomes very slow with the following error
message:
[ 224.567695] blk_update_request: operation not supported error, dev nvme1n1, sector 499384320 op 0x9:(WRITE_ZEROES) flags 0x1000000 phys_seg 0 prio class 0]
SK Hynix PC400 has a buggy firmware that treats NLB as max value instead
of a range, so the NLB passed isn't a valid value to the firmware.
According to SK hynix there are three commands are affected:
- Write Zeroes
- Compare
- Write Uncorrectable
Right now only Write Zeroes is implemented, so disable it completely on
SK hynix PC400.
BugLink: https://bugs.launchpad.net/bugs/1872383
Cc: kyounghwan sohn <kyounghwan.sohn@sk.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
If the controller died exactly when we are receiving icresp
we hang because icresp may never return. Make sure to set a
high finite limit.
Fixes: 3f2304f8c6 ("nvme-tcp: add NVMe over TCP host driver")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Pull RISC-V fixes from Palmer Dabbelt:
"A few more fixes this week:
- A fix to avoid using SBI calls during kasan initialization, as the
SBI calls themselves have not been probed yet.
- Three fixes related to systems with multiple memory regions"
* tag 'riscv-for-linus-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Parse all memory blocks to remove unusable memory
RISC-V: Do not rely on initrd_start/end computed during early dt parsing
RISC-V: Set maximum number of mapped pages correctly
riscv: kasan: use local_tlb_flush_all() to avoid uninitialized __sbi_rfence
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- Fix a section end page alignment assumption that was causing
crashes
- Fix ORC unwinding on freshly forked tasks which haven't executed
yet and which have empty user task stacks
- Fix the debug.exception-trace=1 sysctl dumping of user stacks,
which was broken by recent maccess changes"
* tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/dumpstack: Dump user space code correctly again
x86/stacktrace: Fix reliable check for empty user task stacks
x86/unwind/orc: Fix ORC for newly forked tasks
x86, vmlinux.lds: Page-align end of ..page_aligned sections
Pull uprobe fix from Ingo Molnar:
"Fix an interaction/regression between uprobes based shared library
tracing & GDB"
* tag 'perf-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix GDB regression
Pull timer fix from Ingo Molnar:
"Fix a suspend/resume regression (crash) on TI AM3/AM4 SoC's"
* tag 'timers-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4
Pull scheduler fixes from Ingo Molnar:
"Fix a race introduced by the recent loadavg race fix, plus add a debug
check for a hard to debug case of bogus wakeup function flags"
* tag 'sched-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Warn if garbage is passed to default_wake_function()
sched: Fix race against ptrace_freeze_trace()
Pull EFI fixes from Ingo Molnar:
"Various EFI fixes:
- Fix the layering violation in the use of the EFI runtime services
availability mask in users of the 'efivars' abstraction
- Revert build fix for GCC v4.8 which is no longer supported
- Clean up some x86 EFI stub details, some of which are borderline
bugs that copy around garbage into padding fields - let's fix these
out of caution.
- Fix build issues while working on RISC-V support
- Avoid --whole-archive when linking the stub on arm64"
* tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: Revert "efi/x86: Fix build with gcc 4"
efi/efivars: Expose RT service availability via efivars abstraction
efi/libstub: Move the function prototypes to header file
efi/libstub: Fix gcc error around __umoddi3 for 32 bit builds
efi/libstub/arm64: link stub lib.a conditionally
efi/x86: Only copy upto the end of setup_header
efi/x86: Remove unused variables
Pull cifs fix from Steve French:
"A fix for a recently discovered regression in rename to older servers
caused by a recent patch"
* tag '5.8-rc6-cifs-fix' of git://git.samba.org/sfrench/cifs-2.6:
Revert "cifs: Fix the target file was deleted when rename failed."
Pull networking fixes from David Miller:
1) Fix RCU locaking in iwlwifi, from Johannes Berg.
2) mt76 can access uninitialized NAPI struct, from Felix Fietkau.
3) Fix race in updating pause settings in bnxt_en, from Vasundhara
Volam.
4) Propagate error return properly during unbind failures in ax88172a,
from George Kennedy.
5) Fix memleak in adf7242_probe, from Liu Jian.
6) smc_drv_probe() can leak, from Wang Hai.
7) Don't muck with the carrier state if register_netdevice() fails in
the bonding driver, from Taehee Yoo.
8) Fix memleak in dpaa_eth_probe, from Liu Jian.
9) Need to check skb_put_padto() return value in hsr_fill_tag(), from
Murali Karicheri.
10) Don't lose ionic RSS hash settings across FW update, from Shannon
Nelson.
11) Fix clobbered SKB control block in act_ct, from Wen Xu.
12) Missing newlink in "tx_timeout" sysfs output, from Xiongfeng Wang.
13) IS_UDPLITE cleanup a long time ago, incorrectly handled
transformations involving UDPLITE_RECV_CC. From Miaohe Lin.
14) Unbalanced locking in netdevsim, from Taehee Yoo.
15) Suppress false-positive error messages in qed driver, from Alexander
Lobakin.
16) Out of bounds read in ax25_connect and ax25_sendmsg, from Peilin Ye.
17) Missing SKB release in cxgb4's uld_send(), from Navid Emamdoost.
18) Uninitialized value in geneve_changelink(), from Cong Wang.
19) Fix deadlock in xen-netfront, from Andera Righi.
19) flush_backlog() frees skbs with IRQs disabled, so should use
dev_kfree_skb_irq() instead of kfree_skb(). From Subash Abhinov
Kasiviswanathan.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
drivers/net/wan: lapb: Corrected the usage of skb_cow
dev: Defer free of skbs in flush_backlog
qrtr: orphan socket in qrtr_release()
xen-netfront: fix potential deadlock in xennet_remove()
flow_offload: Move rhashtable inclusion to the source file
geneve: fix an uninitialized value in geneve_changelink()
bonding: check return value of register_netdevice() in bond_newlink()
tcp: allow at most one TLP probe per flight
AX.25: Prevent integer overflows in connect and sendmsg
cxgb4: add missing release on skb in uld_send()
net: atlantic: fix PTP on AQC10X
AX.25: Prevent out-of-bounds read in ax25_sendmsg()
sctp: shrink stream outq when fails to do addstream reconf
sctp: shrink stream outq only when new outcnt < old outcnt
AX.25: Fix out-of-bounds read in ax25_connect()
enetc: Remove the mdio bus on PF probe bailout
net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload
net: ethernet: ave: Fix error returns in ave_init
drivers/net/wan/x25_asy: Fix to make it work
ipvs: fix the connection sync failed in some cases
...
Since commit bcf3440c6d ("net: phy: micrel: add phy-mode support for the
KSZ9031 PHY") the networking is broken on keystone-k2g-evm board.
The above board have phy-mode = "rgmii-id" and it is worked before because
KSZ9031 PHY started with default RGMII internal delays configuration (TX
off, RX on 1.2 ns) and MAC provided TX delay by default.
After above commit, the KSZ9031 PHY starts handling phy mode properly and
enables both RX and TX delays, as result networking is become broken.
Fix it by switching to phy-mode = "rgmii-rxid" to reflect previous
behavior.
Fixes: bcf3440c6d ("net: phy: micrel: add phy-mode support for the KSZ9031 PHY")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Oleksij Rempel <o.rempel@pengutronix.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Philippe Schenker <philippe.schenker@toradex.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Currently, maximum physical memory allowed is equal to -PAGE_OFFSET.
That's why we remove any memory blocks spanning beyond that size. However,
it is done only for memblock containing linux kernel which will not work
if there are multiple memblocks.
Process all memory blocks to figure out how much memory needs to be removed
and remove at the end instead of updating the memblock list in place.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This patch fixed 2 issues with the usage of skb_cow in LAPB drivers
"lapbether" and "hdlc_x25":
1) After skb_cow fails, kfree_skb should be called to drop a reference
to the skb. But in both drivers, kfree_skb is not called.
2) skb_cow should be called before skb_push so that is can ensure the
safety of skb_push. But in "lapbether", it is incorrectly called after
skb_push.
More details about these 2 issues:
1) The behavior of calling kfree_skb on failure is also the behavior of
netif_rx, which is called by this function with "return netif_rx(skb);".
So this function should follow this behavior, too.
2) In "lapbether", skb_cow is called after skb_push. This results in 2
logical issues:
a) skb_push is not protected by skb_cow;
b) An extra headroom of 1 byte is ensured after skb_push. This extra
headroom has no use in this function. It also has no use in the
upper-layer function that this function passes the skb to
(x25_lapb_receive_frame in net/x25/x25_dev.c).
So logically skb_cow should instead be called before skb_push.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IRQs are disabled when freeing skbs in input queue.
Use the IRQ safe variant to free skbs here.
Fixes: 145dd5f9c8 ("net: flush the softnet backlog in process context")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, maximum number of mapper pages are set to the pfn calculated
from the memblock size of the memblock containing kernel. This will work
until that memblock spans the entire memory. However, it will be set to
a wrong value if there are multiple memblocks defined in kernel
(e.g. with efi runtime services).
Set the the maximum value to the pfn calculated from dram size.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Pull PCI fixes from Bjorn Helgaas:
- Reject invalid IRQ 0 command line argument for virtio_mmio because
IRQ 0 now generates warnings (Bjorn Helgaas)
- Revert "PCI/PM: Assume ports without DLL Link Active train links in
100 ms", which broke nouveau (Bjorn Helgaas)
* tag 'pci-v5.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Revert "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"
virtio-mmio: Reject invalid IRQ 0 command line argument
Kalle Valo says:
====================
wireless-drivers fixes for v5.8
Second set of fixes for v5.8, and hopefully also the last. Three
important regressions fixed.
ath9k
* fix a regression which broke support for all ath9k usb devices
ath10k
* fix a regression which broke support for all QCA4019 AHB devices
iwlwifi
* fix a regression which broke support for some Killer Wireless-AC 1550 cards
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a potential race in xennet_remove(); this is what the driver is
doing upon unregistering a network device:
1. state = read bus state
2. if state is not "Closed":
3. request to set state to "Closing"
4. wait for state to be set to "Closing"
5. request to set state to "Closed"
6. wait for state to be set to "Closed"
If the state changes to "Closed" immediately after step 1 we are stuck
forever in step 4, because the state will never go back from "Closed" to
"Closing".
Make sure to check also for state == "Closed" in step 4 to prevent the
deadlock.
Also add a 5 sec timeout any time we wait for the bus state to change,
to avoid getting stuck forever in wait_event().
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull nfsd fix from Bruce Fields:
"Just one fix for a NULL dereference if someone happens to read
/proc/fs/nfsd/client/../state at the wrong moment"
* tag 'nfsd-5.8-2' of git://linux-nfs.org/~bfields/linux:
nfsd4: fix NULL dereference in nfsd/clients display code
I noticed that touching linux/rhashtable.h causes lib/vsprintf.c to
be rebuilt. This dependency came through a bogus inclusion in the
file net/flow_offload.h. This patch moves it to the right place.
This patch also removes a lingering rhashtable inclusion in cls_api
created by the same commit.
Fixes: 4e481908c5 ("flow_offload: move tc indirect block to...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge misc fixes from Andrew Morton:
"Subsystems affected by this patch series: mm/pagemap, mm/shmem,
mm/hotfixes, mm/memcg, mm/hugetlb, mailmap, squashfs, scripts,
io-mapping, MAINTAINERS, and gdb"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
scripts/gdb: fix lx-symbols 'gdb.error' while loading modules
MAINTAINERS: add KCOV section
io-mapping: indicate mapping failure
scripts/decode_stacktrace: strip basepath from all paths
squashfs: fix length field overlap check in metadata reading
mailmap: add entry for Mike Rapoport
khugepaged: fix null-pointer dereference due to race
mm/hugetlb: avoid hardcoding while checking if cma is enabled
mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
mm/memcg: fix refcount error while moving and swapping
mm/memcontrol: fix OOPS inside mem_cgroup_get_nr_swap_pages()
mm: initialize return of vm_insert_pages
vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way
mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
Pull xtensa csum regression fix from Al Viro:
"Max Filippov caught a breakage introduced in xtensa this cycle
by the csum_and_copy_..._user() series.
Cut'n'paste from the wrong source - the check that belongs
in csum_and_copy_to_user() ended up both there and in
csum_and_copy_from_user()"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
xtensa: fix access check in csum_and_copy_from_user
Pull arm64 fix from Will Deacon:
"Fix compat vDSO build flags for recent versions of clang to tell it
where to find the assembler"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: vdso32: Fix '--prefix=' value for newer versions of clang
Pull btrfs fixes from David Sterba:
"A few resouce leak fixes from recent patches, all are stable material.
The problems have been observed during testing or have a reproducer"
* tag 'for-5.8-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix mount failure caused by race with umount
btrfs: fix page leaks after failure to lock page for delalloc
btrfs: qgroup: fix data leak caused by race between writeback and truncate
btrfs: fix double free on ulist after backref resolution failure
Pull zonefs fixes from Damien Le Moal:
"Two fixes, the first one to remove compilation warnings and the second
to avoid potentially inefficient allocation of BIOs for direct writes
into sequential zones"
* tag 'zonefs-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: count pages after truncating the iterator
zonefs: Fix compilation warning
Pull io_uring fixes from Jens Axboe:
- Fix discrepancy in how sqe->flags are treated for a few requests,
this makes it consistent (Daniele)
- Ensure that poll driven retry works with double waitqueue poll users
- Fix a missing io_req_init_async() (Pavel)
* tag 'io_uring-5.8-2020-07-24' of git://git.kernel.dk/linux-block:
io_uring: missed req_init_async() for IOSQE_ASYNC
io_uring: always allow drain/link/hardlink/async sqe flags
io_uring: ensure double poll additions work with both request types
Pull iommu fix from Joerg Roedel:
"Fix a NULL-ptr dereference in the QCOM IOMMU driver"
* tag 'iommu-fix-v5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/qcom: Use domain rather than dev as tlb cookie
Pull rdma fixes from Jason Gunthorpe:
"One merge window regression, some corruption bugs in HNS and a few
more syzkaller fixes:
- Two long standing syzkaller races
- Fix incorrect HW configuration in HNS
- Restore accidentally dropped locking in IB CM
- Fix ODP prefetch bug added in the big rework several versions ago"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/mlx5: Prevent prefetch from racing with implicit destruction
RDMA/cm: Protect access to remote_sidr_table
RDMA/core: Fix race in rdma_alloc_commit_uobject()
RDMA/hns: Fix wrong PBL offset when VA is not aligned to PAGE_SIZE
RDMA/hns: Fix wrong assignment of lp_pktn_ini in QPC
RDMA/mlx5: Use xa_lock_irq when access to SRQ table
Pull device mapper fix from Mike Snitzer:
"A stable fix for DM integrity target's integrity recalculation that
gets skipped when resuming a device. This is a fix for a previous
stable@ fix"
* tag 'for-5.8/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm integrity: fix integrity recalculation that is improperly skipped
Pull i2c fixes from Wolfram Sang:
"Again some driver bugfixes and some documentation fixes"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i2c-qcom-geni: Fix DMA transfer race
i2c: rcar: always clear ICSAR to avoid side effects
MAINTAINERS: i2c: at91: handover maintenance to Codrin Ciubotariu
i2c: drop duplicated word in the header file
i2c: cadence: Clear HOLD bit at correct time in Rx path
Revert "i2c: cadence: Fix the hold bit setting"
Pull drm fixes from Dave Airlie:
"Quiet fixes, I may have a single regression fix follow up to this for
nouveau, but it might be next week, Ben was testing it a bit more .
Otherwise two amdgpu fixes, one lima and one sun4i:
amdgpu:
- Fix crash when overclocking VegaM
- Fix possible crash when editing dpm levels
sun4i:
- Fix inverted HPD result; fixes an earlier fix
lima:
- fix timeout during reset"
* tag 'drm-fixes-2020-07-24' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: Fix NULL dereference in dpm sysfs handlers
drm/amd/powerplay: fix a crash when overclocking Vega M
drm/lima: fix wait pp reset timeout
drm: sun4i: hdmi: Fix inverted HPD result
Commit ed66f991bb ("module: Refactor section attr into bin attribute")
removed the 'name' field from 'struct module_sect_attr' triggering the
following error when invoking lx-symbols:
(gdb) lx-symbols
loading vmlinux
scanning for modules in linux/build
loading @0xffffffffc014f000: linux/build/drivers/net/tun.ko
Python Exception <class 'gdb.error'> There is no member named name.:
Error occurred in Python: There is no member named name.
This patch fixes the issue taking the module name from the 'struct
attribute'.
Fixes: ed66f991bb ("module: Refactor section attr into bin attribute")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Kieran Bingham <kbingham@kernel.org>
Link: http://lkml.kernel.org/r/20200722102239.313231-1-sgarzare@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The !ATOMIC_IOMAP version of io_maping_init_wc will always return
success, even when the ioremap fails.
Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.
During a device probe, where the ioremap failed, a crash can look like
this:
BUG: unable to handle page fault for address: 0000000000210000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 177 Comm:
RIP: 0010:fill_page_dma [i915]
gen8_ppgtt_create [i915]
i915_ppgtt_create [i915]
intel_gt_init [i915]
i915_gem_init [i915]
i915_driver_probe [i915]
pci_device_probe
really_probe
driver_probe_device
The remap failure occurred much earlier in the probe. If it had been
propagated, the driver would have exited with an error.
Return NULL on ioremap failure.
[akpm@linux-foundation.org: detect ioremap_wc() errors earlier]
Fixes: cafaf14a5d ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
khugepaged has to drop mmap lock several times while collapsing a page.
The situation can change while the lock is dropped and we need to
re-validate that the VMA is still in place and the PMD is still subject
for collapse.
But we miss one corner case: while collapsing an anonymous pages the VMA
could be replaced with file VMA. If the file VMA doesn't have any
private pages we get NULL pointer dereference:
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
anon_vma_lock_write include/linux/rmap.h:120 [inline]
collapse_huge_page mm/khugepaged.c:1110 [inline]
khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
khugepaged_do_scan mm/khugepaged.c:2193 [inline]
khugepaged+0x3bba/0x5a10 mm/khugepaged.c:2238
The fix is to make sure that the VMA is anonymous in
hugepage_vma_revalidate(). The helper is only used for collapsing
anonymous pages.
Fixes: 99cb0dbd47 ("mm,thp: add read-only THP support for (non-shmem) FS")
Reported-by: syzbot+ed318e8b790ca72c5ad0@syzkaller.appspotmail.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200722121439.44328-1-kirill.shutemov@linux.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the kmem_cache refcount is greater than one, we should not mark the
root kmem_cache as dying. If we mark the root kmem_cache dying
incorrectly, the non-root kmem_cache can never be destroyed. It
resulted in memory leak when memcg was destroyed. We can use the
following steps to reproduce.
1) Use kmem_cache_create() to create a new kmem_cache named A.
2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
so the refcount of B is just increased.
3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
decrease the B's refcount but mark the B as dying.
4) Create a new memory cgroup and alloc memory from the kmem_cache
B. It leads to create a non-root kmem_cache for allocating memory.
5) When destroy the memory cgroup created in the step 4), the
non-root kmem_cache can never be destroyed.
If we repeat steps 4) and 5), this will cause a lot of memory leak. So
only when refcount reach zero, we mark the root kmem_cache as dying.
Fixes: 92ee383f6d ("mm: fix race between kmem_cache destroy, create and deactivate")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200716165103.83462-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It was hard to keep a test running, moving tasks between memcgs with
move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s
refcount is discovered to be 0 (supposedly impossible), so it is then
forced to REFCOUNT_SATURATED, and after thousands of warnings in quick
succession, the test is at last put out of misery by being OOM killed.
This is because of the way moved_swap accounting was saved up until the
task move gets completed in __mem_cgroup_clear_mc(), deferred from when
mem_cgroup_move_swap_account() actually exchanged old and new ids.
Concurrent activity can free up swap quicker than the task is scanned,
bringing id refcount down 0 (which should only be possible when
offlining).
Just skip that optimization: do that part of the accounting immediately.
Fixes: 615d66c37c ("mm: memcontrol: fix memcg id ref counter on swap charge move")
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2007071431050.4726@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Prabhakar reported an OOPS inside mem_cgroup_get_nr_swap_pages()
function in a corner case seen on some arm64 boards when kdump kernel
runs with "cgroup_disable=memory" passed to the kdump kernel via
bootargs.
The root-cause behind the same is that currently mem_cgroup_swap_init()
function is implemented as a subsys_initcall() call instead of a
core_initcall(), this means 'cgroup_memory_noswap' still remains set to
the default value (false) even when memcg is disabled via
"cgroup_disable=memory" boot parameter.
This may result in premature OOPS inside mem_cgroup_get_nr_swap_pages()
function in corner cases:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000188
Mem abort info:
ESR = 0x96000006
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
[0000000000000188] user address but active_mm is swapper
Internal error: Oops: 96000006 [#1] SMP
Modules linked in:
<..snip..>
Call trace:
mem_cgroup_get_nr_swap_pages+0x9c/0xf4
shrink_lruvec+0x404/0x4f8
shrink_node+0x1a8/0x688
do_try_to_free_pages+0xe8/0x448
try_to_free_pages+0x110/0x230
__alloc_pages_slowpath.constprop.106+0x2b8/0xb48
__alloc_pages_nodemask+0x2ac/0x2f8
alloc_page_interleave+0x20/0x90
alloc_pages_current+0xdc/0xf8
atomic_pool_expand+0x60/0x210
__dma_atomic_pool_init+0x50/0xa4
dma_atomic_pool_init+0xac/0x158
do_one_initcall+0x50/0x218
kernel_init_freeable+0x22c/0x2d0
kernel_init+0x18/0x110
ret_from_fork+0x10/0x18
Code: aa1403e3 91106000 97f82a27 14000011 (f940c663)
---[ end trace 9795948475817de4 ]---
Kernel panic - not syncing: Fatal exception
Rebooting in 10 seconds..
Fixes: eccb52e788 ("mm: memcontrol: prepare swap controller setup for integration")
Reported-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: http://lkml.kernel.org/r/1593641660-13254-2-git-send-email-bhsharma@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under
mmap_read_lock(). It can lead to race with __do_munmap():
Thread A Thread B
__do_munmap()
detach_vmas_to_be_unmapped()
mmap_write_downgrade()
expand_downwards()
vma->vm_start = address;
// The VMA now overlaps with
// VMAs detached by the Thread A
// page fault populates expanded part
// of the VMA
unmap_region()
// Zaps pagetables partly
// populated by Thread B
Similar race exists for expand_upwards().
The fix is to avoid downgrading mmap_lock in __do_munmap() if detached
VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA.
[akpm@linux-foundation.org: s/mmap_sem/mmap_lock/ in comment]
Fixes: dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org> [4.20+]
Link: http://lkml.kernel.org/r/20200709105309.42495-1-kirill.shutemov@linux.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
req->work might be already initialised by the time it gets into
__io_arm_poll_handler(), which will corrupt it by using fields that are
in an union with req->work. Luckily, the only side effect is missing
put_creds(). Clean req->work before going there.
Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
mvebu fixes for 5.8 (part 1)
- DT change for Armada 38x allowing to add the register needed to fix
NETA lockup when repeatedly switching speed.
* tag 'mvebu-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu:
ARM: dts: armada-38x: fix NETA lockup when repeatedly switching speeds
If a tracee is uprobed and it hits int3 inserted by debugger, handle_swbp()
does send_sig(SIGTRAP, current, 0) which means si_code == SI_USER. This used
to work when this code was written, but then GDB started to validate si_code
and now it simply can't use breakpoints if the tracee has an active uprobe:
# cat test.c
void unused_func(void)
{
}
int main(void)
{
return 0;
}
# gcc -g test.c -o test
# perf probe -x ./test -a unused_func
# perf record -e probe_test:unused_func gdb ./test -ex run
GNU gdb (GDB) 10.0.50.20200714-git
...
Program received signal SIGTRAP, Trace/breakpoint trap.
0x00007ffff7ddf909 in dl_main () from /lib64/ld-linux-x86-64.so.2
(gdb)
The tracee hits the internal breakpoint inserted by GDB to monitor shared
library events but GDB misinterprets this SIGTRAP and reports a signal.
Change handle_swbp() to use force_sig(SIGTRAP), this matches do_int3_user()
and fixes the problem.
This is the minimal fix for -stable, arch/x86/kernel/uprobes.c is equally
wrong; it should use send_sigtrap(TRAP_TRACE) instead of send_sig(SIGTRAP),
but this doesn't confuse GDB and needs another x86-specific patch.
Reported-by: Aaron Merey <amerey@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200723154420.GA32043@redhat.com
Since the default_wake_function() passes its flags onto
try_to_wake_up(), warn if those flags collide with internal values.
Given that the supplied flags are garbage, no repair can be done but at
least alert the user to the damage they are causing.
In the belief that these errors should be picked up during testing, the
warning is only compiled in under CONFIG_SCHED_DEBUG.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20200723201042.18861-1-chris@chris-wilson.co.uk
Sealevel XR17V35X based devices are inoperable on kernel versions
4.11 and above due to a change in the GPIO preconfiguration introduced in
commit
7dea8165f1. This patch fixes this by preconfiguring the GPIO on Sealevel
cards to the value (0x00) used prior to commit 7dea8165f1
With GPIOs preconfigured as per commit 7dea8165f1 all ports on
Sealevel XR17V35X based devices become stuck in high impedance
mode, regardless of dip-switch or software configuration. This
causes the device to become effectively unusable. This patch (in
various forms) has been distributed to our customers and no issues
related to it have been reported.
Fixes: 7dea8165f1 ("serial: exar: Preconfigure xr17v35x MPIOs as output")
Signed-off-by: Matthew Howell <matthew.howell@sealevel.com>
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2007221605270.13247@tstest-VirtualBox
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The disp015x classes are used by both gt21x and gf1xx (aside from gf119), but page
kinds differ between Tesla and Fermi.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
To support the change in "phy: armada-38x: fix NETA lockup when
repeatedly switching speeds" we need to update the DT with the
additional register.
Fixes: 14dc100b44 ("phy: armada38x: add common phy support")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for net:
1) Fix NAT hook deletion when table is dormant, from Florian Westphal.
2) Fix IPVS sync stalls, from guodeqing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 9ffad9263b.
Upon additional testing with older servers, it was found that
the original commit introduced a regression when using the old SMB1
dialect and rsyncing over an existing file.
The patch will need to be respun to address this, likely including
a larger refactoring of the SMB1 and SMB3 rename code paths to make
it less confusing and also to address some additional rename error
cases that SMB3 may be able to workaround.
Signed-off-by: Steve French <stfrench@microsoft.com>
Reported-by: Patrick Fernie <patrick.fernie@gmail.com>
CC: Stable <stable@vger.kernel.org>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Acked-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Pull s390 fixes from Heiko Carstens:
- Change cpum_cf/perf counter name from DFLT_CCERROR to DFLT_CCFINISH
to reflect reality and avoid further confusion. This is a user space
visible change therefore the commit has also a stable tag for 5.7,
where this counter was introduced.
- Add Matthew Rosato as s390 IOMMU maintainer.
* tag 's390-5.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
MAINTAINERS: add Matthew for s390 IOMMU
s390/cpum_cf,perf: change DFLT_CCERROR counter name
When I have KASAN enabled on my kernel and I start stressing the
touchscreen my system tends to hang. The touchscreen is one of the
only things that does a lot of big i2c transfers and ends up hitting
the DMA paths in the geni i2c driver. It appears that KASAN adds
enough delay in my system to tickle a race condition in the DMA setup
code.
When the system hangs, I found that it was running the geni_i2c_irq()
over and over again. It had these:
m_stat = 0x04000080
rx_st = 0x30000011
dm_tx_st = 0x00000000
dm_rx_st = 0x00000000
dma = 0x00000001
Notably we're in DMA mode but are getting M_RX_IRQ_EN and
M_RX_FIFO_WATERMARK_EN over and over again.
Putting some traces in geni_i2c_rx_one_msg() showed that when we
failed we were getting to the start of geni_i2c_rx_one_msg() but were
never executing geni_se_rx_dma_prep().
I believe that the problem here is that we are starting the geni
command before we run geni_se_rx_dma_prep(). If a transfer makes it
far enough before we do that then we get into the state I have
observed. Let's change the order, which seems to work fine.
Although problems were seen on the RX path, code inspection suggests
that the TX should be changed too. Change it as well.
Fixes: 37692de5d5 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Reviewed-by: Akash Asthana <akashast@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Mukesh Kumar Savaliya <msavaliy@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
On R-Car Gen2, we get a timeout when reading from the address set in
ICSAR, even though the slave interface is disabled. Clearing it fixes
this situation. Note that Gen3 is not affected.
To reproduce: bind and undbind an I2C slave on some bus, run
'i2cdetect' on that bus.
Fixes: de20d1857d ("i2c: rcar: add slave support")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Previously TLP may send multiple probes of new data in one
flight. This happens when the sender is cwnd limited. After the
initial TLP containing new data is sent, the sender receives another
ACK that acks partial inflight. It may re-arm another TLP timer
to send more, if no further ACK returns before the next TLP timeout
(PTO) expires. The sender may send in theory a large amount of TLP
until send queue is depleted. This only happens if the sender sees
such irregular uncommon ACK pattern. But it is generally undesirable
behavior during congestion especially.
The original TLP design restrict only one TLP probe per inflight as
published in "Reducing Web Latency: the Virtue of Gentle Aggression",
SIGCOMM 2013. This patch changes TLP to send at most one probe
per inflight.
Note that if the sender is app-limited, TLP retransmits old data
and did not have this issue.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We recently added some bounds checking in ax25_connect() and
ax25_sendmsg() and we so we removed the AX25_MAX_DIGIS checks because
they were no longer required.
Unfortunately, I believe they are required to prevent integer overflows
so I have added them back.
Fixes: 8885bb0621 ("AX.25: Prevent out-of-bounds read in ax25_sendmsg()")
Fixes: 2f2a7ffad5 ("AX.25: Fix out-of-bounds read in ax25_connect()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit adc0daad36 ("dm: report suspended
device during destroy") broke integrity recalculation.
The problem is dm_suspended() returns true not only during suspend,
but also during resume. So this race condition could occur:
1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
2. integrity_recalc (&ic->recalc_work) preempts the current thread
3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
4. integrity_recalc exits and no recalculating is done.
To fix this race condition, add a function dm_post_suspending that is
only true during the postsuspend phase and use it instead of
dm_suspended().
Signed-off-by: Mikulas Patocka <mpatocka redhat com>
Fixes: adc0daad36 ("dm: report suspended device during destroy")
Cc: stable vger kernel org # v4.18+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
IOSQE_ASYNC branch of io_queue_sqe() is another place where an
unitialised req->work can be accessed (i.e. prior io_req_init_async()).
Nothing really bad though, it just looses IO_WQ_WORK_CONCURRENT flag.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
syzbot is reporting general protection fault in bitfill_aligned() [1]
caused by integer underflow in bit_clear_margins(). The cause of this
problem is when and how do_vc_resize() updates vc->vc_{cols,rows}.
If vc_do_resize() fails (e.g. kzalloc() fails) when var.xres or var.yres
is going to shrink, vc->vc_{cols,rows} will not be updated. This allows
bit_clear_margins() to see info->var.xres < (vc->vc_cols * cw) or
info->var.yres < (vc->vc_rows * ch). Unexpectedly large rw or bh will
try to overrun the __iomem region and causes general protection fault.
Also, vc_resize(vc, 0, 0) does not set vc->vc_{cols,rows} = 0 due to
new_cols = (cols ? cols : vc->vc_cols);
new_rows = (lines ? lines : vc->vc_rows);
exception. Since cols and lines are calculated as
cols = FBCON_SWAP(ops->rotate, info->var.xres, info->var.yres);
rows = FBCON_SWAP(ops->rotate, info->var.yres, info->var.xres);
cols /= vc->vc_font.width;
rows /= vc->vc_font.height;
vc_resize(vc, cols, rows);
in fbcon_modechanged(), var.xres < vc->vc_font.width makes cols = 0
and var.yres < vc->vc_font.height makes rows = 0. This means that
const int fd = open("/dev/fb0", O_ACCMODE);
struct fb_var_screeninfo var = { };
ioctl(fd, FBIOGET_VSCREENINFO, &var);
var.xres = var.yres = 1;
ioctl(fd, FBIOPUT_VSCREENINFO, &var);
easily reproduces integer underflow bug explained above.
Of course, callers of vc_resize() are not handling vc_do_resize() failure
is bad. But we can't avoid vc_resize(vc, 0, 0) which returns 0. Therefore,
as a band-aid workaround, this patch checks integer underflow in
"struct fbcon_ops"->clear_margins call, assuming that
vc->vc_cols * vc->vc_font.width and vc->vc_rows * vc->vc_font.heigh do not
cause integer overflow.
[1] https://syzkaller.appspot.com/bug?id=a565882df74fa76f10d3a6fec4be31098dbb37c6
Reported-and-tested-by: syzbot <syzbot+e5fd3e65515b48c02a30@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200715015102.3814-1-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 5c4e8d3781 ("usb: host: xhci-tegra: Add support for XUSB
context save/restore") is using the IPFS 'num_offsets' value when
allocating memory for FPCI context instead of the FPCI 'num_offsets'.
After commit cad064f1bd ("devres: handle zero size in devm_kmalloc()")
was added system suspend started failing on Tegra186. The kernel log
showed that the Tegra XHCI driver was crashing on entry to suspend when
attempting the save the USB context. On Tegra186, the IPFS context has a
zero length but the FPCI content has a non-zero length, and because of
the bug in the Tegra XHCI driver we are incorrectly allocating a zero
length array for the FPCI context. The crash seen on entering suspend
when we attempt to save the FPCI context and following commit
cad064f1bd ("devres: handle zero size in devm_kmalloc()") this now
causes a NULL pointer deference when we access the memory. Fix this by
correcting the amount of memory we are allocating for FPCI contexts.
Cc: stable@vger.kernel.org
Fixes: 5c4e8d3781 ("usb: host: xhci-tegra: Add support for XUSB context save/restore")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20200715113842.30680-1-jonathanh@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Newer versions of clang only look for $(COMPAT_GCC_TOOLCHAIN_DIR)as [1],
rather than $(COMPAT_GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE_COMPAT)as,
resulting in the following build error:
$ make -skj"$(nproc)" ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- \
CROSS_COMPILE_COMPAT=arm-linux-gnueabi- LLVM=1 O=out/aarch64 distclean \
defconfig arch/arm64/kernel/vdso32/
...
/home/nathan/cbl/toolchains/llvm-binutils/bin/as: unrecognized option '-EL'
clang-12: error: assembler command failed with exit code 1 (use -v to see invocation)
make[3]: *** [arch/arm64/kernel/vdso32/Makefile:181: arch/arm64/kernel/vdso32/note.o] Error 1
...
Adding the value of CROSS_COMPILE_COMPAT (adding notdir to account for a
full path for CROSS_COMPILE_COMPAT) fixes this issue, which matches the
solution done for the main Makefile [2].
[1]: 3452a0d8c1
[2]: https://lore.kernel.org/lkml/20200721173125.1273884-1-maskray@google.com/
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1099
Link: https://lore.kernel.org/r/20200723041509.400450-1-natechancellor@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
WRITE_ONCE() isn't the correct way to publish a pointer to a data
structure, since it doesn't include a write memory barrier. Therefore
other tasks may see that the pointer has been set but not see that the
pointed-to memory has finished being initialized yet. Instead a
primitive with "release" semantics is needed.
Use smp_store_release() for this.
The use of READ_ONCE() on the read side is still potentially correct if
there's no control dependency, i.e. if all memory being "published" is
transitively reachable via the pointer itself. But this pairing is
somewhat confusing and error-prone. So just upgrade the read side to
smp_load_acquire() so that it clearly pairs with smp_store_release().
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: 3234ac664a ("/dev/mem: Revoke mappings when a driver claims the region")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20200716060553.24618-1-ebiggers@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the implementation of uld_send(), the skb is consumed on all
execution paths except one. Release skb when returning NET_XMIT_DROP.
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if
$(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-elfedit,
GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to
/usr/bin/ and Clang as of 11 will search for both
$(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle.
GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle,
$(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice,
$(prefix)aarch64-linux-gnu/$needle rarely contains executables.
To better model how GCC's -B/--prefix takes in effect in practice, newer
Clang (since
3452a0d8c1)
only searches for $(prefix)$needle. Currently it will find /usr/bin/as
instead of /usr/bin/aarch64-linux-gnu-as.
Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(notdir $(CROSS_COMPILE))
(/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the
appropriate cross compiling GNU as (when -no-integrated-as is in
effect).
Cc: stable@vger.kernel.org
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Fangrui Song <maskray@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1099
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
This patch fixes PTP on AQC10X.
PTP support on AQC10X requires FW involvement and FW configures the
TPS data arb mode itself.
So we must make sure driver doesn't touch TPS data arb mode on AQC10x
if PTP is enabled. Otherwise, there are no timestamps even though
packets are flowing.
Fixes: 2deac71ac4 ("net: atlantic: QoS implementation: min_rate")
Signed-off-by: Egor Pomozov <epomozov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Checks on `addr_len` and `usax->sax25_ndigis` are insufficient.
ax25_sendmsg() can go out of bounds when `usax->sax25_ndigis` equals to 7
or 8. Fix it.
It is safe to remove `usax->sax25_ndigis > AX25_MAX_DIGIS`, since
`addr_len` is guaranteed to be less than or equal to
`sizeof(struct full_sockaddr_ax25)`
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long says:
====================
sctp: shrink stream outq in the right place
Patch 1 is an improvement, and Patch 2 is a bug fix.
====================
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When adding a stream with stream reconf, the new stream firstly is in
CLOSED state but new out chunks can still be enqueued. Then once gets
the confirmation from the peer, the state will change to OPEN.
However, if the peer denies, it needs to roll back the stream. But when
doing that, it only sets the stream outcnt back, and the chunks already
in the new stream don't get purged. It caused these chunks can still be
dequeued in sctp_outq_dequeue_data().
As its stream is still in CLOSE, the chunk will be enqueued to the head
again by sctp_outq_head_data(). This chunk will never be sent out, and
the chunks after it can never be dequeued. The assoc will be 'hung' in
a dead loop of sending this chunk.
To fix it, this patch is to purge these chunks already in the new
stream by calling sctp_stream_shrink_out() when failing to do the
addstream reconf.
Fixes: 11ae76e67a ("sctp: implement receiver-side procedures for the Reconf Response Parameter")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's not necessary to go list_for_each for outq->out_chunk_list
when new outcnt >= old outcnt, as no chunk with higher sid than
new (outcnt - 1) exists in the outqueue.
While at it, also move the list_for_each code in a new function
sctp_stream_shrink_out(), which will be used in the next patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Checks on `addr_len` and `fsa->fsa_ax25.sax25_ndigis` are insufficient.
ax25_connect() can go out of bounds when `fsa->fsa_ax25.sax25_ndigis`
equals to 7 or 8. Fix it.
This issue has been reported as a KMSAN uninit-value bug, because in such
a case, ax25_connect() reaches into the uninitialized portion of the
`struct sockaddr_storage` statically allocated in __sys_connect().
It is safe to remove `fsa->fsa_ax25.sax25_ndigis > AX25_MAX_DIGIS` because
`addr_len` is guaranteed to be less than or equal to
`sizeof(struct full_sockaddr_ax25)`.
Reported-by: syzbot+c82752228ed975b0a623@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=55ef9d629f3b3d7d70b69558015b63b48d01af66
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For ENETC ports that register an external MDIO bus,
the bus doesn't get removed on the error bailout path
of enetc_pf_probe().
This issue became much more visible after recent:
commit 07095c025a ("net: enetc: Use DT protocol information to set up the ports")
Before this commit, one could make probing fail on the error
path only by having register_netdev() fail, which is unlikely.
But after this commit, because it moved the enetc_of_phy_get()
call up in the probing sequence, now we can trigger an mdiobus_free()
bug just by forcing enetc_alloc_msix() to return error, i.e. with the
'pci=nomsi' kernel bootarg (since ENETC relies on MSI support to work),
as the calltrace below shows:
kernel BUG at /home/eiz/work/enetc/net/drivers/net/phy/mdio_bus.c:648!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[...]
Hardware name: LS1028A RDB Board (DT)
pstate: 80000005 (Nzcv daif -PAN -UAO BTYPE=--)
pc : mdiobus_free+0x50/0x58
lr : devm_mdiobus_free+0x14/0x20
[...]
Call trace:
mdiobus_free+0x50/0x58
devm_mdiobus_free+0x14/0x20
release_nodes+0x138/0x228
devres_release_all+0x38/0x60
really_probe+0x1c8/0x368
driver_probe_device+0x5c/0xc0
device_driver_attach+0x74/0x80
__driver_attach+0x8c/0xd8
bus_for_each_dev+0x7c/0xd8
driver_attach+0x24/0x30
bus_add_driver+0x154/0x200
driver_register+0x64/0x120
__pci_register_driver+0x44/0x50
enetc_pf_driver_init+0x24/0x30
do_one_initcall+0x60/0x1c0
kernel_init_freeable+0x1fc/0x274
kernel_init+0x14/0x110
ret_from_fork+0x10/0x34
Fixes: ebfcb23d62 ("enetc: Add ENETC PF level external MDIO support")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull EFI fixes from Ard Biesheuvel:
- Fix the layering violation in the use of the EFI runtime services
availability mask in users of the 'efivars' abstraction
- Revert build fix for GCC v4.8 which is no longer supported
- Some fixes for build issues found by Atish while working on RISC-V support
- Avoid --whole-archive when linking the stub on arm64
- Some x86 EFI stub cleanups from Arvind
H.J. reported that post 5.7 a segfault of a user space task does not longer
dump the Code bytes when /proc/sys/debug/exception-trace is enabled. It
prints 'Code: Bad RIP value.' instead.
This was broken by a recent change which made probe_kernel_read() reject
non-kernel addresses.
Update show_opcodes() so it retrieves user space opcodes via
copy_from_user_nmi().
Fixes: 98a23609b1 ("maccess: always use strict semantics for probe_kernel_read")
Reported-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/87h7tz306w.fsf@nanos.tec.linutronix.de
If a user task's stack is empty, or if it only has user regs, ORC
reports it as a reliable empty stack. But arch_stack_walk_reliable()
incorrectly treats it as unreliable.
That happens because the only success path for user tasks is inside the
loop, which only iterates on non-empty stacks. Generally, a user task
must end in a user regs frame, but an empty stack is an exception to
that rule.
Thanks to commit 71c9582528 ("x86/unwind/orc: Fix error handling in
__unwind_start()"), unwind_start() now sets state->error appropriately.
So now for both ORC and FP unwinders, unwind_done() and !unwind_error()
always means the end of the stack was successfully reached. So the
success path for kthreads is no longer needed -- it can also be used for
empty user tasks.
Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Link: https://lkml.kernel.org/r/f136a4e5f019219cbc4f4da33b30c2f44fa65b84.1594994374.git.jpoimboe@redhat.com
We hold the cl_lock here, and that's enough to keep stateid's from going
away, but it's not enough to prevent the files they point to from going
away. Take fi_lock and a reference and check for NULL, as we do in
other code.
Reported-by: NeilBrown <neilb@suse.de>
Fixes: 78599c42ae ("nfsd4: add file to display list of client's opens")
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
i.MX fixes for 5.8, round 3:
- A couple of FEC2 phy-mode fixes on imx6sx-sabreauto and imx6sx-sdb
board.
- One fix on imx6qdl-icore pin muxing to get USB OTG_ID and SD card
detect work correctly.
* tag 'imx-fixes-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6qdl-icore: Fix OTG_ID pin and sdcard detect
ARM: dts: imx6sx-sabreauto: Fix the phy-mode on fec2
ARM: dts: imx6sx-sdb: Fix the phy-mode on fec2
Link: https://lore.kernel.org/r/20200720040148.GA20462@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Two fixes for the Allwinner SoCs, one to relax the CMA allocation ranges that
were failing on older SoCs and one to fix Cedrus on the H6.
* tag 'sunxi-fixes-for-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
arm64: dts: allwinner: h6: Fix Cedrus IOMMU usage
ARM: dts sunxi: Relax a bit the CMA pool allocation range
Link: https://lore.kernel.org/r/e24f0608-6a4f-4163-b99e-a5f48e796184.lettre@localhost
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull media fixes from Mauro Carvalho Chehab:
"A series of fixes for the upcoming atomisp driver. They solve issues
when probing atomisp on devices with multiple cameras and get rid of
warnings when built with W=1.
The diffstat is a bit long, as this driver has several abstractions.
The patches that solved the issues with W=1 had to get rid of some
duplicated code (there used to have 2 versions of the same code, one
for ISP2401 and another one for ISP2400).
As this driver is not in 5.7, such changes won't cause regressions"
* tag 'media/v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (38 commits)
Revert "media: atomisp: keep the ISP powered on when setting it"
media: atomisp: fix mask and shift operation on ISPSSPM0
media: atomisp: move system_local consts into a C file
media: atomisp: get rid of version-specific system_local.h
media: atomisp: move global stuff into a common header
media: atomisp: remove non-used 32-bits consts at system_local
media: atomisp: get rid of some unused static vars
media: atomisp: Fix error code in ov5693_probe()
media: atomisp: Replace trace_printk by pr_info
media: atomisp: Fix __func__ style warnings
media: atomisp: fix help message for ISP2401 selection
media: atomisp: i2c: atomisp-ov2680.c: fixed a brace coding style issue.
media: atomisp: make const arrays static, makes object smaller
media: atomisp: Clean up non-existing folders from Makefile
media: atomisp: Get rid of ACPI specifics in gmin_subdev_add()
media: atomisp: Provide Gmin subdev as parameter to gmin_subdev_add()
media: atomisp: Use temporary variable for device in gmin_subdev_add()
media: atomisp: Refactor PMIC detection to a separate function
media: atomisp: Deduplicate return ret in gmin_i2c_write()
media: atomisp: Make pointer to PMIC client global
...
Pull exfat fixes from Namjae Jeon:
- fix overflow issue at sector calculation
- fix wrong hint_stat initialization
- fix wrong size update of stream entry
- fix endianness of upname in name_hash computation
* tag 'exfat-for-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: fix name_hash computation on big endian systems
exfat: fix wrong size update of stream entry by typo
exfat: fix wrong hint_stat initialization in exfat_find_dir_entry()
exfat: fix overflow issue in exfat_cluster_to_sector()
The "virtio_mmio.device=" command line argument allows a user to specify
the size, address, and IRQ of a virtio device. Previously the only
requirement for the IRQ was that it be an unsigned integer.
Zero is an unsigned integer but an invalid IRQ number, and after
a85a6c86c2 ("driver core: platform: Clarify that IRQ 0 is invalid"),
attempts to use IRQ 0 cause warnings.
If the user specifies IRQ 0, return failure instead of registering a device
with IRQ 0.
Fixes: a85a6c86c2 ("driver core: platform: Clarify that IRQ 0 is invalid")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
My colleague Codrin Ciubotariu, now, maintains this driver internally.
Then I handover the mainline maintenance to him.
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
There are few issues on Zynq SOC observed in the stress tests causing
timeout errors. Even though all the data is received, timeout error
is thrown. This is due to an IP bug in which the COMP bit in ISR is
not set at end of transfer and completion interrupt is not generated.
This bug is seen on Zynq platforms when the following condition occurs:
Master read & HOLD bit set & Transfer size register reaches '0'.
One workaround is to clear the HOLD bit before the transfer size
register reaches '0'. The current implementation checks for this at
the start of the loop and also only for less than FIFO DEPTH case
(ignoring the equal to case).
So clear the HOLD bit when the data yet to receive is less than or
equal to the FIFO DEPTH. This avoids the IP bug condition.
Signed-off-by: Raviteja Narayanam <raviteja.narayanam@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
This reverts commit d358def706.
There are two issues with "i2c: cadence: Fix the hold bit setting" commit.
1. In case of combined message request from user space, when the HOLD
bit is cleared in cdns_i2c_mrecv function, a STOP condition is sent
on the bus even before the last message is started. This is because when
the HOLD bit is cleared, the FIFOS are empty and there is no pending
transfer. The STOP condition should occur only after the last message
is completed.
2. The code added by the commit is redundant. Driver is handling the
setting/clearing of HOLD bit in right way before the commit.
The setting of HOLD bit based on 'bus_hold_flag' is taken care in
cdns_i2c_master_xfer function even before cdns_i2c_msend/cdns_i2c_recv
functions.
The clearing of HOLD bit is taken care at the end of cdns_i2c_msend and
cdns_i2c_recv functions based on bus_hold_flag and byte count.
Since clearing of HOLD bit is done after the slave address is written to
the register (writing to address register triggers the message transfer),
it is ensured that STOP condition occurs at the right time after
completion of the pending transfer (last message).
Signed-off-by: Raviteja Narayanam <raviteja.narayanam@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
There is apparently one site that violates the rule that only current
and ttwu() will modify task->state, namely ptrace_{,un}freeze_traced()
will change task->state for a remote task.
Oleg explains:
"TASK_TRACED/TASK_STOPPED was always protected by siglock. In
particular, ttwu(__TASK_TRACED) must be always called with siglock
held. That is why ptrace_freeze_traced() assumes it can safely do
s/TASK_TRACED/__TASK_TRACED/ under spin_lock(siglock)."
This breaks the ordering scheme introduced by commit:
dbfb089d36 ("sched: Fix loadavg accounting race")
Specifically, the reload not matching no longer implies we don't have
to block.
Simply things by noting that what we need is a LOAD->STORE ordering
and this can be provided by a control dependency.
So replace:
prev_state = prev->state;
raw_spin_lock(&rq->lock);
smp_mb__after_spinlock(); /* SMP-MB */
if (... && prev_state && prev_state == prev->state)
deactivate_task();
with:
prev_state = prev->state;
if (... && prev_state) /* CTRL-DEP */
deactivate_task();
Since that already implies the 'prev->state' load must be complete
before allowing the 'prev->on_rq = 0' store to become visible.
Fixes: dbfb089d36 ("sched: Fix loadavg accounting race")
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Tested-by: Christian Brauner <christian.brauner@ubuntu.com>
On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is
page-aligned, but the end of the .bss..page_aligned section is not
guaranteed to be page-aligned.
As a result, objects from other .bss sections may end up on the same 4k
page as the idt_table, and will accidentially get mapped read-only during
boot, causing unexpected page-faults when the kernel writes to them.
This could be worked around by making the objects in the page aligned
sections page sized, but that's wrong.
Explicit sections which store only page aligned objects have an implicit
guarantee that the object is alone in the page in which it is placed. That
works for all objects except the last one. That's inconsistent.
Enforcing page sized objects for these sections would wreckage memory
sanitizers, because the object becomes artificially larger than it should
be and out of bound access becomes legit.
Align the end of the .bss..page_aligned and .data..page_aligned section on
page-size so all objects places in these sections are guaranteed to have
their own page.
[ tglx: Amended changelog ]
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200721093448.10417-1-joro@8bytes.org
Currently drive supports taprio offload which is a tc feature offloaded
to cpsw hardware. So driver has to set the hw feature flag, NETIF_F_HW_TC
in the net device to be compliant. This patch adds the flag.
Fixes: 8127224c27 ("ethernet: ti: am65-cpsw-qos: add TAPRIO offload support")
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When regmap_update_bits failed in ave_init(), calls of the functions
reset_control_assert() and clk_disable_unprepare() were missed.
Add goto out_reset_assert to do this.
Fixes: 57878f2f46 ("net: ethernet: ave: add support for phy-mode setting of system controller")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver is not working because of problems of its receiving code.
This patch fixes it to make it work.
When the driver receives an LAPB frame, it should first pass the frame
to the LAPB module to process. After processing, the LAPB module passes
the data (the packet) back to the driver, the driver should then add a
one-byte pseudo header and pass the data to upper layers.
The changes to the "x25_asy_bump" function and the
"x25_asy_data_indication" function are to correctly implement this
procedure.
Also, the "x25_asy_unesc" function ignores any frame that is shorter
than 3 bytes. However the shortest frames are 2-byte long. So we need
to change it to allow 2-byte frames to pass.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Reviewed-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sync_thread_backup only checks sk_receive_queue is empty or not,
there is a situation which cannot sync the connection entries when
sk_receive_queue is empty and sk_rmem_alloc is larger than sk_rcvbuf,
the sync packets are dropped in __udp_enqueue_schedule_skb, this is
because the packets in reader_queue is not read, so the rmem is
not reclaimed.
Here I add the check of whether the reader_queue of the udp sock is
empty or not to solve this problem.
Fixes: 2276f58ac5 ("udp: use a separate rx queue for packet reception")
Reported-by: zhouxudong <zhouxudong8@huawei.com>
Signed-off-by: guodeqing <geffrey.guo@huawei.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
According to 'man 8 ip-netns', if `ip netns identify` returns an empty string,
there's no net namespace associated with current PID: fix the net ns entrance
logic.
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Lobakin says:
====================
qed: suppress irrelevant error messages on HW init
This raises the verbosity level of several error/warning messages on
driver/module initialization, most of which are false-positives, and
the one actively spamming the log for no reason.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
It was found that qed_pglueb_rbc_attn_handler() can produce a lot of
false-positive error detections on driver load/reload (especially after
crashes/recoveries) and spam the kernel log:
[ 4.958275] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d00ff0
[ 2079.146764] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0
[ 2116.374631] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0
[ 2135.250564] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0
[...]
Reduce the logging level of two false-positive prone error messages from
notice to verbose on initialization (only) to not mix it with real error
attentions while debugging.
Fixes: 666db4862f ("qed: Revise load sequence to avoid PCI errors")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the verbosity of the "don't support RoCE & iWARP simultaneously"
warning to debug level to stop flooding on driver/hardware initialization:
[ 4.783230] qede 01:00.00: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth0]
[ 4.810020] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[ 4.861186] qede 01:00.01: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth1]
[ 4.893311] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[ 5.181713] qede a1:00.00: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth2]
[ 5.224740] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[ 5.276449] qede a1:00.01: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth3]
[ 5.318671] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[ 5.369548] qede a1:00.02: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth4]
[ 5.411645] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
Fixes: e0a8f9de16 ("qed: Add iWARP enablement support")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the nsim_create(), rtnl_lock() is called before nsim_bpf_init().
If nsim_bpf_init() is failed, rtnl_unlock() should be called,
but it isn't called.
So, unbalanced locking would occur.
Fixes: e05b2d141f ("netdevsim: move netdev creation/destruction to dev probe")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When doing "ip link set dev ... up" for a ksz9477 backed link,
ksz9477_phy_setup is called and it calls phy_remove_link_mode to remove
1000baseT HDX. During phy_remove_link_mode, phy_advertise_supported is
called. Doing so reverts any previous change to advertised link modes
e.g. using a udevd .link file.
phy_remove_link_mode is not meant to be used while opening a link and
should be called during phy probe when the link is not yet available to
userspace.
Therefore move the phy_remove_link_mode calls into
ksz9477_switch_register. It indirectly calls dsa_register_switch, which
creates the relevant struct phy_devices and we update the link modes
right after that. At that time dev->features is already initialized by
ksz9477_switch_detect.
Remove phy_setup from ksz_dev_ops as no users remain.
Link: https://lore.kernel.org/netdev/20200715192722.GD1256692@lunn.ch/
Fixes: 42fc6a4c61 ("net: dsa: microchip: prepare PHY for proper advertisement")
Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan says:
====================
net: hns3: fixes for -net
There are some bugfixes for the HNS3 ethernet driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, PF queries the MAC link status per second by calling
function hclge_get_mac_link_status(). It return the error code
when failed to send cmdq command to firmware. It's incorrect,
because this return value is used as the MAC link status, which
0 means link down, and none-zero means link up. So fixes it.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The content of the TX desc is automatically cleared by the HW
when the HW has sent out the packet to the wire. When desc filling
fails in hns3_nic_net_xmit(), it will call hns3_clear_desc() to do
the error handling, which miss zeroing of the TX desc and the
checking if a unmapping is needed.
So add the zeroing and checking in hns3_clear_desc() to avoid the
above problem. Also add DESC_TYPE_UNKNOWN to indicate the info in
desc_cb is not valid, because hns3_nic_reclaim_desc() may treat
the desc_cb->type of zero as packet and add to the sent pkt
statistics accordingly.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With GRO and fraglist support, the SKB can be aggregated to
a total size of 65535, and when that SKB is forwarded through
a bridge, the size of the SKB may be pushed to exceed the size
of 65535 when br_dev_queue_push_xmit() is called.
The max send size of BD supported by the HW is 65535, when a SKB
with a headlen of over 65535 is sent to the driver, the driver
needs to use multi BD to send the linear data, and the send size
of the last BD is calculated incorrectly by the driver who is
using '&' operation, which causes a TX error.
Use '%' operation to fix this problem.
Fixes: 3fe13ed95d ("net: hns3: avoid mult + div op in critical data path")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a big TX buffer is sent using multi BD, the driver maps the
whole TX buffer, and unmaps it using info in desc_cb corresponding
to each BD, but only the info in the desc_cb of first BD is correct,
other info in desc_cb is wrong, which causes TX unmapping problem
when SMMU is on.
Only set the mapping and freeing info in the desc_cb of first BD to
fix this problem, because the TX buffer only need to be unmapped and
freed once.
Fixes: 1e8a7977d09f("net: hns3: add handling for big TX fragment")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huzhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can't use IS_UDPLITE to replace udp_sk->pcflag when UDPLITE_RECV_CC is
checked.
Fixes: b2bf1e2659 ("[UDP]: Clean up for IS_UDPLITE macro")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When I cat 'tx_timeout' by sysfs, it displays as follows. It's better to
add a newline for easy reading.
root@syzkaller:~# cat /sys/devices/virtual/net/lo/queues/tx-0/tx_timeout
0root@syzkaller:~#
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the report of [1], this driver is possible to cause
the following error in ravb_tx_timeout_work().
ravb e6800000.ethernet ethernet: failed to switch device to config mode
This error means that the hardware could not change the state
from "Operation" to "Configuration" while some tx and/or rx queue
are operating. After that, ravb_config() in ravb_dmac_init() will fail,
and then any descriptors will be not allocaled anymore so that NULL
pointer dereference happens after that on ravb_start_xmit().
To fix the issue, the ravb_tx_timeout_work() should check
the return values of ravb_stop_dma() and ravb_dmac_init().
If ravb_stop_dma() fails, ravb_tx_timeout_work() re-enables TX and RX
and just exits. If ravb_dmac_init() fails, just exits.
[1]
https://lore.kernel.org/linux-renesas-soc/20200518045452.2390-1-dirk.behme@de.bosch.com/
Reported-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima says:
====================
udp: Fix reuseport selection with connected sockets.
This patch set addresses two issues which happen when both connected and
unconnected sockets are in the same UDP reuseport group.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, SO_REUSEPORT does not work well if connected sockets are in a
UDP reuseport group.
Then reuseport_has_conns() returns true and the result of
reuseport_select_sock() is discarded. Also, unconnected sockets have the
same score, hence only does the first unconnected socket in udp_hslot
always receive all packets sent to unconnected sockets.
So, the result of reuseport_select_sock() should be used for load
balancing.
The noteworthy point is that the unconnected sockets placed after
connected sockets in sock_reuseport.socks will receive more packets than
others because of the algorithm in reuseport_select_sock().
index | connected | reciprocal_scale | result
---------------------------------------------
0 | no | 20% | 40%
1 | no | 20% | 20%
2 | yes | 20% | 0%
3 | no | 20% | 40%
4 | yes | 20% | 0%
If most of the sockets are connected, this can be a problem, but it still
works better than now.
Fixes: acdcecc612 ("udp: correct reuseport selection with connected sockets")
CC: Willem de Bruijn <willemb@google.com>
Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If an unconnected socket in a UDP reuseport group connect()s, has_conns is
set to 1. Then, when a packet is received, udp[46]_lib_lookup2() scans all
sockets in udp_hslot looking for the connected socket with the highest
score.
However, when the number of sockets bound to the port exceeds max_socks,
reuseport_grow() resets has_conns to 0. It can cause udp[46]_lib_lookup2()
to return without scanning all sockets, resulting in that packets sent to
connected sockets may be distributed to unconnected sockets.
Therefore, reuseport_grow() should copy has_conns.
Fixes: acdcecc612 ("udp: correct reuseport selection with connected sockets")
CC: Willem de Bruijn <willemb@google.com>
Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is possible to cause a btrfs mount to fail by racing it with a slow
umount. The crux of the sequence is generic_shutdown_super not yet
calling sop->put_super before btrfs_mount_root calls btrfs_open_devices.
If that occurs, btrfs_open_devices will decide the opened counter is
non-zero, increment it, and skip resetting fs_devices->total_rw_bytes to
0. From here, mount will call sget which will result in grab_super
trying to take the super block umount semaphore. That semaphore will be
held by the slow umount, so mount will block. Before up-ing the
semaphore, umount will delete the super block, resulting in mount's sget
reliably allocating a new one, which causes the mount path to dutifully
fill it out, and increment total_rw_bytes a second time, which causes
the mount to fail, as we see double the expected bytes.
Here is the sequence laid out in greater detail:
CPU0 CPU1
down_write sb->s_umount
btrfs_kill_super
kill_anon_super(sb)
generic_shutdown_super(sb);
shrink_dcache_for_umount(sb);
sync_filesystem(sb);
evict_inodes(sb); // SLOW
btrfs_mount_root
btrfs_scan_one_device
fs_devices = device->fs_devices
fs_info->fs_devices = fs_devices
// fs_devices-opened makes this a no-op
btrfs_open_devices(fs_devices, mode, fs_type)
s = sget(fs_type, test, set, flags, fs_info);
find sb in s_instances
grab_super(sb);
down_write(&s->s_umount); // blocks
sop->put_super(sb)
// sb->fs_devices->opened == 2; no-op
spin_lock(&sb_lock);
hlist_del_init(&sb->s_instances);
spin_unlock(&sb_lock);
up_write(&sb->s_umount);
return 0;
retry lookup
don't find sb in s_instances (deleted by CPU0)
s = alloc_super
return s;
btrfs_fill_super(s, fs_devices, data)
open_ctree // fs_devices total_rw_bytes improperly set!
btrfs_read_chunk_tree
read_one_dev // increment total_rw_bytes again!!
super_total_bytes < fs_devices->total_rw_bytes // ERROR!!!
To fix this, we clear total_rw_bytes from within btrfs_read_chunk_tree
before the calls to read_one_dev, while holding the sb umount semaphore
and the uuid mutex.
To reproduce, it is sufficient to dirty a decent number of inodes, then
quickly umount and mount.
for i in $(seq 0 500)
do
dd if=/dev/zero of="/mnt/foo/$i" bs=1M count=1
done
umount /mnt/foo&
mount /mnt/foo
does the trick for me.
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When locking pages for delalloc, we check if it's dirty and mapping still
matches. If it does not match, we need to return -EAGAIN and release all
pages. Only the current page was put though, iterate over all the
remaining pages too.
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When running tests like generic/013 on test device with btrfs quota
enabled, it can normally lead to data leak, detected at unmount time:
BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type 0 rsv 4096
------------[ cut here ]------------
WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142 close_ctree+0x1dc/0x323 [btrfs]
RIP: 0010:close_ctree+0x1dc/0x323 [btrfs]
Call Trace:
btrfs_put_super+0x15/0x17 [btrfs]
generic_shutdown_super+0x72/0x110
kill_anon_super+0x18/0x30
btrfs_kill_super+0x17/0x30 [btrfs]
deactivate_locked_super+0x3b/0xa0
deactivate_super+0x40/0x50
cleanup_mnt+0x135/0x190
__cleanup_mnt+0x12/0x20
task_work_run+0x64/0xb0
__prepare_exit_to_usermode+0x1bc/0x1c0
__syscall_return_slowpath+0x47/0x230
do_syscall_64+0x64/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace caf08beafeca2392 ]---
BTRFS error (device dm-3): qgroup reserved space leaked
[CAUSE]
In the offending case, the offending operations are:
2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0
2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0
The following sequence of events could happen after the writev():
CPU1 (writeback) | CPU2 (truncate)
-----------------------------------------------------------------
btrfs_writepages() |
|- extent_write_cache_pages() |
|- Got page for 1003520 |
| 1003520 is Dirty, no writeback |
| So (!clear_page_dirty_for_io()) |
| gets called for it |
|- Now page 1003520 is Clean. |
| | btrfs_setattr()
| | |- btrfs_setsize()
| | |- truncate_setsize()
| | New i_size is 18388
|- __extent_writepage() |
| |- page_offset() > i_size |
|- btrfs_invalidatepage() |
|- Page is clean, so no qgroup |
callback executed
This means, the qgroup reserved data space is not properly released in
btrfs_invalidatepage() as the page is Clean.
[FIX]
Instead of checking the dirty bit of a page, call
btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage().
As qgroup rsv are completely bound to the QGROUP_RESERVED bit of
io_tree, not bound to page status, thus we won't cause double freeing
anyway.
Fixes: 0b34c261e2 ("btrfs: qgroup: Prevent qgroup->reserved from going subzero")
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
NULL dereference occurs when string that is not ended with space or
newline is written to some dpm sysfs interface (for example pp_dpm_sclk).
This happens because strsep replaces the tmp with NULL if the delimiter
is not present in string, which is then dereferenced by tmp[0].
Reproduction example:
sudo sh -c 'echo -n 1 > /sys/class/drm/card0/device/pp_dpm_sclk'
Signed-off-by: Paweł Gronowski <me@woland.xyz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Commit 7b668c064e ("serial: 8250: Fix max baud limit in generic 8250
port") fixed limits of a baud rate setting for a generic 8250 port.
In other words since that commit the baud rate has been permitted to be
within [uartclk / 16 / UART_DIV_MAX; uartclk / 16], which is absolutely
normal for a standard 8250 UART port. But there are custom 8250 ports,
which provide extended baud rate limits. In particular the Mediatek 8250
port can work with baud rates up to "uartclk" speed.
Normally that and any other peculiarity is supposed to be handled in a
custom set_termios() callback implemented in the vendor-specific
8250-port glue-driver. Currently that is how it's done for the most of
the vendor-specific 8250 ports, but for some reason for Mediatek a
solution has been spread out to both the glue-driver and to the generic
8250-port code. Due to that a bug has been introduced, which permitted the
extended baud rate limit for all even for standard 8250-ports. The bug
has been fixed by the commit 7b668c064e ("serial: 8250: Fix max baud
limit in generic 8250 port") by narrowing the baud rates limit back down to
the normal bounds. Unfortunately by doing so we also broke the
Mediatek-specific extended bauds feature.
A fix of the problem described above is twofold. First since we can't get
back the extended baud rate limits feature to the generic set_termios()
function and that method supports only a standard baud rates range, the
requested baud rate must be locally stored before calling it and then
restored back to the new termios structure after the generic set_termios()
finished its magic business. By doing so we still use the
serial8250_do_set_termios() method to set the LCR/MCR/FCR/etc. registers,
while the extended baud rate setting procedure will be performed later in
the custom Mediatek-specific set_termios() callback. Second since a true
baud rate is now fully calculated in the custom set_termios() method we
need to locally update the port timeout by calling the
uart_update_timeout() function. After the fixes described above are
implemented in the 8250_mtk.c driver, the Mediatek 8250-port should
get back to normally working with extended baud rates.
Link: https://lore.kernel.org/linux-serial/20200701211337.3027448-1-danielwinkler@google.com
Fixes: 7b668c064e ("serial: 8250: Fix max baud limit in generic 8250 port")
Reported-by: Daniel Winkler <danielwinkler@google.com>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: stable <stable@vger.kernel.org>
Tested-by: Claire Chang <tientzu@chromium.org>
Link: https://lore.kernel.org/r/20200714124113.20918-1-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prefetch work in mlx5_ib_prefetch_mr_work can be queued and able to run
concurrently with destruction of the implicit MR. The num_deferred_work
was intended to serialize this, but there is a race:
CPU0 CPU1
mlx5_ib_free_implicit_mr()
xa_erase(odp_mkeys)
synchronize_srcu()
__xa_erase(implicit_children)
mlx5_ib_prefetch_mr_work()
pagefault_mr()
pagefault_implicit_mr()
implicit_get_child_mr()
xa_cmpxchg()
atomic_dec_and_test(num_deferred_mr)
wait_event(imr->q_deferred_work)
ib_umem_odp_release(odp_imr)
kfree(odp_imr)
At this point in mlx5_ib_free_implicit_mr() the implicit_children list is
supposed to be empty forever so that destroy_unused_implicit_child_mr()
and related are not and will not be running.
Since it is not empty the destroy_unused_implicit_child_mr() flow ends up
touching deallocated memory as mlx5_ib_free_implicit_mr() already tore down the
imr parent.
The solution is to flush out the prefetch wq by driving num_deferred_work
to zero after creation of new prefetch work is blocked.
Fixes: 5256edcb98 ("RDMA/mlx5: Rework implicit ODP destroy")
Link: https://lore.kernel.org/r/20200719065435.130722-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The problems started with the revert (18cc7ac8a2). The
cdns_uart_console.index is statically assigned -1. When the port is
registered, Linux assigns consecutive numbers to it. It turned out that
when using ttyPS1 as console, the index is not updated as we are reusing
the same cdns_uart_console instance for multiple ports. When registering
ttyPS0, it gets updated from -1 to 0, but when registering ttyPS1, it
already is 0 and not updated.
That led to 2ae11c46d5. It assigns the index prior to registering
the uart_driver once. Unfortunately, that ended up breaking the
situation where the probe order does not match the id order. When using
the same device tree for both uboot and linux, it is important that the
serial0 alias points to the console. So some boards reverse those
aliases. This was reported by Jan Kiszka. The proposed fix was reverting
the index assignment and going back to the previous iteration.
However such a reversed assignement (serial0 -> uart1, serial1 -> uart0)
was already partially broken by the revert (18cc7ac8a2). While the
ttyPS device works, the kmsg connection is already broken and kernel
messages go missing. Reverting the id assignment does not fix this.
>From the xilinx_uartps driver pov (after reverting the refactoring
commits), there can be only one console. This manifests in static
variables console_pprt and cdns_uart_console. These variables are not
properly linked and can go out of sync. The cdns_uart_console.index is
important for uart_add_one_port. We call that function for each port -
one of which hopefully is the console. If it isn't, the CON_ENABLED flag
is not set and console_port is cleared. The next cdns_uart_probe call
then tries to register the next port using that same cdns_uart_console.
It is important that console_port and cdns_uart_console (and its index
in particular) stay in sync. The index assignment implemented by
Shubhrajyoti Datta is correct in principle. It just may have to happen a
second time if the first cdns_uart_probe call didn't encounter the
console device. And we shouldn't change the index once the console uart
is registered.
Reported-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Link: https://lore.kernel.org/linux-serial/f4092727-d8f5-5f91-2c9f-76643aace993@siemens.com/
Fixes: 18cc7ac8a2 ("Revert "serial: uartps: Register own uart console and driver structures"")
Fixes: 2ae11c46d5 ("tty: xilinx_uartps: Fix missing id assignment to the console")
Fixes: 76ed2e1057 ("Revert "tty: xilinx_uartps: Fix missing id assignment to the console"")
Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200713073227.GA3805@laureti-dev
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzbot is reporting general protection fault in do_con_write() [1] caused
by vc->vc_screenbuf == ZERO_SIZE_PTR caused by vc->vc_screenbuf_size == 0
caused by vc->vc_cols == vc->vc_rows == vc->vc_size_row == 0 caused by
fb_set_var() from ioctl(FBIOPUT_VSCREENINFO) on /dev/fb0 , for
gotoxy(vc, 0, 0) from reset_terminal() from vc_init() from vc_allocate()
from con_install() from tty_init_dev() from tty_open() on such console
causes vc->vc_pos == 0x10000000e due to
((unsigned long) ZERO_SIZE_PTR) + -1U * 0 + (-1U << 1).
I don't think that a console with 0 column or 0 row makes sense. And it
seems that vc_do_resize() does not intend to allow resizing a console to
0 column or 0 row due to
new_cols = (cols ? cols : vc->vc_cols);
new_rows = (lines ? lines : vc->vc_rows);
exception.
Theoretically, cols and rows can be any range as long as
0 < cols * rows * 2 <= KMALLOC_MAX_SIZE is satisfied (e.g.
cols == 1048576 && rows == 2 is possible) because of
vc->vc_size_row = vc->vc_cols << 1;
vc->vc_screenbuf_size = vc->vc_rows * vc->vc_size_row;
in visual_init() and kzalloc(vc->vc_screenbuf_size) in vc_allocate().
Since we can detect cols == 0 or rows == 0 via screenbuf_size = 0 in
visual_init(), we can reject kzalloc(0). Then, vc_allocate() will return
an error, and con_write() will not be called on a console with 0 column
or 0 row.
We need to make sure that integer overflow in visual_init() won't happen.
Since vc_do_resize() restricts cols <= 32767 and rows <= 32767, applying
1 <= cols <= 32767 and 1 <= rows <= 32767 restrictions to vc_allocate()
will be practically fine.
This patch does not touch con_init(), for returning -EINVAL there
does not help when we are not returning -ENOMEM.
[1] https://syzkaller.appspot.com/bug?extid=017265e8553724e514e8
Reported-and-tested-by: syzbot <syzbot+017265e8553724e514e8@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200712111013.11881-1-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit
84e6ffb2c4 ("arm: add support for folded p4d page tables")
updated create_mapping_late() to take folded P4Ds into account when
creating mappings, but inverted the p4d_alloc() failure test, resulting
in no mapping to be created at all.
When the EFI rtc driver subsequently tries to invoke the EFI GetTime()
service, the memory regions covering the EFI data structures are missing
from the page tables, resulting in a crash like
Unable to handle kernel paging request at virtual address 5ae0cf28
pgd = (ptrval)
[5ae0cf28] *pgd=80000040205003, *pmd=00000000
Internal error: Oops: 207 [#1] SMP THUMB2
Modules linked in:
CPU: 0 PID: 7 Comm: kworker/u32:0 Not tainted 5.7.0+ #92
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
Workqueue: efi_rts_wq efi_call_rts
PC is at efi_call_rts+0x94/0x294
LR is at efi_call_rts+0x83/0x294
pc : [<c0b4f098>] lr : [<c0b4f087>] psr: 30000033
sp : e6219ef0 ip : 00000000 fp : ffffe000
r10: 00000000 r9 : 00000000 r8 : 30000013
r7 : e6201dd0 r6 : e6201ddc r5 : 00000000 r4 : c181f264
r3 : 5ae0cf10 r2 : 00000001 r1 : e6201dd0 r0 : e6201ddc
Flags: nzCV IRQs on FIQs on Mode SVC_32 ISA Thumb Segment none
Control: 70c5383d Table: 661cc840 DAC: 00000001
Process kworker/u32:0 (pid: 7, stack limit = 0x(ptrval))
...
[<c0b4f098>] (efi_call_rts) from [<c0448219>] (process_one_work+0x16d/0x3d8)
[<c0448219>] (process_one_work) from [<c0448581>] (worker_thread+0xfd/0x408)
[<c0448581>] (worker_thread) from [<c044ca7b>] (kthread+0x103/0x104)
...
Fixes: 84e6ffb2c4 ("arm: add support for folded p4d page tables")
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
__vdso_*() should be removed and fallback used if CNTCVT is not
available by cntvct_functional(). __vdso_clock_gettime64 when added
previous commit is using the incorrect CNTCVT value in that state.
__vdso_clock_gettime64 is also added to remove it's symbol.
Cc: stable@vger.kernel.org
Fixes: 74d06efb9c ("ARM: 8932/1: Add clock_gettime64 entry point")
Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
Tested-by: Robin Murphy <robin.mruphy@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Unprivileged memory accesses generated by the so-called "translated"
instructions (e.g. LDRT) in kernel mode can cause user watchpoints to fire
unexpectedly. In such cases, the hw_breakpoint logic will invoke the user
overflow handler which will typically raise a SIGTRAP back to the current
task. This is futile when returning back to the kernel because (a) the
signal won't have been delivered and (b) userspace can't handle the thing
anyway.
Avoid invoking the user overflow handler for watchpoints triggered by
kernel uaccess routines, and instead single-step over the faulting
instruction as we would if no overflow handler had been installed.
Cc: <stable@vger.kernel.org>
Fixes: f81ef4a920 ("ARM: 6356/1: hw-breakpoint: add ARM backend for the hw-breakpoint framework")
Reported-by: Luis Machado <luis.machado@linaro.org>
Tested-by: Luis Machado <luis.machado@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Stalls are quite frequent with recent kernels. I enabled
CONFIG_SOFTLOCKUP_DETECTOR and I caught the following stall:
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [cc1:22803]
CPU: 0 PID: 22803 Comm: cc1 Not tainted 5.6.17+ #3
Hardware name: 9000/800/rp3440
IAOQ[0]: d_alloc_parallel+0x384/0x688
IAOQ[1]: d_alloc_parallel+0x388/0x688
RP(r2): d_alloc_parallel+0x134/0x688
Backtrace:
[<000000004036974c>] __lookup_slow+0xa4/0x200
[<0000000040369fc8>] walk_component+0x288/0x458
[<000000004036a9a0>] path_lookupat+0x88/0x198
[<000000004036e748>] filename_lookup+0xa0/0x168
[<000000004036e95c>] user_path_at_empty+0x64/0x80
[<000000004035d93c>] vfs_statx+0x104/0x158
[<000000004035dfcc>] __do_sys_lstat64+0x44/0x80
[<000000004035e5a0>] sys_lstat64+0x20/0x38
[<0000000040180054>] syscall_exit+0x0/0x14
The code was stuck in this loop in d_alloc_parallel:
4037d414: 0e 00 10 dc ldd 0(r16),ret0
4037d418: c7 fc 5f ed bb,< ret0,1f,4037d414 <d_alloc_parallel+0x384>
4037d41c: 08 00 02 40 nop
This is the inner loop of bit_spin_lock which is called by hlist_bl_unlock in
d_alloc_parallel:
static inline void bit_spin_lock(int bitnum, unsigned long *addr)
{
/*
* Assuming the lock is uncontended, this never enters
* the body of the outer loop. If it is contended, then
* within the inner loop a non-atomic test is used to
* busywait with less bus contention for a good time to
* attempt to acquire the lock bit.
*/
preempt_disable();
#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
while (unlikely(test_and_set_bit_lock(bitnum, addr))) {
preempt_enable();
do {
cpu_relax();
} while (test_bit(bitnum, addr));
preempt_disable();
}
#endif
__acquire(bitlock);
}
After consideration, I realized that we must be losing bit unlocks.
Then, I noticed that we missed defining atomic64_set_release().
Adding this define fixes the stalls in bit operations.
Signed-off-by: Dave Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
Pull sound fixes from Takashi Iwai:
"This became fairly large, containing mostly the collection of ASoC
fixes that slipped from the previous request, so I sent now a bit
earlier than usual. But all changes look small and mostly
device-specific, hence nothing to worry too much.
Majority of changes are for x86 based platforms and their CODEC
drivers, in order to address some issues hit by their recent tests and
fuzzing. The rest are other ASoC device-specific fixes (imx, qcom,
wm8974, amd, rockchip) as well as a trivial fix for a kernel WARNING
hit by syzkaller"
* tag 'sound-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (28 commits)
ALSA: hda/realtek: Fixed ALC298 sound bug by adding quirk for Samsung Notebook Pen S
ALSA: info: Drop WARN_ON() from buffer NULL sanity check
ASoC: rt5682: Report the button event in the headset type only
ASoC: Intel: bytcht_es8316: Add missed put_device()
ASoC: rt5682: Enable Vref2 under using PLL2
ASoC: rt286: fix unexpected interrupt happens
ASoC: wm8974: remove unsupported clock mode
ASoC: wm8974: fix Boost Mixer Aux Switch
ASoC: SOF: core: fix null-ptr-deref bug during device removal
ASoc: codecs: max98373: remove Idle_bias_on to let codec suspend
ASoC: codecs: max98373: Removed superfluous volume control from chip default
ASoC: topology: fix tlvs in error handling for widget_dmixer
ASoC: topology: fix kernel oops on route addition error
ASoC: SOF: imx: add min/max channels for SAI/ESAI on i.MX8/i.MX8M
ASoC: Intel: bdw-rt5677: fix non BE conversion
ASoC: soc-dai: set dai_link dpcm_ flags with a helper
MAINTAINERS: Add Shengjiu to reviewer list of sound/soc/fsl
ASoC: core: Remove only the registered component in devm functions
MAINTAINERS: Change Maintainer for some at91 drivers
ASoC: dt-bindings: simple-card: Fix 'make dt_binding_check' warnings
...
Carlos Hernandez <ceh@ti.com> reported that we now have a suspend and
resume regresssion on am3 and am4 compared to the earlier kernels. While
suspend and resume works with v5.8-rc3, we now get errors with rtcwake:
pm33xx pm33xx: PM: Could not transition all powerdomains to target state
...
rtcwake: write error
This is because we now fail to idle the system timer clocks that the
idle code checks and the error gets propagated to the rtcwake.
Turns out there are several issues that need to be fixed:
1. Ignore no-idle and no-reset configured timers for the ti-sysc
interconnect target driver as otherwise it will keep the system timer
clocks enabled
2. Toggle the system timer functional clock for suspend for am3 and am4
(but not for clocksource on am3)
3. Only reconfigure type1 timers in dmtimer_systimer_disable()
4. Use of_machine_is_compatible() instead of of_device_is_compatible()
for checking the SoC type
Fixes: 52762fbd1c ("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support")
Reported-by: Carlos Hernandez <ceh@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Carlos Hernandez <ceh@ti.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200713162601.6829-1-tony@atomide.com
Change the counter name DLFT_CCERROR to DLFT_CCFINISH on IBM z15.
This counter counts completed DEFLATE instructions with exit code
0, 1 or 2. Since exit code 0 means success and exit code 1 or 2
indicate errors, change the counter name to avoid confusion.
This counter is incremented each time the DEFLATE instruction
completed regardless if an error was detected or not.
Fixes: d68d5d51dc ("s390/cpum_cf: Add new extended counters for IBM z15")
Fixes: e7950166e4 ("perf vendor events s390: Add new deflate counters for IBM z15")
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
The commits "xfrm: Move dst->path into struct xfrm_dst"
and "net: Create and use new helper xfrm_dst_child()."
changed xfrm bundle handling under the assumption
that xdst->path and dst->child are not a NULL pointer
only if dst->xfrm is not a NULL pointer. That is true
with one exception. If the xfrm hold queue is used
to wait until a SA is installed by the key manager,
we create a dummy bundle without a valid dst->xfrm
pointer. The current xfrm bundle handling crashes
in that case. Fix this by extending the NULL check
of dst->xfrm with a test of the DST_XFRM_QUEUE flag.
Fixes: 0f6c480f23 ("xfrm: Move dst->path into struct xfrm_dst")
Fixes: b92cf4aab8 ("net: Create and use new helper xfrm_dst_child().")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The kernel test bot reported[1] that using set_mask_bits on a u8 causes
the following issue on parisc:
hppa-linux-ld: drivers/phy/ti/phy-tusb1210.o: in function `tusb1210_probe':
>> (.text+0x2f4): undefined reference to `__cmpxchg_called_with_bad_pointer'
>> hppa-linux-ld: (.text+0x324): undefined reference to `__cmpxchg_called_with_bad_pointer'
hppa-linux-ld: (.text+0x354): undefined reference to `__cmpxchg_called_with_bad_pointer'
Add support for cmpxchg on u8 pointers.
[1] https://lore.kernel.org/patchwork/patch/1272617/#1468946
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Liam Beguin <liambeguin@gmail.com>
Tested-by: Dave Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Sabrina Dubroca says:
====================
xfrm: a few fixes for espintc
Andrew Cagney reported some issues when trying to use async operations
on the encapsulation socket. Patches 1 and 2 take care of these bugs.
In addition, I missed a spot when adding IPv6 support and converting
to the common config option.
====================
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
It fails to boot the v5.8-rc4 kernel with CONFIG_KASAN because kasan_init
and kasan_early_init use uninitialized __sbi_rfence as executing the
tlb_flush_all(). Actually, at this moment, only the CPU which is
responsible for the system initialization enables the MMU. Other CPUs are
parking at the .Lsecondary_start. Hence the tlb_flush_all() is able to be
replaced by local_tlb_flush_all() to avoid using uninitialized
__sbi_rfence.
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Commit 02288248b0 ("tipc: eliminate gap indicator from ACK messages")
eliminated sending of the 'gap' indicator in regular ACK messages and
only allowed to build NACK message with enabled probe/probe_reply.
However, necessary correction for building NACK message was missed
in tipc_link_timeout() function. This leads to significant delay and
link reset (due to retransmission failure) in lossy environment.
This commit fixes it by setting the 'probe' flag to 'true' when
the receive deferred queue is not empty. As a result, NACK message
will be built to send back to another peer.
Fixes: 02288248b0 ("tipc: eliminate gap indicator from ACK messages")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The stream.size field is updated to the value of create timestamp
of the file entry. Fix this to use correct stream entry pointer.
Fixes: 29bbb14bfc ("exfat: fix incorrect update of stream entry in __exfat_truncate()")
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
We found the wrong hint_stat initialization in exfat_find_dir_entry().
It should be initialized when cluster is EXFAT_EOF_CLUSTER.
Fixes: ca06197382 ("exfat: add directory operations")
Cc: stable@vger.kernel.org # v5.7
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
An overflow issue can occur while calculating sector in
exfat_cluster_to_sector(). It needs to cast clus's type to sector_t
before left shifting.
Fixes: 1acf1a564b ("exfat: add in-memory and on-disk structures and headers")
Cc: stable@vger.kernel.org # v5.7
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Fix the warning: [-Werror=-Wframe-larger-than=]
drivers/net/ethernet/neterion/vxge/vxge-main.c:
In function'VXGE_COMPLETE_VPATH_TX.isra.37':
drivers/net/ethernet/neterion/vxge/vxge-main.c:119:1:
warning: the frame size of 1056 bytes is larger than 1024 bytes
Dropping the NR_SKB_COMPLETED to 16 is appropriate that won't
have much impact on performance and functionality.
Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ag71xx_mdio_probe() forgets to call clk_disable_unprepare() when
of_reset_control_get_exclusive() failed. Add the missed call to fix it.
Fixes: d51b6ce441 ("net: ethernet: add ag71xx driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fragment packets do defrag in tcf_ct_handle_fragments
will clear the skb->cb which make the qdisc_skb_cb clear
too. So the qdsic_skb_cb should be store before defrag and
restore after that.
It also update the pkt_len after all the
fragments finish the defrag to one packet and make the
following actions counter correct.
Fixes: b57dc7c13e ("net/sched: Introduce action ct")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
The implementation of s3fwrn5_recv_frame() is supposed to consume skb on
all execution paths. Release skb before returning -ENODEV.
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip_dev_find() call holds net_device reference which is not needed,
use __ip_dev_find() which does not hold reference.
v1->v2:
- Correct submission tree.
- Add fixes tag.
Fixes: cc35c88ae4 ("crypto : chtls - CPL handler definition")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tls data skb is pending for Tx and tls alert comes , It
is wrongly overwrite the record type of tls data to tls alert
record type. fix the issue correcting it.
v1->v2:
- Correct submission tree.
- Add fixes tag.
Fixes: 6919a8264a ("Crypto/chtls: add/delete TLS header in driver")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson says:
====================
ionic: locking and filter fixes
These patches address an ethtool show regs problem, some locking sightings,
and issues with RSS hash and filter_id tracking after a managed FW update.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The ionic_wait_on_bit_lock() was a open-coded mutex knock-off
used only for protecting the queue reset operations, and there
was no reason not to use the real thing. We can use the lock
more correctly and to better protect the queue stop and start
operations from cross threading. We can also remove a useless
and expensive bit operation from the Rx path.
This fixes a case found where the link_status_check from a link
flap could run into an MTU change and cause a crash.
Fixes: beead698b1 ("ionic: Add the basic NDO callbacks for netdev support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure the RSS hash key is kept across a fw update by not
de-initing it when an update is happening.
Fixes: c672412f61 ("ionic: remove lifs on fw reset")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we replay the rx filters after a fw-upgrade we get new
filter_id values from the FW, which we need to save and update
in our local filter list. This allows us to delete the filters
with the correct filter_id when we're done.
Fixes: 7e4d47596b ("ionic: replay filters after fw upgrade")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add in a couple of forgotten spinlocks and fix up some of
the debug messages around filter management.
Fixes: c1e329ebec ("ionic: Add management of rx filters")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use an offset to write the second half of the regs data into the
second half of the buffer instead of overwriting the first half.
Fixes: 4d03e00a21 ("ionic: Add initial ethtool support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_put_padto() can fail. So check for return type and return NULL
for skb. Caller checks for skb and acts correctly if it is NULL.
Fixes: 6d6148bc78 ("net: hsr: fix incorrect lsdu size in the tag of HSR frames for small frames")
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bareudp.rst was written before iproute2 gained support for this new
type of tunnel. Therefore, the sample command lines didn't match the
final iproute2 implementation.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When mlxsw_core_trap_register fails in mlxsw_emad_init,
destroy_workqueue() shouled be called to destroy mlxsw_core->emad_wq.
Fixes: d965465b60 ("mlxsw: core: Fix possible deadlock")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul says:
====================
net/smc: fixes 2020-07-20
Please apply the following patch series for smc to netdev's net tree.
Patch 1 fixes a problem with a buffer that is not put back when the
connection was killed in the meantime.
Patch 2 fixes a wrong behaviour when the maximum dmb buffer count
exceeded.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a current limit of 1920 registered dmb buffers per ISM device
for smc-d. One link group can contain 255 connections, each connection
is using one dmb buffer. When the connection is closed then the
registered buffer is held in a queue and is reused by the next
connection. When a link group is 'full' then another link group is
created and uses an own buffer pool. The link groups are added to a
list using list_add() which puts a new link group to the first position
in the list.
In the situation that many connections are opened (>1920) and a few of
them stay open while others are closed quickly we end up with at least 8
link groups. For a new connection a matching link group is looked up,
iterating over the list of link groups. The trailing 7 link groups
all have registered dmb buffers which could be reused, while the first
link group has only a few dmb buffers and then hit the 1920 limit.
Because the first link group is not full (255 connection limit not
reached) it is chosen and finally the connection falls back to TCP
because there is no dmb buffer available in this link group.
There are multiple ways to fix that: using list_add_tail() allows
to scan older link groups first for free buffers which ensures that
buffers are reused first. This fixes the problem for smc-r link groups
as well. For smc-d there is an even better way to address this problem
because smc-d does not have the 255 connections per link group limit.
So fix the problem for smc-d by allowing large link groups.
Fixes: c6ba7c9ba4 ("net/smc: add base infrastructure for SMC-D and ISM")
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To get a send slot smc_wr_tx_get_free_slot() is called, which might
wait for a free slot. When smc_wr_tx_get_free_slot() returns there is a
check if the connection was killed in the meantime. In that case don't
only return an error, but also put back the free slot.
Fixes: b290098092 ("net/smc: cancel send and receive for terminated socket")
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rxrpc_sendmsg() returns EPIPE if there's an outstanding error, such as if
rxrpc_recvmsg() indicating ENODATA if there's nothing for it to read.
Change rxrpc_recvmsg() to return EAGAIN instead if there's nothing to read
as this particular error doesn't get stored in ->sk_err by the networking
core.
Also change rxrpc_sendmsg() so that it doesn't fail with delayed receive
errors (there's no way for it to report which call, if any, the error was
caused by).
Fixes: 17926a7932 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Schmidt says:
====================
pull-request: ieee802154 for net 2020-07-20
An update from ieee802154 for your *net* tree.
A potential memory leak fix for adf7242 from Liu Jian,
and one more HTTPS link change from Alexander A. Klimov.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver forgets to call clk_disable_unprepare() in error path after
a success calling for clk_prepare_enable().
Fix to goto err_clk_disable if clk_prepare_enable() is successful.
Fixes: c80d36ff63 ("net: bcmgenet: Use devm_clk_get_optional() to get the clocks")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver forgets to call clk_disable_unprepare() in error path after
a success calling for clk_prepare_enable().
Fix to goto err_clk_disable if clk_prepare_enable() is successful.
Fixes: 99d55638d4 ("net: bcmgenet: enable NETIF_F_HIGHDMA flag")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull VFIO fix from Alex Williamson:
"Fix race with eventfd ctx cleared outside of mutex (Zeng Tao)"
* tag 'vfio-v5.8-rc7' of git://github.com/awilliam/linux-vfio:
vfio/pci: fix racy on error and request eventfd ctx
Count pages after possibly truncating the iterator to the maximum zone
append size, not before.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Avoid the compilation warning "Variable 'ret' is reassigned a value
before the old one has been used." in zonefs_create_zgroup() by setting
ret for the error path only if an error happens.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked. Shift amounts greater than or equal to 32 will result in
undefined behavior. Add code to deal with this.
Fixes: 1e15687ea4 ("staging: comedi: addi_apci_1564: add Change-of-State interrupt subdevice and required functions")
Cc: <stable@vger.kernel.org> #3.17+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-4-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked. Shift amounts greater than or equal to 32 will result in
undefined behavior. Add code to deal with this, adjusting the checks
for invalid channels so that enabled channel bits that would have been
lost by shifting are also checked for validity. Only channels 0 to 15
are valid.
Fixes: a8c66b684e ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions")
Cc: <stable@vger.kernel.org> #4.0+: ef75e14a6c: staging: comedi: verify array index is correct before using it
Cc: <stable@vger.kernel.org> #4.0+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-5-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked. Shift amounts greater than or equal to 32 will result in
undefined behavior. Add code to deal with this.
Fixes: 33cdce6293 ("staging: comedi: addi_apci_1032: conform to new INSN_CONFIG_DIGITAL_TRIG")
Cc: <stable@vger.kernel.org> #3.8+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-3-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
`ni6527_intr_insn_config()` processes `INSN_CONFIG` comedi instructions
for the "interrupt" subdevice. When `data[0]` is
`INSN_CONFIG_DIGITAL_TRIG` it is configuring the digital trigger. When
`data[2]` is `COMEDI_DIGITAL_TRIG_ENABLE_EDGES` it is configuring rising
and falling edge detection for the digital trigger, using a base channel
number (or shift amount) in `data[3]`, a rising edge bitmask in
`data[4]` and falling edge bitmask in `data[5]`.
If the base channel number (shift amount) is greater than or equal to
the number of channels (24) of the digital input subdevice, there are no
changes to the rising and falling edges, so the mask of channels to be
changed can be set to 0, otherwise the mask of channels to be changed,
and the rising and falling edge bitmasks are shifted by the base channel
number before calling `ni6527_set_edge_detection()` to change the
appropriate registers. Unfortunately, the code is comparing the base
channel (shift amount) to the interrupt subdevice's number of channels
(1) instead of the digital input subdevice's number of channels (24).
Fix it by comparing to 32 because all shift amounts for an `unsigned
int` must be less than that and everything from bit 24 upwards is
ignored by `ni6527_set_edge_detection()` anyway.
Fixes: 110f9e687c ("staging: comedi: ni_6527: support INSN_CONFIG_DIGITAL_TRIG")
Cc: <stable@vger.kernel.org> # 3.17+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-2-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Oded writes:
This tag contains a single bug fix for 5.8-rc7:
- Check that an index is in valid range before using it to access an
array. The index is received from the user. This is to prevent a
possible out-of-bounds access error.
* tag 'misc-habanalabs-fixes-2020-07-19' of git://people.freedesktop.org/~gabbayo/linux:
habanalabs: prevent possible out-of-bounds array access
Moritz writes:
FPGA manager fixes for 5.8
Here are two (late) dfl fixes for the the 5.8 release.
Matthew's fix addresses an issue in the reset of a port.
Xu'x fix addresses a linter warning.
All patches have been reviewed on the mailing list, and have been in the
last few linux-next releases (as part of my fixes branch) without issues.
Signed-off-by: Moritz Fischer <mdf@kernel.org>
* tag 'fpga-late-fixes-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga:
fpga: dfl: fix bug in port reset handshake
fpga: dfl: pci: reduce the scope of variable 'ret'
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Link: https://lore.kernel.org/r/20200719113142.58304-1-grandmaster@al2klimov.de
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
The current pin muxing scheme muxes GPIO_1 pad for USB_OTG_ID
because of which when card is inserted, usb otg is enumerated
and the card is never detected.
[ 64.492645] cfg80211: failed to load regulatory.db
[ 64.492657] imx-sdma 20ec000.sdma: external firmware not found, using ROM firmware
[ 76.343711] ci_hdrc ci_hdrc.0: EHCI Host Controller
[ 76.349742] ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 2
[ 76.388862] ci_hdrc ci_hdrc.0: USB 2.0 started, EHCI 1.00
[ 76.396650] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.08
[ 76.405412] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 76.412763] usb usb2: Product: EHCI Host Controller
[ 76.417666] usb usb2: Manufacturer: Linux 5.8.0-rc1-next-20200618 ehci_hcd
[ 76.424623] usb usb2: SerialNumber: ci_hdrc.0
[ 76.431755] hub 2-0:1.0: USB hub found
[ 76.435862] hub 2-0:1.0: 1 port detected
The TRM mentions GPIO_1 pad should be muxed/assigned for card detect
and ENET_RX_ER pad for USB_OTG_ID for proper operation.
This patch fixes pin muxing as per TRM and is tested on a
i.Core 1.5 MX6 DL SOM.
[ 22.449165] mmc0: host does not support reading read-only switch, assuming write-enable
[ 22.459992] mmc0: new high speed SDHC card at address 0001
[ 22.469725] mmcblk0: mmc0:0001 EB1QT 29.8 GiB
[ 22.478856] mmcblk0: p1 p2
Fixes: 6df11287f7 ("ARM: dts: imx6q: Add Engicam i.CoreM6 Quad/Dual initial support")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
Signed-off-by: Suniel Mahesh <sunil@amarulasolutions.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Commit 0672d22a19 ("ARM: dts: imx: Fix the AR803X phy-mode") fixed the
phy-mode for fec1, but missed to fix it for the fec2 node.
Fix fec2 to also use "rgmii-id" as the phy-mode.
Cc: <stable@vger.kernel.org>
Fixes: 0672d22a19 ("ARM: dts: imx: Fix the AR803X phy-mode")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Commit 0672d22a19 ("ARM: dts: imx: Fix the AR803X phy-mode") fixed the
phy-mode for fec1, but missed to fix it for the fec2 node.
Fix fec2 to also use "rgmii-id" as the phy-mode.
Cc: <stable@vger.kernel.org>
Fixes: 0672d22a19 ("ARM: dts: imx: Fix the AR803X phy-mode")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The commit below caused a regression for clearfog-gt-8k, where the link
between the switch and the host does not come up.
Investigation revealed two issues:
- MV88E6xxx DSA no longer allows an in-band link to come up as the link
is programmed to be forced down. Commit "net: dsa: mv88e6xxx: fix
in-band AN link establishment" addresses this.
- The dts configured dissimilar link modes at each end of the host to
switch link; the host was configured using a fixed link (so has no
in-band status) and the switch was configured to expect in-band
status.
With both issues fixed, the regression is resolved.
Fixes: 34b5e6a33c ("net: dsa: mv88e6xxx: Configure MAC when using fixed link")
Reported-by: Martin Rowe <martin.p.rowe@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
If in-band negotiation or fixed-link modes are specified for a DSA
port, the DSA code will force the link down during initialisation. For
fixed-link mode, this is fine, as phylink will manage the link state.
However, for in-band mode, phylink expects the PCS to detect link,
which will not happen if the link is forced down.
There is a related issue that in in-band mode, the link could come up
while we are making configuration changes, so we should force the link
down prior to reconfiguring the interface mode.
This patch addresses both issues.
Fixes: 3be98b2d5f ("net: dsa: Down cpu/dsa ports phylink will control")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a regression encountered while running the
gdb.base/corefile.exp test in GDB's test suite.
In my testing, the typo prevented the sw_reserved field of struct
fxregs_state from being output to the kernel XSAVES area. Thus the
correct mask corresponding to XCR0 was not present in the core file for
GDB to interrogate, resulting in the following behavior:
[kev@f32-1 gdb]$ ./gdb -q testsuite/outputs/gdb.base/corefile/corefile testsuite/outputs/gdb.base/corefile/corefile.core
Reading symbols from testsuite/outputs/gdb.base/corefile/corefile...
[New LWP 232880]
warning: Unexpected size of section `.reg-xstate/232880' in core file.
With the typo fixed, the test works again as expected.
Signed-off-by: Kevin Buettner <kevinb@redhat.com>
Fixes: 9e46365459 ("copy_xstate_to_kernel(): don't leave parts of destination uninitialized")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Karsten Graul says:
====================
net/smc: fixes 2020-07-16
Please apply the following patch series for smc to netdev's net tree.
The patches address problems caused by late or unexpected link layer
control packets, dma sync calls for unmapped memory, freed buffers
that are not removed from the buffer list and a possible null pointer
access that results in a crash.
v1->v2: in patch 4, improve patch description and correct the comment
for the new mutex
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When a listen socket is closed then all non-accepted sockets in its
accept queue are to be released. Inside __smc_release() the helper
smc_restore_fallback_changes() restores the changes done to the socket
without to check if the clcsocket has a file set. This can result in
a crash. Fix this by checking the file pointer first.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Fixes: f536dffc0b ("net/smc: fix closing of fallback SMC sockets")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two buffers are allocated for each SMC connection. Each buffer is
added to a buffer list after creation. When the second buffer
allocation fails, the first buffer is freed but not deleted from
the list. This might result in crashes when another connection picks
up the freed buffer later and starts to work with it.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Fixes: 6511aad3f0 ("net/smc: change smc_buf_free function parameters")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dma related ...sync_sg... functions check the link state before the
dma function is actually called. But the check in smc_link_usable()
allows links in ACTIVATING state which are not yet mapped to dma memory.
Under high load it may happen that the sync_sg functions are called for
such a link which results in an debug output like
DMA-API: mlx5_core 0002:00:00.0: device driver tries to sync
DMA memory it has not allocated [device address=0x0000000103370000]
[size=65536 bytes]
To fix that introduce a helper to check for the link state ACTIVE and
use it where appropriate. And move the link state update to ACTIVATING
to the end of smcr_link_init() when most initial setup is done.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Fixes: d854fcbfae ("net/smc: add new link state and related helpers")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As smc client the delete link requests are assigned to the flow when
_any_ flow is active. This may break other flows that do not expect
delete link requests during their handling. Fix that by assigning the
request only when an add link flow is active. With that fix the code
for smc client and smc server is the same, so remove the separate
handling.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Fixes: 9ec6bf19ec ("net/smc: llc_del_link_work and use the LLC flow for delete link")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a new ib device is up smc will send an add link invitation to the
peer if needed. This is currently done with rudimentary flow control.
Under high workload these add link invitations can disturb other llc
flows because they arrive unexpected. Fix this by integrating the
invitations into the normal llc event flow and handle them as a flow.
While at it, check for already assigned requests in the flow before
the new add link request is assigned.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Fixes: 1f90a05d9f ("net/smc: add smcr_port_add() and smcr_link_up() processing")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To be save from unexpected or late llc response messages check if the
arrived message fits to the current flow type and drop out-of-flow
messages. And drop it when there is already a response assigned to
the flow.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Fixes: ef79d439cd ("net/smc: process llc responses in tasklet context")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before an smc ib device is used the first time for an smc link it is
lazily initialized. When there are 2 active link groups and a new ib
device is brought online then it might happen that 2 link creations run
in parallel and enter smc_ib_setup_per_ibdev(). Both allocate new send
and receive completion queues on the device, but only one set of them
keeps assigned and the other leaks.
Fix that by protecting the setup and cleanup code using a mutex.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Fixes: f3c1deddb2 ("net/smc: separate function for link initialization")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For new rdma connections the SMC server assigns the link and sends the
link data in the clc accept message. To match the correct link use not
only the qp_num but also the gid and the mac of the links. If there are
equal qp_nums for different links the wrong link would be chosen.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Fixes: 0fb0b02bd6 ("net/smc: adapt SMC client code to use the LLC flow")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a link-down condition we notify the SMC server and expect that the
server will finally trigger the link clear processing on the client
side. This could fail when anything along this notification path goes
wrong. Clear the link as part of SMC client link-down processing to
prevent dangling links.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Fixes: 541afa10c1 ("net/smc: add smcr_port_err() and smcr_link_down() processing")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A delete link could arrive during confirm link processing. Handle this
situation directly in smc_llc_srv_conf_link() rather than using the
logic in smc_llc_wait() to avoid the unexpected message handling there.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Fixes: 1551c95b61 ("net/smc: final part of add link processing as SMC server")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull perf tooling fixes from Arnaldo Carvalho de Melo:
- Update hashmap.h from libbpf and kvm.h from x86's kernel UAPI.
- Set opt->set in libsubcmd's OPT_CALLBACK_SET(). This fixes
'perf record --switch-output-event event-name' usage"
* tag 'perf-tools-fixes-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
tools arch kvm: Sync kvm headers with the kernel sources
perf tools: Sync hashmap.h with libbpf's
libsubcmd: Fix OPT_CALLBACK_SET()
Pull x86 fixes from Thomas Gleixner:
"A pile of fixes for x86:
- Fix the I/O bitmap invalidation on XEN PV, which was overlooked in
the recent ioperm/iopl rework. This caused the TSS and XEN's I/O
bitmap to get out of sync.
- Use the proper vectors for HYPERV.
- Make disabling of stack protector for the entry code work with GCC
builds which enable stack protector by default. Removing the option
is not sufficient, it needs an explicit -fno-stack-protector to
shut it off.
- Mark check_user_regs() noinstr as it is called from noinstr code.
The missing annotation causes it to be placed in the text section
which makes it instrumentable.
- Add the missing interrupt disable in exc_alignment_check()
- Fixup a XEN_PV build dependency in the 32bit entry code
- A few fixes to make the Clang integrated assembler happy
- Move EFI stub build to the right place for out of tree builds
- Make prepare_exit_to_usermode() static. It's not longer called from
ASM code"
* tag 'x86-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Don't add the EFI stub to targets
x86/entry: Actually disable stack protector
x86/ioperm: Fix io bitmap invalidation on Xen PV
x86: math-emu: Fix up 'cmp' insn for clang ias
x86/entry: Fix vectors to IDTENTRY_SYSVEC for CONFIG_HYPERV
x86/entry: Add compatibility with IAS
x86/entry/common: Make prepare_exit_to_usermode() static
x86/entry: Mark check_user_regs() noinstr
x86/traps: Disable interrupts in exc_aligment_check()
x86/entry/32: Fix XEN_PV build dependency
Pull timer fixes from Thomas Gleixner:
"Two fixes for the timer wheel:
- A timer which is already expired at enqueue time can set the
base->next_expiry value backwards. As a consequence base->clk can
be set back as well. This can lead to timers expiring early. Add a
sanity check to prevent this.
- When a timer is queued with an expiry time beyond the wheel
capacity then it should be queued in the bucket of the last wheel
level which is expiring last.
The code adjusted the expiry time to the maximum wheel capacity,
which is only correct when the wheel clock is 0. Aside of that the
check whether the delta is larger than wheel capacity does not
check the delta, it checks the expiry value itself. As a result
timers can expire at random.
Fix this by checking the right variable and adjust expiry time so
it becomes base->clock plus capacity which places it into the
outmost bucket in the last wheel level"
* tag 'timers-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timer: Fix wheel index calculation on last level
timer: Prevent base->clk from moving backward
Pull scheduler fixes from Thomas Gleixner:
"A set of scheduler fixes:
- Plug a load average accounting race which was introduced with a
recent optimization casing load average to show bogus numbers.
- Fix the rseq CPU id initialization for new tasks. sched_fork() does
not update the rseq CPU id so the id is the stale id of the parent
task, which can cause user space data corruption.
- Handle a 0 return value of task_h_load() correctly in the load
balancer, which does not decrease imbalance and therefore pulls
until the maximum number of loops is reached, which might be all
tasks just created by a fork bomb"
* tag 'sched-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: handle case of task_h_load() returning 0
sched: Fix unreliable rseq cpu_id for new tasks
sched: Fix loadavg accounting race
Pull irq fixes from Thomas Gleixner:
"Two fixes for the interrupt subsystem:
- Make the handling of the firmware node consistent and do not free
the node after the domain has been created successfully. The core
code stores a pointer to it which can lead to a use after free or
double free.
This used to "work" because the pointer was not stored when the
initial code was written, but at some point later it was required
to store it. Of course nobody noticed that the existing users break
that way.
- Handle affinity setting on inactive interrupts correctly when
hierarchical irq domains are enabled.
When interrupts are inactive with the modern hierarchical irqdomain
design, the interrupt chips are not necessarily in a state where
affinity changes can be handled. The legacy irq chip design allowed
this because interrupts are immediately fully initialized at
allocation time. X86 has a hacky workaround for this, but other
implementations do not.
This cased malfunction on GIC-V3. Instead of playing whack a mole
to find all affected drivers, change the core code to store the
requested affinity setting and then establish it when the interrupt
is allocated, which makes the X86 hack go away"
* tag 'irq-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/affinity: Handle affinity setting on inactive interrupts correctly
irqdomain/treewide: Keep firmware node unconditionally allocated
Pull USB fixes from Greg KH:
"Here are a few small USB fixes, and one thunderbolt fix, for 5.8-rc6.
Nothing huge in here, just the normal collection of gadget, dwc2/3,
serial, and other minor USB driver fixes and id additions. Full
details are in the shortlog.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: iuu_phoenix: fix memory corruption
USB: c67x00: fix use after free in c67x00_giveback_urb
usb: gadget: function: fix missing spinlock in f_uac1_legacy
usb: gadget: udc: atmel: fix uninitialized read in debug printk
usb: gadget: udc: atmel: remove outdated comment in usba_ep_disable()
usb: dwc2: Fix shutdown callback in platform
usb: cdns3: trace: fix some endian issues
usb: cdns3: ep0: fix some endian issues
usb: gadget: udc: gr_udc: fix memleak on error handling path in gr_ep_init()
usb: gadget: fix langid kernel-doc warning in usbstring.c
usb: dwc3: pci: add support for the Intel Jasper Lake
usb: dwc3: pci: add support for the Intel Tiger Lake PCH -H variant
usb: chipidea: core: add wakeup support for extcon
USB: serial: option: add Quectel EG95 LTE modem
thunderbolt: Fix path indices used in USB3 tunnel discovery
USB: serial: ch341: add new Product ID for CH340
USB: serial: option: add GosunCn GM500 series
USB: serial: cypress_m8: enable Simply Automated UPB PIM
Pull dma-mapping fixes from Christoph Hellwig:
"Ensure we always have fully addressable memory in the dma coherent
pool (Nicolas Saenz Julienne)"
* tag 'dma-mapping-5.8-6' of git://git.infradead.org/users/hch/dma-mapping:
dma-pool: do not allocate pool memory from CMA
dma-pool: make sure atomic pool suits device
dma-pool: introduce dma_guess_pool()
dma-pool: get rid of dma_in_atomic_pool()
dma-direct: provide function to check physical memory area validity
p9_fd_open just fgets file descriptors passed in from userspace, but
doesn't verify that they are valid for read or writing. This gets
cought down in the VFS when actually attempting a read or write, but
a new warning added in linux-next upsets syzcaller.
Fix this by just verifying the fds early on.
Link: http://lkml.kernel.org/r/20200710085722.435850-1-hch@lst.de
Reported-by: syzbot+e6f77e16ff68b2434a2c@syzkaller.appspotmail.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
[Dominique: amend goto as per Doug Nazar's review]
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
changeset d0213061a501 ("media: atomisp: fix mask and shift operation on ISPSSPM0")
solved the existing issue with the IUNIT power on code.
So, the driver can now use the right code again.
This reverts commit 95d1f398c4.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Currently the check on bits 25:24 on ISPSSPM0 is always 0 because
the mask and shift operations are incorrect. Fix this by shifting
by MRFLD_ISPSSPM0_ISPSSS_OFFSET (24 bits right) and then masking
with RFLD_ISPSSPM0_ISPSSC_MASK (0x03) to get the appropriate 2 bits
to check.
Addresses-Coverity: ("Operands don't affect result")
Fixes: 0f441fd70b ("media: atomisp: simplify the power down/up code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
vmlinux-objs-y is added to targets, which currently means that the EFI
stub gets added to the targets as well. It shouldn't be added since it
is built elsewhere.
This confuses Makefile.build which interprets the EFI stub as a target
$(obj)/$(objtree)/drivers/firmware/efi/libstub/lib.a
and will create drivers/firmware/efi/libstub/ underneath
arch/x86/boot/compressed, to hold this supposed target, if building
out-of-tree. [0]
Fix this by pulling the stub out of vmlinux-objs-y into efi-obj-y.
[0] See scripts/Makefile.build near the end:
# Create directories for object files if they do not exist
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20200715032631.1562882-1-nivedita@alum.mit.edu
Some builds of GCC enable stack protector by default. Simply removing
the arguments is not sufficient to disable stack protector, as the stack
protector for those GCC builds must be explicitly disabled. Remove the
argument removals and add -fno-stack-protector. Additionally include
missed x32 argument updates, and adjust whitespace for readability.
Fixes: 20355e5f73 ("x86/entry: Exclude low level entry code from sanitizing")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/202006261333.585319CA6B@keescook
Instead of declaring all those consts everywhere when the
headers are included, just place them on a single place.
This change shuts up lots of warnings when built with W=1:
In file included from drivers/staging/media/atomisp/pci/ia_css_acc_types.h:23,
from drivers/staging/media/atomisp/pci/ia_css.h:26,
from drivers/staging/media/atomisp/pci/atomisp_compat_css20.h:24,
from drivers/staging/media/atomisp/pci/atomisp_compat.h:22,
from drivers/staging/media/atomisp/pci/atomisp_drvfs.c:23:
./drivers/staging/media/atomisp//pci/system_local.h:193:26: warning: ‘STREAM2MMIO_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
193 | static const hrt_address STREAM2MMIO_CTRL_BASE[N_STREAM2MMIO_ID] = {
| ^~~~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:186:26: warning: ‘PIXELGEN_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
186 | static const hrt_address PIXELGEN_CTRL_BASE[N_PIXELGEN_ID] = {
| ^~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:179:26: warning: ‘CSI_RX_BE_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
179 | static const hrt_address CSI_RX_BE_CTRL_BASE[N_CSI_RX_BACKEND_ID] = {
| ^~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:172:26: warning: ‘CSI_RX_FE_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
172 | static const hrt_address CSI_RX_FE_CTRL_BASE[N_CSI_RX_FRONTEND_ID] = {
| ^~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:165:26: warning: ‘ISYS_IRQ_BASE’ defined but not used [-Wunused-const-variable=]
165 | static const hrt_address ISYS_IRQ_BASE[N_ISYS_IRQ_ID] = {
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:158:26: warning: ‘IBUF_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
158 | static const hrt_address IBUF_CTRL_BASE[N_IBUF_CTRL_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:153:26: warning: ‘RX_BASE’ defined but not used [-Wunused-const-variable=]
153 | static const hrt_address RX_BASE[N_RX_ID] = {
| ^~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:138:26: warning: ‘INPUT_SYSTEM_BASE’ defined but not used [-Wunused-const-variable=]
138 | static const hrt_address INPUT_SYSTEM_BASE[N_INPUT_SYSTEM_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:130:26: warning: ‘INPUT_FORMATTER_BASE’ defined but not used [-Wunused-const-variable=]
130 | static const hrt_address INPUT_FORMATTER_BASE[N_INPUT_FORMATTER_ID] = {
| ^~~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:125:26: warning: ‘TIMED_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
125 | static const hrt_address TIMED_CTRL_BASE[N_TIMED_CTRL_ID] = {
| ^~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:120:26: warning: ‘GPIO_BASE’ defined but not used [-Wunused-const-variable=]
120 | static const hrt_address GPIO_BASE[N_GPIO_ID] = {
| ^~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:116:26: warning: ‘GP_TIMER_BASE’ defined but not used [-Wunused-const-variable=]
116 | static const hrt_address GP_TIMER_BASE =
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:109:26: warning: ‘GP_DEVICE_BASE’ defined but not used [-Wunused-const-variable=]
109 | static const hrt_address GP_DEVICE_BASE[N_GP_DEVICE_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:96:26: warning: ‘FIFO_MONITOR_BASE’ defined but not used [-Wunused-const-variable=]
96 | static const hrt_address FIFO_MONITOR_BASE[N_FIFO_MONITOR_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:90:26: warning: ‘GDC_BASE’ defined but not used [-Wunused-const-variable=]
90 | static const hrt_address GDC_BASE[N_GDC_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:78:26: warning: ‘IRQ_BASE’ defined but not used [-Wunused-const-variable=]
78 | static const hrt_address IRQ_BASE[N_IRQ_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:73:26: warning: ‘ISYS2401_DMA_BASE’ defined but not used [-Wunused-const-variable=]
73 | static const hrt_address ISYS2401_DMA_BASE[N_ISYS2401_DMA_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:69:26: warning: ‘DMA_BASE’ defined but not used [-Wunused-const-variable=]
69 | static const hrt_address DMA_BASE[N_DMA_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:63:26: warning: ‘MMU_BASE’ defined but not used [-Wunused-const-variable=]
63 | static const hrt_address MMU_BASE[N_MMU_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:54:26: warning: ‘SP_DMEM_BASE’ defined but not used [-Wunused-const-variable=]
54 | static const hrt_address SP_DMEM_BASE[N_SP_ID] = {
| ^~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:50:26: warning: ‘SP_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
50 | static const hrt_address SP_CTRL_BASE[N_SP_ID] = {
| ^~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:45:26: warning: ‘ISP_BAMEM_BASE’ defined but not used [-Wunused-const-variable=]
45 | static const hrt_address ISP_BAMEM_BASE[N_BAMEM_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:41:26: warning: ‘ISP_DMEM_BASE’ defined but not used [-Wunused-const-variable=]
41 | static const hrt_address ISP_DMEM_BASE[N_ISP_ID] = {
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:37:26: warning: ‘ISP_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
37 | static const hrt_address ISP_CTRL_BASE[N_ISP_ID] = {
| ^~~~~~~~~~~~~
In file included from drivers/staging/media/atomisp/pci/ia_css_acc_types.h:23,
from drivers/staging/media/atomisp/pci/ia_css.h:26,
from drivers/staging/media/atomisp/pci/atomisp_file.c:27:
./drivers/staging/media/atomisp//pci/system_local.h:193:26: warning: ‘STREAM2MMIO_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
193 | static const hrt_address STREAM2MMIO_CTRL_BASE[N_STREAM2MMIO_ID] = {
| ^~~~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:186:26: warning: ‘PIXELGEN_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
186 | static const hrt_address PIXELGEN_CTRL_BASE[N_PIXELGEN_ID] = {
| ^~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:179:26: warning: ‘CSI_RX_BE_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
179 | static const hrt_address CSI_RX_BE_CTRL_BASE[N_CSI_RX_BACKEND_ID] = {
| ^~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:172:26: warning: ‘CSI_RX_FE_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
172 | static const hrt_address CSI_RX_FE_CTRL_BASE[N_CSI_RX_FRONTEND_ID] = {
| ^~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:165:26: warning: ‘ISYS_IRQ_BASE’ defined but not used [-Wunused-const-variable=]
165 | static const hrt_address ISYS_IRQ_BASE[N_ISYS_IRQ_ID] = {
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:158:26: warning: ‘IBUF_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
158 | static const hrt_address IBUF_CTRL_BASE[N_IBUF_CTRL_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:153:26: warning: ‘RX_BASE’ defined but not used [-Wunused-const-variable=]
153 | static const hrt_address RX_BASE[N_RX_ID] = {
| ^~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:138:26: warning: ‘INPUT_SYSTEM_BASE’ defined but not used [-Wunused-const-variable=]
138 | static const hrt_address INPUT_SYSTEM_BASE[N_INPUT_SYSTEM_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:130:26: warning: ‘INPUT_FORMATTER_BASE’ defined but not used [-Wunused-const-variable=]
130 | static const hrt_address INPUT_FORMATTER_BASE[N_INPUT_FORMATTER_ID] = {
| ^~~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:125:26: warning: ‘TIMED_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
125 | static const hrt_address TIMED_CTRL_BASE[N_TIMED_CTRL_ID] = {
| ^~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:120:26: warning: ‘GPIO_BASE’ defined but not used [-Wunused-const-variable=]
120 | static const hrt_address GPIO_BASE[N_GPIO_ID] = {
| ^~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:116:26: warning: ‘GP_TIMER_BASE’ defined but not used [-Wunused-const-variable=]
116 | static const hrt_address GP_TIMER_BASE =
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:109:26: warning: ‘GP_DEVICE_BASE’ defined but not used [-Wunused-const-variable=]
109 | static const hrt_address GP_DEVICE_BASE[N_GP_DEVICE_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:96:26: warning: ‘FIFO_MONITOR_BASE’ defined but not used [-Wunused-const-variable=]
96 | static const hrt_address FIFO_MONITOR_BASE[N_FIFO_MONITOR_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:90:26: warning: ‘GDC_BASE’ defined but not used [-Wunused-const-variable=]
90 | static const hrt_address GDC_BASE[N_GDC_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:78:26: warning: ‘IRQ_BASE’ defined but not used [-Wunused-const-variable=]
78 | static const hrt_address IRQ_BASE[N_IRQ_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:73:26: warning: ‘ISYS2401_DMA_BASE’ defined but not used [-Wunused-const-variable=]
73 | static const hrt_address ISYS2401_DMA_BASE[N_ISYS2401_DMA_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:69:26: warning: ‘DMA_BASE’ defined but not used [-Wunused-const-variable=]
69 | static const hrt_address DMA_BASE[N_DMA_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:63:26: warning: ‘MMU_BASE’ defined but not used [-Wunused-const-variable=]
63 | static const hrt_address MMU_BASE[N_MMU_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:54:26: warning: ‘SP_DMEM_BASE’ defined but not used [-Wunused-const-variable=]
54 | static const hrt_address SP_DMEM_BASE[N_SP_ID] = {
| ^~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:50:26: warning: ‘SP_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
50 | static const hrt_address SP_CTRL_BASE[N_SP_ID] = {
| ^~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:45:26: warning: ‘ISP_BAMEM_BASE’ defined but not used [-Wunused-const-variable=]
45 | static const hrt_address ISP_BAMEM_BASE[N_BAMEM_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:41:26: warning: ‘ISP_DMEM_BASE’ defined but not used [-Wunused-const-variable=]
41 | static const hrt_address ISP_DMEM_BASE[N_ISP_ID] = {
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:37:26: warning: ‘ISP_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
37 | static const hrt_address ISP_CTRL_BASE[N_ISP_ID] = {
| ^~~~~~~~~~~~~
In file included from ./drivers/staging/media/atomisp//pci/ia_css_acc_types.h:23,
from ./drivers/staging/media/atomisp//pci/ia_css_pipe_public.h:29,
from drivers/staging/media/atomisp/pci/sh_css_legacy.h:23,
from drivers/staging/media/atomisp/pci/atomisp_internal.h:34,
from drivers/staging/media/atomisp/pci/atomisp_cmd.h:30,
from drivers/staging/media/atomisp/pci/atomisp_csi2.c:21:
./drivers/staging/media/atomisp//pci/system_local.h:193:26: warning: ‘STREAM2MMIO_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
193 | static const hrt_address STREAM2MMIO_CTRL_BASE[N_STREAM2MMIO_ID] = {
| ^~~~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:186:26: warning: ‘PIXELGEN_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
186 | static const hrt_address PIXELGEN_CTRL_BASE[N_PIXELGEN_ID] = {
| ^~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:179:26: warning: ‘CSI_RX_BE_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
179 | static const hrt_address CSI_RX_BE_CTRL_BASE[N_CSI_RX_BACKEND_ID] = {
| ^~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:172:26: warning: ‘CSI_RX_FE_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
172 | static const hrt_address CSI_RX_FE_CTRL_BASE[N_CSI_RX_FRONTEND_ID] = {
| ^~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:165:26: warning: ‘ISYS_IRQ_BASE’ defined but not used [-Wunused-const-variable=]
165 | static const hrt_address ISYS_IRQ_BASE[N_ISYS_IRQ_ID] = {
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:158:26: warning: ‘IBUF_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
158 | static const hrt_address IBUF_CTRL_BASE[N_IBUF_CTRL_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:153:26: warning: ‘RX_BASE’ defined but not used [-Wunused-const-variable=]
153 | static const hrt_address RX_BASE[N_RX_ID] = {
| ^~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:138:26: warning: ‘INPUT_SYSTEM_BASE’ defined but not used [-Wunused-const-variable=]
138 | static const hrt_address INPUT_SYSTEM_BASE[N_INPUT_SYSTEM_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:130:26: warning: ‘INPUT_FORMATTER_BASE’ defined but not used [-Wunused-const-variable=]
130 | static const hrt_address INPUT_FORMATTER_BASE[N_INPUT_FORMATTER_ID] = {
| ^~~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:125:26: warning: ‘TIMED_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
125 | static const hrt_address TIMED_CTRL_BASE[N_TIMED_CTRL_ID] = {
| ^~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:120:26: warning: ‘GPIO_BASE’ defined but not used [-Wunused-const-variable=]
120 | static const hrt_address GPIO_BASE[N_GPIO_ID] = {
| ^~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:116:26: warning: ‘GP_TIMER_BASE’ defined but not used [-Wunused-const-variable=]
116 | static const hrt_address GP_TIMER_BASE =
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:109:26: warning: ‘GP_DEVICE_BASE’ defined but not used [-Wunused-const-variable=]
109 | static const hrt_address GP_DEVICE_BASE[N_GP_DEVICE_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:96:26: warning: ‘FIFO_MONITOR_BASE’ defined but not used [-Wunused-const-variable=]
96 | static const hrt_address FIFO_MONITOR_BASE[N_FIFO_MONITOR_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:90:26: warning: ‘GDC_BASE’ defined but not used [-Wunused-const-variable=]
90 | static const hrt_address GDC_BASE[N_GDC_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:78:26: warning: ‘IRQ_BASE’ defined but not used [-Wunused-const-variable=]
78 | static const hrt_address IRQ_BASE[N_IRQ_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:73:26: warning: ‘ISYS2401_DMA_BASE’ defined but not used [-Wunused-const-variable=]
73 | static const hrt_address ISYS2401_DMA_BASE[N_ISYS2401_DMA_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:69:26: warning: ‘DMA_BASE’ defined but not used [-Wunused-const-variable=]
69 | static const hrt_address DMA_BASE[N_DMA_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:63:26: warning: ‘MMU_BASE’ defined but not used [-Wunused-const-variable=]
63 | static const hrt_address MMU_BASE[N_MMU_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:54:26: warning: ‘SP_DMEM_BASE’ defined but not used [-Wunused-const-variable=]
54 | static const hrt_address SP_DMEM_BASE[N_SP_ID] = {
| ^~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:50:26: warning: ‘SP_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
50 | static const hrt_address SP_CTRL_BASE[N_SP_ID] = {
| ^~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:45:26: warning: ‘ISP_BAMEM_BASE’ defined but not used [-Wunused-const-variable=]
45 | static const hrt_address ISP_BAMEM_BASE[N_BAMEM_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:41:26: warning: ‘ISP_DMEM_BASE’ defined but not used [-Wunused-const-variable=]
41 | static const hrt_address ISP_DMEM_BASE[N_ISP_ID] = {
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:37:26: warning: ‘ISP_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
37 | static const hrt_address ISP_CTRL_BASE[N_ISP_ID] = {
| ^~~~~~~~~~~~~
In file included from ./drivers/staging/media/atomisp//pci/ia_css_acc_types.h:23,
from ./drivers/staging/media/atomisp//pci/ia_css_pipe_public.h:29,
from drivers/staging/media/atomisp/pci/sh_css_legacy.h:23,
from drivers/staging/media/atomisp/pci/atomisp_internal.h:34,
from drivers/staging/media/atomisp/pci/atomisp_acc.h:23,
from drivers/staging/media/atomisp/pci/atomisp_acc.c:29:
./drivers/staging/media/atomisp//pci/system_local.h:193:26: warning: ‘STREAM2MMIO_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
193 | static const hrt_address STREAM2MMIO_CTRL_BASE[N_STREAM2MMIO_ID] = {
| ^~~~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:186:26: warning: ‘PIXELGEN_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
186 | static const hrt_address PIXELGEN_CTRL_BASE[N_PIXELGEN_ID] = {
| ^~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:179:26: warning: ‘CSI_RX_BE_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
179 | static const hrt_address CSI_RX_BE_CTRL_BASE[N_CSI_RX_BACKEND_ID] = {
| ^~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:172:26: warning: ‘CSI_RX_FE_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
172 | static const hrt_address CSI_RX_FE_CTRL_BASE[N_CSI_RX_FRONTEND_ID] = {
| ^~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:165:26: warning: ‘ISYS_IRQ_BASE’ defined but not used [-Wunused-const-variable=]
165 | static const hrt_address ISYS_IRQ_BASE[N_ISYS_IRQ_ID] = {
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:158:26: warning: ‘IBUF_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
158 | static const hrt_address IBUF_CTRL_BASE[N_IBUF_CTRL_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:153:26: warning: ‘RX_BASE’ defined but not used [-Wunused-const-variable=]
153 | static const hrt_address RX_BASE[N_RX_ID] = {
| ^~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:138:26: warning: ‘INPUT_SYSTEM_BASE’ defined but not used [-Wunused-const-variable=]
138 | static const hrt_address INPUT_SYSTEM_BASE[N_INPUT_SYSTEM_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:130:26: warning: ‘INPUT_FORMATTER_BASE’ defined but not used [-Wunused-const-variable=]
130 | static const hrt_address INPUT_FORMATTER_BASE[N_INPUT_FORMATTER_ID] = {
| ^~~~~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:125:26: warning: ‘TIMED_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
125 | static const hrt_address TIMED_CTRL_BASE[N_TIMED_CTRL_ID] = {
| ^~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:120:26: warning: ‘GPIO_BASE’ defined but not used [-Wunused-const-variable=]
120 | static const hrt_address GPIO_BASE[N_GPIO_ID] = {
| ^~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:116:26: warning: ‘GP_TIMER_BASE’ defined but not used [-Wunused-const-variable=]
116 | static const hrt_address GP_TIMER_BASE =
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:109:26: warning: ‘GP_DEVICE_BASE’ defined but not used [-Wunused-const-variable=]
109 | static const hrt_address GP_DEVICE_BASE[N_GP_DEVICE_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:96:26: warning: ‘FIFO_MONITOR_BASE’ defined but not used [-Wunused-const-variable=]
96 | static const hrt_address FIFO_MONITOR_BASE[N_FIFO_MONITOR_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:90:26: warning: ‘GDC_BASE’ defined but not used [-Wunused-const-variable=]
90 | static const hrt_address GDC_BASE[N_GDC_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:78:26: warning: ‘IRQ_BASE’ defined but not used [-Wunused-const-variable=]
78 | static const hrt_address IRQ_BASE[N_IRQ_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:73:26: warning: ‘ISYS2401_DMA_BASE’ defined but not used [-Wunused-const-variable=]
73 | static const hrt_address ISYS2401_DMA_BASE[N_ISYS2401_DMA_ID] = {
| ^~~~~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:69:26: warning: ‘DMA_BASE’ defined but not used [-Wunused-const-variable=]
69 | static const hrt_address DMA_BASE[N_DMA_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:63:26: warning: ‘MMU_BASE’ defined but not used [-Wunused-const-variable=]
63 | static const hrt_address MMU_BASE[N_MMU_ID] = {
| ^~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:54:26: warning: ‘SP_DMEM_BASE’ defined but not used [-Wunused-const-variable=]
54 | static const hrt_address SP_DMEM_BASE[N_SP_ID] = {
| ^~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:50:26: warning: ‘SP_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
50 | static const hrt_address SP_CTRL_BASE[N_SP_ID] = {
| ^~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:45:26: warning: ‘ISP_BAMEM_BASE’ defined but not used [-Wunused-const-variable=]
45 | static const hrt_address ISP_BAMEM_BASE[N_BAMEM_ID] = {
| ^~~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:41:26: warning: ‘ISP_DMEM_BASE’ defined but not used [-Wunused-const-variable=]
41 | static const hrt_address ISP_DMEM_BASE[N_ISP_ID] = {
| ^~~~~~~~~~~~~
./drivers/staging/media/atomisp//pci/system_local.h:37:26: warning: ‘ISP_CTRL_BASE’ defined but not used [-Wunused-const-variable=]
37 | static const hrt_address ISP_CTRL_BASE[N_ISP_ID] = {
| ^~~~~~~~~~~~~
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
After removing the unused 32-bits data, the isp2401_system_local.h
now contains everything that it is needed, either by isp2401 or
by isp2400.
So, remove code duplication.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Right now, there are two versions of system_global.h headers.
Both share a lot of common code. There are some ISP2401 specific
types on one of the headers, but it doesn't conflict with the
ISP2400 ones.
Also, the common code is identical.
So, remove code duplication by moving such code into a
common header.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
There is an abstraction at the code in order to support
32 or 64 bits address/data length. However, for all
Atom chipsets supported by this version, the size is fixed.
So, cleanup the mess, removing the uused code and placing
the data sizes on a single place.
The end goal is to completely remove those local/global
headers, replacing them by some ISP-version dependent struct,
in order for the driver to decide what version it would need
in runtime.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
There are several static vars declared inside the
system local headers. This causes lots of warnings when W=1.
Remove the unused ones.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
If gmin_camera_platform_data() returns NULL then we should return a
negative error instead of success.
Fixes: 90ebe55ab8 ("media: staging: atomisp: Add driver prefix to Kconfig option and module names")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
This patch fixes the checkpatch.pl warning:
Prefer using '"%s...", __func__' to using '<function name>',
this function's name, in a string
Signed-off-by: Baidyanath Kundu <kundubaidya99@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
I'm pretty sure I named this right, but it sounds that I ended
doing something weird maybe while solving some conflict.
So, fix the title of this config var.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Queue index is received from the user. Therefore, we must validate it
before using it to access the queue props array.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Pull SCSI fix from James Bottomley:
"One small driver fix. Although the one liner makes it sound like a
cosmetic change, it's a regression fix for the megaraid_sas driver"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: megaraid_sas: Remove undefined ENABLE_IRQ_POLL macro
We currently filter these for timeout_remove/async_cancel/files_update,
but we only should be filtering for fixed file and buffer select. This
also causes a second read of sqe->flags, which isn't needed.
Just check req->flags for the relevant bits. This then allows these
commands to be used in links, for example, like everything else.
Signed-off-by: Daniele Albano <d.albano@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull hwmon fixes from Guenter Roeck:
- Using SCT on some Tohsiba drives causes firmware hangs. Disable its
use in the drivetemp driver.
- Handle potential buffer overflows in scmi and aspeed-pwm-tacho
driver.
- Energy reporting does not work well on all AMD CPUs. Restrict
amd_energy to known working models.
- Enable reading the CPU temperature on NCT6798D using undocumented
registers.
- Fix read errors seen if PEC is enabled in adm1275 driver.
- Fix setting the pwm1_enable in emc2103 driver.
* tag 'hwmon-for-v5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (drivetemp) Avoid SCT usage on Toshiba DT01ACA family drives
hwmon: (scmi) Fix potential buffer overflow in scmi_hwmon_probe()
hwmon: (nct6775) Accept PECI Calibration as temperature source for NCT6798D
hwmon: (adm1275) Make sure we are reading enough data for different chips
hwmon: (emc2103) fix unable to change fan pwm1_enable attribute
hwmon: (amd_energy) match for supported models
hwmon: (aspeed-pwm-tacho) Avoid possible buffer overflow
Pull RISC-V fixes from Palmer Dabbelt:
"Two fixes:
- 16KiB kernel stacks on rv64, which fixes a lot of crashes.
- Rolling an mmiowb() into the scheduler, which when combined with
Will's fix to the mmiowb()-on-spinlock should fix the PREEMPT
issues we've been seeing"
* tag 'riscv-for-linus-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Upgrade smp_mb__after_spinlock() to iorw,iorw
riscv: use 16KB kernel stack on 64-bit
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 5.8:
- A fix to the VAS code we merged this cycle, to report the proper
error code to userspace for address translation failures. And a
selftest update to match.
- Another fix for our pkey handling of PROT_EXEC mappings.
- A fix for a crash when booting a "secure VM" under an ultravisor
with certain numbers of CPUs.
Thanks to: Aneesh Kumar K.V, Haren Myneni, Laurent Dufour, Sandipan
Das, Satheesh Rajendran, Thiago Jung Bauermann"
* tag 'powerpc-5.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc: Use proper error code to check fault address
powerpc/vas: Report proper error code for address translation failure
powerpc/pseries/svm: Fix incorrect check for shared_lppaca_size
powerpc/book3s64/pkeys: Fix pkey_access_permitted() for execute disable pkey
It has been observed that Toshiba DT01ACA family drives have
WRITE FPDMA QUEUED command timeouts and sometimes just freeze until
power-cycled under heavy write loads when their temperature is getting
polled in SCT mode. The SMART mode seems to be fine, though.
Let's make sure we don't use SCT mode for these drives then.
While only the 3 TB model was actually caught exhibiting the problem let's
play safe here to avoid data corruption and extend the ban to the whole
family.
Fixes: 5b46903d8b ("hwmon: Driver for disk and solid state drives with temperature sensors")
Cc: stable@vger.kernel.org
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Link: https://lore.kernel.org/r/0cb2e7022b66c6d21d3f189a12a97878d0e7511b.1595075458.git.mail@maciej.szmigiero.name
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Don't populate const arrays on the stack but instead make them
static. Makes the object code smaller by 150 bytes.
Before:
text data bss dec hex filename
111083 23692 64 134839 20eb7 atomisp/pci/atomisp_compat_css20.o
After:
text data bss dec hex filename
110773 23852 64 134689 20e21 atomisp/pci/atomisp_compat_css20.o
After:
(gcc version 9.3.0, amd64)
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Compiler is not happy about leftovers:
cc1: warning: .../pci/hrt/: No such file or directory [-Wmissing-include-dirs]
cc1: warning: .../pci/hive_isp_css_include/memory_access/: No such file or directory [-Wmissing-include-dirs]
cc1: warning: .../pci/css_2400_system/hrt/: No such file or directory [-Wmissing-include-dirs]
Drop them from Makefile.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
First of all ACPI HID is a part of the device name which is printed
as a part of the dev_info(dev, ...); line. Second, since the only BID
is left, it's a part of ACPI path, which can be printed via %pfw.
Besides that, drop ACPI handle from atomisp_get_acpi_power() parameters.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Use temporary variable for device in gmin_subdev_add().
While here, drop unused temporary variable for device in other places.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Refactor PMIC detection to a separate function. In the future
we may move this code somewhere else where it's more suitable.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Deduplicate return ret in gmin_i2c_write().
While here, replace 'Kernel' by 'kernel' in the message and
reduce amount of LOCs.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
When we enumerate second device when PMIC has been successfully detected
to AXP, we got into troubles dereferencing NULL pointer. It seems
we have to detect PMIC only once because pmic_id is a global variable
and code doesn't expect the change of it: Two PMICs on one platform?
It's impossible.
Crash excerpt:
[ 34.335237] BUG: kernel NULL pointer dereference, address: 0000000000000002
...
[ 35.652036] RIP: 0010:gmin_subdev_add.cold+0x32f/0x33e [atomisp_gmin_platform]
So, as a quick fix make power a global variable. In next patches we move
PMIC detection to be more independent from subdevices.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
There are devices with completely different _DSM() format,
and accessing object as a package leads to crashes.
Bail out in the case of unexpected object type.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Unify pdev to be pointer to struct pci_device.
While here, reindent some (touched) lines for better readability.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
struct atomisp device has struct device and struct pci_dev pointers
which are basically duplicates of each other. Drop the latter
in favour of the former.
While here, unify pdev to be pointer to struct pci_device and reindent
some (touched) lines for better readability.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
There is no need to pass a pointer to struct device_driver
when we have an access to struct device already.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
There are specific ACPI and I\xB2C APIs to match device by different
parameters, such as ACPI HID, and retrieve an I\xB2C client.
Use them instead of home grown approach.
Note, it fixes a resource leak as well.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
IOSF MBI header contains a lot of definitions, such as
end point addresses of IPs. Move CCK address from AtomISP driver
to generic header.
While here, drop unused one.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Instead of parsing GPIO for two exceptions inside the logic
which would be trying to use the GPIO, move the init code
to the routine which adds a new gmin device.
Move the notes to it too, as this helps to identify where
those two GPIO settings are used, explaining why they're
defaulting to -1 when not found.
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
There's only one place where a subdev can be added: when the
sensor driver registers it. Trying to do it elsewhere will
cause problems, as the detection code needs to access the
I2C bus in order to probe some things.
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
If the ACPI DSDT tables provide _CRS for the camera, the
GPIO core code should be able to handle them automatically.
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
The media core has now a check if fi->pad is bigger than zero
or bigger than sd->entity.num_pads, if the media controller
is defined.
This causes a call to g_frame_interval to return -EINVAL.
Fix it by first cleaning up the struct.
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Newer devices don't place the PMIC CLK line inside an EFI
var. Instead, those are found at the _PR0 table.
Add a parser for it, using info stored via the DSDT table.
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Add support code for this driver to use ACPI power management.
Yet, at least with known devices, this won't work without
further changes.
The rationale is that the ACPI handling code checks for the _PR? tables
in order to know what is required to switch the device from power state
D0 (_PR0) up to D3COLD (_PR3).
The adev->flags.power_manageable is set to true if the device has a _PR0
table, which can be checked by calling acpi_device_power_manageable(adev).
However, this only says that the device can be set to power off mode.
At least on the DSDT tables we've seen so far, there's no _PR3 nor _PS3
(which would have a somewhat similar effect).
So, using ACPI for power management won't work, except if adding
an ACPI override logic somewhere.
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
The gmin_subdev_add() currently doesn't use ACPI device
power management. In order to prepare for adding support
for it, let's shift some things, placing the PM-related
stuff at the end of the probing logic.
Let's also store the current gs on a temporary var, in
order to simplify the source code.
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Fix wrong reading of upper pages for SFP EEPROM. According to "Memory
Organization" figure in SFF-8472 spec: When reading upper pages 1, 2 and
3 the offset should be set relative to zero and I2C high address 0x51
[1010001X (A2h)] is to be used.
Fixes: a45bfb5a50 ("mlxsw: core: Extend QSFP EEPROM size for ethtool")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Validate MAC address before copying the same to outgoing frame
skb destination address. Since a node can have zero mac
address for Link B until a valid frame is received over
that link, this fix address the issue of a zero MAC address
being in the packet.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For small Ethernet frames with size less than minimum size 66 for HSR
vs 60 for regular Ethernet frames, hsr driver currently doesn't pad the
frame to make it minimum size. This results in incorrect LSDU size being
populated in the HSR tag for these frames. Fix this by padding the frame
to the minimum size applicable for HSR.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The double poll additions were centered around doing POLL_ADD on file
descriptors that use more than one waitqueue (typically one for read,
one for write) when being polled. However, it can also end up being
triggered for when we use poll triggered retry. For that case, we cannot
safely use req->io, as that could be used by the request type itself.
Add a second io_poll_iocb pointer in the structure we allocate for poll
based retry, and ensure we use the right one from the two paths.
Fixes: 18bceab101 ("io_uring: allow POLL_ADD with double poll_wait() users")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is one RGMII check not using the phy_interface_mode_is_rgmii()
helper. This prevents the driver from configuring the MAC properly when
using a phy-mode that is not just rgmii, e.g. rgmii-rxid. This became an
issue on sama5d3 xplained since the ksz9031 driver is hadling phy-mode
properly and the phy-mode has to be set to rgmii-rxid.
Fixes: bcf3440c6d ("net: phy: micrel: add phy-mode support for the KSZ9031 PHY")
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch disables PTP on AQC111 and AQC112 due to a known HW issue,
which can cause datapath issues.
Ideally PTP block should have been disabled via PHY provisioning, but
unfortunately many units have been shipped with enabled PTP block.
Thus, we have to work around this in the driver.
Fixes: dbcd6806af ("net: aquantia: add support for Phy access")
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull NFS client fixes from Anna Schumaker:
"A few more NFS client bugfixes for Linux 5.8:
NFS:
- Fix interrupted slots by using the SEQUENCE operation
SUNRPC:
- revert d03727b248 to fix unkillable IOs
xprtrdma:
- Fix double-free in rpcrdma_ep_create()
- Fix recursion into rpcrdma_xprt_disconnect()
- Fix return code from rpcrdma_xprt_connect()
- Fix handling of connect errors
- Fix incorrect header size calculations"
* tag 'nfs-for-5.8-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
SUNRPC reverting d03727b248 ("NFSv4 fix CLOSE not waiting for direct IO compeletion")
xprtrdma: fix incorrect header size calculations
NFS: Fix interrupted slots by sending a solo SEQUENCE operation
xprtrdma: Fix handling of connect errors
xprtrdma: Fix return code from rpcrdma_xprt_connect()
xprtrdma: Fix recursion into rpcrdma_xprt_disconnect()
xprtrdma: Fix double-free in rpcrdma_ep_create()
Pull ARM SoC fixes from Arnd Bergmann:
"This time there are a number of actual code fixes, plus a small set of
device tree issues getting addressed:
Renesas:
- one defconfig cleanup to allow a later Kconfig change
Intel socfpga:
- enable QSPI devices on some machines
- fix DTC validation warnings
TI OMAP:
- Two DEBUG_ATOMIC_SLEEP fixes for ti-sysc interconnect target
module driver
- A regression fix for ti-sysc no-idle handling that caused issues
compared to earlier platform data based booting
- A fix for memory leak for omap_hwmod_allocate_module
- Fix d_can driver probe for am437x
NXP i.MX:
- A couple of fixes on i.MX platform device registration code to
stop the use of invalid IRQ 0.
- Fix a regression seen on ls1021a platform, caused by commit
52102a3ba6 ("soc: imx: move cpu code to drivers/soc/imx").
- Fix a misconfiguration of audio SSI on imx6qdl-gw551x board.
Amlogic Meson:
- misc DT fixes
- SoC ID fixes to detect all chips correctly"
* tag 'arm-fixes-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
arm64: dts: spcfpga: Align GIC, NAND and UART nodenames with dtschema
ARM: dts: socfpga: Align L2 cache-controller nodename with dtschema
arm64: dts: stratix10: increase QSPI reg address in nand dts file
arm64: dts: stratix10: add status to qspi dts node
arm64: dts: agilex: add status to qspi dts node
ARM: dts: Fix dcan driver probe failed on am437x platform
ARM: OMAP2+: Fix possible memory leak in omap_hwmod_allocate_module
arm64: defconfig: Enable CONFIG_PCIE_RCAR_HOST
soc: imx: check ls1021a
ARM: imx: Remove imx_add_imx_dma() unused irq_err argument
ARM: imx: Provide correct number of resources when registering gpio devices
ARM: dts: imx6qdl-gw551x: fix audio SSI
bus: ti-sysc: Do not disable on suspend for no-idle
bus: ti-sysc: Fix sleeping function called from invalid context for RTC quirk
bus: ti-sysc: Fix wakeirq sleeping function called from invalid context
ARM: dts: meson: Align L2 cache-controller nodename with dtschema
arm64: dts: meson-gxl-s805x: reduce initial Mali450 core frequency
arm64: dts: meson: add missing gxl rng clock
soc: amlogic: meson-gx-socinfo: Fix S905X3 and S905D3 ID's
Pull arm64 fixes from Will Deacon:
"A batch of arm64 fixes.
Although the diffstat is a bit larger than we'd usually have at this
stage, a decent amount of it is the addition of comments describing
our syscall tracing behaviour, and also a sweep across all the modular
arm64 PMU drivers to make them rebust against unloading and unbinding.
There are a couple of minor things kicking around at the moment (CPU
errata and module PLTs for very large modules), but I'm not expecting
any significant changes now for us in 5.8.
- Fix kernel text addresses for relocatable images booting using EFI
and with KASLR disabled so that they match the vmlinux ELF binary.
- Fix unloading and unbinding of PMU driver modules.
- Fix generic mmiowb() when writeX() is called from preemptible
context (reported by the riscv folks).
- Fix ptrace hardware single-step interactions with signal handlers,
system calls and reverse debugging.
- Fix reporting of 64-bit x0 register for 32-bit tasks via
'perf_regs'.
- Add comments describing syscall entry/exit tracing ABI"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
drivers/perf: Prevent forced unbinding of PMU drivers
asm-generic/mmiowb: Allow mmiowb_set_pending() when preemptible()
arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEP
arm64: ptrace: Use NO_SYSCALL instead of -1 in syscall_trace_enter()
arm64: syscall: Expand the comment about ptrace and syscall(-1)
arm64: ptrace: Add a comment describing our syscall entry/exit trap ABI
arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return
arm64: ptrace: Override SPSR.SS when single-stepping is enabled
arm64: ptrace: Consistently use pseudo-singlestep exceptions
drivers/perf: Fix kernel panic when rmmod PMU modules during perf sampling
efi/libstub/arm64: Retain 2MB kernel Image alignment if !KASLR
Setting interrupt affinity on inactive interrupts is inconsistent when
hierarchical irq domains are enabled. The core code should just store the
affinity and not call into the irq chip driver for inactive interrupts
because the chip drivers may not be in a state to handle such requests.
X86 has a hacky workaround for that but all other irq chips have not which
causes problems e.g. on GIC V3 ITS.
Instead of adding more ugly hacks all over the place, solve the problem in
the core code. If the affinity is set on an inactive interrupt then:
- Store it in the irq descriptors affinity mask
- Update the effective affinity to reflect that so user space has
a consistent view
- Don't call into the irq chip driver
This is the core equivalent of the X86 workaround and works correctly
because the affinity setting is established in the irq chip when the
interrupt is activated later on.
Note, that this is only effective when hierarchical irq domains are enabled
by the architecture. Doing it unconditionally would break legacy irq chip
implementations.
For hierarchial irq domains this works correctly as none of the drivers can
have a dependency on affinity setting in inactive state by design.
Remove the X86 workaround as it is not longer required.
Fixes: 02edee152d ("x86/apic/vector: Ignore set_affinity call for inactive interrupts")
Reported-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200529015501.15771-1-alisaidi@amazon.com
Link: https://lkml.kernel.org/r/877dv2rv25.fsf@nanos.tec.linutronix.de
When nfc_register_device fails in nci_register_device,
destroy_workqueue() shouled be called to destroy ndev->tx_wq.
Fixes: 3c1c0f5dc8 ("NFC: NCI: Fix nci_register_device init sequence")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doug Berger says:
====================
net: bcmgenet: fix WAKE_FILTER resume from deep sleep
The WAKE_FILTER logic can only wake the system from the standby
power state. However, some systems that include the GENET IP
support deeper power saving states and the driver should suspend
and resume correctly from those states as well.
This commit set squashes a few issues uncovered while testing
suspend and resume from these deep sleep states.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The Hardware Filter Block RAM may not be preserved when the GENET
block is reset during a deep sleep, so it is not sufficient to
only backup and restore the enables.
This commit clears out the HFB block and reprograms the rxnfc
rules when the system resumes from a suspended state. To support
this the bcmgenet_hfb_create_rxnfc_filter() function is modified
to access the register space directly so that it can't fail due
to memory allocation issues.
Fixes: f50932cca6 ("net: bcmgenet: add WAKE_FILTER support")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the GENET driver resumes from deep sleep the UMAC_CMD
register may not be accessible and therefore should not be
accessed from bcmgenet_wol_power_up_cfg() if the GENET has
been reset.
This commit adds a check of the RBUF_ACPI_EN flag when Wake
on Filter is enabled. A clear flag indicates that the GENET
hardware must have been reset so the remainder of the
hardware programming is bypassed.
Fixes: f50932cca6 ("net: bcmgenet: add WAKE_FILTER support")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the GENET driver resumes from deep sleep the UMAC_CMD
register may not be accessible and therefore should not be
accessed from bcmgenet_wol_power_up_cfg() if the GENET has
been reset.
This commit adds a check of the MPD_EN flag when Wake on
Magic Packet is enabled. A clear flag indicates that the
GENET hardware must have been reset so the remainder of the
hardware programming is bypassed.
Fixes: 1a1d5106c1 ("net: bcmgenet: move clk_wol management to bcmgenet_wol")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If try_toggle_control_gpio() failed in smc_drv_probe(), free_netdev(ndev)
should be called to free the ndev created earlier. Otherwise, a memleak
will occur.
Fixes: 7d2911c438 ("net: smc91x: Fix gpios for device tree based booting")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an expiration delta falls into the last level of the wheel, that delta
has be compared against the maximum possible delay and reduced to fit in if
necessary.
However instead of comparing the delta against the maximum, the code
compares the actual expiry against the maximum. Then instead of fixing the
delta to fit in, it sets the maximum delta as the expiry value.
This can result in various undesired outcomes, the worst possible one
being a timer expiring 15 days ahead to fire immediately.
Fixes: 500462a9de ("timers: Switch to a non-cascading wheel")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200717140551.29076-2-frederic@kernel.org
Reverting commit d03727b248 "NFSv4 fix CLOSE not waiting for
direct IO compeletion". This patch made it so that fput() by calling
inode_dio_done() in nfs_file_release() would wait uninterruptably
for any outstanding directIO to the file (but that wait on IO should
be killable).
The problem the patch was also trying to address was REMOVE returning
ERR_ACCESS because the file is still opened, is supposed to be resolved
by server returning ERR_FILE_OPEN and not ERR_ACCESS.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Pull io_uring fix from Jens Axboe:
"Fix for a case where, with automatic buffer selection, we can leak the
buffer descriptor for recvmsg"
* tag 'io_uring-5.8-2020-07-17' of git://git.kernel.dk/linux-block:
io_uring: fix recvmsg memory leak with buffer selection
Pull block fix from Jens Axboe:
"Single NVMe multipath capacity fix"
* tag 'block-5.8-2020-07-17' of git://git.kernel.dk/linux-block:
nvme: explicitly update mpath disk capacity on revalidation
Pull fuse fixes from Miklos Szeredi:
- two regressions in this cycle caused by the conversion of writepage
list to an rb_tree
- two regressions in v5.4 cause by the conversion to the new mount API
- saner behavior of fsconfig(2) for the reconfigure case
- an ancient issue with FS_IOC_{GET,SET}FLAGS ioctls
* tag 'fuse-fixes-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: Fix parameter for FS_IOC_{GET,SET}FLAGS
fuse: don't ignore errors from fuse_writepages_fill()
fuse: clean up condition for writepage sending
fuse: reject options on reconfigure via fsconfig(2)
fuse: ignore 'data' argument of mount(..., MS_REMOUNT)
fuse: use ->reconfigure() instead of ->remount_fs()
fuse: fix warning in tree_insert() and clean up writepage insertion
fuse: move rb_erase() before tree_insert()
Pull overlayfs fixes from Miklos Szeredi:
- fix a regression introduced in v4.20 in handling a regenerated
squashfs lower layer
- two regression fixes for this cycle, one of which is Oops inducing
- miscellaneous issues
* tag 'ovl-fixes-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fix lookup of indexed hardlinks with metacopy
ovl: fix unneeded call to ovl_change_flags()
ovl: fix mount option checks for nfs_export with no upperdir
ovl: force read-only sb on failure to create index dir
ovl: fix regression with re-formatted lower squashfs
ovl: fix oops in ovl_indexdir_cleanup() with nfs_export=on
ovl: relax WARN_ON() when decoding lower directory file handle
ovl: remove not used argument in ovl_check_origin
ovl: change ovl_copy_up_flags static
ovl: inode reference leak in ovl_is_inuse true case.
Add below to “Ancillary clock features” section
- Low Pass Filter (LPF) access from user space
Add below to list of “Supported hardware” section
+ Renesas (IDT) ClockMatrix™
Signed-off-by: Min Li <min.li.xe@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull spi fixes from Mark Brown:
"A couple of small driver specific fixes for fairly minor issues"
* tag 'spi-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-sun6i: sun6i_spi_transfer_one(): fix setting of clock rate
spi: mediatek: use correct SPI_CFG2_REG MACRO
Pull regulator fixes from Mark Brown:
"The more substantial fix here is the rename of the da903x driver which
fixes a collision with the parent MFD driver name which caused issues
when things were built as modules.
There's also a fix for a mislableled regulator on the pmi8994 which is
quite important for systems with that device"
* tag 'regulator-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
MAINTAINERS: remove obsolete entry after file renaming
regulator: rename da903x to da903x-regulator
regulator: qcom_smd: Fix pmi8994 label
Pull regmap fixes from Mark Brown:
"A couple of substantial fixes here, one from Doug which fixes the
debugfs code for MMIO regmaps (fortunately not the common case) and
one from Marc fixing lookups of multiple regmaps for the same device
(a very unusual case).
There's also a fix for Kconfig to ensure we enable SoundWire properly"
* tag 'regmap-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: debugfs: Don't sleep while atomic for fast_io regmaps
regmap: add missing dependency on SoundWire
regmap: dev_get_regmap_match(): fix string comparison
Pull HID fixes from Jiri Kosina:
- linked list race condition fix in hid-steam driver from Rodrigo Rivas
Costa
- assorted deviceID-specific quirks and other small cosmetic cleanups
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: logitech-hidpp: avoid repeated "multiplier = " log messages
HID: logitech: Use HIDPP_RECEIVER_INDEX instead of 0xff
HID: quirks: Ignore Simply Automated UPB PIM
HID: apple: Disable Fn-key key-re-mapping on clone keyboards
MAINTAINERS: update uhid and hid-wiimote entry
HID: steam: fixes race in handling device list.
HID: magicmouse: do not set up autorepeat
HID: alps: support devices with report id 2
HID: quirks: Always poll Obins Anne Pro 2 keyboard
HID: i2c-hid: add Mediacom FlexBook edge13 to descriptor override
While digging through the recent mmiowb preemption issue it came up that
we aren't actually preventing IO from crossing a scheduling boundary.
While it's a bit ugly to overload smp_mb__after_spinlock() with this
behavior, it's what PowerPC is doing so there's some precedent.
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
ASoC: Fixes for v5.8
An awful lot of mostly small fixes here, mainly for x86 based platforms
and the CODEC drivers mainly used on them. For the most part this is
either minor device specific stuff which seems to come from detailed
testing or robustness against errors which comes from people having done
some fuzzing runs aginst the topology code.
Renesas fixes for v5.8
- Replace CONFIG_PCIE_RCAR by CONFIG_PCIE_RCAR_HOST in the defconfig,
to unblock a planned Kconfig change.
* tag 'renesas-fixes-for-v5.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel:
arm64: defconfig: Enable CONFIG_PCIE_RCAR_HOST
Link: https://lore.kernel.org/r/20200717100523.15418-1-geert+renesas@glider.be
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull sound fixes from Takashi Iwai:
"No surprise here, just a few device-specific small fixes: two fixes
for USB LINE6 and one for USB-audio drivers wrt syzkaller fuzzer
issues, while the rest are all HD-audio Realtek quirks"
* tag 'sound-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek - fixup for yet another Intel reference board
ALSA: hda/realtek - Enable Speaker for ASUS UX563
ALSA: hda/realtek - Enable Speaker for ASUS UX533 and UX534
ALSA: hda/realtek: Enable headset mic of Acer TravelMate B311R-31 with ALC256
ALSA: hda/realtek: enable headset mic of ASUS ROG Zephyrus G14(G401) series with ALC289
ALSA: hda/realtek - change to suitable link model for ASUS platform
ALSA: usb-audio: Fix race against the error recovery URB submission
ALSA: line6: Sync the pending work cancel at disconnection
ALSA: line6: Perform sanity check for each URB creation
Right now, the driver is not doing the right thing to detect
the clock like used by the sensor, at least on devices
without the gmin's EFI vars.
Add some notes at the code to explain why and skip the wrong
value provided by the _DSM table.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To pick up the changes from:
83d31e5271 ("KVM: nVMX: fixes for preemption timer migration")
That don't entail changes in tooling.
This silences these tools/perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes in:
b2f9f1535b ("libbpf: Fix libbpf hashmap on (I)LP32 architectures")
Silencing this warning:
Warning: Kernel ABI header at 'tools/perf/util/hashmap.h' differs from latest version at 'tools/lib/bpf/hashmap.h'
diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h
I'll eventually update the warning to remove the "Kernel ABI" part
and instead state libbpf when noticing that the original is at
"tools/lib/something".
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Jakub Bogusz <qboosh@pld-linux.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Any option macro with _SET suffix should set opt->set variable which is
not happening for OPT_CALLBACK_SET(). This is causing issues with perf
record --switch-output-event. Fix that.
Before:
# ./perf record --overwrite -e sched:*switch,syscalls:sys_enter_mmap \
--switch-output-event syscalls:sys_enter_mmap
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.297 MB perf.data (657 samples) ]
After:
$ ./perf record --overwrite -e sched:*switch,syscalls:sys_enter_mmap \
--switch-output-event syscalls:sys_enter_mmap
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144542 ]
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144608 ]
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144660 ]
^C[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144784 ]
[ perf record: Woken up 0 times to write data ]
[ perf record: Dump perf.data.2020061918144803 ]
[ perf record: Captured and wrote 0.419 MB perf.data.<timestamp> ]
Fixes: 636eb4d001 ("libsubcmd: Introduce OPT_CALLBACK_SET()")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200619133412.50705-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This tag contains the following fixes for 5.8-rc4/5:
- Prevent user from using command WREG_BULK in PCI DMA channel. The command
won't be parsed correctly by the driver and will cause unknown behavior.
As the user doesn't need to use that command in that channel, its better
to just prevent it completely.
- Change the interface of the clock gating debugfs property from true/false
to bitmask with bit per engine. This will allow the user to debug the
ASIC while disabling the clock gating feature with fine-grain
granularity.
- Increase message-to-ASIC-CPU timeout to 4s (from 100ms/1s). The ASIC CPU
might respond sometimes after a large delay due to slow external
interfaces (such as temperature sensors) and that will result in a driver
timeout which will lead to ASIC reset.
* tag 'misc-habanalabs-fixes-2020-07-10' of git://people.freedesktop.org/~gabbayo/linux:
habanalabs: set 4s timeout for message to device CPU
habanalabs: set clock gating per engine
habanalabs: block WREG_BULK packet on PDMA
Forcefully unbinding PMU drivers during perf sampling will lead to
a kernel panic, because the perf upper-layer framework call a NULL
pointer in this situation.
To solve this issue, "suppress_bind_attrs" should be set to true, so
that bind/unbind can be disabled via sysfs and prevent unbinding PMU
drivers during perf sampling.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1594975763-32966-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Although mmiowb() is concerned only with serialising MMIO writes occuring
in contexts where a spinlock is held, the call to mmiowb_set_pending()
from the MMIO write accessors can occur in preemptible contexts, such
as during driver probe() functions where ordering between CPUs is not
usually a concern, assuming that the task migration path provides the
necessary ordering guarantees.
Unfortunately, the default implementation of mmiowb_set_pending() is not
preempt-safe, as it makes use of a a per-cpu variable to track its
internal state. This has been reported to generate the following splat
on riscv:
| BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
| caller is regmap_mmio_write32le+0x1c/0x46
| CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
| Call Trace:
| walk_stackframe+0x0/0x7a
| dump_stack+0x6e/0x88
| regmap_mmio_write32le+0x18/0x46
| check_preemption_disabled+0xa4/0xaa
| regmap_mmio_write32le+0x18/0x46
| regmap_mmio_write+0x26/0x44
| regmap_write+0x28/0x48
| sifive_gpio_probe+0xc0/0x1da
Although it's possible to fix the driver in this case, other splats have
been seen from other drivers, including the infamous 8250 UART, and so
it's better to address this problem in the mmiowb core itself.
Fix mmiowb_set_pending() by using the raw_cpu_ptr() to get at the mmiowb
state and then only updating the 'mmiowb_pending' field if we are not
preemptible (i.e. we have a non-zero nesting count).
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Palmer Dabbelt <palmer@dabbelt.com>
Reported-by: Emil Renner Berthing <kernel@esmil.dk>
Tested-by: Emil Renner Berthing <kernel@esmil.dk>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Link: https://lore.kernel.org/r/20200716112816.7356-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
snd_info_get_line() has a sanity check of NULL buffer -- both buffer
itself being NULL and buffer->buffer being NULL. Basically both
checks are valid and necessary, but the problem is that it's with
snd_BUG_ON() macro that triggers WARN_ON(). The latter condition
(NULL buffer->buffer) can be met arbitrarily by user since the buffer
is allocated at the first write, so it means that user can trigger
WARN_ON() at will.
This patch addresses it by simply moving buffer->buffer NULL check out
of snd_BUG_ON() so that spurious WARNING is no longer triggered.
Reported-by: syzbot+e42d0746c3c3699b6061@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200717084023.5928-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
In case we're compiling espintcp support only for IPv6, we should
still initialize the common code.
Fixes: 26333c37fc ("xfrm: add IPv6 support for espintcp")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
man 2 recv says:
RETURN VALUE
When a stream socket peer has performed an orderly shutdown, the
return value will be 0 (the traditional "end-of-file" return).
Currently, this works for blocking reads, but non-blocking reads will
return -EAGAIN. This patch overwrites that return value when the peer
won't send us any more data.
Fixes: e27cca96cd ("xfrm: add espintcp (RFC 8229)")
Reported-by: Andrew Cagney <cagney@libreswan.org>
Tested-by: Andrew Cagney <cagney@libreswan.org>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Currently, non-blocking sends from userspace result in EOPNOTSUPP.
To support this, we need to tell espintcp_sendskb_locked() and
espintcp_sendskmsg_locked() that non-blocking operation was requested
from espintcp_sendmsg().
Fixes: e27cca96cd ("xfrm: add espintcp (RFC 8229)")
Reported-by: Andrew Cagney <cagney@libreswan.org>
Tested-by: Andrew Cagney <cagney@libreswan.org>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Pull drm fixes from Dave Airlie:
"Weekly fixes pull, big bigger than I'd normally like, but they are
fairly scattered and small individually.
The vmwgfx one is a black screen regression, otherwise the largest is
an MST encoder fix for amdgpu which results in a WARN in some cases,
and a scattering of i915 fixes.
I'm tracking two regressions at the moment that hopefully we get
nailed down this week for rc7.
dma-buf:
- sleeping atomic fix
amdgpu:
- Fix a race condition with KIQ
- Preemption fix
- Fix handling of fake MST encoders
- OLED panel fix
- Handle allocation failure in stream construction
- Renoir SMC fix
- SDMA 5.x fix
i915:
- FBC w/a stride fix
- Fix use-after-free fix on module reload
- Ignore irq enabling on the virtual engines to fix device sleep
- Use GTT when saving/restoring engine GPR
- Fix selftest sort function
vmwgfx:
- black screen fix
aspeed:
- fbcon init warn fix"
* tag 'drm-fixes-2020-07-17-1' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu/sdma5: fix wptr overwritten in ->get_wptr()
drm/amdgpu/powerplay: Modify SMC message name for setting power profile mode
drm/amd/display: handle failed allocation during stream construction
drm/amd/display: OLED panel backlight adjust not work with external display connected
drm/amdgpu/display: create fake mst encoders ahead of time (v4)
drm/amdgpu: fix preemption unit test
drm/amdgpu/gfx10: fix race condition for kiq
drm/i915: Recalculate FBC w/a stride when needed
drm/i915: Move cec_notifier to intel_hdmi_connector_unregister, v2.
drm/i915/gt: Only swap to a random sibling once upon creation
drm/i915/gt: Ignore irq enabling on the virtual engines
drm/i915/perf: Use GTT when saving/restoring engine GPR
drm/i915/selftests: Fix compare functions provided for sorting
drm/vmwgfx: fix update of display surface when resolution changes
dmabuf: use spinlock to access dmabuf->name
drm/aspeed: Call drm_fbdev_generic_setup after drm_dev_register
task_h_load() can return 0 in some situations like running stress-ng
mmapfork, which forks thousands of threads, in a sched group on a 224 cores
system. The load balance doesn't handle this correctly because
env->imbalance never decreases and it will stop pulling tasks only after
reaching loop_max, which can be equal to the number of running tasks of
the cfs. Make sure that imbalance will be decreased by at least 1.
misfit task is the other feature that doesn't handle correctly such
situation although it's probably more difficult to face the problem
because of the smaller number of CPUs and running tasks on heterogenous
system.
We can't simply ensure that task_h_load() returns at least one because it
would imply to handle underflow in other places.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: <stable@vger.kernel.org> # v4.4+
Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org
From Documentation/networking/timestamping.txt:
A driver which supports hardware time stamping shall update the
struct with the actual, possibly more permissive configuration.
Do update the struct passed when we upscale the requested time
stamping mode.
Fixes: cb646e2b02 ("ptp: Added a clock driver for the National Semiconductor PHYTER.")
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fixes for omaps for v5.8-rc cycle
Few fixes for issues noticed during testing:
- Two DEBUG_ATOMIC_SLEEP fixes for ti-sysc interconnect target module
driver
- A regression fix for ti-sysc no-idle handling that caused issues
compared to earlier platform data based booting
- A fix for memory leak for omap_hwmod_allocate_module
- Fix d_can driver probe for am437x
* tag 'omap-for-v5.8/fixes-rc5-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: Fix dcan driver probe failed on am437x platform
ARM: OMAP2+: Fix possible memory leak in omap_hwmod_allocate_module
bus: ti-sysc: Do not disable on suspend for no-idle
bus: ti-sysc: Fix sleeping function called from invalid context for RTC quirk
bus: ti-sysc: Fix wakeirq sleeping function called from invalid context
Link: https://lore.kernel.org/r/pull-1594840100-132735@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
i.MX fixes for 5.8, round 2:
- A couple of fixes on i.MX platform device registration code to stop
the use of invalid IRQ 0.
- Fix a regression seen on ls1021a platform, caused by commit
52102a3ba6 ("soc: imx: move cpu code to drivers/soc/imx").
- Fix a misconfiguration of audio SSI on imx6qdl-gw551x board.
* tag 'imx-fixes-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
soc: imx: check ls1021a
ARM: imx: Remove imx_add_imx_dma() unused irq_err argument
ARM: imx: Provide correct number of resources when registering gpio devices
ARM: dts: imx6qdl-gw551x: fix audio SSI
Link: https://lore.kernel.org/r/20200714145649.GP15718@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
If a regmap has "fast_io" set then its lock function uses a spinlock.
That doesn't work so well with the functions:
* regmap_cache_only_write_file()
* regmap_cache_bypass_write_file()
Both of the above functions have the pattern:
1. Lock the regmap.
2. Call:
debugfs_write_file_bool()
copy_from_user()
__might_fault()
__might_sleep()
Let's reorder things a bit so that we do all of our sleepable
functions before we grab the lock.
Fixes: d3dc5430d6 ("regmap: debugfs: Allow writes to cache state settings")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200715164611.1.I35b3533e8a80efde0cec1cc70f71e1e74b2fa0da@changeid
Signed-off-by: Mark Brown <broonie@kernel.org>
Pull power management fixes from Rafael Wysocki:
"Add missing handling of a command line switch to the intel_pstate
driver (Rafael Wysocki) and fix the freeing of the operating
performance point (OPP) entries for the legacy (v1) OPP table type
(Walter Lozano)"
* tag 'pm-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
opp: Increase parsed_static_opps in _of_add_opp_table_v1()
cpufreq: intel_pstate: Fix active mode setting from command line
Pull char/misc fixes from Greg KH:
"Here are number of small char/misc driver fixes for 5.8-rc6
Not that many complex fixes here, just a number of small fixes for
reported issues, and some new device ids. Nothing fancy.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits)
virtio: virtio_console: add missing MODULE_DEVICE_TABLE() for rproc serial
intel_th: Fix a NULL dereference when hub driver is not loaded
intel_th: pci: Add Emmitsburg PCH support
intel_th: pci: Add Tiger Lake PCH-H support
intel_th: pci: Add Jasper Lake CPU support
virt: vbox: Fix guest capabilities mask check
virt: vbox: Fix VBGL_IOCTL_VMMDEV_REQUEST_BIG and _LOG req numbers to match upstream
uio_pdrv_genirq: fix use without device tree and no interrupt
uio_pdrv_genirq: Remove warning when irq is not specified
coresight: etmv4: Fix CPU power management setup in probe() function
coresight: cti: Fix error handling in probe
Revert "zram: convert remaining CLASS_ATTR() to CLASS_ATTR_RO()"
mei: bus: don't clean driver pointer
misc: atmel-ssc: lock with mutex instead of spinlock
phy: sun4i-usb: fix dereference of pointer phy0 before it is null checked
phy: rockchip: Fix return value of inno_dsidphy_probe()
phy: ti: j721e-wiz: Constify structs
phy: ti: am654-serdes: Constify regmap_config
phy: intel: fix enum type mismatch warning
phy: intel: Fix compilation error on FIELD_PREP usage
...
Fix support for external PTP-aware devices such as DSA or PTP PHY:
Make sure we never time stamp tx packets when hardware time stamping
is disabled.
Check for PTP PHY being in use and then pass ioctls related to time
stamping of Ethernet packets to the PTP PHY rather than handle them
ourselves. In addition, disable our own hardware time stamping in this
case.
Fixes: 6605b730c0 ("FEC: Add time stamping code and a PTP hardware clock")
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull driver core fixes from Greg KH:
"Here are 3 driver core fixes for 5.8-rc6.
They resolve some issues found with the deferred probe code for some
types of devices on some embedded systems. They have been tested a
bunch and have been in linux-next for a while with no reported issues"
* tag 'driver-core-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Avoid deferred probe due to fw_devlink_pause/resume()
driver core: Rename dev_links_info.defer_sync to defer_hook
driver core: Don't do deferred probe in parallel with kernel_init thread
Pull IIO and staging driver fixes from Greg KH:
"Here are some IIO and staging driver fixes for 5.8-rc6.
The majority of fixes are for IIO drivers, resolving a number of small
reported issues, and there are some counter fixes in here too that
were tied to the IIO fixes. There's only one staging driver fix here,
a comedi fix found by code inspection.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: comedi: verify array index is correct before using it
iio: adc: ad7780: Fix a resource handling path in 'ad7780_probe()'
iio:pressure:ms5611 Fix buffer element alignment
iio:humidity:hts221 Fix alignment and data leak issues
iio:humidity:hdc100x Fix alignment and data leak issues
iio:magnetometer:ak8974: Fix alignment and data leak issues
iio: adc: adi-axi-adc: Fix object reference counting
iio: pressure: zpa2326: handle pm_runtime_get_sync failure
counter: 104-quad-8: Add lock guards - filter clock prescaler
counter: 104-quad-8: Add lock guards - differential encoder
iio: core: add missing IIO_MOD_H2/ETHANOL string identifiers
iio: magnetometer: ak8974: Fix runtime PM imbalance on error
iio: mma8452: Add missed iio_device_unregister() call in mma8452_probe()
iio:health:afe4404 Fix timestamp alignment and prevent data leak.
iio:health:afe4403 Fix timestamp alignment and prevent data leak.
Pull tty/serial driver fixes from Greg KH:
:Here are some small tty and serial driver fixes for 5.8-rc6.
The largest set of patches in here is a revert of the sysrq changes
that went into 5.8-rc1 but turned out to cause a noticable overhead
and cpu usage.
Other than that, there's a few small serial driver fixes to resolve
reported issues, and finally resolving the spinlock init problem on
many serial driver consoles.
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: core: Initialise spin lock before use in uart_configure_port()
serial: mxs-auart: add missed iounmap() in probe failure and remove
serial: sh-sci: Initialize spinlock for uart console
Revert "tty: xilinx_uartps: Fix missing id assignment to the console"
serial: core: drop redundant sysrq checks
serial: core: fix sysrq overhead regression
Revert "serial: core: Refactor uart_unlock_and_check_sysrq()"
tty/serial: fix serial_core.c kernel-doc warnings
tty: serial: cpm_uart: Fix behaviour for non existing GPIOs
Pull thermal fixes from Daniel Lezcano:
- Fix invalid index array access on int340x_thermal leading to a kernel
panic (Bartosz Szczepanek)
- Fix debug message level to prevent flooding on some platform (Alex
Hung)
- Fix invalid bank access by reverting "thermal: mediatek: fix register
index error" (Enric Balletbo i Serra)
* tag 'thermal-v5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
Revert "thermal: mediatek: fix register index error"
thermal: int3403_thermal: Downgrade error message
thermal/int340x_thermal: Prevent page fault on .set_mode() op
The clang integrated assembler requires the 'cmp' instruction to
have a length prefix here:
arch/x86/math-emu/wm_sqrt.S:212:2: error: ambiguous instructions require an explicit suffix (could be 'cmpb', 'cmpw', or 'cmpl')
cmp $0xffffffff,-24(%ebp)
^
Make this a 32-bit comparison, which it was clearly meant to be.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20200527135352.1198078-1-arnd@arndb.de
When assembling with Clang via `make LLVM_IAS=1` and CONFIG_HYPERV enabled,
we observe the following error:
<instantiation>:9:6: error: expected absolute expression
.if HYPERVISOR_REENLIGHTENMENT_VECTOR == 3
^
<instantiation>:1:1: note: while in macro instantiation
idtentry HYPERVISOR_REENLIGHTENMENT_VECTOR asm_sysvec_hyperv_reenlightenment sysvec_hyperv_reenlightenment has_error_code=0
^
./arch/x86/include/asm/idtentry.h:627:1: note: while in macro instantiation
idtentry_sysvec HYPERVISOR_REENLIGHTENMENT_VECTOR sysvec_hyperv_reenlightenment;
^
<instantiation>:9:6: error: expected absolute expression
.if HYPERVISOR_STIMER0_VECTOR == 3
^
<instantiation>:1:1: note: while in macro instantiation
idtentry HYPERVISOR_STIMER0_VECTOR asm_sysvec_hyperv_stimer0 sysvec_hyperv_stimer0 has_error_code=0
^
./arch/x86/include/asm/idtentry.h:628:1: note: while in macro instantiation
idtentry_sysvec HYPERVISOR_STIMER0_VECTOR sysvec_hyperv_stimer0;
This is caused by typos in arch/x86/include/asm/idtentry.h:
HYPERVISOR_REENLIGHTENMENT_VECTOR -> HYPERV_REENLIGHTENMENT_VECTOR
HYPERVISOR_STIMER0_VECTOR -> HYPERV_STIMER0_VECTOR
For more details see ClangBuiltLinux issue #1088.
Fixes: a16be368dd ("x86/entry: Convert various hypervisor vectors to IDTENTRY_SYSVEC")
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1088
Link: https://github.com/ClangBuiltLinux/linux/issues/1043
Link: https://lore.kernel.org/patchwork/patch/1272115/
Link: https://lkml.kernel.org/r/20200714194740.4548-1-sedat.dilek@gmail.com
Pull an operating performance points (OPP) framework fix for 5.8-rc6 from
Viresh Kumar:
"This fixes freeing of the OPP entries for the legacy OPP table type (v1)."
* 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
opp: Increase parsed_static_opps in _of_add_opp_table_v1()
Commit 3b4b19721e ("nvme: fix possible deadlock when I/O is
blocked") reverted multipath head disk revalidation due to deadlocks
caused by holding the bd_mutex during revalidate.
Updating the multipath disk blockdev size is still required though for
userspace to be able to observe any resizing while the device is
mounted. Directly update the bdev inode size to avoid unnecessarily
holding the bdev->bd_mutex.
Fixes: 3b4b19721e ("nvme: fix possible deadlock when I/O is
blocked")
Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Depending on how you look at it, you can either say that:
a) There is a PDC hardware issue (with the specific IP rev that exists
on sc7180) that causes the PDC not to work properly when configured
to handle dual edges.
b) The dual edge feature of the PDC hardware was only added in later
HW revisions and thus isn't in all hardware.
Regardless of how you look at it, let's work around the lack of dual
edge support by only ever letting our parent see requests for single
edge interrupts on affected hardware.
NOTE: it's possible that a driver requesting a dual edge interrupt
might get several edges coalesced into a single IRQ. For instance if
a line starts low and then goes high and low again, the driver that
requested the IRQ is not guaranteed to be called twice. However, it
is guaranteed that once the driver's interrupt handler starts running
its first instruction that any new edges coming in will cause the
interrupt to fire again. This is relatively commonplace for dual-edge
gpio interrupts (many gpio controllers require software to emulate
dual edge with single edge) so client drivers should be setup to
handle it.
Fixes: e35a6ae0eb ("pinctrl/msm: Setup GPIO chip in hierarchy")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200714080254.v3.1.Ie0d730120b232a86a4eac1e2909bcbec844d1766@changeid
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
ROCE uses "VA % buf_page_size" to caclulate the offset in the PBL's first
page, the actual PA corresponding to the MR's VA is equal to MR's PA plus
this offset. The first PA in PBL has already been aligned to PAGE_SIZE
after calling ib_umem_get(), but the MR's VA may not. If the buf_page_size
is smaller than the PAGE_SIZE, this will lead the HW to access the wrong
memory because the offset is smaller than expected.
Fixes: 9b2cf76c9f ("RDMA/hns: Optimize PBL buffer allocation process")
Link: https://lore.kernel.org/r/1594726935-45666-1-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The RoCE Engine will schedule to another QP after one has sent
(2 ^ lp_pktn_ini) packets. lp_pktn_ini is set in QPC and should be
calculated from 2 factors:
1. current MTU as a integer
2. the RoCE Engine's maximum slice length 64KB
But the driver use MTU as a enum ib_mtu and the max inline capability, the
lp_pktn_ini will be much bigger than expected which may cause traffic of
some QPs to never get scheduled.
Fixes: b713128de7 ("RDMA/hns: Adjust lp_pktn_ini dynamically")
Link: https://lore.kernel.org/r/1594726138-49294-1-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Rather than open-code test_tsk_thread_flag() at each callsite, simply
replace the couple of offenders with calls to test_tsk_thread_flag()
directly.
Signed-off-by: Will Deacon <will@kernel.org>
Setting a system call number of -1 is special, as it indicates that the
current system call should be skipped.
Use NO_SYSCALL instead of -1 when checking for this scenario, which is
different from the -1 returned due to a seccomp failure.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Keno Fischer <keno@juliacomputing.com>
Cc: Luis Machado <luis.machado@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
If a task executes syscall(-1), we intercept this early and force x0 to
be -ENOSYS so that we don't need to distinguish this scenario from one
where the scno is -1 because a tracer wants to skip the system call
using ptrace. With the return value set, the return path is the same as
the skip case.
Although there is a one-line comment noting this in el0_svc_common(), it
misses out most of the detail. Expand the comment to describe a bit more
about what is going on.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Keno Fischer <keno@juliacomputing.com>
Cc: Luis Machado <luis.machado@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Our tracehook logic for syscall entry/exit raises a SIGTRAP back to the
tracer following a ptrace request such as PTRACE_SYSCALL. As part of this
procedure, we clobber the reported value of one of the tracee's general
purpose registers (x7 for native tasks, r12 for compat) to indicate
whether the stop occurred on syscall entry or exit. This is a slightly
unfortunate ABI, as it prevents the tracer from accessing the real
register value and is at odds with other similar stops such as seccomp
traps.
Since we're stuck with this ABI, expand the comment in our tracehook
logic to acknowledge the issue and describe the behaviour in more detail.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Luis Machado <luis.machado@linaro.org>
Reported-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Will Deacon <will@kernel.org>
Although we zero the upper bits of x0 on entry to the kernel from an
AArch32 task, we do not clear them on the exception return path and can
therefore expose 64-bit sign extended syscall return values to userspace
via interfaces such as the 'perf_regs' ABI, which deal exclusively with
64-bit registers.
Explicitly clear the upper 32 bits of x0 on return from a compat system
call.
Cc: <stable@vger.kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Keno Fischer <keno@juliacomputing.com>
Cc: Luis Machado <luis.machado@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Luis reports that, when reverse debugging with GDB, single-step does not
function as expected on arm64:
| I've noticed, under very specific conditions, that a PTRACE_SINGLESTEP
| request by GDB won't execute the underlying instruction. As a consequence,
| the PC doesn't move, but we return a SIGTRAP just like we would for a
| regular successful PTRACE_SINGLESTEP request.
The underlying problem is that when the CPU register state is restored
as part of a reverse step, the SPSR.SS bit is cleared and so the hardware
single-step state can transition to the "active-pending" state, causing
an unexpected step exception to be taken immediately if a step operation
is attempted.
In hindsight, we probably shouldn't have exposed SPSR.SS in the pstate
accessible by the GPR regset, but it's a bit late for that now. Instead,
simply prevent userspace from configuring the bit to a value which is
inconsistent with the TIF_SINGLESTEP state for the task being traced.
Cc: <stable@vger.kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Keno Fischer <keno@juliacomputing.com>
Link: https://lore.kernel.org/r/1eed6d69-d53d-9657-1fc9-c089be07f98c@linaro.org
Reported-by: Luis Machado <luis.machado@linaro.org>
Tested-by: Luis Machado <luis.machado@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Although the arm64 single-step state machine can be fast-forwarded in
cases where we wish to generate a SIGTRAP without actually executing an
instruction, this has two major limitations outside of simply skipping
an instruction due to emulation.
1. Stepping out of a ptrace signal stop into a signal handler where
SIGTRAP is blocked. Fast-forwarding the stepping state machine in
this case will result in a forced SIGTRAP, with the handler reset to
SIG_DFL.
2. The hardware implicitly fast-forwards the state machine when executing
an SVC instruction for issuing a system call. This can interact badly
with subsequent ptrace stops signalled during the execution of the
system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt
the stepping state by updating the PSTATE for the tracee.
Resolve both of these issues by injecting a pseudo-singlestep exception
on entry to a signal handler and also on return to userspace following a
system call.
Cc: <stable@vger.kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Tested-by: Luis Machado <luis.machado@linaro.org>
Reported-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Will Deacon <will@kernel.org>
The driver would happily overwrite its write buffer with user data in
256 byte increments due to a removed buffer-space sanity check.
Fixes: 5fcf62b0f1 ("tty: iuu_phoenix: fix locking.")
Cc: stable <stable@vger.kernel.org> # 2.6.31
Signed-off-by: Johan Hovold <johan@kernel.org>
Now that the IOMMU driver has been introduced, it prevents any access from
a DMA master going through it that hasn't properly mapped the pages, and
that link is set up through the iommus property.
Unfortunately we forgot to add that property to the video engine node when
adding the IOMMU node, so now any DMA access is broken.
Fixes: b3a0a2f910 ("arm64: dts: allwinner: h6: Add IOMMU")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20200628180804.79026-1-maxime@cerno.tech
We recently moved setting inode flag OVL_UPPERDATA to ovl_lookup().
When looking up an overlay dentry, upperdentry may be found by index
and not by name. In that case, we fail to read the metacopy xattr
and falsly set the OVL_UPPERDATA on the overlay inode.
This caused a regression in xfstest overlay/033 when run with
OVERLAY_MOUNT_OPTIONS="-o metacopy=on".
Fixes: 28166ab3c8 ("ovl: initialize OVL_UPPERDATA in ovl_lookup()")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
The check if user has changed the overlay file was wrong, causing unneeded
call to ovl_change_flags() including taking f_lock on every file access.
Fixes: d989903058 ("ovl: do not generate duplicate fsnotify events for "fake" path")
Cc: <stable@vger.kernel.org> # v4.19+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Currently, when using _of_add_opp_table_v2 parsed_static_opps is
increased and this value is used in _opp_remove_all_static() to
check if there are static opp entries that need to be freed.
Unfortunately this does not happen when using _of_add_opp_table_v1(),
which leads to warnings.
This patch increases parsed_static_opps in _of_add_opp_table_v1() in a
similar way as in _of_add_opp_table_v2().
Fixes: 03758d6026 ("opp: Replace list_kref with a local counter")
Cc: v5.6+ <stable@vger.kernel.org> # v5.6+
Signed-off-by: Walter Lozano <walter.lozano@collabora.com>
[ Viresh: Do the operation with lock held and set the value to 1 instead
of incrementing it ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Pull clk fixes from Stephen Boyd:
"A couple build fixes for issues exposed this merge window and a fix
for the eMMC clk on AST2600 SoCs that fixes the rate that is
calculated by the clk framework"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: Specify IOMEM dependency for HSDK pll driver
clk: AST2600: Add mux for EMMC clock
clk: mvebu: ARMADA_AP_CPU_CLK needs to select ARMADA_AP_CP_HELPER
The fsl_mc_get_endpoint() function can return an error or directly a
NULL pointer in case the peer device is not under the root DPRC
container. Treat this case also, otherwise it would lead to a NULL
pointer when trying to access the peer fsl_mc_device.
Fixes: 7194792308 ("dpaa2-eth: add MAC/PHY support through phylink")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull x86 platform driver fixes from Andriy Shevchenko:
"Small fixes for this cycle:
- Fix procfs handling in Thinkpad ACPI driver
- Fix battery management on new ASUS laptops
- New IDs (Sapphire Rapids) in ISST tool"
* tag 'platform-drivers-x86-v5.8-2' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: asus-wmi: allow BAT1 battery name
platform/x86: ISST: Add new PCI device ids
platform/x86: thinkpad_acpi: Revert "Use strndup_user() in dispatch_proc_write()"
Fix to return negative error code -ENOMEM from kmalloc() error handling
case instead of 0, as done elsewhere in this function.
Fixes: f1774cb895 ("X.509: parse public key parameters from x509 for akcipher")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The afs filesystem driver allows unstarted operations to be cancelled by
signal, but most of these can easily be restarted (mkdir for example). The
primary culprits for reproducing this are those applications that use
SIGALRM to display a progress counter.
File lock-extension operation is marked uninterruptible as we have a
limited time in which to do it, and the release op is marked
uninterruptible also as if we fail to unlock a file, we'll have to wait 20
mins before anyone can lock it again.
The store operation logs a warning if it gets interruption, e.g.:
kAFS: Unexpected error from FS.StoreData -4
because it's run from the background - but it can also be run from
fdatasync()-type things. However, store options aren't marked
interruptible at the moment.
Fix this in the following ways:
(1) Mark store operations as uninterruptible. It might make sense to
relax this for certain situations, but I'm not sure how to make sure
that background store ops aren't affected by signals to foreground
processes that happen to trigger them.
(2) In afs_get_io_locks(), where we're getting the serialisation lock for
talking to the fileserver, return ERESTARTSYS rather than EINTR
because a lot of the operations (e.g. mkdir) are restartable if we
haven't yet started sending the op to the server.
Fixes: e49c7b2f6d ("afs: Build an abstraction around an "operation" concept")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Without upperdir mount option, there is no index dir and the dependency
checks nfs_export => index for mount options parsing are incorrect.
Allow the combination nfs_export=on,index=off with no upperdir and move
the check for dependency redirect_dir=nofollow for non-upper mount case
to mount options parsing.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
With index feature enabled, on failure to create index dir, overlay is
being mounted read-only. However, we do not forbid user to remount overlay
read-write. Fix that by setting ofs->workdir to NULL, which prevents
remount read-write.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Commit 9df085f3c9 ("ovl: relax requirement for non null uuid of lower
fs") relaxed the requirement for non null uuid with single lower layer to
allow enabling index and nfs_export features with single lower squashfs.
Fabian reported a regression in a setup when overlay re-uses an existing
upper layer and re-formats the lower squashfs image. Because squashfs
has no uuid, the origin xattr in upper layer are decoded from the new
lower layer where they may resolve to a wrong origin file and user may
get an ESTALE or EIO error on lookup.
To avoid the reported regression while still allowing the new features
with single lower squashfs, do not allow decoding origin with lower null
uuid unless user opted-in to one of the new features that require
following the lower inode of non-dir upper (index, xino, metacopy).
Reported-by: Fabian <godi.beat@gmx.net>
Link: https://lore.kernel.org/linux-unionfs/32532923.JtPX5UtSzP@fgdesktop/
Fixes: 9df085f3c9 ("ovl: relax requirement for non null uuid of lower fs")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Mounting with nfs_export=on, xfstests overlay/031 triggers a kernel panic
since v5.8-rc1 overlayfs updates.
overlayfs: orphan index entry (index/00fb1..., ftype=4000, nlink=2)
BUG: kernel NULL pointer dereference, address: 0000000000000030
RIP: 0010:ovl_cleanup_and_whiteout+0x28/0x220 [overlay]
Bisect point at commit c21c839b84 ("ovl: whiteout inode sharing")
Minimal reproducer:
--------------------------------------------------
rm -rf l u w m
mkdir -p l u w m
mkdir -p l/testdir
touch l/testdir/testfile
mount -t overlay -o lowerdir=l,upperdir=u,workdir=w,nfs_export=on overlay m
echo 1 > m/testdir/testfile
umount m
rm -rf u/testdir
mount -t overlay -o lowerdir=l,upperdir=u,workdir=w,nfs_export=on overlay m
umount m
--------------------------------------------------
When mount with nfs_export=on, and fail to verify an orphan index, we're
cleaning this index from indexdir by calling ovl_cleanup_and_whiteout().
This dereferences ofs->workdir, that was earlier set to NULL.
The design was that ovl->workdir will point at ovl->indexdir, but we are
assigning ofs->indexdir to ofs->workdir only after ovl_indexdir_cleanup().
There is no reason not to do it sooner, because once we get success from
ofs->indexdir = ovl_workdir_create(... there is no turning back.
Reported-and-tested-by: Murphy Zhou <jencce.kernel@gmail.com>
Fixes: c21c839b84 ("ovl: whiteout inode sharing")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Decoding a lower directory file handle to overlay path with cold
inode/dentry cache may go as follows:
1. Decode real lower file handle to lower dir path
2. Check if lower dir is indexed (was copied up)
3. If indexed, get the upper dir path from index
4. Lookup upper dir path in overlay
5. If overlay path found, verify that overlay lower is the lower dir
from step 1
On failure to verify step 5 above, user will get an ESTALE error and a
WARN_ON will be printed.
A mismatch in step 5 could be a result of lower directory that was renamed
while overlay was offline, after that lower directory has been copied up
and indexed.
This is a scripted reproducer based on xfstest overlay/052:
# Create lower subdir
create_dirs
create_test_files $lower/lowertestdir/subdir
mount_dirs
# Copy up lower dir and encode lower subdir file handle
touch $SCRATCH_MNT/lowertestdir
test_file_handles $SCRATCH_MNT/lowertestdir/subdir -p -o $tmp.fhandle
# Rename lower dir offline
unmount_dirs
mv $lower/lowertestdir $lower/lowertestdir.new/
mount_dirs
# Attempt to decode lower subdir file handle
test_file_handles $SCRATCH_MNT -p -i $tmp.fhandle
Since this WARN_ON() can be triggered by user we need to relax it.
Fixes: 4b91c30a5a ("ovl: lookup connected ancestor of dir in inode cache")
Cc: <stable@vger.kernel.org> # v4.16+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
ovl_check_origin outparam 'ctrp' argument not used by caller. So remove
this argument.
Signed-off-by: youngjun <her0gyugyu@gmail.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
"ovl_copy_up_flags" is used in copy_up.c.
so, change it static.
Signed-off-by: youngjun <her0gyugyu@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
io_recvmsg() doesn't free memory allocated for struct io_buffer. This can
causes a leak when used with automatic buffer selection.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fix dtschema validator warnings like:
l2-cache@fffff000: $nodename:0:
'l2-cache@fffff000' does not match '^(cache-controller|cpu)(@[0-9a-f,]+)*$'
Fixes: 475dc86d08 ("arm: dts: socfpga: Add a base DTSI for Altera's Arria10 SOC")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Match the QSPI reg address in the socfpga_stratix10_socdk.dts file.
Fixes: 80f132d737 ("arm64: dts: increase the QSPI reg address for Stratix10 and Agilex")
Cc: linux-stable <stable@vger.kernel.org> # >= v5.6
Signed-off-by: Dinh Nguyen <dinh.nguyen@intel.com>
sybot came up with following transaction:
add table ip syz0
add chain ip syz0 syz2 { type nat hook prerouting priority 0; policy accept; }
add table ip syz0 { flags dormant; }
delete chain ip syz0 syz2
delete table ip syz0
which yields:
hook not found, pf 2 num 0
WARNING: CPU: 0 PID: 6775 at net/netfilter/core.c:413 __nf_unregister_net_hook+0x3e6/0x4a0 net/netfilter/core.c:413
[..]
nft_unregister_basechain_hooks net/netfilter/nf_tables_api.c:206 [inline]
nft_table_disable net/netfilter/nf_tables_api.c:835 [inline]
nf_tables_table_disable net/netfilter/nf_tables_api.c:868 [inline]
nf_tables_commit+0x32d3/0x4d70 net/netfilter/nf_tables_api.c:7550
nfnetlink_rcv_batch net/netfilter/nfnetlink.c:486 [inline]
nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:544 [inline]
nfnetlink_rcv+0x14a5/0x1e50 net/netfilter/nfnetlink.c:562
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
Problem is that when I added ability to override base hook registration
to make nat basechains register with the nat core instead of netfilter
core, I forgot to update nft_table_disable() to use that instead of
the 'raw' hook register interface.
In syzbot transaction, the basechain is of 'nat' type. Its registered
with the nat core. The switch to 'dormant mode' attempts to delete from
netfilter core instead.
After updating nft_table_disable/enable to use the correct helper,
nft_(un)register_basechain_hooks can be folded into the only remaining
caller.
Because nft_trans_table_enable() won't do anything when the DORMANT flag
is set, remove the flag first, then re-add it in case re-enablement
fails, else this patch breaks sequence:
add table ip x { flags dormant; }
/* add base chains */
add table ip x
The last 'add' will remove the dormant flags, but won't have any other
effect -- base chains are not registered.
Then, next 'set dormant flag' will create another 'hook not found'
splat.
Reported-by: syzbot+2570f2c036e3da5db176@syzkaller.appspotmail.com
Fixes: 4e25ceb80b ("netfilter: nf_tables: allow chain type to override hook register")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently the header size calculations are using an assignment
operator instead of a += operator when accumulating the header
size leading to incorrect sizes. Fix this by using the correct
operator.
Addresses-Coverity: ("Unused value")
Fixes: 302d3deb20 ("xprtrdma: Prevent inline overflow")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
SMATCH detected a potential buffer overflow in the manipulation of
hwmon_attributes array inside the scmi_hwmon_probe function:
drivers/hwmon/scmi-hwmon.c:226
scmi_hwmon_probe() error: buffer overflow 'hwmon_attributes' 6 <= 9
Fix it by statically declaring the size of the array as the maximum
possible as defined by hwmon_max define.
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20200715121338.GA18761@e119603-lin.cambridge.arm.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
ERR_NX_TRANSLATION(CSB.CC=5) is for internal to VAS for fault handling
and should not used by OS. ERR_NX_AT_FAULT(CSB.CC=250) is the proper
error code should be reported by OS when NX encounters address
translation failure.
This patch uses CC=250 to determine the fault address when the request
is not successful.
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0315251705baff94f678c33178491b5008723511.camel@linux.ibm.com
P9 DD2 NX workbook (Table 4-36) says DMA controller uses CC=5
internally for translation fault handling. NX reserves CC=250 for
OS to notify user space when NX encounters address translation
failure on the request buffer. Not an issue in earlier releases
as NX does not get faults on kernel addresses.
This patch defines CSB_CC_FAULT_ADDRESS(250) and updates CSB.CC with
this proper error code for user space.
Fixes: c96c4436ab ("powerpc/vas: Update CSB and notify process for fault CRBs")
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
[mpe: Added Fixes tag and fix typo in comment]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/019fd53e7538c6f8f332d175df74b1815ef5aa8c.camel@linux.ibm.com
The ioctl encoding for this parameter is a long but the documentation says
it should be an int and the kernel drivers expect it to be an int. If the
fuse driver treats this as a long it might end up scribbling over the stack
of a userspace process that only allocated enough space for an int.
This was previously discussed in [1] and a patch for fuse was proposed in
[2]. From what I can tell the patch in [2] was nacked in favor of adding
new, "fixed" ioctls and using those from userspace. However there is still
no "fixed" version of these ioctls and the fact is that it's sometimes
infeasible to change all userspace to use the new one.
Handling the ioctls specially in the fuse driver seems like the most
pragmatic way for fuse servers to support them without causing crashes in
userspace applications that call them.
[1]: https://lore.kernel.org/linux-fsdevel/20131126200559.GH20559@hall.aurel32.net/T/
[2]: https://sourceforge.net/p/fuse/mailman/message/31771759/
Signed-off-by: Chirantan Ekbote <chirantan@chromium.org>
Fixes: 59efec7b90 ("fuse: implement ioctl support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
The battery on my laptop ASUS TUF Gaming FX706II is named BAT1.
This patch allows battery extension to load.
Signed-off-by: Vasiliy Kupriakov <rublag-ns@yandex.ru>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
This reverts commit 35d13c7a05.
This broke procfs interface due to neglecting the fact that
the strings are not coming NULL terminated.
Revert the change till we will have a better clean up.
Fixes: 35d13c7a05 ("platform/x86: thinkpad_acpi: Use strndup_user() in dispatch_proc_write()")
Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
This is likely firmware causing this but its starting to annoy customers.
Change the message level to verbose to prevent the spam.
Note that this seems to only show up with ISCSI enabled on the HBA via the
qedi driver.
Signed-off-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During setup():
...
for ns in h0 r1 h1 h2 h3
do
create_ns ${ns}
done
...
while in cleanup():
...
for n in h1 r1 h2 h3 h4
do
ip netns del ${n} 2>/dev/null
done
...
and after removing the stderr redirection in cleanup():
$ sudo ./fib_nexthop_multiprefix.sh
...
TEST: IPv4: host 0 to host 3, mtu 1400 [ OK ]
TEST: IPv6: host 0 to host 3, mtu 1400 [ OK ]
Cannot remove namespace file "/run/netns/h4": No such file or directory
$ echo $?
1
and a non-zero return code, make kselftests fail (even if the test
itself is fine):
...
not ok 34 selftests: net: fib_nexthop_multiprefix.sh # exit=1
...
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Dietrich reports invalid temperature source messages on Asus Formula
XII Z490.
nct6775 nct6775.656: Invalid temperature source 28 at index 0,
source register 0x100, temp register 0x73
Debugging suggests that temperature source 28 reports the CPU temperature.
Let's assume that temperature sources 28 and 29 reflect "PECI Agent {0,1}
Calibration", similar to other chips of the series.
Reported-by: Stefan Dietrich <roots@gmx.de>
Cc: Stefan Dietrich <roots@gmx.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
If there is no valid MAC address in the device tree,
use a random MAC address.
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
The size used when calling 'pci_alloc_consistent()' and
'pci_free_consistent()' should match.
Fix it and have it consistent with the corresponding call in 'rr_close()'.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the current 8KB stack size there are frequent overflows in a 64-bit
configuration. We may split IRQ stacks off in the future, but this fixes a
number of issues right now.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
[Palmer: mention irqstack in the commit text]
Fixes: 7db91e57a0 ("RISC-V: Task implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
"u64 *wptr" points to the the wptr value in write back buffer and
"*wptr = (*wptr) >> 2;" results in the value being overwritten each time
when ->get_wptr() is called.
umr uses /sys/kernel/debug/dri/0/amdgpu_ring_sdma0 to get rptr/wptr and
decode ring content and it is affected by this issue.
fix and simplify the logic similar as sdma_v4_0_ring_get_wptr().
v2: fix for sdma5.2 as well
v3: drop sdma 5.2 changes for 5.8 and stable
Suggested-by: Le Ma <le.ma@amd.com>
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
[Why]
amdgpu_dm->backlight_caps is for single eDP only. the caps are upddated
for very connector. Real eDP caps will be overwritten by other external
display. For OLED panel, caps->aux_support is set to 1 for OLED pnael.
after external connected, caps+.aux_support is set to 0. This causes
OLED backlight adjustment not work.
[How]
within update_conector_ext_caps, backlight caps will be updated only for
eDP connector.
Cc: stable@vger.kernel.org
Signed-off-by: hersen wu <hersenxs.wu@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prevents a warning in the MST create connector case.
v2: create global fake encoders rather per connector fake encoders
to avoid running out of encoder indices.
v3: use the actual number of crtcs on the asic rather than the max
to conserve encoders.
v4: v3 plus missing hunk I forgot to git add.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1108
Fixes: c6385e503a ("drm/amdgpu: drop legacy drm load and unload callbacks")
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.7.x
Currently we're failing to recalculate the gen9 FBC w/a stride
unless something more drastic than just the modifier itself has
changed. This often leaves us with FBC enabled with the linear
fbdev framebuffer without the w/a stride enabled. That will cause
an immediate underrun and FBC will get promptly disabled.
Fix the problem by checking if the w/a stride is about to change,
and go through the full dance if so. This part of the FBC code
is still pretty much a disaster and will need lots more work.
But this should at least fix the immediate issue.
v2: Deactivate FBC when the modifier changes since that will
likely require resetting the w/a CFB stride
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200711080336.13423-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 0428ab013f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Quite some non OF/ACPI users of irqdomains allocate firmware nodes of type
IRQCHIP_FWNODE_NAMED or IRQCHIP_FWNODE_NAMED_ID and free them right after
creating the irqdomain. The only purpose of these FW nodes is to convey
name information. When this was introduced the core code did not store the
pointer to the node in the irqdomain. A recent change stored the firmware
node pointer in irqdomain for other reasons and missed to notice that the
usage sites which do the alloc_fwnode/create_domain/free_fwnode sequence
are broken by this. Storing a dangling pointer is dangerous itself, but in
case that the domain is destroyed later on this leads to a double free.
Remove the freeing of the firmware node after creating the irqdomain from
all affected call sites to cure this.
Fixes: 711419e504 ("irqdomain: Add the missing assignment of domain->fwnode for named fwnode")
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/873661qakd.fsf@nanos.tec.linutronix.de
Got following d_can probe errors with kernel 5.8-rc1 on am437x
[ 10.730822] CAN device driver interface
Starting Wait for Network to be Configured...
[ OK ] Reached target Network.
[ 10.787363] c_can_platform 481cc000.can: probe failed
[ 10.792484] c_can_platform: probe of 481cc000.can failed with error -2
[ 10.799457] c_can_platform 481d0000.can: probe failed
[ 10.804617] c_can_platform: probe of 481d0000.can failed with error -2
actually, Tony has fixed this issue on am335x with the patch [3]
Since am437x has the same clock structure with am335x
[1][2], so reuse the code from Tony Lindgren's patch [3] to fix it.
[1]: https://www.ti.com/lit/pdf/spruh73 Chapter-23, Figure 23-1. DCAN
Integration
[2]: https://www.ti.com/lit/pdf/spruhl7 Chapter-25, Figure 25-1. DCAN
Integration
[3]: commit 516f1117d0 ("ARM: dts: Configure osc clock for d_can on
am335x")
Fixes: 1a5cd7c23c ("bus: ti-sysc: Enable all clocks directly during init to read revision")
Signed-off-by: dillon min <dillon.minfei@gmail.com>
[tony@atomide.com: aligned commit message a bit for readability]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Fix memory leak in omap_hwmod_allocate_module not freeing in
handling error path.
Fixes: 8c87970543b17("ARM: OMAP2+: Add functions to allocate module data from device tree")
Signed-off-by: Chen Tao <chentao107@huawei.com>
Reviewed-by: Paul Walmsley <paul@pwsan.com>
[tony@atomide.com: fix call iounmap for missing regs]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Starting from commit "thermal/int340x_thermal: Don't require IDSP to
exist", priv->current_uuid_index is initialized to -1. This value may
be passed to int3400_thermal_run_osc() from int3400_thermal_set_mode,
contributing to page fault when accessing int3400_thermal_uuids array
at index -1.
This commit adds a check on uuid value to int3400_thermal_run_osc.
Fixes: 8d485da0dd ("thermal/int340x_thermal: Don't require IDSP to exist")
Signed-off-by: Bartosz Szczepanek <bsz@semihalf.com>
Reviewed-by: Pandruvada, Srinivas <srinivas.pandruvada@linux.intel.com>
[ rzhang: Add Fixes tag ]
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20200708134613.131555-1-bsz@semihalf.com
There is no guarantee to CMA's placement, so allocating a zone specific
atomic pool from CMA might return memory from a completely different
memory zone. So stop using it.
Fixes: c84dc6e68a ("dma-pool: add additional coherent pools to map to gfp mask")
Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When allocating DMA memory from a pool, the core can only guess which
atomic pool will fit a device's constraints. If it doesn't, get a safer
atomic pool and try again.
Fixes: c84dc6e68a ("dma-pool: add additional coherent pools to map to gfp mask")
Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
dma-pool's dev_to_pool() creates the false impression that there is a
way to grantee a mapping between a device's DMA constraints and an
atomic pool. It tuns out it's just a guess, and the device might need to
use an atomic pool containing memory from a 'safer' (or lower) memory
zone.
To help mitigate this, introduce dma_guess_pool() which can be fed a
device's DMA constraints and atomic pools already known to be faulty, in
order for it to provide an better guess on which pool to use.
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The function is only used once and can be simplified to a one-liner.
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
dma_coherent_ok() checks if a physical memory area fits a device's DMA
constraints.
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
fuse_writepages() ignores some errors taken from fuse_writepages_fill() I
believe it is a bug: if .writepages is called with WB_SYNC_ALL it should
either guarantee that all data was successfully saved or return error.
Fixes: 26d614df1d ("fuse: Implement writepages callback")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
fuse_writepages_fill uses following construction:
if (wpa && ap->num_pages &&
(A || B || C)) {
action;
} else if (wpa && D) {
if (E) {
the same action;
}
}
- ap->num_pages check is always true and can be removed
- "if" and "else if" calls the same action and can be merged.
Move checking A, B, C, D, E conditions to a helper, add comments.
Original-patch-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Previous patch changed handling of remount/reconfigure to ignore all
options, including those that are unknown to the fuse kernel fs. This was
done for backward compatibility, but this likely only affects the old
mount(2) API.
The new fsconfig(2) based reconfiguration could possibly be improved. This
would make the new API less of a drop in replacement for the old, OTOH this
is a good chance to get rid of some weirdnesses in the old API.
Several other behaviors might make sense:
1) unknown options are rejected, known options are ignored
2) unknown options are rejected, known options are rejected if the value
is changed, allowed otherwise
3) all options are rejected
Prior to the backward compatibility fix to ignore all options all known
options were accepted (1), even if they change the value of a mount
parameter; fuse_reconfigure() does not look at the config values set by
fuse_parse_param().
To fix that we'd need to verify that the value provided is the same as set
in the initial configuration (2). The major drawback is that this is much
more complex than just rejecting all attempts at changing options (3);
i.e. all options signify initial configuration values and don't make sense
on reconfigure.
This patch opts for (3) with the rationale that no mount options are
reconfigurable in fuse.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
The command
mount -o remount -o unknownoption /mnt/fuse
succeeds on kernel versions prior to v5.4 and fails on kernel version at or
after. This is because fuse_parse_param() rejects any unrecognised options
in case of FS_CONTEXT_FOR_RECONFIGURE, just as for FS_CONTEXT_FOR_MOUNT.
This causes a regression in case the fuse filesystem is in fstab, since
remount sends all options found there to the kernel; even ones that are
meant for the initial mount and are consumed by the userspace fuse server.
Fix this by ignoring mount options, just as fuse_remount_fs() did prior to
the conversion to the new API.
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Fixes: c30da2e981 ("fuse: convert to use the new mount API")
Cc: <stable@vger.kernel.org> # v5.4
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
s_op->remount_fs() is only called from legacy_reconfigure(), which is not
used after being converted to the new API.
Convert to using ->reconfigure(). This restores the previous behavior of
syncing the filesystem and rejecting MS_MANDLOCK on remount.
Fixes: c30da2e981 ("fuse: convert to use the new mount API")
Cc: <stable@vger.kernel.org> # v5.4
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
fuse_writepages_fill() calls tree_insert() with ap->num_pages = 0 which
triggers the following warning:
WARNING: CPU: 1 PID: 17211 at fs/fuse/file.c:1728 tree_insert+0xab/0xc0 [fuse]
RIP: 0010:tree_insert+0xab/0xc0 [fuse]
Call Trace:
fuse_writepages_fill+0x5da/0x6a0 [fuse]
write_cache_pages+0x171/0x470
fuse_writepages+0x8a/0x100 [fuse]
do_writepages+0x43/0xe0
Fix up the warning and clean up the code around rb-tree insertion:
- Rename tree_insert() to fuse_insert_writeback() and make it return the
conflicting entry in case of failure
- Re-add tree_insert() as a wrapper around fuse_insert_writeback()
- Rename fuse_writepage_in_flight() to fuse_writepage_add() and reverse
the meaning of the return value to mean
+ "true" in case the writepage entry was successfully added
+ "false" in case it was in-fligt queued on an existing writepage
entry's auxiliary list or the existing writepage entry's temporary
page updated
Switch from fuse_find_writeback() + tree_insert() to
fuse_insert_writeback()
- Move setting orig_pages to before inserting/updating the entry; this may
result in the orig_pages value being discarded later in case of an
in-flight request
- In case of a new writepage entry use fuse_writepage_add()
unconditionally, only set data->wpa if the entry was added.
Fixes: 6b2fb79963 ("fuse: optimize writepages search")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Original-path-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
In fuse_writepage_end() the old writepages entry needs to be removed from
the rbtree before inserting the new one, otherwise tree_insert() would
fail. This is a very rare codepath and no reproducer exists.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Early secure guest boot hits the below crash while booting with
vcpus numbers aligned with page boundary for PAGE size of 64k
and LPPACA size of 1k i.e 64, 128 etc.
Partition configured for 64 cpus.
CPU maps initialized for 1 thread per core
------------[ cut here ]------------
kernel BUG at arch/powerpc/kernel/paca.c:89!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
This is due to the BUG_ON() for shared_lppaca_total_size equal to
shared_lppaca_size. Instead the code should only BUG_ON() if we have
exceeded the total_size, which indicates we've overflowed the array.
Fixes: bd104e6db6 ("powerpc/pseries/svm: Use shared memory for LPPACA structures")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
[mpe: Reword change log to clarify we're fixing not removing the check]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200619070113.16696-1-sathnaga@linux.vnet.ibm.com
This is to fix lkp cppcheck warnings:
drivers/fpga/dfl-pci.c:230:6: warning: The scope of the variable 'ret' can be reduced. [variableScope]
int ret = 0;
^
drivers/fpga/dfl-pci.c:230:10: warning: Variable 'ret' is assigned a value that is never used. [unreadVariable]
int ret = 0;
^
Fixes: 3c2760b78f ("fpga: dfl: pci: fix return value of cci_pci_sriov_configure")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Acked-by: Wu Hao <hao.wu@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
The assignment of metadata overwrote the new display resolution values,
hence we'd miss the size actually changed and wouldn't redefine the
surface. This would then lead to command buffer error when trying to
update the screen target (due to the size mismatch), and result in a
VM with black screen.
Fixes: 504901dbb0 ("drm/vmwgfx: Refactor surface_define to use vmw_surface_metadata")
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Cc: stable@vger.kernel.org
Pull input fixes from Dmitry Torokhov:
"A few quirks for the Elan touchpad driver, another Thinkpad is being
switched over from PS/2 to native RMI4 interface, and we gave a brand
new SW_MACHINE_COVER switch definition"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - add more hardware ID for Lenovo laptops
Input: i8042 - add Lenovo XiaoXin Air 12 to i8042 nomux list
Revert "Input: elants_i2c - report resolution information for touch major"
Input: elan_i2c - only increment wakeup count on touch
Input: synaptics - enable InterTouch for ThinkPad X1E 1st gen
ARM: dts: n900: remove mmc1 card detect gpio
Input: add `SW_MACHINE_COVER`
Kalle Valo says:
====================
wireless-drivers fixes for v5.8
First set of fixes for v5.8. Various important fixes for iwlwifi and
mt76.
iwlwifi
* fix sleeping under RCU
* fix a kernel crash when using compressed firmware images
mt76
* tx queueing fixes for mt7615/22/63
* locking fix
* fix a crash during watchdog reset
* fix memory leaks
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
KASAN report null-ptr-deref error when register_netdev() failed:
KASAN: null-ptr-deref in range [0x00000000000003c0-0x00000000000003c7]
CPU: 2 PID: 422 Comm: ip Not tainted 5.8.0-rc4+ #12
Call Trace:
ip6gre_init_net+0x4ab/0x580
? ip6gre_tunnel_uninit+0x3f0/0x3f0
ops_init+0xa8/0x3c0
setup_net+0x2de/0x7e0
? rcu_read_lock_bh_held+0xb0/0xb0
? ops_init+0x3c0/0x3c0
? kasan_unpoison_shadow+0x33/0x40
? __kasan_kmalloc.constprop.0+0xc2/0xd0
copy_net_ns+0x27d/0x530
create_new_namespaces+0x382/0xa30
unshare_nsproxy_namespaces+0xa1/0x1d0
ksys_unshare+0x39c/0x780
? walk_process_tree+0x2a0/0x2a0
? trace_hardirqs_on+0x4a/0x1b0
? _raw_spin_unlock_irq+0x1f/0x30
? syscall_trace_enter+0x1a7/0x330
? do_syscall_64+0x1c/0xa0
__x64_sys_unshare+0x2d/0x40
do_syscall_64+0x56/0xa0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
ip6gre_tunnel_uninit() has set 'ign->fb_tunnel_dev' to NULL, later
access to ign->fb_tunnel_dev cause null-ptr-deref. Fix it by saving
'ign->fb_tunnel_dev' to local variable ndev.
Fixes: dafabb6590 ("ip6_gre: fix use-after-free in ip6gre_tunnel_lookup()")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On sparc32, tcflag_t is "unsigned long", unlike on all other
architectures, where it is "unsigned int":
drivers/net/usb/hso.c: In function ‘hso_serial_set_termios’:
include/linux/kern_levels.h:5:18: warning: format ‘%d’ expects argument of type ‘unsigned int’, but argument 4 has type ‘tcflag_t {aka long unsigned int}’ [-Wformat=]
drivers/net/usb/hso.c:1393:3: note: in expansion of macro ‘hso_dbg’
hso_dbg(0x16, "Termios called with: cflags new[%d] - old[%d]\n",
^~~~~~~
include/linux/kern_levels.h:5:18: warning: format ‘%d’ expects argument of type ‘unsigned int’, but argument 5 has type ‘tcflag_t {aka long unsigned int}’ [-Wformat=]
drivers/net/usb/hso.c:1393:3: note: in expansion of macro ‘hso_dbg’
hso_dbg(0x16, "Termios called with: cflags new[%d] - old[%d]\n",
^~~~~~~
As "unsigned long" is 32-bit on sparc32, fix this by casting all tcflag_t
parameters to "unsigned int".
While at it, use "%u" to format unsigned numbers.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull iommu fixes from Joerg Roedel:
- Fix a use-after-free of the device iommu-group. Found in the arm-smmu
driver, but the fix is in generic code.
- Fix for the new Allwinner IOMMU driver to use the atomic
readl_timeout() variant in IO/TLB flushing code.
- A couple of cleanups to fix various compile warnings.
* tag 'iommu-fixes-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/arm-smmu: Mark qcom_smmu_client_of_match as possibly unused
iommu: Fix use-after-free in iommu_release_device
iommu/amd: Make amd_iommu_apply_ivrs_quirks() static inline
iommu: SUN50I_IOMMU should depend on HAS_DMA
iommu/sun50i: Remove unused variable
iommu/sun50i: Change the readl timeout to the atomic variant
Naresh Kamboju reported that the LTP tests can cause warnings on i386
going back all the way to v5.0, and bisected it to commit 2c91bd4a4e
("mm: speed up mremap by 20x on large regions").
The warning in move_normal_pmd() is actually mostly correct, but we have
a very unusual special case at process creation time, when we may move
the stack down with an overlapping mode (kind of like a "memmove()"
except using the page tables).
And when you have just the right condition of "move a large initial
stack by the right alignment in the end, but with the early part of the
move being only page-aligned", we'll be in a situation where we're
trying to move a normal PMD entry on top of an already existing - but
now empty - PMD entry.
The warning is still worth having, in case it ever triggers other cases,
and perhaps as a reminder that we could do the stack move case more
efficiently (although it's clearly rare enough that it probably doesn't
matter).
But make it do WARN_ON_ONCE(), so that you can't flood the logs with it.
And add a *big* comment above it to explain and remind us what's going
on, because it took some figuring out to see how this could trigger.
Kudos to Joel Fernandes for debugging this.
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Debugged-and-acked-by: Joel Fernandes <joel@joelfernandes.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If intel_pstate starts in the passive mode by default (that happens
when the processor in the system doesn't support HWP), passing
intel_pstate=active in the kernel command line doesn't work, so
fix that.
Fixes: 33aa46f252 ("cpufreq: intel_pstate: Use passive mode by default without HWP")
Reported-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Doug Smythies <dsmythies@telus.net>
We used to do this before 3453d5708b, but this was changed to better
handle the NFS4ERR_SEQ_MISORDERED error code. This commit fixed the slot
re-use case when the server doesn't receive the interrupted operation,
but if the server does receive the operation then it could still end up
replying to the client with mis-matched operations from the reply cache.
We can fix this by sending a SEQUENCE to the server while recovering from
a SEQ_MISORDERED error when we detect that we are in an interrupted slot
situation.
Fixes: 3453d5708b (NFSv4.1: Avoid false retries when RPC calls are interrupted)
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Ensure that the connect worker is awoken if an attempt to establish
a connection is unsuccessful. Otherwise the worker waits forever
and the transport workload hangs.
Connect errors should not attempt to destroy the ep, since the
connect worker continues to use it after the handler runs, so these
errors are now handled independently of DISCONNECTED events.
Reported-by: Dan Aloni <dan@kernelim.com>
Fixes: e28ce90083 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
I noticed that when rpcrdma_xprt_connect() returns -ENOMEM,
instead of retrying the connect, the RPC client kills the
RPC task that requested the connection. We want a retry
here.
Fixes: cb586decbb ("xprtrdma: Make sendctx queue lifetime the same as connection lifetime")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Both Dan and I have observed two processes invoking
rpcrdma_xprt_disconnect() concurrently. In my case:
1. The connect worker invokes rpcrdma_xprt_disconnect(), which
drains the QP and waits for the final completion
2. This causes the newly posted Receive to flush and invoke
xprt_force_disconnect()
3. xprt_force_disconnect() sets CLOSE_WAIT and wakes up the RPC task
that is holding the transport lock
4. The RPC task invokes xprt_connect(), which calls ->ops->close
5. xprt_rdma_close() invokes rpcrdma_xprt_disconnect(), which tries
to destroy the QP.
Deadlock.
To prevent xprt_force_disconnect() from waking anything, handle the
clean up after a failed connection attempt in the xprt's sndtask.
The retry loop is removed from rpcrdma_xprt_connect() to ensure
that the newly allocated ep and id are properly released before
a REJECTED connection attempt can be retried.
Reported-by: Dan Aloni <dan@kernelim.com>
Fixes: e28ce90083 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
In the error paths, there's no need to call kfree(ep) after calling
rpcrdma_ep_put(ep).
Fixes: e28ce90083 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Since commit 82046702e2 ("efi/libstub/arm64: Replace 'preferred' offset
with alignment check"), loading a relocatable arm64 kernel at a physical
address which is not 2MB aligned and subsequently booting with EFI will
leave the Image in-place, relying on the kernel to relocate itself early
during boot. In conjunction with commit dd4bc60765 ("arm64: warn on
incorrect placement of the kernel by the bootloader"), which enables
CONFIG_RELOCATABLE by default, this effectively means that entering an
arm64 kernel loaded at an alignment smaller than 2MB with EFI (e.g. using
QEMU) will result in silent relocation at runtime.
Unfortunately, this has a subtle but confusing affect for developers
trying to inspect the PC value during a crash and comparing it to the
symbol addresses in vmlinux using tools such as 'nm' or 'addr2line';
all text addresses will be displaced by a sub-2MB offset, resulting in
the wrong symbol being identified in many cases. Passing "nokaslr" on
the command line or disabling "CONFIG_RANDOMIZE_BASE" does not help,
since the EFI stub only copies the kernel Image to a 2MB boundary if it
is not relocatable.
Adjust the EFI stub for arm64 so that the minimum Image alignment is 2MB
unless KASLR is in use.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: David Brazdil <dbrazdil@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
fsl,ls1021a is a mach under arch/arm/mach-imx/, however it could
not use the soc driver which will break caam on ls1021a platform.
So directly return if it is compatible with fsl,ls1021a.
Fixes: 52102a3ba6 ("soc: imx: move cpu code to drivers/soc/imx")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Horia Geantă <horia.geanta@nxp.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Since commit a85a6c86c2 ("driver core: platform: Clarify that IRQ 0 is
invalid"), the kernel is a bit touchy when it encounters interrupt 0.
As a result, there are lots of warnings such as the following when booting
systems such as 'kzm'.
WARNING: CPU: 0 PID: 1 at drivers/base/platform.c:224 platform_get_irq_optional+0x118/0x128
0 is an invalid IRQ number
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3 #1
Hardware name: Kyoto Microcomputer Co., Ltd. KZM-ARM11-01
[<c01127d4>] (unwind_backtrace) from [<c010c620>] (show_stack+0x10/0x14)
[<c010c620>] (show_stack) from [<c06f5f54>] (dump_stack+0xe8/0x120)
[<c06f5f54>] (dump_stack) from [<c0128878>] (__warn+0xe4/0x108)
[<c0128878>] (__warn) from [<c0128910>] (warn_slowpath_fmt+0x74/0xbc)
[<c0128910>] (warn_slowpath_fmt) from [<c08b8e84>] (platform_get_irq_optional+0x118/0x128)
[<c08b8e84>] (platform_get_irq_optional) from [<c08b8eb4>] (platform_irq_count+0x20/0x3c)
[<c08b8eb4>] (platform_irq_count) from [<c0728660>] (mxc_gpio_probe+0x8c/0x494)
[<c0728660>] (mxc_gpio_probe) from [<c08b93cc>] (platform_drv_probe+0x48/0x98)
[<c08b93cc>] (platform_drv_probe) from [<c08b703c>] (really_probe+0x214/0x344)
[<c08b703c>] (really_probe) from [<c08b7274>] (driver_probe_device+0x58/0xb4)
[<c08b7274>] (driver_probe_device) from [<c08b7478>] (device_driver_attach+0x58/0x60)
[<c08b7478>] (device_driver_attach) from [<c08b7504>] (__driver_attach+0x84/0xc0)
[<c08b7504>] (__driver_attach) from [<c08b50f8>] (bus_for_each_dev+0x78/0xb8)
[<c08b50f8>] (bus_for_each_dev) from [<c08b62cc>] (bus_add_driver+0x154/0x1e0)
[<c08b62cc>] (bus_add_driver) from [<c08b82b8>] (driver_register+0x74/0x108)
[<c08b82b8>] (driver_register) from [<c0102320>] (do_one_initcall+0x80/0x3b4)
[<c0102320>] (do_one_initcall) from [<c1501008>] (kernel_init_freeable+0x170/0x208)
[<c1501008>] (kernel_init_freeable) from [<c0e178d4>] (kernel_init+0x8/0x11c)
[<c0e178d4>] (kernel_init) from [<c0100134>] (ret_from_fork+0x14/0x20)
As it turns out, mxc_register_gpio() is a bit lax when setting the
number of resources: it registers a resource with interrupt 0 when in
reality there is no such interrupt. Fix the problem by not declaring
the second interrupt resource if there is no second interrupt.
Fixes: a85a6c86c2 ("driver core: platform: Clarify that IRQ 0 is invalid")
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Michael Chan says:
====================
bnxt_en: 3 bug fixes.
2 Fixes related to PHY/link settings. The last one fixes the sizing of
the completion ring.
Please also queue for -stable. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The current completion ring sizing formula is wrong with TPA enabled.
The formula assumes that the number of TPA completions are bound by the
RX ring size, but that's not true. TPA_START completions are immediately
recycled so they are not bound by the RX ring size. We must add
bp->max_tpa to the worst case maximum RX and TPA completions.
The completion ring can overflow because of this mistake. This will
cause hardware to disable the completion ring when this happens,
leading to RX and TX traffic to stall on that ring. This issue is
generally exposed only when the RX ring size is set very small.
Fix the formula by adding bp->max_tpa to the number of RX completions
if TPA is enabled.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.");
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a shared port PHY configuration, async event is received when any of the
port modifies the configuration. Ethtool link settings should be
initialised after updated PHY configuration from firmware.
Fixes: b1613e78e9 ("bnxt_en: Add async. event logic for PHY configuration changes.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver was modified to not rely on rtnl lock to protect link
settings about 2 years ago. The pause setting was missed when
making that change. Fix it by acquiring link_lock mutex before
calling bnxt_hwrm_set_pause().
Fixes: e2dc9b6e38 ("bnxt_en: Don't use rtnl lock to protect link change logic in workqueue.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull Xtensa fixes from Max Filippov:
- fix __sync_fetch_and_{and,or}_4 declarations to avoid build warning
- update *pos in cpuinfo_op.next to avoid runtime warning
- use for_each_set_bit in xtensa_pmu_irq_handler instead of open-coding
it
* tag 'xtensa-20200712' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: simplify xtensa_pmu_irq_handler
xtensa: update *pos in cpuinfo_op.next
xtensa: fix __sync_fetch_and_{and,or}_4 declarations
Pull io_uring fixes from Jens Axboe:
"Two late fixes again:
- Fix missing msg_name assignment in certain cases (Pavel)
- Correct a previous fix for full coverage (Pavel)"
* tag 'io_uring-5.8-2020-07-12' of git://git.kernel.dk/linux-block:
io_uring: fix not initialised work->flags
io_uring: fix missing msg_name assignment
Pull btrfs fixes from David Sterba:
"Two refcounting fixes and one prepartory patch for upcoming splice
cleanup:
- fix double put of block group with nodatacow
- fix missing block group put when remounting with discard=async
- explicitly set splice callback (no functional change), to ease
integrating splice cleanup patches"
* tag 'for-5.8-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: wire up iter_file_splice_write
btrfs: fix double put of block group with nocow
btrfs: discard: add missing put when grabbing block group from unused list
59960b9deb ("io_uring: fix lazy work init") tried to fix missing
io_req_init_async(), but left out work.flags and hash. Do it earlier.
Fixes: 7cdaf587de ("io_uring: avoid whole io_wq_work copy for requests completed inline")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ensure to set msg.msg_name for the async portion of send/recvmsg,
as the header copy will copy to/from it.
Cc: stable@vger.kernel.org # v5.5+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull RISC-V fixes from Palmer Dabbelt:
"I have a few KGDB-related fixes. They're mostly fixes for build
warnings, but there's also:
- Support for the qSupported and qXfer packets, which are necessary
to pass around GDB XML information which we need for the RISC-V GDB
port to fully function.
- Users can now select STRICT_KERNEL_RWX instead of forcing it on"
* tag 'riscv-for-linus-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Avoid kgdb.h including gdb_xml.h to solve unused-const-variable warning
kgdb: Move the extern declaration kgdb_has_hit_break() to generic kgdb.h
riscv: Fix "no previous prototype" compile warning in kgdb.c file
riscv: enable the Kconfig prompt of STRICT_KERNEL_RWX
kgdb: enable arch to support XML packet.
Pull SCSI fixes from James Bottomley:
"Five small fixes, four in driver and one in the SCSI Parallel
transport, which fixes an incredibly old bug so I suspect no-one has
actually used the functionality it fixes"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: dh: Add Fujitsu device to devinfo and dh lists
scsi: mpt3sas: Fix error returns in BRM_status_show
scsi: mpt3sas: Fix unlock imbalance
scsi: iscsi: Change iSCSI workqueue max_active back to 1
scsi: scsi_transport_spi: Fix function pointer check
Pull xen fix from Juergen Gross:
"Just one fix of a recent patch (double free in an error path)"
* tag 'for-linus-5.8b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/xenbus: Fix a double free in xenbus_map_ring_pv()
Pull powerpc fix from Michael Ellerman:
"One fix for a crash/soft lockup on Power8, caused by the exception
rework we did in v5.7.
Thanks to Paul Menzel and Nicholas Piggin"
* tag 'powerpc-5.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s/exception: Fix 0x1500 interrupt handler crash
The HSDK pll driver uses the devm_ioremap_resource function, but does
not specify a dependency on IOMEM in Kconfig. This causes a build
failure on architectures without IOMEM, for example, UML (notably with
make allyesconfig).
Fix this by making CONFIG_CLK_HSDK depend on CONFIG_IOMEM.
Signed-off-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/r/20200630043214.1080961-1-davidgow@google.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
When building arm32 allmodconfig:
ld.lld: error: undefined symbol: ap_cp_unique_name
>>> referenced by ap-cpu-clk.c
>>> clk/mvebu/ap-cpu-clk.o:(ap_cpu_clock_probe) in archive drivers/built-in.a
ap_cp_unique_name is only compiled into the kernel image when
CONFIG_ARMADA_AP_CP_HELPER is selected (as it is not user selectable).
However, CONFIG_ARMADA_AP_CPU_CLK does not select it.
This has been a problem since the driver was added to the kernel but it
was not built before commit c318ea261749 ("cpufreq: ap806: fix cpufreq
driver needs ap cpu clk") so it was never noticed.
Fixes: f756e362d9 ("clk: mvebu: add CPU clock driver for Armada 7K/8K")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://lore.kernel.org/r/20200701201128.2448427-1-natechancellor@gmail.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Pull libnvdimm fix from Dan Williams:
"A one-line Fix for key ring search permissions to address a regression
from -rc1"
* tag 'libnvdimm-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm/security: Fix key lookup permissions
Pull cifs fixes from Steve French:
"Four cifs/smb3 fixes: the three for stable fix problems found recently
with change notification including a reference count leak"
* tag '5.8-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module version number
cifs: fix reference leak for tlink
smb3: fix unneeded error message on change notify
cifs: remove the retry in cifs_poxis_lock_set
smb3: fix access denied on change notify request to some servers
Pull coding style terminology documentation from Dan Williams:
"The discussion has tapered off as well as the incoming ack, review,
and sign-off tags. I did not see a reason to wait for the next merge
window"
* tag 'inclusive-terminology' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux:
CodingStyle: Inclusive Terminology
Pull networking fixes from David Miller:
1) Restore previous behavior of CAP_SYS_ADMIN wrt loading networking
BPF programs, from Maciej Żenczykowski.
2) Fix dropped broadcasts in mac80211 code, from Seevalamuthu
Mariappan.
3) Slay memory leak in nl80211 bss color attribute parsing code, from
Luca Coelho.
4) Get route from skb properly in ip_route_use_hint(), from Miaohe Lin.
5) Don't allow anything other than ARPHRD_ETHER in llc code, from Eric
Dumazet.
6) xsk code dips too deeply into DMA mapping implementation internals.
Add dma_need_sync and use it. From Christoph Hellwig
7) Enforce power-of-2 for BPF ringbuf sizes. From Andrii Nakryiko.
8) Check for disallowed attributes when loading flow dissector BPF
programs. From Lorenz Bauer.
9) Correct packet injection to L3 tunnel devices via AF_PACKET, from
Jason A. Donenfeld.
10) Don't advertise checksum offload on ipa devices that don't support
it. From Alex Elder.
11) Resolve several issues in TCP MD5 signature support. Missing memory
barriers, bogus options emitted when using syncookies, and failure
to allow md5 key changes in established states. All from Eric
Dumazet.
12) Fix interface leak in hsr code, from Taehee Yoo.
13) VF reset fixes in hns3 driver, from Huazhong Tan.
14) Make loopback work again with ipv6 anycast, from David Ahern.
15) Fix TX starvation under high load in fec driver, from Tobias
Waldekranz.
16) MLD2 payload lengths not checked properly in bridge multicast code,
from Linus Lüssing.
17) Packet scheduler code that wants to find the inner protocol
currently only works for one level of VLAN encapsulation. Allow
Q-in-Q situations to work properly here, from Toke
Høiland-Jørgensen.
18) Fix route leak in l2tp, from Xin Long.
19) Resolve conflict between the sk->sk_user_data usage of bpf reuseport
support and various protocols. From Martin KaFai Lau.
20) Fix socket cgroup v2 reference counting in some situations, from
Cong Wang.
21) Cure memory leak in mlx5 connection tracking offload support, from
Eli Britstein.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (146 commits)
mlxsw: pci: Fix use-after-free in case of failed devlink reload
mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON()
net: macb: fix call to pm_runtime in the suspend/resume functions
net: macb: fix macb_suspend() by removing call to netif_carrier_off()
net: macb: fix macb_get/set_wol() when moving to phylink
net: macb: mark device wake capable when "magic-packet" property present
net: macb: fix wakeup test in runtime suspend/resume routines
bnxt_en: fix NULL dereference in case SR-IOV configuration fails
libbpf: Fix libbpf hashmap on (I)LP32 architectures
net/mlx5e: CT: Fix memory leak in cleanup
net/mlx5e: Fix port buffers cell size value
net/mlx5e: Fix 50G per lane indication
net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash
net/mlx5e: Fix VXLAN configuration restore after function reload
net/mlx5e: Fix usage of rcu-protected pointer
net/mxl5e: Verify that rpriv is not NULL
net/mlx5: E-Switch, Fix vlan or qos setting in legacy mode
net/mlx5: Fix eeprom support for SFP module
cgroup: Fix sock_cgroup_data on big-endian.
selftests: bpf: Fix detach from sockmap tests
...
Since the BPF_PROG_TYPE_CGROUP_SOCKOPT verifier test does not set an
attach type, bpf_prog_load_check_attach() disallows loading the program
and the test is always skipped:
#434/p perfevent for cgroup sockopt SKIP (unsupported program type 25)
Fix the issue by setting a valid attach type.
Fixes: 0456ea170c ("bpf: Enable more helpers for BPF_PROG_TYPE_CGROUP_{DEVICE,SYSCTL,SOCKOPT}")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20200710150439.126627-1-jean-philippe@linaro.org
CONFIG_CC_IS_GCC is undefined when Clang is used, which breaks the build
(see our Travis link below).
Clang 8 was chosen as a minimum version for this check because there
were some improvements around __builtin_constant_p in that release. In
reality, MIPS was not even buildable until clang 9 so that check was not
technically necessary. Just remove all compiler checks and just assume
that we have a working compiler.
Fixes: d4e6045326 ("Restore gcc check in mips asm/unroll.h")
Link: https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/jobs/359642821
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ido Schimmel says:
====================
mlxsw: Various fixes
Fix two issues found by syzkaller.
Patch #1 removes inappropriate usage of WARN_ON() following memory
allocation failure. Constantly triggered when syzkaller injects faults.
Patch #2 fixes a use-after-free that can be triggered by 'devlink dev
info' following a failed devlink reload.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In case devlink reload failed, it is possible to trigger a
use-after-free when querying the kernel for device info via 'devlink dev
info' [1].
This happens because as part of the reload error path the PCI command
interface is de-initialized and its mailboxes are freed. When the
devlink '->info_get()' callback is invoked the device is queried via the
command interface and the freed mailboxes are accessed.
Fix this by initializing the command interface once during probe and not
during every reload.
This is consistent with the other bus used by mlxsw (i.e., 'mlxsw_i2c')
and also allows user space to query the running firmware version (for
example) from the device after a failed reload.
[1]
BUG: KASAN: use-after-free in memcpy include/linux/string.h:406 [inline]
BUG: KASAN: use-after-free in mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
Write of size 4096 at addr ffff88810ae32000 by task syz-executor.1/2355
CPU: 1 PID: 2355 Comm: syz-executor.1 Not tainted 5.8.0-rc2+ #29
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xf6/0x16e lib/dump_stack.c:118
print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
check_memory_region_inline mm/kasan/generic.c:186 [inline]
check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
memcpy+0x39/0x60 mm/kasan/common.c:106
memcpy include/linux/string.h:406 [inline]
mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
mlxsw_cmd_exec+0x249/0x550 drivers/net/ethernet/mellanox/mlxsw/core.c:2335
mlxsw_cmd_access_reg drivers/net/ethernet/mellanox/mlxsw/cmd.h:859 [inline]
mlxsw_core_reg_access_cmd drivers/net/ethernet/mellanox/mlxsw/core.c:1938 [inline]
mlxsw_core_reg_access+0x2f6/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1985
mlxsw_reg_query drivers/net/ethernet/mellanox/mlxsw/core.c:2000 [inline]
mlxsw_devlink_info_get+0x17f/0x6e0 drivers/net/ethernet/mellanox/mlxsw/core.c:1090
devlink_nl_info_fill.constprop.0+0x13c/0x2d0 net/core/devlink.c:4588
devlink_nl_cmd_info_get_dumpit+0x246/0x460 net/core/devlink.c:4648
genl_lock_dumpit+0x85/0xc0 net/netlink/genetlink.c:575
netlink_dump+0x515/0xe50 net/netlink/af_netlink.c:2245
__netlink_dump_start+0x53d/0x830 net/netlink/af_netlink.c:2353
genl_family_rcv_msg_dumpit.isra.0+0x296/0x300 net/netlink/genetlink.c:638
genl_family_rcv_msg net/netlink/genetlink.c:733 [inline]
genl_rcv_msg+0x78d/0x9d0 net/netlink/genetlink.c:753
netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2469
genl_rcv+0x24/0x40 net/netlink/genetlink.c:764
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1329
netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1918
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0x150/0x190 net/socket.c:672
____sys_sendmsg+0x6d8/0x840 net/socket.c:2363
___sys_sendmsg+0xff/0x170 net/socket.c:2417
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2450
do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: a9c8336f65 ("mlxsw: core: Add support for devlink info command")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should not trigger a warning when a memory allocation fails. Remove
the WARN_ON().
The warning is constantly triggered by syzkaller when it is injecting
faults:
[ 2230.758664] FAULT_INJECTION: forcing a failure.
[ 2230.758664] name failslab, interval 1, probability 0, space 0, times 0
[ 2230.762329] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
...
[ 2230.898175] WARNING: CPU: 3 PID: 1407 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:6265 mlxsw_sp_router_fib_event+0xfad/0x13e0
[ 2230.898179] Kernel panic - not syncing: panic_on_warn set ...
[ 2230.898183] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
[ 2230.898190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Fixes: 3057224e01 ("mlxsw: spectrum_router: Implement FIB offload in deferred work")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Ferre says:
====================
net: macb: Wake-on-Lan magic packet fixes and GEM handling
Here is a split series to fix WoL magic-packet on the current macb driver. Only
fixes in this one based on current net/master.
Changes in v5:
- Addressed the error code returned by phylink_ethtool_set_wol() as suggested
by Russell.
If PHY handles WoL, MAC doesn't stay in the way.
- Removed Florian's tag on 3/5 because of the above changes.
- Correct the "Fixes" tag on 1/5.
Changes in v4:
- Pure bug fix series for 'net'. GEM addition and MACB update removed: will be
sent later.
Changes in v3:
- Revert some of the v2 changes done in macb_resume(). Now the resume function
supports in-depth re-configuration of the controller in order to deal with
deeper sleep states. Basically as it was before changes introduced by this
series
- Tested for non-regression with our deeper Power Management mode which cuts
power to the controller completely
Changes in v2:
- Add patch 4/7 ("net: macb: fix macb_suspend() by removing call to netif_carrier_off()")
needed for keeping phy state consistent
- Add patch 5/7 ("net: macb: fix call to pm_runtime in the suspend/resume functions") that prevent
putting the macb in runtime pm suspend mode when WoL is used
- Collect review tags on 3 first patches from Florian: Thanks!
- Review of macb_resume() function
- Addition of pm_wakeup_event() in both MACB and GEM WoL IRQ handlers
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Keep previous function goals and integrate phylink actions to them.
phylink_ethtool_get_wol() is not enough to figure out if Ethernet driver
supports Wake-on-Lan.
Initialization of "supported" and "wolopts" members is done in phylink
function, no need to keep them in calling function.
phylink_ethtool_set_wol() return value is considered and determines
if the MAC has to handle WoL or not. The case where the PHY doesn't
implement WoL leads to the MAC configuring it to provide this feature.
Fixes: 7897b071ac ("net: macb: convert to phylink")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Antoine Tenart <antoine.tenart@bootlin.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the way the "magic-packet" DT property is handled in the
macb_probe() function, matching DT binding documentation.
Now we mark the device as "wakeup capable" instead of calling the
device_init_wakeup() function that would enable the wakeup source.
For Ethernet WoL, enabling the wakeup_source is done by
using ethtool and associated macb_set_wol() function that
already calls device_set_wakeup_enable() for this purpose.
That would reduce power consumption by cutting more clocks if
"magic-packet" property is set but WoL is not configured by ethtool.
Fixes: 3e2a5e1539 ("net: macb: add wake-on-lan support via magic packet")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Sergio Prado <sergio.prado@e-labworks.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the proper struct device pointer to check if the wakeup flag
and wakeup source are positioned.
Use the one passed by function call which is equivalent to
&bp->dev->dev.parent.
It's preventing the trigger of a spurious interrupt in case the
Wake-on-Lan feature is used.
Fixes: d54f89af6c ("net: macb: Add pm runtime support")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf 2020-07-09
The following pull-request contains BPF updates for your *net* tree.
We've added 4 non-merge commits during the last 1 day(s) which contain
a total of 4 files changed, 26 insertions(+), 15 deletions(-).
The main changes are:
1) fix crash in libbpf on 32-bit archs, from Jakub and Andrii.
2) fix crash when l2tp and bpf_sk_reuseport conflict, from Martin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
mlx5 fixes 2020-07-02
This series introduces some fixes to mlx5 driver.
V1->v2:
- Drop "ip -s" patch and mirred device hold reference patch.
- Will revise them in a later submission.
Please pull and let me know if there is any problem.
For -stable v5.2
('net/mlx5: Fix eeprom support for SFP module')
For -stable v5.4
('net/mlx5e: Fix 50G per lane indication')
For -stable v5.5
('net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash')
('net/mlx5e: Fix VXLAN configuration restore after function reload')
For -stable v5.7
('net/mlx5e: CT: Fix memory leak in cleanup')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull rdma fixes from Jason Gunthorpe:
"Small update, a few more merge window bugs and normal driver bug
fixes:
- Two merge window regressions in mlx5: a error path bug found by
syzkaller and some lost code during a rework preventing ipoib from
working in some configurations
- Silence clang compilation warning in OPA related code
- Fix a long standing race condition in ib_nl for ACM
- Resolve when the HFI1 is shutdown"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/mlx5: Set PD pointers for the error flow unwind
IB/mlx5: Fix 50G per lane indication
RDMA/siw: Fix reporting vendor_part_id
IB/sa: Resolv use-after-free in ib_nl_make_request()
IB/hfi1: Do not destroy link_wq when the device is shut down
IB/hfi1: Do not destroy hfi1_wq when the device is shut down
RDMA/mlx5: Fix legacy IPoIB QP initialization
IB/hfi1: Add explicit cast OPA_MTU_8192 to 'enum ib_mtu'
Pull kselftest fixes from Shuah Khan:
"TPM2 test changes to run on python3 and kselftest framework fix to
incorrect return type"
* tag 'linux-kselftest-fixes-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kselftest: ksft_test_num return type should be unsigned
selftests: tpm: upgrade TPM2 tests from Python 2 to Python 3
Pull io_uring fixes from Jens Axboe:
- Fix memleak for error path in registered files (Yang)
- Export CQ overflow state in flags, necessary to fix a case where
liburing doesn't know if it needs to enter the kernel (Xiaoguang)
- Fix for a regression in when user memory is accounted freed, causing
issues with back-to-back ring exit + init if the ulimit -l setting is
very tight.
* tag 'io_uring-5.8-2020-07-10' of git://git.kernel.dk/linux-block:
io_uring: account user memory freed when exit has been queued
io_uring: fix memleak in io_sqe_files_register()
io_uring: fix memleak in __io_sqe_files_update()
io_uring: export cq overflow status to userspace
Pull block fixes from Jens Axboe:
- Fix for inflight accounting, which affects only dm (Ming)
- Fix documentation error for bfq (Yufen)
- Fix memory leak for nbd (Zheng)
* tag 'block-5.8-2020-07-10' of git://git.kernel.dk/linux-block:
nbd: Fix memory leak in nbd_add_socket
blk-mq: consider non-idle request as "inflight" in blk_mq_rq_inflight()
docs: block: update and fix tiny error for bfq
We see that sometimes the CPU in GOYA and GAUDI is occupied by the
power/thermal loop and can't answer requests from the driver fast enough.
Therefore, to avoid false notifications on timeouts, increase the timeout
to 4 seconds on each message sent to the device CPU.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
For debugging purposes, we need to allow the root user better control of
the clock gating feature of the DMA and compute engines. Therefore, change
the clock gating debugfs interface to be bitmask instead of true/false.
Each bit represents a different engine, according to gaudi_engine_id enum.
See debugfs documentation for more details.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
WREG_BULK is a special packet that has a variable length. Therefore, we
can't parse it when validating CBs that go to the PCI DMA queue. In case
the user needs to use it, it can put multiple WREG32 packets instead.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Pull in-kernel read and write op cleanups from Christoph Hellwig:
"Cleanup in-kernel read and write operations
Reshuffle the (__)kernel_read and (__)kernel_write helpers, and ensure
all users of in-kernel file I/O use them if they don't use iov_iter
based methods already.
The new WARN_ONs in combination with syzcaller already found a missing
input validation in 9p. The fix should be on your way through the
maintainer ASAP".
[ This is prep-work for the real changes coming 5.9 ]
* tag 'cleanup-kernel_read_write' of git://git.infradead.org/users/hch/misc:
fs: remove __vfs_read
fs: implement kernel_read using __kernel_read
integrity/ima: switch to using __kernel_read
fs: add a __kernel_read helper
fs: remove __vfs_write
fs: implement kernel_write using __kernel_write
fs: check FMODE_WRITE in __kernel_write
fs: unexport __kernel_write
bpfilter: switch to kernel_write
autofs: switch to kernel_write
cachefiles: switch to kernel_write
Pull dma-mapping fixes from Christoph Hellwig:
- add a warning when the atomic pool is depleted (David Rientjes)
- protect the parameters of the new scatterlist helper macros (Marek
Szyprowski )
* tag 'dma-mapping-5.8-5' of git://git.infradead.org/users/hch/dma-mapping:
scatterlist: protect parameters of the sg_table related macros
dma-mapping: warn when coherent pool is depleted
Pull pin control fixes from Linus Walleij:
- Fix an issue in the AMD driver for the UART0 group
- Fix a glitch issue in the Baytrail pin controller
* tag 'pinctrl-v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: baytrail: Fix pin being driven low for a while on gpiod_get(..., GPIOD_OUT_HIGH)
pinctrl: amd: fix npins for uart0 in kerncz_groups
Pull GPIO fixes from Linus Walleij:
"Some GPIO fixes, most of them for the PCA953x that Andy worked hard to
fix up.
- Fix two runtime PM errorpath problems in the Arizona GPIO driver.
- Fix three interrupt issues in the PCA953x driver.
- Fix the automatic address increment handling in the PCA953x driver
again.
- Add a quirk to the PCA953x that fixes a problem in the Intel
Galileo Gen 2"
* tag 'gpio-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: pca953x: Fix GPIO resource leak on Intel Galileo Gen 2
gpio: pca953x: disable regmap locking for automatic address incrementing
gpio: pca953x: Fix direction setting when configure an IRQ
gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2
gpio: pca953x: Synchronize interrupt handler properly
gpio: arizona: put pm_runtime in case of failure
gpio: arizona: handle pm_runtime_get_sync failure case
USB MIDI driver has an error recovery mechanism to resubmit the URB in
the delayed timer handler, and this may race with the standard start /
stop operations. Although both start and stop operations themselves
don't race with each other due to the umidi->mutex protection, but
this isn't applied to the timer handler.
For fixing this potential race, the following changes are applied:
- Since the timer handler can't use the mutex, we apply the
umidi->disc_lock protection at each input stream URB submission;
this also needs to change the GFP flag to GFP_ATOMIC
- Add a check of the URB refcount and skip if already submitted
- Move the timer cancel call at disconnection to the beginning of the
procedure; this assures the in-flight timer handler is gone properly
before killing all pending URBs
Reported-by: syzbot+0f4ecfe6a2c322c81728@syzkaller.appspotmail.com
Reported-by: syzbot+5f1d24c49c1d2c427497@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200710160656.16819-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull gfs2 fixes from Andreas Gruenbacher:
"Fix gfs2 readahead deadlocks by adding a IOCB_NOIO flag that allows
gfs2 to use the generic fiel read iterator functions without having to
worry about being called back while holding locks".
* tag 'gfs2-v5.8-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Rework read and page fault locking
fs: Add IOCB_NOIO flag for generic_file_read_iter
Pull arm64 fixes from Will Deacon:
"An unfortunately large collection of arm64 fixes for -rc5.
Some of this is absolutely trivial, but the alternatives, vDSO and CPU
errata workaround fixes are significant. At least people are finding
and fixing these things, I suppose.
- Fix workaround for CPU erratum #1418040 to disable the compat vDSO
- Fix Oops when single-stepping with KGDB
- Fix memory attributes for hypervisor device mappings at EL2
- Fix memory leak in PSCI and remove useless variable assignment
- Fix up some comments and asm labels in our entry code
- Fix broken register table formatting in our generated html docs
- Fix missing NULL sentinel in CPU errata workaround list
- Fix patching of branches in alternative instruction sections"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/alternatives: don't patch up internal branches
arm64: Add missing sentinel to erratum_1463225
arm64: Documentation: Fix broken table in generated HTML
arm64: kgdb: Fix single-step exception handling oops
arm64: entry: Tidy up block comments and label numbers
arm64: Rework ARM_ERRATUM_1414080 handling
arm64: arch_timer: Disable the compat vdso for cores affected by ARM64_WORKAROUND_1418040
arm64: arch_timer: Allow an workaround descriptor to disable compat vdso
arm64: Introduce a way to disable the 32bit vdso
arm64: entry: Fix the typo in the comment of el1_dbg()
drivers/firmware/psci: Assign @err directly in hotplug_tests()
drivers/firmware/psci: Fix memory leakage in alloc_init_cpu_groups()
KVM: arm64: Fix definition of PAGE_HYP_DEVICE
Pull s390 fixes from Heiko Carstens:
"This is mainly due to the fact that Gerald Schaefer's and also my old
email addresses currently do not work any longer. Therefore we decided
to switch to new email addresses and reflect that in the MAINTAINERS
file.
- Update email addresses in MAINTAINERS file and add .mailmap entries
for Gerald Schaefer and Heiko Carstens.
- Fix huge pte soft dirty copying"
* tag 's390-5.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
MAINTAINERS: update email address for Gerald Schaefer
MAINTAINERS: update email address for Heiko Carstens
s390/mm: fix huge pte soft dirty copying
Pull vkm fixes from Paolo Bonzini:
"Two simple but important bugfixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: MIPS: Fix build errors for 32bit kernel
KVM: nVMX: fixes for preemption timer migration
Pull MMC fixes from Ulf Hansson:
- Override DLL_CONFIG only with valid values in sdhci-msm
- Get rid of of_match_ptr() macro to fix warning in owl-mmc
- Limit segments to 1 to fix meson-gx G12A/G12B SoCs
* tag 'mmc-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-msm: Override DLL_CONFIG only if the valid value is supplied
mmc: owl-mmc: Get rid of of_match_ptr() macro
mmc: meson-gx: limit segments to 1 when dram-access-quirk is needed
We currently account the memory after the exit work has been run, but
that leaves a gap where a process has closed its ring and until the
memory has been accounted as freed. If the memlocked ulimit is
borderline, then that can introduce spurious setup errors returning
-ENOMEM because the free work hasn't been run yet.
Account this as freed when we close the ring, as not to expose a tiny
gap where setting up a new ring can fail.
Fixes: 85faa7b834 ("io_uring: punt final io_ring_ctx wait-and-free to workqueue")
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Jens Axboe <axboe@kernel.dk>
With the earlier patch in this series, all devices that deferred probe
due to fw_devlink_pause() will have their probes delayed till the
deferred probe thread is kicked off during late_initcall. This will also
affect all their consumers.
This delayed probing in unnecessary. So this patch just keeps track of
the devices that had their probe deferred due to fw_devlink_pause() and
attempts to probe them once during fw_devlink_resume().
Fixes: 716a7a2596 ("driver core: fw_devlink: Add support for batching fwnode parsing")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20200701194259.3337652-4-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current deferred probe implementation can mess up suspend/resume
ordering if deferred probe thread is kicked off in parallel with the
main initcall thread (kernel_init thread) [1].
For example:
Say device-B is a consumer of device-A.
Initcall thread Deferred probe thread
=============== =====================
1. device-A is added.
2. device-B is added.
3. dpm_list is now [device-A, device-B].
4. driver-A defers probe of device-A.
5. device-A is moved to
end of dpm_list
6. dpm_list is now
[device-B, device-A]
7. driver-B is registereed and probes device-B.
8. dpm_list stays as [device-B, device-A].
The reverse order of dpm_list is used for suspend. So in this case
device-A would incorrectly get suspended before device-B.
Commit 716a7a2596 ("driver core: fw_devlink: Add support for batching
fwnode parsing") kicked off the deferred probe thread early during boot
to run in parallel with the initcall thread and caused suspend/resume
regressions. This patch removes the parallel run of the deferred probe
thread to avoid the suspend/resume regressions.
[1] - https://lore.kernel.org/lkml/CAGETcx8W96KAw-d_siTX4qHB_-7ddk0miYRDQeHE6E0_8qx-6Q@mail.gmail.com/
Fixes: 716a7a2596 ("driver core: fw_devlink: Add support for batching fwnode parsing")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20200701194259.3337652-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The comment near to uart_port_spin_lock_init() says:
Ensure that the serial console lock is initialised early.
If this port is a console, then the spinlock is already initialised.
and there is nothing about enabled or disabled consoles. The commit
a3cb39d258 ("serial: core: Allow detach and attach serial device
for console") made a change, which follows the comment, and also to
prevent reinitialisation of the lock in use, when user detaches and
attaches back the same console device. But this change discovers
another issue, that uart_add_one_port() tries to access a spin lock
that now may be uninitialised. This happens when a driver expects
the serial core to register a console on its behalf. In this case
we must initialise a spin lock before use.
Fixes: a3cb39d258 ("serial: core: Allow detach and attach serial device for console")
Reported-by: Marc Zyngier <maz@kernel.org>
Reported-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Reported-by: Anatoly Pugachev <matorola@gmail.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20200706214903.56148-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Until this commit the mainline kernel version (this version) of the
vboxguest module contained a bug where it defined
VBGL_IOCTL_VMMDEV_REQUEST_BIG and VBGL_IOCTL_LOG using
_IOC(_IOC_READ | _IOC_WRITE, 'V', ...) instead of
_IO(V, ...) as the out of tree VirtualBox upstream version does.
Since the VirtualBox userspace bits are always built against VirtualBox
upstream's headers, this means that so far the mainline kernel version
of the vboxguest module has been failing these 2 ioctls with -ENOTTY.
I guess that VBGL_IOCTL_VMMDEV_REQUEST_BIG is never used causing us to
not hit that one and sofar the vboxguest driver has failed to actually
log any log messages passed it through VBGL_IOCTL_LOG.
This commit changes the VBGL_IOCTL_VMMDEV_REQUEST_BIG and VBGL_IOCTL_LOG
defines to match the out of tree VirtualBox upstream vboxguest version,
while keeping compatibility with the old wrong request defines so as
to not break the kernel ABI in case someone has been using the old
request defines.
Fixes: f6ddd094f5 ("virt: Add vboxguest driver for Virtual Box Guest integration UAPI")
Cc: stable@vger.kernel.org
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200709120858.63928-2-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Felipe writes:
USB: fixes for v5.8-rc3
Adding support for recent Intel devices (Tiger Lake and Jasper Lake)
on dwc3. We have some endianess fixes in cdns3, a memleak fix in
gr_udc and lock API usage fix in the legacy f_uac1
Signed-off-by: Felipe Balbi <balbi@kernel.org>
* tag 'fixes-for-v5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb:
usb: gadget: function: fix missing spinlock in f_uac1_legacy
usb: gadget: udc: atmel: fix uninitialized read in debug printk
usb: gadget: udc: atmel: remove outdated comment in usba_ep_disable()
usb: dwc2: Fix shutdown callback in platform
usb: cdns3: trace: fix some endian issues
usb: cdns3: ep0: fix some endian issues
usb: gadget: udc: gr_udc: fix memleak on error handling path in gr_ep_init()
usb: gadget: fix langid kernel-doc warning in usbstring.c
usb: dwc3: pci: add support for the Intel Jasper Lake
usb: dwc3: pci: add support for the Intel Tiger Lake PCH -H variant
Commit dc6d95b153 ("KVM: MIPS: Add more MMIO load/store
instructions emulation") introduced some 64bit load/store instructions
emulation which are unavailable on 32bit platform, and it causes build
errors:
arch/mips/kvm/emulate.c: In function 'kvm_mips_emulate_store':
arch/mips/kvm/emulate.c:1734:6: error: right shift count >= width of type [-Werror]
((vcpu->arch.gprs[rt] >> 56) & 0xff);
^
arch/mips/kvm/emulate.c:1738:6: error: right shift count >= width of type [-Werror]
((vcpu->arch.gprs[rt] >> 48) & 0xffff);
^
arch/mips/kvm/emulate.c:1742:6: error: right shift count >= width of type [-Werror]
((vcpu->arch.gprs[rt] >> 40) & 0xffffff);
^
arch/mips/kvm/emulate.c:1746:6: error: right shift count >= width of type [-Werror]
((vcpu->arch.gprs[rt] >> 32) & 0xffffffff);
^
arch/mips/kvm/emulate.c:1796:6: error: left shift count >= width of type [-Werror]
(vcpu->arch.gprs[rt] << 32);
^
arch/mips/kvm/emulate.c:1800:6: error: left shift count >= width of type [-Werror]
(vcpu->arch.gprs[rt] << 40);
^
arch/mips/kvm/emulate.c:1804:6: error: left shift count >= width of type [-Werror]
(vcpu->arch.gprs[rt] << 48);
^
arch/mips/kvm/emulate.c:1808:6: error: left shift count >= width of type [-Werror]
(vcpu->arch.gprs[rt] << 56);
^
cc1: all warnings being treated as errors
make[3]: *** [arch/mips/kvm/emulate.o] Error 1
So, use #if defined(CONFIG_64BIT) && defined(CONFIG_KVM_MIPS_VZ) to
guard the 64bit load/store instructions emulation.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: dc6d95b153 ("KVM: MIPS: Add more MMIO load/store instructions emulation")
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Message-Id: <1594365797-536-1-git-send-email-chenhc@lemote.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 850448f35a ("KVM: nVMX: Fix VMX preemption timer migration",
2020-06-01) accidentally broke nVMX live migration from older version
by changing the userspace ABI. Restore it and, while at it, ensure
that vmx->nested.has_preemption_timer_deadline is always initialized
according to the KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE flag.
Cc: Makarand Sonare <makarandsonare@google.com>
Fixes: 850448f35a ("KVM: nVMX: Fix VMX preemption timer migration")
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There exists a sleep-while-atomic bug while accessing the dmabuf->name
under mutex in the dmabuffs_dname(). This is caused from the SELinux
permissions checks on a process where it tries to validate the inherited
files from fork() by traversing them through iterate_fd() (which
traverse files under spin_lock) and call
match_file(security/selinux/hooks.c) where the permission checks happen.
This audit information is logged using dump_common_audit_data() where it
calls d_path() to get the file path name. If the file check happen on
the dmabuf's fd, then it ends up in ->dmabuffs_dname() and use mutex to
access dmabuf->name. The flow will be like below:
flush_unauthorized_files()
iterate_fd()
spin_lock() --> Start of the atomic section.
match_file()
file_has_perm()
avc_has_perm()
avc_audit()
slow_avc_audit()
common_lsm_audit()
dump_common_audit_data()
audit_log_d_path()
d_path()
dmabuffs_dname()
mutex_lock()--> Sleep while atomic.
Call trace captured (on 4.19 kernels) is below:
___might_sleep+0x204/0x208
__might_sleep+0x50/0x88
__mutex_lock_common+0x5c/0x1068
__mutex_lock_common+0x5c/0x1068
mutex_lock_nested+0x40/0x50
dmabuffs_dname+0xa0/0x170
d_path+0x84/0x290
audit_log_d_path+0x74/0x130
common_lsm_audit+0x334/0x6e8
slow_avc_audit+0xb8/0xf8
avc_has_perm+0x154/0x218
file_has_perm+0x70/0x180
match_file+0x60/0x78
iterate_fd+0x128/0x168
selinux_bprm_committing_creds+0x178/0x248
security_bprm_committing_creds+0x30/0x48
install_exec_creds+0x1c/0x68
load_elf_binary+0x3a4/0x14e0
search_binary_handler+0xb0/0x1e0
So, use spinlock to access dmabuf->name to avoid sleep-while-atomic.
Cc: <stable@vger.kernel.org> [5.3+]
Signed-off-by: Charan Teja Kalla <charante@codeaurora.org>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
[sumits: added comment to spinlock_t definition to avoid warning]
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/a83e7f0d-4e54-9848-4b58-e1acdbe06735@codeaurora.org
clang static analysis flags this error
c67x00-sched.c:489:55: warning: Use of memory after it is freed [unix.Malloc]
usb_hcd_giveback_urb(c67x00_hcd_to_hcd(c67x00), urb, urbp->status);
^~~~~~~~~~~~
Problem happens in this block of code
c67x00_release_urb(c67x00, urb);
usb_hcd_unlink_urb_from_ep(c67x00_hcd_to_hcd(c67x00), urb);
spin_unlock(&c67x00->lock);
usb_hcd_giveback_urb(c67x00_hcd_to_hcd(c67x00), urb, urbp->status);
In the call to c67x00_release_urb has this freeing of urbp
urbp = urb->hcpriv;
urb->hcpriv = NULL;
list_del(&urbp->hep_node);
kfree(urbp);
And so urbp is freed before usb_hcd_giveback_urb uses it as its 3rd
parameter.
Since all is required is the status, pass the status directly as is
done in c64x00_urb_dequeue
Fixes: e9b29ffc51 ("USB: add Cypress c67x00 OTG controller HCD driver")
Signed-off-by: Tom Rix <trix@redhat.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200708131243.24336-1-trix@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The constant arrays in gdb_xml.h are only used in arch/riscv/kernel/kgdb.c,
but other c files may include the gdb_xml.h indirectly via including the
kgdb.h. Hence, It will cause many unused-const-variable warnings. This
patch makes the kgdb.h not to include the gdb_xml.h to solve this problem.
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Currently, only riscv kgdb.c uses the kgdb_has_hit_break() to identify
the kgdb breakpoint. It causes other architectures will encounter the "no
previous prototype" warnings if the compile option has W=1. Moving the
declaration of extern kgdb_has_hit_break() from risc-v kgdb.h to generic
kgdb.h to avoid generating these warnings.
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Some functions are only used in the kgdb.c file. Add static properities
to these functions to avoid "no previous prototype" compile warnings
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Due to lack of hardware breakpoint support, the kernel option
CONFIG_STRICT_KERNEL_RWX should be disabled when using KGDB. However,
CONFIG_STRICT_KERNEL_RWX is always enabled now. Therefore, select
ARCH_OPTIONAL_KERNEL_RWX_DEFAULT to enable CONFIG_STRICT_KERNEL_RWX
by default, and then select ARCH_OPTIONAL_KERNEL_RWX to enable the
Kconfig prompt of CONFIG_STRICT_KERNEL_RWX so that users can turn it off.
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The XML packet could be supported by required architecture if the
architecture defines CONFIG_HAVE_ARCH_KGDB_QXFER_PKT and implement its own
kgdb_arch_handle_qxfer_pkt(). Except for the kgdb_arch_handle_qxfer_pkt(),
the architecture also needs to record the feature supported by gdb stub
into the kgdb_arch_gdb_stub_feature, and these features will be reported
to host gdb when gdb stub receives the qSupported packet.
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
On ILP32, 64-bit result was shifted by value calculated for 32-bit long type
and returned value was much outside hashmap capacity.
As advised by Andrii Nakryiko, this patch uses different hashing variant for
architectures with size_t shorter than long long.
Fixes: e3b9242240 ("libbpf: add resizable non-thread safe internal hashmap")
Signed-off-by: Jakub Bogusz <qboosh@pld-linux.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200709225723.1069937-1-andriin@fb.com
CT entries are deleted via a workqueue from netfilter. If removing the
module before that, the rules are cleaned by the driver itself, but the
memory entries for them are not freed. Fix that.
Fixes: ac991b48d4 ("net/mlx5e: CT: Offload established flows")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Device unit for port buffers size, xoff_threshold and xon_threshold is
cells. Fix a bug in driver where cell unit size was hard-coded to
128 bytes. This hard-coded value is buggy, as it is wrong for some hardware
versions.
Driver to read cell size from SBCAM register and translate bytes to cell
units accordingly.
In order to fix the bug, this patch exposes SBCAM (Shared buffer
capabilities mask) layout and defines.
If SBCAM.cap_cell_size is valid, use it for all bytes to cells
calculations. If not valid, fallback to 128.
Cell size do not change on the fly per device. Instead of issuing SBCAM
access reg command every time such translation is needed, cache it in
mlx5e_dcbx as part of mlx5e_dcbnl_initialize(). Pass dcbx.port_buff_cell_sz
as a param to every function that needs bytes to cells translation.
While fixing the bug, move MLX5E_BUFFER_CELL_SHIFT macro to
en_dcbnl.c, as it is only used by that file.
Fixes: 0696d60853 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Some released FW versions mistakenly don't set the capability that 50G
per lane link-modes are supported for VFs (ptys_extended_ethernet
capability bit). When the capability is unset, read
PTYS.ext_eth_proto_capability (always reliable).
If PTYS.ext_eth_proto_capability is valid (has a non-zero value)
conclude that the HCA supports 50G per lane. Otherwise, conclude that
the HCA doesn't support 50G per lane.
Fixes: a08b4ed137 ("net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When detaching netdev, remove vxlan port configuration using
udp_tunnel_drop_rx_info. During function reload, configuration will be
restored using udp_tunnel_get_rx_info. This ensures sync between
firmware and driver. Use udp_tunnel_get_rx_info even if its physical
interface is down.
Fixes: 4383cfcc65 ("net/mlx5: Add devlink reload")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In mlx5e_configure_flower() flow pointer is protected by rcu read lock.
However, after cited commit the pointer is being used outside of rcu read
block. Extend the block to protect all pointer accesses.
Fixes: 553f932838 ("net/mlx5e: Support tc block sharing for representors")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Fix eeprom SFP query support by setting i2c_addr, offset and page number
correctly. Unlike QSFP modules, SFP eeprom params are as follow:
- i2c_addr is 0x50 for offset 0 - 255 and 0x51 for offset 256 - 511.
- Page number is always zero.
- Page offset is always relative to zero.
As part of eeprom query, query the module ID (SFP / QSFP*) via helper
function to set the params accordingly.
In addition, change mlx5_qsfp_eeprom_page() input type to be u16 to avoid
unnecessary casting.
Fixes: a708fb7b1f ("net/mlx5e: ethtool, Add support for EEPROM high pages query")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Pull drm fixes from Dave Airlie:
"I've been off most of the week, but some fixes have piled up. Seems a
bit busier than last week, but they are pretty spread out across a
bunch of drivers, none of them seem that big or worried me too much.
amdgpu:
- Fix a suspend/resume issue with PSP
- Backlight fix for Renoir
- Fix for gpu recovery debugging
radeon:
- Fix a double free in error path
i915:
- fbc fencing fix
- debugfs panic fix
- gem vma constuction fix
- gem pin under vm->nutex fix
nouveau:
- SVM fixes
- display fixes
meson:
- OSD burst length fixes
hibmc:
- runtime warning fix
mediatek:
- cmdq, mmsys fixes
- visibility check fixes"
* tag 'drm-fixes-2020-07-10' of git://anongit.freedesktop.org/drm/drm: (24 commits)
drm/amdgpu: don't do soft recovery if gpu_recovery=0
drm/radeon: fix double free
drm/amd/display: add dmcub check on RENOIR
drm/amdgpu: add TMR destory function for psp
drm/amdgpu: asd function needs to be unloaded in suspend phase
drm/hisilicon/hibmc: Move drm_fbdev_generic_setup() down to avoid the splat
drm/nouveau/nouveau: fix page fault on device private memory
drm/nouveau/svm: fix migrate page regression
drm/nouveau/i2c/g94-: increase NV_PMGR_DP_AUXCTL_TRANSACTREQ timeout
drm/nouveau/kms/nv50-: bail from nv50_audio_disable() early if audio not enabled
drm/i915/gt: Pin the rings before marking active
drm/i915: Also drop vm.ref along error paths for vma construction
drm/i915: Drop vm.ref for duplicate vma on construction
drm/i915/fbc: Fix fence_y_offset handling
drm/i915: Skip stale object handle for debugfs per-file-stats
drm/mediatek: mtk_hdmi: Remove debug messages for function calls
drm/mediatek: mtk_mt8173_hdmi_phy: Remove unnused const variables
drm/mediatek: Delete not used of_device_get_match_data
drm/mediatek: Remove unnecessary conversion to bool
drm/meson: viu: fix setting the OSD burst length in VIU_OSD1_FIFO_CTRL_STAT
...
While raising the gcc version requirement to 4.9, the compile-time check
in the unroll macro was accidentally changed from being used on gcc and
clang to being used on clang only.
Restore the gcc check, changing it from "gcc >= 4.7" to "all gcc".
[ We should probably remove this all entirely: if we remove the check
for CLANG, then the check for GCC can go away. Older versions of clang
are not really appropriate or supported for kernel builds - Linus ]
Fixes: 6ec4476ac8 ("Raise gcc version requirement to 4.9")
Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.eti.br>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In order for no_refcnt and is_data to be the lowest order two
bits in the 'val' we have to pad out the bitfield of the u8.
Fixes: ad0f75e5f5 ("cgroup: fix cgroup_sk_alloc() for sk_clone_lock()")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sockmap tests which rely on old bpf_prog_dispatch behaviour.
In the first case, the tests check that detaching without giving
a program succeeds. Since these are not the desired semantics,
invert the condition. In the second case, the clean up code doesn't
supply the necessary program fds.
Fixes: bb0de3131f ("bpf: sockmap: Require attach_bpf_fd when detaching a program")
Reported-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20200709115151.75829-1-lmb@cloudflare.com
Pull device mapper fixes from Mike Snitzer:
- A request-based DM fix to not use a waitqueue to wait for blk-mq IO
completion because doing so is racey.
- A couple more DM zoned target fixes to address issues introduced
during the 5.8 cycle.
- A DM core fix to use proper interface to cleanup DM's static flush
bio.
- A DM core fix to prevent mm recursion during memory allocation needed
by dm_kobject_uevent.
* tag 'for-5.8/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: use noio when sending kobject event
dm zoned: Fix zone reclaim trigger
dm zoned: fix unused but set variable warnings
dm writecache: reject asynchronous pmem devices
dm: use bio_uninit instead of bio_disassociate_blkg
dm: do not use waitqueue for request-based DM
Pull kallsyms fix from Kees Cook:
"Refactor kallsyms_show_value() users for correct cred.
I'm not delighted by the timing of getting these changes to you, but
it does fix a handful of kernel address exposures, and no one has
screamed yet at the patches.
Several users of kallsyms_show_value() were performing checks not
during "open". Refactor everything needed to gain proper checks
against file->f_cred for modules, kprobes, and bpf"
* tag 'kallsyms_show_value-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
selftests: kmod: Add module address visibility test
bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()
kprobes: Do not expose probe addresses to non-CAP_SYSLOG
module: Do not expose section addresses to non-CAP_SYSLOG
module: Refactor section attr into bin attribute
kallsyms: Refactor kallsyms_show_value() to take cred
bpf_sk_reuseport_detach is currently called when sk->sk_user_data
is not NULL. It is incorrect because sk->sk_user_data may not be
managed by the bpf's reuseport_array. It has been reported in [1] that,
the bpf_sk_reuseport_detach() which is called from udp_lib_unhash() has
corrupted the sk_user_data managed by l2tp.
This patch solves it by using another bit (defined as SK_USER_DATA_BPF)
of the sk_user_data pointer value. It marks that a sk_user_data is
managed/owned by BPF.
The patch depends on a PTRMASK introduced in
commit f1ff5ce2cd ("net, sk_msg: Clear sk_user_data pointer on clone if tagged").
[ Note: sk->sk_user_data is used by bpf's reuseport_array only when a sk is
added to the bpf's reuseport_array.
i.e. doing setsockopt(SO_REUSEPORT) and having "sk->sk_reuseport == 1"
alone will not stop sk->sk_user_data being used by other means. ]
[1]: https://lore.kernel.org/netdev/20200706121259.GA20199@katalix.com/
Fixes: 5dc4c4b7d4 ("bpf: Introduce BPF_MAP_TYPE_REUSEPORT_SOCKARRAY")
Reported-by: James Chapman <jchapman@katalix.com>
Reported-by: syzbot+9f092552ba9a5efca5df@syzkaller.appspotmail.com
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: James Chapman <jchapman@katalix.com>
Acked-by: James Chapman <jchapman@katalix.com>
Link: https://lore.kernel.org/bpf/20200709061110.4019316-1-kafai@fb.com
It makes little sense for copying sk_user_data of reuseport_array during
sk_clone_lock(). This patch reuses the SK_USER_DATA_NOCOPY bit introduced in
commit f1ff5ce2cd ("net, sk_msg: Clear sk_user_data pointer on clone if tagged").
It is used to mark the sk_user_data is not supposed to be copied to its clone.
Although the cloned sk's sk_user_data will not be used/freed in
bpf_sk_reuseport_detach(), this change can still allow the cloned
sk's sk_user_data to be used by some other means.
Freeing the reuseport_array's sk_user_data does not require a rcu grace
period. Thus, the existing rcu_assign_sk_user_data_nocopy() is not
used.
Fixes: 5dc4c4b7d4 ("bpf: Introduce BPF_MAP_TYPE_REUSEPORT_SOCKARRAY")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20200709061104.4018798-1-kafai@fb.com
If the genlmsg_put() call in ethnl_default_dumpit() fails, we bail out
without checking if we already have some messages in current skb like we do
with ethnl_default_dump_one() failure later. Therefore if existing messages
almost fill up the buffer so that there is not enough space even for
netlink and genetlink header, we lose all prepared messages and return and
error.
Rather than duplicating the skb->len check, move the genlmsg_put(),
genlmsg_cancel() and genlmsg_end() calls into ethnl_default_dump_one().
This is also more logical as all message composition will be in
ethnl_default_dump_one() and only iteration logic will be left in
ethnl_default_dumpit().
Fixes: 728480f124 ("ethtool: default handlers for GET requests")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tcf_block_get() fails inside atm_tc_init(),
atm_tc_put() is called to release the qdisc p->link.q.
But the flow->ref prevents it to do so, as the flow->ref
is still zero.
Fix this by moving the p->link.ref initialization before
tcf_block_get().
Fixes: 6529eaba33 ("net: sched: introduce tcf block infractructure")
Reported-and-tested-by: syzbot+d411cff6ab29cc2c311b@syzkaller.appspotmail.com
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NVM config file address will be modified when the MBI image is upgraded.
Driver would return stale config values if user reads the nvm-config
(via ethtool -d) in this state. The fix is to re-populate nvm attribute
info while reading the nvm config values/partition.
Changes from previous version:
-------------------------------
v3: Corrected the formatting in 'Fixes' tag.
v2: Added 'Fixes' tag.
Fixes: 1ac4329a1c ("qed: Add configuration information to register dump and debug data")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
clang static analysis flags this error
drivers/gpu/drm/radeon/ci_dpm.c:5652:9: warning: Use of memory after it is freed [unix.Malloc]
kfree(rdev->pm.dpm.ps[i].ps_priv);
^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/radeon/ci_dpm.c:5654:2: warning: Attempt to free released memory [unix.Malloc]
kfree(rdev->pm.dpm.ps);
^~~~~~~~~~~~~~~~~~~~~~
problem is reported in ci_dpm_fini, with these code blocks.
for (i = 0; i < rdev->pm.dpm.num_ps; i++) {
kfree(rdev->pm.dpm.ps[i].ps_priv);
}
kfree(rdev->pm.dpm.ps);
The first free happens in ci_parse_power_table where it cleans up locally
on a failure. ci_dpm_fini also does a cleanup.
ret = ci_parse_power_table(rdev);
if (ret) {
ci_dpm_fini(rdev);
return ret;
}
So remove the cleanup in ci_parse_power_table and
move the num_ps calculation to inside the loop so ci_dpm_fini
will know how many array elements to free.
Fixes: cc8dbbb4f6 ("drm/radeon: add dpm support for CI dGPUs (v2)")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
TMR is required to be destoried with GFX_CMD_ID_DESTROY_TMR while the
system goes to suspend. Otherwise, PSP may return the failure state
(0xFFFF007) on Gfx-2-PSP command GFX_CMD_ID_SETUP_TMR after do multiple
times suspend/resume.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
btrfs implements the iter_write op and thus can use the more efficient
iov_iter based splice implementation. For now falling back to the less
efficient default is pretty harmless, but I have a pending series that
removes the default, and thus would cause btrfs to not support splice
at all.
Reported-by: Andy Lavr <andy.lavr@gmail.com>
Tested-by: Andy Lavr <andy.lavr@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
While debugging a patch that I wrote I was hitting use-after-free panics
when accessing block groups on unmount. This turned out to be because
in the nocow case if we bail out of doing the nocow for whatever reason
we need to call btrfs_dec_nocow_writers() if we called the inc. This
puts our block group, but a few error cases does
if (nocow) {
btrfs_dec_nocow_writers();
goto error;
}
unfortunately, error is
error:
if (nocow)
btrfs_dec_nocow_writers();
so we get a double put on our block group. Fix this by dropping the
error cases calling of btrfs_dec_nocow_writers(), as it's handled at the
error label now.
Fixes: 762bf09893 ("btrfs: improve error handling in run_delalloc_nocow")
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Issue:
When PEC is enabled, binding adm1272 to the adm1275 would
fail due to PEC error. See below:
adm1275: probe of xxxx failed with error -74
Diagnosis:
Per the datasheet of adm1272, adm1278, adm1293 and amd1294,
PMON_CONFIG (0xd4) is 16bits wide. On the other hand,
PMON_CONFIG (0xd4) for adm1275 is 8bits wide. The driver should not
assume everything is 8bits wide and read only 8bits from it.
Solution:
If it is adm1272, adm1278, adm1293 and adm1294, use i2c_read_word.
Else, use i2c_read_byte
Testing:
Binding adm1272 to the driver.
The change is only tested on adm1272.
Signed-off-by: Chu Lin <linchuyuan@google.com>
Link: https://lore.kernel.org/r/20200709040612.3977094-1-linchuyuan@google.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Commit f7b93d4294 ("arm64/alternatives: use subsections for replacement
sequences") moved the alternatives replacement sequences into subsections,
in order to keep the as close as possible to the code that they replace.
Unfortunately, this broke the logic in branch_insn_requires_update,
which assumed that any branch into kernel executable code was a branch
that required updating, which is no longer the case now that the code
sequences that are patched in are in the same section as the patch site
itself.
So the only way to discriminate branches that require updating and ones
that don't is to check whether the branch targets the replacement sequence
itself, and so we can drop the call to kernel_text_address() entirely.
Fixes: f7b93d4294 ("arm64/alternatives: use subsections for replacement sequences")
Reported-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20200709125953.30918-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
When a timer is enqueued with a negative delta (ie: expiry is below
base->clk), it gets added to the wheel as expiring now (base->clk).
Yet the value that gets stored in base->next_expiry, while calling
trigger_dyntick_cpu(), is the initial timer->expires value. The
resulting state becomes:
base->next_expiry < base->clk
On the next timer enqueue, forward_timer_base() may accidentally
rewind base->clk. As a possible outcome, timers may expire way too
early, the worst case being that the highest wheel levels get spuriously
processed again.
To prevent from that, make sure that base->next_expiry doesn't get below
base->clk.
Fixes: a683f390b9 ("timers: Forward the wheel clock whenever possible")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200703010657.2302-1-frederic@kernel.org
This reverts commit 5435f73d5c, which is no longer needed now
that the minimum GCC version has been bumped to v4.9
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Commit
bf67fad19e ("efi: Use more granular check for availability for variable services")
introduced a check into the efivarfs, efi-pstore and other drivers that
aborts loading of the module if not all three variable runtime services
(GetVariable, SetVariable and GetNextVariable) are supported. However, this
results in efivarfs being unavailable entirely if only SetVariable support
is missing, which is only needed if you want to make any modifications.
Also, efi-pstore and the sysfs EFI variable interface could be backed by
another implementation of the 'efivars' abstraction, in which case it is
completely irrelevant which services are supported by the EFI firmware.
So make the generic 'efivars' abstraction dependent on the availibility of
the GetVariable and GetNextVariable EFI runtime services, and add a helper
'efivar_supports_writes()' to find out whether the currently active efivars
abstraction supports writes (and wire it up to the availability of
SetVariable for the generic one).
Then, use the efivar_supports_writes() helper to decide whether to permit
efivarfs to be mounted read-write, and whether to enable efi-pstore or the
sysfs EFI variable interface altogether.
Fixes: bf67fad19e ("efi: Use more granular check for availability for variable services")
Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Tested-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Add a missing spinlock protection for play_queue, because
the play_queue may be destroyed when the "playback_work"
work func and "f_audio_out_ep_complete" callback func
operate this paly_queue at the same time.
Fixes: c6994e6f06 ("USB: gadget: add USB Audio Gadget driver")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Zhang Qiang <qiang.zhang@windriver.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Fixed commit moved the assignment of 'req', but did not update a
reference in the DBG() call. Use the argument as it was renamed.
Fixes: 5fb694f96e ("usb: gadget: udc: atmel: fix possible oops when unloading module")
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Fixed commit removed the offending behaviour from the driver, but missed
the comment and associated test. Remove them now.
Fixes: 38e58986e6 ("usb: gadget: udc: atmel: don't disable enpdoints we don't own")
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
To avoid lot of interrupts from dwc2 core, which can be asserted in
specific conditions need to disable interrupts on HW level instead of
disable IRQs on Kernel level, because of IRQ can be shared between
drivers.
Cc: stable@vger.kernel.org
Fixes: a40a00318c ("usb: dwc2: add shutdown callback to platform variant")
Tested-by: Frank Mori Hess <fmh6jj@gmail.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Reviewed-by: Doug Anderson <dianders@chromium.org>
Reviewed-by: Frank Mori Hess <fmh6jj@gmail.com>
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
gr_ep_init() does not assign the allocated request anywhere if allocation
of memory for the buffer fails. This is a memory leak fixed by the given
patch.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Fix spelling of the 'langid' function argument in the kernel-doc
notation to quieten a kernel-doc warning.
../drivers/usb/gadget/usbstring.c:77: warning: Function parameter or member 'langid' not described in 'usb_validate_langid'
../drivers/usb/gadget/usbstring.c:77: warning: Excess function parameter 'lang' description in 'usb_validate_langid'
Fixes: 17309a6a43 ("usb: gadget: add "usb_validate_langid" function")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Tao Ren <rentao.bupt@gmail.com>
Cc: Tao Ren <rentao.bupt@gmail.com>
Cc: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
The following backtrace is seen when running aspeed G5 kernels.
WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_fb_helper.c:2233 drm_fbdev_generic_setup+0x138/0x198
aspeed_gfx 1e6e6000.display: Device has not been registered.
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3 #1
Hardware name: Generic DT based system
Backtrace:
[<8010d6d0>] (dump_backtrace) from [<8010d9b8>] (show_stack+0x20/0x24)
r7:00000009 r6:60000153 r5:00000000 r4:8119fa94
[<8010d998>] (show_stack) from [<80b8cb98>] (dump_stack+0xcc/0xec)
[<80b8cacc>] (dump_stack) from [<80123ef0>] (__warn+0xd8/0xfc)
r7:00000009 r6:80e62ed0 r5:00000000 r4:974c3ccc
[<80123e18>] (__warn) from [<80123f98>] (warn_slowpath_fmt+0x84/0xc4)
r9:00000009 r8:806a0140 r7:000008b9 r6:80e62ed0 r5:80e631f8 r4:974c2000
[<80123f18>] (warn_slowpath_fmt) from [<806a0140>] (drm_fbdev_generic_setup+0x138/0x198)
r9:00000001 r8:9758fc10 r7:9758fc00 r6:00000000 r5:00000020 r4:9768a000
[<806a0008>] (drm_fbdev_generic_setup) from [<806d4558>] (aspeed_gfx_probe+0x204/0x32c)
r7:9758fc00 r6:00000000 r5:00000000 r4:9768a000
[<806d4354>] (aspeed_gfx_probe) from [<806dfca0>] (platform_drv_probe+0x58/0xa8)
Since commit 1aed9509b2 ("drm/fb-helper: Remove return value from
drm_fbdev_generic_setup()"), drm_fbdev_generic_setup() must be called
after drm_dev_register() to avoid the warning. Do that.
Fixes: 1aed9509b2 ("drm/fb-helper: Remove return value from drm_fbdev_generic_setup()")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701001002.74997-1-linux@roeck-us.net
The prototype of the functions handle_kernel_image & efi_enter_kernel
are defined in efi-stub.c which may result in a compiler warnings if
-Wmissing-prototypes is set in gcc compiler.
Move the prototype to efistub.h to make the compiler happy.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Link: https://lore.kernel.org/r/20200706172609.25965-2-atish.patra@wdc.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Since commit 799c434154 ("kbuild: thin archives make default for
all archs"), core-y is passed to the linker with --whole-archive.
Hence, the whole of stub library is linked to vmlinux.
Use libs-y so that lib.a is passed after --no-whole-archive for
conditional linking.
The unused drivers/firmware/efi/libstub/relocate.o will be dropped
for ARCH=arm64.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20200604022031.164207-1-masahiroy@kernel.org
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
For those applications which are not willing to use io_uring_enter()
to reap and handle cqes, they may completely rely on liburing's
io_uring_peek_cqe(), but if cq ring has overflowed, currently because
io_uring_peek_cqe() is not aware of this overflow, it won't enter
kernel to flush cqes, below test program can reveal this bug:
static void test_cq_overflow(struct io_uring *ring)
{
struct io_uring_cqe *cqe;
struct io_uring_sqe *sqe;
int issued = 0;
int ret = 0;
do {
sqe = io_uring_get_sqe(ring);
if (!sqe) {
fprintf(stderr, "get sqe failed\n");
break;;
}
ret = io_uring_submit(ring);
if (ret <= 0) {
if (ret != -EBUSY)
fprintf(stderr, "sqe submit failed: %d\n", ret);
break;
}
issued++;
} while (ret > 0);
assert(ret == -EBUSY);
printf("issued requests: %d\n", issued);
while (issued) {
ret = io_uring_peek_cqe(ring, &cqe);
if (ret) {
if (ret != -EAGAIN) {
fprintf(stderr, "peek completion failed: %s\n",
strerror(ret));
break;
}
printf("left requets: %d\n", issued);
continue;
}
io_uring_cqe_seen(ring, cqe);
issued--;
printf("left requets: %d\n", issued);
}
}
int main(int argc, char *argv[])
{
int ret;
struct io_uring ring;
ret = io_uring_queue_init(16, &ring, 0);
if (ret) {
fprintf(stderr, "ring setup failed: %d\n", ret);
return 1;
}
test_cq_overflow(&ring);
return 0;
}
To fix this issue, export cq overflow status to userspace by adding new
IORING_SQ_CQ_OVERFLOW flag, then helper functions() in liburing, such as
io_uring_peek_cqe, can be aware of this cq overflow and do flush accordingly.
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
As of commit 8c0637e950 ("keys: Make the KEY_NEED_* perms an enum rather
than a mask") lookup_user_key() needs an explicit declaration of what it
wants to do with the key. Add KEY_NEED_SEARCH to fix a warning with the
below signature, and fixes the inability to retrieve a key.
WARNING: CPU: 15 PID: 6276 at security/keys/permission.c:35 key_task_permission+0xd3/0x140
[..]
RIP: 0010:key_task_permission+0xd3/0x140
[..]
Call Trace:
lookup_user_key+0xeb/0x6b0
? vsscanf+0x3df/0x840
? key_validate+0x50/0x50
? key_default_cmp+0x20/0x20
nvdimm_get_user_key_payload.part.0+0x21/0x110 [libnvdimm]
nvdimm_security_store+0x67d/0xb20 [libnvdimm]
security_store+0x67/0x1a0 [libnvdimm]
kernfs_fop_write+0xcf/0x1c0
vfs_write+0xde/0x1d0
ksys_write+0x68/0xe0
do_syscall_64+0x5c/0xa0
entry_SYSCALL_64_after_hwframe+0x49/0xb3
Fixes: 8c0637e950 ("keys: Make the KEY_NEED_* perms an enum rather than a mask")
Suggested-by: David Howells <dhowells@redhat.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/159297332630.1304143.237026690015653759.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Some released FW versions mistakenly don't set the capability that 50G per
lane link-modes are supported for VFs (ptys_extended_ethernet capability
bit).
Use PTYS.ext_eth_proto_capability instead, as this indication is always
accurate. If PTYS.ext_eth_proto_capability is valid
(has a non-zero value) conclude that the HCA supports 50G per lane.
Otherwise, conclude that the HCA doesn't support 50G per lane.
Fixes: 08e8676f16 ("IB/mlx5: Add support for 50Gbps per lane link modes")
Link: https://lore.kernel.org/r/20200707110612.882962-3-leon@kernel.org
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Make sure we don't regress the CAP_SYSLOG behavior of the module address
visibility via /proc/modules nor /sys/module/*/sections/*.
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
When evaluating access control over kallsyms visibility, credentials at
open() time need to be used, not the "current" creds (though in BPF's
case, this has likely always been the same). Plumb access to associated
file->f_cred down through bpf_dump_raw_ok() and its callers now that
kallsysm_show_value() has been refactored to take struct cred.
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: 7105e828c0 ("bpf: allow for correlation of maps and helpers in dump")
Signed-off-by: Kees Cook <keescook@chromium.org>
The kprobe show() functions were using "current"'s creds instead
of the file opener's creds for kallsyms visibility. Fix to use
seq_file->file->f_cred.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 81365a947d ("kprobes: Show address of kprobes if kallsyms does")
Fixes: ffb9bd68eb ("kprobes: Show blacklist addresses as same as kallsyms does")
Signed-off-by: Kees Cook <keescook@chromium.org>
In order to gain access to the open file's f_cred for kallsym visibility
permission checks, refactor the module section attributes to use the
bin_attribute instead of attribute interface. Additionally removes the
redundant "name" struct member.
Cc: stable@vger.kernel.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
In order to perform future tests against the cred saved during open(),
switch kallsyms_show_value() to operate on a cred, and have all current
callers pass current_cred(). This makes it very obvious where callers
are checking the wrong credential in their "read" contexts. These will
be fixed in the coming patches.
Additionally switch return value to bool, since it is always used as a
direct permission check, not a 0-on-success, negative-on-error style
function return.
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Convert all-mask IP address to Big Endian, instead, for comparison.
Fixes: f286dd8eaa ("cxgb4: use correct type for all-mask IP address comparison")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A scenario has been observed where a 'bc_init' message for a link is not
retransmitted if it fails to be received by the peer. This leads to the
peer never establishing the link fully and it discarding all other data
received on the link. In this scenario the message is lost in transit to
the peer.
The issue is traced to the 'nxt_retr' field of the skb not being
initialised for links that aren't a bc_sndlink. This leads to the
comparison in tipc_link_advance_transmq() that gates whether to attempt
retransmission of a message performing in an undesirable way.
Depending on the relative value of 'jiffies', this comparison:
time_before(jiffies, TIPC_SKB_CB(skb)->nxt_retr)
may return true or false given that 'nxt_retr' remains at the
uninitialised value of 0 for non bc_sndlinks.
This is most noticeable shortly after boot when jiffies is initialised
to a high value (to flush out rollover bugs) and we compare a jiffies of,
say, 4294940189 to zero. In that case time_before returns 'true' leading
to the skb not being retransmitted.
The fix is to ensure that all skbs have a valid 'nxt_retr' time set for
them and this is achieved by refactoring the setting of this value into
a central function.
With this fix, transmission losses of 'bc_init' messages do not stall
the link establishment forever because the 'bc_init' message is
retransmitted and the link eventually establishes correctly.
Fixes: 382f598fb6 ("tipc: reduce duplicate packets for unicast traffic")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the tx path of l2tp, l2tp_xmit_skb() calls skb_dst_set() to set
skb's dst. However, it will eventually call inet6_csk_xmit() or
ip_queue_xmit() where skb's dst will be overwritten by:
skb_dst_set_noref(skb, dst);
without releasing the old dst in skb. Then it causes dst/dev refcnt leak:
unregister_netdevice: waiting for eth0 to become free. Usage count = 1
This can be reproduced by simply running:
# modprobe l2tp_eth && modprobe l2tp_ip
# sh ./tools/testing/selftests/net/l2tp.sh
So before going to inet6_csk_xmit() or ip_queue_xmit(), skb's dst
should be dropped. This patch is to fix it by removing skb_dst_set()
from l2tp_xmit_skb() and moving skb_dst_drop() into l2tp_xmit_core().
Fixes: 3557baabf2 ("[L2TP]: PPP over L2TP driver core")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: James Chapman <jchapman@katalix.com>
Tested-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When adding first socket to nbd, if nsock's allocation failed, the data
structure member "config->socks" was reallocated, but the data structure
member "config->num_connections" was not updated. A memory leak will occur
then because the function "nbd_config_put" will free "config->socks" only
when "config->num_connections" is not zero.
Fixes: 03bf73c315 ("nbd: prevent memory leak")
Reported-by: syzbot+934037347002901b8d2a@syzkaller.appspotmail.com
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
After entering kdb due to breakpoint, when we execute 'ss' or 'go' (will
delay installing breakpoints, do single-step first), it won't work
correctly, and it will enter kdb due to oops.
It's because the reason gotten in kdb_stub() is not as expected, and it
seems that the ex_vector for single-step should be 0, like what arch
powerpc/sh/parisc has implemented.
Before the patch:
Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0xffff8000101486cc (printk)
is enabled addr at ffff8000101486cc, hardtype=0 installed=0
[0]kdb> g
/ # echo h > /proc/sysrq-trigger
Entering kdb (current=0xffff0000fa878040, pid 266) on processor 3 due to Breakpoint @ 0xffff8000101486cc
[3]kdb> ss
Entering kdb (current=0xffff0000fa878040, pid 266) on processor 3 Oops: (null)
due to oops @ 0xffff800010082ab8
CPU: 3 PID: 266 Comm: sh Not tainted 5.7.0-rc4-13839-gf0e5ad491718 #6
Hardware name: linux,dummy-virt (DT)
pstate: 00000085 (nzcv daIf -PAN -UAO)
pc : el1_irq+0x78/0x180
lr : __handle_sysrq+0x80/0x190
sp : ffff800015003bf0
x29: ffff800015003d20 x28: ffff0000fa878040
x27: 0000000000000000 x26: ffff80001126b1f0
x25: ffff800011b6a0d8 x24: 0000000000000000
x23: 0000000080200005 x22: ffff8000101486cc
x21: ffff800015003d30 x20: 0000ffffffffffff
x19: ffff8000119f2000 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000
x9 : 0000000000000000 x8 : ffff800015003e50
x7 : 0000000000000002 x6 : 00000000380b9990
x5 : ffff8000106e99e8 x4 : ffff0000fadd83c0
x3 : 0000ffffffffffff x2 : ffff800011b6a0d8
x1 : ffff800011b6a000 x0 : ffff80001130c9d8
Call trace:
el1_irq+0x78/0x180
printk+0x0/0x84
write_sysrq_trigger+0xb0/0x118
proc_reg_write+0xb4/0xe0
__vfs_write+0x18/0x40
vfs_write+0xb0/0x1b8
ksys_write+0x64/0xf0
__arm64_sys_write+0x14/0x20
el0_svc_common.constprop.2+0xb0/0x168
do_el0_svc+0x20/0x98
el0_sync_handler+0xec/0x1a8
el0_sync+0x140/0x180
[3]kdb>
After the patch:
Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0xffff8000101486cc (printk)
is enabled addr at ffff8000101486cc, hardtype=0 installed=0
[0]kdb> g
/ # echo h > /proc/sysrq-trigger
Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc
[0]kdb> g
Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc
[0]kdb> ss
Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to SS trap @ 0xffff800010082ab8
[0]kdb>
Fixes: 44679a4f14 ("arm64: KGDB: Add step debugging support")
Signed-off-by: Wei Li <liwei391@huawei.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200509214159.19680-2-liwei391@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Continually butchering our entry code with CPU errata workarounds has
led to it looking a little scruffy. Consistently used /* */ comment
style for multi-line block comments and ensure that small numeric labels
use consecutive integers.
No functional change, but the state of things was irritating.
Signed-off-by: Will Deacon <will@kernel.org>
The current handling of erratum 1414080 has the side effect that
cntkctl_el1 can get changed for both 32 and 64bit tasks.
This isn't a problem so far, but if we ever need to mitigate another
of these errata on the 64bit side, we'd better keep the messing with
cntkctl_el1 local to 32bit tasks.
For that, make sure that on entering the kernel from a 32bit tasks,
userspace access to cntvct gets enabled, and disabled returning to
userspace, while it never gets changed for 64bit tasks.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20200706163802.1836732-5-maz@kernel.org
[will: removed branch instructions per Mark's review comments]
Signed-off-by: Will Deacon <will@kernel.org>
We have a class of errata (grouped under the ARM64_WORKAROUND_1418040
banner) that force the trapping of counter access from 32bit EL0.
We would normally disable the whole vdso for such defect, except that
it would disable it for 64bit userspace as well, which is a shame.
Instead, add a new vdso_clock_mode, which signals that the vdso
isn't usable for compat tasks. This gets checked in the new
vdso_clocksource_ok() helper, now provided for the 32bit vdso.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200706163802.1836732-2-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Karsten Graul says:
====================
net/smc: fixes 2020-07-08
Please apply the following patch series for smc to netdev's net tree.
The patches fix problems found during more testing of SMC
functionality, resulting in hang conditions and unneeded link
deactivations. The clc module was hardened to be prepared for
possible future SMCD versions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
CLC proposal messages of future SMCD versions could be larger than SMCD
V1 CLC proposal messages.
To enable toleration in SMC V1 the receival of CLC proposal messages
is adapted:
* accept larger length values in CLC proposal
* check trailing eye catcher for incoming CLC proposal with V1 length only
* receive the whole CLC proposal even in cases it does not fit into the
V1 buffer
Fixes: e7b7a64a84 ("smc: support variable CLC proposal messages")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The similar smc_ib_devices spinlock has been converted to a mutex.
Protecting the smcd_dev_list by a mutex is possible as well. This
patch converts the smcd_dev_list spinlock to a mutex.
Fixes: c6ba7c9ba4 ("net/smc: add base infrastructure for SMC-D and ISM")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tests showed this BUG:
[572555.252867] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
[572555.252876] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 131031, name: smcapp
[572555.252879] INFO: lockdep is turned off.
[572555.252883] CPU: 1 PID: 131031 Comm: smcapp Tainted: G O 5.7.0-rc3uschi+ #356
[572555.252885] Hardware name: IBM 3906 M03 703 (LPAR)
[572555.252887] Call Trace:
[572555.252896] [<00000000ac364554>] show_stack+0x94/0xe8
[572555.252901] [<00000000aca1f400>] dump_stack+0xa0/0xe0
[572555.252906] [<00000000ac3c8c10>] ___might_sleep+0x260/0x280
[572555.252910] [<00000000acdc0c98>] __mutex_lock+0x48/0x940
[572555.252912] [<00000000acdc15c2>] mutex_lock_nested+0x32/0x40
[572555.252975] [<000003ff801762d0>] mlx5_lag_get_roce_netdev+0x30/0xc0 [mlx5_core]
[572555.252996] [<000003ff801fb3aa>] mlx5_ib_get_netdev+0x3a/0xe0 [mlx5_ib]
[572555.253007] [<000003ff80063848>] smc_pnet_find_roce_resource+0x1d8/0x310 [smc]
[572555.253011] [<000003ff800602f0>] __smc_connect+0x1f0/0x3e0 [smc]
[572555.253015] [<000003ff80060634>] smc_connect+0x154/0x190 [smc]
[572555.253022] [<00000000acbed8d4>] __sys_connect+0x94/0xd0
[572555.253025] [<00000000acbef620>] __s390x_sys_socketcall+0x170/0x360
[572555.253028] [<00000000acdc6800>] system_call+0x298/0x2b8
[572555.253030] INFO: lockdep is turned off.
Function smc_pnet_find_rdma_dev() might be called from
smc_pnet_find_roce_resource(). It holds the smc_ib_devices list
spinlock while calling infiniband op get_netdev(). At least for mlx5
the get_netdev operation wants mutex serialization, which conflicts
with the smc_ib_devices spinlock.
This patch switches the smc_ib_devices spinlock into a mutex to
allow sleeping when calling get_netdev().
Fixes: a4cf0443c4 ("smc: introduce SMC as an IB-client")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wait for pending sends only when smc_switch_conns() found a link to move
the connections to. Do not wait during link freeing, this can lead to
permanent hang situations. And refuse to provide a new tx slot on an
unusable link.
Fixes: c6f02ebeea ("net/smc: switch connections to alternate link")
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There might be races in scenarios where both SMC link groups are on the
same system. Prevent that by creating separate wait queues for LLC flows
and messages. Switch to non-interruptable versions of wait_event() and
wake_up() for the llc flow waiter to make sure the waiters get control
sequentially. Fine tune the llc_flow_lock to include the assignment of
the message. Write to system log when an unexpected message was
dropped. And remove an extra indirection and use the existing local
variable lgr in smc_llc_enqueue().
Fixes: 555da9af82 ("net/smc: add event-based llc_flow framework")
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes ip dst and ipv6 address filters.
There were 2 mistakes in the code, which led to the issue:
* invalid register was used for ipv4 dst address;
* incorrect write order of dwords for ipv6 addresses.
Fixes: 23e7a718a4 ("net: aquantia: add rx-flow filter definitions")
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update Documentation for the gcc v4.9 upgrade requirement.
Fixes: 5429ef62bc ("compiler/gcc: Raise minimum GCC version for kernel builds to 4.8")
Fixes: 6ec4476ac8 ("Raise gcc version requirement to 4.9")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull sound fixes from Takashi Iwai:
"A collection of small, mostly device-specific fixes.
The significant one is the regression fix for USB-audio implicit
feedback devices due to the incorrect frame size calculation, which
landed in 5.8 and stable trees.
In addition, a few usual HD-audio and USB-audio quirks, Intel HDMI
fixes, ASoC fsl and rt5682 fixes, as well as the fix in
compress-offload partial drain operation"
* tag 'sound-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: compress: fix partial_drain completion state
ALSA: usb-audio: Add implicit feedback quirk for RTX6001
ALSA: usb-audio: add quirk for MacroSilicon MS2109
ALSA: hda/realtek: Enable headset mic of Acer Veriton N4660G with ALC269VC
ALSA: hda/realtek: Enable headset mic of Acer C20-820 with ALC269VC
ALSA: hda/realtek - Enable audio jacks of Acer vCopperbox with ALC269VC
ALSA: hda/realtek - Fix Lenovo Thinkpad X1 Carbon 7th quirk subdevice id
ALSA: hda/hdmi: improve debug traces for stream lookups
ALSA: hda/hdmi: fix failures at PCM open on Intel ICL and later
ALSA: opl3: fix infoleak in opl3
ALSA: usb-audio: Replace s/frame/packet/ where appropriate
ALSA: usb-audio: Fix packet size calculation
AsoC: amd: add missing snd- module prefix to the acp3x-rn driver kernel module
ALSA: hda - let hs_mic be picked ahead of hp_mic
ASoC: rt5682: fix the pop noise while OMTP type headset plugin
ASoC: fsl_mqs: Fix unchecked return value for clk_prepare_enable
ASoC: fsl_mqs: Don't check clock is NULL before calling clk API
I realize that we fairly recently raised it to 4.8, but the fact is, 4.9
is a much better minimum version to target.
We have a number of workarounds for actual bugs in pre-4.9 gcc versions
(including things like internal compiler errors on ARM), but we also
have some syntactic workarounds for lacking features.
In particular, raising the minimum to 4.9 means that we can now just
assume _Generic() exists, which is likely the much better replacement
for a lot of very convoluted built-time magic with conditionals on
sizeof and/or __builtin_choose_expr() with same_type() etc.
Using _Generic also means that you will need to have a very recent
version of 'sparse', but thats easy to build yourself, and much less of
a hassle than some old gcc version can be.
The latest (in a long string) of reasons for minimum compiler version
upgrades was commit 5435f73d5c ("efi/x86: Fix build with gcc 4").
Ard points out that RHEL 7 uses gcc-4.8, but the people who stay back on
old RHEL versions persumably also don't build their own kernels anyway.
And maybe they should cross-built or just have a little side affair with
a newer compiler?
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Only triggering reclaim based on the percentage of unmapped cache
zones can fail to detect cases where reclaim is needed, e.g. if the
target has only 2 or 3 cache zones and only one unmapped cache zone,
the percentage of free cache zones is higher than
DMZ_RECLAIM_LOW_UNMAP_ZONES (30%) and reclaim does not trigger.
This problem, combined with the fact that dmz_schedule_reclaim() is
called from dmz_handle_bio() without the map lock held, leads to a
race between zone allocation and dmz_should_reclaim() result.
Depending on the workload applied, this race can lead to the write
path waiting forever for a free zone without reclaim being triggered.
Fix this by moving dmz_schedule_reclaim() inside dmz_alloc_zone()
under the map lock. This results in checking the need for zone reclaim
whenever a new data or buffer zone needs to be allocated.
Also fix dmz_reclaim_percentage() to always return 0 if the number of
unmapped cache (or random) zones is less than or equal to 1.
Suggested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Vinod writes:
phy: fixes for 5.8
*) Fix for intel combo driver for warns or errors
*) Constify symbols for am654-serdes & j721e-wiz
*) Return value fix for rockchip driver
*) Null pointer dereference fix for sun4i-usb
* tag 'phy-fixes-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
phy: sun4i-usb: fix dereference of pointer phy0 before it is null checked
phy: rockchip: Fix return value of inno_dsidphy_probe()
phy: ti: j721e-wiz: Constify structs
phy: ti: am654-serdes: Constify regmap_config
phy: intel: fix enum type mismatch warning
phy: intel: Fix compilation error on FIELD_PREP usage
Fix unused but set variable warnings:
drivers/md/dm-zoned-reclaim.c:504:42: warning:
variable nr_rnd set but not used [-Wunused-but-set-variable]
504 | unsigned int p_unmap, nr_unmap_rnd = 0, nr_rnd = 0;
| ^~~~~~
drivers/md/dm-zoned-reclaim.c:504:24: warning:
variable nr_unmap_rnd set but not used [-Wunused-but-set-variable]
504 | unsigned int p_unmap, nr_unmap_rnd = 0, nr_rnd = 0;
| ^~~~~~~~~~~~
Fixes: f97809aec5 ("dm zoned: per-device reclaim")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Johan writes:
USB-serial fixes for 5.8-rc5
Here are some new device ids for 5.8.
All have been in linux-next with no reported issues.
* tag 'usb-serial-5.8-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: option: add Quectel EG95 LTE modem
USB: serial: ch341: add new Product ID for CH340
USB: serial: option: add GosunCn GM500 series
USB: serial: cypress_m8: enable Simply Automated UPB PIM
bio_uninit is the proper API to clean up a BIO that has been allocated
on stack or inside a structure that doesn't come from the BIO allocator.
Switch dm to use that instead of bio_disassociate_blkg, which really is
an implementation detail. Note that the bio_uninit calls are also moved
to the two callers of __send_empty_flush, so that they better pair with
the bio_init calls used to initialize them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This is hopefully the last set of fixes to avoid probe errors due to
stricter checks of DAI capabilities introduced late in the 5.8 cycle.
Daniel Baluta (1):
ASoC: SOF: imx: add min/max channels for SAI/ESAI on i.MX8/i.MX8M
Pierre-Louis Bossart (2):
ASoC: soc-dai: set dai_link dpcm_ flags with a helper
ASoC: Intel: bdw-rt5677: fix non BE conversion
include/sound/soc-dai.h | 1 +
sound/soc/generic/audio-graph-card.c | 4 +--
sound/soc/generic/simple-card.c | 4 +--
sound/soc/intel/boards/bdw-rt5677.c | 1 +
sound/soc/soc-dai.c | 38 ++++++++++++++++++++++++++++
sound/soc/sof/imx/imx8.c | 8 ++++++
sound/soc/sof/imx/imx8m.c | 8 ++++++
7 files changed, 60 insertions(+), 4 deletions(-)
base-commit: a5911ac579
--
2.25.1
While experimenting and introducing errors in Baytrail topology files
until I got them right, I encountered multiple kernel oopses and
memory leaks. This is a first batch to harden the code, but we should
probably think of a tool to fuzz the topology...
Pierre-Louis Bossart (5):
ASoC: topology: fix kernel oops on route addition error
ASoC: topology: fix tlvs in error handling for widget_dmixer
ASoC: topology: use break on errors, not continue
ASoC: topology: factor kfree(se) in error handling
ASoC: topology: add more logs when topology load fails.
sound/soc/soc-topology.c | 97 ++++++++++++++++++++++++----------------
1 file changed, 58 insertions(+), 39 deletions(-)
base-commit: a5911ac579
--
2.25.1
When errors happens while loading graph components, the kernel oopses
while trying to remove all topology components. This can be
root-caused to a list pointing to memory that was already freed on
error.
remove_route() is already called on errors and will perform the
required cleanups so there's no need to free the route memory in
soc_tplg_dapm_graph_elems_load() if the route was added to the
list. We do however want to free the routes allocated but not added to
the list.
Fixes: 7df04ea7a3 ('ASoC: topology: modify dapm route loading routine and add dapm route unloading')
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://lore.kernel.org/r/20200707203749.113883-2-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This is identical with change for Intel platforms done with
commit 8c05246c0b ("ASoC: SOF: Intel: add min/max channels for SSP on Baytrail/Broadwell")
and fixes a regression on i.MX8/i.MX8M:
[ 25.705750] esai-Codec: ASoC: no backend playback stream
[ 27.923378] esai-Codec: ASoC: no users playback at close - state
This is root-caused to the introduction of the DAI capability checks
with snd_soc_dai_stream_valid(). Its use in soc-pcm.c makes it a
requirement for all DAIs to report at least a non-zero min_channels
field.
Fixes: 9b5db05936 ("ASoC: soc-pcm: dpcm: Only allow playback/capture if supported")
Signed-off-by: Daniel Baluta <daniel.baluta@nxp.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://lore.kernel.org/r/20200707210439.115300-4-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Move the initialization of the vendor_part_id to be before calling
ib_register_device(), this is needed because the query_device() callback
is called from the context of ib_register_device() before initializing the
vendor_part_id, so the reported value is wrong.
Fixes: bdcf26bf9b ("rdma/siw: network and RDMA core interface")
Link: https://lore.kernel.org/r/20200707130931.444724-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
A typo caused the interrupt handler to branch immediately to the
common "unknown interrupt" handler and skip the special case test for
denormal cause.
This does not affect KVM softpatch handling (e.g., for POWER9 TM
assist) because the KVM test was moved to common code by commit
9600f261ac ("powerpc/64s/exception: Move KVM test to common code")
just before this bug was introduced.
Fixes: 3f7fbd97d0 ("powerpc/64s/exception: Clean up SRR specifiers")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
[mpe: Split selftest into a separate patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200708074942.1713396-1-npiggin@gmail.com
While integrating rseq into glibc and replacing glibc's sched_getcpu
implementation with rseq, glibc's tests discovered an issue with
incorrect __rseq_abi.cpu_id field value right after the first time
a newly created process issues sched_setaffinity.
For the records, it triggers after building glibc and running tests, and
then issuing:
for x in {1..2000} ; do posix/tst-affinity-static & done
and shows up as:
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
This is caused by the scheduler invoking __set_task_cpu() directly from
sched_fork() and wake_up_new_task(), thus bypassing rseq_migrate() which
is done by set_task_cpu().
Add the missing rseq_migrate() to both functions. The only other direct
use of __set_task_cpu() is done by init_idle(), which does not involve a
user-space task.
Based on my testing with the glibc test-case, just adding rseq_migrate()
to wake_up_new_task() is sufficient to fix the observed issue. Also add
it to sched_fork() to keep things consistent.
The reason why this never triggered so far with the rseq/basic_test
selftest is unclear.
The current use of sched_getcpu(3) does not typically require it to be
always accurate. However, use of the __rseq_abi.cpu_id field within rseq
critical sections requires it to be accurate. If it is not accurate, it
can cause corruption in the per-cpu data targeted by rseq critical
sections in user-space.
Reported-By: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-By: Florian Weimer <fweimer@redhat.com>
Cc: stable@vger.kernel.org # v4.18+
Link: https://lkml.kernel.org/r/20200707201505.2632-1-mathieu.desnoyers@efficios.com
The recent commit:
c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")
moved these lines in ttwu():
p->sched_contributes_to_load = !!task_contributes_to_load(p);
p->state = TASK_WAKING;
up before:
smp_cond_load_acquire(&p->on_cpu, !VAL);
into the 'p->on_rq == 0' block, with the thinking that once we hit
schedule() the current task cannot change it's ->state anymore. And
while this is true, it is both incorrect and flawed.
It is incorrect in that we need at least an ACQUIRE on 'p->on_rq == 0'
to avoid weak hardware from re-ordering things for us. This can fairly
easily be achieved by relying on the control-dependency already in
place.
The second problem, which makes the flaw in the original argument, is
that while schedule() will not change prev->state, it will read it a
number of times (arguably too many times since it's marked volatile).
The previous condition 'p->on_cpu == 0' was sufficient because that
indicates schedule() has completed, and will no longer read
prev->state. So now the trick is to make this same true for the (much)
earlier 'prev->on_rq == 0' case.
Furthermore, in order to make the ordering stick, the 'prev->on_rq = 0'
assignment needs to he a RELEASE, but adding additional ordering to
schedule() is an unwelcome proposition at the best of times, doubly so
for mere accounting.
Luckily we can push the prev->state load up before rq->lock, with the
only caveat that we then have to re-read the state after. However, we
know that if it changed, we no longer have to worry about the blocking
path. This gives us the required ordering, if we block, we did the
prev->state load before an (effective) smp_mb() and the p->on_rq store
needs not change.
With this we end up with the effective ordering:
LOAD p->state LOAD-ACQUIRE p->on_rq == 0
MB
STORE p->on_rq, 0 STORE p->state, TASK_WAKING
which ensures the TASK_WAKING store happens after the prev->state
load, and all is well again.
Fixes: c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Dave Jones <davej@codemonkey.org.uk>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Link: https://lkml.kernel.org/r/20200707102957.GN117543@hirez.programming.kicks-ass.net
The HiSilicon hibmc driver triggers a splat at boot time as below
[ 14.137806] ------------[ cut here ]------------
[ 14.142405] hibmc-drm 0000:0a:00.0: Device has not been registered.
[ 14.148661] WARNING: CPU: 0 PID: 496 at drivers/gpu/drm/drm_fb_helper.c:2233 drm_fbdev_generic_setup+0x15c/0x1b8
[ 14.158787] [...]
[ 14.278307] Call trace:
[ 14.280742] drm_fbdev_generic_setup+0x15c/0x1b8
[ 14.285337] hibmc_pci_probe+0x354/0x418
[ 14.289242] local_pci_probe+0x44/0x98
[ 14.292974] work_for_cpu_fn+0x20/0x30
[ 14.296708] process_one_work+0x1c4/0x4e0
[ 14.300698] worker_thread+0x2c8/0x528
[ 14.304431] kthread+0x138/0x140
[ 14.307646] ret_from_fork+0x10/0x18
[ 14.311205] ---[ end trace a2000ec2d838af4d ]---
This turned out to be due to the fbdev device hasn't been registered when
drm_fbdev_generic_setup() is invoked. Let's fix the splat by moving it down
after drm_dev_register() which will follow the "Display driver example"
documented by commit de99f0600a ("drm/drv: DOC: Add driver example
code").
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200706144713.1123-1-yuzenghui@huawei.com
Jonathan writes:
First set of IIO and counter fixes in the 5.8 cycle.
The buffer alignment fixes continue to trickle through as we get
reviews in. The rest are the standard mixed bag of long term issues
just discovered an things we missed in this cycle.
IIO fixes
* core
- Add missing IIO_MOD_H2 and ETHANOL strings. Somehow got missed
when drivers were added using these in attribute names.
* afe4403, afe4404, ak8974, hdc100x, hts221, ms5611
- Fix a recently identified issue with alignment when using
iio_push_to_buffers_with_timestamp which assumes the timestamp
is 8 byte aligned.
* ad7780
- Fix a some premature / excess cleanup in an error path.
* adi-axi-adc
- Fix reference counting on the wrong object.
* ak8974
- Fix unbalance runtime pm.
* mma8452
- Fix missing iio_device_unregister in error path.
* zp2326
- Error handling for pm_runtime_get_sync failing.
counter fixes
* Add lock guards in 104-quad-8 to protect against races - done
in 2 patches to allow easy back porting.
* tag 'iio-fixes-for-5.8a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
iio: adc: ad7780: Fix a resource handling path in 'ad7780_probe()'
iio:pressure:ms5611 Fix buffer element alignment
iio:humidity:hts221 Fix alignment and data leak issues
iio:humidity:hdc100x Fix alignment and data leak issues
iio:magnetometer:ak8974: Fix alignment and data leak issues
iio: adc: adi-axi-adc: Fix object reference counting
iio: pressure: zpa2326: handle pm_runtime_get_sync failure
counter: 104-quad-8: Add lock guards - filter clock prescaler
counter: 104-quad-8: Add lock guards - differential encoder
iio: core: add missing IIO_MOD_H2/ETHANOL string identifiers
iio: magnetometer: ak8974: Fix runtime PM imbalance on error
iio: mma8452: Add missed iio_device_unregister() call in mma8452_probe()
iio:health:afe4404 Fix timestamp alignment and prevent data leak.
iio:health:afe4403 Fix timestamp alignment and prevent data leak.
Consolidate the two in-kernel read helpers to make upcoming changes
easier. The only difference are the missing call to rw_verify_area
in kernel_read, and an access_ok check that doesn't make sense for
kernel buffers to start with.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Consolidate the two in-kernel write helpers to make upcoming changes
easier. The only difference are the missing call to rw_verify_area
in kernel_write, and an access_ok check that doesn't make sense for
kernel buffers to start with.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Add a WARN_ON_ONCE if the file isn't actually open for write. This
matches the check done in vfs_write, but actually warn warns as a
kernel user calling write on a file not opened for writing is a pretty
obvious programming error.
Signed-off-by: Christoph Hellwig <hch@lst.de>
While pipes don't really need sb_writers projection, __kernel_write is an
interface better kept private, and the additional rw_verify_area does not
hurt here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
While pipes don't really need sb_writers projection, __kernel_write is an
interface better kept private, and the additional rw_verify_area does not
hurt here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ian Kent <raven@themaw.net>
__kernel_write doesn't take a sb_writers references, which we need here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Howells <dhowells@redhat.com>
The caller of cifs_posix_lock_set will do retry(like
fcntl_setlk64->do_lock_file_wait) if we will wait for any file_lock.
So the retry in cifs_poxis_lock_set seems duplicated, remove it to
make a cleanup.
Signed-off-by: yangerkun <yangerkun@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: NeilBrown <neilb@suse.de>
BRM_status_show() has several error branches, but none of them record the
error in the error return.
Also while at it remove the manual mutex_unlock() of the pci_access_mutex
in case of an ongoing pci error recovery or host removal and jump to the
cleanup label instead.
Note: We can safely jump to out from here as io_unit_pg3 is initialized to
NULL and if it hasn't been allocated, kfree() skips the NULL pointer.
[mkp: compilation warning]
Link: https://lore.kernel.org/r/20200701131454.5255-1-johannes.thumshirn@wdc.com
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If system memory is migrated to device private memory and no GPU MMU
page table entry exists, the GPU will fault and call hmm_range_fault()
to get the PFN for the page. Since the .dev_private_owner pointer in
struct hmm_range is not set, hmm_range_fault returns an error which
results in the GPU program stopping with a fatal fault.
Fix this by setting .dev_private_owner appropriately.
Fixes: 08ddddda66 ("mm/hmm: check the device private page owner in hmm_range_fault()")
Cc: stable@vger.kernel.org
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The patch to add zero page migration to GPU memory inadvertently included
part of a future change which broke normal page migration to GPU memory
by copying too much data and corrupting GPU memory.
Fix this by only copying one page instead of a byte count.
Fixes: 9d4296a7d4 ("drm/nouveau/nouveau/hmm: fix migrate zero page to GPU")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tegra TRM says worst-case reply time is 1216us, and this should fix some
spurious timeouts that have been popping up.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Prevents "snd_hda_codec_hdmi hdaudioC1D0: HDMI: pin nid 5 not registered"
that occur on some configurations.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
read permission, not just read attributes permission, is required
on the directory.
See MS-SMB2 (protocol specification) section 3.3.5.19.
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org> # v5.6+
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
The queue reset pattern is used in a couple different places,
only slightly different from each other, and could cause
issues if one gets changed and the other didn't. This puts
them together so that only one version is needed, yet each
can have slighty different effects by passing in a pointer
to a work function to do whatever configuration twiddling is
needed in the middle of the reset.
This specifically addresses issues seen where under loops
of changing ring size or queue count parameters we could
occasionally bump into the netdev watchdog.
v2: added more commit message commentary
Fixes: 4d03e00a21 ("ionic: Add initial ethtool support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki pointed out that we now have two very similar functions to extract
the L3 protocol number in the presence of VLAN tags. And Daniel pointed out
that the unbounded parsing loop makes it possible for maliciously crafted
packets to loop through potentially hundreds of tags.
Fix both of these issues by consolidating the two parsing functions and
limiting the VLAN tag parsing to a max depth of 8 tags. As part of this,
switch over __vlan_get_protocol() to use skb_header_pointer() instead of
pskb_may_pull(), to avoid the possible side effects of the latter and keep
the skb pointer 'const' through all the parsing functions.
v2:
- Use limit of 8 tags instead of 32 (matching XMIT_RECURSION_LIMIT)
Reported-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes: d7bf2ebebc ("sched: consistently handle layer3 header accesses in the presence of VLANs")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When generating debug dump, driver firstly collects all data in binary
form, and then performs per-feature formatting to human-readable if it
is supported.
For ethtool -d, this is roughly incorrect for two reasons. First of all,
drivers should always provide only original raw dumps to Ethtool without
any changes.
The second, and more critical, is that Ethtool's output buffer size is
strictly determined by ethtool_ops::get_regs_len(), and all data *must*
fit in it. The current version of driver always returns the size of raw
data, but the size of the formatted buffer exceeds it in most cases.
This leads to out-of-bound writes and memory corruption.
Address both issues by adding an option to return original, non-formatted
debug data, and using it for Ethtool case.
v2:
- Expand commit message to make it more clear;
- No functional changes.
Fixes: c965db4446 ("qed: Add support for debug data collection")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull perf tooling fixes from Arnaldo Carvalho de Melo:
- Intel PT fixes for PEBS-via-PT with registers
- Fixes for Intel PT python based GUI
- Avoid duplicated sideband events with Intel PT in system wide tracing
- Remove needless 'dummy' event from TUI menu, used when synthesizing
meta data events for pre-existing processes
- Fix corner case segfault when pressing enter in a screen without
entries in the TUI for report/top
- Fixes for time stamp handling in libtraceevent
- Explicitly set utf-8 encoding in perf flamegraph
- Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy',
silencing perf build warning
* tag 'perf-tools-fixes-2020-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf report TUI: Remove needless 'dummy' event from menu
perf intel-pt: Fix PEBS sample for XMM registers
perf intel-pt: Fix displaying PEBS-via-PT with registers
perf intel-pt: Fix recording PEBS-via-PT with registers
perf report TUI: Fix segmentation fault in perf_evsel__hists_browse()
tools lib traceevent: Add proper KBUFFER_TYPE_TIME_STAMP handling
tools lib traceevent: Add API to read time information from kbuffer
perf scripts python: exported-sql-viewer.py: Fix time chart call tree
perf scripts python: exported-sql-viewer.py: Fix zero id in call tree 'Find' result
perf scripts python: exported-sql-viewer.py: Fix zero id in call graph 'Find' result
perf scripts python: exported-sql-viewer.py: Fix unexpanded 'Find' result
perf record: Fix duplicated sideband events with Intel PT system wide tracing
perf scripts python: export-to-postgresql.py: Fix struct.pack() int argument
tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy'
perf flamegraph: Explicitly set utf-8 encoding
Commit e57f61858b ("net: bridge: mcast: fix stale nsrcs pointer in
igmp3/mld2 report handling") introduced a bug in the IPv6 header payload
length check which would potentially lead to rejecting a valid MLD2 Report:
The check needs to take into account the 2 bytes for the "Number of
Sources" field in the "Multicast Address Record" before reading it.
And not the size of a pointer to this field.
Fixes: e57f61858b ("net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling")
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tcf_ct_act execute the tcf_lastuse_update should
be update or the used stats never update
filter protocol ip pref 3 flower chain 0
filter protocol ip pref 3 flower chain 0 handle 0x1
eth_type ipv4
dst_ip 1.1.1.1
ip_flags frag/firstfrag
skip_hw
not_in_hw
action order 1: ct zone 1 nat pipe
index 1 ref 1 bind 1 installed 103 sec used 103 sec
Action statistics:
Sent 151500 bytes 101 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
cookie 4519c04dc64a1a295787aab13b6a50fb
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
rwlock.h should not be included directly. Instead linux/splinlock.h
should be included. Including it directly will break the RT build.
Fixes: 549c243e4e ("net/mlx5e: Extract neigh-specific code from en_rep.c to rep/neigh.c")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RFC 8684 mandates that no-data DATA FIN packets should carry
a DSS with 0 sequence number and data len equal to 1. Currently,
on FIN retransmission we re-use the existing mapping; if the previous
fin transmission was part of a partially acked data packet, we could
end-up writing in the egress packet a non-compliant DSS.
The above will be detected by a "Bad mapping" warning on the receiver
side.
This change addresses the issue explicitly checking for 0 len packet
when adding the DATA_FIN option.
Fixes: 6d0060f600 ("mptcp: Write MPTCP DSS headers to outgoing data packets")
Reported-by: syzbot+42a07faa5923cfaeb9c9@syzkaller.appspotmail.com
Tested-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv4 ping sockets don't set fl4.fl4_icmp_{type,code}, which leads to
incomplete IPsec ACQUIRE messages being sent to userspace. Currently,
both raw sockets and IPv6 ping sockets set those fields.
Expected output of "ip xfrm monitor":
acquire proto esp
sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 8 code 0 dev ens4
policy src 10.0.2.15/32 dst 8.8.8.8/32
<snip>
Currently with ping sockets:
acquire proto esp
sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 0 code 0 dev ens4
policy src 10.0.2.15/32 dst 8.8.8.8/32
<snip>
The Libreswan test suite found this problem after Fedora changed the
value for the sysctl net.ipv4.ping_group_range.
Fixes: c319b4d76b ("net: ipv4: add IPPROTO_ICMP socket kind")
Reported-by: Paul Wouters <pwouters@redhat.com>
Tested-by: Paul Wouters <pwouters@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the ISR, we poll the event register for the queues in need of
service and then enter polled mode. After this point, the event
register will never be read again until we exit polled mode.
In a scenario where a UDP flow is routed back out through the same
interface, i.e. "router-on-a-stick" we'll typically only see an rx
queue event initially. Once we start to process the incoming flow
we'll be locked polled mode, but we'll never clean the tx rings since
that event is never caught.
Eventually the netdev watchdog will trip, causing all buffers to be
dropped and then the process starts over again.
Rework the NAPI poll to keep trying to consome the entire budget as
long as new events are coming in, making sure to service all rx/tx
queues, in priority order, on each pass.
Fixes: 4d494cdc92 ("net: fec: change data structure to support multiqueue")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Tested-by: Fugang Duan <fugang.duan@nxp.com>
Reviewed-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
clang static analysis flags this garbage return
drivers/net/ethernet/marvell/sky2.c:208:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn]
return v;
^~~~~~~~
static inline u16 gm_phy_read( ...
{
u16 v;
__gm_phy_read(hw, port, reg, &v);
return v;
}
__gm_phy_read can return without setting v.
So handle similar to skge.c's gm_phy_read, initialize v.
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MTD fixes from Miquel Raynal:
"MTD:
- Set a missing master partition panic write flag
Raw NAND:
- Fix build issue in the xway driver
- Fix a wrong return code"
* tag 'mtd/fixes-for-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: rawnand: xway: Fix build issue
mtd: set master partition panic write flag
nandsim: Fix return code testing of ns_find_operation()
So far, gfs2 has taken the inode glocks inside the ->readpage and
->readahead address space operations. Since commit d4388340ae ("fs:
convert mpage_readpages to mpage_readahead"), gfs2_readahead is passed
the pages to read ahead locked. With that, the current holder of the
inode glock may be trying to lock one of those pages while
gfs2_readahead is trying to take the inode glock, resulting in a
deadlock.
Fix that by moving the lock taking to the higher-level ->read_iter file
and ->fault vm operations. This also gets rid of an ugly lock inversion
workaround in gfs2_readpage.
The cache consistency model of filesystems like gfs2 is such that if
data is found in the page cache, the data is up to date and can be used
without taking any filesystem locks. If a page is not cached,
filesystem locks must be taken before populating the page cache.
To avoid taking the inode glock when the data is already cached,
gfs2_file_read_iter first tries to read the data with the IOCB_NOIO flag
set. If that fails, the inode glock is taken and the operation is
retried with the IOCB_NOIO flag cleared.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Add an IOCB_NOIO flag that indicates to generic_file_read_iter that it
shouldn't trigger any filesystem I/O for the actual request or for
readahead. This allows to do tentative reads out of the page cache as
some filesystems allow, and to take the appropriate locks and retry the
reads only if the requested pages are not cached.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Pull btrfs fixes from David Sterba:
- regression fix of a leak in global block reserve accounting
- fix a (hard to hit) race of readahead vs releasepage that could lead
to crash
- convert all remaining uses of comment fall through annotations to the
pseudo keyword
- fix crash when mounting a fuzzed image with -o recovery
* tag 'for-5.8-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: reset tree root pointer after error in init_tree_roots
btrfs: fix reclaim_size counter leak after stealing from global reserve
btrfs: fix fatal extent_buffer readahead vs releasepage race
btrfs: convert comments to fallthrough annotations
When we clone a socket in sk_clone_lock(), its sk_cgrp_data is
copied, so the cgroup refcnt must be taken too. And, unlike the
sk_alloc() path, sock_update_netprioidx() is not called here.
Therefore, it is safe and necessary to grab the cgroup refcnt
even when cgroup_sk_alloc is disabled.
sk_clone_lock() is in BH context anyway, the in_interrupt()
would terminate this function if called there. And for sk_alloc()
skcd->val is always zero. So it's safe to factor out the code
to make it more readable.
The global variable 'cgroup_sk_alloc_disabled' is used to determine
whether to take these reference counts. It is impossible to make
the reference counting correct unless we save this bit of information
in skcd->val. So, add a new bit there to record whether the socket
has already taken the reference counts. This obviously relies on
kmalloc() to align cgroup pointers to at least 4 bytes,
ARCH_KMALLOC_MINALIGN is certainly larger than that.
This bug seems to be introduced since the beginning, commit
d979a39d72 ("cgroup: duplicate cgroup reference when cloning sockets")
tried to fix it but not compeletely. It seems not easy to trigger until
the recent commit 090e28b229
("netprio_cgroup: Fix unlimited memory leak of v2 cgroups") was merged.
Fixes: bd1060a1d6 ("sock, cgroup: add sock->sk_cgroup")
Reported-by: Cameron Berkenpas <cam@neo-zeon.de>
Reported-by: Peter Geis <pgwipeout@gmail.com>
Reported-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Reported-by: Daniël Sonck <dsonck92@gmail.com>
Reported-by: Zhang Qiang <qiang.zhang@windriver.com>
Tested-by: Cameron Berkenpas <cam@neo-zeon.de>
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull tpm fix from Jarkko Sakkinen:
"Revert commit e918e57041 ("tpm_tis: Remove the HID IFX0102").
Removing IFX0102 from tpm_tis was not a right move because both
tpm_tis and tpm_infineon use the same device ID.
A real fix requires quirks added to both drivers. It can probably wait
until v5.9 as the bug has existed since 2006"
* tag 'tpmdd-next-v5.8-rc5' of git://git.infradead.org/users/jjs/linux-tpmdd:
Revert commit e918e57041 ("tpm_tis: Remove the HID IFX0102")
Thomas reported a regression with IPv6 and anycast using the following
reproducer:
echo 1 > /proc/sys/net/ipv6/conf/all/forwarding
ip -6 a add fc12::1/16 dev lo
sleep 2
echo "pinging lo"
ping6 -c 2 fc12::
The conversion of addrconf_f6i_alloc to use ip6_route_info_create missed
the use of fib6_is_reject which checks addresses added to the loopback
interface and sets the REJECT flag as needed. Update fib6_is_reject for
loopback checks to handle RTF_ANYCAST addresses.
Fixes: c7a1ce397a ("ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create")
Reported-by: thomas.gambier@nexedi.com
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder says:
====================
net: ipa: fix warning-reported errors
Building the kernel with W=1 produces numerous warnings for the IPA
code. Some of those warnings turn out to flag real problems, and
this series fixes them. The first patch fixes the most important
ones, but the second and third are problems I think are worth
treating as bugs as well.
Note: I'll happily combine any of these if someone prefers that.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Include "ipa_gsi.h" in "ipa_gsi.c", so the public functions are
defined before they are used in "ipa_gsi.c". This addresses some
warnings that are reported with a "W=1" build.
Fixes: c3f398b141 ("soc: qcom: ipa: IPA interface to GSI")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pointers to two struct types are used in "ipa_gsi.h", without those
struct types being forward-declared. Add these declarations.
Fixes: c3f398b141 ("soc: qcom: ipa: IPA interface to GSI")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Building with "W=1" did exactly what it was supposed to do, namely
point out some suspicious-looking code to be verified not to contain
bugs.
Some QMI message structures defined in "ipa_qmi_msg.c" contained
some bad field names (duplicating the "elem_size" field instead of
defining the "offset" field), almost certainly due to copy/paste
errors that weren't obvious in a scan of the code. Fix these bugs.
Fixes: 530f9216a9 ("soc: qcom: ipa: AP/modem communications")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This MIPS driver does not support COMPILE_TEST yet and failed to build
under my radar.
Replace 'mtd' chich is not defined in the scope of xway_nand_remove()
by nand_to_mtd(chip). The mistake has been added in the long series
dropping nand_release().
Tested with a 7.3.0 MIPS GCC toolchain built with Buildroot.
Fixes: 9fdd78f7bc ("mtd: rawnand: xway: Stop using nand_release()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200626065511.16424-1-miquel.raynal@bootlin.com
Given request-based DM now uses blk-mq's blk_mq_queue_inflight() to
determine if outstanding IO has completed (and DM has no control over
the blk-mq state machine used to track outstanding IO) it is unsafe to
wakeup waiter (dm_wait_for_completion) before blk-mq has cleared a
request's state bits (e.g. MQ_RQ_IN_FLIGHT or MQ_RQ_COMPLETE). As
such dm_wait_for_completion() could be left to wait indefinitely if no
other requests complete.
Fix this by eliminating request-based DM's use of waitqueue to wait
for blk-mq requests to complete in dm_wait_for_completion.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Depends-on: 3c94d83cb3 ("blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
dm-multipath is the only user of blk_mq_queue_inflight(). When
dm-multipath calls blk_mq_queue_inflight() to check if it has
outstanding IO it can get a false negative. The reason for this is
blk_mq_rq_inflight() doesn't consider requests that are no longer
MQ_RQ_IN_FLIGHT but that are now MQ_RQ_COMPLETE (->complete isn't
called or finished yet) as "inflight".
This causes request-based dm-multipath's dm_wait_for_completion() to
return before all outstanding dm-multipath requests have actually
completed. This breaks DM multipath's suspend functionality because
blk-mq requests complete after DM's suspend has finished -- which
shouldn't happen.
Fix this by considering any request not in the MQ_RQ_IDLE state
(so either MQ_RQ_COMPLETE or MQ_RQ_IN_FLIGHT) as "inflight" in
blk_mq_rq_inflight().
Fixes: 3c94d83cb3 ("blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch fixes a bug which does not let FAN mode to be changed from
sysfs(pwm1_enable). i.e pwm1_enable can not be set to 3, it will always
remain at 0.
This is caused because the device driver handles the result of
"read_u8_from_i2c(client, REG_FAN_CONF1, &conf_reg)" incorrectly. The
driver thinks an error has occurred if the (result != 0). This has been
fixed by changing the condition to (result < 0).
Signed-off-by: Vishwas M <vishwas.reddy.vr@gmail.com>
Link: https://lore.kernel.org/r/20200707142747.118414-1-vishwas.reddy.vr@gmail.com
Fixes: 9df7305b5a ("hwmon: Add driver for SMSC EMC2103 temperature monitor and fan controller")
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[BUG]
The following small test script can trigger ASSERT() at unmount time:
mkfs.btrfs -f $dev
mount $dev $mnt
mount -o remount,discard=async $mnt
umount $mnt
The call trace:
assertion failed: atomic_read(&block_group->count) == 1, in fs/btrfs/block-group.c:3431
------------[ cut here ]------------
kernel BUG at fs/btrfs/ctree.h:3204!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 4 PID: 10389 Comm: umount Tainted: G O 5.8.0-rc3-custom+ #68
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
btrfs_free_block_groups.cold+0x22/0x55 [btrfs]
close_ctree+0x2cb/0x323 [btrfs]
btrfs_put_super+0x15/0x17 [btrfs]
generic_shutdown_super+0x72/0x110
kill_anon_super+0x18/0x30
btrfs_kill_super+0x17/0x30 [btrfs]
deactivate_locked_super+0x3b/0xa0
deactivate_super+0x40/0x50
cleanup_mnt+0x135/0x190
__cleanup_mnt+0x12/0x20
task_work_run+0x64/0xb0
__prepare_exit_to_usermode+0x1bc/0x1c0
__syscall_return_slowpath+0x47/0x230
do_syscall_64+0x64/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The code:
ASSERT(atomic_read(&block_group->count) == 1);
btrfs_put_block_group(block_group);
[CAUSE]
Obviously it's some btrfs_get_block_group() call doesn't get its put
call.
The offending btrfs_get_block_group() happens here:
void btrfs_mark_bg_unused(struct btrfs_block_group *bg)
{
if (list_empty(&bg->bg_list)) {
btrfs_get_block_group(bg);
list_add_tail(&bg->bg_list, &fs_info->unused_bgs);
}
}
So every call sites removing the block group from unused_bgs list should
reduce the ref count of that block group.
However for async discard, it didn't follow the call convention:
void btrfs_discard_punt_unused_bgs_list(struct btrfs_fs_info *fs_info)
{
list_for_each_entry_safe(block_group, next, &fs_info->unused_bgs,
bg_list) {
list_del_init(&block_group->bg_list);
btrfs_discard_queue_work(&fs_info->discard_ctl, block_group);
}
}
And in btrfs_discard_queue_work(), it doesn't call
btrfs_put_block_group() either.
[FIX]
Fix the problem by reducing the reference count when we grab the block
group from unused_bgs list.
Reported-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Fixes: 6e80d4f8c4 ("btrfs: handle empty block_group removal for async discard")
CC: stable@vger.kernel.org # 5.6+
Tested-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The ASoC devm_ functions that register a component
(devm_snd_soc_register_component and devm_snd_dmaengine_pcm_register) will
clean their component by running snd_soc_unregister_component.
snd_soc_unregister_component will then remove all the components for the
device that was used to register the component in the first place.
However, some drivers register several components (such as a DAI and a
dmaengine PCM) on the same device, and if the dmaengine PCM is registered
first, then the DAI will be cleaned up first and
snd_dmaengine_pcm_unregister will be called next.
snd_dmaengine_pcm_unregister will then lookup the dmaengine PCM component
on the device, and if there's one unregister that component and release its
dmaengine channels. That doesn't happen in practice though since the first
call to snd_soc_unregister_component removed all the components, so we
never get the chance to release the dmaengine channels.
In order to fix this, instead of removing all the components for a given
device, we can simply remove the component that was registered in the first
place. We should have the same number of component registration than we
have components, so it should work just fine.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20200707074237.287171-1-maxime@cerno.tech
Signed-off-by: Mark Brown <broonie@kernel.org>
These messages appear each time the mouse wakes from sleep, in my case
(Logitech M705), every minute or so.
Let's downgrade them to the "debug" level so they don't fill the kernel log
by default.
While we are at it, let's make clear that this is a wheel multiplier (and
not, for example, XY movement multiplier).
Fixes: 4435ff2f09 ("HID: logitech: Enable high-resolution scrolling on Logitech mice")
Cc: stable@vger.kernel.org
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Reviewed-by: Harry Cutts <hcutts@chromium.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Some parts of hid-logitech-dj explicitly referred to 0xff for the
receiver index. This patch changes those references to the
HIDPP_RECEIVER_INDEX definition.
Signed-off-by: Mazin Rezk <mnrzk@protonmail.com>
Reviewed-by: Filipe Laíns <lains@archlinux.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
When HDMI PCM devices are opened in a specific order, with at least one
HDMI/DP receiver connected, ALSA PCM open fails to -EBUSY on the
connected monitor, on recent Intel platforms (ICL/JSL and newer). While
this is not a typical sequence, at least Pulseaudio does this every time
when it is started, to discover the available PCMs.
The rootcause is an invalid assumption in hdmi_add_pin(), where the
total number of converters is assumed to be known at the time the
function is called. On older Intel platforms this held true, but after
ICL/JSL, the order how pins and converters are in the subnode list as
returned by snd_hda_get_sub_nodes(), was changed. As a result,
information for some converters was not stored to per_pin->mux_nids.
And this means some pins cannot be connected to all converters, and
application instead gets -EBUSY instead at open.
The assumption that converters are always before pins in the subnode
list, is not really a valid one. Fix the problem in hdmi_parse_codec()
by introducing separate loops for discovering converters and pins.
BugLink: https://github.com/thesofproject/linux/issues/1978
BugLink: https://github.com/thesofproject/linux/issues/2216
BugLink: https://github.com/thesofproject/linux/issues/2217
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://lore.kernel.org/r/20200703153818.2808592-1-kai.vehmanen@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
In commit 0146dca70b, I incorrectly adapted the code that computes
the location of the UDP or TCP encapsulation header from IPv4 to
IPv6. In esp6_input_done2, skb->transport_header points to the ESP
header, so by adding skb_network_header_len, uh and th will point to
the ESP header, not the encapsulation header that's in front of it.
Since the TCP header's size can change with options, we have to start
from the IPv6 header and walk past possible extensions.
Fixes: 0146dca70b ("xfrm: add support for UDPv6 encapsulation of ESP")
Fixes: 26333c37fc ("xfrm: add IPv6 support for espintcp")
Reported-by: Tobias Brunner <tobias@strongswan.org>
Tested-by: Tobias Brunner <tobias@strongswan.org>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
strsep() is neither standard C nor POSIX and used outside
the kernel code here. Using it here requires that the
build host supports it out of the box which is e.g.
not true for a Darwin build host and using a cross-compiler.
This leads to:
scripts/mod/modpost.c:145:2: warning: implicit declaration of function 'strsep' [-Wimplicit-function-declaration]
return strsep(stringp, "\n");
^
and a segfault when running MODPOST.
See also: https://stackoverflow.com/a/7219504
So let's replace this by strchr() instead of using strsep().
It does not hurt kernel size or speed since this code is run
on the build host.
Fixes: ac5100f543 ("modpost: add read_text_file() and get_line() helpers")
Co-developed-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Removing IFX0102 from tpm_tis was not a right move because both tpm_tis
and tpm_infineon use the same device ID. Revert the commit and add a
remark about a bug caused by commit 93e1b7d42e ("[PATCH] tpm: add HID
module parameter").
Fixes: e918e57041 ("tpm_tis: Remove the HID IFX0102")
Reported-by: Peter Huewe <peterhuewe@gmx.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
The current fence_y_offset calculation is broken. I think it more or
less used to do the right thing, but then I changed the plane code
to put the final x/y source offsets back into the src rectangle so
now it's just subtraacting the same value from itself. The code would
never have worked if we allowed the framebuffer to have a non-zero
offset.
Let's do this in a better way by just calculating the fence_y_offset
from the final plane surface offset. Note that we don't align the
plane surface address to fence rows so with horizontal panning there's
often a horizontal offset from the fence start to the surface address
as well. We have no way to tell the hardware about that so we just
ignore it. Based on some quick tests the invlidation still happens
correctly. I presume due to the invalidation nuking at least the full
line (or a segment of multiple lines).
Fixes: 54d4d719fa ("drm/i915: Overcome display engine stride limits via GTT remapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-4-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 5331889b5f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes a compiler warning:
In file included from sync_test.c:37:
../kselftest.h: In function ‘ksft_print_cnts’:
../kselftest.h:78:16: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare]
if (ksft_plan != ksft_test_num())
^~
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Brian reported a crash in IPv6 code when using rpfilter with a setup
running FRR and external nexthop objects. The root cause of the crash
is fib6_select_path setting fib6_nh in the result to NULL because of
an improper check for nexthop objects.
More specifically, rpfilter invokes ip6_route_lookup with flowi6_oif
set causing fib6_select_path to be called with have_oif_match set.
fib6_select_path has early check on have_oif_match and jumps to the
out label which presumes a builtin fib6_nh. This path is invalid for
nexthop objects; for external nexthops fib6_select_path needs to just
return if the fib6_nh has already been set in the result otherwise it
returns after the call to nexthop_path_fib6_result. Update the check
on have_oif_match to not bail on external nexthops.
Update selftests for this problem.
Fixes: f88d8ea67f ("ipv6: Plumb support for nexthop object in a fib6_info")
Reported-by: Brian Rak <brak@choopa.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull kvm fixes from Paolo Bonzini:
"Bugfixes and a one-liner patch to silence a sparse warning"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: arm64: Stop clobbering x0 for HVC_SOFT_RESTART
KVM: arm64: PMU: Fix per-CPU access in preemptible context
KVM: VMX: Use KVM_POSSIBLE_CR*_GUEST_BITS to initialize guest/host masks
KVM: x86: Mark CR4.TSD as being possibly owned by the guest
KVM: x86: Inject #GP if guest attempts to toggle CR4.LA57 in 64-bit mode
kvm: use more precise cast and do not drop __user
KVM: x86: bit 8 of non-leaf PDPEs is not reserved
KVM: X86: Fix async pf caused null-ptr-deref
KVM: arm64: vgic-v4: Plug race between non-residency and v4.1 doorbell
KVM: arm64: pvtime: Ensure task delay accounting is enabled
KVM: arm64: Fix kvm_reset_vcpu() return code being incorrect with SVE
KVM: arm64: Annotate hyp NMI-related functions as __always_inline
KVM: s390: reduce number of IO pins to 1
Huazhong Tan says:
====================
net: hns3: fixes for -net
There are some fixes about reset issue and a use-after-free
of self-test.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable promisc mode of PF, set VF link state to enable, and
run iperf of the VF, then do self test of the PF. The self test
will fail with a low frequency, and may cause a use-after-free
problem.
[ 87.142126] selftest:000004a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 87.159722] ==================================================================
[ 87.174187] BUG: KASAN: use-after-free in hex_dump_to_buffer+0x140/0x608
[ 87.187600] Read of size 1 at addr ffff003b22828000 by task ethtool/1186
[ 87.201012]
[ 87.203978] CPU: 7 PID: 1186 Comm: ethtool Not tainted 5.5.0-rc4-gfd51c473-dirty #4
[ 87.219306] Hardware name: Huawei TaiShan 2280 V2/BC82AMDA, BIOS TA BIOS 2280-A CS V2.B160.01 01/15/2020
[ 87.238292] Call trace:
[ 87.243173] dump_backtrace+0x0/0x280
[ 87.250491] show_stack+0x24/0x30
[ 87.257114] dump_stack+0xe8/0x140
[ 87.263911] print_address_description.isra.8+0x70/0x380
[ 87.274538] __kasan_report+0x12c/0x230
[ 87.282203] kasan_report+0xc/0x18
[ 87.288999] __asan_load1+0x60/0x68
[ 87.295969] hex_dump_to_buffer+0x140/0x608
[ 87.304332] print_hex_dump+0x140/0x1e0
[ 87.312000] hns3_lb_check_skb_data+0x168/0x170
[ 87.321060] hns3_clean_rx_ring+0xa94/0xfe0
[ 87.329422] hns3_self_test+0x708/0x8c0
The length of packet sent by the selftest process is only
128 + 14 bytes, and the min buffer size of a BD is 256 bytes,
and the receive process will make sure the packet sent by
the selftest process is in the linear part, so only check
the linear part in hns3_lb_check_skb_data().
So fix this use-after-free by using skb_headlen() to dump
skb->data instead of skb->len.
Fixes: c39c4d98dc ("net: hns3: Add mac loopback selftest support in hns3 driver")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When unloading driver, if flag HNS3_NIC_STATE_INITED has been
already cleared, the debugfs will not be uninitialized, so fix it.
Fixes: b2292360bb ("net: hns3: Add debugfs framework registration")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When asserts VF reset fail, flag HCLGEVF_STATE_CMD_DISABLE
and handshake status should not set, otherwise the retry will
fail. So adds a check for asserting VF reset and returns
directly when fails.
Fixes: ef5f8e507e ("net: hns3: stop handling command queue while resetting VF")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there is a PF reset pending before FLR prepare, FLR's
preparatory work will not fail, but the FLR rebuild procedure
will fail for this pending. So this PF reset pending should
be handled in the FLR preparatory.
Fixes: 8627bdedc4 ("net: hns3: refactor the precedure of PF FLR")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andre Edich says:
====================
smsc95xx: fix smsc95xx_bind
The patchset fixes two problems in the function smsc95xx_bind:
- return of false success
- memory leak
Changes in v2:
- added "Fixes" tags to both patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In a case where the ID_REV register read is failed, the memory for a
private data structure has to be freed before returning error from the
function smsc95xx_bind.
Fixes: bbd9f9ee69 ("smsc95xx: add wol support for more frame types")
Signed-off-by: Andre Edich <andre.edich@microchip.com>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The return value of the function smsc95xx_reset() must be checked
to avoid returning false success from the function smsc95xx_bind().
Fixes: 2f7ca802bd ("net: Add SMSC LAN9500 USB2.0 10/100 ethernet adapter driver")
Signed-off-by: Andre Edich <andre.edich@microchip.com>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When this driver transmits data,
first this driver will remove a pseudo header of 1 byte,
then the lapb module will prepend the LAPB header of 2 or 3 bytes,
then this driver will prepend a length field of 2 bytes,
then the underlying Ethernet device will prepend its own header.
So, the header length required should be:
-1 + 3 + 2 + "the header length needed by the underlying device".
This patch fixes kernel panic when this driver is used with AF_PACKET
SOCK_DGRAM sockets.
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull s390 fixes from Heiko Carstens:
- Initialize jump labels before early command line parsing in order to
make init_on_alloc and init_on_free options work
- Fix vfio-ccw build error due to missing include
- Prevent callchain data collection with hardware sampling, since the
callchains simply do not exist
- Prevent multiple registrations of the same zPCI function
- Update defconfigs
* tag 's390-5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
vfio-ccw: Fix a build error due to missing include of linux/slab.h
s390: update defconfigs
s390/cpum_sf: prohibit callchain data collection
s390/setup: init jump labels before command line parsing
s390/maccess: add no DAT mode to kernel_write
s390/pci: fix enabling a reserved PCI function
KVM/arm fixes for 5.8, take #3
- Disable preemption on context-switching PMU EL0 state happening
on system register trap
- Don't clobber X0 when tearing down KVM via a soft reset (kexec)
A SPI transfer defines the _maximum_ speed of the SPI transfer. However the
driver doesn't take into account that the clock divider is always rounded down
(due to integer arithmetics). This results in a too high clock rate for the SPI
transfer.
E.g.: with a mclk_rate of 24 MHz and a SPI transfer speed of 10 MHz, the
original code calculates a reg of "0", which results in a effective divider of
"2" and a 12 MHz clock for the SPI transfer.
This patch fixes the issue by using DIV_ROUND_UP() instead of a plain
integer division.
While there simplify the divider calculation for the CDR1 case, use
order_base_2() instead of two ilog2() calculations.
Fixes: 3558fe900e ("spi: sunxi: Add Allwinner A31 SPI controller driver")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200706143443.9855-2-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Add brackets to protect parameters of the recently added sg_table related
macros from side-effects.
Fixes: 709d6d73c7 ("scatterlist: add generic wrappers for iterating over sgtable objects")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The max value of blkio.bfq.weight is 1000, rather than 10000.
And 'weights' have been remove from /sys/block/XXX/queue/iosched.
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The hardware codec on the A10, A10s, A13 and A20 needs buffer in the
first 256MB of RAM. This was solved by setting the CMA pool at a fixed
address in that range.
However, in recent kernels there's something else that comes in and
reserve some range that end up conflicting with our default pool
requirement, and thus makes its reservation fail.
The video codec will then use buffers from the usual default pool,
outside of the range it can access, and will fail to decode anything.
Since we're only concerned about that 256MB, we can however relax the
allocation to just specify the range that's allowed, and not try to
enforce a specific address.
Fixes: 5949bc5602 ("ARM: dts: sun4i-a10: Add Video Engine and reserved memory nodes")
Fixes: 9604320101 ("ARM: dts: sun5i: Add Video Engine and reserved memory nodes")
Fixes: c2a641a748 ("ARM: dts: sun7i-a20: Add Video Engine and reserved memory nodes")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Link: https://lore.kernel.org/r/20200704130829.34297-1-maxime@cerno.tech
Fixing the common case of:
perf record
perf report
And getting just the cycles events.
We now have a 'dummy' event to get perf metadata events that take place
while we synthesize metadata records for pre-existing processes by
traversing procfs, so we always have this extra 'dummy' evsel, but we
don't have to offer it as there will be no samples on it, remove this
distraction.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20200706115452.GA2772@redhat.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After recording PEBS-via-PT, perf script will not accept 'iregs' field e.g.
# perf record -c 10000 -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -I -- ls -l
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.062 MB perf.data ]
# ./perf script --itrace=eop -F+iregs
Samples for 'dummy:u' event do not have IREGS attribute set. Cannot print 'iregs' field.
Fix by using allow_user_set, which is true when recording AUX area data.
Fixes: 9e64cefe43 ("perf intel-pt: Process options for PEBS event synthesis")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Link: http://lore.kernel.org/lkml/20200630133935.11150-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When recording PEBS-via-PT, the kernel will not accept the intel_pt
event with register sampling e.g.
# perf record --kcore -c 10000 -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -I -- ls -l
Error:
intel_pt/branch=0/: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
Fix by suppressing register sampling on the intel_pt evsel.
Committer notes:
Adrian informed that this is only available from Tremont onwards, so on
older processors the error continues the same as before.
Fixes: 9e64cefe43 ("perf intel-pt: Process options for PEBS event synthesis")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Link: http://lore.kernel.org/lkml/20200630133935.11150-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 07da1ffaa1 ("KVM: arm64: Remove host_cpu_context
member from vcpu structure") has, by removing the host CPU
context pointer, exposed that kvm_vcpu_pmu_restore_guest
is called in preemptible contexts:
[ 266.932442] BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-aar/779
[ 266.939721] caller is debug_smp_processor_id+0x20/0x30
[ 266.944157] CPU: 2 PID: 779 Comm: qemu-system-aar Tainted: G E 5.8.0-rc3-00015-g8d4aa58b2fe3 #1374
[ 266.954268] Hardware name: amlogic w400/w400, BIOS 2020.04 05/22/2020
[ 266.960640] Call trace:
[ 266.963064] dump_backtrace+0x0/0x1e0
[ 266.966679] show_stack+0x20/0x30
[ 266.969959] dump_stack+0xe4/0x154
[ 266.973338] check_preemption_disabled+0xf8/0x108
[ 266.977978] debug_smp_processor_id+0x20/0x30
[ 266.982307] kvm_vcpu_pmu_restore_guest+0x2c/0x68
[ 266.986949] access_pmcr+0xf8/0x128
[ 266.990399] perform_access+0x8c/0x250
[ 266.994108] kvm_handle_sys_reg+0x10c/0x2f8
[ 266.998247] handle_exit+0x78/0x200
[ 267.001697] kvm_arch_vcpu_ioctl_run+0x2ac/0xab8
Note that the bug was always there, it is only the switch to
using percpu accessors that made it obvious.
The fix is to wrap these accesses in a preempt-disabled section,
so that we sample a coherent context on trap from the guest.
Fixes: 435e53fb5e ("arm64: KVM: Enable VHE support for :G/:H perf event modifiers")
Cc:: Andrew Murray <amurray@thegoodpenguin.co.uk>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Due to recent fixes in m68k arch-specific I/O accessor macros, this
driver is not working anymore for ColdFire. Fix wrong tcd endianness
removing additional swaps, since edma_writex() functions should already
take care of any eventual swap if needed.
Note, i could only test the change in ColdFire mcf54415 and Vybrid
vf50 / Colibri where i don't see any issue. So, every feedback and
test for all other SoCs involved is really appreciated.
Signed-off-by: Angelo Dureghello <angelo.dureghello@timesys.com>
Reported-by: kbuild test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20200701225205.1674463-1-angelo.dureghello@timesys.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
aspeed_create_fan() reads a pwm_port value using of_property_read_u32().
If pwm_port will be more than ARRAY_SIZE(pwm_port_params), there will be
a buffer overflow in
aspeed_create_pwm_port()->aspeed_set_pwm_port_enable(). The patch fixes
the potential buffer overflow.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Link: https://lore.kernel.org/r/20200703111518.9644-1-novikov@ispras.ru
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Using a mutex for "print this warning only once" is so overdesigned as
to be actively offensive to my sensitive stomach.
Just use "pr_info_once()" that already does this, although in a
(harmlessly) racy manner that can in theory cause the message to be
printed twice if more than one CPU races on that "is this the first
time" test.
[ If somebody really cares about that harmless data race (which sounds
very unlikely indeed), that person can trivially fix printk_once() by
using a simple atomic access, preferably with an optimistic non-atomic
test first before even bothering to treat the pointless "make sure it
is _really_ just once" case.
A mutex is most definitely never the right primitive to use for
something like this. ]
Yes, this is a small and meaningless detail in a code path that hardly
matters. But let's keep some code quality standards here, and not
accept outrageously bad code.
Link: https://lore.kernel.org/lkml/CAHk-=wgV9toS7GU3KmNpj8hCS9SeF+A0voHS8F275_mgLhL4Lw@mail.gmail.com/
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
xenpv_exc_nmi() and xenpv_exc_debug() are only defined on 64-bit kernels,
but they snuck into the 32-bit build via <asm/identry.h>, causing the link
to fail:
ld: arch/x86/entry/entry_32.o: in function `asm_xenpv_exc_nmi':
(.entry.text+0x817): undefined reference to `xenpv_exc_nmi'
ld: arch/x86/entry/entry_32.o: in function `asm_xenpv_exc_debug':
(.entry.text+0x827): undefined reference to `xenpv_exc_debug'
Only use them on 64-bit kernels.
Fixes: f41f082422: ("x86/entry/xen: Route #DB correctly on Xen PV")
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull x86 fixes from Thomas Gleixner:
"A series of fixes for x86:
- Reset MXCSR in kernel_fpu_begin() to prevent using a stale user
space value.
- Prevent writing MSR_TEST_CTRL on CPUs which are not explicitly
whitelisted for split lock detection. Some CPUs which do not
support it crash even when the MSR is written to 0 which is the
default value.
- Fix the XEN PV fallout of the entry code rework
- Fix the 32bit fallout of the entry code rework
- Add more selftests to ensure that these entry problems don't come
back.
- Disable 16 bit segments on XEN PV. It's not supported because XEN
PV does not implement ESPFIX64"
* tag 'x86-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ldt: Disable 16-bit segments on Xen PV
x86/entry/32: Fix #MC and #DB wiring on x86_32
x86/entry/xen: Route #DB correctly on Xen PV
x86/entry, selftests: Further improve user entry sanity checks
x86/entry/compat: Clear RAX high bits on Xen PV SYSENTER
selftests/x86: Consolidate and fix get/set_eflags() helpers
selftests/x86/syscall_nt: Clear weird flags after each test
selftests/x86/syscall_nt: Add more flag combinations
x86/entry/64/compat: Fix Xen PV SYSENTER frame setup
x86/entry: Move SYSENTER's regs->sp and regs->flags fixups into C
x86/entry: Assert that syscalls are on the right stack
x86/split_lock: Don't write MSR_TEST_CTRL on CPUs that aren't whitelisted
x86/fpu: Reset MXCSR to default in kernel_fpu_begin()
Pull irq fixes from Thomas Gleixner:
"A set of interrupt chip driver fixes:
- Ensure the atomicity of affinity updates in the GIC driver
- Don't try to sleep in atomic context when waiting for the GICv4.1
to respond. Use polling instead.
- Typo fixes in Kconfig and warnings"
* tag 'irq-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic: Atomically update affinity
irqchip/riscv-intc: Fix a typo in a pr_warn()
irqchip/gic-v4.1: Use readx_poll_timeout_atomic() to fix sleep in atomic
irqchip/loongson-pci-msi: Fix a typo in Kconfig
Pull rcu fixlet from Thomas Gleixner:
"A single fix for a printk format warning in RCU"
* tag 'core-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcuperf: Fix printk format warning
Pull Kbuild fixes frin Masahiro Yamada:
- fix various bugs in xconfig
- fix some issues in cross-compilation using Clang
- fix documentation
* tag 'kbuild-fixes-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
.gitignore: Do not track `defconfig` from `make savedefconfig`
kbuild: make Clang build userprogs for target architecture
kbuild: fix CONFIG_CC_CAN_LINK(_STATIC) for cross-compilation with Clang
kconfig: qconf: parse newer types at debug info
kconfig: qconf: navigate menus on hyperlinks
kconfig: qconf: don't show goback button on splitMode
kconfig: qconf: simplify the goBack() logic
kconfig: qconf: re-implement setSelected()
kconfig: qconf: make debug links work again
kconfig: qconf: make search fully work again on split mode
kconfig: qconf: cleanup includes
docs: kbuild: fix ReST formatting
gcc-plugins: fix gcc-plugins directory path in documentation
Pull SCSI fixes from James Bottomley:
"Four small fixes in three drivers.
The mptfusion one has actually caused user visible issues in certain
kernel configurations"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mptfusion: Don't use GFP_ATOMIC for larger DMA allocations
scsi: libfc: Skip additional kref updating work event
scsi: libfc: Handling of extra kref
scsi: qla2xxx: Fix a condition in qla2x00_find_all_fabric_devs()
Pull block fixes from Jens Axboe:
- NVMe fixes from Christoph:
- Fix crash in multi-path disk add (Christoph)
- Fix ignore of identify error (Sagi)
- Fix a compiler complaint that a function should be static (Wei)
* tag 'block-5.8-2020-07-05' of git://git.kernel.dk/linux-block:
block: make function __bio_integrity_free() static
nvme: fix a crash in nvme_mpath_add_disk
nvme: fix identify error status silent ignore
Pull io_uring fix from Jens Axboe:
"Andres reported a regression with the fix that was merged earlier this
week, where his setup of using signals to interrupt io_uring CQ waits
no longer worked correctly.
Fix this, and also limit our use of TWA_SIGNAL to the case where we
need it, and continue using TWA_RESUME for task_work as before.
Since the original is marked for 5.7 stable, let's flush this one out
early"
* tag 'io_uring-5.8-2020-07-05' of git://git.kernel.dk/linux-block:
io_uring: fix regression with always ignoring signals in io_cqring_wait()
Pull i2c fixes from Wolfram Sang:
"The usual driver fixes and documentation updates"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: mlxcpld: check correct size of maximum RECV_LEN packet
i2c: add Kconfig help text for slave mode
i2c: slave-eeprom: update documentation
i2c: eg20t: Load module automatically if ID matches
i2c: designware: platdrv: Set class based on DMI
i2c: algo-pca: Add 0x78 as SCL stuck low status for PCA9665
Pull MIPS fixes from Thomas Bogendoerfer:
- fix for missing hazard barrier
- DT fix for ingenic
- DT fix of GPHY names for lantiq
- fix usage of smp_processor_id() while preemption is enabled
* tag 'mips_fixes_5.8_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Do not use smp_processor_id() in preemptible code
MIPS: Add missing EHB in mtc0 -> mfc0 sequence for DSPen
MIPS: ingenic: gcw0: Fix HP detection GPIO.
MIPS: lantiq: xway: sysctrl: fix the GPHY clock alias names
If 'ad7780_init_gpios()' fails, we must not release some resources that
have not been allocated yet. Return directly instead.
Fixes: 5bb30e7daf ("staging: iio: ad7780: move regulator to after GPIO init")
Fixes: 9085daa4ab ("staging: iio: ad7780: add gain & filter gpio support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Renato Lui Geh <renatogeh@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses an array of smaller elements on the stack.
Here there is no data leak possibility so use an explicit structure
on the stack to ensure alignment and nice readable fashion.
The forced alignment of ts isn't strictly necessary in this driver
as the padding will be correct anyway (there isn't any). However
it is probably less fragile to have it there and it acts as
documentation of the requirement.
Fixes: 713bbb4efb ("iio: pressure: ms5611: Add triggered buffer support")
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Tomasz Duszynski <tomasz.duszynski@octakon.com>
Cc: <Stable@vger.kernel.org>
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses an array of smaller elements on the stack.
As Lars also noted this anti pattern can involve a leak of data to
userspace and that indeed can happen here. We close both issues by
moving to a suitable structure in the iio_priv() data.
This data is allocated with kzalloc so no data can leak
apart from previous readings.
Explicit alignment of ts needed to ensure consistent padding
on all architectures (particularly x86_32 with it's 4 byte alignment
of s64)
Fixes: e4a70e3e7d ("iio: humidity: add support to hts221 rh/temp combo device")
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: <Stable@vger.kernel.org>
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses an array of smaller elements on the stack.
As Lars also noted this anti pattern can involve a leak of data to
userspace and that indeed can happen here. We close both issues by
moving to a suitable structure in the iio_priv() data.
This data is allocated with kzalloc so no data can leak apart
from previous readings.
Fixes: 16bf793f86 ("iio: humidity: hdc100x: add triggered buffer support for HDC100X")
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Matt Ranostay <matt.ranostay@konsulko.com>
Cc: Alison Schofield <amsfield22@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: <Stable@vger.kernel.org>
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses an array of smaller elements on the stack.
As Lars also noted this anti pattern can involve a leak of data to
userspace and that indeed can happen here. We close both issues by
moving to a suitable structure in the iio_priv() data.
This data is allocated with kzalloc so no data can leak appart from
previous readings.
Fixes: 7c94a8b2ee ("iio: magn: add a driver for AK8974")
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: <Stable@vger.kernel.org>
This resolves the hazard between the mtc0 in the change_c0_status() and
the mfc0 in configure_exception_vector(). Without resolving this hazard
configure_exception_vector() could read an old value and would restore
this old value again. This would revert the changes change_c0_status()
did. I checked this by printing out the read_c0_status() at the end of
per_cpu_trap_init() and the ST0_MX is not set without this patch.
The hazard is documented in the MIPS Architecture Reference Manual Vol.
III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev
6.03 table 8.1 which includes:
Producer | Consumer | Hazard
----------|----------|----------------------------
mtc0 | mfc0 | any coprocessor 0 register
I saw this hazard on an Atheros AR9344 rev 2 SoC with a MIPS 74Kc CPU.
There the change_c0_status() function would activate the DSPen by
setting ST0_MX in the c0_status register. This was reverted and then the
system got a DSP exception when the DSP registers were saved in
save_dsp() in the first process switch. The crash looks like this:
[ 0.089999] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.097796] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.107070] Kernel panic - not syncing: Unexpected DSP exception
[ 0.113470] Rebooting in 1 seconds..
We saw this problem in OpenWrt only on the MIPS 74Kc based Atheros SoCs,
not on the 24Kc based SoCs. We only saw it with kernel 5.4 not with
kernel 4.19, in addition we had to use GCC 8.4 or 9.X, with GCC 8.3 it
did not happen.
In the kernel I bisected this problem to commit 9012d01166 ("compiler:
allow all arches to enable CONFIG_OPTIMIZE_INLINING"), but when this was
reverted it also happened after commit 172dcd935c ("MIPS: Always
allocate exception vector for MIPSr2+").
Commit 0b24cae4d5 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.")
does similar changes to a different file. I am not sure if there are
more places affected by this problem.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Running `make savedefconfig` creates by default `defconfig`, which is,
currently, on git’s radar, for example, `git status` lists this file as
untracked.
So, add the file to `.gitignore`, so it’s ignored by git.
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Taehee Yoo says:
====================
net: rmnet: fix interface leak for rmnet module
There are two problems in rmnet module that they occur the leak of
a lower interface.
The symptom is the same, which is the leak of a lower interface.
But there are two different real problems.
This patchset is to fix these real problems.
1. Do not allow to have different two modes.
As a lower interface of rmnet, there are two modes that they are VND
and BRIDGE.
One interface can have only one mode.
But in the current rmnet, there is no code to prevent to have
two modes in one lower interface.
So, interface leak occurs.
2. Do not allow to add multiple bridge interfaces.
rmnet can have only two bridge interface.
If an additional bridge interface is tried to be attached,
rmnet should deny it.
But there is no code to do that.
So, interface leak occurs.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
rmnet can have only two bridge interface.
One of them is a link interface and another one is added by
the master operation.
rmnet interface shouldn't allow adding additional
bridge interfaces by mater operation.
But, there is no code to deny additional interfaces.
So, interface leak occurs.
Test commands:
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link add dummy2 type dummy
ip link add rmnet0 link dummy0 type rmnet mux_id 1
ip link set dummy1 master rmnet0
ip link set dummy2 master rmnet0
ip link del rmnet0
In the above test command, the dummy0 was attached to rmnet as VND mode.
Then, dummy1 was attached to rmnet0 as BRIDGE mode.
At this point, dummy0 mode is switched from VND to BRIDGE automatically.
Then, dummy2 is attached to rmnet as BRIDGE mode.
At this point, rmnet0 should deny this operation.
But, rmnet0 doesn't deny this.
So that below splat occurs when the rmnet0 interface is deleted.
Splat looks like:
[ 186.684787][ C2] WARNING: CPU: 2 PID: 1009 at net/core/dev.c:8992 rollback_registered_many+0x986/0xcf0
[ 186.684788][ C2] Modules linked in: rmnet dummy openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_x
[ 186.684805][ C2] CPU: 2 PID: 1009 Comm: ip Not tainted 5.8.0-rc1+ #621
[ 186.684807][ C2] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 186.684808][ C2] RIP: 0010:rollback_registered_many+0x986/0xcf0
[ 186.684811][ C2] Code: 41 8b 4e cc 45 31 c0 31 d2 4c 89 ee 48 89 df e8 e0 47 ff ff 85 c0 0f 84 cd fc ff ff 5
[ 186.684812][ C2] RSP: 0018:ffff8880cd9472e0 EFLAGS: 00010287
[ 186.684815][ C2] RAX: ffff8880cc56da58 RBX: ffff8880ab21c000 RCX: ffffffff9329d323
[ 186.684816][ C2] RDX: 1ffffffff2be6410 RSI: 0000000000000008 RDI: ffffffff95f32080
[ 186.684818][ C2] RBP: dffffc0000000000 R08: fffffbfff2be6411 R09: fffffbfff2be6411
[ 186.684819][ C2] R10: ffffffff95f32087 R11: 0000000000000001 R12: ffff8880cd947480
[ 186.684820][ C2] R13: ffff8880ab21c0b8 R14: ffff8880cd947400 R15: ffff8880cdf10640
[ 186.684822][ C2] FS: 00007f00843890c0(0000) GS:ffff8880d4e00000(0000) knlGS:0000000000000000
[ 186.684823][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 186.684825][ C2] CR2: 000055b8ab1077b8 CR3: 00000000ab612006 CR4: 00000000000606e0
[ 186.684826][ C2] Call Trace:
[ 186.684827][ C2] ? lockdep_hardirqs_on_prepare+0x379/0x540
[ 186.684829][ C2] ? netif_set_real_num_tx_queues+0x780/0x780
[ 186.684830][ C2] ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
[ 186.684831][ C2] ? __kasan_slab_free+0x126/0x150
[ 186.684832][ C2] ? kfree+0xdc/0x320
[ 186.684834][ C2] ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
[ 186.684835][ C2] unregister_netdevice_many.part.135+0x13/0x1b0
[ 186.684836][ C2] rtnl_delete_link+0xbc/0x100
[ ... ]
[ 238.440071][ T1009] unregister_netdevice: waiting for rmnet0 to become free. Usage count = 1
Fixes: 037f9cdf72 ("net: rmnet: use upper/lower device infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sending mailbox in the work of aeq event, another aeq event
will be triggered. because the last aeq work is not exited and only
one work can be excuted simultaneously in the same workqueue, mailbox
sending function will return failure of timeout. We create and use
another workqueue to fix this.
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Use kvfree() to release vmalloc()'ed areas in ipset, from Eric Dumazet.
2) UAF in nfnetlink_queue from the nf_conntrack_update() path.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap says:
====================
Documentation: networking: eliminate doubled words
Drop all duplicated words in Documentation/networking/ files.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
There are some `static const u8` variables that are not used, this
triggers a warning building with `make W=1`, it is safe to remove them,
so do it and make the compiler more happy.
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
The driver will be loaded by via a platform device. So we
will need to get the device_node from the parent device.
Depending on this we will set the driver data.
As all this is done later already, just delete the call to
of_device_get_match_data.
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
In function mtk_dsi_clk_hs_state, remove unnecessary conversion
to bool return, this change is to make the code a bit readable.
Signed-off-by: Bernard Zhao <bernard@vivo.com>
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Pull powerpc fixes from Michael Ellerman:
"One fix for a regression in our pkey handling, which exhibits as
PROT_EXEC mappings taking continuous page faults.
Thanks to: Jan Stancek, Aneesh Kumar K.V"
* tag 'powerpc-5.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm/pkeys: Make pkey access check work on execute_only_key
Pull arm64 fixes from Will Deacon:
"Nothing earth-shattering, really - some CPU errata workarounds (one
day they'll get it right, ha!) and a fix for a boot failure with very
large kernel images where the alternative patching gets confused when
patching relative branches using veneers.
- Fix alternative patching for very large kernel images and modules
- Hook up existing CPU errata workarounds for Qualcomm Kryo CPUs"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Add KRYO4XX silver CPU cores to erratum list 1530923 and 1024718
arm64: Add KRYO4XX gold CPU cores to erratum list 1463225 and 1418040
arm64: Add MIDR value for KRYO4XX gold CPU cores
arm64/alternatives: use subsections for replacement sequences
When switching to TWA_SIGNAL for task_work notifications, we also made
any signal based condition in io_cqring_wait() return -ERESTARTSYS.
This breaks applications that rely on using signals to abort someone
waiting for events.
Check if we have a signal pending because of queued task_work, and
repeat the signal check once we've run the task_work. This provides a
reliable way of telling the two apart.
Additionally, only use TWA_SIGNAL if we are using an eventfd. If not,
we don't have the dependency situation described in the original commit,
and we can get by with just using TWA_RESUME like we previously did.
Fixes: ce593a6c48 ("io_uring: use signal based task_work running")
Cc: stable@vger.kernel.org # v5.7
Reported-by: Andres Freund <andres@anarazel.de>
Tested-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
On Xen PV, #DB doesn't use IST. It still needs to be correctly routed
depending on whether it came from user or kernel mode.
Get rid of DECLARE/DEFINE_IDTENTRY_XEN -- it was too hard to follow the
logic. Instead, route #DB and NMI through DECLARE/DEFINE_IDTENTRY_RAW on
Xen, and do the right thing for #DB. Also add more warnings to the
exc_debug* handlers to make this type of failure more obvious.
This fixes various forms of corruption that happen when usermode
triggers #DB on Xen PV.
Fixes: 4c0dcd8350 ("x86/entry: Implement user mode C entry points for #DB and #MCE")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/4163e733cce0b41658e252c6c6b3464f33fdff17.1593795633.git.luto@kernel.org
When looking for a registered client to attach with, the wrong reference
counters are being grabbed. The idea is to increment the module and device
counters of the client device and not the counters of the axi device being
probed.
Fixes: ef04070692 (iio: adc: adi-axi-adc: add support for AXI ADC IP core)
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Acked-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Pull xen fixes from Juergen Gross:
"One small cleanup patch for ARM and two patches for the xenbus driver
fixing latent problems (large stack allocations and bad return code
settings)"
* tag 'for-linus-5.8b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/xenbus: let xenbus_map_ring_valloc() return errno values only
xen/xenbus: avoid large structs and arrays on the stack
arm/xen: remove the unused macro GRANT_TABLE_PHYSADDR
I2C_SMBUS_BLOCK_MAX defines already the maximum number as defined in the
SMBus 2.0 specs. I don't see a reason to add 1 here. Also, fix the errno
to what is suggested for this error.
Fixes: c9bfdc7c16 ("i2c: mlxcpld: Add support for smbus block read transaction")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Michael Shych <michaelsh@mellanox.com>
Tested-by: Michael Shych <michaelsh@mellanox.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Pull sysctl fix from Al Viro:
"Another regression fix for sysctl changes this cycle..."
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Call sysctl_head_finish on error
The driver can't be loaded automatically because it misses
module alias to be provided. Add corresponding MODULE_DEVICE_TABLE()
call to the driver.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Current AMD's zen-based APUs use this core for some of its i2c-buses.
With this patch we re-enable autodetection of hwmon-alike devices, so
lm-sensors will be able to work automatically.
It does not affect the boot-time of embedded devices, as the class is
set based on the DMI information.
DMI is probed only on Qtechnology QT5222 Industrial Camera Platform.
DocLink: https://qtec.com/camera-technology-camera-platforms/
Fixes: 3eddad96c4 ("i2c: designware: reverts "i2c: designware: Add support for AMD I2C controller"")
Signed-off-by: Ricardo Ribalda <ribalda@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
The PCA9665 datasheet says that I2CSTA = 78h indicates that SCL is stuck
low, this differs to the PCA9564 which uses 90h for this indication.
Treat either 0x78 or 0x90 as an indication that the SCL line is stuck.
Based on looking through the PCA9564 and PCA9665 datasheets this should
be safe for both chips. The PCA9564 should not return 0x78 for any valid
state and the PCA9665 should not return 0x90.
Fixes: eff9ec95ef ("i2c-algo-pca: Add PCA9665 support")
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Pull cifs fixes from Steve French:
"Eight cifs/smb3 fixes, most when specifying the multiuser mount flag.
Five of the fixes are for stable"
* tag '5.8-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: prevent truncation from long to int in wait_for_free_credits
cifs: Fix the target file was deleted when rename failed.
SMB3: Honor 'posix' flag for multiuser mounts
SMB3: Honor 'handletimeout' flag for multiuser mounts
SMB3: Honor lease disabling for multiuser mounts
SMB3: Honor persistent/resilient handle flags for multiuser mounts
SMB3: Honor 'seal' flag for multiuser mounts
cifs: Display local UID details for SMB sessions in DebugData
Pull hwmon fixes from Guenter Roeck:
- Fix typo in Kconfig SENSORS_IR35221 option
- Fix potential memory leak in acpi_power_meter_add()
- Make sure the OVERT mask is set correctly in max6697 driver
- In PMBus core, fix page vs. register when accessing fans
- Mark is_visible functions static in bt1-pvt driver
- Define Temp- and Volt-to-N poly as maybe-unused in bt1-pvt driver
* tag 'hwmon-for-v5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (pmbus) fix a typo in Kconfig SENSORS_IR35221 option
hwmon: (acpi_power_meter) Fix potential memory leak in acpi_power_meter_add()
hwmon: (max6697) Make sure the OVERT mask is set correctly
hwmon: (pmbus) Fix page vs. register when accessing fans
hwmon: (bt1-pvt) Mark is_visible functions static
hwmon: (bt1-pvt) Define Temp- and Volt-to-N poly as maybe-unused
Merge misc fixes from Andrew Morton:
"Subsystems affected by this patch series: mm/hugetlb, samples, mm/cma,
mm/vmalloc, mm/pagealloc"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/page_alloc: fix documentation error
vmalloc: fix the owner argument for the new __vmalloc_node_range callers
mm/cma.c: use exact_nid true to fix possible per-numa cma leak
samples/vfs: avoid warning in statx override
mm/hugetlb.c: fix pages per hugetlb calculation
Calling cma_declare_contiguous_nid() with false exact_nid for per-numa
reservation can easily cause cma leak and various confusion. For example,
mm/hugetlb.c is trying to reserve per-numa cma for gigantic pages. But it
can easily leak cma and make users confused when system has memoryless
nodes.
In case the system has 4 numa nodes, and only numa node0 has memory. if
we set hugetlb_cma=4G in bootargs, mm/hugetlb.c will get 4 cma areas for 4
different numa nodes. since exact_nid=false in current code, all 4 numa
nodes will get cma successfully from node0, but hugetlb_cma[1 to 3] will
never be available to hugepage will only allocate memory from
hugetlb_cma[0].
In case the system has 4 numa nodes, both numa node0&2 has memory, other
nodes have no memory. if we set hugetlb_cma=4G in bootargs, mm/hugetlb.c
will get 4 cma areas for 4 different numa nodes. since exact_nid=false in
current code, all 4 numa nodes will get cma successfully from node0 or 2,
but hugetlb_cma[1] and [3] will never be available to hugepage as
mm/hugetlb.c will only allocate memory from hugetlb_cma[0] and
hugetlb_cma[2]. This causes permanent leak of the cma areas which are
supposed to be used by memoryless node.
Of cource we can workaround the issue by letting mm/hugetlb.c scan all cma
areas in alloc_gigantic_page() even node_mask includes node0 only. that
means when node_mask includes node0 only, we can get page from
hugetlb_cma[1] to hugetlb_cma[3]. But this will cause kernel crash in
free_gigantic_page() while it wants to free page by:
cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order)
On the other hand, exact_nid=false won't consider numa distance, it might
be not that useful to leverage cma areas on remote nodes. I feel it is
much simpler to make exact_nid true to make everything clear. After that,
memoryless nodes won't be able to reserve per-numa CMA from other nodes
which have memory.
Fixes: cf11e85fc0 ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andreas Schaufler <andreas.schaufler@gmx.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200628074345.27228-1-song.bao.hua@hisilicon.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Something changed recently to uncover this warning:
samples/vfs/test-statx.c:24:15: warning: `struct foo' declared inside parameter list will not be visible outside of this definition or declaration
24 | #define statx foo
| ^~~
Which is due the use of "struct statx" (here, "struct foo") in a function
prototype argument list before it has been defined:
int
# 56 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h"
foo
# 56 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h" 3 4
(int __dirfd, const char *__restrict __path, int __flags,
unsigned int __mask, struct
# 57 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h"
foo
# 57 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h" 3 4
*__restrict __buf)
__attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2, 5)));
Add explicit struct before #include to avoid warning.
Fixes: f1b5618e01 ("vfs: Add a sample program for the new mount API")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/202006282213.C516EA6@keescook
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The routine hpage_nr_pages() was incorrectly used to calculate the number
of base pages in a hugetlb page. hpage_nr_pages is designed to be called
for THP pages and will return HPAGE_PMD_NR for hugetlb pages of any size.
Due to the context in which hpage_nr_pages was called, it is unlikely to
produce a user visible error. The routine with the incorrect call is only
exercised in the case of hugetlb memory error or migration. In addition,
this would need to be on an architecture which supports huge page sizes
less than PMD_SIZE. And, the vma containing the huge page would also need
to smaller than PMD_SIZE.
Fixes: c0d0381ade ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200629185003.97202-1-mike.kravetz@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull xfs fix from Darrick Wong:
"Fix a use-after-free bug when the fs shuts down"
* tag 'xfs-5.8-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix use-after-free on CIL context on shutdown
There are a couple of places in net/sched/ that check skb->protocol and act
on the value there. However, in the presence of VLAN tags, the value stored
in skb->protocol can be inconsistent based on whether VLAN acceleration is
enabled. The commit quoted in the Fixes tag below fixed the users of
skb->protocol to use a helper that will always see the VLAN ethertype.
However, most of the callers don't actually handle the VLAN ethertype, but
expect to find the IP header type in the protocol field. This means that
things like changing the ECN field, or parsing diffserv values, stops
working if there's a VLAN tag, or if there are multiple nested VLAN
tags (QinQ).
To fix this, change the helper to take an argument that indicates whether
the caller wants to skip the VLAN tags or not. When skipping VLAN tags, we
make sure to skip all of them, so behaviour is consistent even in QinQ
mode.
To make the helper usable from the ECN code, move it to if_vlan.h instead
of pkt_sched.h.
v3:
- Remove empty lines
- Move vlan variable definitions inside loop in skb_protocol()
- Also use skb_protocol() helper in IP{,6}_ECN_decapsulate() and
bpf_skb_ecn_set_ce()
v2:
- Use eth_type_vlan() helper in skb_protocol()
- Also fix code that reads skb->protocol directly
- Change a couple of 'if/else if' statements to switch constructs to avoid
calling the helper twice
Reported-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
Fixes: d8b9605d26 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull PCI fix from Bjorn Helgaas:
"Fix a pcie_find_root_port() simplification that broke power management
because it didn't handle the edge case of finding the Root Port of a
Root Port itself (Mika Westerberg)""
* tag 'pci-v5.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Make pcie_find_root_port() work for Root Ports
Pull ACPI updates from Rafael Wysocki:
"Add a new device ID for Intel Tiger Lake to the DPTF battery
participant driver (Srinivas Pandruvada) and fix the Tiger Lake fan
device ID (Sumeet Pawnikar)"
* tag 'acpi-5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: fan: Fix Tiger Lake ACPI device ID
ACPI: DPTF: Add battery participant for TigerLake
Pull gfs2 fixes from Andreas Gruenbacher:
"Various gfs2 fixes"
* tag 'gfs2-v5.8-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: The freeze glock should never be frozen
gfs2: When freezing gfs2, use GL_EXACT and not GL_NOCACHE
gfs2: read-only mounts should grab the sd_freeze_gl glock
gfs2: freeze should work on read-only mounts
gfs2: eliminate GIF_ORDERED in favor of list_empty
gfs2: Don't sleep during glock hash walk
gfs2: fix trans slab error when withdraw occurs inside log_flush
gfs2: Don't return NULL from gfs2_inode_lookup
Pull drm fixes from Dave Airlie:
"Pretty usual rc4 pull: two usual amdgpu, i915 pulls, and some misc arm
driver fixes.
The bigger bit is including the asm sources for some GPU shaders that
were contained in the i915 driver, otherwise it's pretty much business
as usual.
dma-buf:
- fix a use-after-free bug
amdgpu:
- Fix for vega20 boards without RAS support
- DC bandwidth revalidation fix
- Fix Renoir vram info fetching
- Fix hwmon freq printing
i915:
- GVT fixes
- Two missed MMIO handler fixes for SKL/CFL
- Fix mask register bits check
- Fix one lockdep error for debugfs entry access
- Include asm sources for render cache clear batches
msm:
- memleak fix
- display block fix
- address space fixes
exynos:
- error value and reference count fix
- error print removal
sun4i:
- remove HPD polling"
* tag 'drm-fixes-2020-07-03' of git://anongit.freedesktop.org/drm/drm: (22 commits)
drm/amdgpu: use %u rather than %d for sclk/mclk
drm/amdgpu/atomfirmware: fix vram_info fetching for renoir
drm/amd/display: Only revalidate bandwidth on medium and fast updates
drm: sun4i: hdmi: Remove extra HPD polling
drm/i915: Include asm sources for {ivb, hsw}_clear_kernel.c
drm/exynos: fix ref count leak in mic_pre_enable
drm/exynos: Properly propagate return value in drm_iommu_attach_device()
drm/exynos: Remove dev_err() on platform_get_irq() failure
drm/amd/powerplay: Fix NULL dereference in lock_bus() on Vega20 w/o RAS
dma-buf: Move dma_buf_release() from fops to dentry_ops
drm/msm: Fix up the rest of the messed up address sizes
drm/msm: Fix setup of a6xx create_address_space.
drm/msm: Fix address space size after refactor.
drm/i915/gvt: Use GFP_ATOMIC instead of GFP_KERNEL in atomic context
drm/i915/gvt: Fix incorrect check of enabled bits in mask registers
drm/i915/gvt: Fix two CFL MMIO handling caused by regression.
drm/i915/gvt: Add one missing MMIO handler for D_SKL_PLUS
drm/msm: Fix 0xfffflub in "Refactor address space initialization"
drm/msm/dpu: allow initialization of encoder locks during encoder init
drm/msm/dpu: fix error return code in dpu_encoder_init
...
This error path returned directly instead of calling sysctl_head_finish().
Fixes: ef9d965bc8 ("sysctl: reject gigantic reads/write to sysctl files")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Use the "common" KVM_POSSIBLE_CR*_GUEST_BITS defines to initialize the
CR0/CR4 guest host masks instead of duplicating most of the CR4 mask and
open coding the CR0 mask. SVM doesn't utilize the masks, i.e. the masks
are effectively VMX specific even if they're not named as such. This
avoids duplicate code, better documents the guest owned CR0 bit, and
eliminates the need for a build-time assertion to keep VMX and x86
synchronized.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703040422.31536-3-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mark CR4.TSD as being possibly owned by the guest as that is indeed the
case on VMX. Without TSD being tagged as possibly owned by the guest, a
targeted read of CR4 to get TSD could observe a stale value. This bug
is benign in the current code base as the sole consumer of TSD is the
emulator (for RDTSC) and the emulator always "reads" the entirety of CR4
when grabbing bits.
Add a build-time assertion in to ensure VMX doesn't hand over more CR4
bits without also updating x86.
Fixes: 52ce3c21ae ("x86,kvm,vmx: Don't trap writes to CR4.TSD")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703040422.31536-2-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Inject a #GP on MOV CR4 if CR4.LA57 is toggled in 64-bit mode, which is
illegal per Intel's SDM:
CR4.LA57
57-bit linear addresses (bit 12 of CR4) ... blah blah blah ...
This bit cannot be modified in IA-32e mode.
Note, the pseudocode for MOV CR doesn't call out the fault condition,
which is likely why the check was missed during initial development.
This is arguably an SDM bug and will hopefully be fixed in future
release of the SDM.
Fixes: fd8cb43373 ("KVM: MMU: Expose the LA57 feature to VM.")
Cc: stable@vger.kernel.org
Reported-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703021714.5549-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This function is used by dev_get_regmap() to retrieve a regmap for the
specified device. If the device has more than one regmap, the name parameter
can be used to specify one.
The code here uses a pointer comparison to check for equal strings. This
however will probably always fail, as the regmap->name is allocated via
kstrdup_const() from the regmap's config->name.
Fix this by using strcmp() instead.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20200703103315.267996-1-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Kernel commit dc4e2801d4 (ring-buffer: Redefine the unimplemented
RINGBUF_TYPE_TIME_STAMP) changed the way the ring buffer timestamps work
- after that commit the previously unimplemented RINGBUF_TYPE_TIME_STAMP
type causes the time delta to be used as a timestamp rather than a delta
to be added to the timestamp.
The trace-cmd code didn't get updated to handle this, so misinterprets
the event data for this case, which causes a cascade of errors,
including trace-report not being able to identify synthetic (or any
other) events generated by the histogram code (which uses TIME_STAMP
mode). For example, the following triggers along with the trace-cmd
shown cause an UNKNOWN_EVENT error and trace-cmd report crash:
# echo 'wakeup_latency u64 lat pid_t pid char comm[16]' > /sys/kernel/debug/tracing/synthetic_events
# echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="ping"' > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_wakeup).trace(wakeup_latency,$wakeup_lat,next_pid,next_comm) if next_comm=="ping"' > /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
# echo 'hist:keys=comm,pid,lat:wakeup_lat=lat:sort=lat' > /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger
# trace-cmd record -e wakeup_latency -e sched_wakeup -f comm==\"ping\" ping localhost -c 5
# trace-cmd report
CPU 0 is empty
CPU 1 is empty
CPU 2 is empty
CPU 3 is empty
CPU 5 is empty
CPU 6 is empty
CPU 7 is empty
cpus=8
ug! no event found for type 0
[UNKNOWN TYPE 0]
ug! no event found for type 11520
Segmentation fault (core dumped)
After this patch we get the correct interpretation and the events are
shown properly:
# trace-cmd report
CPU 0 is empty
CPU 1 is empty
CPU 2 is empty
CPU 3 is empty
CPU 5 is empty
CPU 6 is empty
CPU 7 is empty
cpus=8
<idle>-0 [004] 23284.341392: sched_wakeup: ping:12031 [120] success=1 CPU:004
<idle>-0 [004] 23284.341464: wakeup_latency: lat=58, pid=12031, comm=ping
<idle>-0 [004] 23285.365303: sched_wakeup: ping:12031 [120] success=1 CPU:004
<idle>-0 [004] 23285.365382: wakeup_latency: lat=64, pid=12031, comm=ping
<idle>-0 [004] 23286.389290: sched_wakeup: ping:12031 [120] success=1 CPU:004
<idle>-0 [004] 23286.389378: wakeup_latency: lat=72, pid=12031, comm=ping
<idle>-0 [004] 23287.413213: sched_wakeup: ping:12031 [120] success=1 CPU:004
<idle>-0 [004] 23287.413291: wakeup_latency: lat=64, pid=12031, comm=ping
Link: http://lkml.kernel.org/r/1567628224.13841.4.camel@kernel.org
Link: http://lore.kernel.org/linux-trace-devel/20200625100516.365338-3-tz.stoyanov@gmail.com
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
[ Ported from trace-cmd.git ]
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200702185703.785094515@goodmis.org
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using Python version 3.8.2 and PySide2 version 5.14.0, time chart call tree
would not expand the tree to the result. Fix by using setExpanded().
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Charts -> Time chart by CPU
Move mouse over middle of chart
Right-click and select Show Call Tree
Before: displays Call Tree but not expanded to selected time
After: displays Call Tree expanded to selected time
Fixes: e69d5df75d ("perf scripts python: exported-sql-viewer.py: Add ability for Call tree to open at a specified task and time")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using ctrl-F ('Find') would not find 'unknown' because it matches id
zero. Fix by excluding id zero from selection.
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Reports -> Call Tree
Press: Ctrl-F
Enter: unknown
Press: Enter
Before: displays 'unknown' not found
After: tree is expanded to line showing 'unknown'
Fixes: ae8b887c00 ("perf scripts python: exported-sql-viewer.py: Add call tree")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using ctrl-F ('Find') would not find 'unknown' because it matches id zero.
Fix by excluding id zero from selection.
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Reports -> Context-Sensitive Call Graph
Press: Ctrl-F
Enter: unknown
Press: Enter
Before: gets stuck
After: tree is expanded to line showing 'unknown'
Fixes: 254c0d820b ("perf scripts python: exported-sql-viewer.py: Factor out CallGraphModelBase")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using Python version 3.8.2 and PySide2 version 5.14.0, ctrl-F ('Find')
would not expand the tree to the result. Fix by using setExpanded().
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Reports -> Context-Sensitive Call Graph or Reports -> Call Tree
Press: Ctrl-F
Enter: main
Press: Enter
Before: line showing 'main' does not display
After: tree is expanded to line showing 'main'
Fixes: ebd70c7dc2 ("perf scripts python: exported-sql-viewer.py: Add ability to find symbols in the call-graph")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 0a892c1c94 ("perf record: Add dummy event during system wide
synthesis") reveals an issue with Intel PT system wide tracing.
Specifically that Intel PT already adds a dummy tracking event, and it
is not the first event. Adding another dummy tracking event causes
duplicated sideband events. Fix by checking for an existing dummy
tracking event first.
Example showing duplicated switch events:
Before:
# perf record -a -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.895 MB perf.data ]
# perf script --no-itrace --show-switch-events | head
swapper 0 [007] 6390.516222: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
swapper 0 [007] 6390.516222: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
rcu_sched 11 [007] 6390.516223: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
rcu_sched 11 [007] 6390.516224: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
rcu_sched 11 [007] 6390.516227: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
rcu_sched 11 [007] 6390.516227: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [007] 6390.516228: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
swapper 0 [007] 6390.516228: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
swapper 0 [002] 6390.516415: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 5556/5559
swapper 0 [002] 6390.516416: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 5556/5559
After:
# perf record -a -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.868 MB perf.data ]
# perf script --no-itrace --show-switch-events | head
swapper 0 [005] 6450.567013: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 7179/7181
perf 7181 [005] 6450.567014: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
perf 7181 [005] 6450.567028: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [005] 6450.567029: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 7179/7181
swapper 0 [005] 6450.571699: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
rcu_sched 11 [005] 6450.571700: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
rcu_sched 11 [005] 6450.571702: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [005] 6450.571703: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
swapper 0 [005] 6450.579703: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
rcu_sched 11 [005] 6450.579704: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200629091955.17090-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To bring in the change made in this cset:
e3a9e681ad ("x86/entry: Fixup bad_iret vs noinstr")
This doesn't cause any functional changes to tooling, just a rebuild.
Addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before this patch, some gfs2 code locked the freeze glock with LM_FLAG_NOEXP
(Do not freeze) flag, and some did not. We never want to freeze the freeze
glock, so this patch makes it consistently use LM_FLAG_NOEXP always.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Before this patch, the freeze code in gfs2 specified GL_NOCACHE in
several places. That's wrong because we always want to know the state
of whether the file system is frozen.
There was also a problem with freeze/thaw transitioning the glock from
frozen (EX) to thawed (SH) because gfs2 will normally grant glocks in EX
to processes that request it in SH mode, unless GL_EXACT is specified.
Therefore, the freeze/thaw code, which tried to reacquire the glock in
SH mode would get the glock in EX mode, and miss the transition from EX
to SH. That made it think the thaw had completed normally, but since the
glock was still cached in EX, other nodes could not freeze again.
This patch removes the GL_NOCACHE flag to allow the freeze glock to be
cached. It also adds the GL_EXACT flag so the glock is fully transitioned
from EX to SH, thereby allowing future freeze operations.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Before this patch, only read-write mounts would grab the freeze
glock in read-only mode, as part of gfs2_make_fs_rw. So the freeze
glock was never initialized. That meant requests to freeze, which
request the glock in EX, were granted without any state transition.
That meant you could mount a gfs2 file system, which is currently
frozen on a different cluster node, in read-only mode.
This patch makes read-only mounts lock the freeze glock in SH mode,
which will block for file systems that are frozen on another node.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Before this patch, function freeze_go_sync, called when promoting
the freeze glock, was testing for the SDF_JOURNAL_LIVE superblock flag.
That's only set for read-write mounts. Read-only mounts don't use a
journal, so the bit is never set, so the freeze never happened.
This patch removes the check for SDF_JOURNAL_LIVE for freeze requests
but still checks it when deciding whether to flush a journal.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
In several places, we used the GIF_ORDERED inode flag to determine
if an inode was on the ordered writes list. However, since we always
held the sd_ordered_lock spin_lock during the manipulation, we can
just as easily check list_empty(&ip->i_ordered) instead.
This allows us to keep more than one ordered writes list to make
journal writing improvements.
This patch eliminates GIF_ORDERED in favor of checking list_empty.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
While e3a3c3a205 ("UIO: fix uio_pdrv_genirq with device tree but no
interrupt") added support for using uio_pdrv_genirq for devices without
interrupt for device tree platforms, the removal of uio_pdrv in
26dac3c49d ("uio: Remove uio_pdrv and use uio_pdrv_genirq instead")
broke the support for non device tree platforms.
This change fixes this, so that uio_pdrv_genirq can be used without
interrupt on all platforms.
This still leaves the support that uio_pdrv had for custom interrupt
handler lacking, as uio_pdrv_genirq does not handle it (yet).
Fixes: 26dac3c49d ("uio: Remove uio_pdrv and use uio_pdrv_genirq instead")
Signed-off-by: Esben Haabendal <esben@geanix.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200701145659.3978-3-esben@geanix.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull m68knommu mm fixes from Greg Ungerer:
"Two critical mm related fixes that affect booting of m68k/ColdFire
devices.
Both fix problems caused by recent system init memblock changes"
* tag 'm68knommu-for-v5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: mm: fix node memblock init
m68k: nommu: register start of the memory with memblock
Pull devicetree fixes from Rob Herring:
- Sync dtc to upstream to pick up fixes for I2C bus checks and quiet
warnings
- Various fixes for DT binding check warnings
- A couple of build fixes/improvements for binding checks
- ReST formatting improvements for writing-schema.rst
- Document reference fixes
* tag 'devicetree-fixes-for-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: clock: imx: Fix e-mail address
dt-bindings: thermal: k3: Fix the reg property
dt-bindings: thermal: Remove soc unit address
dt-bindings: display: arm: versatile: Pass the sysreg unit name
dt-bindings: usb: aspeed: Remove the leading zeroes
dt-bindings: copy process-schema-examples.yaml to process-schema.yaml
dt-bindings: do not build processed-schema.yaml for 'make dt_binding_check'
dt-bindings: fix error in 'make clean' after 'make dt_binding_check'
dt-bindings: mailbox: zynqmp_ipi: fix unit address
dt-bindings: bus: uniphier-system-bus: fix warning in example
scripts/dtc: Update to upstream version v1.6.0-11-g9d7888cbf19c
doc: devicetree: bindings: fix spelling mistake
docs: dt: minor adjustments at writing-schema.rst
dt: fix reference to olpc,xo1.75-ec.txt
dt: Fix broken references to renamed docs
dt: fix broken links due to txt->yaml renames
dt: update a reference for reneases pcar file renamed to yaml
Pull data race annotation from Christian Brauner:
"This contains an annotation patch for a data race in copy_process()
reported by KCSAN when reading and writing nr_threads.
The data race is intentional and benign. This is obvious from the
comment above the relevant code and based on general consensus when
discussing this issue. So simply using data_race() to annotate this as
an intentional race seems the best option"
* tag 'for-linus-2020-07-02' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
fork: annotate data race in copy_process()
Pull tpm fixes from Jarkko Sakkinen:
"These are just fixes for bugs found lately.
All of them are small scale things here and there, and all of them are
for previous kernel releases (the oldest appeared in v2.6.17)"
* tag 'tpmdd-next-v5.8-rc4' of git://git.infradead.org/users/jjs/linux-tpmdd:
tpm_tis: Remove the HID IFX0102
tpm_tis_spi: Prefer async probe
tpm: ibmvtpm: Wait for ready buffer before probing for TPM2 attributes
tpm/st33zp24: fix spelling mistake "drescription" -> "description"
tpm_tis: extra chip->ops check on error path in tpm_tis_core_init
tpm_tis_spi: Don't send anything during flow control
tpm: Fix TIS locality timeout problems
Pull kselftest fixes from Shuah Khan:
"tpm test fixes from Jarkko Sakkinen"
* tag 'linux-kselftest-fixes-5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: tpm: Use /bin/sh instead of /bin/bash
selftests: tpm: Use 'test -e' instead of 'test -f'
Revert "tpm: selftest: cleanup after unseal with wrong auth/policy test"
Pull kunit fixes from Shuah Khan
"Fixes for build and run-times failures.
Also includes troubleshooting tips updates to kunit user
documentation"
* tag 'linux-kselftest-kunit-fixes-5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
Documentation: kunit: Add some troubleshooting tips to the FAQ
kunit: kunit_tool: Fix invalid result when build fails
kunit: show error if kunit results are not present
kunit: kunit_config: Fix parsing of CONFIG options with space
Pull nfsd fixes from Bruce Fields:
"Fixes for a umask bug on exported filesystems lacking ACL support, a
leak and a module unloading bug in the /proc/fs/nfsd/clients/ code,
and a compile warning"
* tag 'nfsd-5.8-1' of git://linux-nfs.org/~bfields/linux:
SUNRPC: Add missing definition of ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
nfsd: fix nfsdfs inode reference count leak
nfsd4: fix nfsdfs reference count loop
nfsd: apply umask on fs without ACL support
Pull block fixes from Jens Axboe:
- Use kvfree_sensitive() for the block keyslot free (Eric)
- Sync blk-mq debugfs flags (Hou)
- Memory leak fix in virtio-blk error path (Hou)
* tag 'block-5.8-2020-07-01' of git://git.kernel.dk/linux-block:
virtio-blk: free vblk-vqs in error path of virtblk_probe()
block/keyslot-manager: use kvfree_sensitive()
blk-mq-debugfs: update blk_queue_flag_name[] accordingly for new flags
Pull io_uring fixes from Jens Axboe:
"One fix in here, for a regression in 5.7 where a task is waiting in
the kernel for a condition, but that condition won't become true until
task_work is run. And the task_work can't be run exactly because the
task is waiting in the kernel, so we'll never make any progress.
One example of that is registering an eventfd and queueing io_uring
work, and then the task goes and waits in eventfd read with the
expectation that it'll get woken (and read an event) when the io_uring
request completes. The io_uring request is finished through task_work,
which won't get run while the task is looping in eventfd read"
* tag 'io_uring-5.8-2020-07-01' of git://git.kernel.dk/linux-block:
io_uring: use signal based task_work running
task_work: teach task_work_add() to do signal_wake_up()
I would like that Claudiu becomes co-maintainer of the Cadence macb
driver. He's already participating to lots of reviews and enhancements
to this driver and knows the different versions of this controller.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The number of ports is incorrectly set to the maximum available for a DSA
switch. Even if the extra ports are not used, this causes some functions
to be called later, like port_disable() and port_stp_state_set(). If the
driver doesn't check the port index, it will end up modifying unknown
registers.
Fixes: b987e98e50 ("dsa: add DSA switch driver for Microchip KSZ9477")
Signed-off-by: Codrin Ciubotariu <codrin.ciubotariu@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Today xenbus_map_ring_valloc() can return either a negative errno
value (-ENOMEM or -EINVAL) or a grant status value. This is a mess as
e.g -ENOMEM and GNTST_eagain have the same numeric value.
Fix that by turning all grant mapping errors into -ENOENT. This is
no problem as all callers of xenbus_map_ring_valloc() only use the
return value to print an error message, and in case of mapping errors
the grant status value has already been printed by __xenbus_map_ring()
before.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20200701121638.19840-3-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
xenbus_map_ring_valloc() and its sub-functions are putting quite large
structs and arrays on the stack. This is problematic at runtime, but
might also result in build failures (e.g. with clang due to the option
-Werror,-Wframe-larger-than=... used).
Fix that by moving most of the data from the stack into a dynamically
allocated struct. Performance is no issue here, as
xenbus_map_ring_valloc() is used only when adding a new PV device to
a backend driver.
While at it move some duplicated code from pv/hvm specific mapping
functions to the single caller.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20200701121638.19840-2-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
This essentially reverts commit 7212303268 ("tcp: md5: reject TCP_MD5SIG
or TCP_MD5SIG_EXT on established sockets")
Mathieu reported that many vendors BGP implementations can
actually switch TCP MD5 on established flows.
Quoting Mathieu :
Here is a list of a few network vendors along with their behavior
with respect to TCP MD5:
- Cisco: Allows for password to be changed, but within the hold-down
timer (~180 seconds).
- Juniper: When password is initially set on active connection it will
reset, but after that any subsequent password changes no network
resets.
- Nokia: No notes on if they flap the tcp connection or not.
- Ericsson/RedBack: Allows for 2 password (old/new) to co-exist until
both sides are ok with new passwords.
- Meta-Switch: Expects the password to be set before a connection is
attempted, but no further info on whether they reset the TCP
connection on a change.
- Avaya: Disable the neighbor, then set password, then re-enable.
- Zebos: Would normally allow the change when socket connected.
We can revert my prior change because commit 9424e2e7ad ("tcp: md5: fix potential
overestimation of TCP option space") removed the leak of 4 kernel bytes to
the wire that was the main reason for my patch.
While doing my investigations, I found a bug when a MD5 key is changed, leading
to these commits that stable teams want to consider before backporting this revert :
Commit 6a2febec33 ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()")
Commit e6ced831ef ("tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers")
Fixes: 7212303268 "tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets"
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we have "ti,no-idle" specified for a module we must not disable
the the module on suspend to keep things backwards compatible.
Fixes: 386cb76681 ("bus: ti-sysc: Handle missed no-idle property in addition to no-idle-on-init")
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
There is a race condition where ib_nl_make_request() inserts the request
data into the linked list but the timer in ib_nl_request_timeout() can see
it and destroy it before ib_nl_send_msg() is done touching it. This could
happen, for instance, if there is a long delay allocating memory during
nlmsg_new()
This causes a use-after-free in the send_mad() thread:
[<ffffffffa02f43cb>] ? ib_pack+0x17b/0x240 [ib_core]
[ <ffffffffa032aef1>] ib_sa_path_rec_get+0x181/0x200 [ib_sa]
[<ffffffffa0379db0>] rdma_resolve_route+0x3c0/0x8d0 [rdma_cm]
[<ffffffffa0374450>] ? cma_bind_port+0xa0/0xa0 [rdma_cm]
[<ffffffffa040f850>] ? rds_rdma_cm_event_handler_cmn+0x850/0x850 [rds_rdma]
[<ffffffffa040f22c>] rds_rdma_cm_event_handler_cmn+0x22c/0x850 [rds_rdma]
[<ffffffffa040f860>] rds_rdma_cm_event_handler+0x10/0x20 [rds_rdma]
[<ffffffffa037778e>] addr_handler+0x9e/0x140 [rdma_cm]
[<ffffffffa026cdb4>] process_req+0x134/0x190 [ib_addr]
[<ffffffff810a02f9>] process_one_work+0x169/0x4a0
[<ffffffff810a0b2b>] worker_thread+0x5b/0x560
[<ffffffff810a0ad0>] ? flush_delayed_work+0x50/0x50
[<ffffffff810a68fb>] kthread+0xcb/0xf0
[<ffffffff816ec49a>] ? __schedule+0x24a/0x810
[<ffffffff816ec49a>] ? __schedule+0x24a/0x810
[<ffffffff810a6830>] ? kthread_create_on_node+0x180/0x180
[<ffffffff816f25a7>] ret_from_fork+0x47/0x90
[<ffffffff810a6830>] ? kthread_create_on_node+0x180/0x180
The ownership rule is once the request is on the list, ownership transfers
to the list and the local thread can't touch it any more, just like for
the normal MAD case in send_mad().
Thus, instead of adding before send and then trying to delete after on
errors, move the entire thing under the spinlock so that the send and
update of the lists are atomic to the conurrent threads. Lightly reoganize
things so spinlock safe memory allocations are done in the final NL send
path and the rest of the setup work is done before and outside the lock.
Fixes: 3ebd2fd0d0 ("IB/sa: Put netlink request into the request list before sending")
Link: https://lore.kernel.org/r/1592964789-14533-1-git-send-email-divya.indi@oracle.com
Signed-off-by: Divya Indi <divya.indi@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Fix sparse build warning:
block/bio-integrity.c:27:6: warning:
symbol '__bio_integrity_free' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull NVMe fixes from Christoph.
* 'nvme-5.8' of git://git.infradead.org/nvme:
nvme: fix a crash in nvme_mpath_add_disk
nvme: fix identify error status silent ignore
With CONFIG_DEBUG_ATOMIC_SLEEP enabled we can see the following with RTC probe:
BUG: sleeping function called from invalid context at drivers/bus/ti-sysc.c:1736
...
(sysc_quirk_rtc) from [<c060d01c>] (sysc_write_sysconfig+0x1c/0x60)
(sysc_write_sysconfig) from [<c060d9f4>] (sysc_enable_module+0x11c/0x274)
(sysc_enable_module) from [<c060f37c>] (sysc_probe+0xe9c/0x1380)
(sysc_probe) from [<c06e9384>] (platform_drv_probe+0x48/0x98)
Fixes: e8639e1c98 ("bus: ti-sysc: Handle module unlock quirk needed for some RTC")
Signed-off-by: Tony Lindgren <tony@atomide.com>
With CONFIG_DEBUG_ATOMIC_SLEEP enabled we can see the following with
wakeirqs and serial console idled:
BUG: sleeping function called from invalid context at drivers/bus/ti-sysc.c:242
...
(sysc_wait_softreset) from [<c0606894>] (sysc_enable_module+0x48/0x274)
(sysc_enable_module) from [<c0606c5c>] (sysc_runtime_resume+0x19c/0x1d8)
(sysc_runtime_resume) from [<c0606cf0>] (sysc_child_runtime_resume+0x58/0x84)
(sysc_child_runtime_resume) from [<c06eb7bc>] (__rpm_callback+0x30/0x12c)
(__rpm_callback) from [<c06eb8d8>] (rpm_callback+0x20/0x80)
(rpm_callback) from [<c06eb434>] (rpm_resume+0x638/0x7fc)
(rpm_resume) from [<c06eb658>] (__pm_runtime_resume+0x60/0x9c)
(__pm_runtime_resume) from [<c06edc08>] (handle_threaded_wake_irq+0x24/0x60)
(handle_threaded_wake_irq) from [<c01befec>] (irq_thread_fn+0x1c/0x78)
(irq_thread_fn) from [<c01bf30c>] (irq_thread+0x140/0x26c)
We have __pm_runtime_resume() call the sysc_runtime_resume() with spinlock
held and interrupts disabled.
Fixes: d46f9fbec7 ("bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Increment *pos in the cpuinfo_op.next to fix the following warning
triggered by cat /proc/cpuinfo:
seq_file: buggy .next function c_next did not update position index
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Building xtensa kernel with gcc-10 produces the following warnings:
arch/xtensa/kernel/xtensa_ksyms.c:90:15: warning: conflicting types
for built-in function ‘__sync_fetch_and_and_4’;
expected ‘unsigned int(volatile void *, unsigned int)’
[-Wbuiltin-declaration-mismatch]
arch/xtensa/kernel/xtensa_ksyms.c:96:15: warning: conflicting types
for built-in function ‘__sync_fetch_and_or_4’;
expected ‘unsigned int(volatile void *, unsigned int)’
[-Wbuiltin-declaration-mismatch]
Fix declarations of these functions to avoid the warning.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Acer C720 running Linux v5.3 reports this in klog:
tpm_tis: 1.2 TPM (device-id 0xB, rev-id 16)
tpm tpm0: tpm_try_transmit: send(): error -5
tpm tpm0: A TPM error (-5) occurred attempting to determine the timeouts
tpm_tis tpm_tis: Could not get TPM timeouts and durations
tpm_tis 00:08: 1.2 TPM (device-id 0xB, rev-id 16)
tpm tpm0: tpm_try_transmit: send(): error -5
tpm tpm0: A TPM error (-5) occurred attempting to determine the timeouts
tpm_tis 00:08: Could not get TPM timeouts and durations
ima: No TPM chip found, activating TPM-bypass!
tpm_inf_pnp 00:08: Found TPM with ID IFX0102
% git --no-pager grep IFX0102 drivers/char/tpm
drivers/char/tpm/tpm_infineon.c: {"IFX0102", 0},
drivers/char/tpm/tpm_tis.c: {"IFX0102", 0}, /* Infineon */
Obviously IFX0102 was added to the HID table for the TCG TIS driver by
mistake.
Fixes: 93e1b7d42e ("[PATCH] tpm: add HID module parameter")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203877
Cc: stable@vger.kernel.org
Cc: Kylene Jo Hall <kjhall@us.ibm.com>
Reported-by: Ferry Toth: <ferry.toth@elsinga.info>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
On a Chromebook I'm working on I noticed a big (~1 second) delay
during bootup where nothing was happening. Right around this big
delay there were messages about the TPM:
[ 2.311352] tpm_tis_spi spi0.0: TPM ready IRQ confirmed on attempt 2
[ 3.332790] tpm_tis_spi spi0.0: Cr50 firmware version: ...
I put a few printouts in and saw that tpm_tis_spi_init() (specifically
tpm_chip_register() in that function) was taking the lion's share of
this time, though ~115 ms of the time was in cr50_print_fw_version().
Let's make a one-line change to prefer async probe for tpm_tis_spi.
There's no reason we need to block other drivers from probing while we
load.
NOTES:
* It's possible that other hardware runs through the init sequence
faster than Cr50 and this isn't such a big problem for them.
However, even if they are faster they are still doing _some_
transfers over a SPI bus so this should benefit everyone even if to
a lesser extent.
* It's possible that there are extra delays in the code that could be
optimized out. I didn't dig since once I enabled async probe they
no longer impacted me.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
The tpm2_get_cc_attrs_tbl() call will result in TPM commands being issued,
which will need the use of the internal command/response buffer. But,
we're issuing this *before* we've waited to make sure that buffer is
allocated.
This can result in intermittent failures to probe if the hypervisor / TPM
implementation doesn't respond quickly enough. I find it fails almost
every time with an 8 vcpu guest under KVM with software emulated TPM.
To fix it, just move the tpm2_get_cc_attrs_tlb() call after the
existing code to wait for initialization, which will ensure the buffer
is allocated.
Fixes: 18b3670d79 ("tpm: ibmvtpm: Add support for TPM2")
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Found by smatch:
drivers/char/tpm/tpm_tis_core.c:1088 tpm_tis_core_init() warn:
variable dereferenced before check 'chip->ops' (see line 979)
'chip->ops' is assigned in the beginning of function
in tpmm_chip_alloc->tpm_chip_alloc
and is used before first possible goto to error path.
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
During flow control we are just reading from the TPM, yet our spi_xfer
has the tx_buf and rx_buf both non-NULL which means we're requesting a
full duplex transfer.
SPI is always somewhat of a full duplex protocol anyway and in theory
the other side shouldn't really be looking at what we're sending it
during flow control, but it's still a bit ugly to be sending some
"random" data when we shouldn't.
The default tpm_tis_spi_flow_control() tries to address this by
setting 'phy->iobuf[0] = 0'. This partially avoids the problem of
sending "random" data, but since our tx_buf and rx_buf both point to
the same place I believe there is the potential of us sending the
TPM's previous byte back to it if we hit the retry loop.
Another flow control implementation, cr50_spi_flow_control(), doesn't
address this at all.
Let's clean this up and just make the tx_buf NULL before we call
flow_control(). Not only does this ensure that we're not sending any
"random" bytes but it also possibly could make the SPI controller
behave in a slightly more optimal way.
NOTE: no actual observed problems are fixed by this patch--it's was
just made based on code inspection.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
It has been reported that some TIS based TPMs are giving unexpected
errors when using the O_NONBLOCK path of the TPM device. The problem
is that some TPMs don't like it when you get and then relinquish a
locality (as the tpm_try_get_ops()/tpm_put_ops() pair does) without
sending a command. This currently happens all the time in the
O_NONBLOCK write path. Fix this by moving the tpm_try_get_ops()
further down the code to after the O_NONBLOCK determination is made.
This is safe because the priv->buffer_mutex still protects the priv
state being modified.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206275
Fixes: d23d124843 ("tpm: fix invalid locking in NONBLOCKING mode")
Reported-by: Mario Limonciello <Mario.Limonciello@dell.com>
Tested-by: Alex Guzman <alex@guzman.io>
Cc: stable@vger.kernel.org
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
The following build warnings are seen with 'make dt_binding_check':
Documentation/devicetree/bindings/sound/simple-card.example.dts:209.46-211.15: Warning (unit_address_vs_reg): /example-4/sound/simple-audio-card,cpu@0: node has a unit name, but no reg or ranges property
Documentation/devicetree/bindings/sound/simple-card.example.dts:213.37-215.15: Warning (unit_address_vs_reg): /example-4/sound/simple-audio-card,cpu@1: node has a unit name, but no reg or ranges property
Documentation/devicetree/bindings/sound/simple-card.example.dts:250.42-261.15: Warning (unit_address_vs_reg): /example-5/sound/simple-audio-card,dai-link@0: node has a unit name, but no reg or ranges property
Documentation/devicetree/bindings/sound/simple-card.example.dts:263.42-288.15: Warning (unit_address_vs_reg): /example-5/sound/simple-audio-card,dai-link@1: node has a unit name, but no reg or ranges property
Documentation/devicetree/bindings/sound/simple-card.example.dts:270.32-272.19: Warning (unit_address_vs_reg): /example-5/sound/simple-audio-card,dai-link@1/cpu@0: node has a unit name, but no reg or ranges property
Documentation/devicetree/bindings/sound/simple-card.example.dts:273.23-275.19: Warning (unit_address_vs_reg): /example-5/sound/simple-audio-card,dai-link@1/cpu@1: node has a unit name, but no reg or ranges property
Documentation/devicetree/bindings/sound/simple-card.example.dts:276.23-278.19: Warning (unit_address_vs_reg): /example-5/sound/simple-audio-card,dai-link@1/cpu@2: node has a unit name, but no reg or ranges property
Documentation/devicetree/bindings/sound/simple-card.example.dts:279.23-281.19: Warning (unit_address_vs_reg): /example-5/sound/simple-audio-card,dai-link@1/cpu@3: node has a unit name, but no reg or ranges property
Documentation/devicetree/bindings/sound/simple-card.example.dts:290.42-303.15: Warning (unit_address_vs_reg): /example-5/sound/simple-audio-card,dai-link@2: node has a unit name, but no reg or ranges property
Fix them all.
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20200630223020.25546-1-festevam@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
When building very large kernels, the logic that emits replacement
sequences for alternatives fails when relative branches are present
in the code that is emitted into the .altinstr_replacement section
and patched in at the original site and fixed up. The reason is that
the linker will insert veneers if relative branches go out of range,
and due to the relative distance of the .altinstr_replacement from
the .text section where its branch targets usually live, veneers
may be emitted at the end of the .altinstr_replacement section, with
the relative branches in the sequence pointed at the veneers instead
of the actual target.
The alternatives patching logic will attempt to fix up the branch to
point to its original target, which will be the veneer in this case,
but given that the patch site is likely to be far away as well, it
will be out of range and so patching will fail. There are other cases
where these veneers are problematic, e.g., when the target of the
branch is in .text while the patch site is in .init.text, in which
case putting the replacement sequence inside .text may not help either.
So let's use subsections to emit the replacement code as closely as
possible to the patch site, to ensure that veneers are only likely to
be emitted if they are required at the patch site as well, in which
case they will be in range for the replacement sequence both before
and after it is transported to the patch site.
This will prevent alternative sequences in non-init code from being
released from memory after boot, but this is tolerable given that the
entire section is only 512 KB on an allyesconfig build (which weighs in
at 500+ MB for the entire Image). Also, note that modules today carry
the replacement sequences in non-init sections as well, and any of
those that target init code will be emitted into init sections after
this change.
This fixes an early crash when booting an allyesconfig kernel on a
system where any of the alternatives sequences containing relative
branches are activated at boot (e.g., ARM64_HAS_PAN on TX2)
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Dave P Martin <dave.martin@arm.com>
Link: https://lore.kernel.org/r/20200630081921.13443-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Sparse complains on a call to get_compat_sigset, fix it. The "if"
right above explains that sigmask_arg->sigset is basically a
compat_sigset_t.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For private namespaces ns->head_disk is NULL, so add a NULL check
before updating the BDI capabilities.
Fixes: b2ce4d9069 ("nvme-multipath: set bdi capabilities once")
Reported-by: Avinash M N <Avinash.M.N@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Commit 59c7c3caaa intended to only silently ignore non retry-able
errors (DNR bit set) such that we can still identify misbehaving
controllers, and in the other hand propagate retry-able errors (DNR bit
cleared) so we don't wrongly abandon a namespace just because it happens
to be temporarily inaccessible.
The goal remains the same as the original commit where this was
introduced but unfortunately had the logic backwards.
Fixes: 59c7c3caaa ("nvme: fix possible hang when ns scanning fails during error recovery")
Reported-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The burst length is configured in VIU_OSD1_FIFO_CTRL_STAT[31] and
VIU_OSD1_FIFO_CTRL_STAT[11:10]. The public S905D3 datasheet describes
this as:
- 0x0 = up to 24 per burst
- 0x1 = up to 32 per burst
- 0x2 = up to 48 per burst
- 0x3 = up to 64 per burst
- 0x4 = up to 96 per burst
- 0x5 = up to 128 per burst
The lower two bits map to VIU_OSD1_FIFO_CTRL_STAT[11:10] while the upper
bit maps to VIU_OSD1_FIFO_CTRL_STAT[31].
Replace meson_viu_osd_burst_length_reg() with pre-defined macros which
set these values. meson_viu_osd_burst_length_reg() always returned 0
(for the two used values: 32 and 64 at least) and thus incorrectly set
the burst size to 24.
Fixes: 147ae1cbaa ("drm: meson: viu: use proper macros instead of magic constants")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Christian Hewitt <christianshewitt@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200620155752.21065-1-martin.blumenstingl@googlemail.com
Eric reported an issue where mounting -o recovery with a fuzzed fs
resulted in a kernel panic. This is because we tried to free the tree
node, except it was an error from the read. Fix this by properly
resetting the tree_root->node == NULL in this case. The panic was the
following
BTRFS warning (device loop0): failed to read tree root
BUG: kernel NULL pointer dereference, address: 000000000000001f
RIP: 0010:free_extent_buffer+0xe/0x90 [btrfs]
Call Trace:
free_root_extent_buffers.part.0+0x11/0x30 [btrfs]
free_root_pointers+0x1a/0xa2 [btrfs]
open_ctree+0x1776/0x18a5 [btrfs]
btrfs_mount_root.cold+0x13/0xfa [btrfs]
? selinux_fs_context_parse_param+0x37/0x80
legacy_get_tree+0x27/0x40
vfs_get_tree+0x25/0xb0
fc_mount+0xe/0x30
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x147/0x3e0 [btrfs]
? cred_has_capability+0x7c/0x120
? legacy_get_tree+0x27/0x40
legacy_get_tree+0x27/0x40
vfs_get_tree+0x25/0xb0
do_mount+0x735/0xa40
__x64_sys_mount+0x8e/0xd0
do_syscall_64+0x4d/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nik says: this is problematic only if we fail on the last iteration of
the loop as this results in init_tree_roots returning err value with
tree_root->node = -ERR. Subsequently the caller does: fail_tree_roots
which calls free_root_pointers on the bogus value.
Reported-by: Eric Sandeen <sandeen@redhat.com>
Fixes: b8522a1e5f ("btrfs: Factor out tree roots initialization during mount")
CC: stable@vger.kernel.org # 5.5+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add details how the pointer gets dereferenced ]
Signed-off-by: David Sterba <dsterba@suse.com>
Under somewhat convoluted conditions, it is possible to attempt to
release an extent_buffer that is under io, which triggers a BUG_ON in
btrfs_release_extent_buffer_pages.
This relies on a few different factors. First, extent_buffer reads done
as readahead for searching use WAIT_NONE, so they free the local extent
buffer reference while the io is outstanding. However, they should still
be protected by TREE_REF. However, if the system is doing signficant
reclaim, and simultaneously heavily accessing the extent_buffers, it is
possible for releasepage to race with two concurrent readahead attempts
in a way that leaves TREE_REF unset when the readahead extent buffer is
released.
Essentially, if two tasks race to allocate a new extent_buffer, but the
winner who attempts the first io is rebuffed by a page being locked
(likely by the reclaim itself) then the loser will still go ahead with
issuing the readahead. The loser's call to find_extent_buffer must also
race with the reclaim task reading the extent_buffer's refcount as 1 in
a way that allows the reclaim to re-clear the TREE_REF checked by
find_extent_buffer.
The following represents an example execution demonstrating the race:
CPU0 CPU1 CPU2
reada_for_search reada_for_search
readahead_tree_block readahead_tree_block
find_create_tree_block find_create_tree_block
alloc_extent_buffer alloc_extent_buffer
find_extent_buffer // not found
allocates eb
lock pages
associate pages to eb
insert eb into radix tree
set TREE_REF, refs == 2
unlock pages
read_extent_buffer_pages // WAIT_NONE
not uptodate (brand new eb)
lock_page
if !trylock_page
goto unlock_exit // not an error
free_extent_buffer
release_extent_buffer
atomic_dec_and_test refs to 1
find_extent_buffer // found
try_release_extent_buffer
take refs_lock
reads refs == 1; no io
atomic_inc_not_zero refs to 2
mark_buffer_accessed
check_buffer_tree_ref
// not STALE, won't take refs_lock
refs == 2; TREE_REF set // no action
read_extent_buffer_pages // WAIT_NONE
clear TREE_REF
release_extent_buffer
atomic_dec_and_test refs to 1
unlock_page
still not uptodate (CPU1 read failed on trylock_page)
locks pages
set io_pages > 0
submit io
return
free_extent_buffer
release_extent_buffer
dec refs to 0
delete from radix tree
btrfs_release_extent_buffer_pages
BUG_ON(io_pages > 0)!!!
We observe this at a very low rate in production and were also able to
reproduce it in a test environment by introducing some spurious delays
and by introducing probabilistic trylock_page failures.
To fix it, we apply check_tree_ref at a point where it could not
possibly be unset by a competing task: after io_pages has been
incremented. All the codepaths that clear TREE_REF check for io, so they
would not be able to clear it after this point until the io is done.
Stack trace, for reference:
[1417839.424739] ------------[ cut here ]------------
[1417839.435328] kernel BUG at fs/btrfs/extent_io.c:4841!
[1417839.447024] invalid opcode: 0000 [#1] SMP
[1417839.502972] RIP: 0010:btrfs_release_extent_buffer_pages+0x20/0x1f0
[1417839.517008] Code: ed e9 ...
[1417839.558895] RSP: 0018:ffffc90020bcf798 EFLAGS: 00010202
[1417839.570816] RAX: 0000000000000002 RBX: ffff888102d6def0 RCX: 0000000000000028
[1417839.586962] RDX: 0000000000000002 RSI: ffff8887f0296482 RDI: ffff888102d6def0
[1417839.603108] RBP: ffff88885664a000 R08: 0000000000000046 R09: 0000000000000238
[1417839.619255] R10: 0000000000000028 R11: ffff88885664af68 R12: 0000000000000000
[1417839.635402] R13: 0000000000000000 R14: ffff88875f573ad0 R15: ffff888797aafd90
[1417839.651549] FS: 00007f5a844fa700(0000) GS:ffff88885f680000(0000) knlGS:0000000000000000
[1417839.669810] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1417839.682887] CR2: 00007f7884541fe0 CR3: 000000049f609002 CR4: 00000000003606e0
[1417839.699037] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1417839.715187] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[1417839.731320] Call Trace:
[1417839.737103] release_extent_buffer+0x39/0x90
[1417839.746913] read_block_for_search.isra.38+0x2a3/0x370
[1417839.758645] btrfs_search_slot+0x260/0x9b0
[1417839.768054] btrfs_lookup_file_extent+0x4a/0x70
[1417839.778427] btrfs_get_extent+0x15f/0x830
[1417839.787665] ? submit_extent_page+0xc4/0x1c0
[1417839.797474] ? __do_readpage+0x299/0x7a0
[1417839.806515] __do_readpage+0x33b/0x7a0
[1417839.815171] ? btrfs_releasepage+0x70/0x70
[1417839.824597] extent_readpages+0x28f/0x400
[1417839.833836] read_pages+0x6a/0x1c0
[1417839.841729] ? startup_64+0x2/0x30
[1417839.849624] __do_page_cache_readahead+0x13c/0x1a0
[1417839.860590] filemap_fault+0x6c7/0x990
[1417839.869252] ? xas_load+0x8/0x80
[1417839.876756] ? xas_find+0x150/0x190
[1417839.884839] ? filemap_map_pages+0x295/0x3b0
[1417839.894652] __do_fault+0x32/0x110
[1417839.902540] __handle_mm_fault+0xacd/0x1000
[1417839.912156] handle_mm_fault+0xaa/0x1c0
[1417839.921004] __do_page_fault+0x242/0x4b0
[1417839.930044] ? page_fault+0x8/0x30
[1417839.937933] page_fault+0x1e/0x30
[1417839.945631] RIP: 0033:0x33c4bae
[1417839.952927] Code: Bad RIP value.
[1417839.960411] RSP: 002b:00007f5a844f7350 EFLAGS: 00010206
[1417839.972331] RAX: 000000000000006e RBX: 1614b3ff6a50398a RCX: 0000000000000000
[1417839.988477] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000002
[1417840.004626] RBP: 00007f5a844f7420 R08: 000000000000006e R09: 00007f5a94aeccb8
[1417840.020784] R10: 00007f5a844f7350 R11: 0000000000000000 R12: 00007f5a94aecc79
[1417840.036932] R13: 00007f5a94aecc78 R14: 00007f5a94aecc90 R15: 00007f5a94aecc40
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Convert fall through comments to the pseudo-keyword which is now the
preferred way.
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The wait_event_... defines evaluate to long so we should not assign it an int as this may truncate
the value.
Reported-by: Marshall Midden <marshallmidden@gmail.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
The KSZ9893 3-Port Gigabit Ethernet Switch can be controlled via SPI,
I²C or MDIO (very limited and not supported by this driver). While there
is already a compatible entry for the SPI bus, it was missing for I²C.
Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Whenever tcp_try_rmem_schedule() returns an error, we are under
trouble and should make sure to wakeup readers so that they
can drain socket queues and eventually make room.
Fixes: 03f45c883c ("tcp: avoid extra wakeups for SO_RCVLOWAT users")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When xfstest generic/035, we found the target file was deleted
if the rename return -EACESS.
In cifs_rename2, we unlink the positive target dentry if rename
failed with EACESS or EEXIST, even if the target dentry is positived
before rename. Then the existing file was deleted.
We should just delete the target file which created during the
rename.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
The flag from the primary tcon needs to be copied into the volume info
so that cifs_get_tcon will try to enable extensions on the per-user
tcon. At that point, since posix extensions must have already been
enabled on the superblock, don't try to needlessly adjust the mount
flags.
Fixes: ce558b0e17 ("smb3: Add posix create context for smb3.11 posix mounts")
Fixes: b326614ea2 ("smb3: allow "posix" mount option to enable new SMB311 protocol extensions")
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Without this:
- persistent handles will only be enabled for per-user tcons if the
server advertises the 'Continuous Availabity' capability
- resilient handles would never be enabled for per-user tcons
Signed-off-by: Paul Aurich <paul@darkrain42.org>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Ensure multiuser SMB3 mounts use encryption for all users' tcons if the
mount options are configured to require encryption. Without this, only
the primary tcon and IPC tcons are guaranteed to be encrypted. Per-user
tcons would only be encrypted if the server was configured to require
encryption.
Signed-off-by: Paul Aurich <paul@darkrain42.org>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
When no full socket is available, skbs are sent over a per-netns
control socket. Its sk_mark is temporarily adjusted to match that
of the real (request or timewait) socket or to reflect an incoming
skb, so that the outgoing skb inherits this in __ip_make_skb.
Introduction of the socket cookie mark field broke this. Now the
skb is set through the cookie and cork:
<caller> # init sockc.mark from sk_mark or cmsg
ip_append_data
ip_setup_cork # convert sockc.mark to cork mark
ip_push_pending_frames
ip_finish_skb
__ip_make_skb # set skb->mark to cork mark
But I missed these special control sockets. Update all callers of
__ip(6)_make_skb that were originally missed.
For IPv6, the same two icmp(v6) paths are affected. The third
case is not, as commit 92e55f412c ("tcp: don't annotate
mark on control socket from tcp_v6_send_response()") replaced
the ctl_sk->sk_mark with passing the mark field directly as a
function argument. That commit predates the commit that
introduced the bug.
Fixes: c6af0c227a ("ip: support SO_MARK cmsg")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reported-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Whenever cookie_init_timestamp() has been used to encode
ECN,SACK,WSCALE options, we can not remove the TS option in the SYNACK.
Otherwise, tcp_synack_options() will still advertize options like WSCALE
that we can not deduce later when receiving the packet from the client
to complete 3WHS.
Note that modern linux TCP stacks wont use MD5+TS+SACK in a SYN packet,
but we can not know for sure that all TCP stacks have the same logic.
Before the fix a tcpdump would exhibit this wrong exchange :
10:12:15.464591 IP C > S: Flags [S], seq 4202415601, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 456965269 ecr 0,nop,wscale 8], length 0
10:12:15.464602 IP S > C: Flags [S.], seq 253516766, ack 4202415602, win 65535, options [nop,nop,md5 valid,mss 1400,nop,nop,sackOK,nop,wscale 8], length 0
10:12:15.464611 IP C > S: Flags [.], ack 1, win 256, options [nop,nop,md5 valid], length 0
10:12:15.464678 IP C > S: Flags [P.], seq 1:13, ack 1, win 256, options [nop,nop,md5 valid], length 12
10:12:15.464685 IP S > C: Flags [.], ack 13, win 65535, options [nop,nop,md5 valid], length 0
After this patch the exchange looks saner :
11:59:59.882990 IP C > S: Flags [S], seq 517075944, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 1751508483 ecr 0,nop,wscale 8], length 0
11:59:59.883002 IP S > C: Flags [S.], seq 1902939253, ack 517075945, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 1751508479 ecr 1751508483,nop,wscale 8], length 0
11:59:59.883012 IP C > S: Flags [.], ack 1, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508479], length 0
11:59:59.883114 IP C > S: Flags [P.], seq 1:13, ack 1, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508479], length 12
11:59:59.883122 IP S > C: Flags [.], ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508483], length 0
11:59:59.883152 IP S > C: Flags [P.], seq 1:13, ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508484 ecr 1751508483], length 12
11:59:59.883170 IP C > S: Flags [.], ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508484 ecr 1751508484], length 0
Of course, no SACK block will ever be added later, but nothing should break.
Technically, we could remove the 4 nops included in MD5+TS options,
but again some stacks could break seeing not conventional alignment.
Fixes: 4957faade1 ("TCPCT part 1g: Responder Cookie => Initiator")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In testing with mprds enabled, Oracle Cluster nodes after reboot were
not able to communicate with others nodes and so failed to rejoin
the cluster. Peers with lower IP address initiated connection but the
node could not respond as it choose a different path and could not
initiate a connection as it had a higher IP address.
With this patch, when a node sends out a packet and the selected path
is down, all other paths are also checked and any down paths are
re-connected.
Reviewed-by: Ka-cheong Poon <ka-cheong.poon@oracle.com>
Reviewed-by: David Edmondson <david.edmondson@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My prior fix went a bit too far, according to Herbert and Mathieu.
Since we accept that concurrent TCP MD5 lookups might see inconsistent
keys, we can use READ_ONCE()/WRITE_ONCE() instead of smp_rmb()/smp_wmb()
Clearing all key->key[] is needed to avoid possible KMSAN reports,
if key->keylen is increased. Since tcp_md5_do_add() is not fast path,
using __GFP_ZERO to clear all struct tcp_md5sig_key is simpler.
data_race() was added in linux-5.8 and will prevent KCSAN reports,
this can safely be removed in stable backports, if data_race() is
not yet backported.
v2: use data_race() both in tcp_md5_hash_key() and tcp_md5_do_add()
Fixes: 6a2febec33 ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Marco Elver <elver@google.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
A potential deadlock can occur during registering or unregistering a
new generic netlink family between the main nl_table_lock and the
cb_lock where each thread wants the lock held by the other, as
demonstrated below.
1) Thread 1 is performing a netlink_bind() operation on a socket. As part
of this call, it will call netlink_lock_table(), incrementing the
nl_table_users count to 1.
2) Thread 2 is registering (or unregistering) a genl_family via the
genl_(un)register_family() API. The cb_lock semaphore will be taken for
writing.
3) Thread 1 will call genl_bind() as part of the bind operation to handle
subscribing to GENL multicast groups at the request of the user. It will
attempt to take the cb_lock semaphore for reading, but it will fail and
be scheduled away, waiting for Thread 2 to finish the write.
4) Thread 2 will call netlink_table_grab() during the (un)registration
call. However, as Thread 1 has incremented nl_table_users, it will not
be able to proceed, and both threads will be stuck waiting for the
other.
genl_bind() is a noop, unless a genl_family implements the mcast_bind()
function to handle setting up family-specific multicast operations. Since
no one in-tree uses this functionality as Cong pointed out, simply removing
the genl_bind() function will remove the possibility for deadlock, as there
is no attempt by Thread 1 above to take the cb_lock semaphore.
Fixes: c380d9a7af ("genetlink: pass multicast bind/unbind to families")
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Johannes Berg <johannes.berg@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull hyperv fix from Wei Liu:
"One patch from Joseph to make panic reporting contain more useful
information"
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: Change flag to write log level in panic msg to false
get_dev_cap and set_resources_state functions may return a positive
value because of hardware failure, and the positive return value
can not be passed to ERR_PTR directly.
Fixes: 7dd29ee128 ("hinic: add sriov feature support")
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Renoir uses integrated_system_info table v12. The table
has the same layout as v11 with respect to this data. Just
reuse the existing code for v12 for stable.
Fixes incorrectly reported vram info in the driver output.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
CPU Measurement sampling facility on s390 does not support
perf tool collection of callchain data using --call-graph
option. The sampling facility collects samples in a ring
buffer which includes only the instruction address the
samples were taken. When the ring buffer hits a watermark,
a measurement alert interrupt is triggered and handled
by the performance measurement unit (PMU) device driver.
It collects the samples and feeds each sample to the
perf ring buffer in the common code via functions
perf_prepare_sample()/perf_output_sample(). When function
perf_prepare_sample() is called to collect sample data's
callchain, user register values or stack area, invalid
data is picked, because the context of the collected
information does not match the context when the sample
was taken.
There is currently no way to provide the callchain and other
information, because the hardware sampler does not collect this
information.
Therefore prohibit sampling when the user requests a callchain graph
from the hardware sampler. Return -EOPNOTSUPP to the user in this
case.
If call chains are really wanted, users need to specify software
event cpu-clock to get the callchain information from a
software event.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
The current probe() function calls a pair of cpuhp_xxx API functions to
setup CPU hotplug handling. The hotplug lock is held for the duration of
the two calls and other CPU related code using cpus_read_lock() /
cpus_read_unlock() calls.
The problem is that on error states, goto: statements bypass the
cpus_read_unlock() call. This code has increased in complexity as the
driver has developed.
This patch introduces a pair of helper functions etm4_pm_setup_cpuslocked()
and etm4_pm_clear() which correct the issues above and group the PM code a
little better.
The two functions etm4_cpu_pm_register() and etm4_cpu_pm_unregister() are
dropped as these call cpu_pm_register_notifier() / ..unregister_notifier()
dependent on CONFIG_CPU_PM - but this define is used to nop these functions
out in the pm headers - so the wrapper functions are superfluous.
Fixes: f188b5e76a ("coresight: etm4x: Save/restore state across CPU low power states")
Fixes: e9f5d63f84 ("hwtracing/coresight-etm4x: Use cpuhp_setup_state_nocalls_cpuslocked()")
Fixes: 58eb457be0 ("hwtracing/coresight-etm4x: Convert to hotplug state machine")
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200701160852.2782823-3-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There were a couple problems with error handling in the probe function:
1) If the "drvdata" allocation failed then it lead to a NULL
dereference.
2) On several error paths we decremented "nr_cti_cpu" before it was
incremented which lead to a reference counting bug.
There were also some parts of the error handling which were not bugs but
were messy. The error handling was confusing to read. It printed some
unnecessary error messages.
The simplest way to fix these problems was to create a cti_pm_setup()
function that did all the power management setup in one go. That way
when we call cti_pm_release() we don't have to deal with the
complications of a partially configured power management config.
I reversed the "if (drvdata->ctidev.cpu >= 0)" condition in
cti_pm_release() so that it mirros the new cti_pm_setup() function.
Fixes: 6a0953ce7d ("coresight: cti: Add CPU idle pm notifer to CTI devices")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200701160852.2782823-2-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Programs added 'userprogs' should be compiled for the target
architecture i.e. the same architecture as the kernel.
GCC does this correctly since the target architecture is implied
by the toolchain prefix.
Clang builds userspace programs always for the host architecture
because the target triple is currently missing.
Fix this.
Fixes: 7f3a59db27 ("kbuild: add infrastructure to build userspace programs")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
scripts/cc-can-link.sh tests if the compiler can link userspace
programs.
When $(CC) is GCC, it is checked against the target architecture
because the toolchain prefix is specified as a part of $(CC).
When $(CC) is Clang, it is checked against the host architecture
because --target option is missing.
Pass $(CLANG_FLAGS) to scripts/cc-can-link.sh to evaluate the link
capability for the target architecture.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
There are 3 types that are not parsed by the debug info logic.
Add support for them.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Instead of just changing the helper window to show a
dependency, also navigate to it at the config and menu
widgets.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
the goback button does nothing on splitMode. So, why display
it?
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
The goBack() logic is used only for the configList, as
it only makes sense on singleMode. So, let's simplify the
code.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
The default implementation for setSelected() at QTreeWidgetItem
allows multiple items to be selected.
Well, this should never be possible for the configItem lists.
So, implement a function that will automatically clean any
previous selection. This simplifies the logic somewhat, while
making the selection logic to be applied atomically, avoiding
future issues on that.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
The Qt5 conversion broke support for debug info links.
Restore the behaviour added by changeset
ab45d190fd ("kconfig: create links in info window").
The original approach was to pass a pointer for a data struct
via an <a href>. That doesn't sound a good idea, as, if something
gets wrong, the app could crash. So, instead, pass the name of
the symbol, and validate such symbol at the hyperlink handling
logic.
Link: https://lore.kernel.org/lkml/20200628125421.12458086@coco.lan/
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
When the search dialog box finds symbols/menus that match
the search criteria, it presents all results at the window.
Clicking on a search result should make qconf to navigate
to the selected item. This works on singleMode and on
fullMode, but on splitMode, the navigation is broken.
This was partially caused by an incomplete Qt5 conversion
and by the followup patches that restored the original
behavior.
When qconf is on split mode, it has to update both the
config and the menu views. Right now, such logic is broken,
as it is not seeking using the right structures.
On qconf, the screen is split into 3 parts:
+------------+-------+
| | |
| Config | Menu |
| | |
+------------+-------+
| |
| ConfigInfo |
| |
+--------------------+
On singleMode and on fullMode, the menuView is hidden, and search
updates only the configList (which controls the ConfigView).
On SplitMode, the search logic should detect if the variable is a
leaf or not. If it is a leaf, it should be presented at the menuView,
and both configList and menuList should be updated. Otherwise, just
the configList should be updated.
Link: https://lore.kernel.org/lkml/a98b0f0ebe0c23615a76f1d23f25fd0c84835e6b.camel@redhat.com/
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
The usage of c-like include is deprecated on modern Qt
versions. Use the c++ style includes.
While here, remove uneeded and redundant ones, sorting
them on alphabetic order.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
KVM/arm fixes for 5.8, take #2
- Make sure a vcpu becoming non-resident doesn't race against the doorbell delivery
- Only advertise pvtime if accounting is enabled
- Return the correct error code if reset fails with SVE
- Make sure that pseudo-NMI functions are annotated as __always_inline
Mika writes:
thunderbolt: Fix for v5.8-rc4
This includes a single patch that corrects path indices used in USB3
tunnel discovery.
* tag 'thunderbolt-fix-for-v5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt:
thunderbolt: Fix path indices used in USB3 tunnel discovery
There are several copies of get_eflags() and set_eflags() and they all are
buggy. Consolidate them and fix them. The fixes are:
Add memory clobbers. These are probably unnecessary but they make sure
that the compiler doesn't move something past one of these calls when it
shouldn't.
Respect the redzone on x86_64. There has no failure been observed related
to this, but it's definitely a bug.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/982ce58ae8dea2f1e57093ee894760e35267e751.1593191971.git.luto@kernel.org
The SYSENTER frame setup was nonsense. It worked by accident because the
normal code into which the Xen asm jumped (entry_SYSENTER_32/compat) threw
away SP without touching the stack. entry_SYSENTER_compat was recently
modified such that it relied on having a valid stack pointer, so now the
Xen asm needs to invoke it with a valid stack.
Fix it up like SYSCALL: use the Xen-provided frame and skip the bare
metal prologue.
Fixes: 1c3e5d3f60 ("x86/entry: Make entry_64_compat.S objtool clean")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lkml.kernel.org/r/947880c41ade688ff4836f665d0c9fcaa9bd1201.1593191971.git.luto@kernel.org
[Why]
Changes that are fast don't require updating DLG parameters making
this call unnecessary. Considering this is an expensive call it should
not be done on every flip.
DML touches clocks, p-state support, DLG params and a few other DC
internal flags and these aren't expected during fast. A hang has been
reported with this change when called on every flip which suggests that
modifying these fields is not recommended behavior on fast updates.
[How]
Guard the validation to only happen if update type isn't FAST.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1191
Fixes: a24eaa5c51 ("drm/amd/display: Revalidate bandwidth before commiting DC updates")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
MD5 keys are read with RCU protection, and tcp_md5_do_add()
might update in-place a prior key.
Normally, typical RCU updates would allocate a new piece
of memory. In this case only key->key and key->keylen might
be updated, and we do not care if an incoming packet could
see the old key, the new one, or some intermediate value,
since changing the key on a live flow is known to be problematic
anyway.
We only want to make sure that in the case key->keylen
is changed, cpus in tcp_md5_hash_key() wont try to use
uninitialized data, or crash because key->keylen was
read twice to feed sg_init_one() and ahash_request_set_crypt()
Fixes: 9ea88a1530 ("tcp: md5: check md5 signature without socket lock")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Else there will be memory leak if alloc_disk() fails.
Fixes: 6a27b656fc ("block: virtio-blk: support multi virt queues per virtio-blk device")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The flow is allocated in qrtr_tx_wait, but not freed when qrtr node
is released. (*slot) becomes NULL after radix_tree_iter_delete is
called in __qrtr_node_release. The fix is to save (*slot) to a
vairable and then free it.
This memory leak is catched when kmemleak is enabled in kernel,
the report looks like below:
unreferenced object 0xffffa0de69e08420 (size 32):
comm "kworker/u16:3", pid 176, jiffies 4294918275 (age 82858.876s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 28 84 e0 69 de a0 ff ff ........(..i....
28 84 e0 69 de a0 ff ff 03 00 00 00 00 00 00 00 (..i............
backtrace:
[<00000000e252af0a>] qrtr_node_enqueue+0x38e/0x400 [qrtr]
[<000000009cea437f>] qrtr_sendmsg+0x1e0/0x2a0 [qrtr]
[<000000008bddbba4>] sock_sendmsg+0x5b/0x60
[<0000000003beb43a>] qmi_send_message.isra.3+0xbe/0x110 [qmi_helpers]
[<000000009c9ae7de>] qmi_send_request+0x1c/0x20 [qmi_helpers]
Signed-off-by: Carl Huang <cjhuang@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
t4_prep_fw goto bye tag with positive return value when something
bad happened and which can not free resource in adap_init0.
so fix it to return negative value.
Fixes: 16e47624e7 ("cxgb4: Add new scheme to update T4/T5 firmware")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Li Heng <liheng40@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6ae72bfa65 ("PCI: Unify pcie_find_root_port() and
pci_find_pcie_root_port()") broke acpi_pci_bridge_d3() because calling
pcie_find_root_port() on a Root Port returned NULL when it should return
the Root Port, which in turn broke power management of PCIe hierarchies.
Rework pcie_find_root_port() so it returns its argument when it is already
a Root Port.
[bhelgaas: test device only once, test for PCIe]
Fixes: 6ae72bfa65 ("PCI: Unify pcie_find_root_port() and pci_find_pcie_root_port()")
Link: https://lore.kernel.org/r/20200622161248.51099-1-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Daniel Borkmann says:
====================
pull-request: bpf 2020-06-30
The following pull-request contains BPF updates for your *net* tree.
We've added 28 non-merge commits during the last 9 day(s) which contain
a total of 35 files changed, 486 insertions(+), 232 deletions(-).
The main changes are:
1) Fix an incorrect verifier branch elimination for PTR_TO_BTF_ID pointer
types, from Yonghong Song.
2) Fix UAPI for sockmap and flow_dissector progs that were ignoring various
arguments passed to BPF_PROG_{ATTACH,DETACH}, from Lorenz Bauer & Jakub Sitnicki.
3) Fix broken AF_XDP DMA hacks that are poking into dma-direct and swiotlb
internals and integrate it properly into DMA core, from Christoph Hellwig.
4) Fix RCU splat from recent changes to avoid skipping ingress policy when
kTLS is enabled, from John Fastabend.
5) Fix BPF ringbuf map to enforce size to be the power of 2 in order for its
position masking to work, from Andrii Nakryiko.
6) Fix regression from CAP_BPF work to re-allow CAP_SYS_ADMIN for loading
of network programs, from Maciej Żenczykowski.
7) Fix libbpf section name prefix for devmap progs, from Jesper Dangaard Brouer.
8) Fix formatting in UAPI documentation for BPF helpers, from Quentin Monnet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add two tests for PTR_TO_BTF_ID vs. null ptr comparison,
one for PTR_TO_BTF_ID in the ctx structure and the
other for PTR_TO_BTF_ID after one level pointer chasing.
In both cases, the test ensures condition is not
removed.
For example, for this test
struct bpf_fentry_test_t {
struct bpf_fentry_test_t *a;
};
int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
{
if (arg == 0)
test7_result = 1;
return 0;
}
Before the previous verifier change, we have xlated codes:
int test7(long long unsigned int * ctx):
; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
0: (79) r1 = *(u64 *)(r1 +0)
; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
1: (b4) w0 = 0
2: (95) exit
After the previous verifier change, we have:
int test7(long long unsigned int * ctx):
; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
0: (79) r1 = *(u64 *)(r1 +0)
; if (arg == 0)
1: (55) if r1 != 0x0 goto pc+4
; test7_result = 1;
2: (18) r1 = map[id:6][0]+48
4: (b7) r2 = 1
5: (7b) *(u64 *)(r1 +0) = r2
; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
6: (b4) w0 = 0
7: (95) exit
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630171241.2523875-1-yhs@fb.com
Wenbo reported an issue in [1] where a checking of null
pointer is evaluated as always false. In this particular
case, the program type is tp_btf and the pointer to
compare is a PTR_TO_BTF_ID.
The current verifier considers PTR_TO_BTF_ID always
reprents a non-null pointer, hence all PTR_TO_BTF_ID compares
to 0 will be evaluated as always not-equal, which resulted
in the branch elimination.
For example,
struct bpf_fentry_test_t {
struct bpf_fentry_test_t *a;
};
int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
{
if (arg == 0)
test7_result = 1;
return 0;
}
int BPF_PROG(test8, struct bpf_fentry_test_t *arg)
{
if (arg->a == 0)
test8_result = 1;
return 0;
}
In above bpf programs, both branch arg == 0 and arg->a == 0
are removed. This may not be what developer expected.
The bug is introduced by Commit cac616db39 ("bpf: Verifier
track null pointer branch_taken with JNE and JEQ"),
where PTR_TO_BTF_ID is considered to be non-null when evaluting
pointer vs. scalar comparison. This may be added
considering we have PTR_TO_BTF_ID_OR_NULL in the verifier
as well.
PTR_TO_BTF_ID_OR_NULL is added to explicitly requires
a non-NULL testing in selective cases. The current generic
pointer tracing framework in verifier always
assigns PTR_TO_BTF_ID so users does not need to
check NULL pointer at every pointer level like a->b->c->d.
We may not want to assign every PTR_TO_BTF_ID as
PTR_TO_BTF_ID_OR_NULL as this will require a null test
before pointer dereference which may cause inconvenience
for developers. But we could avoid branch elimination
to preserve original code intention.
This patch simply removed PTR_TO_BTD_ID from reg_type_not_null()
in verifier, which prevented the above branches from being eliminated.
[1]: https://lore.kernel.org/bpf/79dbb7c0-449d-83eb-5f4f-7af0cc269168@fb.com/T/
Fixes: cac616db39 ("bpf: Verifier track null pointer branch_taken with JNE and JEQ")
Reported-by: Wenbo Zhang <ethercflow@gmail.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630171240.2523722-1-yhs@fb.com
Alex Elder says:
====================
net: ipa: three bug fixes
This series contains three bug fixes for the Qualcomm IPA driver.
In practice these bugs are unlikke.y to be harmful, but they do
represent incorrect code.
Version 2 adds "Fixes" tags to two of the patches and fixes a typo
in one (found by checkpatch.pl).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a new function ipa_cmd_tag_process() that simply allocates a
transaction, adds a tag process command to it to clear the hardware
pipeline, and commits the transaction.
Call it in from ipa_endpoint_suspend(), after suspending the modem
endpoints but before suspending the AP command TX and AP LAN RX
endpoints (which are used by the tag sequence).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The AP LAN RX endpoint should not have download checksum offload
enabled.
The receive handler does properly accommodate the trailer that's
added by the hardware, but we ignore it.
Fixes: 1ed7d0c0fd ("soc: qcom: ipa: configuration data")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In gsi_channel_stop(), there's a check to see if the channel might
have entered STOPPED state since a previous call, which might have
timed out before stopping completed.
That check actually belongs in gsi_channel_stop_command(), which is
called repeatedly by gsi_channel_stop() for RX channels.
Fixes: 650d160382 ("soc: qcom: ipa: the generic software interface")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When support for short preambles was added, it incorrectly keyed its
decision off state->speed instead of state->interface. state->speed
is not guaranteed to be correct for in-band modes, which can lead to
short preambles being unexpectedly disabled.
Fix this by keying off the interface mode, which is the only way that
mvneta can operate at 2.5Gbps.
Fixes: da58a931f2 ("net: mvneta: Add support for 2500Mbps SGMII")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull exfat fixes from Namjae Jeon:
- Zero out unused characters of FileName field to avoid a complaint
from some fsck tool.
- Fix memory leak on error paths.
- Fix unnecessary VOL_DIRTY set when calling rmdir on non-empty
directory.
- Call sync_filesystem() for read-only remount (Fix generic/452 test in
xfstests)
- Add own fsync() to flush dirty metadata.
* tag 'exfat-for-5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: flush dirty metadata in fsync
exfat: move setting VOL_DIRTY over exfat_remove_entries()
exfat: call sync_filesystem for read-only remount
exfat: add missing brelse() calls on error paths
exfat: Set the unused characters of FileName field to the value 0000h
Jason A. Donenfeld says:
====================
support AF_PACKET for layer 3 devices
Hans reported that packets injected by a correct-looking and trivial
libpcap-based program were not being accepted by wireguard. In
investigating that, I noticed that a few devices weren't properly
handling AF_PACKET-injected packets, and so this series introduces a bit
of shared infrastructure to support that.
The basic problem begins with socket(AF_PACKET, SOCK_RAW,
htons(ETH_P_ALL)) sockets. When sendto is called, AF_PACKET examines the
headers of the packet with this logic:
static void packet_parse_headers(struct sk_buff *skb, struct socket *sock)
{
if ((!skb->protocol || skb->protocol == htons(ETH_P_ALL)) &&
sock->type == SOCK_RAW) {
skb_reset_mac_header(skb);
skb->protocol = dev_parse_header_protocol(skb);
}
skb_probe_transport_header(skb);
}
The middle condition there triggers, and we jump to
dev_parse_header_protocol. Note that this is the only caller of
dev_parse_header_protocol in the kernel, and I assume it was designed
for this purpose:
static inline __be16 dev_parse_header_protocol(const struct sk_buff *skb)
{
const struct net_device *dev = skb->dev;
if (!dev->header_ops || !dev->header_ops->parse_protocol)
return 0;
return dev->header_ops->parse_protocol(skb);
}
Since AF_PACKET already knows which netdev the packet is going to, the
dev_parse_header_protocol function can see if that netdev has a way it
prefers to figure out the protocol from the header. This, again, is the
only use of parse_protocol in the kernel. At the moment, it's only used
with ethernet devices, via eth_header_parse_protocol. This makes sense,
as mostly people are used to AF_PACKET-injecting ethernet frames rather
than layer 3 frames. But with nothing in place for layer 3 netdevs, this
function winds up returning 0, and skb->protocol then is set to 0, and
then by the time it hits the netdev's ndo_start_xmit, the driver doesn't
know what to do with it.
This is a problem because drivers very much rely on skb->protocol being
correct, and routinely reject packets where it's incorrect. That's why
having this parsing happen for injected packets is quite important. In
wireguard, ipip, and ipip6, for example, packets from AF_PACKET are just
dropped entirely. For tun devices, it's sort of uglier, with the tun
"packet information" header being passed to userspace containing a bogus
protocol value. Some userspace programs are ill-equipped to deal with
that. (But of course, that doesn't happen with tap devices, which
benefit from the similar shared infrastructure for layer 2 netdevs,
further motiviating this patchset for layer 3 netdevs.)
This patchset addresses the issue by first adding a layer 3 header parse
function, much akin to the existing one for layer 2 packets, and then
adds a shared header_ops structure that, also much akin to the existing
one for layer 2 packets. Then it wires it up to a few immediate places
that stuck out as requiring it, and does a bit of cleanup.
This patchset seems like it's fixing real bugs, so it might be
appropriate for stable. But they're also very old bugs, so if you'd
rather not backport to stable, that'd make sense to me too.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The xfrm interface uses skb->protocol to determine packet type, and
bails out if it's not set. For AF_PACKET injection, we need to support
its call chain of:
packet_sendmsg -> packet_snd -> packet_parse_headers ->
dev_parse_header_protocol -> parse_protocol
Without a valid parse_protocol, this returns zero, and xfrmi rejects the
skb. So, this wires up the ip_tunnel handler for layer 3 packets for
that case.
Reported-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sit uses skb->protocol to determine packet type, and bails out if it's
not set. For AF_PACKET injection, we need to support its call chain of:
packet_sendmsg -> packet_snd -> packet_parse_headers ->
dev_parse_header_protocol -> parse_protocol
Without a valid parse_protocol, this returns zero, and sit rejects the
skb. So, this wires up the ip_tunnel handler for layer 3 packets for
that case.
Reported-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vti uses skb->protocol to determine packet type, and bails out if it's
not set. For AF_PACKET injection, we need to support its call chain of:
packet_sendmsg -> packet_snd -> packet_parse_headers ->
dev_parse_header_protocol -> parse_protocol
Without a valid parse_protocol, this returns zero, and vti rejects the
skb. So, this wires up the ip_tunnel handler for layer 3 packets for
that case.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tun driver passes up skb->protocol to userspace in the form of PI headers.
For AF_PACKET injection, we need to support its call chain of:
packet_sendmsg -> packet_snd -> packet_parse_headers ->
dev_parse_header_protocol -> parse_protocol
Without a valid parse_protocol, this returns zero, and the tun driver
then gives userspace bogus values that it can't deal with.
Note that this isn't the case with tap, because tap already benefits
from the shared infrastructure for ethernet headers. But with tun,
there's nothing.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that wg_examine_packet_protocol has been added for general
consumption as ip_tunnel_parse_protocol, it's possible to remove
wg_examine_packet_protocol and simply use the new
ip_tunnel_parse_protocol function directly.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WireGuard uses skb->protocol to determine packet type, and bails out if
it's not set or set to something it's not expecting. For AF_PACKET
injection, we need to support its call chain of:
packet_sendmsg -> packet_snd -> packet_parse_headers ->
dev_parse_header_protocol -> parse_protocol
Without a valid parse_protocol, this returns zero, and wireguard then
rejects the skb. So, this wires up the ip_tunnel handler for layer 3
packets for that case.
Reported-by: Hans Wippel <ndev@hwipl.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ipip uses skb->protocol to determine packet type, and bails out if it's
not set. For AF_PACKET injection, we need to support its call chain of:
packet_sendmsg -> packet_snd -> packet_parse_headers ->
dev_parse_header_protocol -> parse_protocol
Without a valid parse_protocol, this returns zero, and ipip rejects the
skb. So, this wires up the ip_tunnel handler for layer 3 packets for
that case.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some devices that take straight up layer 3 packets benefit from having a
shared header_ops so that AF_PACKET sockets can inject packets that are
recognized. This shared infrastructure will be used by other drivers
that currently can't inject packets using AF_PACKET. It also exposes the
parser function, as it is useful in standalone form too.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull security subsystem fixes from James Morris:
"Two simple fixes for v5.8:
- Fix hook iteration and default value for inode_copy_up_xattr
(KP Singh)
- Fix the key_permission LSM hook function type (Sami Tolvanen)"
* tag 'fixes-v5.8-rc3-a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: Fix hook iteration and default value for inode_copy_up_xattr
security: fix the key_permission LSM hook function type
Pull integrity updates from Mimi Zohar:
"Include PCRs 8 & 9 in per TPM 2.0 bank boot_aggregate calculation.
Prior to Linux 5.8 the SHA1 "boot_aggregate" value was padded with 0's
and extended into the other TPM 2.0 banks.
Included in the Linux 5.8 open window, TPM 2.0 PCR bank specific
"boot_aggregate" values (PCRs 0 - 7) are calculated and extended into the TPM banks.
Distro releases are now shipping grub2 with TPM support, which extend
PCRs 8 & 9. I'd like for PCRs 8 & 9 to be included in the new
"boot_aggregate" calculations.
For backwards compatibility, if the hash is SHA1, these new PCRs are
not included in the boot aggregate"
* tag 'integrity-v5.8-fix-2' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: extend boot_aggregate with kernel measurements
Instead, expose the key via the input framework, as SW_MACHINE_COVER
The chip-detect GPIO is actually detecting if the cover is closed.
Technically it's possible to use the SD card with open cover. The
only downside is risk of battery falling out and user being able
to physically remove the card.
The behaviour of SD card not being available when the device is
open is unexpected and creates more problems than it solves. There
is a high chance, that more people accidentally break their rootfs
by opening the case without physically removing the card.
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Merlijn Wajer <merlijn@wizzup.org>
Link: https://lore.kernel.org/r/20200612125402.18393-3-merlijn@wizzup.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Since 5.7, we've been using task_work to trigger async running of
requests in the context of the original task. This generally works
great, but there's a case where if the task is currently blocked
in the kernel waiting on a condition to become true, it won't process
task_work. Even though the task is woken, it just checks whatever
condition it's waiting on, and goes back to sleep if it's still false.
This is a problem if that very condition only becomes true when that
task_work is run. An example of that is the task registering an eventfd
with io_uring, and it's now blocked waiting on an eventfd read. That
read could depend on a completion event, and that completion event
won't get trigged until task_work has been run.
Use the TWA_SIGNAL notification for task_work, so that we ensure that
the task always runs the work when queued.
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Jens Axboe <axboe@kernel.dk>
So that the target task will exit the wait_event_interruptible-like
loop and call task_work_run() asap.
The patch turns "bool notify" into 0,TWA_RESUME,TWA_SIGNAL enum, the
new TWA_SIGNAL flag implies signal_wake_up(). However, it needs to
avoid the race with recalc_sigpending(), so the patch also adds the
new JOBCTL_TASK_WORK bit included in JOBCTL_PENDING_MASK.
TODO: once this patch is merged we need to change all current users
of task_work_add(notify = true) to use TWA_RESUME.
Cc: stable@vger.kernel.org # v5.7
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Iterating over BPF links attached to network namespace in pre_exit hook is
not safe, even if there is just one. Once link gets auto-detached, that is
its back-pointer to net object is set to NULL, the link can be released and
freed without waiting on netns_bpf_mutex, effectively causing the list
element we are operating on to be freed.
This leads to use-after-free when trying to access the next element on the
list, as reported by KASAN. Bug can be triggered by destroying a network
namespace, while also releasing a link attached to this network namespace.
| ==================================================================
| BUG: KASAN: use-after-free in netns_bpf_pernet_pre_exit+0xd9/0x130
| Read of size 8 at addr ffff888119e0d778 by task kworker/u8:2/177
|
| CPU: 3 PID: 177 Comm: kworker/u8:2 Not tainted 5.8.0-rc1-00197-ga0c04c9d1008-dirty #776
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
| Workqueue: netns cleanup_net
| Call Trace:
| dump_stack+0x9e/0xe0
| print_address_description.constprop.0+0x3a/0x60
| ? netns_bpf_pernet_pre_exit+0xd9/0x130
| kasan_report.cold+0x1f/0x40
| ? netns_bpf_pernet_pre_exit+0xd9/0x130
| netns_bpf_pernet_pre_exit+0xd9/0x130
| cleanup_net+0x30b/0x5b0
| ? unregister_pernet_device+0x50/0x50
| ? rcu_read_lock_bh_held+0xb0/0xb0
| ? _raw_spin_unlock_irq+0x24/0x50
| process_one_work+0x4d1/0xa10
| ? lock_release+0x3e0/0x3e0
| ? pwq_dec_nr_in_flight+0x110/0x110
| ? rwlock_bug.part.0+0x60/0x60
| worker_thread+0x7a/0x5c0
| ? process_one_work+0xa10/0xa10
| kthread+0x1e3/0x240
| ? kthread_create_on_node+0xd0/0xd0
| ret_from_fork+0x1f/0x30
|
| Allocated by task 280:
| save_stack+0x1b/0x40
| __kasan_kmalloc.constprop.0+0xc2/0xd0
| netns_bpf_link_create+0xfe/0x650
| __do_sys_bpf+0x153a/0x2a50
| do_syscall_64+0x59/0x300
| entry_SYSCALL_64_after_hwframe+0x44/0xa9
|
| Freed by task 198:
| save_stack+0x1b/0x40
| __kasan_slab_free+0x12f/0x180
| kfree+0xed/0x350
| process_one_work+0x4d1/0xa10
| worker_thread+0x7a/0x5c0
| kthread+0x1e3/0x240
| ret_from_fork+0x1f/0x30
|
| The buggy address belongs to the object at ffff888119e0d700
| which belongs to the cache kmalloc-192 of size 192
| The buggy address is located 120 bytes inside of
| 192-byte region [ffff888119e0d700, ffff888119e0d7c0)
| The buggy address belongs to the page:
| page:ffffea0004678340 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0
| flags: 0x2fffe0000000200(slab)
| raw: 02fffe0000000200 ffffea00045ba8c0 0000000600000006 ffff88811a80ea80
| raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
| page dumped because: kasan: bad access detected
|
| Memory state around the buggy address:
| ffff888119e0d600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
| ffff888119e0d680: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
| >ffff888119e0d700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
| ^
| ffff888119e0d780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
| ffff888119e0d800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
| ==================================================================
Remove the "fast-path" for releasing a link that got auto-detached by a
dying network namespace to fix it. This way as long as link is on the list
and netns_bpf mutex is held, we have a guarantee that link memory can be
accessed.
An alternative way to fix this issue would be to safely iterate over the
list of links and ensure there is no access to link object after detaching
it. But, at the moment, optimizing synchronization overhead on link release
without a workload in mind seems like an overkill.
Fixes: ab53cad90e ("bpf, netns: Keep a list of attached bpf_link's")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630164541.1329993-1-jakub@cloudflare.com
Lorenz Bauer says:
====================
Both sockmap and flow_dissector ingnore various arguments passed to
BPF_PROG_ATTACH and BPF_PROG_DETACH. We can fix the attach case by
checking that the unused arguments are zero. I considered requiring
target_fd to be -1 instead of 0, but this leads to a lot of churn
in selftests. There is also precedent in that bpf_iter already
expects 0 for a similar field. I think that we can come up with a
work around for fd 0 should we need to in the future.
The detach case is more problematic: both cgroups and lirc2 verify
that attach_bpf_fd matches the currently attached program. This
way you need access to the program fd to be able to remove it.
Neither sockmap nor flow_dissector do this. flow_dissector even
has a check for CAP_NET_ADMIN because of this. The patch set
addresses this by implementing the desired behaviour.
There is a possibility for user space breakage: any callers that
don't provide the correct fd will fail with ENOENT. For sockmap
the risk is low: even the selftests assume that sockmap works
the way I described. For flow_dissector the story is less
straightforward, and the selftests use a variety of arguments.
I've includes fixes tags for the oldest commits that allow an easy
backport, however the behaviour dates back to when sockmap and
flow_dissector were introduced. What is the best way to handle these?
This set is based on top of Jakub's work "bpf, netns: Prepare
for multi-prog attachment" available at
https://lore.kernel.org/bpf/87k0zwmhtb.fsf@cloudflare.com/T/
Since v1:
- Adjust selftests
- Implement detach behaviour
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Calling bpf_prog_detach is incorrect, since it takes target_fd as
its argument. The intention here is to pass it as attach_bpf_fd,
so use bpf_prog_detach2 and pass zero for target_fd.
Fixes: 06716e04a0 ("selftests/bpf: Extend test_flow_dissector to cover link creation")
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200629095630.7933-7-lmb@cloudflare.com
The sockmap code currently ignores the value of attach_bpf_fd when
detaching a program. This is contrary to the usual behaviour of
checking that attach_bpf_fd represents the currently attached
program.
Ensure that attach_bpf_fd is indeed the currently attached
program. It turns out that all sockmap selftests already do this,
which indicates that this is unlikely to cause breakage.
Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200629095630.7933-5-lmb@cloudflare.com
Using BPF_PROG_DETACH on a flow dissector program supports neither
attach_flags nor attach_bpf_fd. Yet no value is enforced for them.
Enforce that attach_flags are zero, and require the current program
to be passed via attach_bpf_fd. This allows us to remove the check
for CAP_SYS_ADMIN, since userspace can now no longer remove
arbitrary flow dissector programs.
Fixes: b27f7bb590 ("flow_dissector: Move out netns_bpf prog callbacks")
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200629095630.7933-3-lmb@cloudflare.com
Using BPF_PROG_ATTACH on a flow dissector program supports neither
target_fd, attach_flags or replace_bpf_fd but accepts any value.
Enforce that all of them are zero. This is fine for replace_bpf_fd
since its presence is indicated by BPF_F_REPLACE. It's more
problematic for target_fd, since zero is a valid fd. Should we
want to use the flag later on we'd have to add an exception for
fd 0. The alternative is to force a value like -1. This requires
more changes to tests. There is also precedent for using 0,
since bpf_iter uses this for target_fd as well.
Fixes: b27f7bb590 ("flow_dissector: Move out netns_bpf prog callbacks")
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200629095630.7933-2-lmb@cloudflare.com
Jakub Sitnicki says:
====================
This patch set prepares ground for link-based multi-prog attachment for
future netns attach types, with BPF_SK_LOOKUP attach type in mind [0].
Two changes are needed in order to attach and run a series of BPF programs:
1) an bpf_prog_array of programs to run (patch #2), and
2) a list of attached links to keep track of attachments (patch #3).
Nothing changes for BPF flow_dissector. Just as before only one program can
be attached to netns.
In v3 I've simplified patch #2 that introduces bpf_prog_array to take
advantage of the fact that it will hold at most one program for now.
In particular, I'm no longer using bpf_prog_array_copy. It turned out to be
less suitable for link operations than I thought as it fails to append the
same BPF program.
bpf_prog_array_replace_item is also gone, because we know we always want to
replace the first element in prog_array.
Naturally the code that handles bpf_prog_array will need change once
more when there is a program type that allows multi-prog attachment. But I
feel it will be better to do it gradually and present it together with
tests that actually exercise multi-prog code paths.
[0] https://lore.kernel.org/bpf/20200511185218.1422406-1-jakub@cloudflare.com/
v2 -> v3:
- Don't check if run_array is null in link update callback. (Martin)
- Allow updating the link with the same BPF program. (Andrii)
- Add patch #4 with a test for the above case.
- Kill bpf_prog_array_replace_item. Access the run_array directly.
- Switch from bpf_prog_array_copy() to bpf_prog_array_alloc(1, ...).
- Replace rcu_deref_protected & RCU_INIT_POINTER with rcu_replace_pointer.
- Drop Andrii's Ack from patch #2. Code changed.
v1 -> v2:
- Show with a (void) cast that bpf_prog_array_replace_item() return value
is ignored on purpose. (Andrii)
- Explain why bpf-cgroup cannot replace programs in bpf_prog_array based
on bpf_prog pointer comparison in patch #2 description. (Andrii)
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This case, while not particularly useful, is worth covering because we
expect the operation to succeed as opposed when re-attaching the same
program directly with PROG_ATTACH.
While at it, update the tests summary that fell out of sync when tests
extended to cover links.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200625141357.910330-5-jakub@cloudflare.com
To support multi-prog link-based attachments for new netns attach types, we
need to keep track of more than one bpf_link per attach type. Hence,
convert net->bpf.links into a list, that currently can be either empty or
have just one item.
Instead of reusing bpf_prog_list from bpf-cgroup, we link together
bpf_netns_link's themselves. This makes list management simpler as we don't
have to allocate, initialize, and later release list elements. We can do
this because multi-prog attachment will be available only for bpf_link, and
we don't need to build a list of programs attached directly and indirectly
via links.
No functional changes intended.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200625141357.910330-4-jakub@cloudflare.com
Prepare for having multi-prog attachments for new netns attach types by
storing programs to run in a bpf_prog_array, which is well suited for
iterating over programs and running them in sequence.
After this change bpf(PROG_QUERY) may block to allocate memory in
bpf_prog_array_copy_to_user() for collected program IDs. This forces a
change in how we protect access to the attached program in the query
callback. Because bpf_prog_array_copy_to_user() can sleep, we switch from
an RCU read lock to holding a mutex that serializes updaters.
Because we allow only one BPF flow_dissector program to be attached to
netns at all times, the bpf_prog_array pointed by net->bpf.run_array is
always either detached (null) or one element long.
No functional changes intended.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200625141357.910330-3-jakub@cloudflare.com
Prepare for using bpf_prog_array to store attached programs by moving out
code that updates the attached program out of flow dissector.
Managing bpf_prog_array is more involved than updating a single bpf_prog
pointer. This will let us do it all from one place, bpf/net_namespace.c, in
the subsequent patch.
No functional change intended.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200625141357.910330-2-jakub@cloudflare.com
Tiger Lake's new unique ACPI device ID for Fan is not valid
because of missing 'C' in the ID. Use correct fan device ID.
Fixes: c248dfe7e0 ("ACPI: fan: Add Tiger Lake ACPI device ID")
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Cc: 5.6+ <stable@vger.kernel.org> # 5.6+
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Adjust the reg property to fix the following warning seen with
'make dt_binding_check':
Documentation/devicetree/bindings/thermal/ti,am654-thermal.example.dt.yaml: example-0: thermal@42050000:reg:0: [0, 1107623936, 0, 604] is too long
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20200630122527.28640-1-festevam@gmail.com
Signed-off-by: Rob Herring <robh@kernel.org>
Remove the soc unit address to fix the following warnings seen with
'make dt_binding_check':
Documentation/devicetree/bindings/thermal/thermal-sensor.example.dts:22.20-49.11: Warning (unit_address_vs_reg): /example-0/soc@0: node has a unit name, but no reg or ranges property
Documentation/devicetree/bindings/thermal/thermal-zones.example.dts:23.20-50.11: Warning (unit_address_vs_reg): /example-0/soc@0: node has a unit name, but no reg or ranges property
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20200630121804.27887-1-festevam@gmail.com
[robh: also fix thermal-zones.yaml example]
Signed-off-by: Rob Herring <robh@kernel.org>
Remove the leading zeroes to fix the following warning seen with
'make dt_binding_check':
Documentation/devicetree/bindings/usb/aspeed,usb-vhub.example.dts:37.33-42.23: Warning (unit_address_format): /example-0/usb-vhub@1e6a0000/vhub-strings/string@0409: unit name should not have leading 0s
Reviewed-by: Tao Ren <rentao.bupt@gmail.com>
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20200629214027.16768-1-festevam@gmail.com
Signed-off-by: Rob Herring <robh@kernel.org>
There are two processed schema files:
- processed-schema-examples.yaml
Used for 'make dt_binding_check'. This is always a full schema.
- processed-schema.yaml
Used for 'make dtbs_check'. This may be a full schema, or a smaller
subset if DT_SCHEMA_FILES is given by a user.
If DT_SCHEMA_FILES is not specified, they are the same. You can copy
the former to the latter instead of running dt-mk-schema twice. This
saves the cpu time a lot when you do 'make dt_binding_check dtbs_check'
because building the full schema takes a couple of seconds.
If DT_SCHEMA_FILES is specified, processed-schema.yaml is generated
based on the specified yaml files.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20200625170434.635114-4-masahiroy@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
We are having more and more schema files.
Commit 8b6b80218b ("dt-bindings: Fix command line length limit
calling dt-mk-schema") fixed the 'Argument list too long' error of
the schema checks, but the same error happens while cleaning too.
'make clean' after 'make dt_binding_check' fails as follows:
$ make dt_binding_check
[ snip ]
$ make clean
make[2]: execvp: /bin/sh: Argument list too long
make[2]: *** [scripts/Makefile.clean:52: __clean] Error 127
make[1]: *** [scripts/Makefile.clean:66: Documentation/devicetree/bindings] Error 2
make: *** [Makefile:1763: _clean_Documentation] Error 2
'make dt_binding_check' generates so many .example.dts, .dt.yaml files,
which are passed to the 'rm' command when you run 'make clean'.
I added a small hack to use the 'find' command to clean up most of the
build artifacts before they are processed by scripts/Makefile.clean
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20200625170434.635114-2-masahiroy@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Since commit e69f5dc623 ("dt-bindings: serial: Convert 8250 to
json-schema"), the schema for "ns16550a" is checked.
'make dt_binding_check' emits the following warning:
uart@5,00200000: $nodename:0: 'uart@5,00200000' does not match '^serial(@[0-9a-f,]+)*$'
Rename the node to follow the pattern defined in
Documentation/devicetree/bindings/serial/serial.yaml
While I was here, I removed leading zeros from unit names.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Link: https://lore.kernel.org/r/20200623113242.779241-1-yamada.masahiro@socionext.com
Signed-off-by: Rob Herring <robh@kernel.org>
Sync with upstream dtc primarily to pickup the I2C bus check fixes. The
interrupt_provider check is noisy, so turn it off for now.
This adds the following commits from upstream:
9d7888cbf19c dtc: Consider one-character strings as strings
8259d59f59de checks: Improve i2c reg property checking
fdabcf2980a4 checks: Remove warning for I2C_OWN_SLAVE_ADDRESS
2478b1652c8d libfdt: add extern "C" for C++
f68bfc2668b2 libfdt: trivial typo fix
7be250b4d059 libfdt: Correct condition for reordering blocks
81e0919a3e21 checks: Add interrupt provider test
85e5d839847a Makefile: when building libfdt only, do not add unneeded deps
b28464a550c5 Fix some potential unaligned accesses in dtc
Signed-off-by: Rob Herring <robh@kernel.org>
BPF ringbuf assumes the size to be a multiple of page size and the power of
2 value. The latter is important to avoid division while calculating position
inside the ring buffer and using (N-1) mask instead. This patch fixes omission
to enforce power-of-2 size rule.
Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200630061500.1804799-1-andriin@fb.com
Choo! Choo! All aboard the Split Lock Express, with direct service to
Wreckage!
Skip split_lock_verify_msr() if the CPU isn't whitelisted as a possible
SLD-enabled CPU model to avoid writing MSR_TEST_CTRL. MSR_TEST_CTRL
exists, and is writable, on many generations of CPUs. Writing the MSR,
even with '0', can result in bizarre, undocumented behavior.
This fixes a crash on Haswell when resuming from suspend with a live KVM
guest. Because APs use the standard SMP boot flow for resume, they will
go through split_lock_init() and the subsequent RDMSR/WRMSR sequence,
which runs even when sld_state==sld_off to ensure SLD is disabled. On
Haswell (at least, my Haswell), writing MSR_TEST_CTRL with '0' will
succeed and _may_ take the SMT _sibling_ out of VMX root mode.
When KVM has an active guest, KVM performs VMXON as part of CPU onlining
(see kvm_starting_cpu()). Because SMP boot is serialized, the resulting
flow is effectively:
on_each_ap_cpu() {
WRMSR(MSR_TEST_CTRL, 0)
VMXON
}
As a result, the WRMSR can disable VMX on a different CPU that has
already done VMXON. This ultimately results in a #UD on VMPTRLD when
KVM regains control and attempt run its vCPUs.
The above voodoo was confirmed by reworking KVM's VMXON flow to write
MSR_TEST_CTRL prior to VMXON, and to serialize the sequence as above.
Further verification of the insanity was done by redoing VMXON on all
APs after the initial WRMSR->VMXON sequence. The additional VMXON,
which should VM-Fail, occasionally succeeded, and also eliminated the
unexpected #UD on VMPTRLD.
The damage done by writing MSR_TEST_CTRL doesn't appear to be limited
to VMX, e.g. after suspend with an active KVM guest, subsequent reboots
almost always hang (even when fudging VMXON), a #UD on a random Jcc was
observed, suspend/resume stability is qualitatively poor, and so on and
so forth.
kernel BUG at arch/x86/kvm/x86.c:386!
CPU: 1 PID: 2592 Comm: CPU 6/KVM Tainted: G D
Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
RIP: 0010:kvm_spurious_fault+0xf/0x20
Call Trace:
vmx_vcpu_load_vmcs+0x1fb/0x2b0
vmx_vcpu_load+0x3e/0x160
kvm_arch_vcpu_load+0x48/0x260
finish_task_switch+0x140/0x260
__schedule+0x460/0x720
_cond_resched+0x2d/0x40
kvm_arch_vcpu_ioctl_run+0x82e/0x1ca0
kvm_vcpu_ioctl+0x363/0x5c0
ksys_ioctl+0x88/0xa0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x4c/0x170
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: dbaba47085 ("x86/split_lock: Rework the initialization flow of split lock detection")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200605192605.7439-1-sean.j.christopherson@intel.com
Bit 8 would be the "global" bit, which does not quite make sense for non-leaf
page table entries. Intel ignores it; AMD ignores it in PDEs and PDPEs, but
reserves it in PML4Es.
Probably, earlier versions of the AMD manual documented it as reserved in PDPEs
as well, and that behavior made it into KVM as well as kvm-unit-tests; fix it.
Cc: stable@vger.kernel.org
Reported-by: Nadav Amit <namit@vmware.com>
Fixes: a0c0feb579 ("KVM: x86: reserve bit 8 of non-leaf PDPEs and PML4Es in 64-bit mode on AMD", 2014-09-03)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In flush_delete_work, instead of flushing each individual pending
delayed work item, cancel and re-queue them for immediate execution.
The waiting isn't needed here because we're already waiting for all
queued work items to complete in gfs2_flush_delete_work. This makes the
code more efficient, but more importantly, it avoids sleeping during a
rhashtable walk, inside rcu_read_lock().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Log flush operations (gfs2_log_flush()) can target a specific transaction.
But if the function encounters errors (e.g. io errors) and withdraws,
the transaction was only freed it if was queued to one of the ail lists.
If the withdraw occurred before the transaction was queued to the ail1
list, function ail_drain never freed it. The result was:
BUG gfs2_trans: Objects remaining in gfs2_trans on __kmem_cache_shutdown()
This patch makes log_flush() add the targeted transaction to the ail1
list so that function ail_drain() will find and free it properly.
Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Callers expect gfs2_inode_lookup to return an inode pointer or ERR_PTR(error).
Commit b66648ad6d caused it to return NULL instead of ERR_PTR(-ESTALE) in
some cases. Fix that.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: b66648ad6d ("gfs2: Move inode generation number check into gfs2_inode_lookup")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
If NO_DMA=y (e.g. Sun-3 all{mod,yes}-config):
drivers/iommu/dma-iommu.o: In function `iommu_dma_mmap':
dma-iommu.c:(.text+0x92e): undefined reference to `dma_pgprot'
IOMMU_DMA must not be selected, unless HAS_DMA=y.
Hence fix this by making SUN50I_IOMMU depend on HAS_DMA.
Fixes: 4100b8c229 ("iommu: Add Allwinner H6 IOMMU driver")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200629121146.24011-1-geert@linux-m68k.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pull irqchip fixes from Marc Zyngier:
- Fix atomicity of affinity update in the GIC driver
- Don't sleep in atomic when waiting for a GICv4.1 RD to respond
- Fix a couple of typos in user-visible messages
clang static analysis flags several null function pointer problems.
drivers/scsi/scsi_transport_spi.c:374:1: warning: Called function pointer is null (null dereference) [core.CallAndMessage]
spi_transport_max_attr(offset, "%d\n");
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reviewing the store_spi_store_max macro
if (i->f->set_##field)
return -EINVAL;
should be
if (!i->f->set_##field)
return -EINVAL;
Link: https://lore.kernel.org/r/20200627133242.21618-1-trix@redhat.com
Reviewed-by: James Bottomley <jejb@linux.ibm.com>
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
genl_family_rcv_msg_attrs_parse() reuses the global family->attrbuf
when family->parallel_ops is false. However, family->attrbuf is not
protected by any lock on the genl_family_rcv_msg_doit() code path.
This leads to several different consequences, one of them is UAF,
like the following:
genl_family_rcv_msg_doit(): genl_start():
genl_family_rcv_msg_attrs_parse()
attrbuf = family->attrbuf
__nlmsg_parse(attrbuf);
genl_family_rcv_msg_attrs_parse()
attrbuf = family->attrbuf
__nlmsg_parse(attrbuf);
info->attrs = attrs;
cb->data = info;
netlink_unicast_kernel():
consume_skb()
genl_lock_dumpit():
genl_dumpit_info(cb)->attrs
Note family->attrbuf is an array of pointers to the skb data, once
the skb is freed, any dereference of family->attrbuf will be a UAF.
Maybe we could serialize the family->attrbuf with genl_mutex too, but
that would make the locking more complicated. Instead, we can just get
rid of family->attrbuf and always allocate attrbuf from heap like the
family->parallel_ops==true code path. This may add some performance
overhead but comparing with taking the global genl_mutex, it still
looks better.
Fixes: 75cdbdd089 ("net: ieee802154: have genetlink code to parse the attrs during dumpit")
Fixes: 057af70713 ("net: tipc: have genetlink code to parse the attrs during dumpit")
Reported-and-tested-by: syzbot+3039ddf6d7b13daf3787@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+80cad1e3cb4c41cde6ff@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+736bcbcb11b60d0c0792@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+520f8704db2b68091d44@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+c96e4dfb32f8987fdeed@syzkaller.appspotmail.com
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Couple of fixes/small things:
* TX control port status check fixed to not assume frame format
* mesh control port fixes
* error handling/leak fixes when starting AP, with HE attributes
* fix broadcast packet handling with encapsulation offload
* add new AKM suites
* and a small code cleanup
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following sparse warning:
arch/arm64/xen/../../arm/xen/enlighten.c:244: warning: macro
"GRANT_TABLE_PHYSADDR" is not used [-Wunused-macros]
It is an isolated macro, and should be removed when its last user
was deleted in the following commit 3cf4095d74 ("arm/xen: Use
xen_xlate_map_ballooned_pages to setup grant table")
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
When starting at 744MHz, the Mali 450 core crashes on S805X based boards:
lima d00c0000.gpu: IRQ ppmmu3 not found
lima d00c0000.gpu: IRQ ppmmu4 not found
lima d00c0000.gpu: IRQ ppmmu5 not found
lima d00c0000.gpu: IRQ ppmmu6 not found
lima d00c0000.gpu: IRQ ppmmu7 not found
Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.2+ #492
Hardware name: Libre Computer AML-S805X-AC (DT)
pstate: 40000005 (nZcv daif -PAN -UAO)
pc : lima_gp_init+0x28/0x188
...
Call trace:
lima_gp_init+0x28/0x188
lima_device_init+0x334/0x534
lima_pdev_probe+0xa4/0xe4
...
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Reverting to a safer 666Mhz frequency on the S805X that doesn't use the
GP0 PLL makes it more stable.
Fixes: fd47716479 ("ARM64: dts: add S805X based P241 board")
Fixes: 0449b8e371 ("arm64: dts: meson: add libretech aml-s805x-ac board")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Link: https://lore.kernel.org/r/20200618132737.14243-1-narmstrong@baylibre.com
The adc_vol_tlv volume-control has a range from -17.625 dB to +30 dB,
not -176.25 dB to + 300 dB. This wrong scale is esp. a problem in userspace
apps which translate the dB scale to a linear scale. With the logarithmic
dB scale being of by a factor of 10 we loose all precision in the lower
area of the range when apps translate things to a linear scale.
E.g. the 0 dB default, which corresponds with a value of 47 of the
0 - 127 range for the control, would be shown as 0/100 in alsa-mixer.
Since the centi-dB values used in the TLV struct cannot represent the
0.375 dB step size used by these controls, change the TLV definition
for them to specify a min and max value instead of min + stepsize.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200628155231.71089-5-hdegoede@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The Lenovo Miix 2 10 has a keyboard dock with extra speakers in the dock.
Rather then the ACL5672's GPIO1 pin being used as IRQ to the CPU, it is
actually used to enable the amplifier for these speakers
(the IRQ to the CPU comes directly from the jack-detect switch).
Add a quirk for having an ext speaker-amplifier enable pin on GPIO1
and replace the Lenovo Miix 2 10's dmi_system_id table entry's wrong
GPIO_DEV quirk (which needs to be renamed to GPIO1_IS_IRQ) with the
new RT5670_GPIO1_IS_EXT_SPK_EN quirk, so that we enable the external
speaker-amplifier as necessary.
Also update the ident field for the dmi_system_id table entry, the
Miix models are not Thinkpads.
Fixes: 67e03ff3f3 ("ASoC: codecs: rt5670: add Thinkpad Tablet 10 quirk")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1786723
Link: https://lore.kernel.org/r/20200628155231.71089-4-hdegoede@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The RT5670_PWR_ANLG1 register has 3 bits to select the LDO voltage,
so the correct mask is 0x7 not 0x3.
Because of this wrong mask we were programming the ldo bits
to a setting of binary 001 (0x05 & 0x03) instead of binary 101
when moving to SND_SOC_BIAS_PREPARE.
According to the datasheet 001 is a reserved value, so no idea
what it did, since the driver was working fine before I guess we
got lucky and it does something which is ok.
Fixes: 5e8351de74 ("ASoC: add RT5670 CODEC driver")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200628155231.71089-3-hdegoede@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The default mode for SSP configuration is TDM 4 slot and so far we were
using this for the bus format on cht-bsw-rt56732 boards.
One board, the Lenovo Miix 2 10 uses not 1 but 2 codecs connected to SSP2.
The second piggy-backed, output-only codec is inside the keyboard-dock
(which has extra speakers). Unlike the main rt5672 codec, we cannot
configure this codec, it is hard coded to use 2 channel 24 bit I2S.
Using 4 channel TDM leads to the dock speakers codec (which listens in on
the data send from the SSP to the rt5672 codec) emiting horribly distorted
sound.
Since we only support 2 channels anyways, there is no need for TDM on any
cht-bsw-rt5672 designs. So we can simply use I2S 2ch everywhere.
This commit fixes the Lenovo Miix 2 10 dock speakers issue by changing
the bus format set in cht_codec_fixup() to I2S 2 channel.
This change has been tested on the following devices with a rt5672 codec:
Lenovo Miix 2 10
Lenovo Thinkpad 8
Lenovo Thinkpad 10 (gen 1)
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: stable@vger.kernel.org
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1786723
Link: https://lore.kernel.org/r/20200628155231.71089-2-hdegoede@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Make blk_ksm_destroy() use the kvfree_sensitive() function (which was
introduced in v5.8-rc1) instead of open-coding it.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Even if that's only a warning, not including asm/cacheflush.h
leads to svc_flush_bvec() being empty allthough powerpc defines
ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE.
CC net/sunrpc/svcsock.o
net/sunrpc/svcsock.c:227:5: warning: "ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE" is not defined [-Wundef]
#if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
^
Include linux/highmem.h so that asm/cacheflush.h will be included.
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: ca07eda33e ("SUNRPC: Refactor svc_recvfrom()")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
I don't understand this code well, but I'm seeing a warning about a
still-referenced inode on unmount, and every other similar filesystem
does a dput() here.
Fixes: e8a79fb14f ("nfsd: add nfsd/clients directory")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We don't drop the reference on the nfsdfs filesystem with
mntput(nn->nfsd_mnt) until nfsd_exit_net(), but that won't be called
until the nfsd module's unloaded, and we can't unload the module as long
as there's a reference on nfsdfs. So this prevents module unloading.
Fixes: 2c830dd720 ("nfsd: persist nfsd filesystem across mounts")
Reported-and-Tested-by: Luo Xiaogang <lxgrxd@163.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Commit f16861b12f ("regulator: rename da903x to da903x-regulator") missed
to adjust the DIALOG SEMICONDUCTOR DRIVERS section in MAINTAINERS.
Hence, ./scripts/get_maintainer.pl --self-test=patterns complains:
warning: no file matches F: drivers/regulator/da903x.c
The da903x-regulator.c file is already covered by the pattern
drivers/regulator/da9???-regulator.[ch] in the section.
So, simply remove the non-matching file entry in MAINTAINERS.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20200628180229.5068-1-lukas.bulwahn@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Pull spi fixes from Mark Brown:
"A batch of fixes for the Freescale DSPI driver fixing some serious
issues with removal of active devices and one resume case, plus a few
new PCI IDs for Intel platforms"
* tag 'spi-fix-v5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: pxa2xx: Add support for Intel Tiger Lake PCH-H
spi: spi-fsl-dspi: Initialize completion before possible interrupt
spi: spi-fsl-dspi: Fix external abort on interrupt in resume or exit paths
spi: spi-fsl-dspi: Fix lockup if device is shutdown during SPI transfer
spi: spi-fsl-dspi: Fix lockup if device is removed during SPI transfer
Pull thermal fixes from Daniel Lezcano:
- Fix undefined temperature if negative on the rcar_gen3 (Dien Pham)
- Fix wrong frequency converted from power for the cpufreq cooling
device (Finley Xiao)
- Fix compilation warnings by making functions static in the tsens
driver (Amit Kucheria)
- Fix return value of sprd_thm_probe for the Spreadtrum driver
(Tiezhu Yang)
- Fix bank number settings on the Mediatek mt8183 (Michael Kao)
- Fix missing of_node_put() at probe time i.MX (Anson Huang)
* tag 'thermal-v5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal/drivers/rcar_gen3: Fix undefined temperature if negative
thermal/drivers/cpufreq_cooling: Fix wrong frequency converted from power
thermal/drivers/tsens: Fix compilation warnings by making functions static
thermal/drivers/sprd: Fix return value of sprd_thm_probe()
thermal/drivers/mediatek: Fix bank number settings on mt8183
thermal/drivers: imx: Fix missing of_node_put() at probe time
Pull crypto fixes from Herbert Xu:
"This fixes two race conditions, one in padata and one in af_alg"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
padata: upgrade smp_mb__after_atomic to smp_mb in padata_do_serial
crypto: af_alg - fix use-after-free in af_alg_accept() due to bh_lock_sock()
When building on allyesconfig kernel for a NO_DMA=y platform (e.g.
Sun-3), CONFIG_SND_SOC_QCOM_COMMON=y, but CONFIG_SND_SOC_QDSP6_AFE=n,
leading to a link failure:
sound/soc/qcom/common.o: In function `qcom_snd_parse_of':
common.c:(.text+0x2e2): undefined reference to `q6afe_is_rx_port'
While SND_SOC_QDSP6 depends on HAS_DMA, SND_SOC_MSM8996 and SND_SOC_SDM845
don't, so the following warning is seen:
WARNING: unmet direct dependencies detected for SND_SOC_QDSP6
Depends on [n]: SOUND [=y] && !UML && SND [=y] && SND_SOC [=y] && QCOM_APR [=y] && HAS_DMA [=n]
Selected by [y]:
- SND_SOC_MSM8996 [=y] && SOUND [=y] && !UML && SND [=y] && SND_SOC [=y] && QCOM_APR [=y]
- SND_SOC_SDM845 [=y] && SOUND [=y] && !UML && SND [=y] && SND_SOC [=y] && QCOM_APR [=y] && CROS_EC [=y] && I2C [=y] && SOUNDWIRE [=y]
Until recently, this warning was harmless (from a compile-testing
point-of-view), but the new user of q6afe_is_rx_port() turned this into
a hard failure.
As the QDSP6 driver itself builds fine if NO_DMA=y, and it depends on
QCOM_APR (which in turns depends on ARCH_QCOM || COMPILE_TEST), it is
safe to increase compile testing coverage. Hence fix the link failure
by dropping the HAS_DMA dependency of SND_SOC_QDSP6.
Fixes: a212008925 ("ASoC: qcom: common: set correct directions for dailinks")
Fixes: 6b1687bf76 ("ASoC: qcom: add sdm845 sound card support")
Fixes: a6f933f63f ("ASoC: qcom: apq8096: Add db820c machine driver")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20200629122443.21736-1-geert@linux-m68k.org
Signed-off-by: Mark Brown <broonie@kernel.org>
This reverts commit e9c15badbb ("fs: Do not check if there is a
fsnotify watcher on pseudo inodes"). The commit intended to eliminate
fsnotify-related overhead for pseudo inodes but it is broken in
concept. inotify can receive events of pipe files under /proc/X/fd and
chromium relies on close and open events for sandboxing. Maxim Levitsky
reported the following
Chromium starts as a white rectangle, shows few white rectangles that
resemble its notifications and then crashes.
The stdout output from chromium:
[mlevitsk@starship ~]$chromium-freeworld
mesa: for the --simplifycfg-sink-common option: may only occur zero or one times!
mesa: for the --global-isel-abort option: may only occur zero or one times!
[3379:3379:0628/135151.440930:ERROR:browser_switcher_service.cc(238)] XXX Init()
../../sandbox/linux/seccomp-bpf-helpers/sigsys_handlers.cc:**CRASHING**:seccomp-bpf failure in syscall 0072
Received signal 11 SEGV_MAPERR 0000004a9048
Crashes are not universal but even if chromium does not crash, it certainly
does not work properly. While filtering just modify and access might be
safe, the benefit is not worth the risk hence the revert.
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Fixes: e9c15badbb ("fs: Do not check if there is a fsnotify watcher on pseudo inodes")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Syzbot reported that:
CPU: 1 PID: 6780 Comm: syz-executor153 Not tainted 5.7.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__apic_accept_irq+0x46/0xb80
Call Trace:
kvm_arch_async_page_present+0x7de/0x9e0
kvm_check_async_pf_completion+0x18d/0x400
kvm_arch_vcpu_ioctl_run+0x18bf/0x69f0
kvm_vcpu_ioctl+0x46a/0xe20
ksys_ioctl+0x11a/0x180
__x64_sys_ioctl+0x6f/0xb0
do_syscall_64+0xf6/0x7d0
entry_SYSCALL_64_after_hwframe+0x49/0xb3
The testcase enables APF mechanism in MSR_KVM_ASYNC_PF_EN with ASYNC_PF_INT
enabled w/o setting MSR_KVM_ASYNC_PF_INT before, what's worse, interrupt
based APF 'page ready' event delivery depends on in kernel lapic, however,
we didn't bail out when lapic is not in kernel during guest setting
MSR_KVM_ASYNC_PF_EN which causes the null-ptr-deref in host later.
This patch fixes it.
Reported-by: syzbot+1bf777dfdde86d64b89b@syzkaller.appspotmail.com
Fixes: 2635b5c4a0 (KVM: x86: interrupt based APF 'page ready' event delivery)
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1593426391-8231-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Command line parameters might set static keys. This is true for s390 at
least since commit 6471384af2 ("mm: security: introduce init_on_alloc=1
and init_on_free=1 boot options"). To avoid the following WARN:
static_key_enable_cpuslocked(): static key 'init_on_alloc+0x0/0x40' used
before call to jump_label_init()
call jump_label_init() just before parse_early_param().
jump_label_init() is safe to call multiple times (x86 does that), doesn't
do any memory allocations and hence should be safe to call that early.
Fixes: 6471384af2 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
Cc: <stable@vger.kernel.org> # 5.3: d6df52e999: s390/maccess: add no DAT mode to kernel_write
Cc: <stable@vger.kernel.org> # 5.3
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
To be able to patch kernel code before paging is initialized do plain
memcpy if DAT is off. This is required to enable early jump label
initialization.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
In usual IPL or hot plug scenarios a zPCI function transitions directly
from reserved (invisible to Linux) to configured state or is configured
by Linux itself using an SCLP, however it can also first go from
reserved to standby and then from standby to configured without
Linux initiative.
In this scenario we first get a PEC event 0x302 and then 0x301. This may
happen for example when the device is deconfigured at another LPAR and
made available for this LPAR. It may also happen under z/VM when
a device is attached while in some inconsistent state.
However when we get the 0x301 the device is already known to zPCI
so calling zpci_create() will add it twice resulting in the below
BUG. Instead we should only enable the existing device and finally
scan it through the PCI subsystem.
list_add double add: new=00000000ed5a9008, prev=00000000ed5a9008, next=0000000083502300.
kernel BUG at lib/list_debug.c:31!
Krnl PSW : 0704c00180000000 0000000082dc2db8 (__list_add_valid+0x70/0xa8)
Call Trace:
[<0000000082dc2db8>] __list_add_valid+0x70/0xa8
([<0000000082dc2db4>] __list_add_valid+0x6c/0xa8)
[<00000000828ea920>] zpci_create_device+0x60/0x1b0
[<00000000828ef04a>] zpci_event_availability+0x282/0x2f0
[<000000008315f848>] chsc_process_crw+0x2b8/0xa18
[<000000008316735c>] crw_collect_info+0x254/0x348
[<00000000829226ea>] kthread+0x14a/0x168
[<000000008319d5c0>] ret_from_fork+0x24/0x2c
Fixes: f606b3ef47 ("s390/pci: adapt events for zbus")
Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
After pulling 5.7.0 (linux-next merge), mcf5441x mmu boot was
hanging silently.
memblock_add() seems not appropriate, since using MAX_NUMNODES
as node id, while memblock_add_node() sets up memory for node id 0.
Signed-off-by: Angelo Dureghello <angelo.dureghello@timesys.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
The m68k nommu setup code didn't register the beginning of the physical
memory with memblock because it was anyway occupied by the kernel. However,
commit fa3354e4ea ("mm: free_area_init: use maximal zone PFNs rather than
zone sizes") changed zones initialization to use memblock.memory to detect
the zone extents and this caused inconsistency between zone PFNs and the
actual PFNs:
BUG: Bad page state in process swapper pfn:20165
page:41fe0ca0 refcount:0 mapcount:1 mapping:00000000 index:0x0 flags: 0x0()
raw: 00000000 00000100 00000122 00000000 00000000 00000000 00000000 00000000
page dumped because: nonzero mapcount
CPU: 0 PID: 1 Comm: swapper Not tainted 5.8.0-rc1-00001-g3a38f8a60c65-dirty #1
Stack from 404c9ebc:
404c9ebc 4029ab28 4029ab28 40088470 41fe0ca0 40299e21 40299df1 404ba2a4
00020165 00000000 41fd2c10 402c7ba0 41fd2c04 40088504 41fe0ca0 40299e21
00000000 40088a12 41fe0ca0 41fe0ca4 0000020a 00000000 00000001 402ca000
00000000 41fe0ca0 41fd2c10 41fd2c10 00000000 00000000 402b2388 00000001
400a0934 40091056 404c9f44 404c9f44 40088db4 402c7ba0 00000001 41fd2c04
41fe0ca0 41fd2000 41fe0ca0 40089e02 4026ecf4 40089e4e 41fe0ca0 ffffffff
Call Trace:
[<40088470>] 0x40088470
[<40088504>] 0x40088504
[<40088a12>] 0x40088a12
[<402ca000>] 0x402ca000
[<400a0934>] 0x400a0934
Adjust the memory registration with memblock to include the beginning of
the physical memory and make sure that the area occupied by the kernel is
marked as reserved.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Else there may be magic numbers in /sys/kernel/debug/block/*/state.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When the kernel panics, one page of kmsg data may be collected and sent to
Hyper-V to aid in diagnosing the failure. The collected kmsg data typically
contains 50 to 100 lines, each of which has a log level prefix that isn't
very useful from a diagnostic standpoint. So tell kmsg_dump_get_buffer()
to not include the log level, enabling more information that *is* useful to
fit in the page.
Requesting in stable kernels, since many kernels running in production are
stable releases.
Cc: stable@vger.kernel.org
Signed-off-by: Joseph Salisbury <joseph.salisbury@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1593210497-114310-1-git-send-email-joseph.salisbury@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
As description for DIV_ROUND_CLOSEST in file include/linux/kernel.h.
"Result is undefined for negative divisors if the dividend variable
type is unsigned and for negative dividends if the divisor variable
type is unsigned."
In current code, the FIXPT_DIV uses DIV_ROUND_CLOSEST but has not
checked sign of divisor before using. It makes undefined temperature
value in case the value is negative.
This patch fixes to satisfy DIV_ROUND_CLOSEST description
and fix bug too. Note that the variable name "reg" is not good
because it should be the same type as rcar_gen3_thermal_read().
However, it's better to rename the "reg" in a further patch as
cleanup.
Signed-off-by: Van Do <van.do.xw@renesas.com>
Signed-off-by: Dien Pham <dien.pham.ry@renesas.com>
[shimoda: minor fixes, add Fixes tag]
Fixes: 564e73d283 ("thermal: rcar_gen3_thermal: Add R-Car Gen3 thermal driver")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Niklas Soderlund <niklas.soderlund+renesas@ragnatech.se>
Tested-by: Niklas Soderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1593085099-2057-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
The function cpu_power_to_freq is used to find a frequency and set the
cooling device to consume at most the power to be converted. For example,
if the power to be converted is 80mW, and the em table is as follow.
struct em_cap_state table[] = {
/* KHz mW */
{ 1008000, 36, 0 },
{ 1200000, 49, 0 },
{ 1296000, 59, 0 },
{ 1416000, 72, 0 },
{ 1512000, 86, 0 },
};
The target frequency should be 1416000KHz, not 1512000KHz.
Fixes: 349d39dc57 ("thermal: cpu_cooling: merge frequency and power tables")
Cc: <stable@vger.kernel.org> # v4.13+
Signed-off-by: Finley Xiao <finley.xiao@rock-chips.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200619090825.32747-1-finley.xiao@rock-chips.com
After merging tsens-common.c into tsens.c, we can now mark some
functions static so they don't need any prototype declarations. This
fixes the following issue reported by lkp.
>> drivers/thermal/qcom/tsens.c:385:13: warning: no previous prototype for 'tsens_critical_irq_thread' [-Wmissing-prototypes]
385 | irqreturn_t tsens_critical_irq_thread(int irq, void *data)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/thermal/qcom/tsens.c:455:13: warning: no previous prototype for 'tsens_irq_thread' [-Wmissing-prototypes]
455 | irqreturn_t tsens_irq_thread(int irq, void *data)
| ^~~~~~~~~~~~~~~~
>> drivers/thermal/qcom/tsens.c:523:5: warning: no previous prototype for 'tsens_set_trips' [-Wmissing-prototypes]
523 | int tsens_set_trips(void *_sensor, int low, int high)
| ^~~~~~~~~~~~~~~
>> drivers/thermal/qcom/tsens.c:560:5: warning: no previous prototype for 'tsens_enable_irq' [-Wmissing-prototypes]
560 | int tsens_enable_irq(struct tsens_priv *priv)
| ^~~~~~~~~~~~~~~~
>> drivers/thermal/qcom/tsens.c:573:6: warning: no previous prototype for 'tsens_disable_irq' [-Wmissing-prototypes]
573 | void tsens_disable_irq(struct tsens_priv *priv)
| ^~~~~~~~~~~~~~~~~
Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/6757a26876b29922929abf64b1c11fa3b3033d03.1590579709.git.amit.kucheria@linaro.org
Alexandre Oliva has recently removed these files from Linux Libre
with concerns that the sources weren't available.
The sources are available on IGT repository, and only open source
tools are used to generate the {ivb,hsw}_clear_kernel.c files.
However, the remaining concern from Alexandre Oliva was around
GPL license and the source not been present when distributing
the code.
So, it looks like 2 alternatives are possible, the use of
linux-firmware.git repository to store the blob or making sure
that the source is also present in our tree. Since the goal
is to limit the i915 firmware to only the micro-controller blobs
let's make sure that we do include the asm sources here in our tree.
Btw, I tried to have some diligence here and make sure that the
asms that these commits are adding are truly the source for
the mentioned files:
igt$ ./scripts/generate_clear_kernel.sh -g ivb \
-m ~/mesa/build/src/intel/tools/i965_asm
Output file not specified - using default file "ivb-cb_assembled"
Generating gen7 CB Kernel assembled file "ivb_clear_kernel.c"
for i915 driver...
igt$ diff ~/i915/drm-tip/drivers/gpu/drm/i915/gt/ivb_clear_kernel.c \
ivb_clear_kernel.c
< * Generated by: IGT Gpu Tools on Fri 21 Feb 2020 05:29:32 AM UTC
> * Generated by: IGT Gpu Tools on Mon 08 Jun 2020 10:00:54 AM PDT
61c61
< };
> };
\ No newline at end of file
igt$ ./scripts/generate_clear_kernel.sh -g hsw \
-m ~/mesa/build/src/intel/tools/i965_asm
Output file not specified - using default file "hsw-cb_assembled"
Generating gen7.5 CB Kernel assembled file "hsw_clear_kernel.c"
for i915 driver...
igt$ diff ~/i915/drm-tip/drivers/gpu/drm/i915/gt/hsw_clear_kernel.c \
hsw_clear_kernel.c
5c5
< * Generated by: IGT Gpu Tools on Fri 21 Feb 2020 05:30:13 AM UTC
> * Generated by: IGT Gpu Tools on Mon 08 Jun 2020 10:01:42 AM PDT
61c61
< };
> };
\ No newline at end of file
Used IGT and Mesa master repositories from Fri Jun 5 2020)
IGT: 53e8c878a6fb ("tests/kms_chamelium: Force reprobe after replugging
the connector")
Mesa: 5d13c7477eb1 ("radv: set keep_statistic_info with
RADV_DEBUG=shaderstats")
Mesa built with: meson build -D platforms=drm,x11 -D dri-drivers=i965 \
-D gallium-drivers=iris -D prefix=/usr \
-D libdir=/usr/lib64/ -Dtools=intel \
-Dkulkan-drivers=intel && ninja -C build
v2: Header clean-up and include build instructions in a readme (Chris)
Modified commit message to respect check-patch
Reference: http://www.fsfla.org/pipermail/linux-libre/2020-June/003374.html
Reference: http://www.fsfla.org/pipermail/linux-libre/2020-June/003375.html
Fixes: 47f8253d2b ("drm/i915/gen7: Clear all EU/L3 residual contexts")
Cc: <stable@vger.kernel.org> # v5.7+
Cc: Alexandre Oliva <lxoliva@fsfla.org>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200610201807.191440-1-rodrigo.vivi@intel.com
(cherry picked from commit 5a7eeb8ba1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
generic_file_fsync() exfat used could not guarantee the consistency of
a file because it has flushed not dirty metadata but only dirty data pages
for a file.
Instead of that, use exfat_file_fsync() for files and directories so that
it guarantees to commit both the metadata and data pages for a file.
Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
We need to commit dirty metadata and pages to disk
before remounting exfat as read-only.
This fixes a failure in xfstests generic/452
generic/452 does the following:
cp something <exfat>/
mount -o remount,ro <exfat>
the <exfat>/something is corrupted. because while
exfat is remounted as read-only, exfat doesn't
have a chance to commit metadata and
vfs invalidates page caches in a block device.
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Acked-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
If the second exfat_get_dentry() call fails then we need to release
"old_bh" before returning. There is a similar bug in exfat_move_file().
Fixes: 5f2aa07507 ("exfat: add inode operations")
Reported-by: Markus Elfring <Markus.Elfring@web.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Some fsck tool complain that padding part of the FileName field
is not set to the value 0000h. So let's maintain filesystem cleaner,
as exfat's spec. recommendation.
Signed-off-by: Hyeongseok.Kim <Hyeongseok@gmail.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
When a DMA coherent pool is depleted, allocation failures may or may not
get reported in the kernel log depending on the allocator.
The admin does have a workaround, however, by using coherent_pool= on the
kernel command line.
Provide some guidance on the failure and a recommended minimum size for
the pools (double the size).
Signed-off-by: David Rientjes <rientjes@google.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Previously, kernel floating point code would run with the MXCSR control
register value last set by userland code by the thread that was active
on the CPU core just before kernel call. This could affect calculation
results if rounding mode was changed, or a crash if a FPU/SIMD exception
was unmasked.
Restore MXCSR to the kernel's default value.
[ bp: Carve out from a bigger patch by Petteri, add feature check, add
FNINIT call too (amluto). ]
Signed-off-by: Petteri Aimonen <jpa@git.mail.kapsi.fi>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=207979
Link: https://lkml.kernel.org/r/20200624114646.28953-2-bp@alien8.de
Jan reported that LTP mmap03 was getting stuck in a page fault loop
after commit c46241a370 ("powerpc/pkeys: Check vma before returning
key fault error to the user"), as well as a minimised reproducer:
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
int main(int ac, char **av)
{
int page_sz = getpagesize();
int fildes;
char *addr;
fildes = open("tempfile", O_WRONLY | O_CREAT, 0666);
write(fildes, &fildes, sizeof(fildes));
close(fildes);
fildes = open("tempfile", O_RDONLY);
unlink("tempfile");
addr = mmap(0, page_sz, PROT_EXEC, MAP_FILE | MAP_PRIVATE, fildes, 0);
printf("%d\n", *addr);
return 0;
}
And noticed that access_pkey_error() in page fault handler now always
seem to return false:
__do_page_fault
access_pkey_error(is_pkey: 1, is_exec: 0, is_write: 0)
arch_vma_access_permitted
pkey_access_permitted
if (!is_pkey_enabled(pkey))
return true
return false
pkey_access_permitted() should not check if the pkey is available in
UAMOR (using is_pkey_enabled()). The kernel needs to do that check
only when allocating keys. This also makes sure the execute_only_key
which is marked as non-manageable via UAMOR is handled correctly in
pkey_access_permitted(), and fixes the bug.
Fixes: c46241a370 ("powerpc/pkeys: Check vma before returning key fault error to the user")
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[mpe: Include bug report details etc. in the change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200627070147.297535-1-aneesh.kumar@linux.ibm.com
The lockdep annotations for dev_uc_unsync() and dev_mc_unsync()
are not easy to understand, so add some comments to explain
why they are correct.
Similar for the rest netif_addr_lock_bh() cases, they don't
need nested version.
Cc: Taehee Yoo <ap420073@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
lockdep_set_class_and_subclass() is meant to reduce
the _nested() annotations by assigning a default subclass.
For addr_list_lock, we have to compute the subclass at
run-time as the netdevice topology changes after creation.
So, we should just get rid of these
lockdep_set_class_and_subclass() and stick with our _nested()
annotations.
Fixes: 845e0ebb44 ("net: change addr_list_lock back to static key")
Suggested-by: Taehee Yoo <ap420073@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes sparse warning:
Function parameter or member 'pbuflen' not described in 'packing'
Fixes: 554aae3500 ("lib: Add support for generic packing operations")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
in mic_pre_enable, pm_runtime_get_sync is called which
increments the counter even in case of failure, leading to incorrect
ref count. In case of failure, decrement the ref count before returning.
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Propagate the proper error codes from the called functions instead of
unconditionally returning 0.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Merge conflict so merged it manually.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
platform_get_irq() will call dev_err() itself on failure,
so there is no need for the driver to also do this.
This is detected by coccinelle.
Signed-off-by: Tamseel Shams <m.shams@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Pull ARM OMAP fixes from Arnd Bergmann:
"The OMAP developers are particularly active at hunting down
regressions, so this is a separate branch with OMAP specific
fixes for v5.8:
As Tony explains
"The recent display subsystem (DSS) related platform data changes
caused display related regressions for suspend and resume. Looks
like I only tested suspend and resume before dropping the legacy
platform data, and forgot to test it after dropping it. Turns out
the main issue was that we no longer have platform code calling
pm_runtime_suspend for DSS like we did for the legacy platform data
case, and that fix is still being discussed on the dri-devel list
and will get merged separately. The DSS related testing exposed a
pile other other display related issues that also need fixing
though":
- Fix ti-sysc optional clock handling and reset status checks for
devices that reset automatically in idle like DSS
- Ignore ti-sysc clockactivity bit unless separately requested to
avoid unexpected performance issues
- Init ti-sysc framedonetv_irq to true and disable for am4
- Avoid duplicate DSS reset for legacy mode with dts data
- Remove LCD timings for am4 as they cause warnings now that we're
using generic panels
Other OMAP changes from Tony include:
- Fix omap_prm reset deassert as we still have drivers setting the
pm_runtime_irq_safe() flag
- Flush posted write for ti-sysc enable and disable
- Fix droid4 spi related errors with spi flags
- Fix am335x USB range and a typo for softreset
- Fix dra7 timer nodes for clocks for IPU and DSP
- Drop duplicate mailboxes after mismerge for dra7
- Prevent pocketgeagle header line signal from accidentally setting
micro-SD write protection signal by removing the default mux
- Fix NFSroot flakeyness after resume for duover by switching the
smsc911x gpio interrupt to back to level sensitive
- Fix regression for omap4 clockevent source after recent system
timer changes
- Yet another ethernet regression fix for the "rgmii" vs "rgmii-rxid"
phy-mode
- One patch to convert am3/am4 DT files to use the regular sdhci-omap
driver instead of the old hsmmc driver, this was meant for the
merge window but got lost in the process"
* tag 'arm-omap-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (21 commits)
ARM: dts: am5729: beaglebone-ai: fix rgmii phy-mode
ARM: dts: Fix omap4 system timer source clocks
ARM: dts: Fix duovero smsc interrupt for suspend
ARM: dts: am335x-pocketbeagle: Fix mmc0 Write Protect
Revert "bus: ti-sysc: Increase max softreset wait"
ARM: dts: am437x-epos-evm: remove lcd timings
ARM: dts: am437x-gp-evm: remove lcd timings
ARM: dts: am437x-sk-evm: remove lcd timings
ARM: dts: dra7-evm-common: Fix duplicate mailbox nodes
ARM: dts: dra7: Fix timer nodes properly for timer_sys_ck clocks
ARM: dts: Fix am33xx.dtsi ti,sysc-mask wrong softreset flag
ARM: dts: Fix am33xx.dtsi USB ranges length
bus: ti-sysc: Increase max softreset wait
ARM: OMAP2+: Fix legacy mode dss_reset
bus: ti-sysc: Fix uninitialized framedonetv_irq
bus: ti-sysc: Ignore clockactivity unless specified as a quirk
bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit
ARM: dts: omap4-droid4: Fix spi configuration and increase rate
bus: ti-sysc: Flush posted write on enable and disable
soc: ti: omap-prm: use atomic iopoll instead of sleeping one
...
Pull ARM SoC fixes from Arnd Bergmann:
"Here are a couple of bug fixes, mostly for devicetree files
NXP i.MX:
- Use correct voltage on some i.MX8M board device trees to avoid
hardware damage
- Code fixes for a compiler warning and incorrect reference counting,
both harmless.
- Fix the i.MX8M SoC driver to correctly identify imx8mp
- Fix watchdog configuration in imx6ul-kontron device tree.
Broadcom:
- A small regression fix for the Raspberry-Pi firmware driver
- A Kconfig change to use the correct timer driver on Northstar
- A DT fix for the Luxul XWC-2000 machine
- Two more DT fixes for NSP SoCs
STmicroelectronics STI
- Revert one broken patch for L2 cache configuration
ARM Versatile Express:
- Fix a regression by reverting a broken DT cleanup
TEE drivers:
- MAINTAINERS: change tee mailing list"
* tag 'arm-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
Revert "ARM: sti: Implement dummy L2 cache's write_sec"
soc: imx8m: fix build warning
ARM: imx6: add missing put_device() call in imx6q_suspend_init()
ARM: imx5: add missing put_device() call in imx_suspend_alloc_ocram()
soc: imx8m: Correct i.MX8MP UID fuse offset
ARM: dts: imx6ul-kontron: Change WDOG_ANY signal from push-pull to open-drain
ARM: dts: imx6ul-kontron: Move watchdog from Kontron i.MX6UL/ULL board to SoM
arm64: dts: imx8mm-beacon: Fix voltages on LDO1 and LDO2
arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2 voltage range
arm64: dts: imx8mm-evk: correct ldo1/ldo2 voltage range
ARM: dts: NSP: Correct FA2 mailbox node
ARM: bcm2835: Fix integer overflow in rpi_firmware_print_firmware_revision()
MAINTAINERS: change tee mailing list
ARM: dts: NSP: Disable PL330 by default, add dma-coherent property
ARM: bcm: Select ARM_TIMER_SP804 for ARCH_BCM_NSP
ARM: dts: BCM5301X: Add missing memory "device_type" for Luxul XWC-2000
arm: dts: vexpress: Move mcc node back into motherboard node
Pull timer fix from Ingo Molnar:
"A single DocBook fix"
* tag 'timers-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Fix kerneldoc system_device_crosststamp & al
Pull perf fix from Ingo Molnar:
"A single Kbuild dependency fix"
* tag 'perf-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/rapl: Fix RAPL config variable bug
Pull EFI fixes from Ingo Molnar:
- Fix build regression on v4.8 and older
- Robustness fix for TPM log parsing code
- kobject refcount fix for the ESRT parsing code
- Two efivarfs fixes to make it behave more like an ordinary file
system
- Style fixup for zero length arrays
- Fix a regression in path separator handling in the initrd loader
- Fix a missing prototype warning
- Add some kerneldoc headers for newly introduced stub routines
- Allow support for SSDT overrides via EFI variables to be disabled
- Report CPU mode and MMU state upon entry for 32-bit ARM
- Use the correct stack pointer alignment when entering from mixed mode
* tag 'efi-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/libstub: arm: Print CPU boot mode and MMU state at boot
efi/libstub: arm: Omit arch specific config table matching array on arm64
efi/x86: Setup stack correctly for efi_pe_entry
efi: Make it possible to disable efivar_ssdt entirely
efi/libstub: Descriptions for stub helper functions
efi/libstub: Fix path separator regression
efi/libstub: Fix missing-prototype warning for skip_spaces()
efi: Replace zero-length array and use struct_size() helper
efivarfs: Don't return -EINTR when rate-limiting reads
efivarfs: Update inode modification time for successful writes
efi/esrt: Fix reference count leak in esre_create_sysfs_entry.
efi/tpm: Verify event log header before parsing
efi/x86: Fix build with gcc 4
Pull scheduler fixes from Borislav Petkov:
"The most anticipated fix in this pull request is probably the horrible
build fix for the RANDSTRUCT fail that didn't make -rc2. Also included
is the cleanup that removes those BUILD_BUG_ON()s and replaces it with
ugly unions.
Also included is the try_to_wake_up() race fix that was first
triggered by Paul's RCU-torture runs, but was independently hit by
Dave Chinner's fstest runs as well"
* tag 'sched_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/cfs: change initial value of runnable_avg
smp, irq_work: Continue smp_call_function*() and irq_work*() integration
sched/core: s/WF_ON_RQ/WQ_ON_CPU/
sched/core: Fix ttwu() race
sched/core: Fix PI boosting between RT and DEADLINE tasks
sched/deadline: Initialize ->dl_boosted
sched/core: Check cpus_mask, not cpus_ptr in __set_cpus_allowed_ptr(), to fix mask corruption
sched/core: Fix CONFIG_GCC_PLUGIN_RANDSTRUCT build fail
Pull x86 fixes from Borislav Petkov:
- AMD Memory bandwidth counter width fix, by Babu Moger.
- Use the proper length type in the 32-bit truncate() syscall variant,
by Jiri Slaby.
- Reinit IA32_FEAT_CTL during wakeup to fix the case where after
resume, VMXON would #GP due to VMX not being properly enabled, by
Sean Christopherson.
- Fix a static checker warning in the resctrl code, by Dan Carpenter.
- Add a CR4 pinning mask for bits which cannot change after boot, by
Kees Cook.
- Align the start of the loop of __clear_user() to 16 bytes, to improve
performance on AMD zen1 and zen2 microarchitectures, by Matt Fleming.
* tag 'x86_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm/64: Align start of __clear_user() loop to 16-bytes
x86/cpu: Use pinning mask for CR4 bits needing to be 0
x86/resctrl: Fix a NULL vs IS_ERR() static checker warning in rdt_cdp_peer_get()
x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup
syscalls: Fix offset type of ksys_ftruncate()
x86/resctrl: Fix memory bandwidth counter width for AMD
Pull RCU-vs-KCSAN fixes from Borislav Petkov:
"A single commit that uses "arch_" atomic operations to avoid the
instrumentation that comes with the non-"arch_" versions.
In preparation for that commit, it also has another commit that makes
these "arch_" atomic operations available to generic code.
Without these commits, KCSAN uses can see pointless errors"
* tag 'rcu_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Fixup noinstr warnings
locking/atomics: Provide the arch_atomic_ interface to generic code
Pull objtool fixes from Borislav Petkov:
"Three fixes from Peter Zijlstra suppressing KCOV instrumentation in
noinstr sections.
Peter Zijlstra says:
"Address KCOV vs noinstr. There is no function attribute to
selectively suppress KCOV instrumentation, instead teach objtool
to NOP out the calls in noinstr functions"
This cures a bunch of KCOV crashes (as used by syzcaller)"
* tag 'objtool_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix noinstr vs KCOV
objtool: Provide elf_write_{insn,reloc}()
objtool: Clean up elf_write() condition
Pull x86 entry fixes from Borislav Petkov:
"This is the x86/entry urgent pile which has accumulated since the
merge window.
It is not the smallest but considering the almost complete entry core
rewrite, the amount of fixes to follow is somewhat higher than usual,
which is to be expected.
Peter Zijlstra says:
'These patches address a number of instrumentation issues that were
found after the x86/entry overhaul. When combined with rcu/urgent
and objtool/urgent, these patches make UBSAN/KASAN/KCSAN happy
again.
Part of making this all work is bumping the minimum GCC version for
KASAN builds to gcc-8.3, the reason for this is that the
__no_sanitize_address function attribute is broken in GCC releases
before that.
No known GCC version has a working __no_sanitize_undefined, however
because the only noinstr violation that results from this happens
when an UB is found, we treat it like WARN. That is, we allow it to
violate the noinstr rules in order to get the warning out'"
* tag 'x86_entry_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry: Fix #UD vs WARN more
x86/entry: Increase entry_stack size to a full page
x86/entry: Fixup bad_iret vs noinstr
objtool: Don't consider vmlinux a C-file
kasan: Fix required compiler version
compiler_attributes.h: Support no_sanitize_undefined check with GCC 4
x86/entry, bug: Comment the instrumentation_begin() usage for WARN()
x86/entry, ubsan, objtool: Whitelist __ubsan_handle_*()
x86/entry, cpumask: Provide non-instrumented variant of cpu_is_offline()
compiler_types.h: Add __no_sanitize_{address,undefined} to noinstr
kasan: Bump required compiler version
x86, kcsan: Add __no_kcsan to noinstr
kcsan: Remove __no_kcsan_or_inline
x86, kcsan: Remove __no_kcsan_or_inline usage
John Fastabend says:
====================
Fix a splat introduced by recent changes to avoid skipping ingress policy
when kTLS is enabled. The RCU splat was introduced because in the non-TLS
case the caller is wrapped in an rcu_read_lock/unlock. But, in the TLS
case we have a reference to the psock and the caller did not wrap its
call in rcu_read_lock/unlock.
To fix extend the RCU section to include the redirect case which was
missed. From v1->v2 I changed the location a bit to simplify the code
some. See patch 1.
But, then Martin asked why it was not needed in the non-TLS case. The
answer for patch 1 was, as stated above, because the caller has the
rcu read lock. However, there was still a missing case where a BPF
user could in-theory line up a set of parameters to hit a case
where the code was entered from strparser side from a different context
then the initial caller. To hit this user would need a parser program
to return value greater than skb->len then an ENOMEM error could happen
in the strparser codepath triggering strparser to retry from a workqueue
and without rcu_read_lock original caller used. See patch 2 for details.
Finally, we don't actually have any selftests for parser returning a
value geater than skb->len so add one in patch 3. This is especially
needed because at least I don't have any code that uses the parser
to return value greater than skb->len. So I wouldn't have caught any
errors here in my own testing.
Thanks, John
v1->v2: simplify code in patch 1 some and add patches 2 and 3.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
If an ingress verdict program specifies message sizes greater than
skb->len and there is an ENOMEM error due to memory pressure we
may call the rcv_msg handler outside the strp_data_ready() caller
context. This is because on an ENOMEM error the strparser will
retry from a workqueue. The caller currently protects the use of
psock by calling the strp_data_ready() inside a rcu_read_lock/unlock
block.
But, in above workqueue error case the psock is accessed outside
the read_lock/unlock block of the caller. So instead of using
psock directly we must do a look up against the sk again to
ensure the psock is available.
There is an an ugly piece here where we must handle
the case where we paused the strp and removed the psock. On
psock removal we first pause the strparser and then remove
the psock. If the strparser is paused while an skb is
scheduled on the workqueue the skb will be dropped on the
flow and kfree_skb() is called. If the workqueue manages
to get called before we pause the strparser but runs the rcvmsg
callback after the psock is removed we will hit the unlikely
case where we run the sockmap rcvmsg handler but do not have
a psock. For now we will follow strparser logic and drop the
skb on the floor with skb_kfree(). This is ugly because the
data is dropped. To date this has not caused problems in practice
because either the application controlling the sockmap is
coordinating with the datapath so that skbs are "flushed"
before removal or we simply wait for the sock to be closed before
removing it.
This patch fixes the describe RCU bug and dropping the skb doesn't
make things worse. Future patches will improve this by allowing
the normal case where skbs are not merged to skip the strparser
altogether. In practice many (most?) use cases have no need to
merge skbs so its both a code complexity hit as seen above and
a performance issue. For example, in the Cilium case we always
set the strparser up to return sbks 1:1 without any merging and
have avoided above issues.
Fixes: e91de6afa8 ("bpf: Fix running sk_skb program types with ktls")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/159312679888.18340.15248924071966273998.stgit@john-XPS-13-9370
There are two paths to generate the below RCU splat the first and
most obvious is the result of the BPF verdict program issuing a
redirect on a TLS socket (This is the splat shown below). Unlike
the non-TLS case the caller of the *strp_read() hooks does not
wrap the call in a rcu_read_lock/unlock. Then if the BPF program
issues a redirect action we hit the RCU splat.
However, in the non-TLS socket case the splat appears to be
relatively rare, because the skmsg caller into the strp_data_ready()
is wrapped in a rcu_read_lock/unlock. Shown here,
static void sk_psock_strp_data_ready(struct sock *sk)
{
struct sk_psock *psock;
rcu_read_lock();
psock = sk_psock(sk);
if (likely(psock)) {
if (tls_sw_has_ctx_rx(sk)) {
psock->parser.saved_data_ready(sk);
} else {
write_lock_bh(&sk->sk_callback_lock);
strp_data_ready(&psock->parser.strp);
write_unlock_bh(&sk->sk_callback_lock);
}
}
rcu_read_unlock();
}
If the above was the only way to run the verdict program we
would be safe. But, there is a case where the strparser may throw an
ENOMEM error while parsing the skb. This is a result of a failed
skb_clone, or alloc_skb_for_msg while building a new merged skb when
the msg length needed spans multiple skbs. This will in turn put the
skb on the strp_wrk workqueue in the strparser code. The skb will
later be dequeued and verdict programs run, but now from a
different context without the rcu_read_lock()/unlock() critical
section in sk_psock_strp_data_ready() shown above. In practice
I have not seen this yet, because as far as I know most users of the
verdict programs are also only working on single skbs. In this case no
merge happens which could trigger the above ENOMEM errors. In addition
the system would need to be under memory pressure. For example, we
can't hit the above case in selftests because we missed having tests
to merge skbs. (Added in later patch)
To fix the below splat extend the rcu_read_lock/unnlock block to
include the call to sk_psock_tls_verdict_apply(). This will fix both
TLS redirect case and non-TLS redirect+error case. Also remove
psock from the sk_psock_tls_verdict_apply() function signature its
not used there.
[ 1095.937597] WARNING: suspicious RCU usage
[ 1095.940964] 5.7.0-rc7-02911-g463bac5f1ca79 #1 Tainted: G W
[ 1095.944363] -----------------------------
[ 1095.947384] include/linux/skmsg.h:284 suspicious rcu_dereference_check() usage!
[ 1095.950866]
[ 1095.950866] other info that might help us debug this:
[ 1095.950866]
[ 1095.957146]
[ 1095.957146] rcu_scheduler_active = 2, debug_locks = 1
[ 1095.961482] 1 lock held by test_sockmap/15970:
[ 1095.964501] #0: ffff9ea6b25de660 (sk_lock-AF_INET){+.+.}-{0:0}, at: tls_sw_recvmsg+0x13a/0x840 [tls]
[ 1095.968568]
[ 1095.968568] stack backtrace:
[ 1095.975001] CPU: 1 PID: 15970 Comm: test_sockmap Tainted: G W 5.7.0-rc7-02911-g463bac5f1ca79 #1
[ 1095.977883] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 1095.980519] Call Trace:
[ 1095.982191] dump_stack+0x8f/0xd0
[ 1095.984040] sk_psock_skb_redirect+0xa6/0xf0
[ 1095.986073] sk_psock_tls_strp_read+0x1d8/0x250
[ 1095.988095] tls_sw_recvmsg+0x714/0x840 [tls]
v2: Improve commit message to identify non-TLS redirect plus error case
condition as well as more common TLS case. In the process I decided
doing the rcu_read_unlock followed by the lock/unlock inside branches
was unnecessarily complex. We can just extend the current rcu block
and get the same effeective without the shuffling and branching.
Thanks Martin!
Fixes: e91de6afa8 ("bpf: Fix running sk_skb program types with ktls")
Reported-by: Jakub Sitnicki <jakub@cloudflare.com>
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/159312677907.18340.11064813152758406626.stgit@john-XPS-13-9370
Some performance regression on reaim benchmark have been raised with
commit 070f5e860e ("sched/fair: Take into account runnable_avg to classify group")
The problem comes from the init value of runnable_avg which is initialized
with max value. This can be a problem if the newly forked task is finally
a short task because the group of CPUs is wrongly set to overloaded and
tasks are pulled less agressively.
Set initial value of runnable_avg equals to util_avg to reflect that there
is no waiting time so far.
Fixes: 070f5e860e ("sched/fair: Take into account runnable_avg to classify group")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200624154422.29166-1-vincent.guittot@linaro.org
Paul reported rcutorture occasionally hitting a NULL deref:
sched_ttwu_pending()
ttwu_do_wakeup()
check_preempt_curr() := check_preempt_wakeup()
find_matching_se()
is_same_group()
if (se->cfs_rq == pse->cfs_rq) <-- *BOOM*
Debugging showed that this only appears to happen when we take the new
code-path from commit:
2ebb177175 ("sched/core: Offload wakee task activation if it the wakee is descheduling")
and only when @cpu == smp_processor_id(). Something which should not
be possible, because p->on_cpu can only be true for remote tasks.
Similarly, without the new code-path from commit:
c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")
this would've unconditionally hit:
smp_cond_load_acquire(&p->on_cpu, !VAL);
and if: 'cpu == smp_processor_id() && p->on_cpu' is possible, this
would result in an instant live-lock (with IRQs disabled), something
that hasn't been reported.
The NULL deref can be explained however if the task_cpu(p) load at the
beginning of try_to_wake_up() returns an old value, and this old value
happens to be smp_processor_id(). Further assume that the p->on_cpu
load accurately returns 1, it really is still running, just not here.
Then, when we enqueue the task locally, we can crash in exactly the
observed manner because p->se.cfs_rq != rq->cfs_rq, because p's cfs_rq
is from the wrong CPU, therefore we'll iterate into the non-existant
parents and NULL deref.
The closest semi-plausible scenario I've managed to contrive is
somewhat elaborate (then again, actual reproduction takes many CPU
hours of rcutorture, so it can't be anything obvious):
X->cpu = 1
rq(1)->curr = X
CPU0 CPU1 CPU2
// switch away from X
LOCK rq(1)->lock
smp_mb__after_spinlock
dequeue_task(X)
X->on_rq = 9
switch_to(Z)
X->on_cpu = 0
UNLOCK rq(1)->lock
// migrate X to cpu 0
LOCK rq(1)->lock
dequeue_task(X)
set_task_cpu(X, 0)
X->cpu = 0
UNLOCK rq(1)->lock
LOCK rq(0)->lock
enqueue_task(X)
X->on_rq = 1
UNLOCK rq(0)->lock
// switch to X
LOCK rq(0)->lock
smp_mb__after_spinlock
switch_to(X)
X->on_cpu = 1
UNLOCK rq(0)->lock
// X goes sleep
X->state = TASK_UNINTERRUPTIBLE
smp_mb(); // wake X
ttwu()
LOCK X->pi_lock
smp_mb__after_spinlock
if (p->state)
cpu = X->cpu; // =? 1
smp_rmb()
// X calls schedule()
LOCK rq(0)->lock
smp_mb__after_spinlock
dequeue_task(X)
X->on_rq = 0
if (p->on_rq)
smp_rmb();
if (p->on_cpu && ttwu_queue_wakelist(..)) [*]
smp_cond_load_acquire(&p->on_cpu, !VAL)
cpu = select_task_rq(X, X->wake_cpu, ...)
if (X->cpu != cpu)
switch_to(Y)
X->on_cpu = 0
UNLOCK rq(0)->lock
However I'm having trouble convincing myself that's actually possible
on x86_64 -- after all, every LOCK implies an smp_mb() there, so if ttwu
observes ->state != RUNNING, it must also observe ->cpu != 1.
(Most of the previous ttwu() races were found on very large PowerPC)
Nevertheless, this fully explains the observed failure case.
Fix it by ordering the task_cpu(p) load after the p->on_cpu load,
which is easy since nothing actually uses @cpu before this.
Fixes: c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200622125649.GC576871@hirez.programming.kicks-ass.net
syzbot reported the following warning:
WARNING: CPU: 1 PID: 6351 at kernel/sched/deadline.c:628
enqueue_task_dl+0x22da/0x38a0 kernel/sched/deadline.c:1504
At deadline.c:628 we have:
623 static inline void setup_new_dl_entity(struct sched_dl_entity *dl_se)
624 {
625 struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
626 struct rq *rq = rq_of_dl_rq(dl_rq);
627
628 WARN_ON(dl_se->dl_boosted);
629 WARN_ON(dl_time_before(rq_clock(rq), dl_se->deadline));
[...]
}
Which means that setup_new_dl_entity() has been called on a task
currently boosted. This shouldn't happen though, as setup_new_dl_entity()
is only called when the 'dynamic' deadline of the new entity
is in the past w.r.t. rq_clock and boosted tasks shouldn't verify this
condition.
Digging through the PI code I noticed that what above might in fact happen
if an RT tasks blocks on an rt_mutex hold by a DEADLINE task. In the
first branch of boosting conditions we check only if a pi_task 'dynamic'
deadline is earlier than mutex holder's and in this case we set mutex
holder to be dl_boosted. However, since RT 'dynamic' deadlines are only
initialized if such tasks get boosted at some point (or if they become
DEADLINE of course), in general RT 'dynamic' deadlines are usually equal
to 0 and this verifies the aforementioned condition.
Fix it by checking that the potential donor task is actually (even if
temporary because in turn boosted) running at DEADLINE priority before
using its 'dynamic' deadline value.
Fixes: 2d3d891d33 ("sched/deadline: Add SCHED_DEADLINE inheritance logic")
Reported-by: syzbot+119ba87189432ead09b4@syzkaller.appspotmail.com
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Tested-by: Daniel Wagner <dwagner@suse.de>
Link: https://lkml.kernel.org/r/20181119153201.GB2119@localhost.localdomain
syzbot reported the following warning triggered via SYSC_sched_setattr():
WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 setup_new_dl_entity /kernel/sched/deadline.c:594 [inline]
WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 enqueue_dl_entity /kernel/sched/deadline.c:1370 [inline]
WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 enqueue_task_dl+0x1c17/0x2ba0 /kernel/sched/deadline.c:1441
This happens because the ->dl_boosted flag is currently not initialized by
__dl_clear_params() (unlike the other flags) and setup_new_dl_entity()
rightfully complains about it.
Initialize dl_boosted to 0.
Fixes: 2d3d891d33 ("sched/deadline: Add SCHED_DEADLINE inheritance logic")
Reported-by: syzbot+5ac8bac25f95e8b221e7@syzkaller.appspotmail.com
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Daniel Wagner <dwagner@suse.de>
Link: https://lkml.kernel.org/r/20200617072919.818409-1-juri.lelli@redhat.com
This function is concerned with the long-term CPU mask, not the
transitory mask the task might have while migrate disabled. Before
this patch, if a task was migrate-disabled at the time
__set_cpus_allowed_ptr() was called, and the new mask happened to be
equal to the CPU that the task was running on, then the mask update
would be lost.
Signed-off-by: Scott Wood <swood@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200617121742.cpxppyi7twxmpin7@linutronix.de
i.MX fixes for 5.8:
- Fix LDO1 and LDO2 voltage range for a couple of i.MX8M board device
trees.
- Fix i.MX8MP UID fuse offset in i.MX8M SoC driver.
- Fix watchdog configuration in imx6ul-kontron device tree.
- Fix one build warning seen on building soc-imx8m driver with
x86_64-randconfig.
- Add missing put_device() call for a couple of mach-imx PM functions.
* tag 'imx-fixes-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
soc: imx8m: fix build warning
ARM: imx6: add missing put_device() call in imx6q_suspend_init()
ARM: imx5: add missing put_device() call in imx_suspend_alloc_ocram()
soc: imx8m: Correct i.MX8MP UID fuse offset
ARM: dts: imx6ul-kontron: Change WDOG_ANY signal from push-pull to open-drain
ARM: dts: imx6ul-kontron: Move watchdog from Kontron i.MX6UL/ULL board to SoM
arm64: dts: imx8mm-beacon: Fix voltages on LDO1 and LDO2
arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2 voltage range
arm64: dts: imx8mm-evk: correct ldo1/ldo2 voltage range
Link: https://lore.kernel.org/r/20200624111725.GA24312@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This pull request contains Broadcom ARM/ARM64/MIPS SoCs drivers fixes
for 5.8, please pull the following:
- Andy provides a fix for the Raspberry Pi firmware driver to print the
correct time upon boot. This is a fallout from a converstion to use
the ptT format
* tag 'arm-soc/for-5.8/drivers-fixes' of https://github.com/Broadcom/stblinux:
ARM: bcm2835: Fix integer overflow in rpi_firmware_print_firmware_revision()
Link: https://lore.kernel.org/r/20200619202250.19029-2-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for
5.8, please pull the following:
- Rafal adds a missing 'device_type' property to the Luxul XWC-2000
required for the memory nodes to be correctly parsed by Linux
- Matthew provides two fixes for the NSP SoCs, one to disable the PL330
DMA controller by default since it can be left in reset by the
bootloader and the second to correct the flow accelerator mailbox node
* tag 'arm-soc/for-5.8/devicetree-fixes' of https://github.com/Broadcom/stblinux:
ARM: dts: NSP: Correct FA2 mailbox node
ARM: dts: NSP: Disable PL330 by default, add dma-coherent property
ARM: dts: BCM5301X: Add missing memory "device_type" for Luxul XWC-2000
Link: https://lore.kernel.org/r/20200619202250.19029-1-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This reverts commit 7b8e0188fa.
Initially, STiH410-B2260 was supposed to be secured, that's why
l2c_write_sec was stubbed to avoid secure register access from
non secure world.
But by default, STiH410-B2260 is running in non secure mode,
so L2 cache register accesses are authorized, l2c_write_sec stub
is not needed.
With this patch, L2 cache is configured and performance are enhanced.
Link: https://lore.kernel.org/r/20200618172456.29475-1-patrice.chotard@st.com
Signed-off-by: Patrice Chotard <patrice.chotard@st.com>
Cc: Alain Volmat <alain.volmat@st.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Few dts fixes for omaps for v5.8
Few fixes for various devices:
- Prevent pocketgeagle header line signal from accidentally setting
micro-SD write protection signal by removing the default mux
- Fix NFSroot flakeyness after resume for duover by switching the
smsc911x gpio interrupt to back to level sensitive
- Fix regression for omap4 clockevent source after recent system
timer changes
- Yet another ethernet regression fix for the "rgmii" vs "rgmii-rxid"
phy-mode
* tag 'omap-for-v5.8/fixes-rc1-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am5729: beaglebone-ai: fix rgmii phy-mode
ARM: dts: Fix omap4 system timer source clocks
ARM: dts: Fix duovero smsc interrupt for suspend
ARM: dts: am335x-pocketbeagle: Fix mmc0 Write Protect
Link: https://lore.kernel.org/r/pull-1592499282-121092@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Missed sdhci patch for am3 and am4
I forgot to send a pull request earlier for converting am3 and am4 to
use sdhci-omap driver instead of the old omap_hsmmc driver.
There was a display subsystem related suspend and resume regression found
recently and looks like I forgot to send a pull request for this patch
while debugging the regression. This patch has been tested without the
display subsystem, and has been in Linux next for several weeks now, so
would be good to have merged for v5.8.
* tag 'omap-for-v5.8/dt-missed-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: Move am33xx and am43xx mmc nodes to sdhci-omap driver
Link: https://lore.kernel.org/r/pull-1591637467-607254@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes for omaps for v5.8
The recent display subsystem (DSS) related platform data changes caused
display related regressions for suspend and resume. Looks like I only
tested suspend and resume before dropping the legacy platform data, and
forgot to test it after dropping it. Turns out the main issue was that
we no longer have platform code calling pm_runtime_suspend for DSS like
we did for the legacy platform data case, and that fix is still being
discussed on the dri-devel list and will get merged separately. The DSS
related testing exposed a pile other other display related issues that
also need fixing though:
- Fix ti-sysc optional clock handling and reset status checks
for devices that reset automatically in idle like DSS
- Ignore ti-sysc clockactivity bit unless separately requested
to avoid unexpected performance issues
- Init ti-sysc framedonetv_irq to true and disable for am4
- Avoid duplicate DSS reset for legacy mode with dts data
- Remove LCD timings for am4 as they cause warnings now that we're
using generic panels
Then there is a pile of other fixes not related to the DSS:
- Fix omap_prm reset deassert as we still have drivers setting the
pm_runtime_irq_safe() flag
- Flush posted write for ti-sysc enable and disable
- Fix droid4 spi related errors with spi flags
- Fix am335x USB range and a typo for softreset
- Fix dra7 timer nodes for clocks for IPU and DSP
- Drop duplicate mailboxes after mismerge for dra7
* tag 'omap-for-v5.8/fixes-merge-window-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
Revert "bus: ti-sysc: Increase max softreset wait"
ARM: dts: am437x-epos-evm: remove lcd timings
ARM: dts: am437x-gp-evm: remove lcd timings
ARM: dts: am437x-sk-evm: remove lcd timings
ARM: dts: dra7-evm-common: Fix duplicate mailbox nodes
ARM: dts: dra7: Fix timer nodes properly for timer_sys_ck clocks
ARM: dts: Fix am33xx.dtsi ti,sysc-mask wrong softreset flag
ARM: dts: Fix am33xx.dtsi USB ranges length
bus: ti-sysc: Increase max softreset wait
ARM: OMAP2+: Fix legacy mode dss_reset
bus: ti-sysc: Fix uninitialized framedonetv_irq
bus: ti-sysc: Ignore clockactivity unless specified as a quirk
bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit
ARM: dts: omap4-droid4: Fix spi configuration and increase rate
bus: ti-sysc: Flush posted write on enable and disable
soc: ti: omap-prm: use atomic iopoll instead of sleeping one
Link: https://lore.kernel.org/r/pull-1591889257-410830@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The cell name stored in the afs_cell struct is a 64-char + NUL buffer -
when it needs to be able to handle up to AFS_MAXCELLNAME (256 chars) + NUL.
Fix this by changing the array to a pointer and allocating the string.
Found using Coverity.
Fixes: 989782dcdc ("afs: Overhaul cell database management")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit cd238effef ("docs: kbuild: convert docs to ReST and rename to
*.rst") missed a ReST header and a verbatim file content area.
Signed-off-by: Dov Murik <dovmurik@linux.vnet.ibm.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
We can't cast sk_buff to rtable by (struct rtable *)hint. Use skb_rtable().
Fixes: 02b2494161 ("ipv4: use dst hint for ipv4 list receive")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
intel-pinctrl for v5.8-2
* Fix output pin value handling on Intel Baytrail
The following is an automated git shortlog grouped by driver:
baytrail:
- Fix pin being driven low for a while on gpiod_get(..., GPIOD_OUT_HIGH)
Pull cifs fixes from Steve French:
"Six cifs/smb3 fixes, three of them for stable.
Fixes xfstests 451, 313 and 316"
* tag '5.8-rc2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: misc: Use array_size() in if-statement controlling expression
cifs: update ctime and mtime during truncate
cifs/smb3: Fix data inconsistent when punch hole
cifs/smb3: Fix data inconsistent when zero file range
cifs: Fix double add page to memcg when cifs_readpages
cifs: Fix cached_fid refcnt leak in open_shroot
Pull SCSI fixes from James Bottomley:
"Six small fixes, five in drivers and one to correct another minor
regression from cc97923a5b ("block: move dma drain handling to
scsi") where we still need the drain stub to be built in to the kernel
for the modular libata, non-modular SAS driver case"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mptscsih: Fix read sense data size
scsi: zfcp: Fix panic on ERP timeout for previously dismissed ERP action
scsi: lpfc: Avoid another null dereference in lpfc_sli4_hba_unset()
scsi: libata: Fix the ata_scsi_dma_need_drain stub
scsi: qla2xxx: Keep initiator ports after RSCN
scsi: qla2xxx: Set NVMe status code for failed NVMe FCP request
Pull VFIO fixes from Alex Williamson:
- Fix double free of eventfd ctx (Alex Williamson)
- Fix duplicate use of capability ID (Alex Williamson)
- Fix SR-IOV VF memory enable handling (Alex Williamson)
* tag 'vfio-v5.8-rc3' of git://github.com/awilliam/linux-vfio:
vfio/pci: Fix SR-IOV VF handling with MMIO blocking
vfio/type1: Fix migration info capability ID
vfio/pci: Clear error and request eventfd ctx after releasing
Pull i2c fixes from Wolfram Sang:
"This contains a 5.8 regression fix for the Designware driver, a
register bitfield fix for the fsi driver, and a missing sanity check
for the I2C core"
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: core: check returned size of emulated smbus block read
i2c: fsi: Fix the port number field in status register
i2c: designware: Adjust bus speed independently of ACPI
Pull staging driver fixes from Greg KH:
"Here are a small number of tiny staging driver fixes for 5.8-rc3.
Not much here, but there were some reported problems to be fixed:
- three wfx driver fixes
- rtl8723bs driver fix
All of these have been in linux-next with no reported issues"
* tag 'staging-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
Staging: rtl8723bs: prevent buffer overflow in update_sta_support_rate()
staging: wfx: fix coherency of hif_scan() prototype
staging: wfx: drop useless loop
staging: wfx: fix AC priority
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 5.8-rc3 to resolve some reported
issues.
Nothing major here:
- gadget driver fixes
- cdns3 driver fixes
- xhci fixes
- renesas_usbhs driver fixes
- some new device support with ids
- documentation update
All of these have been in linux-next with no reported issues"
* tag 'usb-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (27 commits)
usb: renesas_usbhs: getting residue from callback_result
Revert "usb: dwc3: exynos: Add support for Exynos5422 suspend clk"
xhci: Poll for U0 after disabling USB2 LPM
xhci: Return if xHCI doesn't support LPM
usb: host: xhci-mtk: avoid runtime suspend when removing hcd
xhci: Fix enumeration issue when setting max packet size for FS devices.
xhci: Fix incorrect EP_STATE_MASK
usb: cdns3: ep0: add spinlock for cdns3_check_new_setup
usb: cdns3: trace: using correct dir value
usb: cdns3: ep0: fix the test mode set incorrectly
Revert "usb: dwc3: exynos: Add support for Exynos5422 suspend clk"
usb: gadget: udc: Potential Oops in error handling code
usb: phy: tegra: Fix unnecessary check in tegra_usb_phy_probe()
usb: dwc3: pci: Fix reference count leak in dwc3_pci_resume_work
usb: cdns3: ep0: add spinlock for cdns3_check_new_setup
usb: cdns3: trace: using correct dir value
usb: cdns3: ep0: fix the test mode set incorrectly
usb: typec: tcpci_rt1711h: avoid screaming irq causing boot hangs
USB: ohci-sm501: Add missed iounmap() in remove
cdc-acm: Add DISABLE_ECHO quirk for Microchip/SMSC chip
...
Pull char/misc fixes from Greg KH:
"Some tiny char/misc driver fixes for 5.8-rc3.
The "largest" changes are in the mei driver, to resolve some reported
problems and add some new device ids. There's also a binder bugfix, an
fpga driver build fix, and some assorted habanalabs fixes.
All of these, except for the habanalabs fixes, have been in linux-next
with no reported issues. The habanalabs driver changes showed up in my
tree on Friday, but as they are totally self-contained, all should be
good there"
* tag 'char-misc-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
habanalabs: increase h/w timer when checking idle
habanalabs: Correct handling when failing to enqueue CB
habanalabs: increase GAUDI QMAN ARB WDT timeout
habanalabs: rename mmu_write() to mmu_asid_va_write()
habanalabs: use PI in MMU cache invalidation
habanalabs: block scalar load_and_exe on external queue
mei: me: add tiger lake point device ids for H platforms.
mei: me: disable mei interface on Mehlow server platforms
binder: fix null deref of proc->context
fpga: zynqmp: fix modular build
Pull EDAC fix from Borislav Petkov:
"A single fix for amd64_edac restoring the reporting of the DRAM scrub
rate on family 0x15 CPUs"
* tag 'edac_urgent_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/amd64: Read back the scrub rate PCI register on F15h
Pull dma-mapping fixes from Christoph Hellwig:
- fix dma coherent mmap in nommu (me)
- more AMD SEV fallout (David Rientjes, me)
- fix alignment in dma_common_*_remap (Eric Auger)
* tag 'dma-mapping-5.8-4' of git://git.infradead.org/users/hch/dma-mapping:
dma-remap: align the size in dma_common_*_remap()
dma-mapping: DMA_COHERENT_POOL should select GENERIC_ALLOCATOR
dma-direct: add missing set_memory_decrypted() for coherent mapping
dma-direct: check return value when encrypting or decrypting memory
dma-direct: re-encrypt memory if dma_direct_alloc_pages() fails
dma-direct: always align allocation size in dma_direct_alloc_pages()
dma-direct: mark __dma_direct_alloc_pages static
dma-direct: re-enable mmap for !CONFIG_MMU
Pull NFS client bugfixes from Anna Schumaker:
"Stable Fixes:
- xprtrdma: Fix handling of RDMA_ERROR replies
- sunrpc: Fix rollback in rpc_gssd_dummy_populate()
- pNFS/flexfiles: Fix list corruption if the mirror count changes
- NFSv4: Fix CLOSE not waiting for direct IO completion
- SUNRPC: Properly set the @subbuf parameter of xdr_buf_subsegment()
Other Fixes:
- xprtrdma: Fix a use-after-free with r_xprt->rx_ep
- Fix other xprtrdma races during disconnect
- NFS: Fix memory leak of export_path"
* tag 'nfs-for-5.8-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
SUNRPC: Properly set the @subbuf parameter of xdr_buf_subsegment()
NFSv4 fix CLOSE not waiting for direct IO compeletion
pNFS/flexfiles: Fix list corruption if the mirror count changes
nfs: Fix memory leak of export_path
sunrpc: fixed rollback in rpc_gssd_dummy_populate()
xprtrdma: Fix handling of RDMA_ERROR replies
xprtrdma: Clean up disconnect
xprtrdma: Clean up synopsis of rpcrdma_flush_disconnect()
xprtrdma: Use re_connect_status safely in rpcrdma_xprt_connect()
xprtrdma: Prevent dereferencing r_xprt->rx_ep after it is freed
Pull io_uring fixes from Jens Axboe:
"Three small fixes:
- Close a corner case for polled IO resubmission (Pavel)
- Toss commands when exiting (Pavel)
- Fix SQPOLL conditional reschedule on perpetually busy submit
(Xuan)"
* tag 'io_uring-5.8-2020-06-26' of git://git.kernel.dk/linux-block:
io_uring: fix current->mm NULL dereference on exit
io_uring: fix hanging iopoll in case of -EAGAIN
io_uring: fix io_sq_thread no schedule when busy
Pull block fixes from Jens Axboe:
- NVMe pull request from Christoph:
- multipath deadlock fixes (Anton)
- NUMA fixes (Max)
- RDMA completion vector fix (Max)
- IO deadlock fix (Sagi)
- multipath reference fix (Sagi)
- NS mutation fix (Sagi)
- Use right allocator when freeing bip in error path (Chengguang)
* tag 'block-5.8-2020-06-26' of git://git.kernel.dk/linux-block:
nvme-multipath: fix bogus request queue reference put
nvme-multipath: fix deadlock due to head->lock
nvme: don't protect ns mutation with ns->head->lock
nvme-multipath: fix deadlock between ana_work and scan_work
nvme: fix possible deadlock when I/O is blocked
nvme-rdma: assign completion vector correctly
nvme-loop: initialize tagset numa value to the value of the ctrl
nvme-tcp: initialize tagset numa value to the value of the ctrl
nvme-pci: initialize tagset numa value to the value of the ctrl
nvme-pci: override the value of the controller's numa node
nvme: set initial value for controller's numa node
block: release bip in a right way in error path
Pull device mapper fixes from Mike Snitzer:
- Quite a few DM zoned target fixes and a Zone append fix in DM core.
Considering the amount of dm-zoned changes that went in during the
5.8 merge window these fixes are not that surprising.
- A few DM writecache target fixes.
- A fix to Documentation index to include DM ebs target docs.
- Small cleanup to use struct_size() in DM core's retrieve_deps().
* tag 'for-5.8/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm writecache: add cond_resched to loop in persistent_memory_claim()
dm zoned: Fix reclaim zone selection
dm zoned: Fix random zone reclaim selection
dm: update original bio sector on Zone Append
dm zoned: Fix metadata zone size check
docs: device-mapper: add dm-ebs.rst to an index file
dm ioctl: use struct_size() helper in retrieve_deps()
dm writecache: skip writecache_wait when using pmem mode
dm writecache: correct uncommitted_block when discarding uncommitted entry
dm zoned: assign max_io_len correctly
dm zoned: fix uninitialized pointer dereference
Pull kgdb fixes from Daniel Thompson:
"The main change here is a fix for a number of unsafe interactions
between kdb and the console system. The fixes are specific to kdb
(pure kgdb debugging does not use the console system at all). On
systems with an NMI then kdb, if it is enabled, must get messages to
the user despite potentially running from some "difficult" calling
contexts. These fixes avoid using the console system where we have
been provided an alternative (safer) way to interact with the user
and, if using the console system in unavoidable, use oops_in_progress
for deadlock avoidance. These fixes also ensure kdb honours the
console enable flag.
Also included is a fix that wraps kgdb trap handling in an RCU read
lock to avoids triggering diagnostic warnings. This is a wide lock
scope but this is OK because kgdb is a stop-the-world debugger. When
we stop the world we put all the CPUs into holding pens and this
inhibits RCU update anyway"
* tag 'kgdb-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
kgdb: Avoid suspicious RCU usage warning
kdb: Switch to use safer dbg_io_ops over console APIs
kdb: Make kdb_printf() console handling more robust
kdb: Check status of console prior to invoking handlers
kdb: Re-factor kdb_printf() message write code
Pull powerpc fixes from Michael Ellerman:
- A fix for a crash in nested KVM when CONFIG_DEBUG_VIRTUAL=y.
- Two minor build fixes.
Thanks to: Aneesh Kumar K.V, Arseny Solokha, Harish.
* tag 'powerpc-5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc: Fix build failure in ebb tests
powerpc/kvm/book3s64: Fix kernel crash with nested kvm & DEBUG_VIRTUAL
powerpc/fsl_booke/32: Fix build with CONFIG_RANDOMIZE_BASE
Pull RISC-V fixes from Palmer Dabbelt:
"This contains a handful of fixes I'd like to target for rc3.
Most of them fix issues with the conversion of our vDSO to C. There is
also one fix to the SiFive PRCI driver that I picked up as it's
causing boot issues on the hardware.
- A fix to allow kernels with dynamic ftrace to use the vDSO.
- Some build fixes for the C vDSO functions.
- A fix to the PRCI driver's memory allocation, which was the cause
of some boot panics with FREELIST_RANDOM"
* tag 'riscv-for-linus-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fixup __vdso_gettimeofday broke dynamic ftrace
riscv: Add extern declarations for vDSO time-related functions
clk: sifive: allocate sufficient memory for struct __prci_data
riscv: Add -fPIC option to CFLAGS_vgettimeofday.o
Pull arm64 fixes from Will Deacon:
"The big fix here is to our vDSO sigreturn trampoline as, after a
painfully long stint of debugging, it turned out that fixing some of
our CFI directives in the merge window lit up a bunch of logic in
libgcc which has been shown to SEGV in some cases during asynchronous
pthread cancellation.
It looks like we can fix this by extending the directives to restore
most of the interrupted register state from the sigcontext, but it's
risky and hard to test so we opted to remove the CFI directives for
now and rely on the unwinder fallback path like we used to.
- Fix unwinding through vDSO sigreturn trampoline
- Fix build warnings by raising minimum LD version for PAC
- Whitelist some Kryo Cortex-A55 derivatives for Meltdown and SSB
- Fix perf register PC reporting for compat tasks
- Fix 'make clean' warning for arm64 signal selftests
- Fix ftrace when BTI is compiled in
- Avoid building the compat vDSO using GCC plugins"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Add KRYO{3,4}XX silver CPU cores to SSB safelist
arm64: perf: Report the PC value in REGS_ABI_32 mode
kselftest: arm64: Remove redundant clean target
arm64: kpti: Add KRYO{3, 4}XX silver CPU cores to kpti safelist
arm64: Don't insert a BTI instruction at inner labels
arm64: vdso: Don't use gcc plugins for building vgettimeofday.c
arm64: vdso: Only pass --no-eh-frame-hdr when linker supports it
arm64: Depend on newer binutils when building PAC
arm64: compat: Remove 32-bit sigreturn code from the vDSO
arm64: compat: Always use sigpage for sigreturn trampoline
arm64: compat: Allow 32-bit vdso and sigpage to co-exist
arm64: vdso: Disable dwarf unwinding through the sigreturn trampoline
Commit 8e20fc3917 ("serial_core: Move sysrq functions from header
file") converted the inline sysrq helpers to exported functions which
are now called for every received character, interrupt and break signal
also on systems without CONFIG_MAGIC_SYSRQ_SERIAL instead of being
optimised away by the compiler.
Inlining these helpers again also avoids the function call overhead when
CONFIG_MAGIC_SYSRQ_SERIAL is enabled (e.g. when the port is not used as
a console).
Fixes: 8e20fc3917 ("serial_core: Move sysrq functions from header file")
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Link: https://lore.kernel.org/r/20200610152232.16925-3-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
devm_gpiod_get_index() doesn't return NULL but -ENOENT when the
requested GPIO doesn't exist, leading to the following messages:
[ 2.742468] gpiod_direction_input: invalid GPIO (errorpointer)
[ 2.748147] can't set direction for gpio #2: -2
[ 2.753081] gpiod_direction_input: invalid GPIO (errorpointer)
[ 2.758724] can't set direction for gpio #3: -2
[ 2.763666] gpiod_direction_output: invalid GPIO (errorpointer)
[ 2.769394] can't set direction for gpio #4: -2
[ 2.774341] gpiod_direction_input: invalid GPIO (errorpointer)
[ 2.779981] can't set direction for gpio #5: -2
[ 2.784545] ff000a20.serial: ttyCPM1 at MMIO 0xfff00a20 (irq = 39, base_baud = 8250000) is a CPM UART
Use devm_gpiod_get_index_optional() instead.
At the same time, handle the error case and properly exit
with an error.
Fixes: 97cbaf2c82 ("tty: serial: cpm_uart: Convert to use GPIO descriptors")
Cc: stable@vger.kernel.org
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/694a25fdce548c5ee8b060ef6a4b02746b8f25c0.1591986307.git.christophe.leroy@csgroup.eu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The mpt fusion driver still uses the legacy PCI DMA API which hardcodes
atomic allocations. This caused the driver to fail to load on some powerpc
VMs with incoherent DMA and small memory sizes. Switch to use the modern
DMA API and sleeping allocations for large allocations instead. This is
not a full cleanup of the PCI DMA API usage yet, but just enough to fix the
regression caused by reducing the default atomic pool size.
Link: https://lore.kernel.org/r/20200624165724.1818496-1-hch@lst.de
Fixes: 3ee06a6d53 ("dma-pool: fix too large DMA pools on medium memory size systems")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When an rport event (RPORT_EV_READY) is updated without work being queued,
avoid taking an additional reference.
This issue was leading to memory leak. Trace from KMEMLEAK tool:
unreferenced object 0xffff8888259e8780 (size 512):
comm "kworker/2:1", jiffies 4433237386 (age 113021.971s)
hex dump (first 32 bytes):
58 0a ec cf 83 88 ff ff 00 00 00 00 00 00 00 00
01 00 00 00 08 00 00 00 13 7d f0 1e 0e 00 00 10
backtrace:
[<000000006b25760f>] fc_rport_recv_req+0x3c6/0x18f0 [libfc]
[<00000000f208d994>] fc_lport_recv_els_req+0x120/0x8a0 [libfc]
[<00000000a9c437b8>] fc_lport_recv+0xb9/0x130 [libfc]
[<00000000a9c437b8>] fc_lport_recv+0xb9/0x130 [libfc]
[<00000000ad5be37b>] qedf_ll2_process_skb+0x73d/0xad0 [qedf]
[<00000000e0eb6893>] process_one_work+0x382/0x6c0
[<000000002dfd9e21>] worker_thread+0x57/0x5c0
[<00000000b648204f>] kthread+0x1a0/0x1c0
[<0000000072f5ab20>] ret_from_fork+0x35/0x40
[<000000001d5c05d8>] 0xffffffffffffffff
Below is the log sequence which leads to memory leak. Here we get the
RPORT_EV_READY and RPORT_EV_STOP back to back, which lead to overwrite the
event RPORT_EV_READY by event RPORT_EV_STOP. Because of this, kref_count
gets incremented by 1.
kernel: host0: rport fffce5: Received PLOGI request
kernel: host0: rport fffce5: Received PLOGI in INIT state
kernel: host0: rport fffce5: Port is Ready
kernel: host0: rport fffce5: Received PRLI request while in state Ready
kernel: host0: rport fffce5: PRLI rspp type 8 active 1 passive 0
kernel: host0: rport fffce5: Received LOGO request while in state Ready
kernel: host0: rport fffce5: Delete port
kernel: host0: rport fffce5: Received PLOGI request
kernel: host0: rport fffce5: Received PLOGI in state Delete - send busy
kernel: host0: rport fffce5: work event 3
kernel: host0: rport fffce5: lld callback ev 3
kernel: host0: rport fffce5: work delete
Link: https://lore.kernel.org/r/20200626094959.32151-1-jhasan@marvell.com
Reviewed-by: Girish Basrur <gbasrur@marvell.com>
Reviewed-by: Saurav Kashyap <skashyap@marvell.com>
Reviewed-by: Shyam Sundar <ssundar@marvell.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Handling of extra kref which is done by lookup table in case rdata is
already present in list.
This issue was leading to memory leak. Trace from KMEMLEAK tool:
unreferenced object 0xffff8888259e8780 (size 512):
comm "kworker/2:1", pid 182614, jiffies 4433237386 (age 113021.971s)
hex dump (first 32 bytes):
58 0a ec cf 83 88 ff ff 00 00 00 00 00 00 00 00
01 00 00 00 08 00 00 00 13 7d f0 1e 0e 00 00 10
backtrace:
[<000000006b25760f>] fc_rport_recv_req+0x3c6/0x18f0 [libfc]
[<00000000f208d994>] fc_lport_recv_els_req+0x120/0x8a0 [libfc]
[<00000000a9c437b8>] fc_lport_recv+0xb9/0x130 [libfc]
[<00000000ad5be37b>] qedf_ll2_process_skb+0x73d/0xad0 [qedf]
[<00000000e0eb6893>] process_one_work+0x382/0x6c0
[<000000002dfd9e21>] worker_thread+0x57/0x5c0
[<00000000b648204f>] kthread+0x1a0/0x1c0
[<0000000072f5ab20>] ret_from_fork+0x35/0x40
[<000000001d5c05d8>] 0xffffffffffffffff
Below is the log sequence which leads to memory leak. Here we get the
nested "Received PLOGI request" for same port and this request leads to
call the fc_rport_create() twice for the same rport.
kernel: host1: rport fffce5: Received PLOGI request
kernel: host1: rport fffce5: Received PLOGI in INIT state
kernel: host1: rport fffce5: Port is Ready
kernel: host1: rport fffce5: Received PRLI request while in state Ready
kernel: host1: rport fffce5: PRLI rspp type 8 active 1 passive 0
kernel: host1: rport fffce5: Received LOGO request while in state Ready
kernel: host1: rport fffce5: Delete port
kernel: host1: rport fffce5: Received PLOGI request
kernel: host1: rport fffce5: Received PLOGI in state Delete - send busy
Link: https://lore.kernel.org/r/20200622101212.3922-2-jhasan@marvell.com
Reviewed-by: Girish Basrur <gbasrur@marvell.com>
Reviewed-by: Saurav Kashyap <skashyap@marvell.com>
Reviewed-by: Shyam Sundar <ssundar@marvell.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Disable the plane if it's not visible. Otherwise mtk_ovl_layer_config()
would proceed with invalid plane and we may see vblank timeout.
Fixes: 119f517362 ("drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.")
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
ARMv8 Juno/Vexpress/Fast Models fix for v5.8
Partial revert of some recent fixes to silence DTC warning which broke
clocks on some Vexpress platforms resulting in boot issues.
* tag 'juno-fix-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
arm: dts: vexpress: Move mcc node back into motherboard node
Link: https://lore.kernel.org/r/20200609180447.GB5732@bogus
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The rings bitmap of an interrupt vector encodes
which of the device's rings were assigned to that
interrupt vector.
Hence the iteration range of the tx rings bitmap
(for_each_set_bit()) should be the total number of
Tx rings of that netdevice instead of the number of
rings assigned to the interrupt vector.
Since there are 2 cores, and one interrupt vector for
each core, the number of rings asigned to an interrupt
vector is half the number of available rings.
The impact of this error is that the upper half of the
tx rings could still generate interrupts during napi
polling.
Fixes: d4fd0404c1 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, if the kernel is configured incorrectly or if it crashes before any
kunit tests are run, kunit finishes without error, reporting
that 0 test cases were run.
To fix this, an error is shown when the tap header is not found, which
indicates that kunit was not able to run at all.
Signed-off-by: Uriel Guajardo <urielguajardo@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Commit 8b59cd81dc ("kbuild: ensure full rebuild when the compiler is
updated") introduced a new CONFIG option CONFIG_CC_VERSION_TEXT. On my
system, this is set to "gcc (GCC) 10.1.0" which breaks KUnit config
parsing which did not like the spaces in the string.
Fix this by updating the regex to allow strings containing spaces.
Fixes: 8b59cd81dc ("kbuild: ensure full rebuild when the compiler is updated")
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Pull ACPI fixes from Rafael Wysocki:
"Prevent bypassing kernel lockdown via the ACPI tables loading
interface (Jason A. Donenfeld) and fix the handling of an ACPI sysfs
attribute (Nathan Chancellor)"
* tag 'acpi-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: sysfs: Fix pm_profile_attr type
ACPI: configfs: Disallow loading ACPI tables when locked down
Pull power management fixes from Rafael Wysocki:
"These fix a recent regression that broke suspend-to-idle on some x86
systems, fix the intel_pstate driver to correctly let the platform
firmware control CPU performance in some cases and add __init
annotations to a couple of functions.
Specifics:
- Make sure that the _TIF_POLLING_NRFLAG is clear before entering the
last phase of suspend-to-idle to avoid wakeup issues on some x86
systems (Chen Yu, Rafael Wysocki).
- Cover one more case in which the intel_pstate driver should let the
platform firmware control the CPU frequency and refuse to load
(Srinivas Pandruvada).
- Add __init annotations to 2 functions in the power management core
(Christophe JAILLET)"
* tag 'pm-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: Rearrange s2idle-specific idle state entry code
PM: sleep: core: mark 2 functions as __init to save some memory
cpufreq: intel_pstate: Add one more OOB control bit
PM: s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle
Pull iommu fixes from Joerg Roedel:
"A couple of Intel VT-d fixes:
- Make Intel SVM code 64bit only. The code uses pgd_t* and the IOMMU
only supports long-mode page-table formats, so its broken on 32bit
anyway.
- Make sure GFX quirks in for Intel VT-d are not applied to untrusted
devices. Those devices might gain full memory access otherwise.
- Identity mapping setup fix.
- Fix ACS enabling when Intel IOMMU is off and untrusted devices are
detected.
- Two smaller fixes for coherency and IO page-table setup"
* tag 'iommu-fixes-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Fix misuse of iommu_domain_identity_map()
iommu/vt-d: Update scalable mode paging structure coherency
iommu/vt-d: Enable PCI ACS for platform opt in hint
iommu/vt-d: Don't apply gfx quirks to untrusted devices
iommu/vt-d: Set U/S bit in first level page table by default
iommu/vt-d: Make Intel SVM code 64-bit only
Pull drm fixes from Dave Airlie:
"Usual rc3 pickup, lots of little fixes all over.
The core VT registration regression fix is probably the largest,
otherwise ttm, amdgpu and tegra are the bulk, with some minor driver
fixes.
No i915 pull this week which may or may not mean I get 2x of it next
week, we'll see how it goes.
core:
- fix VT registration regression
ttm:
- fix two fence leaks
amdgpu:
- Fix missed mutex unlock in DC error path
- Fix firmware leak for sdma5
- DC bpc property fixes
amdkfd:
- Fix memleak in an error path
radeon:
- Fix copy paste typo in NI DPM spll validation
rcar-du:
- build fix
tegra:
- add missing zpos property
- child driver registeration fix
- debugfs cleanup fix
- doc fix
mcde:
- reorder fbdev setup
panel:
- fix connector type
- fix orienation for some panels
sun4i:
- fix dma/iommu configuration
uvesafb:
- respect blank flag"
* tag 'drm-fixes-2020-06-26' of git://anongit.freedesktop.org/drm/drm: (25 commits)
drm/amd: fix potential memleak in err branch
drm/amd/display: Fix ineffective setting of max bpc property
drm/amd/display: Enable output_bpc property on all outputs
drm/amdgpu: add fw release for sdma v5_0
drm/fb-helper: Fix vt restore
drm/radeon: fix fb_div check in ni_init_smc_spll_table()
drm/amdgpu/display: Unlock mutex on error
drm/sun4i: mixer: Call of_dma_configure if there's an IOMMU
drm: panel-orientation-quirks: Use generic orientation-data for Acer S1003
drm: panel-orientation-quirks: Add quirk for Asus T101HA panel
video: fbdev: uvesafb: fix "noblank" option handling
drm/panel-simple: fix connector type for newhaven_nhd_43_480272ef_atxl
drm/panel-simple: fix connector type for LogicPD Type28 Display
drm: rcar-du: Fix build error
drm: mcde: Fix forgotten user of drm->dev_private
drm: mcde: Fix display initialization problem
drm/tegra: Add zpos property for cursor planes
gpu: host1x: Detach driver on unregister
gpu: host1x: Correct trivial kernel-doc inconsistencies
drm/tegra: hub: Register child devices
...
Let the network stack know the real number of queues that
we are using.
v2: added error checking
Fixes: 49d3b49367 ("ionic: disable the queues on link down")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
FPGA Manager fixes for 5.8-rc1
Here is one (late) fix for 5.8-rc1 merge window.
Arnd's change addresses a missing build dependency.
All patches have been reviewed on the mailing list, and have been in the
last few linux-next releases (as part of my fixes branch) without issues.
Signed-off-by: Moritz Fischer <mdf@kernel.org>
* tag 'fpga-fixes-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga:
fpga: zynqmp: fix modular build
Oded writes:
This tag contains the following fixes for kernel 5.8-rc2:
- close security hole in GAUDI command buffer parsing by blocking an
instruction that might allow user to run command buffer that wasn't
parsed on a secured engine.
- Fix bug in GAUDI MMU cache invalidation code.
- Rename a function to resolve conflict with a static inline function in
arch/m68k/include/asm/mcfmmu.h
- Increase watchdog timeout of GAUDI QMAN arbitration H/W to prevent false
reports on timeouts
- Fix bug of dereferencing NULL pointer when an error occurs during command
submission
- Increase H/W timer for checking if PDMA engine is IDLE in GAUDI.
* tag 'misc-habanalabs-fixes-2020-06-24' of git://people.freedesktop.org/~gabbayo/linux:
habanalabs: increase h/w timer when checking idle
habanalabs: Correct handling when failing to enqueue CB
habanalabs: increase GAUDI QMAN ARB WDT timeout
habanalabs: rename mmu_write() to mmu_asid_va_write()
habanalabs: use PI in MMU cache invalidation
habanalabs: block scalar load_and_exe on external queue
Felipe writes:
usb: fixes for v5.8-rc2
A revert of Exynos5422 suspend clock support, it turns out it wasn't
ready to be merged. CDNS3 got a fix for test mode initialization.
Signed-off-by: Felipe Balbi <balbi@kernel.org>
* tag 'fixes-for-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb:
Revert "usb: dwc3: exynos: Add support for Exynos5422 suspend clk"
usb: gadget: udc: Potential Oops in error handling code
usb: phy: tegra: Fix unnecessary check in tegra_usb_phy_probe()
usb: dwc3: pci: Fix reference count leak in dwc3_pci_resume_work
usb: cdns3: ep0: add spinlock for cdns3_check_new_setup
usb: cdns3: trace: using correct dir value
usb: cdns3: ep0: fix the test mode set incorrectly
* pm-cpufreq:
cpufreq: intel_pstate: Add one more OOB control bit
* pm-cpuidle:
cpuidle: Rearrange s2idle-specific idle state entry code
PM: s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle
Felipe has based his patches on that tag, so update my usb-linus branch
to it as well so that I can pull his patches in here easier.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
At times when I'm using kgdb I see a splat on my console about
suspicious RCU usage. I managed to come up with a case that could
reproduce this that looked like this:
WARNING: suspicious RCU usage
5.7.0-rc4+ #609 Not tainted
-----------------------------
kernel/pid.c:395 find_task_by_pid_ns() needs rcu_read_lock() protection!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by swapper/0/1:
#0: ffffff81b6b8e988 (&dev->mutex){....}-{3:3}, at: __device_attach+0x40/0x13c
#1: ffffffd01109e9e8 (dbg_master_lock){....}-{2:2}, at: kgdb_cpu_enter+0x20c/0x7ac
#2: ffffffd01109ea90 (dbg_slave_lock){....}-{2:2}, at: kgdb_cpu_enter+0x3ec/0x7ac
stack backtrace:
CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc4+ #609
Hardware name: Google Cheza (rev3+) (DT)
Call trace:
dump_backtrace+0x0/0x1b8
show_stack+0x1c/0x24
dump_stack+0xd4/0x134
lockdep_rcu_suspicious+0xf0/0x100
find_task_by_pid_ns+0x5c/0x80
getthread+0x8c/0xb0
gdb_serial_stub+0x9d4/0xd04
kgdb_cpu_enter+0x284/0x7ac
kgdb_handle_exception+0x174/0x20c
kgdb_brk_fn+0x24/0x30
call_break_hook+0x6c/0x7c
brk_handler+0x20/0x5c
do_debug_exception+0x1c8/0x22c
el1_sync_handler+0x3c/0xe4
el1_sync+0x7c/0x100
rpmh_rsc_probe+0x38/0x420
platform_drv_probe+0x94/0xb4
really_probe+0x134/0x300
driver_probe_device+0x68/0x100
__device_attach_driver+0x90/0xa8
bus_for_each_drv+0x84/0xcc
__device_attach+0xb4/0x13c
device_initial_probe+0x18/0x20
bus_probe_device+0x38/0x98
device_add+0x38c/0x420
If I understand properly we should just be able to blanket kgdb under
one big RCU read lock and the problem should go away. We'll add it to
the beast-of-a-function known as kgdb_cpu_enter().
With this I no longer get any splats and things seem to work fine.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200602154729.v2.1.I70e0d4fd46d5ed2aaf0c98a355e8e1b7a5bb7e4e@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
In kgdb context, calling console handlers aren't safe due to locks used
in those handlers which could in turn lead to a deadlock. Although, using
oops_in_progress increases the chance to bypass locks in most console
handlers but it might not be sufficient enough in case a console uses
more locks (VT/TTY is good example).
Currently when a driver provides both polling I/O and a console then kdb
will output using the console. We can increase robustness by using the
currently active polling I/O driver (which should be lockless) instead
of the corresponding console. For several common cases (e.g. an
embedded system with a single serial port that is used both for console
output and debugger I/O) this will result in no console handler being
used.
In order to achieve this we need to reverse the order of preference to
use dbg_io_ops (uses polling I/O mode) over console APIs. So we just
store "struct console" that represents debugger I/O in dbg_io_ops and
while emitting kdb messages, skip console that matches dbg_io_ops
console in order to avoid duplicate messages. After this change,
"is_console" param becomes redundant and hence removed.
Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lore.kernel.org/r/1591264879-25920-5-git-send-email-sumit.garg@linaro.org
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
@subbuf is an output parameter of xdr_buf_subsegment(). A survey of
call sites shows that @subbuf is always uninitialized before
xdr_buf_segment() is invoked by callers.
There are some execution paths through xdr_buf_subsegment() that do
not set all of the fields in @subbuf, leaving some pointer fields
containing garbage addresses. Subsequent processing of that buffer
then results in a page fault.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Figuring out the root case for the REMOVE/CLOSE race and
suggesting the solution was done by Neil Brown.
Currently what happens is that direct IO calls hold a reference
on the open context which is decremented as an asynchronous task
in the nfs_direct_complete(). Before reference is decremented,
control is returned to the application which is free to close the
file. When close is being processed, it decrements its reference
on the open_context but since directIO still holds one, it doesn't
sent a close on the wire. It returns control to the application
which is free to do other operations. For instance, it can delete a
file. Direct IO is finally releasing its reference and triggering
an asynchronous close. Which races with the REMOVE. On the server,
REMOVE can be processed before the CLOSE, failing the REMOVE with
EACCES as the file is still opened.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Suggested-by: Neil Brown <neilb@suse.com>
CC: stable@vger.kernel.org
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
If the mirror count changes in the new layout we pick up inside
ff_layout_pg_init_write(), then we can end up adding the
request to the wrong mirror and corrupting the mirror->pg_list.
Fixes: d600ad1f2b ("NFS41: pop some layoutget errors to application")
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The try_location function is called within a loop by nfs_follow_referral.
try_location calls nfs4_pathname_string to created the export_path.
nfs4_pathname_string allocates the memory. export_path is stored in the
nfs_fs_context/fs_context structure similarly as hostname and source.
But whereas the ctx hostname and source are freed before assignment,
export_path is not. So if there are multiple loops, the new export_path
will overwrite the old without the old being freed.
So call kfree for export_path.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The pins on the Bay Trail SoC have separate input-buffer and output-buffer
enable bits and a read of the level bit of the value register will always
return the value from the input-buffer.
The BIOS of a device may configure a pin in output-only mode, only enabling
the output buffer, and write 1 to the level bit to drive the pin high.
This 1 written to the level bit will be stored inside the data-latch of the
output buffer.
But a subsequent read of the value register will return 0 for the level bit
because the input-buffer is disabled. This causes a read-modify-write as
done by byt_gpio_set_direction() to write 0 to the level bit, driving the
pin low!
Before this commit byt_gpio_direction_output() relied on
pinctrl_gpio_direction_output() to set the direction, followed by a call
to byt_gpio_set() to apply the selected value. This causes the pin to
go low between the pinctrl_gpio_direction_output() and byt_gpio_set()
calls.
Change byt_gpio_direction_output() to directly make the register
modifications itself instead. Replacing the 2 subsequent writes to the
value register with a single write.
Note that the pinctrl code does not keep track internally of the direction,
so not going through pinctrl_gpio_direction_output() is not an issue.
This issue was noticed on a Trekstor SurfTab Twin 10.1. When the panel is
already on at boot (no external monitor connected), then the i915 driver
does a gpiod_get(..., GPIOD_OUT_HIGH) for the panel-enable GPIO. The
temporarily going low of that GPIO was causing the panel to reset itself
after which it would not show an image until it was turned off and back on
again (until a full modeset was done on it). This commit fixes this.
This commit also updates the byt_gpio_direction_input() to use direct
register accesses instead of going through pinctrl_gpio_direction_input(),
to keep it consistent with byt_gpio_direction_output().
Note for backporting, this commit depends on:
commit e2b74419e5 ("pinctrl: baytrail: Replace WARN with dev_info_once
when setting direct-irq pin to output")
Cc: stable@vger.kernel.org
Fixes: 86e3ef812f ("pinctrl: baytrail: Update gpio chip operations")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
If the i2c bus driver ignores the I2C_M_RECV_LEN flag (as some of
them do), it is possible for an I2C_SMBUS_BLOCK_DATA read issued
on some random device to return an arbitrary value in the first
byte (and nothing else). When this happens, i2c_smbus_xfer_emulated()
will happily write past the end of the supplied data buffer, thus
causing Bad Things to happen. To prevent this, check the size
before copying the data block and return an error if it is too large.
Fixes: 209d27c3b1 ("i2c: Emulate SMBus block read over I2C")
Signed-off-by: Mans Rullgard <mans@mansr.com>
[wsa: use better errno]
Signed-off-by: Wolfram Sang <wsa@kernel.org>
When working with very large nodes, poisoning the struct pages (for which
there will be very many) can take a very long time. If the system is
using voluntary preemptions, the software watchdog will not be able to
detect forward progress. This patch addresses this issue by offering to
give up time like __remove_pages() does. This behavior was introduced in
v5.6 with: commit d33695b16a ("mm/memory_hotplug: poison memmap in
remove_pfn_range_from_zone()")
Alternately, init_page_poison could do this cond_resched(), but it seems
to me that the caller of init_page_poison() is what actually knows whether
or not it should relax its own priority.
Based on Dan's notes, I think this is perfectly safe: commit f931ab479d
("mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}")
Aside from fixing the lockup, it is also a friendlier thing to do on lower
core systems that might wipe out large chunks of hotplug memory (probably
not a very common case).
Fixes this kind of splat:
watchdog: BUG: soft lockup - CPU#46 stuck for 22s! [daxctl:9922]
irq event stamp: 138450
hardirqs last enabled at (138449): [<ffffffffa1001f26>] trace_hardirqs_on_thunk+0x1a/0x1c
hardirqs last disabled at (138450): [<ffffffffa1001f42>] trace_hardirqs_off_thunk+0x1a/0x1c
softirqs last enabled at (138448): [<ffffffffa1e00347>] __do_softirq+0x347/0x456
softirqs last disabled at (138443): [<ffffffffa10c416d>] irq_exit+0x7d/0xb0
CPU: 46 PID: 9922 Comm: daxctl Not tainted 5.7.0-BEN-14238-g373c6049b336 #30
Hardware name: Intel Corporation PURLEY/PURLEY, BIOS PLYXCRB1.86B.0578.D07.1902280810 02/28/2019
RIP: 0010:memset_erms+0x9/0x10
Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01
Call Trace:
remove_pfn_range_from_zone+0x3a/0x380
memunmap_pages+0x17f/0x280
release_nodes+0x22a/0x260
__device_release_driver+0x172/0x220
device_driver_detach+0x3e/0xa0
unbind_store+0x113/0x130
kernfs_fop_write+0xdc/0x1c0
vfs_write+0xde/0x1d0
ksys_write+0x58/0xd0
do_syscall_64+0x5a/0x120
entry_SYSCALL_64_after_hwframe+0x49/0xb3
Built 2 zonelists, mobility grouping on. Total pages: 49050381
Policy zone: Normal
Built 3 zonelists, mobility grouping on. Total pages: 49312525
Policy zone: Normal
David said: "It really only is an issue for devmem. Ordinary
hotplugged system memory is not affected (onlined/offlined in memory
block granularity)."
Link: http://lkml.kernel.org/r/20200619231213.1160351-1-ben.widawsky@intel.com
Fixes: commit d33695b16a ("mm/memory_hotplug: poison memmap in remove_pfn_range_from_zone()")
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: "Scargall, Steve" <steve.scargall@intel.com>
Reported-by: Ben Widawsky <ben.widawsky@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "fix a hyperv W^X violation and remove vmalloc_exec"
Dexuan reported a W^X violation due to the fact that the hyper hypercall
page due switching it to be allocated using vmalloc_exec.
The problem is that PAGE_KERNEL_EXEC as used by vmalloc_exec actually
sets writable permissions in the pte. This series fixes the issue by
switching to the low-level __vmalloc_node_range interface that allows
specifing more detailed permissions instead. It then also open codes
the other two callers and removes the somewhat confusing vmalloc_exec
interface.
Peter noted that the hyper hypercall page allocation also has another
long standing issue in that it shouldn't use the full vmalloc but just
the module space. This issue is so far theoretical as the allocation is
done early in the boot process. I plan to fix it with another bigger
series for 5.9.
This patch (of 3):
Avoid a W^X violation cause by the fact that PAGE_KERNEL_EXEC includes
the writable bit.
For this resurrect the removed PAGE_KERNEL_RX definition, but as
PAGE_KERNEL_ROX to match arm64 and powerpc.
Link: http://lkml.kernel.org/r/20200618064307.32739-2-hch@lst.de
Fixes: 78bb17f76e ("x86/hyperv: use vmalloc_exec for the hypercall page")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Dexuan Cui <decui@microsoft.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tejun reports seeing rare div0 crashes in memory.low stress testing:
RIP: 0010:mem_cgroup_calculate_protection+0xed/0x150
Code: 0f 46 d1 4c 39 d8 72 57 f6 05 16 d6 42 01 40 74 1f 4c 39 d8 76 1a 4c 39 d1 76 15 4c 29 d1 4c 29 d8 4d 29 d9 31 d2 48 0f af c1 <49> f7 f1 49 01 c2 4c 89 96 38 01 00 00 5d c3 48 0f af c7 31 d2 49
RSP: 0018:ffffa14e01d6fcd0 EFLAGS: 00010246
RAX: 000000000243e384 RBX: 0000000000000000 RCX: 0000000000008f4b
RDX: 0000000000000000 RSI: ffff8b89bee84000 RDI: 0000000000000000
RBP: ffffa14e01d6fcd0 R08: ffff8b89ca7d40f8 R09: 0000000000000000
R10: 0000000000000000 R11: 00000000006422f7 R12: 0000000000000000
R13: ffff8b89d9617000 R14: ffff8b89bee84000 R15: ffffa14e01d6fdb8
FS: 0000000000000000(0000) GS:ffff8b8a1f1c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f93b1fc175b CR3: 000000016100a000 CR4: 0000000000340ea0
Call Trace:
shrink_node+0x1e5/0x6c0
balance_pgdat+0x32d/0x5f0
kswapd+0x1d7/0x3d0
kthread+0x11c/0x160
ret_from_fork+0x1f/0x30
This happens when parent_usage == siblings_protected.
We check that usage is bigger than protected, which should imply
parent_usage being bigger than siblings_protected. However, we don't
read (or even update) these values atomically, and they can be out of
sync as the memory state changes under us. A bit of fluctuation around
the target protection isn't a big deal, but we need to handle the div0
case.
Check the parent state explicitly to make sure we have a reasonable
positive value for the divisor.
Link: http://lkml.kernel.org/r/20200615140658.601684-1-hannes@cmpxchg.org
Fixes: 8a931f8013 ("mm: memcontrol: recursive memory.low protection")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Chris Down <chris@chrisdown.name>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After mm.h was removed from the asm-generic version of cacheflush.h,
s390 allyesconfig shows several warnings of the following nature:
In file included from arch/s390/include/generated/asm/cacheflush.h:1,
from drivers/media/platform/omap3isp/isp.c:42:
include/asm-generic/cacheflush.h:16:42: warning: 'struct mm_struct' declared inside parameter list will not be visible outside of this definition or declaration
As Geert and Laurent point out, this driver does not need this header in
the two files that include it. Remove it so there are no warnings.
Link: http://lkml.kernel.org/r/20200622234740.72825-2-natechancellor@gmail.com
Fixes: e0cf615d72 ("asm-generic: don't include <linux/mm.h> in cacheflush.h")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Suggested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some s390 builds get these warnings:
include/asm-generic/cacheflush.h:16:42: warning: 'struct mm_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:22:46: warning: 'struct mm_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:28:45: warning: 'struct vm_area_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:36:44: warning: 'struct vm_area_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:44:45: warning: 'struct page' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:52:50: warning: 'struct address_space' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:58:52: warning: 'struct address_space' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:75:17: warning: 'struct page' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:74:45: warning: 'struct vm_area_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:82:16: warning: 'struct page' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:81:50: warning: 'struct vm_area_struct' declared inside parameter list will not be visible outside of this definition or declaration
Forward declare the named structs to get rid of these.
Link: http://lkml.kernel.org/r/20200623135714.4dae4b8a@canb.auug.org.au
Fixes: e0cf615d72 ("asm-generic: don't include <linux/mm.h> in cacheflush.h")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since commit 9e343b467c ("READ_ONCE: Enforce atomicity for
{READ,WRITE}_ONCE() memory accesses"), READ_ONCE() cannot be used
anymore to read complex page table entries.
This leads to:
CC mm/debug_vm_pgtable.o
In file included from ./include/asm-generic/bug.h:5,
from ./arch/powerpc/include/asm/bug.h:109,
from ./include/linux/bug.h:5,
from ./include/linux/mmdebug.h:5,
from ./include/linux/gfp.h:5,
from mm/debug_vm_pgtable.c:13:
In function 'pte_clear_tests',
inlined from 'debug_vm_pgtable' at mm/debug_vm_pgtable.c:363:2:
./include/linux/compiler.h:392:38: error: Unsupported access size for {READ,WRITE}_ONCE().
mm/debug_vm_pgtable.c:249:14: note: in expansion of macro 'READ_ONCE'
249 | pte_t pte = READ_ONCE(*ptep);
| ^~~~~~~~~
make[2]: *** [mm/debug_vm_pgtable.o] Error 1
Fix it by using the recently added ptep_get() helper.
Link: http://lkml.kernel.org/r/6ca8c972e6c920dc4ae0d4affbed9703afa4d010.1592490570.git.christophe.leroy@csgroup.eu
Fixes: 9e343b467c ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Murphy reports that a slightly overcommitted load, testing swap
and zram along with i915, splats and keeps on splatting, when it had
better fail less noisily:
gnome-shell: page allocation failure: order:0,
mode:0x400d0(__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_RECLAIMABLE),
nodemask=(null),cpuset=/,mems_allowed=0
CPU: 2 PID: 1155 Comm: gnome-shell Not tainted 5.7.0-1.fc33.x86_64 #1
Call Trace:
dump_stack+0x64/0x88
warn_alloc.cold+0x75/0xd9
__alloc_pages_slowpath.constprop.0+0xcfa/0xd30
__alloc_pages_nodemask+0x2df/0x320
alloc_slab_page+0x195/0x310
allocate_slab+0x3c5/0x440
___slab_alloc+0x40c/0x5f0
__slab_alloc+0x1c/0x30
kmem_cache_alloc+0x20e/0x220
xas_nomem+0x28/0x70
add_to_swap_cache+0x321/0x400
__read_swap_cache_async+0x105/0x240
swap_cluster_readahead+0x22c/0x2e0
shmem_swapin+0x8e/0xc0
shmem_swapin_page+0x196/0x740
shmem_getpage_gfp+0x3a2/0xa60
shmem_read_mapping_page_gfp+0x32/0x60
shmem_get_pages+0x155/0x5e0 [i915]
__i915_gem_object_get_pages+0x68/0xa0 [i915]
i915_vma_pin+0x3fe/0x6c0 [i915]
eb_add_vma+0x10b/0x2c0 [i915]
i915_gem_do_execbuffer+0x704/0x3430 [i915]
i915_gem_execbuffer2_ioctl+0x1ea/0x3e0 [i915]
drm_ioctl_kernel+0x86/0xd0 [drm]
drm_ioctl+0x206/0x390 [drm]
ksys_ioctl+0x82/0xc0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x5b/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported on 5.7, but it goes back really to 3.1: when
shmem_read_mapping_page_gfp() was implemented for use by i915, and
allowed for __GFP_NORETRY and __GFP_NOWARN flags in most places, but
missed swapin's "& GFP_KERNEL" mask for page tree node allocation in
__read_swap_cache_async() - that was to mask off HIGHUSER_MOVABLE bits
from what page cache uses, but GFP_RECLAIM_MASK is now what's needed.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208085
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2006151330070.11064@eggly.anvils
Fixes: 68da9f0557 ("tmpfs: pass gfp to shmem_getpage_gfp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Chris Murphy <lists@colorremedies.com>
Analyzed-by: Vlastimil Babka <vbabka@suse.cz>
Analyzed-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Chris Murphy <lists@colorremedies.com>
Cc: <stable@vger.kernel.org> [3.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It was found that running the LTP test on a PowerPC system could produce
erroneous values in /proc/meminfo, like:
MemTotal: 531915072 kB
MemFree: 507962176 kB
MemAvailable: 1100020596352 kB
Using bisection, the problem is tracked down to commit 9c315e4d7d ("mm:
memcg/slab: cache page number in memcg_(un)charge_slab()").
In memcg_uncharge_slab() with a "int order" argument:
unsigned int nr_pages = 1 << order;
:
mod_lruvec_state(lruvec, cache_vmstat_idx(s), -nr_pages);
The mod_lruvec_state() function will eventually call the
__mod_zone_page_state() which accepts a long argument. Depending on the
compiler and how inlining is done, "-nr_pages" may be treated as a
negative number or a very large positive number. Apparently, it was
treated as a large positive number in that PowerPC system leading to
incorrect stat counts. This problem hasn't been seen in x86-64 yet,
perhaps the gcc compiler there has some slight difference in behavior.
It is fixed by making nr_pages a signed value. For consistency, a similar
change is applied to memcg_charge_slab() as well.
Link: http://lkml.kernel.org/r/20200620184719.10994-1-longman@redhat.com
Fixes: 9c315e4d7d ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()").
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signature verification is an important security feature, to protect
system from being attacked with a kernel of unknown origin. Kexec
rebooting is a way to replace the running kernel, hence need be secured
carefully.
In the current code of handling signature verification of kexec kernel,
the logic is very twisted. It mixes signature verification, IMA
signature appraising and kexec lockdown.
If there is no KEXEC_SIG_FORCE, kexec kernel image doesn't have one of
signature, the supported crypto, and key, we don't think this is wrong,
Unless kexec lockdown is executed. IMA is considered as another kind of
signature appraising method.
If kexec kernel image has signature/crypto/key, it has to go through the
signature verification and pass. Otherwise it's seen as verification
failure, and won't be loaded.
Seems kexec kernel image with an unqualified signature is even worse
than those w/o signature at all, this sounds very unreasonable. E.g.
If people get a unsigned kernel to load, or a kernel signed with expired
key, which one is more dangerous?
So, here, let's simplify the logic to improve code readability. If the
KEXEC_SIG_FORCE enabled or kexec lockdown enabled, signature
verification is mandated. Otherwise, we lift the bar for any kernel
image.
Link: http://lkml.kernel.org/r/20200602045952.27487-1-lijiang@redhat.com
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Reviewed-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Matthew Garrett <mjg59@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh reports:
"While stressing compaction, one run oopsed on NULL capc->cc in
__free_one_page()'s task_capc(zone): compact_zone_order() had been
interrupted, and a page was being freed in the return from interrupt.
Though you would not expect it from the source, both gccs I was using
(4.8.1 and 7.5.0) had chosen to compile compact_zone_order() with the
".cc = &cc" implemented by mov %rbx,-0xb0(%rbp) immediately before
callq compact_zone - long after the "current->capture_control =
&capc". An interrupt in between those finds capc->cc NULL (zeroed by
an earlier rep stos).
This could presumably be fixed by a barrier() before setting
current->capture_control in compact_zone_order(); but would also need
more care on return from compact_zone(), in order not to risk leaking
a page captured by interrupt just before capture_control is reset.
Maybe that is the preferable fix, but I felt safer for task_capc() to
exclude the rather surprising possibility of capture at interrupt
time"
I have checked that gcc10 also behaves the same.
The advantage of fix in compact_zone_order() is that we don't add
another test in the page freeing hot path, and that it might prevent
future problems if we stop exposing pointers to uninitialized structures
in current task.
So this patch implements the suggestion for compact_zone_order() with
barrier() (and WRITE_ONCE() to prevent store tearing) for setting
current->capture_control, and prevents page leaking with
WRITE_ONCE/READ_ONCE in the proper order.
Link: http://lkml.kernel.org/r/20200616082649.27173-1-vbabka@suse.cz
Fixes: 5e1f0f098b ("mm, compaction: capture a page under direct compaction")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Hugh Dickins <hughd@google.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Li Wang <liwang@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org> [5.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Short summary of fixes pull (less than what git shortlog provides):
* In mcde, set up fbdev after device registration and removde the last access
to dev->dev_private. Fixes an error message and a segmentation fault.
* Set the connector type for LogicPT Type 28 and newhaven_nhd_43_480272ef_atxl
panels.
* In uvesafb, fix the handling of the noblank option.
* Fix panel orientation for Asus T101HA and Acer S1003.
* Fix DMA configuration for sun4i if IOMMU is present.
* Fix regression in VT restoration. Unbreaks userspace (i.e., Xorg) VT handling.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200625082717.GA14856@linux-uq9g
We use OUTPUT directory as TMPOUT for checking no-pie option.
Since commit f2f02ebd8f ("kbuild: improve cc-option to clean up all
temporary files") when building powerpc/ from selftests directory, the
OUTPUT directory points to powerpc/pmu/ebb/ and gets removed when
checking for -no-pie option in try-run routine, subsequently build
fails with the following:
$ make -C powerpc
...
TARGET=ebb; BUILD_TARGET=$OUTPUT/$TARGET; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C $TARGET all
make[2]: Entering directory '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb'
make[2]: *** No rule to make target 'Makefile'.
make[2]: Failed to remake makefile 'Makefile'.
make[2]: *** No rule to make target 'ebb.c', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: *** No rule to make target 'ebb_handler.S', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: *** No rule to make target 'trace.c', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: *** No rule to make target 'busy_loop.S', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: Target 'all' not remade because of errors.
Fix this by adding a suffix to the OUTPUT directory so that the
failure is avoided.
Fixes: 9686813f6e ("selftests/powerpc: Fix try-run when source tree is not writable")
Signed-off-by: Harish <harish@linux.ibm.com>
[mpe: Mention that commit that triggered the breakage]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200625165721.264904-1-harish@linux.ibm.com
Pull networking fixes from David Miller:
1) Don't insert ESP trailer twice in IPSEC code, from Huy Nguyen.
2) The default crypto algorithm selection in Kconfig for IPSEC is out
of touch with modern reality, fix this up. From Eric Biggers.
3) bpftool is missing an entry for BPF_MAP_TYPE_RINGBUF, from Andrii
Nakryiko.
4) Missing init of ->frame_sz in xdp_convert_zc_to_xdp_frame(), from
Hangbin Liu.
5) Adjust packet alignment handling in ax88179_178a driver to match
what the hardware actually does. From Jeremy Kerr.
6) register_netdevice can leak in the case one of the notifiers fail,
from Yang Yingliang.
7) Use after free in ip_tunnel_lookup(), from Taehee Yoo.
8) VLAN checks in sja1105 DSA driver need adjustments, from Vladimir
Oltean.
9) tg3 driver can sleep forever when we get enough EEH errors, fix from
David Christensen.
10) Missing {READ,WRITE}_ONCE() annotations in various Intel ethernet
drivers, from Ciara Loftus.
11) Fix scanning loop break condition in of_mdiobus_register(), from
Florian Fainelli.
12) MTU limit is incorrect in ibmveth driver, from Thomas Falcon.
13) Endianness fix in mlxsw, from Ido Schimmel.
14) Use after free in smsc95xx usbnet driver, from Tuomas Tynkkynen.
15) Missing bridge mrp configuration validation, from Horatiu Vultur.
16) Fix circular netns references in wireguard, from Jason A. Donenfeld.
17) PTP initialization on recovery is not done properly in qed driver,
from Alexander Lobakin.
18) Endian conversion of L4 ports in filters of cxgb4 driver is wrong,
from Rahul Lakkireddy.
19) Don't clear bound device TX queue of socket prematurely otherwise we
get problems with ktls hw offloading, from Tariq Toukan.
20) ipset can do atomics on unaligned memory, fix from Russell King.
21) Align ethernet addresses properly in bridging code, from Thomas
Martitz.
22) Don't advertise ipv4 addresses on SCTP sockets having ipv6only set,
from Marcelo Ricardo Leitner.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (149 commits)
rds: transport module should be auto loaded when transport is set
sch_cake: fix a few style nits
sch_cake: don't call diffserv parsing code when it is not needed
sch_cake: don't try to reallocate or unshare skb unconditionally
ethtool: fix error handling in linkstate_prepare_data()
wil6210: account for napi_gro_receive never returning GRO_DROP
hns: do not cast return value of napi_gro_receive to null
socionext: account for napi_gro_receive never returning GRO_DROP
wireguard: receive: account for napi_gro_receive never returning GRO_DROP
vxlan: fix last fdb index during dump of fdb with nhid
sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket
tc-testing: avoid action cookies with odd length.
bpf: tcp: bpf_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
net: dsa: sja1105: fix tc-gate schedule with single element
net: dsa: sja1105: recalculate gating subschedule after deleting tc-gate rules
net: dsa: sja1105: unconditionally free old gating config
net: dsa: sja1105: move sja1105_compose_gating_subschedule at the top
net: macb: free resources on failure path of at91ether_open()
net: macb: call pm_runtime_put_sync on failure path
...
Toke Høiland-Jørgensen says:
====================
sched: A couple of fixes for sch_cake
This series contains a couple of fixes for diffserv handling in sch_cake that
provide a nice speedup (with a somewhat pedantic nit fix tacked on to the end).
Not quite sure about whether this should go to stable; it does provide a nice
speedup, but it's not strictly a fix in the "correctness" sense. I lean towards
including this in stable as well, since our most important consumer of that
(OpenWrt) is likely to backport the series anyway.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
I spotted a few nits when comparing the in-tree version of sch_cake with
the out-of-tree one: A redundant error variable declaration shadowing an
outer declaration, and an indentation alignment issue. Fix both of these.
Fixes: 046f6fd5da ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As a further optimisation of the diffserv parsing codepath, we can skip it
entirely if CAKE is configured to neither use diffserv-based
classification, nor to zero out the diffserv bits.
Fixes: c87b4ecdbe ("sch_cake: Make sure we can write the IP header before changing DSCP bits")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cake_handle_diffserv() tries to linearize mac and network header parts of
skb and to make it writable unconditionally. In some cases it leads to full
skb reallocation, which reduces throughput and increases CPU load. Some
measurements of IPv4 forward + NAPT on MIPS router with 580 MHz single-core
CPU was conducted. It appears that on kernel 4.9 skb_try_make_writable()
reallocates skb, if skb was allocated in ethernet driver via so-called
'build skb' method from page cache (it was discovered by strange increase
of kmalloc-2048 slab at first).
Obtain DSCP value via read-only skb_header_pointer() call, and leave
linearization only for DSCP bleaching or ECN CE setting. And, as an
additional optimisation, skip diffserv parsing entirely if it is not needed
by the current configuration.
Fixes: c87b4ecdbe ("sch_cake: Make sure we can write the IP header before changing DSCP bits")
Signed-off-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
[ fix a few style issues, reflow commit message ]
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When getting SQI or maximum SQI value fails in linkstate_prepare_data(), we
must not return without calling ethnl_ops_complete(dev) as that could
result in imbalance between ethtool_ops ->begin() and ->complete() calls.
Fixes: 8066021915 ("ethtool: provide UAPI for PHY Signal Quality Index (SQI)")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull tracing fixes from Steven Rostedt:
"Four small fixes:
- Fix a ringbuffer bug for nested events having time go backwards
- Fix a config dependency for boot time tracing to depend on
synthetic events instead of histograms.
- Fix trigger format parsing to handle multiple spaces
- Fix bootconfig to handle failures in multiple events"
* tag 'trace-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/boottime: Fix kprobe multiple events
tracing: Fix event trigger to accept redundant spaces
tracing/boot: Fix config dependency for synthedic event
ring-buffer: Zero out time extend if it is nested and not absolute
Jason A. Donenfeld says:
====================
napi_gro_receive caller return value cleanups
In 6570bc79c0 ("net: core: use listified Rx for GRO_NORMAL in
napi_gro_receive()"), the GRO_NORMAL case stopped calling
netif_receive_skb_internal, checking its return value, and returning
GRO_DROP in case it failed. Instead, it calls into
netif_receive_skb_list_internal (after a bit of indirection), which
doesn't return any error. Therefore, napi_gro_receive will never return
GRO_DROP, making handling GRO_DROP dead code.
I emailed the author of 6570bc79c0 on netdev [1] to see if this change
was intentional, but the dlink.ru email address has been disconnected,
and looking a bit further myself, it seems somewhat infeasible to start
propagating return values backwards from the internal machinations of
netif_receive_skb_list_internal.
Taking a look at all the callers of napi_gro_receive, it appears that
three are checking the return value for the purpose of comparing it to
the now never-happening GRO_DROP, and one just casts it to (void), a
likely historical leftover. Every other of the 120 callers does not
bother checking the return value.
And it seems like these remaining 116 callers are doing the right thing:
after calling napi_gro_receive, the packet is now in the hands of the
upper layers of the newtworking, and the device driver itself has no
business now making decisions based on what the upper layers choose to
do. Incrementing stats counters on GRO_DROP seems like a mistake, made
by these three drivers, but not by the remaining 117.
It would seem, therefore, that after rectifying these four callers of
napi_gro_receive, that I should go ahead and just remove returning the
value from napi_gro_receive all together. However, napi_gro_receive has
a function event tracer, and being able to introspect into the
networking stack to see how often napi_gro_receive is returning whatever
interesting GRO status (aside from _DROP) remains an interesting
data point worth keeping for debugging.
So, this series simply gets rid of the return value checking for the
four useless places where that check never evaluates to anything
meaningful.
[1] https://lore.kernel.org/netdev/20200624210606.GA1362687@zx2c4.com/
====================
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The napi_gro_receive function no longer returns GRO_DROP ever, making
handling GRO_DROP dead code. This commit removes that dead code.
Further, it's not even clear that device drivers have any business in
taking action after passing off received packets; that's arguably out of
their hands. In this case, too, the non-gro path didn't bother checking
the return value. Plus, this had some clunky debugging functions that
duplicated code from elsewhere and was generally pretty messy. So, this
commit cleans that all up too.
Fixes: 6570bc79c0 ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Basically no drivers care about the return value here, and there's no
__must_check that would make casting to void sensible, so remove it.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The napi_gro_receive function no longer returns GRO_DROP ever, making
handling GRO_DROP dead code. This commit removes that dead code.
Further, it's not even clear that device drivers have any business in
taking action after passing off received packets; that's arguably out of
their hands.
Fixes: 6570bc79c0 ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The napi_gro_receive function no longer returns GRO_DROP ever, making
handling GRO_DROP dead code. This commit removes that dead code.
Further, it's not even clear that device drivers have any business in
taking action after passing off received packets; that's arguably out of
their hands.
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Fixes: 6570bc79c0 ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes last saved fdb index in fdb dump handler when
handling fdb's with nhid.
Fixes: 1274e1cc42 ("vxlan: ecmp support for mac fdb entries")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a socket is set ipv6only, it will still send IPv4 addresses in the
INIT and INIT_ACK packets. This potentially misleads the peer into using
them, which then would cause association termination.
The fix is to not add IPv4 addresses to ipv6only sockets.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Tested-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update odd length cookie hexstrings in csum.json, tunnel_key.json and
bpf.json to be even length to comply with check enforced in commit
0149dabf2a1b ("tc: m_actions: check cookie hexstring len") in iproute2.
Signed-off-by: Briana Oursler <briana.oursler@gmail.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neal Cardwell says:
====================
tcp_cubic: fix spurious HYSTART_DELAY on RTT decrease
This series fixes a long-standing bug in the TCP CUBIC
HYSTART_DELAY mechanim recently reported by Mirja Kuehlewind. The
code can cause a spurious exit of slow start in some particular
cases: upon an RTT decrease that happens on the 9th or later ACK
in a round trip. This series fixes the original Hystart code and
also the recent BPF implementation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Apply the fix from:
"tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT"
to the BPF implementation of TCP CUBIC congestion control.
Repeating the commit description here for completeness:
Mirja Kuehlewind reported a bug in Linux TCP CUBIC Hystart, where
Hystart HYSTART_DELAY mechanism can exit Slow Start spuriously on an
ACK when the minimum rtt of a connection goes down. From inspection it
is clear from the existing code that this could happen in an example
like the following:
o The first 8 RTT samples in a round trip are 150ms, resulting in a
curr_rtt of 150ms and a delay_min of 150ms.
o The 9th RTT sample is 100ms. The curr_rtt does not change after the
first 8 samples, so curr_rtt remains 150ms. But delay_min can be
lowered at any time, so delay_min falls to 100ms. The code executes
the HYSTART_DELAY comparison between curr_rtt of 150ms and delay_min
of 100ms, and the curr_rtt is declared far enough above delay_min to
force a (spurious) exit of Slow start.
The fix here is simple: allow every RTT sample in a round trip to
lower the curr_rtt.
Fixes: 6de4a9c430 ("bpf: tcp: Add bpf_cubic example")
Reported-by: Mirja Kuehlewind <mirja.kuehlewind@ericsson.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mirja Kuehlewind reported a bug in Linux TCP CUBIC Hystart, where
Hystart HYSTART_DELAY mechanism can exit Slow Start spuriously on an
ACK when the minimum rtt of a connection goes down. From inspection it
is clear from the existing code that this could happen in an example
like the following:
o The first 8 RTT samples in a round trip are 150ms, resulting in a
curr_rtt of 150ms and a delay_min of 150ms.
o The 9th RTT sample is 100ms. The curr_rtt does not change after the
first 8 samples, so curr_rtt remains 150ms. But delay_min can be
lowered at any time, so delay_min falls to 100ms. The code executes
the HYSTART_DELAY comparison between curr_rtt of 150ms and delay_min
of 100ms, and the curr_rtt is declared far enough above delay_min to
force a (spurious) exit of Slow start.
The fix here is simple: allow every RTT sample in a round trip to
lower the curr_rtt.
Fixes: ae27e98a51 ("[TCP] CUBIC v2.3")
Reported-by: Mirja Kuehlewind <mirja.kuehlewind@ericsson.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean says:
====================
Fixes for SJA1105 DSA tc-gate action
This small series fixes 2 bugs in the tc-gate implementation:
1. The TAS state machine keeps getting rescheduled even after removing
tc-gate actions on all ports.
2. tc-gate actions with only one gate control list entry are installed
to hardware with an incorrect interval of zero, which makes the
switch erroneously drop those packets (since the configuration is
invalid).
To keep the code palatable, a forward-declaration was avoided by moving
some code around in patch 1/4. I hope that isn't too much of an issue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The sja1105_gating_cfg_time_to_interval function does this, as per the
comments:
/* The gate entries contain absolute times in their e->interval field. Convert
* that to proper intervals (i.e. "0, 5, 10, 15" to "5, 5, 5, 5").
*/
To perform that task, it iterates over gating_cfg->entries, at each step
updating the interval of the _previous_ entry. So one interval remains
to be updated at the end of the loop: the last one (since it isn't
"prev" for anyone else).
But there was an erroneous check, that the last element's interval
should not be updated if it's also the only element. I'm not quite sure
why that check was there, but it's clearly incorrect, as a tc-gate
schedule with a single element would get an e->interval of zero,
regardless of the duration requested by the user. The switch wouldn't
even consider this configuration as valid: it will just drop all traffic
that matches the rule.
Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Reported-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, tas_data->enabled would remain true even after deleting all
tc-gate rules from the switch ports, which would cause the
sja1105_tas_state_machine to get unnecessarily scheduled.
Also, if there were any errors which would prevent the hardware from
enabling the gating schedule, the sja1105_tas_state_machine would
continuously detect and print that, spamming the kernel log, even if the
rules were subsequently deleted.
The rules themselves are _not_ active, because sja1105_init_scheduling
does enough of a job to not install the gating schedule in the static
config. But the virtual link rules themselves are still present.
So call the functions that remove the tc-gate configuration from
priv->tas_data.gating_cfg, so that tas_data->enabled can be set to
false, and sja1105_tas_state_machine will stop from being scheduled.
Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently sja1105_compose_gating_subschedule is not prepared to be
called for the case where we want to recompute the global tc-gate
configuration after we've deleted those actions on a port.
After deleting the tc-gate actions on the last port, max_cycle_time
would become zero, and that would incorrectly prevent
sja1105_free_gating_config from getting called.
So move the freeing function above the check for the need to apply a new
configuration.
Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It turns out that sja1105_compose_gating_subschedule must also be called
from sja1105_vl_delete, to recalculate the overall tc-gate
configuration. Currently this is not possible without introducing a
forward declaration. So move the function at the top of the file, along
with its dependencies.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KCSAN reported data race reading and writing nr_threads and max_threads.
The data race is intentional and benign. This is obvious from the comment
above it and based on general consensus when discussing this issue. So
there's no need for any heavy atomic or *_ONCE() machinery here.
In accordance with the newly introduced data_race() annotation consensus,
mark the offending line with data_race(). Here it's actually useful not
just to silence KCSAN but to also clearly communicate that the race is
intentional. This is especially helpful since nr_threads is otherwise
protected by tasklist_lock.
BUG: KCSAN: data-race in copy_process / copy_process
write to 0xffffffff86205cf8 of 4 bytes by task 14779 on cpu 1:
copy_process+0x2eba/0x3c40 kernel/fork.c:2273
_do_fork+0xfe/0x7a0 kernel/fork.c:2421
__do_sys_clone kernel/fork.c:2576 [inline]
__se_sys_clone kernel/fork.c:2557 [inline]
__x64_sys_clone+0x130/0x170 kernel/fork.c:2557
do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x44/0xa9
read to 0xffffffff86205cf8 of 4 bytes by task 6944 on cpu 0:
copy_process+0x94d/0x3c40 kernel/fork.c:1954
_do_fork+0xfe/0x7a0 kernel/fork.c:2421
__do_sys_clone kernel/fork.c:2576 [inline]
__se_sys_clone kernel/fork.c:2557 [inline]
__x64_sys_clone+0x130/0x170 kernel/fork.c:2557
do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Link: https://groups.google.com/forum/#!msg/syzkaller-upstream-mo
deration/thvp7AHs5Ew/aPdYLXfYBQAJ
Reported-by: syzbot+52fced2d288f8ecd2b20@syzkaller.appspotmail.com
Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Weilong Chen <chenweilong@huawei.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Marco Elver <elver@google.com>
[christian.brauner@ubuntu.com: rewrite commit message]
Link: https://lore.kernel.org/r/20200623041240.154294-1-chenweilong@huawei.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
DMA buffers were not freed on failure path of at91ether_open().
Along with changes for freeing the DMA buffers the enable/disable
interrupt instructions were moved to at91ether_start()/at91ether_stop()
functions and the operations on at91ether_stop() were done in
their reverse order (compared with how is done in at91ether_start()):
before this patch the operation order on interface open path
was as follows:
1/ alloc DMA buffers
2/ enable tx, rx
3/ enable interrupts
and the order on interface close path was as follows:
1/ disable tx, rx
2/ disable interrupts
3/ free dma buffers.
Fixes: 7897b071ac ("net: macb: convert to phylink")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call pm_runtime_put_sync() on failure path of at91ether_open.
Fixes: e6a41c23df ("net: macb: ensure interface is not suspended on at91rm9200")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The port number field in the status register was not correct, so fix it.
Fixes: d6ffb63001 ("i2c: Add FSI-attached I2C master algorithm")
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Add extern declarations for vDSO time-related functions to notify the
compiler these functions will be used in somewhere to avoid
"no previous prototype" compile warning.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The (struct __prci_data).hw_clks.hws is an array with dynamic elements.
Using struct_size(pd, hw_clks.hws, ARRAY_SIZE(__prci_init_clocks))
instead of sizeof(*pd) to get the correct memory size of
struct __prci_data for sifive/fu540-prci. After applying this
modifications, the kernel runs smoothly with CONFIG_SLAB_FREELIST_RANDOM
enabled on the HiFive unleashed board.
Fixes: 30b8e27e3b ("clk: sifive: add a driver for the SiFive FU540 PRCI IP block")
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The time related vDSO functions use a variable, vdso_data, to access the
vDSO data page to get the system time information. Because the vdso_data
for CFLAGS_vgettimeofday.o is an external variable defined in vdso.o,
the CFLAGS_vgettimeofday.o should be compiled with -fPIC to ensure
that vdso_data is addressable.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Pull fsnotify fixlet from Jan Kara:
"A performance improvement to reduce impact of fsnotify for inodes
where it isn't used"
* tag 'fsnotify_for_v5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fs: Do not check if there is a fsnotify watcher on pseudo inodes
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net, they are:
1) Unaligned atomic access in ipset, from Russell King.
2) Missing module description, from Rob Gill.
3) Patches to fix a module unload causing NULL pointer dereference in
xtables, from David Wilder. For the record, I posting here his cover
letter explaining the problem:
A crash happened on ppc64le when running ltp network tests triggered by
"rmmod iptable_mangle".
See previous discussion in this thread:
https://lists.openwall.net/netdev/2020/06/03/161 .
In the crash I found in iptable_mangle_hook() that
state->net->ipv4.iptable_mangle=NULL causing a NULL pointer dereference.
net->ipv4.iptable_mangle is set to NULL in +iptable_mangle_net_exit() and
called when ip_mangle modules is unloaded. A rmmod task was found running
in the crash dump. A 2nd crash showed the same problem when running
"rmmod iptable_filter" (net->ipv4.iptable_filter=NULL).
To fix this I added .pre_exit hook in all iptable_foo.c. The pre_exit will
un-register the underlying hook and exit would do the table freeing. The
netns core does an unconditional +synchronize_rcu after the pre_exit hooks
insuring no packets are in flight that have picked up the pointer before
completing the un-register.
These patches include changes for both iptables and ip6tables.
We tested this fix with ltp running iptables01.sh and iptables01.sh -6 a
loop for 72 hours.
4) Add a selftest for conntrack helper assignment, from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull rdma fixes from Jason Gunthorpe:
"Several regression fixes from work that landed in the merge window,
particularly in the mlx5 driver:
- Various static checker and warning fixes
- General bug fixes in rvt, qedr, hns, mlx5 and hfi1
- Several regression fixes related to the ECE and QP changes in last
cycle
- Fixes for a few long standing crashers in CMA, uverbs ioctl, and
xrc"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (25 commits)
IB/hfi1: Add atomic triggered sleep/wakeup
IB/hfi1: Correct -EBUSY handling in tx code
IB/hfi1: Fix module use count flaw due to leftover module put calls
IB/hfi1: Restore kfree in dummy_netdev cleanup
IB/mad: Fix use after free when destroying MAD agent
RDMA/mlx5: Protect from kernel crash if XRC_TGT doesn't have udata
RDMA/counter: Query a counter before release
RDMA/mad: Fix possible memory leak in ib_mad_post_receive_mads()
RDMA/mlx5: Fix integrity enabled QP creation
RDMA/mlx5: Remove ECE limitation from the RAW_PACKET QPs
RDMA/mlx5: Fix remote gid value in query QP
RDMA/mlx5: Don't access ib_qp fields in internal destroy QP path
RDMA/core: Check that type_attrs is not NULL prior access
RDMA/hns: Fix an cmd queue issue when resetting
RDMA/hns: Fix a calltrace when registering MR from userspace
RDMA/mlx5: Add missed RST2INIT and INIT2INIT steps during ECE handshake
RDMA/cma: Protect bind_list and listen_list while finding matching cm id
RDMA/qedr: Fix KASAN: use-after-free in ucma_event_handler+0x532
RDMA/efa: Set maximum pkeys device attribute
RDMA/rvt: Fix potential memory leak caused by rvt_alloc_rq
...
there is a problem with the CWR flag set in an incoming ACK segment
and it leads to the situation when the ECE flag is latched forever
the following packetdrill script shows what happens:
// Stack receives incoming segments with CE set
+0.1 <[ect0] . 11001:12001(1000) ack 1001 win 65535
+0.0 <[ce] . 12001:13001(1000) ack 1001 win 65535
+0.0 <[ect0] P. 13001:14001(1000) ack 1001 win 65535
// Stack repsonds with ECN ECHO
+0.0 >[noecn] . 1001:1001(0) ack 12001
+0.0 >[noecn] E. 1001:1001(0) ack 13001
+0.0 >[noecn] E. 1001:1001(0) ack 14001
// Write a packet
+0.1 write(3, ..., 1000) = 1000
+0.0 >[ect0] PE. 1001:2001(1000) ack 14001
// Pure ACK received
+0.01 <[noecn] W. 14001:14001(0) ack 2001 win 65535
// Since CWR was sent, this packet should NOT have ECE set
+0.1 write(3, ..., 1000) = 1000
+0.0 >[ect0] P. 2001:3001(1000) ack 14001
// but Linux will still keep ECE latched here, with packetdrill
// flagging a missing ECE flag, expecting
// >[ect0] PE. 2001:3001(1000) ack 14001
// in the script
In the situation above we will continue to send ECN ECHO packets
and trigger the peer to reduce the congestion window. To avoid that
we can check CWR on pure ACKs received.
v3:
- Add a sequence check to avoid sending an ACK to an ACK
v2:
- Adjusted the comment
- move CWR check before checking for unacknowledged packets
Signed-off-by: Denis Kirjanov <denis.kirjanov@suse.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The skcipher API dynamically instantiates the transformation object
on request that implements the requested algorithm optimally on the
given platform. This notion of optimality only matters for cases like
bulk network or disk encryption, where performance can be a bottleneck,
or in cases where the algorithm itself is not known at compile time.
In the mscc case, we are dealing with AES encryption of a single
block, and so neither concern applies, and we are better off using
the AES library interface, which is lightweight and safe for this
kind of use.
Note that the scatterlist API does not permit references to buffers
that are located on the stack, so the existing code is incorrect in
any case, but avoiding the skcipher and scatterlist APIs entirely is
the most straight-forward approach to fixing this.
Cc: Antoine Tenart <antoine.tenart@bootlin.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Fixes: 28c5107aa9 ("net: phy: mscc: macsec support")
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull NVMe fixes from Christoph.
* 'nvme-5.8' of git://git.infradead.org/nvme:
nvme-multipath: fix bogus request queue reference put
nvme-multipath: fix deadlock due to head->lock
nvme: don't protect ns mutation with ns->head->lock
nvme-multipath: fix deadlock between ana_work and scan_work
nvme: fix possible deadlock when I/O is blocked
nvme-rdma: assign completion vector correctly
nvme-loop: initialize tagset numa value to the value of the ctrl
nvme-tcp: initialize tagset numa value to the value of the ctrl
nvme-pci: initialize tagset numa value to the value of the ctrl
nvme-pci: override the value of the controller's numa node
nvme: set initial value for controller's numa node
I updated my system with Radeon VII from kernel 5.6 to kernel 5.7, and
following started to happen on each boot:
...
BUG: kernel NULL pointer dereference, address: 0000000000000128
...
CPU: 9 PID: 1940 Comm: modprobe Tainted: G E 5.7.2-200.im0.fc32.x86_64 #1
Hardware name: System manufacturer System Product Name/PRIME X570-P, BIOS 1407 04/02/2020
RIP: 0010:lock_bus+0x42/0x60 [amdgpu]
...
Call Trace:
i2c_smbus_xfer+0x3d/0xf0
i2c_default_probe+0xf3/0x130
i2c_detect.isra.0+0xfe/0x2b0
? kfree+0xa3/0x200
? kobject_uevent_env+0x11f/0x6a0
? i2c_detect.isra.0+0x2b0/0x2b0
__process_new_driver+0x1b/0x20
bus_for_each_dev+0x64/0x90
? 0xffffffffc0f34000
i2c_register_driver+0x73/0xc0
do_one_initcall+0x46/0x200
? _cond_resched+0x16/0x40
? kmem_cache_alloc_trace+0x167/0x220
? do_init_module+0x23/0x260
do_init_module+0x5c/0x260
__do_sys_init_module+0x14f/0x170
do_syscall_64+0x5b/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
...
Error appears when some i2c device driver tries to probe for devices
using adapter registered by `smu_v11_0_i2c_eeprom_control_init()`.
Code supporting this adapter requires `adev->psp.ras.ras` to be not
NULL, which is true only when `amdgpu_ras_init()` detects HW support by
calling `amdgpu_ras_check_supported()`.
Before 9015d60c9e, adapter was registered by
-> amdgpu_device_ip_init()
-> amdgpu_ras_recovery_init()
-> amdgpu_ras_eeprom_init()
-> smu_v11_0_i2c_eeprom_control_init()
after verifying that `adev->psp.ras.ras` is not NULL in
`amdgpu_ras_recovery_init()`. Currently it is registered
unconditionally by
-> amdgpu_device_ip_init()
-> pp_sw_init()
-> hwmgr_sw_init()
-> vega20_smu_init()
-> smu_v11_0_i2c_eeprom_control_init()
Fix simply adds HW support check (ras == NULL => no support) before
calling `smu_v11_0_i2c_eeprom_control_{init,fini}()`.
Please note that there is a chance that similar fix is also required for
CHIP_ARCTURUS. I do not know whether any actual Arcturus hardware without
RAS exist, and whether calling `smu_i2c_eeprom_init()` makes any sense
when there is no HW support.
Cc: stable@vger.kernel.org
Fixes: 9015d60c9e ("drm/amdgpu: Move EEPROM I2C adapter to amdgpu_device")
Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com>
Tested-by: Bjorn Nostvold <bjorn.nostvold@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SR-IOV VFs do not implement the memory enable bit of the command
register, therefore this bit is not set in config space after
pci_enable_device(). This leads to an unintended difference
between PF and VF in hand-off state to the user. We can correct
this by setting the initial value of the memory enable bit in our
virtualized config space. There's really no need however to
ever fault a user on a VF though as this would only indicate an
error in the user's management of the enable bit, versus a PF
where the same access could trigger hardware faults.
Fixes: abafbc551f ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pull s390 fixes from Heiko Carstens:
- Fix kernel crash on system call single stepping.
- Make sure early program check handler is executed with DAT on to
avoid an endless program check loop.
- Add __GFP_NOWARN flag to debug feature to avoid user triggerable
allocation failure messages.
* tag 's390-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/debug: avoid kernel warning on too large number of pages
s390/kasan: fix early pgm check handler execution
s390: fix system call single stepping
Pull sound fixes from Takashi Iwai:
"A collection of small fixes gathered in the last two weeks.
The major changes here are fixes for the recent DPCM regressions found
on i.MX and Qualcomm platforms and fixes for resource leaks in ASoC
DAI registrations.
Other than those are mostly device-specific fixes including the usual
USB- and HD-audio quirks, and a fix for syzkaller case and ID updates
for new Intel platforms"
* tag 'sound-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (32 commits)
ALSA: usb-audio: Fix OOB access of mixer element list
ALSA: usb-audio: add quirk for Samsung USBC Headset (AKG)
ALSA: usb-audio: Add registration quirk for Kingston HyperX Cloud Flight S
ASoC: rockchip: Fix a reference count leak.
ASoC: amd: closing specific instance.
ALSA: hda: Intel: add missing PCI IDs for ICL-H, TGL-H and EKL
ASoC: hdac_hda: fix memleak with regmap not freed on remove
ASoC: SOF: Intel: add PCI IDs for ICL-H and TGL-H
ASoC: SOF: Intel: add PCI ID for CometLake-S
ASoC: Intel: SOF: merge COMETLAKE_LP and COMETLAKE_H
ALSA: hda/realtek: Add mute LED and micmute LED support for HP systems
ALSA: usb-audio: Fix potential use-after-free of streams
ALSA: hda/realtek - Add quirk for MSI GE63 laptop
ASoC: fsl_ssi: Fix bclk calculation for mono channel
ASoC: SOF: Intel: hda: Clear RIRB status before reading WP
ASoC: rt1015: Update rt1015 default register value according to spec modification.
ASoC: qcom: common: set correct directions for dailinks
ASoc: q6afe: add support to get port direction
ASoC: soc-pcm: fix checks for multi-cpu FE dailinks
ASoC: rt5682: Let dai clks be registered whether mclk exists or not
...
When copying the setup_header into the boot_params buffer, only the data
that is actually part of the setup_header should be copied.
efi_pe_entry() currently copies the entire second sector, which
initializes some of the fields in boot_params beyond the setup_header
with garbage (i.e. part of the real-mode boot code gets copied into
those fields).
This does not cause any issues currently because the fields that are
overwritten are padding, BIOS EDD information that won't get used, and
the E820 table which will get properly filled in later.
Fix this to only copy data that is actually part of the setup_header
structure.
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Commit
987053a300 ("efi/x86: Move command-line initrd loading to efi_main")
made the ramdisk_addr/ramdisk_size variables in efi_pe_entry unused, but
neglected to delete them.
Delete these unused variables.
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
A KCSAN build revealed we have explicit annoations through atomic_*()
usage, switch to arch_atomic_*() for the respective functions.
vmlinux.o: warning: objtool: rcu_nmi_exit()+0x4d: call to __kcsan_check_access() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_dynticks_eqs_enter()+0x25: call to __kcsan_check_access() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_nmi_enter()+0x4f: call to __kcsan_check_access() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_dynticks_eqs_exit()+0x2a: call to __kcsan_check_access() leaves .noinstr.text section
vmlinux.o: warning: objtool: __rcu_is_watching()+0x25: call to __kcsan_check_access() leaves .noinstr.text section
Additionally, without the NOP in instrumentation_begin(), objtool would
not detect the lack of the 'else instrumentation_begin();' branch in
rcu_nmi_enter().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Architectures with instrumented (KASAN/KCSAN) atomic operations
natively provide arch_atomic_ variants that are not instrumented.
It turns out that some generic code also requires arch_atomic_ in
order to avoid instrumentation, so provide the arch_atomic_ interface
as a direct map into the regular atomic_ interface for
non-instrumented architectures.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
To ensure btf_ctx_access() is safe the verifier checks that the BTF
arg type is an int, enum, or pointer. When the function does the
BTF arg lookup it uses the calculation 'arg = off / 8' using the
fact that registers are 8B. This requires that the first arg is
in the first reg, the second in the second, and so on. However,
for __int128 the arg will consume two registers by default LLVM
implementation. So this will cause the arg layout assumed by the
'arg = off / 8' calculation to be incorrect.
Because __int128 is uncommon this patch applies the easiest fix and
will force int types to be sizeof(u64) or smaller so that they will
fit in a single register.
v2: remove unneeded parens per Andrii's feedback
Fixes: 9e15db6613 ("bpf: Implement accurate raw_tp context access via BTF")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159303723962.11287.13309537171132420717.stgit@john-Precision-5820-Tower
io_do_iopoll() won't do anything with a request unless
req->iopoll_completed is set. So io_complete_rw_iopoll() has to set
it, otherwise io_do_iopoll() will poll a file again and again even
though the request of interest was completed long time ago.
Also, remove -EAGAIN check from io_issue_sqe() as it races with
the changed lines. The request will take the long way and be
resubmitted from io_iopoll*().
io_kiocb's result and iopoll_completed")
Fixes: bbde017a32 ("io_uring: add memory barrier to synchronize
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We have a Dell AIO, there is neither internal speaker nor internal
mic, only a multi-function audio jack on it.
Users reported that after freshly installing the OS and plug
a headset to the audio jack, the headset can't output sound. I
reproduced this bug, at that moment, the Input Source is as below:
Simple mixer control 'Input Source',0
Capabilities: cenum
Items: 'Headphone Mic' 'Headset Mic'
Item0: 'Headphone Mic'
That is because the patch_realtek will set this audio jack as mic_in
mode if Input Source's value is hp_mic.
If it is not fresh installing, this issue will not happen since the
systemd will run alsactl restore -f /var/lib/alsa/asound.state, this
will set the 'Input Source' according to history value.
If there is internal speaker or internal mic, this issue will not
happen since there is valid sink/source in the pulseaudio, the PA will
set the 'Input Source' according to active_port.
To fix this issue, change the parser function to let the hs_mic be
stored ahead of hp_mic.
Cc: stable@vger.kernel.org
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20200625083833.11264-1-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Currently pointer phy0 is being dereferenced via the assignment of
phy on the call to phy_get_drvdata before phy0 is null checked, this
can lead to a null pointer dereference. Fix this by performing the
null check on phy0 before the call to phy_get_drvdata. Also replace
the phy0 == NULL check with the more usual !phy0 idiom.
Addresses-Coverity: ("Dereference before null check")
Fixes: e6f32efb1b ("phy: sun4i-usb: Make sure to disable PHY0 passby for peripheral mode")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20200625124428.83564-1-colin.king@canonical.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The USB3 discovery used wrong indices when tunnel is discovered. It
should use TB_USB3_PATH_DOWN for path that flows downstream and
TB_USB3_PATH_UP when it flows upstream. This should not affect the
functionality but better to fix it.
Fixes: e6f8185857 ("thunderbolt: Add support for USB 3.x tunnels")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org # v5.6+
Implement call_cpuidle_s2idle() in analogy with call_cpuidle()
for the s2idle-specific idle state entry and invoke it from
cpuidle_idle_call() to make the s2idle-specific idle entry code
path look more similar to the "regular" idle entry one.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Chen Yu <yu.c.chen@intel.com>
While rounding up CPUs via NMIs, its possible that a rounded up CPU
maybe holding a console port lock leading to kgdb master CPU stuck in
a deadlock during invocation of console write operations. A similar
deadlock could also be possible while using synchronous breakpoints.
So in order to avoid such a deadlock, set oops_in_progress to encourage
the console drivers to disregard their internal spin locks: in the
current calling context the risk of deadlock is a bigger problem than
risks due to re-entering the console driver. We operate directly on
oops_in_progress rather than using bust_spinlocks() because the calls
bust_spinlocks() makes on exit are not appropriate for this calling
context.
Suggested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/1591264879-25920-4-git-send-email-sumit.garg@linaro.org
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Without this patch, eapol frames cannot be received in mesh
mode, when 802.1X should be used. Initially only a MGTK is
defined, which is found and set as rx->key, when there are
no other keys set. ieee80211_drop_unencrypted would then
drop these eapol frames, as they are data frames without
encryption and there exists some rx->key.
Fix this by differentiating between mesh eapol frames and
other data frames with existing rx->key. Allow mesh mesh
eapol frames only if they are for our vif address.
With this patch in-place, ieee80211_rx_h_mesh_fwding continues
after the ieee80211_drop_unencrypted check and notices, that
these eapol frames have to be delivered locally, as they should.
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Link: https://lore.kernel.org/r/20200625104214.50319-1-markus.theil@tu-ilmenau.de
[small code cleanups]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Charan Teja reported a 'use-after-free' in dmabuffs_dname [1], which
happens if the dma_buf_release() is called while the userspace is
accessing the dma_buf pseudo fs's dmabuffs_dname() in another process,
and dma_buf_release() releases the dmabuf object when the last reference
to the struct file goes away.
I discussed with Arnd Bergmann, and he suggested that rather than tying
the dma_buf_release() to the file_operations' release(), we can tie it to
the dentry_operations' d_release(), which will be called when the last ref
to the dentry is removed.
The path exercised by __fput() calls f_op->release() first, and then calls
dput, which eventually calls d_op->d_release().
In the 'normal' case, when no userspace access is happening via dma_buf
pseudo fs, there should be exactly one fd, file, dentry and inode, so
closing the fd will kill of everything right away.
In the presented case, the dentry's d_release() will be called only when
the dentry's last ref is released.
Therefore, lets move dma_buf_release() from fops->release() to
d_ops->d_release()
Many thanks to Arnd for his FS insights :)
[1]: https://lore.kernel.org/patchwork/patch/1238278/
Fixes: bb2bb90304 ("dma-buf: add DMA_BUF_SET_NAME ioctls")
Reported-by: syzbot+3643a18836bce555bff6@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org> [5.3+]
Cc: Arnd Bergmann <arnd@arndb.de>
Reported-by: Charan Teja Reddy <charante@codeaurora.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Tested-by: Charan Teja Reddy <charante@codeaurora.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200611114418.19852-1-sumit.semwal@linaro.org
When using 802.1X over mesh networks, at first an ordinary
mesh peering is established, then the 802.1X EAPOL dialog
happens, afterwards an authenticated mesh peering exchange
(AMPE) happens, finally the peering is complete and we can
set the STA authorized flag.
As 802.1X is an intermediate step here and key material is
not yet exchanged for stations we have to skip mesh path lookup
for these EAPOL frames. Otherwise the already configure mesh
group encryption key would be used to send a mesh path request
which no one can decipher, because we didn't already establish
key material on both peers, like with SAE and directly using AMPE.
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Link: https://lore.kernel.org/r/20200617082637.22670-2-markus.theil@tu-ilmenau.de
[remove pointless braces, remove unnecessary local variable,
the list can only process one such frame (or its fragments)]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The initial control port tx status patch assumed, that
we have IEEE 802.11 frames, but actually ethernet frames
are stored in the ack skb. Fix this by checking for the
correct ethertype and skb protocol 802.3.
Also allow tx status reports for ETH_P_PREAUTH, as preauth
frames can also be send over the nl80211 control port.
Fixes: a7528198ad ("mac80211: support control port TX status reporting")
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20200622123542.173695-1-markus.theil@tu-ilmenau.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Doug Berger says:
====================
net: bcmgenet: use hardware padding of runt frames
Now that scatter-gather and tx-checksumming are enabled by default
it revealed a packet corruption issue that can occur for very short
fragmented packets.
When padding these frames to the minimum length it is possible for
the non-linear (fragment) data to be added to the end of the linear
header in an SKB. Since the number of fragments is read before the
padding and used afterward without reloading, the fragment that
should have been consumed can be tacked on in place of part of the
padding.
The third commit in this set corrects this by removing the software
padding and allowing the hardware to add the pad bytes if necessary.
The first two commits resolve warnings observed by the kbuild test
robot and are included here for simplicity of application.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When commit 474ea9cafc ("net: bcmgenet: correctly pad short
packets") added the call to skb_padto() it should have been
located before the nr_frags parameter was read since that value
could be changed when padding packets with lengths between 55
and 59 bytes (inclusive).
The use of a stale nr_frags value can cause corruption of the
pad data when tx-scatter-gather is enabled. This corruption of
the pad can cause invalid checksum computation when hardware
offload of tx-checksum is also enabled.
Since the original reason for the padding was corrected by
commit 7dd399130e ("net: bcmgenet: fix skb_len in
bcmgenet_xmit_single()") we can remove the software padding all
together and make use of hardware padding of short frames as
long as the hardware also always appends the FCS value to the
frame.
Fixes: 474ea9cafc ("net: bcmgenet: correctly pad short packets")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 16-bit value that holds a short in network byte order should
be declared as a restricted big endian type to allow type checks
to succeed during assignment.
Fixes: 3e37095228 ("net: bcmgenet: add support for ethtool rxnfc flows")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function was originally removed by Baoyou Xie in
commit e2072600a2 ("net: bcmgenet: remove unused function in
bcmgenet.c") to prevent a build warning.
Some of the functions removed by Baoyou Xie are now used for
WAKE_FILTER support so his commit was reverted, but this function
is still unused and the kbuild test robot dutifully reported the
warning.
This commit once again removes the remaining unused hfb functions.
Fixes: 14da1510fe ("Revert "net: bcmgenet: remove unused function in bcmgenet.c"")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Registers 8-9 are used to store measurements of the kernel and its
command line (e.g., grub2 bootloader with tpm module enabled). IMA
should include them in the boot aggregate. Registers 8-9 should be
only included in non-SHA1 digests to avoid ambiguity.
Signed-off-by: Maurizio Drocco <maurizio.drocco@ibm.com>
Reviewed-by: Bruno Meneguele <bmeneg@redhat.com>
Tested-by: Bruno Meneguele <bmeneg@redhat.com> (TPM 1.2, TPM 2.0)
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Pull erofs fix from Gao Xiang:
"Fix a regression which uses potential uninitialized high 32-bit value
unexpectedly recently observed with specific compiler options"
* tag 'erofs-for-5.8-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: fix partially uninitialized misuse in z_erofs_onlinepage_fixup
check that 'nft ... ct helper set <foo>' works:
1. configure ftp helper via nft and assign it to
connections on port 2121
2. check with 'conntrack -L' that the next connection
has the ftp helper attached to it.
Also add a test for auto-assign (old behaviour).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Using new helpers ip6t_unregister_table_pre_exit() and
ip6t_unregister_table_exit().
Fixes: b9e69e1273 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The pre_exit will un-register the underlying hook and .exit will do
the table freeing. The netns core does an unconditional synchronize_rcu
after the pre_exit hooks insuring no packets are in flight that have
picked up the pointer before completing the un-register.
Fixes: b9e69e1273 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Using new helpers ipt_unregister_table_pre_exit() and
ipt_unregister_table_exit().
Fixes: b9e69e1273 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The pre_exit will un-register the underlying hook and .exit will do the
table freeing. The netns core does an unconditional synchronize_rcu after
the pre_exit hooks insuring no packets are in flight that have picked up
the pointer before completing the un-register.
Fixes: b9e69e1273 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The user tool modinfo is used to get information on kernel modules, including a
description where it is available.
This patch adds a brief MODULE_DESCRIPTION to netfilter kernel modules
(descriptions taken from Kconfig file or code comments)
Signed-off-by: Rob Gill <rrobgill@protonmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When using ip_set with counters and comment, traffic causes the kernel
to panic on 32-bit ARM:
Alignment trap: not handling instruction e1b82f9f at [<bf01b0dc>]
Unhandled fault: alignment exception (0x221) at 0xea08133c
PC is at ip_set_match_extensions+0xe0/0x224 [ip_set]
The problem occurs when we try to update the 64-bit counters - the
faulting address above is not 64-bit aligned. The problem occurs
due to the way elements are allocated, for example:
set->dsize = ip_set_elem_len(set, tb, 0, 0);
map = ip_set_alloc(sizeof(*map) + elements * set->dsize);
If the element has a requirement for a member to be 64-bit aligned,
and set->dsize is not a multiple of 8, but is a multiple of four,
then every odd numbered elements will be misaligned - and hitting
an atomic64_add() on that element will cause the kernel to panic.
ip_set_elem_len() must return a size that is rounded to the maximum
alignment of any extension field stored in the element. This change
ensures that is the case.
Fixes: 95ad1f4a93 ("netfilter: ipset: Fix extension alignment")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The function kobject_init_and_add alloc memory like:
kobject_init_and_add->kobject_add_varg->kobject_set_name_vargs
->kvasprintf_const->kstrdup_const->kstrdup->kmalloc_track_caller
->kmalloc_slab, in err branch this memory not free. If use
kmemleak, this path maybe catched.
These changes are to add kobject_put in kobject_init_and_add
failed branch, fix potential memleak.
Signed-off-by: Bernard Zhao <bernard@vivo.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
[Why]
Regression was introduced where setting max bpc property has no effect
on the atomic check and final commit. It has the same effect as max bpc
being stuck at 8.
[How]
Correctly propagate max bpc with the new connector state.
Signed-off-by: Stylon Wang <stylon.wang@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The error DBG_STATUS_NO_MATCHING_FRAMING_MODE was added to the enum
enum dbg_status however there is a missing corresponding entry for
this in the array s_status_str. This causes an out-of-bounds read when
indexing into the last entry of s_status_str. Fix this by adding in
the missing entry.
Addresses-Coverity: ("Out-of-bounds read").
Fixes: 2d22bc8354 ("qed: FW 8.42.2.0 debug features")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jisheng Zhang says:
====================
net: phy: call phy_disable_interrupts() in phy_init_hw()
We face an issue with rtl8211f, a pin is shared between INTB and PMEB,
and the PHY Register Accessible Interrupt is enabled by default, so
the INTB/PMEB pin is always active in polling mode case.
As Heiner pointed out "I was thinking about calling
phy_disable_interrupts() in phy_init_hw(), to have a defined init
state as we don't know in which state the PHY is if the PHY driver is
loaded. We shouldn't assume that it's the chip power-on defaults, BIOS
or boot loader could have changed this. Or in case of dual-boot
systems the other OS could leave the PHY in whatever state."
patch1 makes phy_disable_interrupts() non-static so that it could be used
in phy_init_hw() to have a defined init state.
patch2 calls phy_disable_interrupts() in phy_init_hw() to have a
defined init state.
Since v3:
- call phy_disable_interrupts() have interrupts disabled first then
config_init, thank Florian
Since v2:
- Don't export phy_disable_interrupts() but just make it non-static
Since v1:
- EXPORT the correct symbol
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Call phy_disable_interrupts() in phy_init_hw() to "have a defined init
state as we don't know in which state the PHY is if the PHY driver is
loaded. We shouldn't assume that it's the chip power-on defaults, BIOS
or boot loader could have changed this. Or in case of dual-boot
systems the other OS could leave the PHY in whatever state." as pointed
out by Heiner.
Suggested-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We face an issue with rtl8211f, a pin is shared between INTB and PMEB,
and the PHY Register Accessible Interrupt is enabled by default, so
the INTB/PMEB pin is always active in polling mode case.
As Heiner pointed out "I was thinking about calling
phy_disable_interrupts() in phy_init_hw(), to have a defined init
state as we don't know in which state the PHY is if the PHY driver is
loaded. We shouldn't assume that it's the chip power-on defaults, BIOS
or boot loader could have changed this. Or in case of dual-boot
systems the other OS could leave the PHY in whatever state."
Make phy_disable_interrupts() non-static so that it could be used in
phy_init_hw() to have a defined init state.
Suggested-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When writing the serdes configuration register was moved to
mvneta_config_interface() the whole code block was removed from
mvneta_port_power_up() in the assumption that its only purpose was to
write the serdes configuration register. As mentioned by Russell King
its purpose was also to check for valid interface modes early so that
later in the driver we do not have to care for unexpected interface
modes.
Add back the test to let the driver bail out early on unhandled
interface modes.
Fixes: b4748553f5 ("net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy")
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
In mvneta_config_interface() the RGMII modes are catched by the default
case which is an error return. The RGMII modes are valid modes for the
driver, so instead of returning an error add a break statement to return
successfully.
This avoids this warning for non comphy SoCs which use RGMII, like
SolidRun Clearfog:
WARNING: CPU: 0 PID: 268 at drivers/net/ethernet/marvell/mvneta.c:3512 mvneta_start_dev+0x220/0x23c
Fixes: b4748553f5 ("net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy")
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver for Marvell switches puts all ports in IGMP snooping mode
which results in all IGMP/MLD frames that ingress on the ports to be
forwarded to the CPU only.
The bridge code in the kernel can then interpret these frames and act
upon them, for instance by updating the mdb in the switch to reflect
multicast memberships of stations connected to the ports. However,
the IGMP/MLD frames must then also be forwarded to other ports of the
bridge so external IGMP queriers can track membership reports, and
external multicast clients can receive query reports from foreign IGMP
queriers.
Currently, this is impossible as the EDSA tagger sets offload_fwd_mark
on the skb when it unwraps the tagged frames, and that will make the
switchdev layer prevent the skb from egressing on any other port of
the same switch.
To fix that, look at the To_CPU code in the DSA header and make
forwarding of the frame possible for trapped IGMP packets.
Introduce some #defines for the frame types to make the code a bit more
comprehensive.
This was tested on a Marvell 88E6352 variant.
Signed-off-by: Daniel Mack <daniel@zonque.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
ovs connection tracking module performs de-fragmentation on incoming
fragmented traffic. Take info account if traffic has been de-fragmented
in execute_check_pkt_len action otherwise we will perform the wrong
nested action considering the original packet size. This issue typically
occurs if ovs-vswitchd adds a rule in the pipeline that requires connection
tracking (e.g. OVN stateful ACLs) before execute_check_pkt_len action.
Moreover take into account GSO fragment size for GSO packet in
execute_check_pkt_len routine
Fixes: 4d5ec89fc8 ("net: openvswitch: Add a new action check_pkt_len")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull virtio fixes from Michael Tsirkin:
"Fixes all over the place.
This includes a couple of tests that I would normally defer, but since
they have already been helpful in catching some bugs, don't build for
any users at all, and having them upstream makes life easier for
everyone, I think it's ok even at this late stage"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
tools/virtio: Use tools/include/list.h instead of stubs
tools/virtio: Reset index in virtio_test --reset.
tools/virtio: Extract virtqueue initialization in vq_reset
tools/virtio: Use __vring_new_virtqueue in virtio_test.c
tools/virtio: Add --reset
tools/virtio: Add --batch=random option
tools/virtio: Add --batch option
virtio-mem: add memory via add_memory_driver_managed()
virtio-mem: silence a static checker warning
vhost_vdpa: Fix potential underflow in vhost_vdpa_mmap()
vdpa: fix typos in the comments for __vdpa_alloc_device()
Pull thread fix from Christian Brauner:
"This fixes a regression introduced with 303cc571d1 ("nsproxy: attach
to namespaces via pidfds").
The LTP testsuite reported a regression where users would now see
EBADF returned instead of EINVAL when an fd was passed that referred
to an open file but the file was not a namespace file.
Fix this by continuing to report EINVAL and add a regression test"
* tag 'for-linus-2020-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
tests: test for setns() EINVAL regression
nsproxy: restore EINVAL for non-namespace file descriptor
In the past we had a pile of hacks to orchestrate access between fbdev
emulation and native kms clients. We've tried to streamline this, by
always preferring the kms side above fbdev calls when a drm master
exists, because drm master controls access to the display resources.
Unfortunately this breaks existing userspace, specifically Xorg. When
exiting Xorg first restores the console to text mode using the KDSET
ioctl on the vt. This does nothing, because a drm master is still
around. Then it drops the drm master status, which again does nothing,
because logind is keeping additional drm fd open to be able to
orchestrate vt switches. In the past this is the point where fbdev was
restored, as part of the ->lastclose hook on the drm side.
Now to fix this regression we don't want to go back to letting fbdev
restore things whenever it feels like, or to the pile of hacks we've
had before. Instead try and go with a minimal exception to make the
KDSET case work again, and nothing else.
This means that if userspace does a KDSET call when switching between
graphical compositors, there will be some flickering with fbcon
showing up for a bit. But a) that's not a regression and b) userspace
can fix it by improving the vt switching dance - logind should have
all the information it needs.
While pondering all this I'm also wondering wheter we should have a
SWITCH_MASTER ioctl to allow race-free master status handover. But
that's for another day.
v2: Somehow forgot to cc all the fbdev people.
v3: Fix typo Alex spotted.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208179
Cc: shlomo@fastmail.com
Reported-and-Tested-by: shlomo@fastmail.com
Cc: Michel Dänzer <michel@daenzer.net>
Fixes: 64914da24e ("drm/fbdev-helper: don't force restores")
Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.7+
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Qiujun Huang <hqjagain@gmail.com>
Cc: Peter Rosin <peda@axentia.se>
Cc: linux-fbdev@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200624092910.3280448-1-daniel.vetter@ffwll.ch
When running iperf in a two host configuration the following trace can
occur:
[ 319.728730] NETDEV WATCHDOG: ib0 (hfi1): transmit queue 0 timed out
The issue happens because the current implementation relies on the netif
txq being stopped to control the flushing of the tx list.
There are two resources that the transmit logic can wait on and stop the
txq:
- SDMA descriptors
- Ring space to hold completions
The ring space is tested on the sending side and relieved when the ring is
consumed in the napi tx reaping.
Unfortunately, that reaping can run conncurrently with the workqueue
flushing of the txlist. If the txq is started just before the workitem
executes, the txlist will never be flushed, leading to the txq being
stuck.
Fix by:
- Adding sleep/wakeup wrappers
* Use an atomic to control the call to the netif routines inside the
wrappers
- Use another atomic to record ring space exhaustion
* Only wakeup when the a ring space exhaustion has happened and it
relieved
Add additional wrappers to clarify the ring space resource handling.
Fixes: d99dc602e2 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20200623204327.108092.4024.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The current code mishandles -EBUSY in two ways:
- The flow change doesn't test the return from the flush and runs on to
process the current packet racing with the wakeup processing
- The -EBUSY handling for a single packet inserts the tx into the txlist
after the submit call, racing with the same wakeup processing
Fix the first by dropping the skb and returning NETDEV_TX_OK.
Fix the second by insuring the the list entry within the txreq is inited
when allocated. This enables the sleep routine to detect that the txreq
has used the non-list api and queue the packet to the txlist.
Both flaws can lead to having the flushing thread executing in causing two
threads to manipulate the txlist.
Fixes: d99dc602e2 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20200623204321.108092.83898.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Per the datasheet for max6697, OVERT mask and ALERT mask are different.
For example, the 7th bit of OVERT is the local channel but for alert
mask, the 6th bit is the local channel. Therefore, we can't apply the
same mask for both registers. In addition to that, the max6697 driver
is supposed to be compatibale with different models. I manually went over
all the listed chips and made sure all chip types have the same layout.
Testing;
mask value of 0x9 should map to 0x44 for ALERT and 0x84 for OVERT.
I used iotool to read the reg value back to verify. I only tested this
change on max6581.
Reference:
https://datasheets.maximintegrated.com/en/ds/MAX6581.pdfhttps://datasheets.maximintegrated.com/en/ds/MAX6697.pdfhttps://datasheets.maximintegrated.com/en/ds/MAX6699.pdf
Signed-off-by: Chu Lin <linchuyuan@google.com>
Fixes: 5372d2d71c ("hwmon: Driver for Maxim MAX6697 and compatibles")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The mpath disk node takes a reference on the request mpath
request queue when adding live path to the mpath gendisk.
However if we connected to an inaccessible path device_add_disk
is not called, so if we disconnect and remove the mpath gendisk
we endup putting an reference on the request queue that was
never taken [1].
Fix that to check if we ever added a live path (using
NVME_NS_HEAD_HAS_DISK flag) and if not, clear the disk->queue
reference.
[1]:
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 1 PID: 1372 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
CPU: 1 PID: 1372 Comm: nvme Tainted: G O 5.7.0-rc2+ #3
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1 04/01/2014
RIP: 0010:refcount_warn_saturate+0xa6/0xf0
RSP: 0018:ffffb29e8053bdc0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8b7a2f4fc060 RCX: 0000000000000007
RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff8b7a3ec99980
RBP: ffff8b7a2f4fc000 R08: 00000000000002e1 R09: 0000000000000004
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: fffffffffffffff2 R14: ffffb29e8053bf08 R15: ffff8b7a320e2da0
FS: 00007f135d4ca800(0000) GS:ffff8b7a3ec80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005651178c0c30 CR3: 000000003b650005 CR4: 0000000000360ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
disk_release+0xa2/0xc0
device_release+0x28/0x80
kobject_put+0xa5/0x1b0
nvme_put_ns_head+0x26/0x70 [nvme_core]
nvme_put_ns+0x30/0x60 [nvme_core]
nvme_remove_namespaces+0x9b/0xe0 [nvme_core]
nvme_do_delete_ctrl+0x43/0x5c [nvme_core]
nvme_sysfs_delete.cold+0x8/0xd [nvme_core]
kernfs_fop_write+0xc1/0x1a0
vfs_write+0xb6/0x1a0
ksys_write+0x5f/0xe0
do_syscall_64+0x52/0x1a0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported-by: Anton Eidelman <anton@lightbitslabs.com>
Tested-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In the following scenario scan_work and ana_work will deadlock:
When scan_work calls nvme_mpath_add_disk() this holds ana_lock
and invokes nvme_parse_ana_log(), which may issue IO
in device_add_disk() and hang waiting for an accessible path.
While nvme_mpath_set_live() only called when nvme_state_is_live(),
a transition may cause NVME_SC_ANA_TRANSITION and requeue the IO.
Since nvme_mpath_set_live() holds ns->head->lock, an ana_work on
ANY ctrl will not be able to complete nvme_mpath_set_live()
on the same ns->head, which is required in order to update
the new accessible path and remove NVME_NS_ANA_PENDING..
Therefore IO never completes: deadlock [1].
Fix:
Move device_add_disk out of the head->lock and protect it with an
atomic test_and_set for a new NVME_NS_HEAD_HAS_DISK bit.
[1]:
kernel: INFO: task kworker/u8:2:160 blocked for more than 120 seconds.
kernel: Tainted: G OE 5.3.5-050305-generic #201910071830
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: kworker/u8:2 D 0 160 2 0x80004000
kernel: Workqueue: nvme-wq nvme_ana_work [nvme_core]
kernel: Call Trace:
kernel: __schedule+0x2b9/0x6c0
kernel: schedule+0x42/0xb0
kernel: schedule_preempt_disabled+0xe/0x10
kernel: __mutex_lock.isra.0+0x182/0x4f0
kernel: __mutex_lock_slowpath+0x13/0x20
kernel: mutex_lock+0x2e/0x40
kernel: nvme_update_ns_ana_state+0x22/0x60 [nvme_core]
kernel: nvme_update_ana_state+0xca/0xe0 [nvme_core]
kernel: nvme_parse_ana_log+0xa1/0x180 [nvme_core]
kernel: nvme_read_ana_log+0x76/0x100 [nvme_core]
kernel: nvme_ana_work+0x15/0x20 [nvme_core]
kernel: process_one_work+0x1db/0x380
kernel: worker_thread+0x4d/0x400
kernel: kthread+0x104/0x140
kernel: ret_from_fork+0x35/0x40
kernel: INFO: task kworker/u8:4:439 blocked for more than 120 seconds.
kernel: Tainted: G OE 5.3.5-050305-generic #201910071830
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: kworker/u8:4 D 0 439 2 0x80004000
kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core]
kernel: Call Trace:
kernel: __schedule+0x2b9/0x6c0
kernel: schedule+0x42/0xb0
kernel: io_schedule+0x16/0x40
kernel: do_read_cache_page+0x438/0x830
kernel: read_cache_page+0x12/0x20
kernel: read_dev_sector+0x27/0xc0
kernel: read_lba+0xc1/0x220
kernel: efi_partition+0x1e6/0x708
kernel: check_partition+0x154/0x244
kernel: rescan_partitions+0xae/0x280
kernel: __blkdev_get+0x40f/0x560
kernel: blkdev_get+0x3d/0x140
kernel: __device_add_disk+0x388/0x480
kernel: device_add_disk+0x13/0x20
kernel: nvme_mpath_set_live+0x119/0x140 [nvme_core]
kernel: nvme_update_ns_ana_state+0x5c/0x60 [nvme_core]
kernel: nvme_mpath_add_disk+0xbe/0x100 [nvme_core]
kernel: nvme_validate_ns+0x396/0x940 [nvme_core]
kernel: nvme_scan_work+0x256/0x390 [nvme_core]
kernel: process_one_work+0x1db/0x380
kernel: worker_thread+0x4d/0x400
kernel: kthread+0x104/0x140
kernel: ret_from_fork+0x35/0x40
Fixes: 0d0b660f21 ("nvme: add ANA support")
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Right now ns->head->lock is protecting namespace mutation
which is wrong and unneeded. Move it to only protect
against head mutations. While we're at it, remove unnecessary
ns->head reference as we already have head pointer.
The problem with this is that the head->lock spans
mpath disk node I/O that may block under some conditions (if
for example the controller is disconnecting or the path
became inaccessible), The locking scheme does not allow any
other path to enable itself, preventing blocked I/O to complete
and forward-progress from there.
This is a preparation patch for the fix in a subsequent patch
where the disk I/O will also be done outside the head->lock.
Fixes: 0d0b660f21 ("nvme: add ANA support")
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When scan_work calls nvme_mpath_add_disk() this holds ana_lock
and invokes nvme_parse_ana_log(), which may issue IO
in device_add_disk() and hang waiting for an accessible path.
While nvme_mpath_set_live() only called when nvme_state_is_live(),
a transition may cause NVME_SC_ANA_TRANSITION and requeue the IO.
In order to recover and complete the IO ana_work on the same ctrl
should be able to update the path state and remove NVME_NS_ANA_PENDING.
The deadlock occurs because scan_work keeps holding ana_lock,
so ana_work hangs [1].
Fix:
Now nvme_mpath_add_disk() uses nvme_parse_ana_log() to obtain a copy
of the ANA group desc, and then calls nvme_update_ns_ana_state() without
holding ana_lock.
[1]:
kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core]
kernel: Call Trace:
kernel: __schedule+0x2b9/0x6c0
kernel: schedule+0x42/0xb0
kernel: io_schedule+0x16/0x40
kernel: do_read_cache_page+0x438/0x830
kernel: read_cache_page+0x12/0x20
kernel: read_dev_sector+0x27/0xc0
kernel: read_lba+0xc1/0x220
kernel: efi_partition+0x1e6/0x708
kernel: check_partition+0x154/0x244
kernel: rescan_partitions+0xae/0x280
kernel: __blkdev_get+0x40f/0x560
kernel: blkdev_get+0x3d/0x140
kernel: __device_add_disk+0x388/0x480
kernel: device_add_disk+0x13/0x20
kernel: nvme_mpath_set_live+0x119/0x140 [nvme_core]
kernel: nvme_update_ns_ana_state+0x5c/0x60 [nvme_core]
kernel: nvme_set_ns_ana_state+0x1e/0x30 [nvme_core]
kernel: nvme_parse_ana_log+0xa1/0x180 [nvme_core]
kernel: nvme_mpath_add_disk+0x47/0x90 [nvme_core]
kernel: nvme_validate_ns+0x396/0x940 [nvme_core]
kernel: nvme_scan_work+0x24f/0x380 [nvme_core]
kernel: process_one_work+0x1db/0x380
kernel: worker_thread+0x249/0x400
kernel: kthread+0x104/0x140
kernel: Workqueue: nvme-wq nvme_ana_work [nvme_core]
kernel: Call Trace:
kernel: __schedule+0x2b9/0x6c0
kernel: schedule+0x42/0xb0
kernel: schedule_preempt_disabled+0xe/0x10
kernel: __mutex_lock.isra.0+0x182/0x4f0
kernel: ? __switch_to_asm+0x34/0x70
kernel: ? select_task_rq_fair+0x1aa/0x5c0
kernel: ? kvm_sched_clock_read+0x11/0x20
kernel: ? sched_clock+0x9/0x10
kernel: __mutex_lock_slowpath+0x13/0x20
kernel: mutex_lock+0x2e/0x40
kernel: nvme_read_ana_log+0x3a/0x100 [nvme_core]
kernel: nvme_ana_work+0x15/0x20 [nvme_core]
kernel: process_one_work+0x1db/0x380
kernel: worker_thread+0x4d/0x400
kernel: kthread+0x104/0x140
kernel: ? process_one_work+0x380/0x380
kernel: ? kthread_park+0x80/0x80
kernel: ret_from_fork+0x35/0x40
Fixes: 0d0b660f21 ("nvme: add ANA support")
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Revert fab7772bfb ("nvme-multipath: revalidate nvme_ns_head gendisk
in nvme_validate_ns")
When adding a new namespace to the head disk (via nvme_mpath_set_live)
we will see partition scan which triggers I/O on the mpath device node.
This process will usually be triggered from the scan_work which holds
the scan_lock. If I/O blocks (if we got ana change currently have only
available paths but none are accessible) this can deadlock on the head
disk bd_mutex as both partition scan I/O takes it, and head disk revalidation
takes it to check for resize (also triggered from scan_work on a different
path). See trace [1].
The mpath disk revalidation was originally added to detect online disk
size change, but this is no longer needed since commit cb224c3af4
("nvme: Convert to use set_capacity_revalidate_and_notify") which already
updates resize info without unnecessarily revalidating the disk (the
mpath disk doesn't even implement .revalidate_disk fop).
[1]:
--
kernel: INFO: task kworker/u65:9:494 blocked for more than 241 seconds.
kernel: Tainted: G OE 5.3.5-050305-generic #201910071830
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: kworker/u65:9 D 0 494 2 0x80004000
kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core]
kernel: Call Trace:
kernel: __schedule+0x2b9/0x6c0
kernel: schedule+0x42/0xb0
kernel: schedule_preempt_disabled+0xe/0x10
kernel: __mutex_lock.isra.0+0x182/0x4f0
kernel: __mutex_lock_slowpath+0x13/0x20
kernel: mutex_lock+0x2e/0x40
kernel: revalidate_disk+0x63/0xa0
kernel: __nvme_revalidate_disk+0xfe/0x110 [nvme_core]
kernel: nvme_revalidate_disk+0xa4/0x160 [nvme_core]
kernel: ? evict+0x14c/0x1b0
kernel: revalidate_disk+0x2b/0xa0
kernel: nvme_validate_ns+0x49/0x940 [nvme_core]
kernel: ? blk_mq_free_request+0xd2/0x100
kernel: ? __nvme_submit_sync_cmd+0xbe/0x1e0 [nvme_core]
kernel: nvme_scan_work+0x24f/0x380 [nvme_core]
kernel: process_one_work+0x1db/0x380
kernel: worker_thread+0x249/0x400
kernel: kthread+0x104/0x140
kernel: ? process_one_work+0x380/0x380
kernel: ? kthread_park+0x80/0x80
kernel: ret_from_fork+0x1f/0x40
...
kernel: INFO: task kworker/u65:1:2630 blocked for more than 241 seconds.
kernel: Tainted: G OE 5.3.5-050305-generic #201910071830
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: kworker/u65:1 D 0 2630 2 0x80004000
kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core]
kernel: Call Trace:
kernel: __schedule+0x2b9/0x6c0
kernel: schedule+0x42/0xb0
kernel: io_schedule+0x16/0x40
kernel: do_read_cache_page+0x438/0x830
kernel: ? __switch_to_asm+0x34/0x70
kernel: ? file_fdatawait_range+0x30/0x30
kernel: read_cache_page+0x12/0x20
kernel: read_dev_sector+0x27/0xc0
kernel: read_lba+0xc1/0x220
kernel: ? kmem_cache_alloc_trace+0x19c/0x230
kernel: efi_partition+0x1e6/0x708
kernel: ? vsnprintf+0x39e/0x4e0
kernel: ? snprintf+0x49/0x60
kernel: check_partition+0x154/0x244
kernel: rescan_partitions+0xae/0x280
kernel: __blkdev_get+0x40f/0x560
kernel: blkdev_get+0x3d/0x140
kernel: __device_add_disk+0x388/0x480
kernel: device_add_disk+0x13/0x20
kernel: nvme_mpath_set_live+0x119/0x140 [nvme_core]
kernel: nvme_update_ns_ana_state+0x5c/0x60 [nvme_core]
kernel: nvme_set_ns_ana_state+0x1e/0x30 [nvme_core]
kernel: nvme_parse_ana_log+0xa1/0x180 [nvme_core]
kernel: ? nvme_update_ns_ana_state+0x60/0x60 [nvme_core]
kernel: nvme_mpath_add_disk+0x47/0x90 [nvme_core]
kernel: nvme_validate_ns+0x396/0x940 [nvme_core]
kernel: ? blk_mq_free_request+0xd2/0x100
kernel: nvme_scan_work+0x24f/0x380 [nvme_core]
kernel: process_one_work+0x1db/0x380
kernel: worker_thread+0x249/0x400
kernel: kthread+0x104/0x140
kernel: ? process_one_work+0x380/0x380
kernel: ? kthread_park+0x80/0x80
kernel: ret_from_fork+0x1f/0x40
--
Fixes: fab7772bfb ("nvme-multipath: revalidate nvme_ns_head gendisk
in nvme_validate_ns")
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The completion vector index that is given during CQ creation can't
exceed the number of support vectors by the underlying RDMA device. This
violation currently can accure, for example, in case one will try to
connect with N regular read/write queues and M poll queues and the sum
of N + M > num_supported_vectors. This will lead to failure in establish
a connection to remote target. Instead, in that case, share a completion
vector between queues.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Both admin's and drive's tagsets should be set according the numa
node of the controller.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Both admin's and drive's tagsets should be set according the numa
node of the controller.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Both admin's and drive's tagsets should be set according the numa node
of the controller.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Set the node value according to the PCI device numa node.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Initialize the node to NUMA_NO_NODE value. Transports that are aware of
numa node affinity can override it (e.g. RDMA transport set the affinity
according to the RDMA HCA).
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This driver assumed that dmaengine_tx_status() could return
the residue even if the transfer was completed. However,
this was not correct usage [1] and this caused to break getting
the residue after the commit 24461d9792 ("dmaengine:
virt-dma: Fix access after free in vchan_complete()") actually.
So, this is possible to get wrong received size if the usb
controller gets a short packet. For example, g_zero driver
causes "bad OUT byte" errors.
The usb-dmac driver will support the callback_result, so this
driver can use it to get residue correctly. Note that even if
the usb-dmac driver has not supported the callback_result yet,
this patch doesn't cause any side-effects.
[1]
https://lore.kernel.org/dmaengine/20200616165550.GP2324254@vkoul-mobl/
Reported-by: Hien Dang <hien.dang.eb@renesas.com>
Fixes: 24461d9792 ("dmaengine: virt-dma: Fix access after free in vchan_complete()")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/1592482277-19563-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
clk_s is checked twice in a row in ni_init_smc_spll_table().
fb_div should be checked instead.
Fixes: 69e0b57a91 ("drm/radeon/kms: add dpm support for cayman (v5)")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
USB2 devices with LPM enabled may interrupt the system suspend:
[ 932.510475] usb 1-7: usb suspend, wakeup 0
[ 932.510549] hub 1-0:1.0: hub_suspend
[ 932.510581] usb usb1: bus suspend, wakeup 0
[ 932.510590] xhci_hcd 0000:00:14.0: port 9 not suspended
[ 932.510593] xhci_hcd 0000:00:14.0: port 8 not suspended
..
[ 932.520323] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03
..
[ 932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
[ 932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
[ 932.591418] PM: Device 0000:00:14.0 failed to suspend async: error -16
During system suspend, USB core will let HC suspends the device if it
doesn't have remote wakeup enabled and doesn't have any children.
However, from the log above we can see that the usb 1-7 doesn't get bus
suspended due to not in U0. After a while the port finished U2 -> U0
transition, interrupts the suspend process.
The observation is that after disabling LPM, port doesn't transit to U0
immediately and can linger in U2. xHCI spec 4.23.5.2 states that the
maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for
the affected device is advertised as 400us, which is still not enough
based on my testing result.
So let's use the maximum permitted latency, 10000, to poll for U0
status to solve the issue.
Cc: stable@vger.kernel.org
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20200624135949.22611-6-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Unable to complete the enumeration of a USB TV Tuner device.
Per XHCI spec (4.6.5), the EP state field of the input context shall
be cleared for a set address command. In the special case of an FS
device that has "MaxPacketSize0 = 8", the Linux XHCI driver does
not do this before evaluating the context. With an XHCI controller
that checks the EP state field for parameter context error this
causes a problem in cases such as the device getting reset again
after enumeration.
When that field is cleared, the problem does not occur.
This was found and fixed by Sasi Kumar.
Cc: stable@vger.kernel.org
Signed-off-by: Al Cooper <alcooperx@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20200624135949.22611-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It should use the correct direction value from register, not depends
on previous software setting. It fixed the EP number wrong issue at
trace when the TRBERR interrupt occurs for EP0IN.
When the EP0IN IOC has finished, software prepares the setup packet
request, the expected direction is OUT, but at that time, the TRBERR
for EP0IN may occur since it is DMULT mode, the DMA does not stop
until TRBERR has met.
Fixes: 7733f6c32e ("usb: cdns3: Add Cadence USB3 DRD Driver")
Cc: <stable@vger.kernel.org>
Reviewed-by: Pawel Laszczak <pawell@cadence.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Link: https://lore.kernel.org/r/20200623030918.8409-3-peter.chen@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The arm64 signal tests generate warnings during build since both they and
the toplevel lib.mk define a clean target:
Makefile:25: warning: overriding recipe for target 'clean'
../../lib.mk:126: warning: ignoring old recipe for target 'clean'
Since the inclusion of lib.mk is in the signal Makefile there is no
situation where this warning could be avoided so just remove the redundant
clean target.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20200624104933.21125-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Some ftrace features are broken since commit 714a8d02ca ("arm64: asm:
Override SYM_FUNC_START when building the kernel with BTI"). For example
the function_graph tracer:
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
[ 36.107016] WARNING: CPU: 0 PID: 115 at kernel/trace/ftrace.c:2691 ftrace_modify_all_code+0xc8/0x14c
When ftrace_modify_graph_caller() attempts to write a branch at
ftrace_graph_call, it finds the "BTI J" instruction inserted by
SYM_INNER_LABEL() instead of a NOP, and aborts.
It turns out we don't currently need the BTI landing pads inserted by
SYM_INNER_LABEL:
* ftrace_call and ftrace_graph_call are only used for runtime patching
of the active tracer. The patched code is not reached from a branch.
* install_el2_stub is reached from a CBZ instruction, which doesn't
change PSTATE.BTYPE.
* __guest_exit is reached from B instructions in the hyp-entry vectors,
which aren't subject to BTI checks either.
Remove the BTI annotation from SYM_INNER_LABEL.
Fixes: 714a8d02ca ("arm64: asm: Override SYM_FUNC_START when building the kernel with BTI")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20200624112253.1602786-1-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
clk_div_table and wiz_regmap_config are not modified and can therefore
be made const to allow the compiler to put them in read-only memory.
Before:
text data bss dec hex filename
20265 7044 64 27373 6aed drivers/phy/ti/phy-j721e-wiz.o
After:
text data bss dec hex filename
20649 6660 64 27373 6aed drivers/phy/ti/phy-j721e-wiz.o
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20200524095516.25227-3-rikard.falkeborn@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
regmap_config is not modified and can be made static to allow the compiler
to put it in read-only memory.
Before:
text data bss dec hex filename
12328 3644 64 16036 3ea4 drivers/phy/ti/phy-am654-serdes.o
After:
text data bss dec hex filename
12648 3324 64 16036 3ea4 drivers/phy/ti/phy-am654-serdes.o
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20200524095516.25227-2-rikard.falkeborn@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The USB-audio mixer code holds a linked list of usb_mixer_elem_list,
and several operations are performed for each mixer element. A few of
them (snd_usb_mixer_notify_id() and snd_usb_mixer_interrupt_v2())
assume each mixer element being a usb_mixer_elem_info object that is a
subclass of usb_mixer_elem_list, cast via container_of() and access it
members. This may result in an out-of-bound access when a
non-standard list element has been added, as spotted by syzkaller
recently.
This patch adds a new field, is_std_info, in usb_mixer_elem_list to
indicate that the element is the usb_mixer_elem_info type or not, and
skip the access to such an element if needed.
Reported-by: syzbot+fb14314433463ad51625@syzkaller.appspotmail.com
Reported-by: syzbot+2405ca3401e943c538b5@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200624122340.9615-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit 87676cfca1 ("arm64: vdso: Disable dwarf unwinding through the
sigreturn trampoline") unconditionally passes the '--no-eh-frame-hdr'
option to the linker when building the native vDSO in an attempt to
prevent generation of the .eh_frame_hdr section, the presence of which
has been implicated in segfaults originating from the libgcc unwinder.
Unfortunately, not all versions of binutils support this option, which
has been shown to cause build failures in linux-next:
| CALL scripts/atomic/check-atomics.sh
| CALL scripts/checksyscalls.sh
| LD arch/arm64/kernel/vdso/vdso.so.dbg
| ld: unrecognized option '--no-eh-frame-hdr'
| ld: use the --help option for usage information
| arch/arm64/kernel/vdso/Makefile:64: recipe for target
| 'arch/arm64/kernel/vdso/vdso.so.dbg' failed
| make[1]: *** [arch/arm64/kernel/vdso/vdso.so.dbg] Error 1
| arch/arm64/Makefile:175: recipe for target 'vdso_prepare' failed
| make: *** [vdso_prepare] Error 2
Only link the vDSO with '--no-eh-frame-hdr' when the linker supports it.
If we end up with the section due to linker defaults, the absence of CFI
information in the sigreturn trampoline will prevent the unwinder from
breaking.
Link: https://lore.kernel.org/r/7a7e31a8-9a7b-2428-ad83-2264f20bdc2d@hisilicon.com
Fixes: 87676cfca1 ("arm64: vdso: Disable dwarf unwinding through the sigreturn trampoline")
Reported-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
clang points out that a local variable is initialized with
an enum value of the wrong type:
drivers/phy/intel/phy-intel-combo.c:202:34: error: implicit conversion from enumeration type 'enum intel_phy_mode' to different enumeration type 'enum intel_combo_mode' [-Werror,-Wenum-conversion]
enum intel_combo_mode cb_mode = PHY_PCIE_MODE;
~~~~~~~ ^~~~~~~~~~~~~
>From reading the code, it seems that this was not only the
wrong type, but not even supposed to be a code path that can
happen in practice.
Change the code to have no default phy mode but instead return an
error for invalid input.
Fixes: ac0a95a3ea ("phy: intel: Add driver support for ComboPhy")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Dilip Kota <eswara.kota@linux.intel.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://lore.kernel.org/r/20200527134518.908624-1-arnd@arndb.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
FIELD_PREP expects constant arguments. Istead of doing FIELD_PREP
operation on the arguments of combo_phy_w32_off_mask(), pass the
final FIELD_PREP value as an argument.
Error reported as:
In file included from include/linux/build_bug.h:5,
from include/linux/bitfield.h:10,
from drivers/phy/intel/phy-intel-combo.c:8:
drivers/phy/intel/phy-intel-combo.c: In function 'combo_phy_w32_off_mask':
include/linux/bitfield.h:52:28: warning: comparison is always false due to limited range of data type [-Wtype-limits]
include/linux/compiler.h:350:38: error: call to '__compiletime_assert_37' declared with attribute error: FIELD_PREP: mask is not constant
94 | __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: "); | ^~~~~~~~~~~~~~~~
drivers/phy/intel/phy-intel-combo.c:137:13: note: in expansion of macro 'FIELD_PREP'
137 | reg_val |= FIELD_PREP(mask, val);
| ^~~~~~~~~~
../include/linux/compiler.h:392:38: error: call to__compiletime_assert_137
declared with attribute error:
BUILD_BUG_ON failed: (((mask) + (1ULL << (__builtin_ffsll(mask) - 1))) & (((mask) + (1ULL << (__builtin_ffsll(mask) - 1))) - 1)) != 0
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
../include/linux/bitfield.h:94:3: note: in expansion of macro __BF_FIELD_CHECK
__BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: "); \
^~~~~~~~~~~~~~~~
../drivers/phy/intel/phy-intel-combo.c:137:13: note: in expansion of macro FIELD_PREP
reg_val |= FIELD_PREP(mask, val);
^~~~~~~~~~
Fixes: ac0a95a3ea ("phy: intel: Add driver support for ComboPhy")
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Dilip Kota <eswara.kota@linux.intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/8a309dd3c238efbaa59d1649704255d6f8b6c9c5.1590575358.git.eswara.kota@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In GAUDI the current timer value for the hardware to check if it is
in IDLE state is too low. As a result, there are occasions where the H/W
wrongly reports it is not IDLE. The driver checks that before submitting
work on behalf of the driver during initialization, so a false report might
cause the driver to fail during device initialization.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
As this is a cypress HID->COM RS232 style device that is handled
by the cypress_M8 driver we also need to add it to the ignore list
in hid-quirks.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: James Hilliard <james.hilliard1@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The Maxxter KB-BT-001 Bluetooth keyboard, which looks somewhat like the
Apple Wireless Keyboard, is using the vendor and product IDs (05AC:0239)
of the Apple Wireless Keyboard (2009 ANSI version) <sigh>.
But its F1 - F10 keys are marked as sending F1 - F10, not the special
functions hid-apple.c maps them too; and since its descriptors do not
contain the HID_UP_CUSTOM | 0x0003 usage apple-hid looks for for the
Fn-key, apple_setup_input() never gets called, so F1 - F6 are mapped
to key-codes which have not been set in the keybit array causing them
to not send any events at all.
The lack of a usage code matching the Fn key in the clone is actually
useful as this allows solving this problem in a generic way.
This commits adds a fn_found flag and it adds a input_configured
callback which checks if this flag is set once all usages have been
mapped. If it is not set, then assume this is a clone and clear the
quirks bitmap so that the hid-apple code does not add any special
handling to this keyboard.
This fixes F1 - F6 not sending anything at all and F7 - F12 sending
the wrong codes on the Maxxter KB-BT-001 Bluetooth keyboard and on
similar clones.
Cc: Joao Moreno <mail@joaomoreno.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
My last name changed to "Rheinsberg", so update the maintainer entries
and adjust the emails while at it.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
On Toradex Colibri VF50 (Vybrid VF5xx) with fsl-edma driver NULL pointer
exception happens occasionally on serial output initiated by login
timeout.
This was reproduced only if kernel was built with significant debugging
options and EDMA driver is used with serial console.
Issue looks like a race condition between interrupt handler
fsl_edma_tx_handler() (called as a result of fsl_edma_xfer_desc()) and
terminating the transfer with fsl_edma_terminate_all().
The fsl_edma_tx_handler() handles interrupt for a transfer with already
freed edesc and idle==true.
The mcf-edma driver shares design and lot of code with fsl-edma. It
looks like being affected by same problem. Fix this pattern the same
way as fix for fsl-edma driver.
Fixes: e7a3ff92ea ("dmaengine: fsl-edma: add ColdFire mcf5441x edma support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Robin Gong <yibin.gong@nxp.com>
Link: https://lore.kernel.org/r/1591881665-25592-1-git-send-email-krzk@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
NULL pointer exception happens occasionally on serial output initiated
by login timeout. This was reproduced only if kernel was built with
significant debugging options and EDMA driver is used with serial
console.
col-vf50 login: root
Password:
Login timed out after 60 seconds.
Unable to handle kernel NULL pointer dereference at virtual address 00000044
Internal error: Oops: 5 [#1] ARM
CPU: 0 PID: 157 Comm: login Not tainted 5.7.0-next-20200610-dirty #4
Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
(fsl_edma_tx_handler) from [<8016eb10>] (__handle_irq_event_percpu+0x64/0x304)
(__handle_irq_event_percpu) from [<8016eddc>] (handle_irq_event_percpu+0x2c/0x7c)
(handle_irq_event_percpu) from [<8016ee64>] (handle_irq_event+0x38/0x5c)
(handle_irq_event) from [<801729e4>] (handle_fasteoi_irq+0xa4/0x160)
(handle_fasteoi_irq) from [<8016ddcc>] (generic_handle_irq+0x34/0x44)
(generic_handle_irq) from [<8016e40c>] (__handle_domain_irq+0x54/0xa8)
(__handle_domain_irq) from [<80508bc8>] (gic_handle_irq+0x4c/0x80)
(gic_handle_irq) from [<80100af0>] (__irq_svc+0x70/0x98)
Exception stack(0x8459fe80 to 0x8459fec8)
fe80: 72286b00 e3359f64 00000001 0000412d a0070013 85c98840 85c98840 a0070013
fea0: 8054e0d4 00000000 00000002 00000000 00000002 8459fed0 8081fbe8 8081fbec
fec0: 60070013 ffffffff
(__irq_svc) from [<8081fbec>] (_raw_spin_unlock_irqrestore+0x30/0x58)
(_raw_spin_unlock_irqrestore) from [<8056cb48>] (uart_flush_buffer+0x88/0xf8)
(uart_flush_buffer) from [<80554e60>] (tty_ldisc_hangup+0x38/0x1ac)
(tty_ldisc_hangup) from [<8054c7f4>] (__tty_hangup+0x158/0x2bc)
(__tty_hangup) from [<80557b90>] (disassociate_ctty.part.1+0x30/0x23c)
(disassociate_ctty.part.1) from [<8011fc18>] (do_exit+0x580/0xba0)
(do_exit) from [<801214f8>] (do_group_exit+0x3c/0xb4)
(do_group_exit) from [<80121580>] (__wake_up_parent+0x0/0x14)
Issue looks like race condition between interrupt handler fsl_edma_tx_handler()
(called as result of fsl_edma_xfer_desc()) and terminating the transfer with
fsl_edma_terminate_all().
The fsl_edma_tx_handler() handles interrupt for a transfer with already freed
edesc and idle==true.
Fixes: d6be34fbd3 ("dma: Add Freescale eDMA engine driver support")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Robin Gong <yibin.gong@nxp.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1591877861-28156-2-git-send-email-krzk@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In commit ed17b8d377 ("xfrm: fix a warning in xfrm_policy_insert_list"),
it would take 'priority' to make a policy unique, and allow duplicated
policies with different 'priority' to be added, which is not expected
by userland, as Tobias reported in strongswan.
To fix this duplicated policies issue, and also fix the issue in
commit ed17b8d377 ("xfrm: fix a warning in xfrm_policy_insert_list"),
when doing add/del/get/update on user interfaces, this patch is to change
to look up a policy with both mark and mask by doing:
mark.v == pol->mark.v && mark.m == pol->mark.m
and leave the check:
(mark & pol->mark.m) == pol->mark.v
for tx/rx path only.
As the userland expects an exact mark and mask match to manage policies.
v1->v2:
- make xfrm_policy_mark_match inline and fix the changelog as
Tobias suggested.
Fixes: 295fae5688 ("xfrm: Allow user space manipulation of SPD mark")
Fixes: ed17b8d377 ("xfrm: fix a warning in xfrm_policy_insert_list")
Reported-by: Tobias Brunner <tobias@strongswan.org>
Tested-by: Tobias Brunner <tobias@strongswan.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
If this is in "transceiver" mode the the ->qwork isn't required and is
a NULL pointer. This can lead to a NULL dereference when we call
destroy_workqueue(udc->qwork).
Fixes: 3517c31a8e ("usb: gadget: mv_udc: use devm_xxx for probe")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
The current number of KVM_IRQCHIP_NUM_PINS results in an order 3
allocation (32kb) for each guest start/restart which can result in OOM
killer activity when kernel memory is fragmented enough.
This fix reduces the number of iopins as s390 doesn't use them, hence
reducing the memory footprint.
In the function tegra_usb_phy_probe(), if usb_add_phy_dev() failed,
the return value will be given to err, and if usb_add_phy_dev() succeed,
the return value will be zero. Thus it is unnecessary to repeated check
here.
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
dwc3_pci_resume_work() calls pm_runtime_get_sync() that increments
the reference counter. In case of failure, decrement the reference
before returning.
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
The other thread may access other endpoints when the cdns3_check_new_setup
is handling, add spinlock to protect it.
Cc: <stable@vger.kernel.org>
Fixes: 7733f6c32e ("usb: cdns3: Add Cadence USB3 DRD Driver")
Reviewed-by: Pawel Laszczak <pawell@cadence.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
It should use the correct direction value from register, not depends
on previous software setting. It fixed the EP number wrong issue at
trace when the TRBERR interrupt occurs for EP0IN.
When the EP0IN IOC has finished, software prepares the setup packet
request, the expected direction is OUT, but at that time, the TRBERR
for EP0IN may occur since it is DMULT mode, the DMA does not stop
until TRBERR has met.
Cc: <stable@vger.kernel.org>
Fixes: 7733f6c32e ("usb: cdns3: Add Cadence USB3 DRD Driver")
Reviewed-by: Pawel Laszczak <pawell@cadence.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
The fence release flow is different if the CS was never submitted. In that
case, we don't have an hw_sob object attached that we need to "put". While
if the CS was aborted, we do need to "put" the hw_sob.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The current timeout is too low for some of the workloads and we see false
errors as a result.
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The function name conflicts with a static inline function in
arch/m68k/include/asm/mcfmmu.h
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The PS flow for MMU cache invalidation caused timeouts in stress tests.
Use PS + PI flow so no timeouts should happen whatsoever.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
In Gaudi, the user can't execute scalar load_and_exe on external queue
because it can be a security hole. The driver doesn't parse the commands
being loaded and it can be msg_prot, which the user isn't allowed to use.
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
A client driver (renesas_usbhs) assumed that
dmaengine_tx_status() could return the residue even if
the transfer was completed. However, this was not correct
usage [1] and this caused to break getting the residue after
the commit 24461d9792 ("dmaengine: virt-dma: Fix access after
free in vchan_complete()") actually. So, this is possible to get
wrong received size if the usb controller gets a short packet.
For example, g_zero driver causes "bad OUT byte" errors.
To use the tx_result from the renesas_usbhs driver when
the transfer is completed, set the tx_result parameters.
Notes that the renesas_usbhs driver needs to update for it.
[1]
https://lore.kernel.org/dmaengine/20200616165550.GP2324254@vkoul-mobl/
Reported-by: Hien Dang <hien.dang.eb@renesas.com>
Fixes: 24461d9792 ("dmaengine: virt-dma: Fix access after free in vchan_complete()")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/1592482053-19433-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
event_id0 is defined as 'unsigned int', so it is always greater or
equal to zero.
Remove the unneeded comparisons to fix the following W=1 build
warning:
drivers/dma/imx-sdma.c: In function 'sdma_free_chan_resources':
drivers/dma/imx-sdma.c:1334:23: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
1334 | if (sdmac->event_id0 >= 0)
| ^~
drivers/dma/imx-sdma.c: In function 'sdma_config':
drivers/dma/imx-sdma.c:1635:23: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
1635 | if (sdmac->event_id0 >= 0) {
|
Fixes: 25962e1a7f ("dmaengine: imx-sdma: Fix the event id check to include RX event for UART6")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Link: https://lore.kernel.org/r/20200621155730.28766-1-festevam@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The sense data buffer in sense_buf_pool is allocated with size of
MPT_SENSE_BUFFER_ALLOC(64) (multiplied by req_depth) while SNS_LEN(sc)(96)
is used when reading the data. That may lead to a read from unallocated
area, sometimes from another (unallocated) page. To fix this, limit the
read size to MPT_SENSE_BUFFER_ALLOC.
Link: https://lore.kernel.org/r/20200616150446.4840-1-thenzl@redhat.com
Co-developed-by: Stanislav Saner <ssaner@redhat.com>
Signed-off-by: Stanislav Saner <ssaner@redhat.com>
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Suppose that, for unrelated reasons, FSF requests on behalf of recovery are
very slow and can run into the ERP timeout.
In the case at hand, we did adapter recovery to a large degree. However
due to the slowness a LUN open is pending so the corresponding fc_rport
remains blocked. After fast_io_fail_tmo we trigger close physical port
recovery for the port under which the LUN should have been opened. The new
higher order port recovery dismisses the pending LUN open ERP action and
dismisses the pending LUN open FSF request. Such dismissal decouples the
ERP action from the pending corresponding FSF request by setting
zfcp_fsf_req->erp_action to NULL (among other things)
[zfcp_erp_strategy_check_fsfreq()].
If now the ERP timeout for the pending open LUN request runs out, we must
not use zfcp_fsf_req->erp_action in the ERP timeout handler. This is a
problem since v4.15 commit 75492a5156 ("s390/scsi: Convert timers to use
timer_setup()"). Before that we intentionally only passed zfcp_erp_action
as context argument to zfcp_erp_timeout_handler().
Note: The lifetime of the corresponding zfcp_fsf_req object continues until
a (late) response or an (unrelated) adapter recovery.
Just like the regular response path ignores dismissed requests
[zfcp_fsf_req_complete() => zfcp_fsf_protstatus_eval() => return early] the
ERP timeout handler now needs to ignore dismissed requests. So simply
return early in the ERP timeout handler if the FSF request is marked as
dismissed in its status flags. To protect against the race where
zfcp_erp_strategy_check_fsfreq() dismisses and sets
zfcp_fsf_req->erp_action to NULL after our previous status flag check,
return early if zfcp_fsf_req->erp_action is NULL. After all, the former
ERP action does not need to be woken up as that was already done as part of
the dismissal above [zfcp_erp_action_dismiss()].
This fixes the following panic due to kernel page fault in IRQ context:
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:000009859238c00b R2:00000e3e7ffd000b R3:00000e3e7ffcc007 S:00000e3e7ffd7000 P:000000000000013d
Oops: 0004 ilc:2 [#1] SMP
Modules linked in: ...
CPU: 82 PID: 311273 Comm: stress Kdump: loaded Tainted: G E X ...
Hardware name: IBM 8561 T01 701 (LPAR)
Krnl PSW : 0404c00180000000 001fffff80549be0 (zfcp_erp_notify+0x40/0xc0 [zfcp])
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000080 00000e3d00000000 00000000000000f0 0000000000030000
000000010028e700 000000000400a39c 000000010028e700 00000e3e7cf87e02
0000000010000000 0700098591cb67f0 0000000000000000 0000000000000000
0000033840e9a000 0000000000000000 001fffe008d6bc18 001fffe008d6bbc8
Krnl Code: 001fffff80549bd4: a7180000 lhi %r1,0
001fffff80549bd8: 4120a0f0 la %r2,240(%r10)
#001fffff80549bdc: a53e0003 llilh %r3,3
>001fffff80549be0: ba132000 cs %r1,%r3,0(%r2)
001fffff80549be4: a7740037 brc 7,1fffff80549c52
001fffff80549be8: e320b0180004 lg %r2,24(%r11)
001fffff80549bee: e31020e00004 lg %r1,224(%r2)
001fffff80549bf4: 412020e0 la %r2,224(%r2)
Call Trace:
[<001fffff80549be0>] zfcp_erp_notify+0x40/0xc0 [zfcp]
[<00000985915e26f0>] call_timer_fn+0x38/0x190
[<00000985915e2944>] expire_timers+0xfc/0x190
[<00000985915e2ac4>] run_timer_softirq+0xec/0x218
[<0000098591ca7c4c>] __do_softirq+0x144/0x398
[<00000985915110aa>] do_softirq_own_stack+0x72/0x88
[<0000098591551b58>] irq_exit+0xb0/0xb8
[<0000098591510c6a>] do_IRQ+0x82/0xb0
[<0000098591ca7140>] ext_int_handler+0x128/0x12c
[<0000098591722d98>] clear_subpage.constprop.13+0x38/0x60
([<000009859172ae4c>] clear_huge_page+0xec/0x250)
[<000009859177e7a2>] do_huge_pmd_anonymous_page+0x32a/0x768
[<000009859172a712>] __handle_mm_fault+0x88a/0x900
[<000009859172a860>] handle_mm_fault+0xd8/0x1b0
[<0000098591529ef6>] do_dat_exception+0x136/0x3e8
[<0000098591ca6d34>] pgm_check_handler+0x1c8/0x220
Last Breaking-Event-Address:
[<001fffff80549c88>] zfcp_erp_timeout_handler+0x10/0x18 [zfcp]
Kernel panic - not syncing: Fatal exception in interrupt
Link: https://lore.kernel.org/r/20200623140242.98864-1-maier@linux.ibm.com
Fixes: 75492a5156 ("s390/scsi: Convert timers to use timer_setup()")
Cc: <stable@vger.kernel.org> #4.15+
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit cdb42becdd ("scsi: lpfc: Replace io_channels for nvme and fcp with
general hdw_queues per cpu") has introduced static checker warnings for
potential null dereferences in 'lpfc_sli4_hba_unset()' and commit 1ffdd2c044
("scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset") has
tried to fix it. However, yet another potential null dereference is
remaining. This commit fixes it.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Link: https://lore.kernel.org/r/20200623084122.30633-1-sjpark@amazon.com
Fixes: 1ffdd2c044 ("scsi: lpfc: resolve static checker warning inlpfc_sli4_hba_unset")
Fixes: cdb42becdd ("scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu")
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Russell King says:
====================
Two phylink pause fixes
While testing, I discovered two issues with ethtool -A with phylink.
First, if there is a PHY bound to the network device, we hit a
deadlock when phylib tries to notify us of the link changing as a
result of triggering a renegotiation.
Second, when we are manually forcing the pause settings, and there
is no renegotiation triggered, we do not update the MAC via the new
mac_link_up approach.
These two patches solve both problems, and will need to be backported
to v5.7; they do not apply cleanly there due to the introduction of
PCS in the v5.8 merge window.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We have been relying on link events and mac_config() when the manual
pause modes are changed. With recent developments, such as moving
the programming of link state to mac_link_up(), this no longer works.
To ensure that we update the MAC, we must generate a link-down followed
by a link-up event; we can do that by setting mac_link_dropped and
triggering a resolve.
Fixes: 91a208f218 ("net: phylink: propagate resolved link config via mac_link_up()")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a phylink's ethtool set_pauseparam support deadlock caused by phylib
interacting with phylink: we must not hold the state lock while calling
phylib functions that may call into phylink_phy_change().
Fixes: f904f15ea9 ("net: phylink: allow ethtool -A to change flow control advertisement")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver performs SCR (state change registration) in all modes including
pure target mode.
For each RSCN, scan_needed flag is set in qla2x00_handle_rscn() for the
port mentioned in the RSCN and fabric rescan is scheduled. During the
rescan, GNN_FT handler, qla24xx_async_gnnft_done() deletes session of the
port that caused the RSCN.
In target mode, the session deletion has an impact on ATIO handler,
qlt_24xx_atio_pkt(). Target responds with SAM STATUS BUSY to I/O incoming
from the deleted session. qlt_handle_cmd_for_atio() and
qlt_handle_task_mgmt() return -EFAULT if they are not able to find session
of the command/TMF, and that results in invocation of qlt_send_busy():
qlt_24xx_atio_pkt_all_vps: qla_target(0): type 6 ox_id 0014
qla_target(0): Unable to send command to target, sending BUSY status
Such response causes command timeout on the initiator. Error handler thread
on the initiator will be spawned to abort the commands:
scsi 23:0:0:0: tag#0 abort scheduled
scsi 23:0:0:0: tag#0 aborting command
qla2xxx [0000:af:00.0]-188c:23: Entered qla24xx_abort_command.
qla2xxx [0000:af:00.0]-801c:23: Abort command issued nexus=23:0:0 -- 0 2003.
Command abort is rejected by target and fails (2003), error handler then
tries to perform DEVICE RESET and TARGET RESET but they're also doomed to
fail because TMFs are ignored for the deleted sessions.
Then initiator makes BUS RESET that resets the link via
qla2x00_full_login_lip(). BUS RESET succeeds and brings initiator port up,
SAN switch detects that and sends RSCN to the target port and it fails
again the same way as described above. It never goes out of the loop.
The change breaks the RSCN loop by keeping initiator sessions mentioned in
RSCN payload in all modes, including dual and pure target mode.
Link: https://lore.kernel.org/r/20200605144435.27023-1-r.bolshakov@yadro.com
Fixes: 2037ce49d3 ("scsi: qla2xxx: Fix stale session")
Cc: Quinn Tran <qutran@marvell.com>
Cc: Arun Easi <aeasi@marvell.com>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Daniel Wagner <dwagner@suse.de>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: stable@vger.kernel.org # v5.4+
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Shyam Sundar <ssundar@marvell.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Clearing the sock TX queue in sk_set_socket() might cause unexpected
out-of-order transmit when called from sock_orphan(), as outstanding
packets can pick a different TX queue and bypass the ones already queued.
This is undesired in general. More specifically, it breaks the in-order
scheduling property guarantee for device-offloaded TLS sockets.
Remove the call to sk_tx_queue_clear() in sk_set_socket(), and add it
explicitly only where needed.
Fixes: e022f0b4a0 ("net: Introduce sk_tx_queue_mapping")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The qla2xxx driver knows when request was processed successfully or
not. But it always sets the NVMe status code to 0/NVME_SC_SUCCESS. The
upper layer needs to figure out from the rcv_rsplen and transferred_length
variables if the request was transferred successfully. This is not always
possible, e.g. when the request data length is 0, the transferred_length is
also set 0 which is interpreted as success in nvme_fc_fcpio_done(). Let's
inform the upper layer (nvme_fc_fcpio_done()) when something went wrong.
nvme_fc_fcpio_done() maps all non-NVME_SC_SUCCESS status codes to
NVME_SC_HOST_PATH_ERROR. There isn't any benefit to map the QLA status code
to the NVMe status code. Therefore, use NVME_SC_INTERNAL to indicate an
error which aligns it with the lpfc driver.
Link: https://lore.kernel.org/r/20200604100745.89250-1-dwagner@suse.de
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A last minute change put the TDR cable test parameters into a nest.
The validation is not sufficient, resulting in an oops if the nest is
missing. Set default values first, then update them if the nest is
provided.
Fixes: f2bc8ad31a ("net: ethtool: Allow PHY cable test TDR data to configured")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Bug fixes.
The first patch stores the firmware version code which is needed by the
next 2 patches to determine some worarounds based on the firmware version.
The workarounds are to disable legacy TX push mode and to clear the
hardware statistics during ifdown. The last patch checks that it is
a PF before reading the VPD.
Please also queue these for -stable. Thanks.
====================
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Virtual functions does not have VPD information. This patch modifies
calling bnxt_read_vpd_info() only for PFs and avoids an unnecessary
error log.
Fixes: a0d0fd70fe ("bnxt_en: Read partno and serialno of the board from VPD")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On older firmware, the hardware statistics are not cleared when the
driver frees the hardware stats contexts during ifdown. The driver
expects these stats to be cleared and saves a copy before freeing
the stats contexts. During the next ifup, the driver will likely
allocate the same hardware stats contexts and this will cause a big
increase in the counters as the old counters are added back to the
saved counters.
We fix it by making an additional firmware call to clear the counters
before freeing the hw stats contexts when the firmware is the older
20.x firmware.
Fixes: b8875ca356 ("bnxt_en: Save ring statistics before reset.")
Reported-by: Jakub Kicinski <kicinski@fb.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Tested-by: Jakub Kicinski <kicinski@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Older firmware may not support legacy TX push properly and may not
be disabling it. So we check certain firmware versions that may
have this problem and disable legacy TX push unconditionally.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently only store the firmware version as a string for ethtool
and devlink info. Store it also as a version code. The next 2
patches will need to check the firmware major version to determine
some workarounds.
We also use the 16-bit firmware version fields if the firmware is newer
and provides the 16-bit fields.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix boottime kprobe events to report and abort after each failure when
adding probes.
As an example, when we try to set multiprobe kprobe events in
bootconfig like this:
ftrace.event.kprobes.vfsevents {
probes = "vfs_read $arg1 $arg2,,
!error! not reported;?", // leads to error
"vfs_write $arg1 $arg2"
}
This will not work as expected. After
commit da0f1f4167 ("tracing/boottime: Fix kprobe event API usage"),
the function trace_boot_add_kprobe_event will not produce any error
message when adding a probe fails at kprobe_event_gen_cmd_start.
Furthermore, we continue to add probes when kprobe_event_gen_cmd_end fails
(and kprobe_event_gen_cmd_start did not fail). In this case the function
even returns successfully when the last call to kprobe_event_gen_cmd_end
is successful.
The behaviour of reporting and aborting after failures is not
consistent.
The function trace_boot_add_kprobe_event now reports each failure and
stops adding probes immediately.
Link: https://lkml.kernel.org/r/20200618163301.25854-1-sascha.ortmann@stud.uni-hannover.de
Cc: stable@vger.kernel.org
Cc: linux-kernel@i4.cs.fau.de
Co-developed-by: Maximilian Werner <maximilian.werner96@gmail.com>
Fixes: da0f1f4167 ("tracing/boottime: Fix kprobe event API usage")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Maximilian Werner <maximilian.werner96@gmail.com>
Signed-off-by: Sascha Ortmann <sascha.ortmann@stud.uni-hannover.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fix the event trigger to accept redundant spaces in
the trigger input.
For example, these return -EINVAL
echo " traceon" > events/ftrace/print/trigger
echo "traceon if common_pid == 0" > events/ftrace/print/trigger
echo "disable_event:kmem:kmalloc " > events/ftrace/print/trigger
But these are hard to find what is wrong.
To fix this issue, use skip_spaces() to remove spaces
in front of actual tokens, and set NULL if there is no
token.
Link: http://lkml.kernel.org/r/159262476352.185015.5261566783045364186.stgit@devnote2
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 85f2b08268 ("tracing: Add basic event trigger framework")
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Hongyu reported "id != index" in z_erofs_onlinepage_fixup() with
specific aarch64 environment easily, which wasn't shown before.
After digging into that, I found that high 32 bits of page->private
was set to 0xaaaaaaaa rather than 0 (due to z_erofs_onlinepage_init
behavior with specific compiler options). Actually we only use low
32 bits to keep the page information since page->private is only 4
bytes on most 32-bit platforms. However z_erofs_onlinepage_fixup()
uses the upper 32 bits by mistake.
Let's fix it now.
Reported-and-tested-by: Hongyu Jin <hongyu.jin@unisoc.com>
Fixes: 3883a79abd ("staging: erofs: introduce VLE decompression support")
Cc: <stable@vger.kernel.org> # 4.19+
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Link: https://lore.kernel.org/r/20200618234349.22553-1-hsiangkao@aol.com
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
if of_find_device_by_node() succeed, imx6q_suspend_init() doesn't have a
corresponding put_device(). Thus add a jump target to fix the exception
handling for this function implementation.
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
if of_find_device_by_node() succeed, imx_suspend_alloc_ocram() doesn't
have a corresponding put_device(). Thus add a jump target to fix the
exception handling for this function implementation.
Fixes: 1579c7b9fe ("ARM: imx53: Set DDR pins to high impedance when in suspend to RAM.")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
When producing the bpf-helpers.7 man page from the documentation from
the BPF user space header file, rst2man complains:
<stdin>:2636: (ERROR/3) Unexpected indentation.
<stdin>:2640: (WARNING/2) Block quote ends without a blank line; unexpected unindent.
Let's fix formatting for the relevant chunk (item list in
bpf_ringbuf_query()'s description), and for a couple other functions.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200623153935.6215-1-quentin@isovalent.com
This is a fix for a regression in commit 2c78ee898d ("bpf: Implement CAP_BPF").
Before the above commit it was possible to load network bpf programs
with just the CAP_SYS_ADMIN privilege.
The Android bpfloader happens to run in such a configuration (it has
SYS_ADMIN but not NET_ADMIN) and creates maps and loads bpf programs
for later use by Android's netd (which has NET_ADMIN but not SYS_ADMIN).
Fixes: 2c78ee898d ("bpf: Implement CAP_BPF")
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Link: https://lore.kernel.org/bpf/20200620212616.93894-1-zenczykowski@gmail.com
Currently, if a bpf program has more than one subprograms, each program will be
jitted separately. For programs with bpf-to-bpf calls the
prog->aux->num_exentries is not setup properly. For example, with
bpf_iter_netlink.c modified to force one function to be not inlined and with
CONFIG_BPF_JIT_ALWAYS_ON the following error is seen:
$ ./test_progs -n 3/3
...
libbpf: failed to load program 'iter/netlink'
libbpf: failed to load object 'bpf_iter_netlink'
libbpf: failed to load BPF skeleton 'bpf_iter_netlink': -4007
test_netlink:FAIL:bpf_iter_netlink__open_and_load skeleton open_and_load failed
#3/3 netlink:FAIL
The dmesg shows the following errors:
ex gen bug
which is triggered by the following code in arch/x86/net/bpf_jit_comp.c:
if (excnt >= bpf_prog->aux->num_exentries) {
pr_err("ex gen bug\n");
return -EFAULT;
}
This patch fixes the issue by computing proper num_exentries for each
subprogram before calling JIT.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Use array_size() instead of the open-coded version in the controlling
expression of the if statement.
Also, while there, use the preferred form for passing a size of a struct.
The alternative form where struct name is spelled out hurts readability
and introduces an opportunity for a bug when the pointer variable type is
changed but the corresponding sizeof that is passed as argument is not.
This issue was found with the help of Coccinelle and, audited and fixed
manually.
Addresses-KSPP-ID: https://github.com/KSPP/linux/issues/83
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
As the man description of the truncate, if the size changed,
then the st_ctime and st_mtime fields should be updated. But
in cifs, we doesn't do it.
It lead the xfstests generic/313 failed.
So, add the ATTR_MTIME|ATTR_CTIME flags on attrs when change
the file size
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
When punch hole success, we also can read old data from file:
# strace -e trace=pread64,fallocate xfs_io -f -c "pread 20 40" \
-c "fpunch 20 40" -c"pread 20 40" file
pread64(3, " version 5.8.0-rc1+"..., 40, 20) = 40
fallocate(3, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 20, 40) = 0
pread64(3, " version 5.8.0-rc1+"..., 40, 20) = 40
CIFS implements the fallocate(FALLOCATE_FL_PUNCH_HOLE) with send SMB
ioctl(FSCTL_SET_ZERO_DATA) to server. It just set the range of the
remote file to zero, but local page caches not updated, then the
local page caches inconsistent with server.
Also can be found by xfstests generic/316.
So, we need to remove the page caches before send the SMB
ioctl(FSCTL_SET_ZERO_DATA) to server.
Fixes: 31742c5a33 ("enable fallocate punch hole ("fallocate -p") for SMB3")
Suggested-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Cc: stable@vger.kernel.org # v3.17
Signed-off-by: Steve French <stfrench@microsoft.com>
CIFS implements the fallocate(FALLOC_FL_ZERO_RANGE) with send SMB
ioctl(FSCTL_SET_ZERO_DATA) to server. It just set the range of the
remote file to zero, but local page cache not update, then the data
inconsistent with server, which leads the xfstest generic/008 failed.
So we need to remove the local page caches before send SMB
ioctl(FSCTL_SET_ZERO_DATA) to server. After next read, it will
re-cache it.
Fixes: 30175628bf ("[SMB3] Enable fallocate -z support for SMB3 mounts")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Cc: stable@vger.kernel.org # v3.17
Signed-off-by: Steve French <stfrench@microsoft.com>
bpf_object__find_program_by_title(), used by CO-RE relocation code, doesn't
return .text "BPF program", if it is a function storage for sub-programs.
Because of that, any CO-RE relocation in helper non-inlined functions will
fail. Fix this by searching for .text-corresponding BPF program manually.
Adjust one of bpf_iter selftest to exhibit this pattern.
Fixes: ddc7c30426 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200619230423.691274-1-andriin@fb.com
inode_copy_up_xattr returns 0 to indicate the acceptance of the xattr
and 1 to reject it. If the LSM does not know about the xattr, it's
expected to return -EOPNOTSUPP, which is the correct default value for
this hook. BPF LSM, currently, uses 0 as the default value and thereby
falsely allows all overlay fs xattributes to be copied up.
The iteration logic is also updated from the "bail-on-fail"
call_int_hook to continue on the non-decisive -EOPNOTSUPP and bail out
on other values.
Fixes: 98e828a065 ("security: Refactor declaration of LSM hooks")
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: James Morris <jmorris@namei.org>
Rahul Lakkireddy says:
====================
cxgb4/cxgb4vf: fix warnings reported by sparse
This series of patches fix various warnings reported by the sparse
tool.
Patches 1 and 2 fix lock context imbalance warnings.
Patch 3 fixes cast to restricted __be64 warning when fetching
timestamp in PTP path.
Patch 4 fixes several cast to restricted __be32 warnings in TC-U32
offload parser.
Patch 5 fixes several cast from restricted __be16 warnings in parsing
L4 ports for filters.
Patch 6 fixes several restricted __be32 degrades to integer warnings
when comparing IP address masks for exact-match filters.
Patch 7 fixes cast to restricted __be64 warning when fetching SGE
queue contexts in device dump collection.
Patch 8 fixes cast from restricted __sum16 warning when saving IPv4
partial checksum.
Patch 9 fixes issue with string array scope in DCB path.
Patch 10 fixes a set but unused variable warning when DCB is disabled.
Patch 11 fixes several kernel-doc comment warnings in cxgb4 driver.
Patch 12 fixes several kernel-doc comment warnings in cxgb4vf driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Update several kernel-doc line comments to fix warnings reported by
make W=1.
Fixes following class of warnings reported by make W=1 in several
places:
cxgb4vf_main.c:275: warning: Function parameter or member 'persistent'
not described in 'cxgb4vf_change_mac'
cxgb4vf_main.c:275: warning: Excess function parameter 'persist'
description in 'cxgb4vf_change_mac'
Fixes: 16f8bd4be7 ("cxgb4vf: Add core T4 PCI-E SR-IOV Virtual Function hardware definitions and device communication code")
Fixes: c6e0d91464 ("cxgb4vf: Add T4 Virtual Function Scatter-Gather Engine DMA code")
Fixes: e0a8b34a9c ("cxgb4vf: Add and initialize some sge params for VF driver")
Fixes: c3168cabe1 ("cxgb4/cxgbvf: Handle 32-bit fw port capabilities")
Fixes: 0e23daeb64 ("drivers/net: chelsio/cxgb*: Convert timers to use timer_setup()")
Fixes: 3f8cfd0d95 ("cxgb4/cxgb4vf: Program hash region for {t4/t4vf}_change_mac()")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update several kernel-doc line comments to fix warnings reported by
make W=1.
Fixes following class of warnings reported by make W=1 in several
places:
l2t.c:616: warning: Cannot understand * @dev: net_device pointer
t4_hw.c:3175: warning: Function parameter or member 'adap' not
described in 't4_get_exprom_version'
t4_hw.c:3175: warning: Excess function parameter 'adapter' description
in 't4_get_exprom_version'
Fixes: 56d36be4dd ("cxgb4: Add HW and FW support code")
Fixes: fd3a47900b ("cxgb4: Add packet queues and packet DMA code")
Fixes: 26f7cbc0a5 ("cxgb4: Don't attempt to upgrade T4 firmware when cxgb4 will end up as a slave")
Fixes: 793dad94e7 ("RDMA/cxgb4: Fix bug for active and passive LE hash collision path")
Fixes: ba3f8cd55f ("cxgb4: Add support in cxgb4 to get expansion rom version via ethtool")
Fixes: f7502659ce ("cxgb4: Add API to alloc l2t entry; also update existing ones")
Fixes: ddc7740d9a ("cxgb4: Decode link down reason code obtained from firmware")
Fixes: 193c4c2845 ("cxgb4: Update T6 Buffer Group and Channel Mappings")
Fixes: 8f46d46715 ("cxgb4: Use Firmware params to get buffer-group map")
Fixes: a456950445 ("cxgb4: time stamping interface for PTP")
Fixes: 9c33e4208b ("cxgb4: Add PTP Hardware Clock (PHC) support")
Fixes: c3168cabe1 ("cxgb4/cxgbvf: Handle 32-bit fw port capabilities")
Fixes: 5ccf9d0496 ("cxgb4: update API for TP indirect register access")
Fixes: 3bdb376e69 ("cxgb4: introduce SMT ops to prepare for SMAC rewrite support")
Fixes: 736c3b9447 ("cxgb4: collect egress and ingress SGE queue contexts")
Fixes: f56ec6766d ("cxgb4: Add support for ethtool i2c dump")
Fixes: 9d5fd927d2 ("cxgb4/cxgb4vf: add support for ndo_set_vf_vlan")
Fixes: 98f3697f8d ("cxgb4: add tc flower match support for tunnel VNI")
Fixes: 02d805dc5f ("cxgb4: use new fw interface to get the VIN and smt index")
Fixes: 3f8cfd0d95 ("cxgb4/cxgb4vf: Program hash region for {t4/t4vf}_change_mac()")
Fixes: d429005fdf ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer")
Fixes: 0e395b3cb1 ("cxgb4: add FLOWC based QoS offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the set but unused variable when DCB is disabled. Instead,
do the calculation directly inline.
Fixes following warning in make W=1:
cxgb4_main.c: In function 'cfg_queues':
cxgb4_main.c:5380:29: warning: variable 'n1g' set but not used
[-Wunused-but-set-variable]
u32 i, n10g = 0, qidx = 0, n1g = 0;
^
Fixes: 116ca924ae ("cxgb4: fix checks for max queues to allocate")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the DCB version string array extern to header file.
Fixes following sparse warning:
cxgb4_dcb.c:13:12: warning: symbol 'dcb_ver_array' was not declared.
Should it be static?
Fixes: ebddd97afb ("cxgb4: add support to display DCB info")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The checksum field in IPv4 header is in __sum16 and ip_fast_csum()
also returns __sum16. So, no need to cast it to u16.
Fixes following sparse warning:
sge.c:1539:47: warning: cast from restricted __sum16
sge.c:1539:44: warning: incorrect type in assignment (different base types)
sge.c:1539:44: expected restricted __sum16 [usertype] check
sge.c:1539:44: got unsigned short [usertype]
Fixes: d0a1299c6b ("cxgb4: add support for vxlan segmentation offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The data in destination buffer is expected to be be parsed in big
endian. So, use the right context.
Fixes following sparse warning:
cudbg_lib.c:2041:44: warning: incorrect type in assignment (different
base types)
cudbg_lib.c:2041:44: expected unsigned long long [usertype]
cudbg_lib.c:2041:44: got restricted __be64 [usertype]
Fixes: 736c3b9447 ("cxgb4: collect egress and ingress SGE queue contexts")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use correct type to check for all-mask exact match IP addresses.
Fixes following sparse warnings due to big endian value checks
against 0xffffffff in is_addr_all_mask():
cxgb4_filter.c:977:25: warning: restricted __be32 degrades to integer
cxgb4_filter.c:983:37: warning: restricted __be32 degrades to integer
cxgb4_filter.c:984:37: warning: restricted __be32 degrades to integer
cxgb4_filter.c:985:37: warning: restricted __be32 degrades to integer
cxgb4_filter.c:986:37: warning: restricted __be32 degrades to integer
Fixes: 3eb8b62d5a ("cxgb4: add support to create hash-filters via tc-flower offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The source and destination L4 ports in filter offload need to be
in CPU endian. They will finally be converted to Big Endian after
all operations are done and before giving them to hardware. The
L4 ports for NAT are expected to be passed as a byte stream TCB.
So, treat them as such.
Fixes following sparse warnings in several places:
cxgb4_tc_flower.c:159:33: warning: cast from restricted __be16
cxgb4_tc_flower.c:159:33: warning: incorrect type in argument 1 (different
base types)
cxgb4_tc_flower.c:159:33: expected unsigned short [usertype] val
cxgb4_tc_flower.c:159:33: got restricted __be16 [usertype] dst
Fixes: dca4faeb81 ("cxgb4: Add LE hash collision bug fix path in LLD driver")
Fixes: 62488e4b53 ("cxgb4: add basic tc flower offload support")
Fixes: 557ccbf9df ("cxgb4: add tc flower support for L3/L4 rewrite")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TC-U32 passes all keys values and masks in __be32 format. The parser
already expects this and hence pass the value and masks in __be32
natively to the parser.
Fixes following sparse warnings in several places:
cxgb4_tc_u32.c:57:21: warning: incorrect type in assignment (different base
types)
cxgb4_tc_u32.c:57:21: expected unsigned int [usertype] val
cxgb4_tc_u32.c:57:21: got restricted __be32 [usertype] val
cxgb4_tc_u32_parse.h:48:24: warning: cast to restricted __be32
Fixes: 2e8aad7bf2 ("cxgb4: add parser to translate u32 filters to internal spec")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use get_unaligned_be64() to fetch the timestamp needed for ns_to_ktime()
conversion.
Fixes following sparse warning:
sge.c:3282:43: warning: cast to restricted __be64
Fixes: a456950445 ("cxgb4: time stamping interface for PTP")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check for whether PTP is enabled or not at the caller and perform
locking/unlocking at the caller.
Fixes following sparse warning:
sge.c:1641:26: warning: context imbalance in 'cxgb4_eth_xmit' -
different lock contexts for basic block
Fixes: a456950445 ("cxgb4: time stamping interface for PTP")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move code handling L2T ARP failures to the only caller.
Fixes following sparse warning:
skbuff.h:2091:29: warning: context imbalance in
'handle_failed_resolution' - unexpected unlock
Fixes: 749cb5fe48 ("cxgb4: Replace arpq_head/arpq_tail with SKB double link-list code")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Lobakin says:
====================
net: qed/qede: various stability fixes
This set addresses several near-critical issues that were observed
and reproduced on different test and production configurations.
v2:
- don't split the "Fixes:" tag across several lines in patch 9;
- no functional changes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Variable 'abs_ppfid' in qed_dev.c:qed_llh_add_mac_filter() always gets
printed, but is initialized only under 'ref_cnt == 1' condition. This
results in:
In file included from ./include/linux/kernel.h:15:0,
from ./include/asm-generic/bug.h:19,
from ./arch/x86/include/asm/bug.h:86,
from ./include/linux/bug.h:5,
from ./include/linux/io.h:11,
from drivers/net/ethernet/qlogic/qed/qed_dev.c:35:
drivers/net/ethernet/qlogic/qed/qed_dev.c: In function 'qed_llh_add_mac_filter':
./include/linux/printk.h:358:2: warning: 'abs_ppfid' may be used uninitialized
in this function [-Wmaybe-uninitialized]
printk(KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~
drivers/net/ethernet/qlogic/qed/qed_dev.c:983:17: note: 'abs_ppfid' was declared
here
u8 filter_idx, abs_ppfid;
^~~~~~~~~
...under W=1+.
Fix this by initializing it with zero.
Fixes: 79284adeb9 ("qed: Add llh ppfid interface and 100g support for offload protocols")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sizes of all ILT blocks must be reset before ILT recomputing when
disabling clients, or memory allocation may exceed ILT shadow array
and provoke system crashes.
Fixes: 1408cc1fa4 ("qed: Introduce VFs")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently PTP cyclecounter and timecounter are initialized only on
the first probing and are cleaned up during removal. This means that
PTP becomes non-functional after device recovery.
Fix this by unconditional PTP initialization on probing and clearing
Tx pending bit on exiting.
Fixes: ccc67ef50b ("qede: Error recovery process")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is likely a copy'n'paste mistake. The amount of ILT lines to
reserve for a single VF was being multiplied by the total VFs count.
This led to a huge redundancy in reservation and potential lines
drainouts.
Fixes: 1408cc1fa4 ("qed: Introduce VFs")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
25ms sleep cycles in waiting for PF response are excessive and may lead
to different timeout failures.
Start to wait with short udelays, and in most cases polling will end
here. If the time was not sufficient, switch to msleeps.
usleep_range() may go far beyond 100us depending on platform and tick
configuration, hence atomic udelays for consistency.
Also add explicit DMA barriers since 'done' always comes from a shared
request-response DMA pool, and note that in the comment nearby.
Fixes: 1408cc1fa4 ("qed: Introduce VFs")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set rdma_wq pointer to NULL after destroying the workqueue and check
for it when adding new events to fix crashes on driver unload.
Fixes: cee9fbd8e2 ("qede: Add qedr framework")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qed_spq_unregister_async_cb() should be called before
qed_rdma_info_free() to avoid crash-spawning uses-after-free.
Instead of calling it from each subsystem exit code, do it in one place
on PF down.
Fixes: 291d57f67d ("qed: Fix rdma_info structure allocation")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qed_chain_get_element_left{,_u32} returned 0 when the difference
between producer and consumer page count was equal to the total
page count.
Fix this by conditional expanding of producer value (vs
unconditional). This allowed to eliminate normalizaton against
total page count, which was the cause of this bug.
Misc: replace open-coded constants with common defines.
Fixes: a91eb52abb ("qed: Revisit chain implementation")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit e585f23636 ("udp: Changes to udp_offload to support remote
checksum offload") added new GSO type and a corresponding netdev
feature, but missed Ethtool's 'netdev_features_strings' table.
Give it a name so it will be exposed to userspace and become available
for manual configuration.
v3:
- decouple from "netdev_features_strings[] cleanup" series;
- no functional changes.
v2:
- don't split the "Fixes:" tag across lines;
- no functional changes.
Fixes: e585f23636 ("udp: Changes to udp_offload to support remote checksum offload")
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason A. Donenfeld says:
====================
wireguard fixes for 5.8-rc3
This series contains two fixes, one cosmetic and one quite important:
1) Avoid the `if ((x = f()) == y)` pattern, from Frank
Werner-Krippendorf.
2) Mitigate a potential memory leak by creating circular netns
references, while also making the netns semantics a bit more
robust.
Patch (2) has a "Fixes:" line and should be backported to stable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Before, we took a reference to the creating netns if the new netns was
different. This caused issues with circular references, with two
wireguard interfaces swapping namespaces. The solution is to rather not
take any extra references at all, but instead simply invalidate the
creating netns pointer when that netns is deleted.
In order to prevent this from happening again, this commit improves the
rough object leak tracking by allowing it to account for created and
destroyed interfaces, aside from just peers and keys. That then makes it
possible to check for the object leak when having two interfaces take a
reference to each others' namespaces.
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes an error condition reported by checkpatch.pl which caused by
assigning a variable in an if condition in wg_noise_handshake_consume_
initiation().
Signed-off-by: Frank Werner-Krippendorf <mail@hb9fxq.ch>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur says:
====================
bridge: mrp: Update MRP_PORT_ROLE
This patch series does the following:
- fixes the enum br_mrp_port_role_type. It removes the port role none(0x2)
because this is in conflict with the standard. The standard defines the
interconnect port role as value 0x2.
- adds checks regarding current defined port roles: primary(0x0) and
secondary(0x1).
v2:
- add the validation code when setting the port role.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds specific checks for primary(0x0) and secondary(0x1) when
setting the port role. For any other value the function
'br_mrp_set_port_role' will return -EINVAL.
Fixes: 20f6a05ef6 ("bridge: mrp: Rework the MRP netlink interface")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the MRP_PORT_ROLE_NONE has the value 0x2 but this is in conflict
with the IEC 62439-2 standard. The standard defines the following port
roles: primary (0x0), secondary(0x1), interconnect(0x2).
Therefore remove the port role none.
Fixes: 4714d13791 ("bridge: uapi: mrp: Add mrp attributes.")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Stultz reported that commit f9288fcc5c ("i2c: designware: Move
ACPI parts into common module") caused a regression on the HiKey board
where adv7511 HDMI bridge driver wasn't probing anymore due the I2C bus
failed to start.
It seems the change caused the bus speed being zero when CONFIG_ACPI
not set and neither speed based on "clock-frequency" device property
or default fast mode is set.
Fix this by splitting i2c_dw_acpi_adjust_bus_speed() to
i2c_dw_acpi_round_bus_speed() and i2c_dw_adjust_bus_speed(), where
the latter one has the code that runs independently of ACPI.
Fixes: f9288fcc5c ("i2c: designware: Move ACPI parts into common module")
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Tested-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Pull kvm fixes from Paolo Bonzini:
"All bugfixes except for a couple cleanup patches"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: VMX: Remove vcpu_vmx's defunct copy of host_pkru
KVM: x86: allow TSC to differ by NTP correction bounds without TSC scaling
KVM: X86: Fix MSR range of APIC registers in X2APIC mode
KVM: VMX: Stop context switching MSR_IA32_UMWAIT_CONTROL
KVM: nVMX: Plumb L2 GPA through to PML emulation
KVM: x86/mmu: Avoid mixing gpa_t with gfn_t in walk_addr_generic()
KVM: LAPIC: ensure APIC map is up to date on concurrent update requests
kvm: lapic: fix broken vcpu hotplug
Revert "KVM: VMX: Micro-optimize vmexit time when not exposing PMU"
KVM: VMX: Add helpers to identify interrupt type from intr_info
kvm/svm: disable KCSAN for svm_vcpu_run()
KVM: MIPS: Fix a build error for !CPU_LOONGSON64
When the user consumes and generates sqe at a fast rate,
io_sqring_entries can always get sqe, and ret will not be equal to -EBUSY,
so that io_sq_thread will never call cond_resched or schedule, and then
we will get the following system error prompt:
rcu: INFO: rcu_sched self-detected stall on CPU
or
watchdog: BUG: soft lockup-CPU#23 stuck for 112s! [io_uring-sq:1863]
This patch checks whether need to call cond_resched() by checking
the need_resched() function every cycle.
Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull btrfs fixes from David Sterba:
"A number of fixes, located in two areas, one performance fix and one
fixup for better integration with another patchset.
- bug fixes in nowait aio:
- fix snapshot creation hang after nowait-aio was used
- fix failure to write to prealloc extent past EOF
- don't block when extent range is locked
- block group fixes:
- relocation failure when scrub runs in parallel
- refcount fix when removing fails
- fix race between removal and creation
- space accounting fixes
- reinstante fast path check for log tree at unlink time, fixes
performance drop up to 30% in REAIM
- kzfree/kfree fixup to ease treewide patchset renaming kzfree"
* tag 'for-5.8-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: use kfree() in btrfs_ioctl_get_subvol_info()
btrfs: fix RWF_NOWAIT writes blocking on extent locks and waiting for IO
btrfs: fix RWF_NOWAIT write not failling when we need to cow
btrfs: fix failure of RWF_NOWAIT write into prealloc extent beyond eof
btrfs: fix hang on snapshot creation after RWF_NOWAIT write
btrfs: check if a log root exists before locking the log_mutex on unlink
btrfs: fix bytes_may_use underflow when running balance and scrub in parallel
btrfs: fix data block group relocation failure due to concurrent scrub
btrfs: fix race between block group removal and block group creation
btrfs: fix a block group ref counter leak after failure to remove block group
'early_resume_init()' and 'late_resume_init() 'are only called respectively
via 'early_resume_init' and 'late_resume_init'.
They can be marked as __init to save a few bytes of memory.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add one more bit for OOB (Out Of Band) enabling of P-states.
If OOB handling of P-states is enabled, intel_pstate shouldn't load.
Currently, only "BIT(8) == 1" of the MSR MSR_MISC_PWR_MGMT is
considered as OOB, but "BIT(18) == 1" needs to be taken into
consideration as OOB condition too.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Add an empty code line, edit subject and changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently the ring buffer makes events that happen in interrupts that preempt
another event have a delta of zero. (Hopefully we can change this soon). But
this is to deal with the races of updating a global counter with lockless
and nesting functions updating deltas.
With the addition of absolute time stamps, the time extend didn't follow
this rule. A time extend can happen if two events happen longer than 2^27
nanoseconds appart, as the delta time field in each event is only 27 bits.
If that happens, then a time extend is injected with 2^59 bits of
nanoseconds to use (18 years). But if the 2^27 nanoseconds happen between
two events, and as it is writing the event, an interrupt triggers, it will
see the 2^27 difference as well and inject a time extend of its own. But a
recent change made the time extend logic not take into account the nesting,
and this can cause two time extend deltas to happen moving the time stamp
much further ahead than the current time. This gets all reset when the ring
buffer moves to the next page, but that can cause time to appear to go
backwards.
This was observed in a trace-cmd recording, and since the data is saved in a
file, with trace-cmd report --debug, it was possible to see that this indeed
did happen!
bash-52501 110d... 81778.908247: sched_switch: bash:52501 [120] S ==> swapper/110:0 [120] [12770284:0x2e8:64]
<idle>-0 110d... 81778.908757: sched_switch: swapper/110:0 [120] R ==> bash:52501 [120] [509947:0x32c:64]
TIME EXTEND: delta:306454770 length:0
bash-52501 110.... 81779.215212: sched_swap_numa: src_pid=52501 src_tgid=52388 src_ngid=52501 src_cpu=110 src_nid=2 dst_pid=52509 dst_tgid=52388 dst_ngid=52501 dst_cpu=49 dst_nid=1 [0:0x378:48]
TIME EXTEND: delta:306458165 length:0
bash-52501 110dNh. 81779.521670: sched_wakeup: migration/110:565 [0] success=1 CPU:110 [0:0x3b4:40]
and at the next page, caused the time to go backwards:
bash-52504 110d... 81779.685411: sched_switch: bash:52504 [120] S ==> swapper/110:0 [120] [8347057:0xfb4:64]
CPU:110 [SUBBUFFER START] [81779379165886:0x1320000]
<idle>-0 110dN.. 81779.379166: sched_wakeup: bash:52504 [120] success=1 CPU:110 [0:0x10:40]
<idle>-0 110d... 81779.379167: sched_switch: swapper/110:0 [120] R ==> bash:52504 [120] [1168:0x3c:64]
Link: https://lkml.kernel.org/r/20200622151815.345d1bf5@oasis.local.home
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Fixes: dc4e2801d4 ("ring-buffer: Redefine the unimplemented RINGBUF_TYPE_TIME_STAMP")
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Versions of binutils prior to 2.33.1 don't understand the ELF notes that
are added by modern compilers to indicate the PAC and BTI options used
to build the code. This causes them to emit large numbers of warnings in
the form:
aarch64-linux-gnu-nm: warning: .tmp_vmlinux.kallsyms2: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000
during the kernel build which is currently causing quite a bit of
disruption for automated build testing using clang.
In commit 15cd0e675f (arm64: Kconfig: ptrauth: Add binutils version
check to fix mismatch) we added a dependency on binutils to avoid this
issue when building with versions of GCC that emit the notes but did not
do so for clang as it was believed that the existing check for
.cfi_negate_ra_state was already requiring a new enough binutils. This
does not appear to be the case for some versions of binutils (eg, the
binutils in Debian 10) so instead refactor so we require a new enough
GNU binutils in all cases other than when we are using an old GCC
version that does not emit notes.
Other, more exotic, combinations of tools are possible such as using
clang, lld and gas together are possible and may have further problems
but rather than adding further version checks it looks like the most
robust thing will be to just test that we can build cleanly with the
configured tools but that will require more review and discussion so do
this for now to address the immediate problem disrupting build testing.
Reported-by: KernelCI <bot@kernelci.org>
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1054
Link: https://lore.kernel.org/r/20200619123550.48098-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Suspend to idle was found to not work on Goldmont CPU recently.
The issue happens due to:
1. On Goldmont the CPU in idle can only be woken up via IPIs,
not POLLING mode, due to commit 08e237fa56 ("x86/cpu: Add
workaround for MONITOR instruction erratum on Goldmont based
CPUs")
2. When the CPU is entering suspend to idle process, the
_TIF_POLLING_NRFLAG remains on, because cpuidle_enter_s2idle()
doesn't match call_cpuidle() exactly.
3. Commit b2a02fc43a ("smp: Optimize send_call_function_single_ipi()")
makes use of _TIF_POLLING_NRFLAG to avoid sending IPIs to idle
CPUs.
4. As a result, some IPIs related functions might not work
well during suspend to idle on Goldmont. For example, one
suspected victim:
tick_unfreeze() -> timekeeping_resume() -> hrtimers_resume()
-> clock_was_set() -> on_each_cpu() might wait forever,
because the IPIs will not be sent to the CPUs which are
sleeping with _TIF_POLLING_NRFLAG set, and Goldmont CPU
could not be woken up by only setting _TIF_NEED_RESCHED
on the monitor address.
To avoid that, clear the _TIF_POLLING_NRFLAG flag before invoking
enter_s2idle_proper() in cpuidle_enter_s2idle() in analogy with the
call_cpuidle() code flow.
Fixes: b2a02fc43a ("smp: Optimize send_call_function_single_ipi()")
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
[ rjw: Subject / changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The 32-bit sigreturn trampoline in the compat sigpage matches the binary
representation of the arch/arm/ sigpage exactly. This is important for
debuggers (e.g. GDB) and unwinders (e.g. libunwind) since they rely
on matching the instruction sequence in order to identify that they are
unwinding through a signal. The same cannot be said for the sigreturn
trampoline in the compat vDSO, which defeats the unwinder heuristics and
instead attempts to use unwind directives for the unwinding. This is in
contrast to arch/arm/, which never uses the vDSO for sigreturn.
Ensure compatibility with arch/arm/ and existing unwinders by always
using the sigpage for the sigreturn trampoline, regardless of the
presence of the compat vDSO.
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
In preparation for removing the signal trampoline from the compat vDSO,
allow the sigpage and the compat vDSO to co-exist.
For the moment the vDSO signal trampoline will still be used when built.
Subsequent patches will move to the sigpage consistently.
Acked-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Commit 7e9f5e6629 ("arm64: vdso: Add --eh-frame-hdr to ldflags") results
in a .eh_frame_hdr section for the vDSO, which in turn causes the libgcc
unwinder to unwind out of signal handlers using the .eh_frame information
populated by our .cfi directives. In conjunction with a4eb355a3f
("arm64: vdso: Fix CFI directives in sigreturn trampoline"), this has
been shown to cause segmentation faults originating from within the
unwinder during thread cancellation:
| Thread 14 "virtio-net-rx" received signal SIGSEGV, Segmentation fault.
| 0x0000000000435e24 in uw_frame_state_for ()
| (gdb) bt
| #0 0x0000000000435e24 in uw_frame_state_for ()
| #1 0x0000000000436e88 in _Unwind_ForcedUnwind_Phase2 ()
| #2 0x00000000004374d8 in _Unwind_ForcedUnwind ()
| #3 0x0000000000428400 in __pthread_unwind (buf=<optimized out>) at unwind.c:121
| #4 0x0000000000429808 in __do_cancel () at ./pthreadP.h:304
| #5 sigcancel_handler (sig=32, si=0xffff33c743f0, ctx=<optimized out>) at nptl-init.c:200
| #6 sigcancel_handler (sig=<optimized out>, si=0xffff33c743f0, ctx=<optimized out>) at nptl-init.c:165
| #7 <signal handler called>
| #8 futex_wait_cancelable (private=0, expected=0, futex_word=0x3890b708) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
After considerable bashing of heads, it appears that our CFI directives
for unwinding out of the sigreturn trampoline are only processed by libgcc
when both a .eh_frame_hdr section is present *and* the mysterious NOP is
covered by an entry in .eh_frame. With both of these now in place, it has
highlighted that our CFI directives are not comprehensive enough to
restore the stack pointer of the interrupted context. This results in libgcc
falling back to an arm64-specific unwinder after computing a bogus PC value
from the unwind tables. The unwinder promptly dereferences this bogus address
in an attempt to see if the pointed-to instruction sequence looks like
the sigreturn trampoline.
Restore the old unwind behaviour, which relied solely on heuristics in
the unwinder, by removing the .eh_frame_hdr section from the vDSO and
commenting out the insufficient CFI directives for now. Add comments to
explain the current, miserable state of affairs.
Cc: Tamas Zsoldos <tamas.zsoldos@arm.com>
Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Kiss <daniel.kiss@arm.com>
Acked-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Clang-based kernel building with W=1 warns that some static const
variables are unused:
drivers/hwmon/bt1-pvt.c:67:30: warning: unused variable 'poly_temp_to_N' [-Wunused-const-variable]
static const struct pvt_poly poly_temp_to_N = {
^
drivers/hwmon/bt1-pvt.c:99:30: warning: unused variable 'poly_volt_to_N' [-Wunused-const-variable]
static const struct pvt_poly poly_volt_to_N = {
^
Indeed these polynomials are utilized only when the PVT sensor alarms are
enabled. In that case they are used to convert the temperature and
voltage alarm limits from normal quantities (Volts and degree Celsius) to
the sensor data representation N = [0, 1023]. Otherwise when alarms are
disabled the driver only does the detected data conversion to the human
readable form and doesn't need that polynomials defined. So let's mark the
Temp-to-N and Volt-to-N polynomials with __maybe_unused attribute.
Note gcc with W=1 doesn't notice the problem.
Fixes: 87976ce282 ("hwmon: Add Baikal-T1 PVT sensor driver")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Maxim Kaurkin <Maxim.Kaurkin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Link: https://lore.kernel.org/r/20200603000753.391-1-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Running a guest with a virtio-iommu protecting virtio devices
is broken since commit 515e5b6d90 ("dma-mapping: use vmap insted
of reimplementing it"). Before the conversion, the size was
page aligned in __get_vm_area_node(). Doing so fixes the
regression.
Fixes: 515e5b6d90 ("dma-mapping: use vmap insted of reimplementing it")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The dma coherent pool code needs genalloc. Move the select over
from DMA_REMAP, which doesn't actually need it.
Fixes: dbed452a07 ("dma-pool: decouple DMA_REMAP from DMA_COHERENT_POOL")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David Rientjes <rientjes@google.com>
When a coherent mapping is created in dma_direct_alloc_pages(), it needs
to be decrypted if the device requires unencrypted DMA before returning.
Fixes: 3acac06550 ("dma-mapping: merge the generic remapping helpers into dma-direct")
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When specifying insanely large debug buffers a kernel warning is
printed. The debug code does handle the error gracefully, though.
Instead of duplicating the check let us silence the warning to
avoid crashes when panic_on_warn is used.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Currently if early_pgm_check_handler is called it ends up in pgm check
loop. The problem is that early_pgm_check_handler is instrumented by
KASAN but executed without DAT flag enabled which leads to addressing
exception when KASAN checks try to access shadow memory.
Fix that by executing early handlers with DAT flag on under KASAN as
expected.
Reported-and-tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
When single stepping an svc instruction on s390, the kernel is entered
with a PER program check interruption. The program check handler than
jumps to the system call handler by reloading the PSW. The code didn't
set GPR13 to the thread pointer in struct task_struct. This made the
kernel access invalid memory while trying to fetch the syscall function
address. Fix this by always assigned GPR13 after .Lsysc_per.
Fixes: 0b0ed657fe ("s390: remove critical section cleanup from entry.S")
Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
When making a vPE non-resident because it has hit a blocking WFI,
the doorbell can fire at any time after the write to the RD.
Crucially, it can fire right between the write to GICR_VPENDBASER
and the write to the pending_last field in the its_vpe structure.
This means that we would overwrite pending_last with stale data,
and potentially not wakeup until some unrelated event (such as
a timer interrupt) puts the vPE back on the CPU.
GICv4 isn't affected by this as we actively mask the doorbell on
entering the guest, while GICv4.1 automatically manages doorbell
delivery without any hypervisor-driven masking.
Use the vpe_lock to synchronize such update, which solves the
problem altogether.
Fixes: ae699ad348 ("irqchip/gic-v4.1: Move doorbell management to the GICv4 abstraction layer")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
The Linux TSC calibration procedure is subject to small variations
(its common to see +-1 kHz difference between reboots on a given CPU, for example).
So migrating a guest between two hosts with identical processor can fail, in case
of a small variation in calibrated TSC between them.
Without TSC scaling, the current kernel interface will either return an error
(if user_tsc_khz <= tsc_khz) or enable TSC catchup mode.
This change enables the following TSC tolerance check to
accept KVM_SET_TSC_KHZ within tsc_tolerance_ppm (which is 250ppm by default).
/*
* Compute the variation in TSC rate which is acceptable
* within the range of tolerance and decide if the
* rate being applied is within that bounds of the hardware
* rate. If so, no scaling or compensation need be done.
*/
thresh_lo = adjust_tsc_khz(tsc_khz, -tsc_tolerance_ppm);
thresh_hi = adjust_tsc_khz(tsc_khz, tsc_tolerance_ppm);
if (user_tsc_khz < thresh_lo || user_tsc_khz > thresh_hi) {
pr_debug("kvm: requested TSC rate %u falls outside tolerance [%u,%u]\n", user_tsc_khz, thresh_lo, thresh_hi);
use_scaling = 1;
}
NTP daemon in the guest can correct this difference (NTP can correct upto 500ppm).
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20200616114741.GA298183@fuller.cnet>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add PID for CH340 that's found on some ESP8266 dev boards made by
LilyGO. The specific device that contains such serial converter can be
seen here: https://github.com/LilyGO/LILYGO-T-OI.
Apparently, it's a regular CH340, but I've confirmed with others that
also bought this board that the PID found on this device (0x7522)
differs from other devices with the "same" converter (0x7523).
Simply adding its PID to the driver and rebuilding it made it work
as expected.
Signed-off-by: Igor Moura <imphilippini@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
mt76 patches for 5.8
* tx queueing fixes for mt7615/22/63
* locking fix
# gpg: Signature made Sun 07 Jun 2020 06:17:47 PM EEST using DSA key ID 02A76EF5
# gpg: Good signature from "Felix Fietkau <nbd@nbd.name>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 75D1 1A7D 91A7 710F 4900 42EF D77D 141D 02A7 6EF5
The Scalable-mode Page-walk Coherency (SMPWC) field in the VT-d extended
capability register indicates the hardware coherency behavior on paging
structures accessed through the pasid table entry. This is ignored in
current code and using ECAP.C instead which is only valid in legacy mode.
Fix this so that paging structure updates could be manually flushed from
the cache line if hardware page walking is not snooped.
Fixes: 765b6a98c1 ("iommu/vt-d: Enumerate the scalable mode capability")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
PCI ACS is disabled if Intel IOMMU is off by default or intel_iommu=off
is used in command line. Unfortunately, Intel IOMMU will be forced on if
there're devices sitting on an external facing PCI port that is marked
as untrusted (for example, thunderbolt peripherals). That means, PCI ACS
is disabled while Intel IOMMU is forced on to isolate those devices. As
the result, the devices of an MFD will be grouped by a single group even
the ACS is supported on device.
[ 0.691263] pci 0000:00:07.1: Adding to iommu group 3
[ 0.691277] pci 0000:00:07.2: Adding to iommu group 3
[ 0.691292] pci 0000:00:07.3: Adding to iommu group 3
Fix it by requesting PCI ACS when Intel IOMMU is detected with platform
opt in hint.
Fixes: 89a6079df7 ("iommu/vt-d: Force IOMMU on for platform opt in hint")
Co-developed-by: Lalithambika Krishnakumar <lalithambika.krishnakumar@intel.com>
Signed-off-by: Lalithambika Krishnakumar <lalithambika.krishnakumar@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When using first-level translation for IOVA, currently the U/S bit in the
page table is cleared which implies DMA requests with user privilege are
blocked. As the result, following error messages might be observed when
passing through a device to user level:
DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Read] Request device [41:00.0] PASID 1 fault addr 7ecdcd000
[fault reason 129] SM: U/S set 0 for first-level translation
with user privilege
This fixes it by setting U/S bit in the first level page table and makes
IOVA over first level compatible with previous second-level translation.
Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Reported-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Current Intel SVM is designed by setting the pgd_t of the processor page
table to FLPTR field of the PASID entry. The first level translation only
supports 4 and 5 level paging structures, hence it's infeasible for the
IOMMU to share a processor's page table when it's running in 32-bit mode.
Let's disable 32bit support for now and claim support only when all the
missing pieces are ready in the future.
Fixes: 1c4f88b7f1 ("iommu/vt-d: Shared virtual address in scalable mode")
Suggested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This is a UPB (Universal Powerline Bus) PIM (Powerline Interface Module)
which allows for controlling multiple UPB compatible devices from Linux
using the standard serial interface.
Based on vendor application source code there are two different models
of USB based PIM devices in addition to a number of RS232 based PIM's.
The vendor UPB application source contains the following USB ID's:
#define USB_PCS_VENDOR_ID 0x04b4
#define USB_PCS_PIM_PRODUCT_ID 0x5500
#define USB_SAI_VENDOR_ID 0x17dd
#define USB_SAI_PIM_PRODUCT_ID 0x5500
The first set of ID's correspond to the PIM variant sold by Powerline
Control Systems while the second corresponds to the Simply Automated
Incorporated PIM. As the product ID for both of these match the default
cypress HID->COM RS232 product ID it assumed that they both use an
internal variant of this HID->COM RS232 converter hardware. However
as the vendor ID for the Simply Automated variant is different we need
to also add it to the cypress_M8 driver so that it is properly
detected.
Signed-off-by: James Hilliard <james.hilliard1@gmail.com>
Link: https://lore.kernel.org/r/20200616220403.1807003-1-james.hilliard1@gmail.com
Cc: stable@vger.kernel.org
[ johan: amend VID define entry ]
Signed-off-by: Johan Hovold <johan@kernel.org>
The binder driver makes the assumption proc->context pointer is invariant after
initialization (as documented in the kerneldoc header for struct proc).
However, in commit f0fe2c0f05 ("binder: prevent UAF for binderfs devices II")
proc->context is set to NULL during binder_deferred_release().
Another proc was in the middle of setting up a transaction to the dying
process and crashed on a NULL pointer deref on "context" which is a local
set to &proc->context:
new_ref->data.desc = (node == context->binder_context_mgr_node) ? 0 : 1;
Here's the stack:
[ 5237.855435] Call trace:
[ 5237.855441] binder_get_ref_for_node_olocked+0x100/0x2ec
[ 5237.855446] binder_inc_ref_for_node+0x140/0x280
[ 5237.855451] binder_translate_binder+0x1d0/0x388
[ 5237.855456] binder_transaction+0x2228/0x3730
[ 5237.855461] binder_thread_write+0x640/0x25bc
[ 5237.855466] binder_ioctl_write_read+0xb0/0x464
[ 5237.855471] binder_ioctl+0x30c/0x96c
[ 5237.855477] do_vfs_ioctl+0x3e0/0x700
[ 5237.855482] __arm64_sys_ioctl+0x78/0xa4
[ 5237.855488] el0_svc_common+0xb4/0x194
[ 5237.855493] el0_svc_handler+0x74/0x98
[ 5237.855497] el0_svc+0x8/0xc
The fix is to move the kfree of the binder_device to binder_free_proc()
so the binder_device is freed when we know there are no references
remaining on the binder_proc.
Fixes: f0fe2c0f05 ("binder: prevent UAF for binderfs devices II")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200622200715.114382-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In RFC 8684, we don't need to send sndr_key in SYN package anymore, so drop
it.
Fixes: cc7972ea19 ("mptcp: parse and emit MP_CAPABLE option according to v1 spec")
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix check in ethtool_rx_flow_rule_create
Fixes: eca4205f9e ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator")
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an interface is being deleted, "/proc/net/dev_snmp6/<interface name>"
is deleted.
The function for this is addrconf_ifdown() in the addrconf_notify() and
it is called by notification, which is NETDEV_UNREGISTER.
But, if NETDEV_CHANGEMTU is triggered after NETDEV_UNREGISTER,
this proc file will be created again.
This recreated proc file will be deleted by netdev_wati_allrefs().
Before netdev_wait_allrefs() is called, creating a new HSR interface
routine can be executed and It tries to create a proc file but it will
find an un-deleted proc file.
At this point, it warns about it.
To avoid this situation, it can use ->dellink() instead of
->ndo_uninit() to release resources because ->dellink() is called
before NETDEV_UNREGISTER.
So, a proc file will not be recreated.
Test commands
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link set dummy0 mtu 1300
#SHELL1
while :
do
ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1
done
#SHELL2
while :
do
ip link del hsr0
done
Splat looks like:
[ 9888.980852][ T2752] proc_dir_entry 'dev_snmp6/hsr0' already registered
[ 9888.981797][ C2] WARNING: CPU: 2 PID: 2752 at fs/proc/generic.c:372 proc_register+0x2d5/0x430
[ 9888.981798][ C2] Modules linked in: hsr dummy veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6x
[ 9888.981814][ C2] CPU: 2 PID: 2752 Comm: ip Tainted: G W 5.8.0-rc1+ #616
[ 9888.981815][ C2] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 9888.981816][ C2] RIP: 0010:proc_register+0x2d5/0x430
[ 9888.981818][ C2] Code: fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 65 01 00 00 49 8b b5 e0 00 00 00 48 89 ea 40
[ 9888.981819][ C2] RSP: 0018:ffff8880628dedf0 EFLAGS: 00010286
[ 9888.981821][ C2] RAX: dffffc0000000008 RBX: ffff888028c69170 RCX: ffffffffaae09a62
[ 9888.981822][ C2] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88806c9f75ac
[ 9888.981823][ C2] RBP: ffff888028c693f4 R08: ffffed100d9401bd R09: ffffed100d9401bd
[ 9888.981824][ C2] R10: ffffffffaddf406f R11: 0000000000000001 R12: ffff888028c69308
[ 9888.981825][ C2] R13: ffff8880663584c8 R14: dffffc0000000000 R15: ffffed100518d27e
[ 9888.981827][ C2] FS: 00007f3876b3b0c0(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000
[ 9888.981828][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9888.981829][ C2] CR2: 00007f387601a8c0 CR3: 000000004101a002 CR4: 00000000000606e0
[ 9888.981830][ C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 9888.981831][ C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 9888.981832][ C2] Call Trace:
[ 9888.981833][ C2] ? snmp6_seq_show+0x180/0x180
[ 9888.981834][ C2] proc_create_single_data+0x7c/0xa0
[ 9888.981835][ C2] snmp6_register_dev+0xb0/0x130
[ 9888.981836][ C2] ipv6_add_dev+0x4b7/0xf60
[ 9888.981837][ C2] addrconf_notify+0x684/0x1ca0
[ 9888.981838][ C2] ? __mutex_unlock_slowpath+0xd0/0x670
[ 9888.981839][ C2] ? kasan_unpoison_shadow+0x30/0x40
[ 9888.981840][ C2] ? wait_for_completion+0x250/0x250
[ 9888.981841][ C2] ? inet6_ifinfo_notify+0x100/0x100
[ 9888.981842][ C2] ? dropmon_net_event+0x227/0x410
[ 9888.981843][ C2] ? notifier_call_chain+0x90/0x160
[ 9888.981844][ C2] ? inet6_ifinfo_notify+0x100/0x100
[ 9888.981845][ C2] notifier_call_chain+0x90/0x160
[ 9888.981846][ C2] register_netdevice+0xbe5/0x1070
[ ... ]
Reported-by: syzbot+1d51c8b74efa4c44adeb@syzkaller.appspotmail.com
Fixes: e0a4b99773 ("hsr: use upper/lower device infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The WDOG_ANY signal is connected to the RESET_IN signal of the SoM
and baseboard. It is currently configured as push-pull, which means
that if some external device like a programmer wants to assert the
RESET_IN signal by pulling it to ground, it drives against the high
level WDOG_ANY output of the SoC.
To fix this we set the WDOG_ANY signal to open-drain configuration.
That way we make sure that the RESET_IN can be asserted by the
watchdog as well as by external devices.
Fixes: 1ea4b76cdf ("ARM: dts: imx6ul-kontron-n6310: Add Kontron i.MX6UL N6310 SoM and boards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The watchdog's WDOG_ANY signal is used to trigger a POR of the SoC,
if a soft reset is issued. As the SoM hardware connects the WDOG_ANY
and the POR signals, the watchdog node itself and the pin
configuration should be part of the common SoM devicetree.
Let's move it from the baseboard's devicetree to its proper place.
Fixes: 1ea4b76cdf ("ARM: dts: imx6ul-kontron-n6310: Add Kontron i.MX6UL N6310 SoM and boards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
xlog_wait() on the CIL context can reference a freed context if the
waiter doesn't get scheduled before the CIL context is freed. This
can happen when a task is on the hard throttle and the CIL push
aborts due to a shutdown. This was detected by generic/019:
thread 1 thread 2
__xfs_trans_commit
xfs_log_commit_cil
<CIL size over hard throttle limit>
xlog_wait
schedule
xlog_cil_push_work
wake_up_all
<shutdown aborts commit>
xlog_cil_committed
kmem_free
remove_wait_queue
spin_lock_irqsave --> UAF
Fix it by moving the wait queue to the CIL rather than keeping it in
in the CIL context that gets freed on push completion. Because the
wait queue is now independent of the CIL context and we might have
multiple contexts in flight at once, only wake the waiters on the
push throttle when the context we are pushing is over the hard
throttle size threshold.
Fixes: 0e7ab7efe7 ("xfs: Throttle commits on delayed background CIL push")
Reported-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Remove support for context switching between the guest's and host's
desired UMWAIT_CONTROL. Propagating the guest's value to hardware isn't
required for correct functionality, e.g. KVM intercepts reads and writes
to the MSR, and the latency effects of the settings controlled by the
MSR are not architecturally visible.
As a general rule, KVM should not allow the guest to control power
management settings unless explicitly enabled by userspace, e.g. see
KVM_CAP_X86_DISABLE_EXITS. E.g. Intel's SDM explicitly states that C0.2
can improve the performance of SMT siblings. A devious guest could
disable C0.2 so as to improve the performance of their workloads at the
detriment to workloads running in the host or on other VMs.
Wholesale removal of UMWAIT_CONTROL context switching also fixes a race
condition where updates from the host may cause KVM to enter the guest
with the incorrect value. Because updates are are propagated to all
CPUs via IPI (SMP function callback), the value in hardware may be
stale with respect to the cached value and KVM could enter the guest
with the wrong value in hardware. As above, the guest can't observe the
bad value, but it's a weird and confusing wart in the implementation.
Removal also fixes the unnecessary usage of VMX's atomic load/store MSR
lists. Using the lists is only necessary for MSRs that are required for
correct functionality immediately upon VM-Enter/VM-Exit, e.g. EFER on
old hardware, or for MSRs that need to-the-uop precision, e.g. perf
related MSRs. For UMWAIT_CONTROL, the effects are only visible in the
kernel via TPAUSE/delay(), and KVM doesn't do any form of delay in
vcpu_vmx_run(). Using the atomic lists is undesirable as they are more
expensive than direct RDMSR/WRMSR.
Furthermore, even if giving the guest control of the MSR is legitimate,
e.g. in pass-through scenarios, it's not clear that the benefits would
outweigh the overhead. E.g. saving and restoring an MSR across a VMX
roundtrip costs ~250 cycles, and if the guest diverged from the host
that cost would be paid on every run of the guest. In other words, if
there is a legitimate use case then it should be enabled by a new
per-VM capability.
Note, KVM still needs to emulate MSR_IA32_UMWAIT_CONTROL so that it can
correctly expose other WAITPKG features to the guest, e.g. TPAUSE,
UMWAIT and UMONITOR.
Fixes: 6e3ba4abce ("KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL")
Cc: stable@vger.kernel.org
Cc: Jingqi Liu <jingqi.liu@intel.com>
Cc: Tao Xu <tao3.xu@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200623005135.10414-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Syzbot reports an use-after-free in workqueue context:
BUG: KASAN: use-after-free in mutex_unlock+0x19/0x40 kernel/locking/mutex.c:737
mutex_unlock+0x19/0x40 kernel/locking/mutex.c:737
__smsc95xx_mdio_read drivers/net/usb/smsc95xx.c:217 [inline]
smsc95xx_mdio_read+0x583/0x870 drivers/net/usb/smsc95xx.c:278
check_carrier+0xd1/0x2e0 drivers/net/usb/smsc95xx.c:644
process_one_work+0x777/0xf90 kernel/workqueue.c:2274
worker_thread+0xa8f/0x1430 kernel/workqueue.c:2420
kthread+0x2df/0x300 kernel/kthread.c:255
It looks like that smsc95xx_unbind() is freeing the structures that are
still in use by the concurrently running workqueue callback. Thus switch
to using cancel_delayed_work_sync() to ensure the work callback really
is no longer active.
Reported-by: syzbot+29dc7d4ae19b703ff947@syzkaller.appspotmail.com
Signed-off-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
The second commit cited below performed a cast of 'u32 buffsize' to
'(u16 *)' when calling mlxsw_sp_port_headroom_8x_adjust():
mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, (u16 *) &buffsize);
Colin noted that this will behave differently on big endian
architectures compared to little endian architectures.
Fix this by following Colin's suggestion and have the function accept
and return 'u32' instead of passing the current size by reference.
Fixes: da382875c6 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Fixes: 60833d54d5 ("mlxsw: spectrum: Adjust headroom buffers for 8x ports")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Colin Ian King <colin.king@canonical.com>
Suggested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7ae7ad2f11 ("net: phy: smsc: use phy_read_poll_timeout()
to simplify the code") will print a lot of logs as follows when Ethernet
cable is not connected:
[ 4.473105] SMSC LAN8710/LAN8720 2188000.ethernet-1:00: lan87xx_read_status failed: -110
When wait 640 ms for check ENERGYON bit, the timeout should not be
regarded as an actual error and an error message also should not be
printed. due to a hardware bug in LAN87XX device, it leads to unstable
detection of plugging in Ethernet cable when LAN87xx is in Energy Detect
Power-Down mode. the workaround for it involves, when the link is down,
and at each read_status() call:
- disable EDPD mode, forcing the PHY out of low-power mode
- waiting 640ms to see if we have any energy detected from the media
- re-enable entry to EDPD mode
This is presumably enough to allow the PHY to notice that a cable is
connected, and resume normal operations to negotiate with the partner.
The problem is that when no media is detected, the 640ms wait times
out and this commit was modified to prints an error message. it is an
inappropriate conversion by used phy_read_poll_timeout() to introduce
this bug. so fix this issue by use read_poll_timeout() to replace
phy_read_poll_timeout().
Fixes: 7ae7ad2f11 ("net: phy: smsc: use phy_read_poll_timeout() to simplify the code")
Reported-by: Kevin Groeneveld <kgroeneveld@gmail.com>
Signed-off-by: Dejin Zheng <zhengdejin5@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Explicitly pass the L2 GPA to kvm_arch_write_log_dirty(), which for all
intents and purposes is vmx_write_pml_buffer(), instead of having the
latter pull the GPA from vmcs.GUEST_PHYSICAL_ADDRESS. If the dirty bit
update is the result of KVM emulation (rare for L2), then the GPA in the
VMCS may be stale and/or hold a completely unrelated GPA.
Fixes: c5f983f6e8 ("nVMX: Implement emulated Page Modification Logging")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200622215832.22090-2-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
msm_gem_address_space_create() changed to take a start/length instead
of a start/end for the iova space but all of the callers were just
cut and pasted from the old usage. Most of the mistakes have been fixed
up so just catch up the rest.
Fixes: ccac7ce373 ("drm/msm: Refactor address space initialization")
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Currently, when RMPP MADs are processed while the MAD agent is destroyed,
it could result in use after free of rmpp_recv, as decribed below:
cpu-0 cpu-1
----- -----
ib_mad_recv_done()
ib_mad_complete_recv()
ib_process_rmpp_recv_wc()
unregister_mad_agent()
ib_cancel_rmpp_recvs()
cancel_delayed_work()
process_rmpp_data()
start_rmpp()
queue_delayed_work(rmpp_recv->cleanup_work)
destroy_rmpp_recv()
free_rmpp_recv()
cleanup_work()[1]
spin_lock_irqsave(&rmpp_recv->agent->lock) <-- use after free
[1] cleanup_work() == recv_cleanup_handler
Fix it by waiting for the MAD agent reference count becoming zero before
calling to ib_cancel_rmpp_recvs().
Fixes: 9a41e38a46 ("IB/mad: Use IDR for agent IDs")
Link: https://lore.kernel.org/r/20200621104738.54850-2-leon@kernel.org
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
translate_gpa() returns a GPA, assigning it to 'real_gfn' seems obviously
wrong. There is no real issue because both 'gpa_t' and 'gfn_t' are u64 and
we don't use the value in 'real_gfn' as a GFN, we do
real_gfn = gpa_to_gfn(real_gfn);
instead. 'If you see a "buffalo" sign on an elephant's cage, do not trust
your eyes', but let's fix it for good.
No functional change intended.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200622151435.752560-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The following race can cause lost map update events:
cpu1 cpu2
apic_map_dirty = true
------------------------------------------------------------
kvm_recalculate_apic_map:
pass check
mutex_lock(&kvm->arch.apic_map_lock);
if (!kvm->arch.apic_map_dirty)
and in process of updating map
-------------------------------------------------------------
other calls to
apic_map_dirty = true might be too late for affected cpu
-------------------------------------------------------------
apic_map_dirty = false
-------------------------------------------------------------
kvm_recalculate_apic_map:
bail out on
if (!kvm->arch.apic_map_dirty)
To fix it, record the beginning of an update of the APIC map in
apic_map_dirty. If another APIC map change switches apic_map_dirty
back to DIRTY during the update, kvm_recalculate_apic_map should not
make it CLEAN, and the other caller will go through the slow path.
Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 8c0637e950 ("keys: Make the KEY_NEED_* perms an enum rather than
a mask") changed the type of the key_permission callback functions, but
didn't change the type of the hook, which trips indirect call checking with
Control-Flow Integrity (CFI). This change fixes the issue by changing the
hook type to match the functions.
Fixes: 8c0637e950 ("keys: Make the KEY_NEED_* perms an enum rather than a mask")
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <jmorris@namei.org>
When adding a quirk for IRQ on Intel Galileo Gen 2 the commit ba8c90c618
("gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2")
missed GPIO resource release. We can safely do this in the same quirk, since
IRQ will be locked by GPIO framework when requested and unlocked on freeing.
Fixes: ba8c90c618 ("gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Pull spi fixes from Mark Brown:
"Quite a lot of fixes here for no single reason.
There's a collection of the usual sort of device specific fixes and
also a bunch of people have been working on spidev and the userspace
test program spidev_test so they've got an unusually large collection
of small fixes"
* tag 'spi-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spidev: fix a potential use-after-free in spidev_release()
spi: spidev: fix a race between spidev_release and spidev_remove
spi: stm32-qspi: Fix error path in case of -EPROBE_DEFER
spi: uapi: spidev: Use TABs for alignment
spi: spi-fsl-dspi: Free DMA memory with matching function
spi: tools: Add macro definitions to fix build errors
spi: tools: Make default_tx/rx and input_tx static
spi: dt-bindings: amlogic, meson-gx-spicc: Fix schema for meson-g12a
spi: rspi: Use requested instead of maximum bit rate
spi: spidev_test: Use %u to format unsigned numbers
spi: sprd: switch the sequence of setting WDG_LOAD_LOW and _HIGH
Guest fails to online hotplugged CPU with error
smpboot: do_boot_cpu failed(-1) to wakeup CPU#4
It's caused by the fact that kvm_apic_set_state(), which used to call
recalculate_apic_map() unconditionally and pulled hotplugged CPU into
apic map, is updating map conditionally on state changes. In this case
the APIC map is not considered dirty and the is not updated.
Fix the issue by forcing unconditional update from kvm_apic_set_state(),
like it used to be.
Fixes: 4abaffce4d ("KVM: LAPIC: Recalculate apic map in batch")
Cc: stable@vger.kernel.org
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200622160830.426022-1-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull regulator fixes from Mark Brown:
"This has a fix for the refactoring out of the pickable ranges
functionality, plus the removal of a BROKEN dependency on mt6358 now
that the dependencies were merged in -rc1 and a couple of device
specific fixes"
* tag 'regulator-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: mt6358: Remove BROKEN dependency
regualtor: pfuze100: correct sw1a/sw2 on pfuze3000
regulator: Fix pickable ranges mapping
regulator: da9063: fix LDO9 suspend and warning.
Pull regmap fixes from Mark Brown:
"A few small fixes, none of which are likely to have any substantial
impact here - the most substantial one is a fix for a long standing
memory leak on devices that use register patching which will only have
an impact if the device is removed and re-added"
* tag 'regmap-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Fix memory leak from regmap_register_patch
regmap: fix the kerneldoc for regmap_test_bits()
regmap: fix alignment issue
Virtio-mem managed memory is always detected and added by the virtio-mem
driver, never using something like the firmware-provided memory map.
This is the case after an ordinary system reboot, and has to be guaranteed
after kexec. Especially, virtio-mem added memory resources can contain
inaccessible parts ("unblocked memory blocks"), blindly forwarding them
to a kexec kernel is dangerous, as unplugged memory will get accessed
(esp. written).
Let's use the new way of adding special driver-managed memory introduced
in commit 7b7b27214b ("mm/memory_hotplug: introduce
add_memory_driver_managed()").
This will result in no entries in /sys/firmware/memmap ("raw firmware-
provided memory map"), the memory resource will be flagged
IORESOURCE_MEM_DRIVER_MANAGED (esp., kexec_file_load() will not place
kexec images on this memory), and it is exposed as "System RAM
(virtio_mem)" in /proc/iomem, so esp. kexec-tools can properly handle it.
Example /proc/iomem before this change:
[...]
140000000-333ffffff : virtio0
140000000-147ffffff : System RAM
334000000-533ffffff : virtio1
338000000-33fffffff : System RAM
340000000-347ffffff : System RAM
348000000-34fffffff : System RAM
[...]
Example /proc/iomem after this change:
[...]
140000000-333ffffff : virtio0
140000000-147ffffff : System RAM (virtio_mem)
334000000-533ffffff : virtio1
338000000-33fffffff : System RAM (virtio_mem)
340000000-347ffffff : System RAM (virtio_mem)
348000000-34fffffff : System RAM (virtio_mem)
[...]
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Fixes: 5f1f79bbc9 ("virtio-mem: Paravirtualized memory hotplug")
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200611093518.5737-1-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Smatch complains that "rc" can be uninitialized if we hit the "break;"
statement on the first iteration through the loop. I suspect that this
can't happen in real life, but returning a zero literal is cleaner and
silence the static checker warning.
Fixes: 5f1f79bbc9 ("virtio-mem: Paravirtualized memory hotplug")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20200610085911.GC5439@mwanda
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The "vma->vm_pgoff" variable is an unsigned long so if it's larger than
INT_MAX then "index" can be negative leading to an underflow. Fix this
by changing the type of "index" to "unsigned long".
Fixes: ddd89d0a05 ("vhost_vdpa: support doorbell mapping via mmap")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20200610085852.GB5439@mwanda
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Like other vectors already patched, this one here allows the root
user to load ACPI tables, which enables arbitrary physical address
writes, which in turn makes it possible to disable lockdown.
Prevents this by checking the lockdown status before allowing a new
ACPI table to be installed. The link in the trailer shows a PoC of
how this might be used.
Link: https://git.zx2c4.com/american-unsigned-language/tree/american-unsigned-language-2.sh
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If SVE is enabled then 'ret' can be assigned the return value of
kvm_vcpu_enable_sve() which may be 0 causing future "goto out" sites to
erroneously return 0 on failure rather than -EINVAL as expected.
Remove the initialisation of 'ret' and make setting the return value
explicit to avoid this situation in the future.
Fixes: 9a3cdf26e3 ("KVM: arm64/sve: Allow userspace to enable SVE for vcpus")
Cc: stable@vger.kernel.org
Reported-by: James Morse <james.morse@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200617105456.28245-1-steven.price@arm.com
The "inline" keyword is a hint for the compiler to inline a function. The
functions system_uses_irq_prio_masking() and gic_write_pmr() are used by
the code running at EL2 on a non-VHE system, so mark them as
__always_inline to make sure they'll always be part of the .hyp.text
section.
This fixes the following splat when trying to run a VM:
[ 47.625273] Kernel panic - not syncing: HYP panic:
[ 47.625273] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006
[ 47.625273] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000
[ 47.625273] VCPU:0000000000000000
[ 47.647261] CPU: 1 PID: 217 Comm: kvm-vcpu-0 Not tainted 5.8.0-rc1-ARCH+ #61
[ 47.654508] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[ 47.661139] Call trace:
[ 47.663659] dump_backtrace+0x0/0x1cc
[ 47.667413] show_stack+0x18/0x24
[ 47.670822] dump_stack+0xb8/0x108
[ 47.674312] panic+0x124/0x2f4
[ 47.677446] panic+0x0/0x2f4
[ 47.680407] SMP: stopping secondary CPUs
[ 47.684439] Kernel Offset: disabled
[ 47.688018] CPU features: 0x240402,20002008
[ 47.692318] Memory Limit: none
[ 47.695465] ---[ end Kernel panic - not syncing: HYP panic:
[ 47.695465] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006
[ 47.695465] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000
[ 47.695465] VCPU:0000000000000000 ]---
The instruction abort was caused by the code running at EL2 trying to fetch
an instruction which wasn't mapped in the EL2 translation tables. Using
objdump showed the two functions as separate symbols in the .text section.
Fixes: 85738e05dc ("arm64: kvm: Unmask PMR before entering guest")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200618171254.1596055-1-alexandru.elisei@arm.com
The RPC client currently doesn't handle ERR_CHUNK replies correctly.
rpcrdma_complete_rqst() incorrectly passes a negative number to
xprt_complete_rqst() as the number of bytes copied. Instead, set
task->tk_status to the error value, and return zero bytes copied.
In these cases, return -EIO rather than -EREMOTEIO. The RPC client's
finite state machine doesn't know what to do with -EREMOTEIO.
Additional clean ups:
- Don't double-count RDMA_ERROR replies
- Remove a stale comment
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@kernel.vger.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
1. Ensure that only rpcrdma_cm_event_handler() modifies
ep->re_connect_status to avoid racy changes to that field.
2. Ensure that xprt_force_disconnect() is invoked only once as a
transport is closed or destroyed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Clean up: Sometimes creating a fresh rpcrdma_ep can fail. That's why
xprt_rdma_connect() always checks if the r_xprt->rx_ep pointer is
valid before dereferencing it. Instead, xprt_rdma_connect() can
simply check rpcrdma_xprt_connect()'s return value.
Also, there's no need to set re_connect_status to zero just after
the rpcrdma_ep is created, since it is allocated with kzalloc.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
r_xprt->rx_ep is known to be good while the transport's send lock is
held. Otherwise additional references on rx_ep must be held when it
is used outside of that lock's critical sections.
For now, bump the rx_ep reference count once whenever there is at
least one outstanding Receive WR. This avoids the memory bandwidth
overhead of taking and releasing the reference count for every
ib_post_recv() and Receive completion.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
If shared interrupt comes late, during probe error path or device remove
(could be triggered with CONFIG_DEBUG_SHIRQ), the interrupt handler
dspi_interrupt() will access registers with the clock being disabled.
This leads to external abort on non-linefetch on Toradex Colibri VF50
module (with Vybrid VF5xx):
$ echo 4002d000.spi > /sys/devices/platform/soc/40000000.bus/4002d000.spi/driver/unbind
Unhandled fault: external abort on non-linefetch (0x1008) at 0x8887f02c
Internal error: : 1008 [#1] ARM
Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
Backtrace:
(regmap_mmio_read32le)
(regmap_mmio_read)
(_regmap_bus_reg_read)
(_regmap_read)
(regmap_read)
(dspi_interrupt)
(free_irq)
(devm_irq_release)
(release_nodes)
(devres_release_all)
(device_release_driver_internal)
The resource-managed framework should not be used for shared interrupt
handling, because the interrupt handler might be called after releasing
other resources and disabling clocks.
Similar bug could happen during suspend - the shared interrupt handler
could be invoked after suspending the device. Each device sharing this
interrupt line should disable the IRQ during suspend so handler will be
invoked only in following cases:
1. None suspended,
2. All devices resumed.
Fixes: 349ad66c0a ("spi:Add Freescale DSPI driver for Vybrid VF610 platform")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200622110543.5035-3-krzk@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
During device removal, the driver should unregister the SPI controller
and stop the hardware. Otherwise the dspi_transfer_one_message() could
wait on completion infinitely.
Additionally, calling spi_unregister_controller() first in device
removal reverse-matches the probe function, where SPI controller is
registered at the end.
Fixes: 05209f4570 ("spi: fsl-dspi: add missing clk_disable_unprepare() in dspi_remove()")
Reported-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200622110543.5035-1-krzk@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
With CONFIG_DEBUG_VIRTUAL=y, __pa() checks for addr value and if it's
less than PAGE_OFFSET it leads to a BUG().
#define __pa(x)
({
VIRTUAL_BUG_ON((unsigned long)(x) < PAGE_OFFSET);
(unsigned long)(x) & 0x0fffffffffffffffUL;
})
kernel BUG at arch/powerpc/kvm/book3s_64_mmu_radix.c:43!
cpu 0x70: Vector: 700 (Program Check) at [c0000018a2187360]
pc: c000000000161b30: __kvmhv_copy_tofrom_guest_radix+0x130/0x1f0
lr: c000000000161d5c: kvmhv_copy_from_guest_radix+0x3c/0x80
...
kvmhv_copy_from_guest_radix+0x3c/0x80
kvmhv_load_from_eaddr+0x48/0xc0
kvmppc_ld+0x98/0x1e0
kvmppc_load_last_inst+0x50/0x90
kvmppc_hv_emulate_mmio+0x288/0x2b0
kvmppc_book3s_radix_page_fault+0xd8/0x2b0
kvmppc_book3s_hv_page_fault+0x37c/0x1050
kvmppc_vcpu_run_hv+0xbb8/0x1080
kvmppc_vcpu_run+0x34/0x50
kvm_arch_vcpu_ioctl_run+0x2fc/0x410
kvm_vcpu_ioctl+0x2b4/0x8f0
ksys_ioctl+0xf4/0x150
sys_ioctl+0x28/0x80
system_call_exception+0x104/0x1d0
system_call_common+0xe8/0x214
kvmhv_copy_tofrom_guest_radix() uses a NULL value for to/from to
indicate direction of copy.
Avoid calling __pa() if the value is NULL to avoid the BUG().
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[mpe: Massage change log a bit to mention CONFIG_DEBUG_VIRTUAL]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200611120159.680284-1-aneesh.kumar@linux.ibm.com
ASoC: Fixes for v5.8
This is a collection of mostly small fixes, mostly fixing fallout from
some of the DPCM changes that went in last time around which shook out
some issues on i.MX and Qualcomm platforms. The addition of a managed
version of snd_soc_register_dai() is to fix resource leaks.
There's also a few new device IDs for x86 systems.
Building the current 5.8 kernel for an e500 machine with
CONFIG_RANDOMIZE_BASE=y and CONFIG_BLOCK=n yields the following
failure:
arch/powerpc/mm/nohash/kaslr_booke.c: In function 'kaslr_early_init':
arch/powerpc/mm/nohash/kaslr_booke.c:387:2: error: implicit
declaration of function 'flush_icache_range'; did you mean 'flush_tlb_range'?
Indeed, including asm/cacheflush.h into kaslr_booke.c fixes the build.
Fixes: 2b0e86cc5d ("powerpc/fsl_booke/32: implement KASLR infrastructure")
Cc: stable@vger.kernel.org # v5.5+
Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Acked-by: Scott Wood <oss@buserror.net>
[mpe: Tweak change log to mention CONFIG_BLOCK=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200613162801.1946619-1-asolokha@kb.kras.ru
This reverts commit 8ece3b3eb5.
This commit broke userspace. Bash uses ESPIPE to determine whether or
not the file should be read using "unbuffered I/O", which means reading
1 byte at a time instead of 128 bytes at a time. I used to use bash to
read through kmsg in a really quite nasty way:
while read -t 0.1 -r line 2>/dev/null || [[ $? -ne 142 ]]; do
echo "SARU $line"
done < /dev/kmsg
This will show all lines that can fit into the 128 byte buffer, and skip
lines that don't. That's pretty awful, but at least it worked.
With this change, bash now tries to do 1-byte reads, which means it
skips all the lines, which is worse than before.
Now, I don't really care very much about this, and I'm already look for
a workaround. But I did just spend an hour trying to figure out why my
scripts were broken. Either way, it makes no difference to me personally
whether this is reverted, but it might be something to consider. If you
declare that "trying to read /dev/kmsg with bash is terminally stupid
anyway," I might be inclined to agree with you. But do note that bash
uses lseek(fd, 0, SEEK_CUR)==>ESPIPE to determine whether or not it's
reading from a pipe.
Cc: Bruno Meneguele <bmeneg@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
open_shroot() invokes kref_get(), which increases the refcount of the
"tcon->crfid" object. When open_shroot() returns not zero, it means the
open operation failed and close_shroot() will not be called to decrement
the refcount of the "tcon->crfid".
The reference counting issue happens in one normal path of
open_shroot(). When the cached root have been opened successfully in a
concurrent process, the function increases the refcount and jump to
"oshr_free" to return. However the current return value "rc" may not
equal to 0, thus the increased refcount will not be balanced outside the
function, causing a refcnt leak.
Fix this issue by setting the value of "rc" to 0 before jumping to
"oshr_free" label.
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Pull SELinux fixes from Paul Moore:
"Three small patches to fix problems in the SELinux code, all found via
clang.
Two patches fix potential double-free conditions and one fixes an
undefined return value"
* tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix undefined return of cond_evaluate_expr
selinux: fix a double free in cond_read_node()/cond_read_list()
selinux: fix double free
Pull pin control fixes from Linus Walleij:
"Some early fixes collected during the first week after the merge
window, all pretty self-evident, with the details below. The revert is
the crucial thing.
- Fix a warning on the Qualcomm SPMI GPIO chip being instatiated
twice without a unique irqchip struct
- Use the noirq variants of the suspend and resume callbacks in the
Tegra driver
- Clean up the errorpath on the MCP23s08 driver
- Revert the use of devm_of_iomap() in the Freescale driver as it was
regressing the platform
- Add some missing pins in the Qualcomm IPQ6018 driver
- Fix a simple documentation bug in the pinctrl-single driver"
* tag 'pinctrl-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: single: fix function name in documentation
pinctrl: qcom: ipq6018 Add missing pins in qpic pin group
Revert "pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'"
pinctrl: mcp23s08: Split to three parts: fix ptr_ret.cocci warnings
pinctrl: tegra: Use noirq suspend/resume callbacks
pinctrl: qcom: spmi-gpio: fix warning about irq chip reusage
Pull Kbuild fixes from Masahiro Yamada:
- fix -gz=zlib compiler option test for CONFIG_DEBUG_INFO_COMPRESSED
- improve cc-option in scripts/Kbuild.include to clean up temp files
- improve cc-option in scripts/Kconfig.include for more reliable
compile option test
- do not copy modules.builtin by 'make install' because it would break
existing systems
- use 'userprogs' syntax for watch_queue sample
* tag 'kbuild-fixes-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
samples: watch_queue: build sample program for target architecture
Revert "Makefile: install modules.builtin even if CONFIG_MODULES=n"
scripts: Fix typo in headers_install.sh
kconfig: unify cc-option and as-option
kbuild: improve cc-option to clean up all temporary files
Makefile: Improve compressed debug info support detection
Pull powerpc fixes from Michael Ellerman:
- One fix for the interrupt rework we did last release which broke
KVM-PR
- Three commits fixing some fallout from the READ_ONCE() changes
interacting badly with our 8xx 16K pages support, which uses a pte_t
that is a structure of 4 actual PTEs
- A cleanup of the 8xx pte_update() to use the newly added pmd_off()
- A fix for a crash when handling an oops if CONFIG_DEBUG_VIRTUAL is
enabled
- A minor fix for the SPU syscall generation
Thanks to Aneesh Kumar K.V, Christian Zigotzky, Christophe Leroy, Mike
Rapoport, Nicholas Piggin.
* tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/8xx: Provide ptep_get() with 16k pages
mm: Allow arches to provide ptep_get()
mm/gup: Use huge_ptep_get() in gup_hugepte()
powerpc/syscalls: Use the number when building SPU syscall table
powerpc/8xx: use pmd_off() to access a PMD entry in pte_update()
powerpc/64s: Fix KVM interrupt using wrong save area
powerpc: Fix kernel crash in show_instructions() w/DEBUG_VIRTUAL
This userspace program includes UAPI headers exported to usr/include/.
'make headers' always works for the target architecture (i.e. the same
architecture as the kernel), so the sample program should be built for
the target as well. Kbuild now supports 'userprogs' for that.
I also guarded the CONFIG option by 'depends on CC_CAN_LINK' because
$(CC) may not provide libc.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
This reverts commit e0b250b57d,
which broke build systems that need to install files to a certain
path, but do not set INSTALL_MOD_PATH when invoking 'make install'.
$ make INSTALL_PATH=/tmp/destdir install
mkdir: cannot create directory ‘/lib/modules/5.8.0-rc1+/’: Permission denied
Makefile:1342: recipe for target '_builtin_inst_' failed
make: *** [_builtin_inst_] Error 1
While modules.builtin is useful also for CONFIG_MODULES=n, this change
in the behavior is quite unexpected. Maybe "make modules_install"
can install modules.builtin irrespective of CONFIG_MODULES as Jonas
originally suggested.
Anyway, that commit should be reverted ASAP.
Reported-by: Douglas Anderson <dianders@chromium.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
The GIC driver uses a RMW sequence to update the affinity, and
relies on the gic_lock_irqsave/gic_unlock_irqrestore sequences
to update it atomically.
But these sequences only expand into anything meaningful if
the BL_SWITCHER option is selected, which almost never happens.
It also turns out that using a RMW and locks is just as silly,
as the GIC distributor supports byte accesses for the GICD_TARGETRn
registers, which when used make the update atomic by definition.
Drop the terminally broken code and replace it by a byte write.
Fixes: 04c8b0f82c ("irqchip/gic: Make locking a BL_SWITCHER only feature")
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
readx_poll_timeout() can sleep if @sleep_us is specified by the caller,
and is therefore unsafe to be used inside the atomic context, which is
this case when we use it to poll the GICR_VPENDBASER.Dirty bit in
irq_set_vcpu_affinity() callback.
Let's convert to its atomic version instead which helps to get the v4.1
board back to life!
Fixes: 96806229ca ("irqchip/gic-v4.1: Add support for VPENDBASER's Dirty+Valid signaling")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200605052345.1494-1-yuzenghui@huawei.com
The user tool modinfo is used to get information on kernel modules, including a
description where it is available.
This patch adds a brief MODULE_DESCRIPTION to the following modules:
9p
drop_monitor
esp4_offload
esp6_offload
fou
fou6
ila
sch_fq
sch_fq_codel
sch_hhf
Signed-off-by: Rob Gill <rrobgill@protonmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Schmidt says:
====================
pull-request: ieee802154 for net 2020-06-19
An update from ieee802154 for your *net* tree.
Just two small maintenance fixes to update references to the new project
homepage.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull SCSI fixes from James Bottomley:
"One minor fix and two patches reworking the ata dma drain for the
!CONFIG_LIBATA case. The latter is a 5.7 regression fix"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: Wire up ata_scsi_dma_need_drain for SAS HBA drivers
scsi: libata: Provide an ata_scsi_dma_need_drain stub for !CONFIG_ATA
scsi: ufs-bsg: Fix runtime PM imbalance on error
Pull i2c fixes from Wolfram Sang:
- a small collection of remaining API conversion patches (all acked)
which allow to finally remove the deprecated API
- some documentation fixes and a MAINTAINERS addition
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
MAINTAINERS: Add robert and myself as qcom i2c cci maintainers
i2c: smbus: Fix spelling mistake in the comments
Documentation/i2c: SMBus start signal is S not A
i2c: remove deprecated i2c_new_device API
Documentation: media: convert to use i2c_new_client_device()
video: backlight: tosa_lcd: convert to use i2c_new_client_device()
x86/platform/intel-mid: convert to use i2c_new_client_device()
drm: encoder_slave: use new I2C API
drm: encoder_slave: fix refcouting error for modules
Since iproute2 commit f72c3ad00f3b ("tc: m_tunnel_key: add options
support for vxlan"), the geneve opt output use key word "geneve_opts"
instead of "geneve_opt". To make compatibility for both old and new
iproute2, let's accept both "geneve_opt" and "geneve_opts".
Suggested-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Typically the firmware takes care that tp->ocp_base is reset to its
default value. That's not the case (at least) for RTL8117.
As a result subsequent PHY access reads/writes the wrong page and
the link is broken. Fix this be resetting tp->ocp_base explicitly.
Fixes: 229c1e0dfd ("r8169: load firmware for RTL8168fp/RTL8117")
Reported-by: Aaron Ma <mapengyu@gmail.com>
Tested-by: Aaron Ma <mapengyu@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Continue the reset path when partner adapter is not ready or H_CLOSED is
returned from reset crq. This patch allows the CRQ init to proceed to
establish a valid CRQ for traffic to flow after reset.
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even with moving netif_tx_disable() to an earlier point when
taking down the queues for a reconfiguration, we still end
up with the occasional netdev watchdog Tx Timeout complaint.
The old method of using netif_trans_update() works fine for
queue 0, but has no effect on the remaining queues. Using
netif_device_detach() allows us to signal to the watchdog to
ignore us for the moment.
Fixes: beead698b1 ("ionic: Add the basic NDO callbacks for netdev support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the correct the function name in the documentation for
"pcs_parse_one_pinctrl_entry()".
"smux_parse_one_pinctrl_entry()" appears to be an artifact from the
development of a prior patch series ("simple pinmux driver") which
transformed into pinctrl-single.
Signed-off-by: Drew Fustini <drew@beagleboard.org>
Link: https://lore.kernel.org/r/20200612112758.GA3407886@x1
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull tracing fixes from Steven Rostedt:
- Have recordmcount work with > 64K sections (to support LTO)
- kprobe RCU fixes
- Correct a kprobe critical section with missing mutex
- Remove redundant arch_disarm_kprobe() call
- Fix lockup when kretprobe triggers within kprobe_flush_task()
- Fix memory leak in fetch_op_data operations
- Fix sleep in atomic in ftrace trace array sample code
- Free up memory on failure in sample trace array code
- Fix incorrect reporting of function_graph fields in format file
- Fix quote within quote parsing in bootconfig
- Fix return value of bootconfig tool
- Add testcases for bootconfig tool
- Fix maybe uninitialized warning in ftrace pid file code
- Remove unused variable in tracing_iter_reset()
- Fix some typos
* tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Fix maybe-uninitialized compiler warning
tools/bootconfig: Add testcase for show-command and quotes test
tools/bootconfig: Fix to return 0 if succeeded to show the bootconfig
tools/bootconfig: Fix to use correct quotes for value
proc/bootconfig: Fix to use correct quotes for value
tracing: Remove unused event variable in tracing_iter_reset
tracing/probe: Fix memleak in fetch_op_data operations
trace: Fix typo in allocate_ftrace_ops()'s comment
tracing: Make ftrace packed events have align of 1
sample-trace-array: Remove trace_array 'sample-instance'
sample-trace-array: Fix sleeping function called from invalid context
kretprobe: Prevent triggering kretprobe from within kprobe_flush_task
kprobes: Remove redundant arch_disarm_kprobe() call
kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex
kprobes: Use non RCU traversal APIs on kprobe_tables if possible
kprobes: Suppress the suspicious RCU warning on kprobes
recordmcount: support >64k sections
Pull libnvdimm updates from Dan Williams:
"A feature (papr_scm health retrieval) and a fix (sysfs attribute
visibility) for v5.8.
Vaibhav explains in the merge commit below why missing v5.8 would be
painful and I agreed to try a -rc2 pull because only cosmetics kept
this out of -rc1 and his initial versions were posted in more than
enough time for v5.8 consideration:
'These patches are tied to specific features that were committed to
customers in upcoming distros releases (RHEL and SLES) whose
time-lines are tied to 5.8 kernel release.
Being able to track the health of an nvdimm is critical for our
customers that are running workloads leveraging papr-scm nvdimms.
Missing the 5.8 kernel would mean missing the distro timelines and
shifting forward the availability of this feature in distro kernels
by at least 6 months'
Summary:
- Fix the visibility of the region 'align' attribute.
The new unit tests for region alignment handling caught a corner
case where the alignment cannot be specified if the region is
converted from static to dynamic provisioning at runtime.
- Add support for device health retrieval for the persistent memory
supported by the papr_scm driver.
This includes both the standard sysfs "health flags" that the nfit
persistent memory driver publishes and a mechanism for the ndctl
tool to retrieve a health-command payload"
* tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nvdimm/region: always show the 'align' attribute
powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
powerpc/papr_scm: Fetch nvdimm health information from PHYP
seq_buf: Export seq_buf_printf
powerpc: Document details on H_SCM_HEALTH hcall
This reverts commit ba40324261.
After commit 26d8cde526 ("pinctrl: freescale: imx: add shared
input select reg support"). i.MX7D has two iomux controllers
iomuxc and iomuxc-lpsr which share select_input register for
daisy chain settings.
If use 'devm_of_iomap()', when probe the iomuxc-lpsr, will call
devm_request_mem_region() for the region <0x30330000-0x3033ffff>
for the first time. Then, next time when probe the iomuxc, API
devm_platform_ioremap_resource() will also use the API
devm_request_mem_region() for the share region <0x30330000-0x3033ffff>
again, then cause issue, log like below:
[ 0.179561] imx7d-pinctrl 302c0000.iomuxc-lpsr: initialized IMX pinctrl driver
[ 0.191742] imx7d-pinctrl 30330000.pinctrl: can't request region for resource [mem 0x30330000-0x3033ffff]
[ 0.191842] imx7d-pinctrl: probe of 30330000.pinctrl failed with error -16
Fixes: ba40324261 ("pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Link: https://lore.kernel.org/r/1591673223-1680-1-git-send-email-haibo.chen@nxp.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull s390 fixes from Vasily Gorbik:
- a few ptrace fixes mostly for strace and seccomp_bpf kernel tests
findings
- cleanup unused pm callbacks in virtio ccw
- replace kmalloc + memset with kzalloc in crypto
- use $(LD) for vDSO linkage to make clang happy
- fix vDSO clock_getres() to preserve the same behaviour as
posix_get_hrtimer_res()
- fix workqueue cpumask warning when NUMA=n and nr_node_ids=2
- reduce SLSB writes during input processing, improve warnings and
cleanup qdio_data usage in qdio
- a few fixes to use scnprintf() instead of snprintf()
* tag 's390-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: fix syscall_get_error for compat processes
s390/qdio: warn about unexpected SLSB states
s390/qdio: clean up usage of qdio_data
s390/numa: let NODES_SHIFT depend on NEED_MULTIPLE_NODES
s390/vdso: fix vDSO clock_getres()
s390/vdso: Use $(LD) instead of $(CC) to link vDSO
s390/protvirt: use scnprintf() instead of snprintf()
s390: use scnprintf() in sys_##_prefix##_##_name##_show
s390/crypto: use scnprintf() instead of snprintf()
s390/zcrypt: use kzalloc
s390/virtio: remove unused pm callbacks
s390/qdio: reduce SLSB writes during Input Queue processing
selftests/seccomp: s390 shares the syscall and return value register
s390/ptrace: fix setting syscall number
s390/ptrace: pass invalid syscall numbers to tracing
s390/ptrace: return -ENOSYS when invalid syscall is supplied
s390/seccomp: pass syscall arguments via seccomp_data
s390/qdio: fine-tune SLSB update
Pull RISC-V fixes from Palmer Dabbelt:
- a workaround for a compiler surprise related to the "r" inline
assembly that allows LLVM to boot.
- a fix to avoid WX-only mappings, which the ISA does not allow. While
this probably manifests in many ways, the bug was found in stress-ng.
- a missing lock in set_direct_map_*(), which due to a recent lockdep
change started asserting.
* tag 'riscv-for-linus-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Acquire mmap lock before invoking walk_page_range
RISC-V: Don't allow write+exec only page mapping request in mmap
riscv/atomic: Fix sign extension for RV64I
Pull kselftest cleanups from Shuah Khan:
- ftrace "requires:" list for simplifying and unifying requirement
checks for each test case, adding "requires:" line instead of
checking required ftrace interfaces in each test case.
- a minor spelling correction patch
* tag 'linux-kselftest-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/ftrace: Support ":README" suffix for requires
selftests/ftrace: Support ":tracer" suffix for requires
selftests/ftrace: Convert check_filter_file() with requires list
selftests/ftrace: Convert required interface checks into requires list
selftests/ftrace: Add "requires:" list support
selftests/ftrace: Return unsupported for the unconfigured features
selftests/ftrace: Allow ":" in description
tools: testing: ftrace: trigger: fix spelling mistake
The fileserver probe timer, net->fs_probe_timer, isn't cancelled when
the kafs module is being removed and so the count it holds on
net->servers_outstanding doesn't get dropped..
This causes rmmod to wait forever. The hung process shows a stack like:
afs_purge_servers+0x1b5/0x23c [kafs]
afs_net_exit+0x44/0x6e [kafs]
ops_exit_list+0x72/0x93
unregister_pernet_operations+0x14c/0x1ba
unregister_pernet_subsys+0x1d/0x2a
afs_exit+0x29/0x6f [kafs]
__do_sys_delete_module.isra.0+0x1a2/0x24b
do_syscall_64+0x51/0x95
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fix this by:
(1) Attempting to cancel the probe timer and, if successful, drop the
count that the timer was holding.
(2) Make the timer function just drop the count and not schedule the
prober if the afs portion of net namespace is being destroyed.
Also, whilst we're at it, make the following changes:
(3) Initialise net->servers_outstanding to 1 and decrement it before
waiting on it so that it doesn't generate wake up events by being
decremented to 0 until we're cleaning up.
(4) Switch the atomic_dec() on ->servers_outstanding for ->fs_timer in
afs_purge_servers() to use the helper function for that.
Fixes: f6cbb368bc ("afs: Actively poll fileservers to maintain NAT or firewall openings")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix afs_do_lookup()'s fallback case for when FS.InlineBulkStatus isn't
supported by the server.
In the fallback, it calls FS.FetchStatus for the specific vnode it's
meant to be looking up. Commit b6489a49f7 broke this by renaming one
of the two identically-named afs_fetch_status_operation descriptors to
something else so that one of them could be made non-static. The site
that used the renamed one, however, wasn't renamed and didn't produce
any warning because the other was declared in a header.
Fix this by making afs_do_lookup() use the renamed variant.
Note that there are two variants of the success method because one is
called from ->lookup() where we may or may not have an inode, but can't
call iget until after we've talked to the server - whereas the other is
called from within iget where we have an inode, but it may or may not be
initialised.
The latter variant expects there to be an inode, but because it's being
called from there former case, there might not be - resulting in an oops
like the following:
BUG: kernel NULL pointer dereference, address: 00000000000000b0
...
RIP: 0010:afs_fetch_status_success+0x27/0x7e
...
Call Trace:
afs_wait_for_operation+0xda/0x234
afs_do_lookup+0x2fe/0x3c1
afs_lookup+0x3c5/0x4bd
__lookup_slow+0xcd/0x10f
walk_component+0xa2/0x10c
path_lookupat.isra.0+0x80/0x110
filename_lookup+0x81/0x104
vfs_statx+0x76/0x109
__do_sys_newlstat+0x39/0x6b
do_syscall_64+0x4c/0x78
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: b6489a49f7 ("afs: Fix silly rename")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
READ_ONCE() now enforces atomic read, which leads to:
CC mm/gup.o
In file included from ./include/linux/kernel.h:11:0,
from mm/gup.c:2:
In function 'gup_hugepte.constprop',
inlined from 'gup_huge_pd.isra.79' at mm/gup.c:2465:8:
./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_222' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^
./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert'
prefix ## suffix(); \
^
./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert'
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^
./include/linux/compiler.h:405:2: note: in expansion of macro 'compiletime_assert'
compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
^
./include/linux/compiler.h:291:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
compiletime_assert_rwonce_type(x); \
^
mm/gup.c:2428:8: note: in expansion of macro 'READ_ONCE'
pte = READ_ONCE(*ptep);
^
In function 'gup_get_pte',
inlined from 'gup_pte_range' at mm/gup.c:2228:9,
inlined from 'gup_pmd_range' at mm/gup.c:2613:15,
inlined from 'gup_pud_range' at mm/gup.c:2641:15,
inlined from 'gup_p4d_range' at mm/gup.c:2666:15,
inlined from 'gup_pgd_range' at mm/gup.c:2694:15,
inlined from 'internal_get_user_pages_fast' at mm/gup.c:2795:3:
./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_219' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^
./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert'
prefix ## suffix(); \
^
./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert'
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^
./include/linux/compiler.h:405:2: note: in expansion of macro 'compiletime_assert'
compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
^
./include/linux/compiler.h:291:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
compiletime_assert_rwonce_type(x); \
^
mm/gup.c:2199:9: note: in expansion of macro 'READ_ONCE'
return READ_ONCE(*ptep);
^
make[2]: *** [mm/gup.o] Error 1
Define ptep_get() on 8xx when using 16k pages.
Fixes: 9e343b467c ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/341688399c1b102756046d19ea6ce39db1ae4742.1592225558.git.christophe.leroy@csgroup.eu
The ETF qdisc can queue skbs that it could not pace on the errqueue.
Address a few issues in the selftest
- recv buffer size was too small, and incorrectly calculated
- compared errno to ee_code instead of ee_errno
- missed invalid request error type
v2:
- fix a few checkpatch --strict indentation warnings
Fixes: ea6a547669 ("selftests/net: make so_txtime more robust to timer variance")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The max MTU limit defined for ibmveth is not accounting for
virtual ethernet buffer overhead, which is twenty-two additional
bytes set aside for the ethernet header and eight additional bytes
of an opaque handle reserved for use by the hypervisor. Update the
max MTU to reflect this overhead.
Fixes: d894be57ca ("ethernet: use net core MTU range checking in more drivers")
Fixes: 110447f826 ("ethernet: fix min/max MTU typos")
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
wenxu says:
====================
several fixes for indirect flow_blocks offload
v2:
patch2: store the cb_priv of representor to the flow_block_cb->indr.cb_priv
in the driver. And make the correct check with the statments
this->indr.cb_priv == cb_priv
patch4: del the driver list only in the indriect cleanup callbacks
v3:
add the cover letter and changlogs.
v4:
collapsed 1/4, 2/4, 4/4 in v3 to one fix
Add the prepare patch 1 and 2
v5:
patch1: place flow_indr_block_cb_alloc() right before
flow_indr_dev_setup_offload() to avoid moving flow_block_indr_init()
This series fixes commit 1fac52da59 ("net: flow_offload: consolidate
indirect flow_block infrastructure") that revists the flow_block
infrastructure.
patch #1#2: prepare for fix patch #3
add and use flow_indr_block_cb_alloc/remove function
patch #3: fix flow_indr_dev_unregister path
If the representor is removed, then identify the indirect flow_blocks
that need to be removed by the release callback and the port representor
structure. To identify the port representor structure, a new
indr.cb_priv field needs to be introduced. The flow_block also needs to
be removed from the driver list from the cleanup path
patch#4 fix block->nooffloaddevcnt warning dmesg log.
When a indr device add in offload success. The block->nooffloaddevcnt
should be 0. After the representor go away. When the dir device go away
the flow_block UNBIND operation with -EOPNOTSUPP which lead the warning
demesg log.
The block->nooffloaddevcnt should always count for indr block.
even the indr block offload successful. The representor maybe
gone away and the ingress qdisc can work in software mode.
====================
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the representor is removed, then identify the indirect flow_blocks
that need to be removed by the release callback and the port representor
structure. To identify the port representor structure, a new
indr.cb_priv field needs to be introduced. The flow_block also needs to
be removed from the driver list from the cleanup path.
Fixes: 1fac52da59 ("net: flow_offload: consolidate indirect flow_block infrastructure")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prepare fix the bug in the next patch. use flow_indr_block_cb_alloc/remove
function and remove the __flow_block_indr_binding.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add flow_indr_block_cb_alloc/remove function for next fix patch.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, trying to change the DF parameter of a geneve device does
nothing:
# ip -d link show geneve1
14: geneve1: <snip>
link/ether <snip>
geneve id 1 remote 10.0.0.1 ttl auto df set dstport 6081 <snip>
# ip link set geneve1 type geneve id 1 df unset
# ip -d link show geneve1
14: geneve1: <snip>
link/ether <snip>
geneve id 1 remote 10.0.0.1 ttl auto df set dstport 6081 <snip>
We just need to update the value in geneve_changelink.
Fixes: a025fb5f49 ("geneve: Allow configuration of DF behaviour")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VLAN tag insertion/extraction offload is correctly
activated at probe time but deactivation of this feature
(i.e. via ethtool) is broken. Toggling works only for
Tx/Rx ring 0 of a PF, and is ignored for the other rings,
including the VF rings.
To fix this, the existing VLAN offload toggling code
was extended to all the rings assigned to a netdevice,
instead of the default ring 0 (likely a leftover from the
early validation days of this feature). And the code was
moved to the common set_features() function to fix toggling
for the VF driver too.
Fixes: d4fd0404c1 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Undo previously done operation in case macb_phylink_connect()
fails. Since macb_reset_hw() is the 1st undo operation the
napi_exit label was renamed to reset_hw.
Fixes: 7897b071ac ("net: macb: convert to phylink")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells says:
====================
rxrpc: Performance drop fix and other fixes
Here are three fixes for rxrpc:
(1) Fix a trace symbol mapping. It doesn't seem to let you map to "".
(2) Fix the handling of the remote receive window size when it increases
beyond the size we can support for our transmit window.
(3) Fix a performance drop caused by retransmitted packets being
accidentally marked as already ACK'd.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Include the papr_scm health retrieval feature for v5.8-rc2. The
functionality was initially posted well in advance of the merge window,
but review comments and a late build-bot warning kept them out of the
v5.8-rc1 libnvdimm pull request.
Vaibhav notes:
These patches are tied to specific features that were committed to
customers in upcoming distros releases (RHEL and SLES) whose time-lines
are tied to 5.8 kernel release.
Being able to track the health of an nvdimm is critical for our
customers that are running workloads leveraging papr-scm nvdimms.
Missing the 5.8 kernel would mean missing the distro timelines and
shifting forward the availability of this feature in distro kernels by
at least 6 months.
Florian Fainelli says:
====================
net: phy: MDIO bus scanning fixes
This patch series fixes two problems with the current MDIO bus scanning
logic which was identified while moving from 4.9 to 5.4 on devices that
do rely on scanning the MDIO bus at runtime because they use pluggable
cards.
Changes in v2:
- added comment explaining the special value of -ENODEV
- added Andrew's Reviewed-by tag
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 02a6efcab6 ("net: phy: allow scanning busses with missing
phys") added a special condition to return -ENODEV in case -ENODEV or
-EIO was returned from the first read of the MII_PHYSID1 register.
In case the MDIO bus data line pull-up is not strong enough, the MDIO
bus controller will not flag this as a read error. This can happen when
a pluggable daughter card is not connected and weak internal pull-ups
are used (since that is the only option, otherwise the pins are
floating).
The second read of MII_PHYSID2 will be correctly flagged an error
though, but now we will return -EIO which will be treated as a hard
error, thus preventing MDIO bus scanning loops to continue succesfully.
Apply the same logic to both register reads, thus allowing the scanning
logic to proceed.
Fixes: 02a6efcab6 ("net: phy: allow scanning busses with missing phys")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 209c65b61d ("drivers/of/of_mdio.c:fix of_mdiobus_register()")
introduced a break of the loop on the premise that a successful
registration should exit the loop. The premise is correct but not to
code, because rc && rc != -ENODEV is just a special error condition,
that means we would exit the loop even with rc == -ENODEV which is
absolutely not correct since this is the error code to indicate to the
MDIO bus layer that scanning should continue.
Fix this by explicitly checking for rc = 0 as the only valid condition
to break out of the loop.
Fixes: 209c65b61d ("drivers/of/of_mdio.c:fix of_mdiobus_register()")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull io_uring fixes from Jens Axboe:
- Catch a case where io_sq_thread() didn't do proper mm acquire
- Ensure poll completions are reaped on shutdown
- Async cancelation and run fixes (Pavel)
- io-poll race fixes (Xiaoguang)
- Request cleanup race fix (Xiaoguang)
* tag 'io_uring-5.8-2020-06-19' of git://git.kernel.dk/linux-block:
io_uring: fix possible race condition against REQ_F_NEED_CLEANUP
io_uring: reap poll completions while waiting for refs to drop on exit
io_uring: acquire 'mm' for task_work for SQPOLL
io_uring: add memory barrier to synchronize io_kiocb's result and iopoll_completed
io_uring: don't fail links for EAGAIN error in IOPOLL mode
io_uring: cancel by ->task not pid
io_uring: lazy get task
io_uring: batch cancel in io_uring_cancel_files()
io_uring: cancel all task's requests on exit
io-wq: add an option to cancel all matched reqs
io-wq: reorder cancellation pending -> running
io_uring: fix lazy work init
Pull block fixes from Jens Axboe:
- Use import_uuid() where appropriate (Andy)
- bcache fixes (Coly, Mauricio, Zhiqiang)
- blktrace sparse warnings fix (Jan)
- blktrace concurrent setup fix (Luis)
- blkdev_get use-after-free fix (Jason)
- Ensure all blk-mq maps are updated (Weiping)
- Loop invalidate bdev fix (Zheng)
* tag 'block-5.8-2020-06-19' of git://git.kernel.dk/linux-block:
block: make function 'kill_bdev' static
loop: replace kill_bdev with invalidate_bdev
partitions/ldm: Replace uuid_copy() with import_uuid() where it makes sense
block: update hctx map when use multiple maps
blktrace: Avoid sparse warnings when assigning q->blk_trace
blktrace: break out of blktrace setup on concurrent calls
block: Fix use-after-free in blkdev_get()
trace/events/block.h: drop kernel-doc for dropped function parameter
blk-mq: Remove redundant 'return' statement
bcache: pr_info() format clean up in bcache_device_init()
bcache: use delayed kworker fo asynchronous devices registration
bcache: check and adjust logical block size for backing devices
bcache: fix potential deadlock problem in btree_gc_coalesce
Pull libata fixes from Jens Axboe:
"A few minor changes that should go into this release"
* tag 'libata-5.8-2020-06-19' of git://git.kernel.dk/linux-block:
libata: Use per port sync for detach
ata/libata: Fix usage of page address by page_address in ata_scsi_mode_select_xlat function
sata_rcar: handle pm_runtime_get_sync failure cases
Steffen Klassert says:
====================
pull request (net): ipsec 2020-06-19
1) Fix double ESP trailer insertion in IPsec crypto offload if
netif_xmit_frozen_or_stopped is true. From Huy Nguyen.
2) Merge fixup for "remove output_finish indirection from
xfrm_state_afinfo". From Stephen Rothwell.
3) Select CRYPTO_SEQIV for ESP as this is needed for GCM and several
other encryption algorithms. Also modernize the crypto algorithm
selections for ESP and AH, remove those that are maked as "MUST NOT"
and add those that are marked as "MUST" be implemented in RFC 8221.
From Eric Biggers.
Please note the merge conflict between commit:
a7f7f6248d ("treewide: replace '---help---' in Kconfig files with 'help'")
from Linus' tree and commits:
7d4e391959 ("esp, ah: consolidate the crypto algorithm selections")
be01369859 ("esp, ah: modernize the crypto algorithm selections")
from the ipsec tree.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2020-06-18
This series contains fixes to ixgbe, i40e and ice driver.
Ciara fixes up the ixgbe, i40e and ice drivers to protect access when
allocating and freeing the rings. In addition, made use of READ_ONCE
when reading the rings prior to accessing the statistics pointer.
Björn fixes a crash where the receive descriptor ring allocation was
moved to a different function, which broke the ethtool set_ringparam()
hook.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull drm fixes from Dave Airlie:
"Just i915 and amd here.
i915 has some workaround movement so they get applied at the right
times, and a timeslicing fix, along with some display fixes.
AMD has a few display floating point fix and a devcgroup fix for
amdkfd.
i915:
- Fix for timeslicing and virtual engines/unpremptable requests (+ 1
dependency patch)
- Fixes into TypeC register programming and interrupt storm detecting
- Disable DIP on MST ports with the transcoder clock still on
- Avoid missing GT workarounds at reset for HSW and older gens
- Fix for unwinding multiple requests missing force restore
- Fix encoder type check for DDI vswing sequence
- Build warning fixes
amdgpu:
- Fix kvfree/kfree mixup
- Fix hawaii device id in powertune configuration
- Display FP fixes
- Documentation fixes
amdkfd:
- devcgroup check fix"
* tag 'drm-fixes-2020-06-19' of git://anongit.freedesktop.org/drm/drm: (23 commits)
drm/amdgpu: fix documentation around busy_percentage
drm/amdgpu/pm: update comment to clarify Overdrive interfaces
drm/amdkfd: Use correct major in devcgroup check
drm/i915/display: Fix the encoder type check
drm/i915/icl+: Fix hotplug interrupt disabling after storm detection
drm/i915/gt: Move gen4 GT workarounds from init_clock_gating to workarounds
drm/i915/gt: Move ilk GT workarounds from init_clock_gating to workarounds
drm/i915/gt: Move snb GT workarounds from init_clock_gating to workarounds
drm/i915/gt: Move vlv GT workarounds from init_clock_gating to workarounds
drm/i915/gt: Move ivb GT workarounds from init_clock_gating to workarounds
drm/i915/gt: Move hsw GT workarounds from init_clock_gating to workarounds
drm/i915/icl: Disable DIP on MST ports with the transcoder clock still on
drm/i915/gt: Incrementally check for rewinding
drm/i915/tc: fix the reset of ln0
drm/i915/gt: Prevent timeslicing into unpreemptable requests
drm/i915/selftests: Restore to default heartbeat
drm/i915: work around false-positive maybe-uninitialized warning
drm/i915/pmu: avoid an maybe-uninitialized warning
drm/i915/gt: Incorporate the virtual engine into timeslicing
drm/amd/display: Rework dsc to isolate FPU operations
...
Pull ceph fixes from Ilya Dryomov:
"An important follow-up for replica reads support that went into -rc1
and two target_copy() fixups"
* tag 'ceph-for-5.8-rc2' of git://github.com/ceph/ceph-client:
libceph: don't omit used_replica in target_copy()
libceph: don't omit recovery_deletes in target_copy()
libceph: move away from global osd_req_flags
Pull arm64 fixes from Will Deacon:
"Unfortunately, we still have a number of outstanding issues so there
will be more fixes to come, but this lot are a good start.
- Fix handling of watchpoints triggered by uaccess routines
- Fix initialisation of gigantic pages for CMA buffers
- Raise minimum clang version for BTI to avoid miscompilation
- Fix data race in SVE vector length configuration code
- Ensure address tags are ignored in kern_addr_valid()
- Dump register state on fatal BTI exception
- kexec_file() cleanup to use struct_size() macro"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: hw_breakpoint: Don't invoke overflow handler on uaccess watchpoints
arm64: kexec_file: Use struct_size() in kmalloc()
arm64: mm: reserve hugetlb CMA after numa_init
arm64: bti: Require clang >= 10.0.1 for in-kernel BTI support
arm64: sve: Fix build failure when ARM64_SVE=y and SYSCTL=n
arm64: pgtable: Clear the GP bit for non-executable kernel pages
arm64: mm: reset address tag set by kasan sw tagging
arm64: traps: Dump registers prior to panic() in bad_mode()
arm64/sve: Eliminate data races on sve_default_vl
docs/arm64: Fix typo'd #define in sve.rst
arm64: remove TEXT_OFFSET randomization
Pull flex-array size helper from Kees Cook:
"During the treewide clean-ups of zero-length "flexible arrays", the
struct_size() helper was heavily used, but it was noticed that many
times it would have been nice to have an additional helper to get the
size of just the flexible array itself.
This need appears to be even more common when cleaning up the 1-byte
array "flexible arrays", so Gustavo implemented it.
I'd love to get this landed early so it can be used during the v5.9
dev cycle to ease the 1-byte array cleanups."
* tag 'overflow-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
overflow.h: Add flex_array_size() helper
Pull perf tooling fixes from Arnaldo Carvalho de Melo:
- Update various UAPI headers, some automatically adding support for a
new MSR and the faccess2 syscall.
- Fix corner case NULL deref in the histograms code.
- Fix corner case NULL deref in 'perf stat' aggregation code.
- Fix array pointer deref and old style declaration in the parsing of
events.
- Fix segfault when processing ZSTD compressed perf.data files in 'perf
script' due to lack of initialization of the ZSTD library.
- Handle __attribute__((user)) in libtraceevent fixing the parsing of
syscall tracepoints with user buffers.
- Make libtraevent aware of __builtin_expect() appearing in tracepoint
fields.
- Make the BPF prologue generation use bpf_probe_read_{user,kernel}().
- Fix the '@user' attribute parsing in kprobes variables in 'perf
probe'.
- Fix error message when asking for -fsanitize=address without required
libraries.
* tag 'perf-tools-fixes-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (22 commits)
perf build: Fix error message when asking for -fsanitize=address without required libraries
tools lib traceevent: Add handler for __builtin_expect()
tools lib traceevent: Handle __attribute__((user)) in field names
tools lib traceevent: Add append() function helper for appending strings
tools headers UAPI: Sync linux/fs.h with the kernel sources
tools include UAPI: Sync linux/vhost.h with the kernel sources
tools arch x86: Sync the msr-index.h copy with the kernel sources
perf script: Initialize zstd_data
perf pmu: Remove unused declaration
perf parse-events: Fix an old style declaration
perf parse-events: Fix an incompatible pointer
perf bpf: Fix bpf prologue generation
perf probe: Fix user attribute access in kprobes
perf stat: Fix NULL pointer dereference
perf report: Fix NULL pointer dereference in hists__fprintf_nr_sample_events()
tools headers UAPI: Sync kvm.h headers with the kernel sources
tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
perf beauty: Add support to STATX_MNT_ID in the 'statx' syscall 'mask' argument
tools headers uapi: Sync linux/stat.h with the kernel sources
...
Add cond_resched() to a loop that fills in the mapper memory area
because the loop can be executed many times.
Fixes: 48debafe4f ("dm: add writecache target")
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
x86 CPUs can suffer severe performance drops if a tight loop, such as
the ones in __clear_user(), straddles a 16-byte instruction fetch
window, or worse, a 64-byte cacheline. This issues was discovered in the
SUSE kernel with the following commit,
1153933703 ("x86/asm/64: Micro-optimize __clear_user() - Use immediate constants")
which increased the code object size from 10 bytes to 15 bytes and
caused the 8-byte copy loop in __clear_user() to be split across a
64-byte cacheline.
Aligning the start of the loop to 16-bytes makes this fit neatly inside
a single instruction fetch window again and restores the performance of
__clear_user() which is used heavily when reading from /dev/zero.
Here are some numbers from running libmicro's read_z* and pread_z*
microbenchmarks which read from /dev/zero:
Zen 1 (Naples)
libmicro-file
5.7.0-rc6 5.7.0-rc6 5.7.0-rc6
revert-1153933703d9+ align16+
Time mean95-pread_z100k 9.9195 ( 0.00%) 5.9856 ( 39.66%) 5.9938 ( 39.58%)
Time mean95-pread_z10k 1.1378 ( 0.00%) 0.7450 ( 34.52%) 0.7467 ( 34.38%)
Time mean95-pread_z1k 0.2623 ( 0.00%) 0.2251 ( 14.18%) 0.2252 ( 14.15%)
Time mean95-pread_zw100k 9.9974 ( 0.00%) 6.0648 ( 39.34%) 6.0756 ( 39.23%)
Time mean95-read_z100k 9.8940 ( 0.00%) 5.9885 ( 39.47%) 5.9994 ( 39.36%)
Time mean95-read_z10k 1.1394 ( 0.00%) 0.7483 ( 34.33%) 0.7482 ( 34.33%)
Note that this doesn't affect Haswell or Broadwell microarchitectures
which seem to avoid the alignment issue by executing the loop straight
out of the Loop Stream Detector (verified using perf events).
Fixes: 1153933703 ("x86/asm/64: Micro-optimize __clear_user() - Use immediate constants")
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # v4.19+
Link: https://lkml.kernel.org/r/20200618102002.30034-1-matt@codeblueprint.co.uk
When dm zoned has multiple devices, random zones are never selected for
reclaim if all reserved sequential write zones are in use and no
sequential write required zones can be selected for reclaim. This can
lead to deadlocks as selecting a cache zone allows reclaiming a
sequential zone, ensuring forward progress.
Fix this by always defaulting to selecting a random zone when no
sequential write required zone can be selected.
[Damien: fix commit message]
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Commit 2094045fe5 ("dm zoned: prefer full zones for reclaim")
modified dmz_get_rnd_zone_for_reclaim() to add a search for the buffer
zone with the heaviest weight as an optimal candidate for reclaim. This
modification uses the zone pointer variabl "last" which is set only once
and never modified as zones are scanned, resulting in the search being
inefective. Furthermore, if the selected buffer zone at the end of the
search loop is active or already locked for reclaim,
dmz_get_rnd_zone_for_reclaim() returns NULL even if other random zones
with a lesser weight can be reclaimed.
To fix the search and to guarantee that reclaim can make forward
progress, fix dmz_get_rnd_zone_for_reclaim() loop to correctly find
the buffer zone with the heaviest weight using the variable maxw_z.
Also make sure to fallback to finding the first random zone that can
be reclaimed if this best candidate zone cannot be reclaimed.
While at it, also fix the device index check to consider only random
zones, ignoring cache zones belonging to the cache device if one is
used as that device does not have a reclaim process.
Fixes: 2094045fe5 ("dm zoned: prefer full zones for reclaim")
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Naohiro reported that issuing zone-append bios to a zoned block device
underneath a dm-linear device does not work as expected.
This because we forgot to reverse-map the sector the device wrote to the
original bio.
For zone-append bios, get the offset in the zone of the written sector
from the clone bio and add that to the original bio's sector position.
Fixes: 0512a75b98 ("block: Introduce REQ_OP_ZONE_APPEND")
Cc: stable@vger.kernel.org
Reported-by: Naohiro Aota <Naohiro.Aota@wdc.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
When dm zoned has multiple devices, metadata is on the cache device, not
in random zones of the zoned devices. Then the number of metadata zones
shall be checked with the number of cache zones, not random zones.
Fixes: 34f5affd04 ("dm zoned: separate random and cache zones")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Solves this Sphinx warning:
Documentation/admin-guide/device-mapper/dm-ebs.rst: WARNING: document isn't included in any toctree
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
If ib_dma_mapping_error() returns non-zero value,
ib_mad_post_receive_mads() will jump out of loops and return -ENOMEM
without freeing mad_priv. Fix this memory-leak problem by freeing mad_priv
in this case.
Fixes: 2c34e68f42 ("IB/mad: Check and handle potential DMA mapping errors")
Link: https://lore.kernel.org/r/20200612063824.180611-1-guofan5@huawei.com
Signed-off-by: Fan Guo <guofan5@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Guest crashes are observed on a Cascade Lake system when 'perf top' is
launched on the host, e.g.
BUG: unable to handle kernel paging request at fffffe0000073038
PGD 7ffa7067 P4D 7ffa7067 PUD 7ffa6067 PMD 7ffa5067 PTE ffffffffff120
Oops: 0000 [#1] SMP PTI
CPU: 1 PID: 1 Comm: systemd Not tainted 4.18.0+ #380
...
Call Trace:
serial8250_console_write+0xfe/0x1f0
call_console_drivers.constprop.0+0x9d/0x120
console_unlock+0x1ea/0x460
Call traces are different but the crash is imminent. The problem was
blindly bisected to the commit 041bc42ce2 ("KVM: VMX: Micro-optimize
vmexit time when not exposing PMU"). It was also confirmed that the
issue goes away if PMU is exposed to the guest.
With some instrumentation of the guest we can see what is being switched
(when we do atomic_switch_perf_msrs()):
vmx_vcpu_run: switching 2 msrs
vmx_vcpu_run: switching MSR38f guest: 70000000d host: 70000000f
vmx_vcpu_run: switching MSR3f1 guest: 0 host: 2
The current guess is that PEBS (MSR_IA32_PEBS_ENABLE, 0x3f1) is to blame.
Regardless of whether PMU is exposed to the guest or not, PEBS needs to
be disabled upon switch.
This reverts commit 041bc42ce2.
Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200619094046.654019-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix spelling mistake in the comments with help of `codespell`.
seperate ==> separate
Signed-off-by: Keyur Patel <iamkeyur96@gmail.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Just like all other I2C/SMBus commands, the start signal for the SMBus
Quick Command is S, not A.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Daniel Schaefer <git@danielschaefer.me>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Using uhid and KASAN this driver crashed because it was getting
several connection events where it only expected one. Then the
device was added several times to the static device list and it got
corrupted.
This patch checks if the device is already in the list before adding
it.
Signed-off-by: Rodrigo Rivas Costa <rodrigorivascosta@gmail.com>
Tested-by: Siarhei Vishniakou <svv@google.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
All in-tree users have been converted to the new i2c_new_client_device
function, so remove this deprecated one.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
i2c_new_client() is deprecated, use the replacement
i2c_new_client_device(). Also, we have a helper to check if a driver is
bound. Use it to simplify the code. Note that this changes the errno for
a failed device creation from ENOMEM to ENODEV. No callers currently
interpret this errno, though, so we use this condensed error check.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
When the AF_XDP buffer allocator was introduced, the Rx SW ring
"rx_bi" allocation was moved from i40e_setup_rx_descriptors()
function, and was instead done in the i40e_configure_rx_ring()
function.
This broke the ethtool set_ringparam() hook for changing the Rx
descriptor count, which was relying on i40e_setup_rx_descriptors() to
handle the allocation.
Fix this by adding an explicit i40e_alloc_rx_bi() call to
i40e_set_ringparam().
Fixes: be1222b585 ("i40e: Separate kernel allocated rx_bi rings from AF_XDP rings")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The READ_ONCE macro is used when reading rings prior to accessing the
statistics pointer. The corresponding WRITE_ONCE usage when allocating and
freeing the rings to ensure protected access was not in place. Introduce
this.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
READ_ONCE should be used when reading rings prior to accessing the
statistics pointer. Introduce this as well as the corresponding WRITE_ONCE
usage when allocating and freeing the rings, to ensure protected access.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
READ_ONCE should be used when reading rings prior to accessing the
statistics pointer. Introduce this as well as the corresponding WRITE_ONCE
usage when allocating and freeing the rings, to ensure protected access.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Commit 3b33583265 ("net: Add fraglist GRO/GSO feature flags") missed
an entry for NETIF_F_GSO_FRAGLIST in netdev_features_strings array. As
a result, fraglist GSO feature is not shown in 'ethtool -k' output and
can't be toggled on/off.
The fix is trivial.
Fixes: 3b33583265 ("net: Add fraglist GRO/GSO feature flags")
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver function tg3_io_error_detected() calls napi_disable twice,
without an intervening napi_enable, when the number of EEH errors exceeds
eeh_max_freezes, resulting in an indefinite sleep while holding rtnl_lock.
Add check for pcierr_recovery which skips code already executed for the
"Frozen" state.
Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann says:
====================
s390/qeth: fixes 2020-06-17
please apply the following patch series for qeth to netdev's net tree.
The first patch fixes a regression in the error handling for a specific
cmd type. I have some follow-ups queued up for net-next to clean this
up properly...
The second patch fine-tunes the HW offload restrictions that went in
with this merge window. In some setups we don't need to apply them.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When a device is configured with ISOLATION_MODE_FWD, traffic never goes
through the internal switch. Don't apply the offload restrictions in
this case.
Fixes: c619e9a6f5 ("s390/qeth: don't use restricted offloads for local traffic")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current(?) OSA devices also store their cmd-specific return codes for
SET_ACCESS_CONTROL cmds into the top-level cmd->hdr.return_code.
So once we added stricter checking for the top-level field a while ago,
none of the error logic that rolls back the user's configuration to its
old state is applied any longer.
For this specific cmd, go back to the old model where we peek into the
cmd structure even though the top-level field indicated an error.
Fixes: 686c97ee29 ("s390/qeth: fix error handling in adapter command callbacks")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni says:
====================
mptcp: cope with syncookie on MP_JOINs
Currently syncookies on MP_JOIN connections are not handled correctly: the
connections fallback to TCP and are kept alive instead of resetting them at
fallback time.
The first patch propagates the required information up to syn_recv_sock time,
and the 2nd patch addresses the unifying the error path for all MP_JOIN
requests.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently any MPTCP socket using syn cookies will fallback to
TCP at 3rd ack time. In case of MP_JOIN requests, the RFC mandate
closing the child and sockets, but the existing error paths
do not handle the syncookie scenario correctly.
Address the issue always forcing the child shutdown in case of
MP_JOIN fallback.
Fixes: ae2dd71649 ("mptcp: handle tcp fallback when using syn cookies")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The msk ownership is transferred to the child socket at
3rd ack time, so that we avoid more lookups later. If the
request does not reach the 3rd ack, the MSK reference is
dropped at request sock release time.
As a side effect, fallback is now tracked by a NULL msk
reference instead of zeroed 'mp_join' field. This will
simplify the next patch.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ie.,
$ ifconfig eth0 6.6.6.6 netmask 255.255.255.0
$ ip rule add from 6.6.6.6 table 6666
$ ip route add 9.9.9.9 via 6.6.6.6
$ ping -I 6.6.6.6 9.9.9.9
PING 9.9.9.9 (9.9.9.9) from 6.6.6.6 : 56(84) bytes of data.
3 packets transmitted, 0 received, 100% packet loss, time 2079ms
$ arp
Address HWtype HWaddress Flags Mask Iface
6.6.6.6 (incomplete) eth0
The arp request address is error, this is because fib_table_lookup in
fib_check_nh lookup the destnation 9.9.9.9 nexthop, the scope of
the fib result is RT_SCOPE_LINK,the correct scope is RT_SCOPE_HOST.
Here I add a check of whether this is RT_TABLE_MAIN to solve this problem.
Fixes: 3bfd847203 ("net: Use passed in table for nexthop lookups")
Signed-off-by: guodeqing <geffrey.guo@huawei.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean says:
====================
Fix VLAN checks for SJA1105 DSA tc-flower filters
This fixes a ridiculous situation where the driver, in VLAN-unaware
mode, would refuse accepting any tc filter:
tc filter replace dev sw1p3 ingress flower skip_sw \
dst_mac 42:be:24:9b:76:20 \
action gate (...)
Error: sja1105: Can only gate based on {DMAC, VID, PCP}.
tc filter replace dev sw1p3 ingress protocol 802.1Q flower skip_sw \
vlan_id 1 vlan_prio 0 dst_mac 42:be:24:9b:76:20 \
action gate (...)
Error: sja1105: Can only gate based on DMAC.
So, without changing the VLAN awareness state, it says it doesn't want
VLAN-aware rules, and it doesn't want VLAN-unaware rules either. One
would say it's in Schrodinger's state...
Now, the situation has been made worse by commit 7f14937fac ("net:
dsa: sja1105: keep the VLAN awareness state in a driver variable"),
which made VLAN awareness a ternary attribute, but after inspecting the
code from before that patch with a truth table, it looks like the
logical bug was there even before.
While attempting to fix this, I also noticed some leftover debugging
code in one of the places that needed to be fixed. It would have
appeared in the context of patch 3/3 anyway, so I decided to create a
patch that removes it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This action requires the VLAN awareness state of the switch to be of the
same type as the key that's being added:
- If the switch is unaware of VLAN, then the tc filter key must only
contain the destination MAC address.
- If the switch is VLAN-aware, the key must also contain the VLAN ID and
PCP.
But this check doesn't work unless we verify the VLAN awareness state on
both the "if" and the "else" branches.
Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This action requires the VLAN awareness state of the switch to be of the
same type as the key that's being added:
- If the switch is unaware of VLAN, then the tc filter key must only
contain the destination MAC address.
- If the switch is VLAN-aware, the key must also contain the VLAN ID and
PCP.
But this check doesn't work unless we verify the VLAN awareness state on
both the "if" and the "else" branches.
Fixes: dfacc5a23e ("net: dsa: sja1105: support flow-based redirection via virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This shouldn't be there.
Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti says:
====================
two fixes for 'act_gate' control plane
- patch 1/2 attempts to fix the error path of tcf_gate_init() when users
try to configure 'act_gate' rules with wrong parameters
- patch 2/2 is a follow-up of a recent fix for NULL dereference in
the error path of tcf_gate_init()
further work will introduce a tdc test for 'act_gate'.
changes since v2:
- fix undefined behavior in patch 1/2
- improve comment in patch 2/2
changes since v1:
coding style fixes in patch 1/2 and 2/2
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
assigning a dummy value of 'clock_id' to avoid cancellation of the cycle
timer before its initialization was a temporary solution, and we still
need to handle the case where act_gate timer parameters are changed by
commands like the following one:
# tc action replace action gate <parameters>
the fix consists in the following items:
1) remove the workaround assignment of 'clock_id', and init the list of
entries before the first error path after IDR atomic check/allocation
2) validate 'clock_id' earlier: there is no need to do IDR atomic
check/allocation if we know that 'clock_id' is a bad value
3) use a dedicated function, 'gate_setup_timer()', to ensure that the
timer is cancelled and re-initialized on action overwrite, and also
ensure we initialize the timer in the error path of tcf_gate_init()
v3: improve comment in the error path of tcf_gate_init() (thanks to
Vladimir Oltean)
v2: avoid 'goto' in gate_setup_timer (thanks to Cong Wang)
CC: Ivan Vecera <ivecera@redhat.com>
Fixes: a01c245438 ("net/sched: fix a couple of splats in the error path of tfc_gate_init()")
Fixes: a51c328df3 ("net: qos: introduce a gate control flow action")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
it is possible to see a KASAN use-after-free, immediately followed by a
NULL dereference crash, with the following command:
# tc action add action gate index 3 cycle-time 100000000ns \
> cycle-time-ext 100000000ns clockid CLOCK_TAI
BUG: KASAN: use-after-free in tcf_action_init_1+0x8eb/0x960
Write of size 1 at addr ffff88810a5908bc by task tc/883
CPU: 0 PID: 883 Comm: tc Not tainted 5.7.0+ #188
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
Call Trace:
dump_stack+0x75/0xa0
print_address_description.constprop.6+0x1a/0x220
kasan_report.cold.9+0x37/0x7c
tcf_action_init_1+0x8eb/0x960
tcf_action_init+0x157/0x2a0
tcf_action_add+0xd9/0x2f0
tc_ctl_action+0x2a3/0x39d
rtnetlink_rcv_msg+0x5f3/0x920
netlink_rcv_skb+0x120/0x380
netlink_unicast+0x439/0x630
netlink_sendmsg+0x714/0xbf0
sock_sendmsg+0xe2/0x110
____sys_sendmsg+0x5b4/0x890
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xd3/0x170
do_syscall_64+0x9a/0x370
entry_SYSCALL_64_after_hwframe+0x44/0xa9
[...]
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
CPU: 0 PID: 883 Comm: tc Tainted: G B 5.7.0+ #188
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
RIP: 0010:tcf_action_fill_size+0xa3/0xf0
[....]
RSP: 0018:ffff88813a48f250 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: 0000000000000094 RCX: ffffffffa47c3eb6
RDX: 000000000000000e RSI: 0000000000000008 RDI: 0000000000000070
RBP: ffff88810a590800 R08: 0000000000000004 R09: ffffed1027491e03
R10: 0000000000000003 R11: ffffed1027491e03 R12: 0000000000000000
R13: 0000000000000000 R14: dffffc0000000000 R15: ffff88810a590800
FS: 00007f62cae8ce40(0000) GS:ffff888147c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f62c9d20a10 CR3: 000000013a52a000 CR4: 0000000000340ef0
Call Trace:
tcf_action_init+0x172/0x2a0
tcf_action_add+0xd9/0x2f0
tc_ctl_action+0x2a3/0x39d
rtnetlink_rcv_msg+0x5f3/0x920
netlink_rcv_skb+0x120/0x380
netlink_unicast+0x439/0x630
netlink_sendmsg+0x714/0xbf0
sock_sendmsg+0xe2/0x110
____sys_sendmsg+0x5b4/0x890
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xd3/0x170
do_syscall_64+0x9a/0x370
entry_SYSCALL_64_after_hwframe+0x44/0xa9
this is caused by the test on 'cycletime_ext', that is still unassigned
when the action is newly created. This makes the action .init() return 0
without calling tcf_idr_insert(), hence the UAF + crash.
rework the logic that prevents zero values of cycle-time, as follows:
1) 'tcfg_cycletime_ext' seems to be unused in the action software path,
and it was already possible by other means to obtain non-zero
cycletime and zero cycletime-ext. So, removing that test should not
cause any damage.
2) while at it, we must prevent overwriting configuration data with wrong
ones: use a temporary variable for 'tcfg_cycletime', and validate it
preserving the original semantic (that allowed computing the cycle
time as the sum of all intervals, when not specified by
TCA_GATE_CYCLE_TIME).
3) remove the test on 'tcfg_cycletime', no more useful, and avoid
returning -EFAULT, which did not seem an appropriate return value for
a wrong netlink attribute.
v3: fix uninitialized 'cycletime' (thanks to Vladimir Oltean)
v2: remove useless 'return;' at the end of void gate_get_start_time()
Fixes: a51c328df3 ("net: qos: introduce a gate control flow action")
CC: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the datapath, the ip_tunnel_lookup() is used and it internally uses
fallback tunnel device pointer, which is fb_tunnel_dev.
This pointer variable should be set to NULL when a fb interface is deleted.
But there is no routine to set fb_tunnel_dev pointer to NULL.
So, this pointer will be still used after interface is deleted and
it eventually results in the use-after-free problem.
Test commands:
ip netns add A
ip netns add B
ip link add eth0 type veth peer name eth1
ip link set eth0 netns A
ip link set eth1 netns B
ip netns exec A ip link set lo up
ip netns exec A ip link set eth0 up
ip netns exec A ip link add gre1 type gre local 10.0.0.1 \
remote 10.0.0.2
ip netns exec A ip link set gre1 up
ip netns exec A ip a a 10.0.100.1/24 dev gre1
ip netns exec A ip a a 10.0.0.1/24 dev eth0
ip netns exec B ip link set lo up
ip netns exec B ip link set eth1 up
ip netns exec B ip link add gre1 type gre local 10.0.0.2 \
remote 10.0.0.1
ip netns exec B ip link set gre1 up
ip netns exec B ip a a 10.0.100.2/24 dev gre1
ip netns exec B ip a a 10.0.0.2/24 dev eth1
ip netns exec A hping3 10.0.100.2 -2 --flood -d 60000 &
ip netns del B
Splat looks like:
[ 77.793450][ C3] ==================================================================
[ 77.794702][ C3] BUG: KASAN: use-after-free in ip_tunnel_lookup+0xcc4/0xf30
[ 77.795573][ C3] Read of size 4 at addr ffff888060bd9c84 by task hping3/2905
[ 77.796398][ C3]
[ 77.796664][ C3] CPU: 3 PID: 2905 Comm: hping3 Not tainted 5.8.0-rc1+ #616
[ 77.797474][ C3] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 77.798453][ C3] Call Trace:
[ 77.798815][ C3] <IRQ>
[ 77.799142][ C3] dump_stack+0x9d/0xdb
[ 77.799605][ C3] print_address_description.constprop.7+0x2cc/0x450
[ 77.800365][ C3] ? ip_tunnel_lookup+0xcc4/0xf30
[ 77.800908][ C3] ? ip_tunnel_lookup+0xcc4/0xf30
[ 77.801517][ C3] ? ip_tunnel_lookup+0xcc4/0xf30
[ 77.802145][ C3] kasan_report+0x154/0x190
[ 77.802821][ C3] ? ip_tunnel_lookup+0xcc4/0xf30
[ 77.803503][ C3] ip_tunnel_lookup+0xcc4/0xf30
[ 77.804165][ C3] __ipgre_rcv+0x1ab/0xaa0 [ip_gre]
[ 77.804862][ C3] ? rcu_read_lock_sched_held+0xc0/0xc0
[ 77.805621][ C3] gre_rcv+0x304/0x1910 [ip_gre]
[ 77.806293][ C3] ? lock_acquire+0x1a9/0x870
[ 77.806925][ C3] ? gre_rcv+0xfe/0x354 [gre]
[ 77.807559][ C3] ? erspan_xmit+0x2e60/0x2e60 [ip_gre]
[ 77.808305][ C3] ? rcu_read_lock_sched_held+0xc0/0xc0
[ 77.809032][ C3] ? rcu_read_lock_held+0x90/0xa0
[ 77.809713][ C3] gre_rcv+0x1b8/0x354 [gre]
[ ... ]
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: c544193214 ("GRE: Refactor GRE tunneling code.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the datapath, the ip6gre_tunnel_lookup() is used and it internally uses
fallback tunnel device pointer, which is fb_tunnel_dev.
This pointer variable should be set to NULL when a fb interface is deleted.
But there is no routine to set fb_tunnel_dev pointer to NULL.
So, this pointer will be still used after interface is deleted and
it eventually results in the use-after-free problem.
Test commands:
ip netns add A
ip netns add B
ip link add eth0 type veth peer name eth1
ip link set eth0 netns A
ip link set eth1 netns B
ip netns exec A ip link set lo up
ip netns exec A ip link set eth0 up
ip netns exec A ip link add ip6gre1 type ip6gre local fc:0::1 \
remote fc:0::2
ip netns exec A ip -6 a a fc:100::1/64 dev ip6gre1
ip netns exec A ip link set ip6gre1 up
ip netns exec A ip -6 a a fc:0::1/64 dev eth0
ip netns exec A ip link set ip6gre0 up
ip netns exec B ip link set lo up
ip netns exec B ip link set eth1 up
ip netns exec B ip link add ip6gre1 type ip6gre local fc:0::2 \
remote fc:0::1
ip netns exec B ip -6 a a fc:100::2/64 dev ip6gre1
ip netns exec B ip link set ip6gre1 up
ip netns exec B ip -6 a a fc:0::2/64 dev eth1
ip netns exec B ip link set ip6gre0 up
ip netns exec A ping fc:100::2 -s 60000 &
ip netns del B
Splat looks like:
[ 73.087285][ C1] BUG: KASAN: use-after-free in ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.088361][ C1] Read of size 4 at addr ffff888040559218 by task ping/1429
[ 73.089317][ C1]
[ 73.089638][ C1] CPU: 1 PID: 1429 Comm: ping Not tainted 5.7.0+ #602
[ 73.090531][ C1] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 73.091725][ C1] Call Trace:
[ 73.092160][ C1] <IRQ>
[ 73.092556][ C1] dump_stack+0x96/0xdb
[ 73.093122][ C1] print_address_description.constprop.6+0x2cc/0x450
[ 73.094016][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.094894][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.095767][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.096619][ C1] kasan_report+0x154/0x190
[ 73.097209][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.097989][ C1] ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.098750][ C1] ? gre_del_protocol+0x60/0x60 [gre]
[ 73.099500][ C1] gre_rcv+0x1c5/0x1450 [ip6_gre]
[ 73.100199][ C1] ? ip6gre_header+0xf00/0xf00 [ip6_gre]
[ 73.100985][ C1] ? rcu_read_lock_sched_held+0xc0/0xc0
[ 73.101830][ C1] ? ip6_input_finish+0x5/0xf0
[ 73.102483][ C1] ip6_protocol_deliver_rcu+0xcbb/0x1510
[ 73.103296][ C1] ip6_input_finish+0x5b/0xf0
[ 73.103920][ C1] ip6_input+0xcd/0x2c0
[ 73.104473][ C1] ? ip6_input_finish+0xf0/0xf0
[ 73.105115][ C1] ? rcu_read_lock_held+0x90/0xa0
[ 73.105783][ C1] ? rcu_read_lock_sched_held+0xc0/0xc0
[ 73.106548][ C1] ipv6_rcv+0x1f1/0x300
[ ... ]
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: c12b395a46 ("gre: Support GRE over IPv6")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current code, ->ndo_start_xmit() can be executed recursively only
10 times because of stack memory.
But, in the case of the vxlan, 10 recursion limit value results in
a stack overflow.
In the current code, the nested interface is limited by 8 depth.
There is no critical reason that the recursion limitation value should
be 10.
So, it would be good to be the same value with the limitation value of
nesting interface depth.
Test commands:
ip link add vxlan10 type vxlan vni 10 dstport 4789 srcport 4789 4789
ip link set vxlan10 up
ip a a 192.168.10.1/24 dev vxlan10
ip n a 192.168.10.2 dev vxlan10 lladdr fc:22:33:44:55:66 nud permanent
for i in {9..0}
do
let A=$i+1
ip link add vxlan$i type vxlan vni $i dstport 4789 srcport 4789 4789
ip link set vxlan$i up
ip a a 192.168.$i.1/24 dev vxlan$i
ip n a 192.168.$i.2 dev vxlan$i lladdr fc:22:33:44:55:66 nud permanent
bridge fdb add fc:22:33:44:55:66 dev vxlan$A dst 192.168.$i.2 self
done
hping3 192.168.10.2 -2 -d 60000
Splat looks like:
[ 103.814237][ T1127] =============================================================================
[ 103.871955][ T1127] BUG kmalloc-2k (Tainted: G B ): Padding overwritten. 0x00000000897a2e4f-0x000
[ 103.873187][ T1127] -----------------------------------------------------------------------------
[ 103.873187][ T1127]
[ 103.874252][ T1127] INFO: Slab 0x000000005cccc724 objects=5 used=5 fp=0x0000000000000000 flags=0x10000000001020
[ 103.881323][ T1127] CPU: 3 PID: 1127 Comm: hping3 Tainted: G B 5.7.0+ #575
[ 103.882131][ T1127] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 103.883006][ T1127] Call Trace:
[ 103.883324][ T1127] dump_stack+0x96/0xdb
[ 103.883716][ T1127] slab_err+0xad/0xd0
[ 103.884106][ T1127] ? _raw_spin_unlock+0x1f/0x30
[ 103.884620][ T1127] ? get_partial_node.isra.78+0x140/0x360
[ 103.885214][ T1127] slab_pad_check.part.53+0xf7/0x160
[ 103.885769][ T1127] ? pskb_expand_head+0x110/0xe10
[ 103.886316][ T1127] check_slab+0x97/0xb0
[ 103.886763][ T1127] alloc_debug_processing+0x84/0x1a0
[ 103.887308][ T1127] ___slab_alloc+0x5a5/0x630
[ 103.887765][ T1127] ? pskb_expand_head+0x110/0xe10
[ 103.888265][ T1127] ? lock_downgrade+0x730/0x730
[ 103.888762][ T1127] ? pskb_expand_head+0x110/0xe10
[ 103.889244][ T1127] ? __slab_alloc+0x3e/0x80
[ 103.889675][ T1127] __slab_alloc+0x3e/0x80
[ 103.890108][ T1127] __kmalloc_node_track_caller+0xc7/0x420
[ ... ]
Fixes: 11a766ce91 ("net: Increase xmit RECURSION_LIMIT to 10.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The older SoCs like Armada XP support a 2500BaseX mode in the datasheets
referred to as DR-SGMII (Double rated SGMII) or HS-SGMII (High Speed
SGMII). This is an upclocked 1000BaseX mode, thus
PHY_INTERFACE_MODE_2500BASEX is the appropriate mode define for it.
adding support for it merely means writing the correct magic value into
the MVNETA_SERDES_CFG register.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MVNETA_SERDES_CFG register is only available on older SoCs like the
Armada XP. On newer SoCs like the Armada 38x the fields are moved to
comphy. This patch moves the writes to this register next to the comphy
initialization, so that depending on the SoC either comphy or
MVNETA_SERDES_CFG is configured.
With this we no longer write to the MVNETA_SERDES_CFG on SoCs where it
doesn't exist.
Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per the table 4.4 of version "20190608-Priv-MSU-Ratified" of the
RISC-V instruction set manual[0], the PTE permission bit combination of
"write+exec only" is reserved for future use. Hence, don't allow such
mapping request in mmap call.
An issue is been reported by David Abdurachmanov, that while running
stress-ng with "sysbadaddr" argument, RCU stalls are observed on RISC-V
specific kernel.
This issue arises when the stress-sysbadaddr request for pages with
"write+exec only" permission bits and then passes the address obtain
from this mmap call to various system call. For the riscv kernel, the
mmap call should fail for this particular combination of permission bits
since it's not valid.
[0]: http://dabbelt.com/~palmer/keep/riscv-isa-manual/riscv-privileged-20190608-1.pdf
Signed-off-by: Yash Shah <yash.shah@sifive.com>
Reported-by: David Abdurachmanov <david.abdurachmanov@gmail.com>
[Palmer: Refer to the latest ISA specification at the only link I could
find, and update the terminology.]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
- Fix for timeslicing and virtual engines/unpremptable requests
(+ 1 dependency patch)
- Fixes into TypeC register programming and interrupt storm detecting
- Disable DIP on MST ports with the transcoder clock still on
- Avoid missing GT workarounds at reset for HSW and older gens
- Fix for unwinding multiple requests missing force restore
- Fix encoder type check for DDI vswing sequence
- Build warning fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200618124659.GA12342@jlahtine-desk.ger.corp.intel.com
On HS cores, loop buffer (LPB) is programmable in runtime and can
be optionally disabled.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Merge non-faulting memory access cleanups from Christoph Hellwig:
"Andrew and I decided to drop the patches implementing your suggested
rename of the probe_kernel_* and probe_user_* helpers from -mm as
there were way to many conflicts.
After -rc1 might be a good time for this as all the conflicts are
resolved now"
This also adds a type safety checking patch on top of the renaming
series to make the subtle behavioral difference between 'get_user()' and
'get_kernel_nofault()' less potentially dangerous and surprising.
* emailed patches from Christoph Hellwig <hch@lst.de>:
maccess: make get_kernel_nofault() check for minimal type compatibility
maccess: rename probe_kernel_address to get_kernel_nofault
maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault
maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault
Now that we've renamed probe_kernel_address() to get_kernel_nofault()
and made it look and behave more in line with get_user(), some of the
subtle type behavior differences end up being more obvious and possibly
dangerous.
When you do
get_user(val, user_ptr);
the type of the access comes from the "user_ptr" part, and the above
basically acts as
val = *user_ptr;
by design (except, of course, for the fact that the actual dereference
is done with a user access).
Note how in the above case, the type of the end result comes from the
pointer argument, and then the value is cast to the type of 'val' as
part of the assignment.
So the type of the pointer is ultimately the more important type both
for the access itself.
But 'get_kernel_nofault()' may now _look_ similar, but it behaves very
differently. When you do
get_kernel_nofault(val, kernel_ptr);
it behaves like
val = *(typeof(val) *)kernel_ptr;
except, of course, for the fact that the actual dereference is done with
exception handling so that a faulting access is suppressed and returned
as the error code.
But note how different the casting behavior of the two superficially
similar accesses are: one does the actual access in the size of the type
the pointer points to, while the other does the access in the size of
the target, and ignores the pointer type entirely.
Actually changing get_kernel_nofault() to act like get_user() is almost
certainly the right thing to do eventually, but in the meantime this
patch adds logit to at least verify that the pointer type is compatible
with the type of the result.
In many cases, this involves just casting the pointer to 'void *' to
make it obvious that the type of the pointer is not the important part.
It's not how 'get_user()' acts, but at least the behavioral difference
is now obvious and explicit.
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ID 1 is already used by the IOVA range capability, use ID 2.
Reported-by: Liu Yi L <yi.l.liu@intel.com>
Fixes: ad721705d0 ("vfio iommu: Add migration capability to report supported features")
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Commit:
da92110dfd ("EDAC, amd64_edac: Extend scrub rate support to F15hM60h")
added support for F15h, model 0x60 CPUs but in doing so, missed to read
back SCRCTRL PCI config register on F15h CPUs which are *not* model
0x60. Add that read so that doing
$ cat /sys/devices/system/edac/mc/mc0/sdram_scrub_rate
can show the previously set DRAM scrub rate.
Fixes: da92110dfd ("EDAC, amd64_edac: Extend scrub rate support to F15hM60h")
Reported-by: Anders Andersson <pipatron@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> #v4.4..
Link: https://lkml.kernel.org/r/CAKkunMbNWppx_i6xSdDHLseA2QQmGJqj_crY=NF-GZML5np4Vw@mail.gmail.com
Better describe what this helper does, and match the naming of
copy_from_kernel_nofault.
Also switch the argument order around, so that it acts and looks
like get_user().
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
destroy_qp_common is called for flows where QP is already created by
HW. While it is called from IB/core, the ibqp.* fields will be fully
initialized, but it is not the case if this function is called during QP
creation.
Don't rely on ibqp fields as much as possible and initialize
send_cq/recv_cq as temporal solution till all drivers will be converted to
IB/core QP allocation scheme.
refcount_t: underflow; use-after-free.
WARNING: CPU: 1 PID: 5372 at lib/refcount.c:28 refcount_warn_saturate+0xfe/0x1a0
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 5372 Comm: syz-executor.2 Not tainted 5.5.0-rc5 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
mlx5_core_put_rsc+0x70/0x80
destroy_resource_common+0x8e/0xb0
mlx5_core_destroy_qp+0xaf/0x1d0
mlx5_ib_destroy_qp+0xeb0/0x1460
ib_destroy_qp_user+0x2d5/0x7d0
create_qp+0xed3/0x2130
ib_uverbs_create_qp+0x13e/0x190
? ib_uverbs_ex_create_qp
ib_uverbs_write+0xaa5/0xdf0
__vfs_write+0x7c/0x100
ksys_write+0xc8/0x200
do_syscall_64+0x9c/0x390
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 08d5397660 ("RDMA/mlx5: Copy response to the user in one place")
Link: https://lore.kernel.org/r/20200617130148.2846643-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Currently, address spaces in warnings are displayed as '<asn:X>' with
'X' being the address space's arbitrary number.
But since sparse v0.6.0-rc1 (late December 2018), sparse allows you to
define the address spaces using an identifier instead of a number. This
identifier is then directly used in the warnings.
So, use the identifiers '__user', '__iomem', '__percpu' & '__rcu' for
the corresponding address spaces. The default address space, __kernel,
being not displayed in warnings, stays defined as '0'.
With this change, warnings that used to be displayed as:
cast removes address space '<asn:1>' of expression
... void [noderef] <asn:2> *
will now be displayed as:
cast removes address space '__user' of expression
... void [noderef] __iomem *
This also moves the __kernel annotation to be the first one, since it is
quite different from the others because it's the default one, and so:
- it's never displayed
- it's normally not needed, nor in type annotations, nor in cast
between address spaces. The only time it's needed is when it's
combined with a typeof to express "the same type as this one but
without the address space"
- it can't be defined with a name, '0' must be used.
So, it seemed strange to me to have it in the middle of the other
ones.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since many compilers cannot disable KCOV with a function attribute,
help it to NOP out any __sanitizer_cov_*() calls injected in noinstr
code.
This turns:
12: e8 00 00 00 00 callq 17 <lockdep_hardirqs_on+0x17>
13: R_X86_64_PLT32 __sanitizer_cov_trace_pc-0x4
into:
12: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
13: R_X86_64_NONE __sanitizer_cov_trace_pc-0x4
Just like recordmcount does.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
This provides infrastructure to rewrite instructions; this is
immediately useful for helping out with KCOV-vs-noinstr, but will
also come in handy for a bunch of variable sized jump-label patches
that are still on ice.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
With there being multiple ways to change the ELF data, let's more
concisely track modification.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
When a filesystem is mounted on a loop device and on a loop ioctl
LOOP_SET_STATUS64, because of kill_bdev, buffer_head mappings are getting
destroyed.
kill_bdev
truncate_inode_pages
truncate_inode_pages_range
do_invalidatepage
block_invalidatepage
discard_buffer -->clear BH_Mapped flag
sb_bread
__bread_gfp
bh = __getblk_gfp
-->discard_buffer clear BH_Mapped flag
__bread_slow
submit_bh
submit_bh_wbc
BUG_ON(!buffer_mapped(bh)) --> hit this BUG_ON
Fixes: 5db470e229 ("loop: drop caches if offset or block_size are changed")
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit 130f4caf14 ("libata: Ensure ata_port probe has completed before
detach") may cause system freeze during suspend.
Using async_synchronize_full() in PM callbacks is wrong, since async
callbacks that are already scheduled may wait for not-yet-scheduled
callbacks, causes a circular dependency.
Instead of using big hammer like async_synchronize_full(), use async
cookie to make sure port probe are synced, without affecting other
scheduled PM callbacks.
Fixes: 130f4caf14 ("libata: Ensure ata_port probe has completed before detach")
Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: John Garry <john.garry@huawei.com>
BugLink: https://bugs.launchpad.net/bugs/1867983
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is a specific API to treat raw data as UUID, i.e. import_uuid().
Use it instead of uuid_copy() with explicit casting.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
LDO1 and LDO2 settings are wrong and case the voltage to go above the
maximum level of 2.15V permitted by the SoC to 3.0V.
This patch is based on work done on the i.MX8M Mini-EVK which utilizes
the same fix.
Fixes: 593816fa2f ("arm64: dts: imx: Add Beacon i.MX8m-Mini development kit")
Signed-off-by: Adam Ford <aford173@gmail.com>
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
In io_read() or io_write(), when io request is submitted successfully,
it'll go through the below sequence:
kfree(iovec);
req->flags &= ~REQ_F_NEED_CLEANUP;
return ret;
But clearing REQ_F_NEED_CLEANUP might be unsafe. The io request may
already have been completed, and then io_complete_rw_iopoll()
and io_complete_rw() will be called, both of which will also modify
req->flags if needed. This causes a race condition, with concurrent
non-atomic modification of req->flags.
To eliminate this race, in io_read() or io_write(), if io request is
submitted successfully, we don't remove REQ_F_NEED_CLEANUP flag. If
REQ_F_NEED_CLEANUP is set, we'll leave __io_req_aux_free() to the
iovec cleanup work correspondingly.
Cc: stable@vger.kernel.org
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If a IMP reset caused by some hardware errors and hns RoCE driver reset
occurred at the same time, there is a possiblity that the IMP will stop
dealing with command and users can't use the hardware. The logs are as
follows:
hns3 0000:fd:00.1: cleaned 0, need to clean 1
hns3 0000:fd:00.1: firmware version query failed -11
hns3 0000:fd:00.1: Cmd queue init failed
hns3 0000:fd:00.1: Upgrade reset level
hns3 0000:fd:00.1: global reset interrupt
The hns NIC driver divides the reset process into 3 status:
initialization, hardware resetting and softwaring restting. RoCE driver
gets reset status by interfaces provided by NIC driver and commands will
not be sent to the IMP if the driver is in any above status. The main
reason for this issue is that there is a time gap between status 1 and 2,
if the RoCE driver sends commands to the IMP during this gap, the IMP will
stop working because it is not ready.
To eliminate the time gap, the hns NIC driver has added a new interface in
commit a4de02287a ("net: hns3: provide .get_cmdq_stat interface for the
client"), so RoCE driver can ensure that no commands will be sent during
resetting.
Link: https://lore.kernel.org/r/1592314778-52822-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
ibmr.device is assigned after MR is successfully registered, but both
write_mtpt() and frmr_write_mtpt() accesses it during the mr registration
process, which may cause the following error when trying to register MR in
userspace and pbl_hop_num is set to 0.
pc : hns_roce_mtr_find+0xa0/0x200 [hns_roce]
lr : set_mtpt_pbl+0x54/0x118 [hns_roce_hw_v2]
sp : ffff00023e73ba20
x29: ffff00023e73ba20 x28: ffff00023e73bad8
x27: 0000000000000000 x26: 0000000000000000
x25: 0000000000000002 x24: 0000000000000000
x23: ffff00023e73bad0 x22: 0000000000000000
x21: ffff0000094d9000 x20: 0000000000000000
x19: ffff8020a6bdb2c0 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0140000000000000 x12: 0040000000000041
x11: ffff000240000000 x10: 0000000000001000
x9 : 0000000000000000 x8 : ffff802fb7558480
x7 : ffff802fb7558480 x6 : 000000000003483d
x5 : ffff00023e73bad0 x4 : 0000000000000002
x3 : ffff00023e73bad8 x2 : 0000000000000000
x1 : 0000000000000000 x0 : ffff0000094d9708
Call trace:
hns_roce_mtr_find+0xa0/0x200 [hns_roce]
set_mtpt_pbl+0x54/0x118 [hns_roce_hw_v2]
hns_roce_v2_write_mtpt+0x14c/0x168 [hns_roce_hw_v2]
hns_roce_mr_enable+0x6c/0x148 [hns_roce]
hns_roce_reg_user_mr+0xd8/0x130 [hns_roce]
ib_uverbs_reg_mr+0x14c/0x2e0 [ib_uverbs]
ib_uverbs_write+0x27c/0x3e8 [ib_uverbs]
__vfs_write+0x60/0x190
vfs_write+0xac/0x1c0
ksys_write+0x6c/0xd8
__arm64_sys_write+0x24/0x30
el0_svc_common+0x78/0x130
el0_svc_handler+0x38/0x78
el0_svc+0x8/0xc
Solve above issue by adding a pointer of structure hns_roce_dev as a
parameter of write_mtpt() and frmr_write_mtpt(), so that both of these
functions can access it before finishing MR's registration.
Fixes: 9b2cf76c9f ("RDMA/hns: Optimize PBL buffer allocation process")
Link: https://lore.kernel.org/r/1592314629-51715-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When build perf with ASan or UBSan, if libasan or libubsan can not find,
the feature-glibc is 0 and there exists the following error log which is
wrong, because we can find gnu/libc-version.h in /usr/include,
glibc-devel is also installed.
[yangtiezhu@linux perf]$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address'
BUILD: Doing 'make -j4' parallel build
HOSTCC fixdep.o
HOSTLD fixdep-in.o
LINK fixdep
<stdin>:1:0: warning: -fsanitize=address and -fsanitize=kernel-address are not supported for this target
<stdin>:1:0: warning: -fsanitize=address not supported for this target
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libaudit: [ OFF ]
... libbfd: [ OFF ]
... libcap: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
... libaio: [ OFF ]
... libzstd: [ OFF ]
... disassembler-four-args: [ OFF ]
Makefile.config:393: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop.
Makefile.perf:224: recipe for target 'sub-make' failed
make[1]: *** [sub-make] Error 2
Makefile:69: recipe for target 'all' failed
make: *** [all] Error 2
[yangtiezhu@linux perf]$ ls /usr/include/gnu/libc-version.h
/usr/include/gnu/libc-version.h
After install libasan and libubsan, the feature-glibc is 1 and the build
process is success, so the cause is related with libasan or libubsan, we
should check them and print an error log to reflect the reality.
Committer testing:
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libbfd: [ OFF ]
... libcap: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
... libaio: [ OFF ]
... libzstd: [ OFF ]
... disassembler-four-args: [ OFF ]
Makefile.config:401: *** No libasan found, please install libasan. Stop.
make[1]: *** [Makefile.perf:231: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
$
$ sudo dnf install libasan
<SNIP>
Installed:
libasan-9.3.1-2.fc31.x86_64
$
$
$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libbfd: [ on ]
... libcap: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... disassembler-four-args: [ on ]
<SNIP>
CC /tmp/build/perf/util/pmu-flex.o
FLEX /tmp/build/perf/util/expr-flex.c
CC /tmp/build/perf/util/expr-bison.o
CC /tmp/build/perf/util/expr.o
CC /tmp/build/perf/util/expr-flex.o
CC /tmp/build/perf/util/parse-events-flex.o
CC /tmp/build/perf/util/parse-events.o
LD /tmp/build/perf/util/intel-pt-decoder/perf-in.o
LD /tmp/build/perf/util/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
<SNIP>
INSTALL python-scripts
INSTALL perf_completion-script
INSTALL perf-tip
make: Leaving directory '/home/acme/git/perf/tools/perf'
$ ldd ~/bin/perf | grep asan
libasan.so.5 => /lib64/libasan.so.5 (0x00007f0904164000)
$
And if we rebuild without -fsanitize-address:
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
$ make O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libbfd: [ on ]
... libcap: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... disassembler-four-args: [ on ]
GEN /tmp/build/perf/common-cmds.h
CC /tmp/build/perf/exec-cmd.o
<SNIP>
INSTALL perf_completion-script
INSTALL perf-tip
make: Leaving directory '/home/acme/git/perf/tools/perf'
$ ldd ~/bin/perf | grep asan
$
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: tiezhu yang <yangtiezhu@loongson.cn>
Cc: xuefeng li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1592445961-28044-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Private data passed to iwarp_cm_handler is copied for connection request /
response, but ignored otherwise. If junk is passed, it is stored in the
event and used later in the event processing.
The driver passes an old junk pointer during connection close which leads
to a use-after-free on event processing. Set private data to NULL for
events that don 't have private data.
BUG: KASAN: use-after-free in ucma_event_handler+0x532/0x560 [rdma_ucm]
kernel: Read of size 4 at addr ffff8886caa71200 by task kworker/u128:1/5250
kernel:
kernel: Workqueue: iw_cm_wq cm_work_handler [iw_cm]
kernel: Call Trace:
kernel: dump_stack+0x8c/0xc0
kernel: print_address_description.constprop.0+0x1b/0x210
kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm]
kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm]
kernel: __kasan_report.cold+0x1a/0x33
kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm]
kernel: kasan_report+0xe/0x20
kernel: check_memory_region+0x130/0x1a0
kernel: memcpy+0x20/0x50
kernel: ucma_event_handler+0x532/0x560 [rdma_ucm]
kernel: ? __rpc_execute+0x608/0x620 [sunrpc]
kernel: cma_iw_handler+0x212/0x330 [rdma_cm]
kernel: ? iw_conn_req_handler+0x6e0/0x6e0 [rdma_cm]
kernel: ? enqueue_timer+0x86/0x140
kernel: ? _raw_write_lock_irq+0xd0/0xd0
kernel: cm_work_handler+0xd3d/0x1070 [iw_cm]
Fixes: e411e0587e ("RDMA/qedr: Add iWARP connection management functions")
Link: https://lore.kernel.org/r/20200616093408.17827-1-michal.kalderon@marvell.com
Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Fix the following sparse error by adding annotation to
cm_queue_work_unlock() that it releases cm_id_priv->lock lock.
drivers/infiniband/core/cm.c:936:24: warning: context imbalance in
'cm_queue_work_unlock' - unexpected unlock
Fixes: e83f195aa4 ("RDMA/cm: Pull duplicated code into cm_queue_work_unlock()")
Link: https://lore.kernel.org/r/20200611130045.1994026-1-leon@kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Unprivileged memory accesses generated by the so-called "translated"
instructions (e.g. STTR) at EL1 can cause EL0 watchpoints to fire
unexpectedly if kernel debugging is enabled. In such cases, the
hw_breakpoint logic will invoke the user overflow handler which will
typically raise a SIGTRAP back to the current task. This is futile when
returning back to the kernel because (a) the signal won't have been
delivered and (b) userspace can't handle the thing anyway.
Avoid invoking the user overflow handler for watchpoints triggered by
kernel uaccess routines, and instead single-step over the faulting
instruction as we would if no overflow handler had been installed.
(Fixes tag identifies the introduction of unprivileged memory accesses,
which exposed this latent bug in the hw_breakpoint code)
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Fixes: 57f4959bad ("arm64: kernel: Add support for User Access Override")
Reported-by: Luis Machado <luis.machado@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
A Synopsys USB2.0 core used in Huawei Kunpeng920 SoC has a bug which
might cause the host controller not issuing ping.
Bug description:
After indicating an Interrupt on Async Advance, the software uses the
doorbell mechanism to delete the Next Link queue head of the last
executed queue head. At this time, the host controller still references
the removed queue head(the queue head is NULL). NULL reference causes
the host controller to lose the USB device.
Solution:
After deleting the Next Link queue head, when has_synopsys_hc_bug set
to 1,the software can write one of the valid queue head addresses to
the ASYNCLISTADDR register to allow the host controller to get
the valid queue head. in order to solve that problem, this patch set
the flag for Huawei Kunpeng920
There are detailed instructions and solutions in this patch:
commit 2f7ac6c199 ("USB: ehci: add workaround for Synopsys HC bug")
Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/1591588019-44284-1-git-send-email-liulongfang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current number of KVM_IRQCHIP_NUM_PINS results in an order 3
allocation (32kb) for each guest start/restart. This can result in OOM
killer activity even with free swap when the memory is fragmented
enough:
kernel: qemu-system-s39 invoked oom-killer: gfp_mask=0x440dc0(GFP_KERNEL_ACCOUNT|__GFP_COMP|__GFP_ZERO), order=3, oom_score_adj=0
kernel: CPU: 1 PID: 357274 Comm: qemu-system-s39 Kdump: loaded Not tainted 5.4.0-29-generic #33-Ubuntu
kernel: Hardware name: IBM 8562 T02 Z06 (LPAR)
kernel: Call Trace:
kernel: ([<00000001f848fe2a>] show_stack+0x7a/0xc0)
kernel: [<00000001f8d3437a>] dump_stack+0x8a/0xc0
kernel: [<00000001f8687032>] dump_header+0x62/0x258
kernel: [<00000001f8686122>] oom_kill_process+0x172/0x180
kernel: [<00000001f8686abe>] out_of_memory+0xee/0x580
kernel: [<00000001f86e66b8>] __alloc_pages_slowpath+0xd18/0xe90
kernel: [<00000001f86e6ad4>] __alloc_pages_nodemask+0x2a4/0x320
kernel: [<00000001f86b1ab4>] kmalloc_order+0x34/0xb0
kernel: [<00000001f86b1b62>] kmalloc_order_trace+0x32/0xe0
kernel: [<00000001f84bb806>] kvm_set_irq_routing+0xa6/0x2e0
kernel: [<00000001f84c99a4>] kvm_arch_vm_ioctl+0x544/0x9e0
kernel: [<00000001f84b8936>] kvm_vm_ioctl+0x396/0x760
kernel: [<00000001f875df66>] do_vfs_ioctl+0x376/0x690
kernel: [<00000001f875e304>] ksys_ioctl+0x84/0xb0
kernel: [<00000001f875e39a>] __s390x_sys_ioctl+0x2a/0x40
kernel: [<00000001f8d55424>] system_call+0xd8/0x2c8
As far as I can tell s390x does not use the iopins as we bail our for
anything other than KVM_IRQ_ROUTING_S390_ADAPTER and the chip/pin is
only used for KVM_IRQ_ROUTING_IRQCHIP. So let us use a small number to
reduce the memory footprint.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200617083620.5409-1-borntraeger@de.ibm.com
Correct ldo1 voltage range from wrong high group(3.0V~3.3V) to low group
(1.6V~1.9V) because the ldo1 should be 1.8V. Actually, two voltage groups
have been supported at bd718x7-regulator driver, hence, just corrrect the
voltage range to 1.6V~3.3V. For ldo2@0.8V, correct voltage range too.
Otherwise, ldo1 would be kept @3.0V and ldo2@0.9V which violate i.mx8mn
datasheet as the below warning log in kernel:
[ 0.995524] LDO1: Bringing 1800000uV into 3000000-3000000uV
[ 0.999196] LDO2: Bringing 800000uV into 900000-900000uV
Fixes: 3e44dd0973 ("arm64: dts: imx8mn-ddr4-evk: Add rohm,bd71847 PMIC support")
Cc: stable@vger.kernel.org
Signed-off-by: Robin Gong <yibin.gong@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Correct ldo1 voltage range from wrong high group(3.0V~3.3V) to low group
(1.6V~1.9V) because the ldo1 should be 1.8V. Actually, two voltage groups
have been supported at bd718x7-regulator driver, hence, just corrrect the
voltage range to 1.6V~3.3V. For ldo2@0.8V, correct voltage range too.
Otherwise, ldo1 would be kept @3.0V and ldo2@0.9V which violate i.mx8mm
datasheet as the below warning log in kernel:
[ 0.995524] LDO1: Bringing 1800000uV into 3000000-3000000uV
[ 0.999196] LDO2: Bringing 800000uV into 900000-900000uV
Fixes: 78cc25fa26 ("arm64: dts: imx8mm-evk: Add BD71847 PMIC")
Cc: stable@vger.kernel.org
Signed-off-by: Robin Gong <yibin.gong@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
A 5.7 kernel hangs during a tcrypt test of padata that waits for an AEAD
request to finish. This is only seen on large machines running many
concurrent requests.
The issue is that padata never serializes the request. The removal of
the reorder_objects atomic missed that the memory barrier in
padata_do_serial() depends on it.
Upgrade the barrier from smp_mb__after_atomic to smp_mb to get correct
ordering again.
Fixes: 3facced7ae ("padata: remove reorder_objects")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The locking in af_alg_release_parent is broken as the BH socket
lock can only be taken if there is a code-path to handle the case
where the lock is owned by process-context. Instead of adding
such handling, we can fix this by changing the ref counts to
atomic_t.
This patch also modifies the main refcnt to include both normal
and nokey sockets. This way we don't have to fudge the nokey
ref count when a socket changes from nokey to normal.
Credits go to Mauricio Faria de Oliveira who diagnosed this bug
and sent a patch for it:
https://lore.kernel.org/linux-crypto/20200605161657.535043-1-mfo@canonical.com/
Reported-by: Brian Moyles <bmoyles@netflix.com>
Reported-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Fixes: 37f96694cf ("crypto: af_alg - Use bh_lock_sock in...")
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There is an issue when tune the number for read and write queues,
if the total queue count was not changed. The hctx->type cannot
be updated, since __blk_mq_update_nr_hw_queues will return directly
if the total queue count has not been changed.
Reproduce:
dmesg | grep "default/read/poll"
[ 2.607459] nvme nvme0: 48/0/0 default/read/poll queues
cat /sys/kernel/debug/block/nvme0n1/hctx*/type | sort | uniq -c
48 default
tune the write queues to 24:
echo 24 > /sys/module/nvme/parameters/write_queues
echo 1 > /sys/block/nvme0n1/device/reset_controller
dmesg | grep "default/read/poll"
[ 433.547235] nvme nvme0: 24/24/0 default/read/poll queues
cat /sys/kernel/debug/block/nvme0n1/hctx*/type | sort | uniq -c
48 default
The driver's hardware queue mapping is not same as block layer.
Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We don't want it under CONFIG_DRM_MSM_GPU_STATE, we need it all the
time (like the other GPUs do).
Fixes: ccac7ce373 ("drm/msm: Refactor address space initialization")
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Previously the address space went from 16M to ~0u, but with the
refactor one of the 'f's was dropped, limiting us to 256MB.
Additionally, the new interface takes a start and size, not start and
end, so we can't just copy and paste.
Fixes regressions in dEQP-VK.memory.allocation.random.*
Fixes: ccac7ce373 ("drm/msm: Refactor address space initialization")
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Set up vlan_features for use by any vlans above us.
Fixes: beead698b1 ("ionic: Add the basic NDO callbacks for netdev support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the driver is busy resetting queues after a change in
MTU or queue parameters, don't bother checking the link,
wait until the next watchdog cycle.
Fixes: 987c0871e8 ("ionic: check for linkup in watchdog")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 2ad6691d98, which moved the modification of the status annotation
for a packet in the Tx buffer prior to the retransmission moved the state
clearance, but managed to lose the bit that set it to UNACK.
Consequently, if a retransmission occurs, the packet is accidentally
changed to the ACK state (ie. 0) by masking it off, which means that the
packet isn't counted towards the tally of newly-ACK'd packets if it gets
hard-ACK'd. This then prevents the congestion control algorithm from
recovering properly.
Fix by reinstating the change of state to UNACK.
Spotted by the generic/460 xfstest.
Fixes: 2ad6691d98 ("rxrpc: Fix race between incoming ACK parser and retransmitter")
Signed-off-by: David Howells <dhowells@redhat.com>
The handling of the receive window size (rwind) from a received ACK packet
is not correct. The rxrpc_input_ackinfo() function currently checks the
current Tx window size against the rwind from the ACK to see if it has
changed, but then limits the rwind size before storing it in the tx_winsize
member and, if it increased, wake up the transmitting process. This means
that if rwind > RXRPC_RXTX_BUFF_SIZE - 1, this path will always be
followed.
Fix this by limiting rwind before we compare it to tx_winsize.
The effect of this can be seen by enabling the rxrpc_rx_rwind_change
tracepoint.
Fixes: 702f2ac87a ("rxrpc: Wake up the transmitter if Rx window size increases on the peer")
Signed-off-by: David Howells <dhowells@redhat.com>
Using a AX88179 device (0b95:1790), I see two bytes of appended data on
every RX packet. For example, this 48-byte ping, using 0xff as a
payload byte:
04:20:22.528472 IP 192.168.1.1 > 192.168.1.2: ICMP echo request, id 2447, seq 1, length 64
0x0000: 000a cd35 ea50 000a cd35 ea4f 0800 4500
0x0010: 0054 c116 4000 4001 f63e c0a8 0101 c0a8
0x0020: 0102 0800 b633 098f 0001 87ea cd5e 0000
0x0030: 0000 dcf2 0600 0000 0000 ffff ffff ffff
0x0040: ffff ffff ffff ffff ffff ffff ffff ffff
0x0050: ffff ffff ffff ffff ffff ffff ffff ffff
0x0060: ffff 961f
Those last two bytes - 96 1f - aren't part of the original packet.
In the ax88179 RX path, the usbnet rx_fixup function trims a 2-byte
'alignment pseudo header' from the start of the packet, and sets the
length from a per-packet field populated by hardware. It looks like that
length field *includes* the 2-byte header; the current driver assumes
that it's excluded.
This change trims the 2-byte alignment header after we've set the packet
length, so the resulting packet length is correct. While we're moving
the comment around, this also fixes the spelling of 'pseudo'.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The trace symbol printer (__print_symbolic()) ignores symbols that map to
an empty string and prints the hex value instead.
Fix the symbol for rxrpc_cong_no_change to " -" instead of "" to avoid
this.
Fixes: b54a134a7d ("rxrpc: Fix handling of enums-to-string translation in tracing")
Signed-off-by: David Howells <dhowells@redhat.com>
Add rename the gpu busy percentage for consistency and
add the mem busy percentage documentation.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The existing code used the major version number of the DRM driver
instead of the device major number of the DRM subsystem for
validating access for a devices cgroup.
This meant that accesses allowed by the devices cgroup weren't
permitted and certain accesses denied by the devices cgroup were
permitted (if they matched the wrong major device number).
Signed-off-by: Lorenz Brun <lorenz@brun.one>
Fixes: 6b855f7b83 ("drm/amdkfd: Check against device cgroup")
Reviewed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
clang static analysis reports an undefined return
security/selinux/ss/conditional.c:79:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn]
return s[0];
^~~~~~~~~~~
static int cond_evaluate_expr( ...
{
u32 i;
int s[COND_EXPR_MAXDEPTH];
for (i = 0; i < expr->len; i++)
...
return s[0];
When expr->len is 0, the loop which sets s[0] never runs.
So return -1 if the loop never runs.
Cc: stable@vger.kernel.org
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
It is possible that a platform that is capable of 'namespace labels'
comes up without the labels properly initialized. In this case, the
region's 'align' attribute is hidden. Howerver, once the user does
initialize he labels, the 'align' attribute still stays hidden, which is
unexpected.
The sysfs_update_group() API is meant to address this, and could be
called during region probe, but it has entanglements with the device
'lockdep_mutex'. Therefore, simply make the 'align' attribute always
visible. It doesn't matter what it says for label-less namespaces, since
it is not possible to change their allocation anyway.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20200520225026.29426-1-vishal.l.verma@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If we're doing polled IO and end up having requests being submitted
async, then completions can come in while we're waiting for refs to
drop. We need to reap these manually, as nobody else will be looking
for them.
Break the wait into 1/20th of a second time waits, and check for done
poll completions if we time out. Otherwise we can have done poll
completions sitting in ctx->poll_list, which needs us to reap them but
we're just waiting for them.
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If both the tracer and the tracee are compat processes, and gprs[2]
is assigned a value by __poke_user_compat, then the higher 32 bits
of gprs[2] are cleared, IS_ERR_VALUE() always returns false, and
syscall_get_error() always returns 0.
Fix the implementation by sign-extending the value for compat processes
the same way as x86 implementation does.
The bug was exposed to user space by commit 201766a20e ("ptrace: add
PTRACE_GET_SYSCALL_INFO request") and detected by strace test suite.
This change fixes strace syscall tampering on s390.
Link: https://lkml.kernel.org/r/20200602180051.GA2427@altlinux.org
Fixes: 753c4dd6a2 ("[S390] ptrace changes")
Cc: Elvira Khabirova <lineprinter@altlinux.org>
Cc: stable@vger.kernel.org # v2.6.28+
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The way we produce SBALs to the device (first update q->nr_buf_used,
then update the SLSB) should ensure that we never see some of the
SLSB states when scanning the queue for progress.
So make some noise if we do, this implies a bug in our SBAL tracking.
Also tweak the WARN msg to provide more information.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
This removes the last remaining accesses to ->qdio_data from internal
code. Just pass the qdio_irq struct where needed instead.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The FA2 mailbox is specified at 0x18025000 but should actually be
0x18025c00, length 0x400 according to socregs_nsp.h and board_bu.c. Also
the interrupt was off by one and should be GIC SPI 151 instead of 150.
Fixes: 17d5171723 ("ARM: dts: NSP: Add mailbox (PDC) to NSP")
Signed-off-by: Matthew Hagan <mnhagan88@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Alexei Starovoitov says:
====================
pull-request: bpf 2020-06-17
The following pull-request contains BPF updates for your *net* tree.
We've added 10 non-merge commits during the last 2 day(s) which contain
a total of 14 files changed, 158 insertions(+), 59 deletions(-).
The main changes are:
1) Important fix for bpf_probe_read_kernel_str() return value, from Andrii.
2) [gs]etsockopt fix for large optlen, from Stanislav.
3) devmap allocation fix, from Toke.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If we're unlucky with timing, we could be running task_work after
having dropped the memory context in the sq thread. Since dropping
the context requires a runnable task state, we cannot reliably drop
it as part of our check-for-work loop in io_sq_thread(). Instead,
abstract out the mm acquire for the sq thread into a helper, and call
it from the async task work handler.
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In io_complete_rw_iopoll(), stores to io_kiocb's result and iopoll
completed are two independent store operations, to ensure that once
iopoll_completed is ture and then req->result must been perceived by
the cpu executing io_do_iopoll(), proper memory barrier should be used.
And in io_do_iopoll(), we check whether req->result is EAGAIN, if it is,
we'll need to issue this io request using io-wq again. In order to just
issue a single smp_rmb() on the completion side, move the re-submit work
to io_iopoll_complete().
Cc: stable@vger.kernel.org
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
[axboe: don't set ->iopoll_completed for -EAGAIN retry]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In IOPOLL mode, for EAGAIN error, we'll try to submit io request
again using io-wq, so don't fail rest of links if this io request
has links.
Cc: stable@vger.kernel.org
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull dma-mapping fixes from Christoph Hellwig:
"Fixes for the SEV atomic pool (Geert Uytterhoeven and David Rientjes)"
* tag 'dma-mapping-5.8-3' of git://git.infradead.org/users/hch/dma-mapping:
dma-pool: decouple DMA_REMAP from DMA_COHERENT_POOL
dma-pool: fix too large DMA pools on medium memory size systems
We are relying on the fact, that we can pass > sizeof(int) optvals
to the SOL_IP+IP_FREEBIND option (the kernel will take first 4 bytes).
In the BPF program we check that we can only touch PAGE_SIZE bytes,
but the real optlen is PAGE_SIZE * 2. In both cases, we override it to
some predefined value and trim the optlen.
Also, let's modify exiting IP_TOS usecase to test optlen=0 case
where BPF program just bypasses the data as is.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200617010416.93086-2-sdf@google.com
Attaching to these hooks can break iptables because its optval is
usually quite big, or at least bigger than the current PAGE_SIZE limit.
David also mentioned some SCTP options can be big (around 256k).
For such optvals we expose only the first PAGE_SIZE bytes to
the BPF program. BPF program has two options:
1. Set ctx->optlen to 0 to indicate that the BPF's optval
should be ignored and the kernel should use original userspace
value.
2. Set ctx->optlen to something that's smaller than the PAGE_SIZE.
v5:
* use ctx->optlen == 0 with trimmed buffer (Alexei Starovoitov)
* update the docs accordingly
v4:
* use temporary buffer to avoid optval == optval_end == NULL;
this removes the corner case in the verifier that might assume
non-zero PTR_TO_PACKET/PTR_TO_PACKET_END.
v3:
* don't increase the limit, bypass the argument
v2:
* proper comments formatting (Jakub Kicinski)
Fixes: 0d01da6afc ("bpf: implement getsockopt and setsockopt hooks")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Link: https://lore.kernel.org/bpf/20200617010416.93086-1-sdf@google.com
Syzkaller discovered that creating a hash of type devmap_hash with a large
number of entries can hit the memory allocator limit for allocating
contiguous memory regions. There's really no reason to use kmalloc_array()
directly in the devmap code, so just switch it to the existing
bpf_map_area_alloc() function that is used elsewhere.
Fixes: 6f9d451ab1 ("xdp: Add devmap_hash map type for looking up devices by hashed index")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200616142829.114173-1-toke@redhat.com
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct dm_target_deps {
...
__u64 dev[0]; /* out */
};
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The array bio_in_progress is only used with ssd mode. So skip
writecache_wait_for_ios in writecache_discard when pmem mode.
Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
To pick the changes from:
b383a73f2b ("fs/ext4: Introduce DAX inode flag")
And silence this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h'
diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h
It causes various beautifiers for things like fspick, fsmount, etc (see
below) to get rebuilt, but this specific change doesn't make 'perf
trace' be capable of decoding anything new, as we still don't decode
what comes from ioctls, just its cmds.
Details about the update:
$ cp include/uapi/linux/fs.h tools/include/uapi/linux/fs.h
$ git diff
diff --git a/tools/include/uapi/linux/fs.h b/tools/include/uapi/linux/fs.h
index 379a612f8f1d..f44eb0a04afd 100644
--- a/tools/include/uapi/linux/fs.h
+++ b/tools/include/uapi/linux/fs.h
@@ -262,6 +262,7 @@ struct fsxattr {
#define FS_EA_INODE_FL 0x00200000 /* Inode used for large EA */
#define FS_EOFBLOCKS_FL 0x00400000 /* Reserved for ext4 */
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
+#define FS_DAX_FL 0x02000000 /* Inode is DAX */
#define FS_INLINE_DATA_FL 0x10000000 /* Reserved for ext4 */
#define FS_PROJINHERIT_FL 0x20000000 /* Create with parents projid */
#define FS_CASEFOLD_FL 0x40000000 /* Folder is case insensitive */
$ m
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j8' parallel build
INSTALL GTK UI
CC /tmp/build/perf/builtin-trace.o
DESCEND plugins
CC /tmp/build/perf/trace/beauty/fsmount.o
CC /tmp/build/perf/trace/beauty/fspick.o
CC /tmp/build/perf/trace/beauty/mount_flags.o
CC /tmp/build/perf/trace/beauty/move_mount.o
CC /tmp/build/perf/trace/beauty/renameat.o
CC /tmp/build/perf/trace/beauty/sync_file_range.o
INSTALL trace_plugins
LD /tmp/build/perf/trace/beauty/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
<SNIP>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes in:
7e5b3c267d ("x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation")
Addressing these tools/perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
With this one will be able to use these new AMD MSRs in filters, by
name, e.g.:
# perf trace -e msr:* --filter "msr==IA32_MCU_OPT_CTRL"
^C#
Using -v we can see how it sets up the tracepoint filters, converting
from the string in the filter to the numeric value:
# perf trace -v -e msr:* --filter "msr==IA32_MCU_OPT_CTRL"
Using CPUID GenuineIntel-6-8E-A
0x123
New filter for msr:read_msr: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
0x123
New filter for msr:write_msr: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
0x123
New filter for msr:rdpmc: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
mmap size 528384B
^C#
The updating process shows how this affects tooling in more detail:
$ diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
--- tools/arch/x86/include/asm/msr-index.h 2020-06-03 10:36:09.959910238 -0300
+++ arch/x86/include/asm/msr-index.h 2020-06-17 10:04:20.235052901 -0300
@@ -128,6 +128,10 @@
#define TSX_CTRL_RTM_DISABLE BIT(0) /* Disable RTM feature */
#define TSX_CTRL_CPUID_CLEAR BIT(1) /* Disable TSX enumeration */
+/* SRBDS support */
+#define MSR_IA32_MCU_OPT_CTRL 0x00000123
+#define RNGDS_MITG_DIS BIT(0)
+
#define MSR_IA32_SYSENTER_CS 0x00000174
#define MSR_IA32_SYSENTER_ESP 0x00000175
#define MSR_IA32_SYSENTER_EIP 0x00000176
$ set -o vi
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
$ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
$ diff -u before after
--- before 2020-06-17 10:05:49.653114752 -0300
+++ after 2020-06-17 10:06:01.777258731 -0300
@@ -51,6 +51,7 @@
[0x0000011e] = "IA32_BBL_CR_CTL3",
[0x00000120] = "IDT_MCR_CTRL",
[0x00000122] = "IA32_TSX_CTRL",
+ [0x00000123] = "IA32_MCU_OPT_CTRL",
[0x00000140] = "MISC_FEATURES_ENABLES",
[0x00000174] = "IA32_SYSENTER_CS",
[0x00000175] = "IA32_SYSENTER_ESP",
$
The related change to cpu-features.h affects this:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
This shouldn't be affecting that 'perf bench' entry:
$ find tools/perf/ -type f | xargs grep SRBDS
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Gross <mgross@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get some newer headers that got out of sync with the copies in tools/
so that we can try to have the tools/perf/ build clean for v5.8 with
fewer pull requests.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes segmentation fault when trying to interpret zstd-compressed data
with perf script:
```
$ perf record -z ls
...
[ perf record: Captured and wrote 0,010 MB perf.data, compressed (original 0,001 MB, ratio is 2,190) ]
$ memcheck perf script
...
==67911== Invalid read of size 4
==67911== at 0x5568188: ZSTD_decompressStream (in /usr/lib/libzstd.so.1.4.5)
==67911== by 0x6E726B: zstd_decompress_stream (zstd.c:100)
==67911== by 0x65729C: perf_session__process_compressed_event (session.c:72)
==67911== by 0x6598E8: perf_session__process_user_event (session.c:1583)
==67911== by 0x65BA59: reader__process_events (session.c:2177)
==67911== by 0x65BA59: __perf_session__process_events (session.c:2234)
==67911== by 0x65BA59: perf_session__process_events (session.c:2267)
==67911== by 0x5A7397: __cmd_script (builtin-script.c:2447)
==67911== by 0x5A7397: cmd_script (builtin-script.c:3840)
==67911== by 0x5FE9D2: run_builtin (perf.c:312)
==67911== by 0x711627: handle_internal_command (perf.c:364)
==67911== by 0x711627: run_argv (perf.c:408)
==67911== by 0x711627: main (perf.c:538)
==67911== Address 0x71d8 is not stack'd, malloc'd or (recently) free'd
```
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LPU-Reference: 20200612230333.72140-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make sure that the local variable rzone in dmz_do_reclaim() is always
initialized before being used for printing debug messages.
Fixes: f97809aec5 ("dm zoned: per-device reclaim")
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
During recent refactorings, bpf_probe_read_kernel_str() started returning 0 on
success, instead of amount of data successfully read. This majorly breaks
applications relying on bpf_probe_read_kernel_str() and bpf_probe_read_str()
and their results. Fix this by returning actual number of bytes read.
Fixes: 8d92db5c04 ("bpf: rework the compat kernel probe handling")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200616050432.1902042-1-andriin@fb.com
Mostly for historical reasons, q->blk_trace is assigned through xchg()
and cmpxchg() atomic operations. Although this is correct, sparse
complains about this because it violates rcu annotations since commit
c780e86dd4 ("blktrace: Protect q->blk_trace with RCU") which started
to use rcu for accessing q->blk_trace. Furthermore there's no real need
for atomic operations anymore since all changes to q->blk_trace happen
under q->blk_trace_mutex and since it also makes more sense to check if
q->blk_trace is set with the mutex held earlier.
So let's just replace xchg() with rcu_replace_pointer() and cmpxchg()
with explicit check and rcu_assign_pointer(). This makes the code more
efficient and sparse happy.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We use one blktrace per request_queue, that means one per the entire
disk. So we cannot run one blktrace on say /dev/vda and then /dev/vda1,
or just two calls on /dev/vda.
We check for concurrent setup only at the very end of the blktrace setup though.
If we try to run two concurrent blktraces on the same block device the
second one will fail, and the first one seems to go on. However when
one tries to kill the first one one will see things like this:
The kernel will show these:
```
debugfs: File 'dropped' in directory 'nvme1n1' already present!
debugfs: File 'msg' in directory 'nvme1n1' already present!
debugfs: File 'trace0' in directory 'nvme1n1' already present!
``
And userspace just sees this error message for the second call:
```
blktrace /dev/nvme1n1
BLKTRACESETUP(2) /dev/nvme1n1 failed: 5/Input/output error
```
The first userspace process #1 will also claim that the files
were taken underneath their nose as well. The files are taken
away form the first process given that when the second blktrace
fails, it will follow up with a BLKTRACESTOP and BLKTRACETEARDOWN.
This means that even if go-happy process #1 is waiting for blktrace
data, we *have* been asked to take teardown the blktrace.
This can easily be reproduced with break-blktrace [0] run_0005.sh test.
Just break out early if we know we're already going to fail, this will
prevent trying to create the files all over again, which we know still
exist.
[0] https://github.com/mcgrof/break-blktrace
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The server is failing to apply the umask when creating new objects on
filesystems without ACL support.
To reproduce this, you need to use NFSv4.2 and a client and server
recent enough to support umask, and you need to export a filesystem that
lacks ACL support (for example, ext4 with the "noacl" mount option).
Filesystems with ACL support are expected to take care of the umask
themselves (usually by calling posix_acl_create).
For filesystems without ACL support, this is up to the caller of
vfs_create(), vfs_mknod(), or vfs_mkdir().
Reported-by: Elliott Mitchell <ehem+debian@m5p.com>
Reported-by: Salvatore Bonaccorso <carnil@debian.org>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Fixes: 47057abde5 ("nfsd: add support for the umask attribute")
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
On 32-bit ARM, we may boot at HYP mode, or with the MMU and caches off
(or both), even though the EFI spec does not actually support this.
While booting at HYP mode is something we might tolerate, fiddling
with the caches is a more serious issue, as disabling the caches is
tricky to do safely from C code, and running without the Dcache makes
it impossible to support unaligned memory accesses, which is another
explicit requirement imposed by the EFI spec.
So take note of the CPU mode and MMU state in the EFI stub diagnostic
output so that we can easily diagnose any issues that may arise from
this. E.g.,
EFI stub: Entering in SVC mode with MMU enabled
Also, capture the CPSR and SCTLR system register values at EFI stub
entry, and after ExitBootServices() returns, and check whether the
MMU and Dcache were disabled at any point. If this is the case, a
diagnostic message like the following will be emitted:
efi: [Firmware Bug]: EFI stub was entered with MMU and Dcache disabled, please fix your firmware!
efi: CPSR at EFI stub entry : 0x600001d3
efi: SCTLR at EFI stub entry : 0x00c51838
efi: CPSR after ExitBootServices() : 0x600001d3
efi: SCTLR after ExitBootServices(): 0x00c50838
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
On arm64, the EFI stub is built into the kernel proper, and so the stub
can refer to its symbols directly. Therefore, the practice of using EFI
configuration tables to pass information between them is never needed,
so we can omit any code consuming such tables when building for arm64.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
Commit
17054f492d ("efi/x86: Implement mixed mode boot without the handover protocol")
introduced a new entry point for the EFI stub to be booted in mixed mode
on 32-bit firmware.
When entered via efi32_pe_entry, control is first transferred to
startup_32 to setup for the switch to long mode, and then the EFI stub
proper is entered via efi_pe_entry. efi_pe_entry is an MS ABI function,
and the ABI requires 32 bytes of shadow stack space to be allocated by
the caller, as well as the stack being aligned to 8 mod 16 on entry.
Allocate 40 bytes on the stack before switching to 64-bit mode when
calling efi_pe_entry to account for this.
For robustness, explicitly align boot_stack_end to 16 bytes. It is
currently implicitly aligned since .bss is cacheline-size aligned,
head_64.o is the first object file with a .bss section, and the heap and
boot sizes are aligned.
Fixes: 17054f492d ("efi/x86: Implement mixed mode boot without the handover protocol")
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Link: https://lore.kernel.org/r/20200617131957.2507632-1-nivedita@alum.mit.edu
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Currently the macro that inserts entries into the SPU syscall table
doesn't actually use the "nr" (syscall number) parameter.
This does work, but it relies on the exact right number of syscall
entries being emitted in order for the syscal numbers to line up with
the array entries. If for example we had two entries with the same
syscall number we wouldn't get an error, it would just cause all
subsequent syscalls to be off by one in the spu_syscall_table.
So instead change the macro to assign to the specific entry of the
array, meaning any numbering overlap will be caught by the compiler.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20200616135617.2937252-1-mpe@ellerman.id.au
The pte_update() implementation for PPC_8xx unfolds page table from the PGD
level to access a PMD entry. Since 8xx has only 2-level page table this can
be simplified with pmd_off() shortcut.
Replace explicit unfolding with pmd_off() and drop defines of pgd_index()
and pgd_offset() that are no longer needed.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200615092229.23142-1-rppt@kernel.org
In case of -EPROBE_DEFER, stm32_qspi_release() was called
in any case which unregistered driver from pm_runtime framework
even if it has not been registered yet to it. This leads to:
stm32-qspi 58003000.spi: can't setup spi0.0, status -13
spi_master spi0: spi_device register error /soc/spi@58003000/mx66l51235l@0
spi_master spi0: Failed to create SPI device for /soc/spi@58003000/mx66l51235l@0
stm32-qspi 58003000.spi: can't setup spi0.1, status -13
spi_master spi0: spi_device register error /soc/spi@58003000/mx66l51235l@1
spi_master spi0: Failed to create SPI device for /soc/spi@58003000/mx66l51235l@1
On v5.7 kernel,this issue was not "visible", qspi driver was probed
successfully.
Fixes: 9d282c17b0 ("spi: stm32-qspi: Add pm_runtime support")
Signed-off-by: Patrice Chotard <patrice.chotard@st.com>
Link: https://lore.kernel.org/r/20200616113035.4514-1-patrice.chotard@st.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Unfortunately, most versions of clang that support BTI are capable of
miscompiling the kernel when converting a switch statement into a jump
table. As an example, attempting to spawn a KVM guest results in a panic:
[ 56.253312] Kernel panic - not syncing: bad mode
[ 56.253834] CPU: 0 PID: 279 Comm: lkvm Not tainted 5.8.0-rc1 #2
[ 56.254225] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
[ 56.254712] Call trace:
[ 56.254952] dump_backtrace+0x0/0x1d4
[ 56.255305] show_stack+0x1c/0x28
[ 56.255647] dump_stack+0xc4/0x128
[ 56.255905] panic+0x16c/0x35c
[ 56.256146] bad_el0_sync+0x0/0x58
[ 56.256403] el1_sync_handler+0xb4/0xe0
[ 56.256674] el1_sync+0x7c/0x100
[ 56.256928] kvm_vm_ioctl_check_extension_generic+0x74/0x98
[ 56.257286] __arm64_sys_ioctl+0x94/0xcc
[ 56.257569] el0_svc_common+0x9c/0x150
[ 56.257836] do_el0_svc+0x84/0x90
[ 56.258083] el0_sync_handler+0xf8/0x298
[ 56.258361] el0_sync+0x158/0x180
This is because the switch in kvm_vm_ioctl_check_extension_generic()
is executed as an indirect branch to tail-call through a jump table:
ffff800010032dc8: 3869694c ldrb w12, [x10, x9]
ffff800010032dcc: 8b0c096b add x11, x11, x12, lsl #2
ffff800010032dd0: d61f0160 br x11
However, where the target case uses the stack, the landing pad is elided
due to the presence of a paciasp instruction:
ffff800010032e14: d503233f paciasp
ffff800010032e18: a9bf7bfd stp x29, x30, [sp, #-16]!
ffff800010032e1c: 910003fd mov x29, sp
ffff800010032e20: aa0803e0 mov x0, x8
ffff800010032e24: 940017c0 bl ffff800010038d24 <kvm_vm_ioctl_check_extension>
ffff800010032e28: 93407c00 sxtw x0, w0
ffff800010032e2c: a8c17bfd ldp x29, x30, [sp], #16
ffff800010032e30: d50323bf autiasp
ffff800010032e34: d65f03c0 ret
Unfortunately, this results in a fatal exception because paciasp is
compatible only with branch-and-link (call) instructions and not simple
indirect branches.
A fix is being merged into Clang 10.0.1 so that a 'bti j' instruction is
emitted as an explicit landing pad in this situation. Make in-kernel
BTI depend on that compiler version when building with clang.
Cc: Tom Stellard <tstellar@redhat.com>
Cc: Daniel Kiss <daniel.kiss@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20200615105524.GA2694@willie-the-truck
Link: https://lore.kernel.org/r/20200616183630.2445-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
The callers don't expect *d_cdp to be set to an error pointer, they only
check for NULL. This leads to a static checker warning:
arch/x86/kernel/cpu/resctrl/rdtgroup.c:2648 __init_one_rdt_domain()
warn: 'd_cdp' could be an error pointer
This would not trigger a bug in this specific case because
__init_one_rdt_domain() calls it with a valid domain that would not have
a negative id and thus not trigger the return of the ERR_PTR(). If this
was a negative domain id then the call to rdt_find_domain() in
domain_add_cpu() would have returned the ERR_PTR() much earlier and the
creation of the domain with an invalid id would have been prevented.
Even though a bug is not triggered currently the right and safe thing to
do is to set the pointer to NULL because that is what can be checked for
when the caller is handling the CDP and non-CDP cases.
Fixes: 52eb74339a ("x86/resctrl: Fix rdt_find_domain() return value and checks")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lkml.kernel.org/r/20200602193611.GA190851@mwanda
With the recent full-duplex support of implicit feedback streams, an
endpoint can be still running after closing the capture stream as long
as the playback stream with the sync-endpoint is running. In such a
state, the URBs are still be handled and they may call retire_data_urb
callback, which tries to transfer the data from the PCM buffer. Since
the PCM stream gets closed, this may lead to use-after-free.
This patch adds the proper clearance of the callback at stopping the
capture stream for addressing the possible UAF above.
Fixes: 10ce77e481 ("ALSA: usb-audio: Add duplex sound support for USB devices using implicit feedback")
Link: https://lore.kernel.org/r/20200616120921.12249-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
__change_page_attr() can fail which will cause set_memory_encrypted() and
set_memory_decrypted() to return non-zero.
If the device requires unencrypted DMA memory and decryption fails, simply
free the memory and fail.
If attempting to re-encrypt in the failure path and that encryption fails,
there is no alternative other than to leak the memory.
Fixes: c10f07aa27 ("dma/direct: Handle force decryption for DMA coherent buffers in common code")
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
If arch_dma_set_uncached() fails after memory has been decrypted, it needs
to be re-encrypted before freeing.
Fixes: fa7e2247c5 ("dma-direct: make uncached_kernel_address more general")
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
dma_alloc_contiguous() does size >> PAGE_SHIFT and set_memory_decrypted()
works at page granularity. It's necessary to page align the allocation
size in dma_direct_alloc_pages() for consistent behavior.
This also fixes an issue when arch_dma_prep_coherent() is called on an
unaligned allocation size for dma_alloc_need_uncached() when
CONFIG_DMA_DIRECT_REMAP is disabled but CONFIG_ARCH_HAS_DMA_SET_UNCACHED
is enabled.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
nommu configfs can trivially map the coherent allocations to user space,
as no actual page table setup is required and the kernel and the user
space programs share the same address space.
Fixes: 62fcee9a3b ("dma-mapping: remove CONFIG_ARCH_NO_COHERENT_DMA_MMAP")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: dillon min <dillon.minfei@gmail.com>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: dillon min <dillon.minfei@gmail.com>
Using _MASKED_BIT_ENABLE macro to set mask register bits is straight
forward and not likely to go wrong. However when checking which bit(s)
is(are) enabled, simply bitwise AND value and _MASKED_BIT_ENABLE() won't
output expected result. Suppose the register write is disabling bit 1
by setting 0xFFFF0000, however "& _MASKED_BIT_ENABLE(1)" outputs
0x00010000, and the non-zero check will pass which cause the old code
consider the new value set as an enabling operation.
We found guest set 0x80008000 on boot, and set 0xffff8000 during resume.
Both are legal settings but old code will block latter and force vgpu
enter fail-safe mode.
Introduce two new macro and make proper masked bit check in mmio handler:
IS_MASKED_BITS_ENABLED()
IS_MASKED_BITS_DISABLED()
V2: Rebase.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200601030721.17129-1-colin.xu@intel.com
Add flex_array_size() helper for the calculation of the size, in bytes,
of a flexible array member contained within an enclosing structure.
Example of usage:
struct something {
size_t count;
struct foo items[];
};
struct something *instance;
instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
instance->count = count;
memcpy(instance->items, src, flex_array_size(instance, items, instance->count));
The helper returns SIZE_MAX on overflow instead of wrapping around.
Additionally replaces parameter "n" with "count" in struct_size() helper
for greater clarity and unification.
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20200609012233.GA3371@embeddedor
Signed-off-by: Kees Cook <keescook@chromium.org>
cc-option and as-option are almost the same; both pass the flag to
$(CC). The main difference is the cc-option stops before the assemble
stage (-S option) whereas as-option stops after (-c option).
I chose -S because it is slightly faster, but $(cc-option,-gz=zlib)
returns a wrong result (https://lkml.org/lkml/2020/6/9/1529).
It has been fixed by commit 7b16994437 ("Makefile: Improve compressed
debug info support detection"), but the assembler should always be
invoked for more reliable compiler option tests.
However, you cannot simply replace -S with -c because the following
code in lib/Kconfig.debug would break:
depends on $(cc-option,-gsplit-dwarf)
The combination of -c and -gsplit-dwarf does not accept /dev/null as
output.
$ cat /dev/null | gcc -gsplit-dwarf -S -x c - -o /dev/null
$ echo $?
0
$ cat /dev/null | gcc -gsplit-dwarf -c -x c - -o /dev/null
objcopy: Warning: '/dev/null' is not an ordinary file
$ echo $?
1
$ cat /dev/null | gcc -gsplit-dwarf -c -x c - -o tmp.o
$ echo $?
0
There is another flag that creates an separate file based on the
object file path:
$ cat /dev/null | gcc -ftest-coverage -c -x c - -o /dev/null
<stdin>:1: error: cannot open /dev/null.gcno
So, we cannot use /dev/null to sink the output.
Align the cc-option implementation with scripts/Kbuild.include.
With -c option used in cc-option, as-option is unneeded.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Fix /proc/bootconfig to select double or single quotes
corrctly according to the value.
If a bootconfig value includes a double quote character,
we must use single-quotes to quote that value.
This modifies if() condition and blocks for avoiding
double-quote in value check in 2 places. Anyway, since
xbc_array_for_each_value() can handle the array which
has a single node correctly.
Thus,
if (vnode && xbc_node_is_array(vnode)) {
xbc_array_for_each_value(vnode) /* vnode->next != NULL */
...
} else {
snprintf(val); /* val is an empty string if !vnode */
}
is equivalent to
if (vnode) {
xbc_array_for_each_value(vnode) /* vnode->next can be NULL */
...
} else {
snprintf(""); /* value is always empty */
}
Link: http://lkml.kernel.org/r/159230244786.65555.3763894451251622488.stgit@devnote2
Cc: stable@vger.kernel.org
Fixes: c1a3c36017 ("proc: bootconfig: Add /proc/bootconfig to show boot config list")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
kmemleak report:
[<57dcc2ca>] __kmalloc_track_caller+0x139/0x2b0
[<f1c45d0f>] kstrndup+0x37/0x80
[<f9761eb0>] parse_probe_arg.isra.7+0x3cc/0x630
[<055bf2ba>] traceprobe_parse_probe_arg+0x2f5/0x810
[<655a7766>] trace_kprobe_create+0x2ca/0x950
[<4fc6a02a>] create_or_delete_trace_kprobe+0xf/0x30
[<6d1c8a52>] trace_run_command+0x67/0x80
[<be812cc0>] trace_parse_run_command+0xa7/0x140
[<aecfe401>] probes_write+0x10/0x20
[<2027641c>] __vfs_write+0x30/0x1e0
[<6a4aeee1>] vfs_write+0x96/0x1b0
[<3517fb7d>] ksys_write+0x53/0xc0
[<dad91db7>] __ia32_sys_write+0x15/0x20
[<da347f64>] do_syscall_32_irqs_on+0x3d/0x260
[<fd0b7e7d>] do_fast_syscall_32+0x39/0xb0
[<ea5ae810>] entry_SYSENTER_32+0xaf/0x102
Post parse_probe_arg(), the FETCH_OP_DATA operation type is overwritten
to FETCH_OP_ST_STRING, as a result memory is never freed since
traceprobe_free_probe_arg() iterates only over SYMBOL and DATA op types
Setup fetch string operation correctly after fetch_op_data operation.
Link: https://lkml.kernel.org/r/20200615143034.GA1734@cosmos
Cc: stable@vger.kernel.org
Fixes: a42e3c4de9 ("tracing/probe: Add immediate string parameter support")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Vamshi K Sthambamkadi <vamshi.k.sthambamkadi@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
When using trace-cmd on 5.6-rt for the function graph tracer, the output was
corrupted. It gave output like this:
funcgraph_entry: func=0xffffffff depth=38982
funcgraph_entry: func=0x1ffffffff depth=16044
funcgraph_exit: func=0xffffffff overrun=0x92539aaf00000000 calltime=0x92539c9900000072 rettime=0x100000072 depth=11084
funcgraph_exit: func=0xffffffff overrun=0x9253946e00000000 calltime=0x92539e2100000072 rettime=0x72 depth=26033702
funcgraph_entry: func=0xffffffff depth=85798
funcgraph_entry: func=0x1ffffffff depth=12044
The reason was because the tracefs/events/ftrace/funcgraph_entry/exit format
file was incorrect. The -rt kernel adds more common fields to the trace
events. Namely, common_migrate_disable and common_preempt_lazy_count. Each
is one byte in size. This changes the alignment of the normal payload. Most
events are aligned normally, but the function and function graph events are
defined with a "PACKED" macro, that packs their payload. As the offsets
displayed in the format files are now calculated by an aligned field, the
aligned field for function and function graph events should be 1, not their
normal alignment.
With aligning of the funcgraph_entry event, the format file has:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:unsigned char common_migrate_disable; offset:8; size:1; signed:0;
field:unsigned char common_preempt_lazy_count; offset:9; size:1; signed:0;
field:unsigned long func; offset:16; size:8; signed:0;
field:int depth; offset:24; size:4; signed:1;
But the actual alignment is:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:unsigned char common_migrate_disable; offset:8; size:1; signed:0;
field:unsigned char common_preempt_lazy_count; offset:9; size:1; signed:0;
field:unsigned long func; offset:12; size:8; signed:0;
field:int depth; offset:20; size:4; signed:1;
Link: https://lkml.kernel.org/r/20200609220041.2a3b527f@oasis.local.home
Cc: stable@vger.kernel.org
Fixes: 04ae87a520 ("ftrace: Rework event_create_dir()")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave.
My test was also able to trigger lockdep output:
============================================
WARNING: possible recursive locking detected
5.6.0-rc6+ #6 Not tainted
--------------------------------------------
sched-messaging/2767 is trying to acquire lock:
ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0
but task is already holding lock:
ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(kretprobe_table_locks[i].lock));
lock(&(kretprobe_table_locks[i].lock));
*** DEADLOCK ***
May be due to missing lock nesting notation
1 lock held by sched-messaging/2767:
#0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
stack backtrace:
CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6
Call Trace:
dump_stack+0x96/0xe0
__lock_acquire.cold.57+0x173/0x2b7
? native_queued_spin_lock_slowpath+0x42b/0x9e0
? lockdep_hardirqs_on+0x590/0x590
? __lock_acquire+0xf63/0x4030
lock_acquire+0x15a/0x3d0
? kretprobe_hash_lock+0x52/0xa0
_raw_spin_lock_irqsave+0x36/0x70
? kretprobe_hash_lock+0x52/0xa0
kretprobe_hash_lock+0x52/0xa0
trampoline_handler+0xf8/0x940
? kprobe_fault_handler+0x380/0x380
? find_held_lock+0x3a/0x1c0
kretprobe_trampoline+0x25/0x50
? lock_acquired+0x392/0xbc0
? _raw_spin_lock_irqsave+0x50/0x70
? __get_valid_kprobe+0x1f0/0x1f0
? _raw_spin_unlock_irqrestore+0x3b/0x40
? finish_task_switch+0x4b9/0x6d0
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
The code within the kretprobe handler checks for probe reentrancy,
so we won't trigger any _raw_spin_lock_irqsave probe in there.
The problem is in outside kprobe_flush_task, where we call:
kprobe_flush_task
kretprobe_table_lock
raw_spin_lock_irqsave
_raw_spin_lock_irqsave
where _raw_spin_lock_irqsave triggers the kretprobe and installs
kretprobe_trampoline handler on _raw_spin_lock_irqsave return.
The kretprobe_trampoline handler is then executed with already
locked kretprobe_table_locks, and first thing it does is to
lock kretprobe_table_locks ;-) the whole lockup path like:
kprobe_flush_task
kretprobe_table_lock
raw_spin_lock_irqsave
_raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed
---> kretprobe_table_locks locked
kretprobe_trampoline
trampoline_handler
kretprobe_hash_lock(current, &head, &flags); <--- deadlock
Adding kprobe_busy_begin/end helpers that mark code with fake
probe installed to prevent triggering of another kprobe within
this code.
Using these helpers in kprobe_flush_task, so the probe recursion
protection check is hit and the probe is never set to prevent
above lockup.
Link: http://lkml.kernel.org/r/158927059835.27680.7011202830041561604.stgit@devnote2
Fixes: ef53d9c5e4 ("kprobes: improve kretprobe scalability with hashed locking")
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Reported-by: "Ziqian SUN (Zamir)" <zsun@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fix to remove redundant arch_disarm_kprobe() call in
force_unoptimize_kprobe(). This arch_disarm_kprobe()
will be invoked if the kprobe is optimized but disabled,
but that means the kprobe (optprobe) is unused (and
unoptimized) state.
In that case, unoptimize_kprobe() puts it in freeing_list
and kprobe_optimizer (do_unoptimize_kprobes()) automatically
disarm it. Thus this arch_disarm_kprobe() is redundant.
Link: http://lkml.kernel.org/r/158927058719.27680.17183632908465341189.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Anders reported that the lockdep warns that suspicious
RCU list usage in register_kprobe() (detected by
CONFIG_PROVE_RCU_LIST.) This is because get_kprobe()
access kprobe_table[] by hlist_for_each_entry_rcu()
without rcu_read_lock.
If we call get_kprobe() from the breakpoint handler context,
it is run with preempt disabled, so this is not a problem.
But in other cases, instead of rcu_read_lock(), we locks
kprobe_mutex so that the kprobe_table[] is not updated.
So, current code is safe, but still not good from the view
point of RCU.
Joel suggested that we can silent that warning by passing
lockdep_is_held() to the last argument of
hlist_for_each_entry_rcu().
Add lockdep_is_held(&kprobe_mutex) at the end of the
hlist_for_each_entry_rcu() to suppress the warning.
Link: http://lkml.kernel.org/r/158927055350.27680.10261450713467997503.stgit@devnote2
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Suggested-by: Joel Fernandes <joel@joelfernandes.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
When cc-option and friends evaluate compiler flags, the temporary file
$$TMP is created as an output object, and automatically cleaned up.
The actual file path of $$TMP is .<pid>.tmp, here <pid> is the process
ID of $(shell ...) invoked from cc-option. (Please note $$$$ is the
escape sequence of $$).
Such garbage files are cleaned up in most cases, but some compiler flags
create additional output files.
For example, -gsplit-dwarf creates a .dwo file.
When CONFIG_DEBUG_INFO_SPLIT=y, you will see a bunch of .<pid>.dwo files
left in the top of build directories. You may not notice them unless you
do 'ls -a', but the garbage files will increase every time you run 'make'.
This commit changes the temporary object path to .tmp_<pid>/tmp, and
removes .tmp_<pid> directory when exiting. Separate build artifacts such
as *.dwo will be cleaned up all together because their file paths are
usually determined based on the base name of the object.
Another example is -ftest-coverage, which outputs the coverage data into
<base-name-of-object>.gcno
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Pull networking fixes from David Miller:
1) Don't get per-cpu pointer with preemption enabled in nft_set_pipapo,
fix from Stefano Brivio.
2) Fix memory leak in ctnetlink, from Pablo Neira Ayuso.
3) Multiple definitions of MPTCP_PM_MAX_ADDR, from Geliang Tang.
4) Accidently disabling NAPI in non-error paths of macb_open(), from
Charles Keepax.
5) Fix races between alx_stop and alx_remove, from Zekun Shen.
6) We forget to re-enable SRIOV during resume in bnxt_en driver, from
Michael Chan.
7) Fix memory leak in ipv6_mc_destroy_dev(), from Wang Hai.
8) rxtx stats use wrong index in mvpp2 driver, from Sven Auhagen.
9) Fix memory leak in mptcp_subflow_create_socket error path, from Wei
Yongjun.
10) We should not adjust the TCP window advertised when sending dup acks
in non-SACK mode, because it won't be counted as a dup by the sender
if the window size changes. From Eric Dumazet.
11) Destroy the right number of queues during remove in mvpp2 driver,
from Sven Auhagen.
12) Various WOL and PM fixes to e1000 driver, from Chen Yu, Vaibhav
Gupta, and Arnd Bergmann.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits)
e1000e: fix unused-function warning
e1000: use generic power management
e1000e: Do not wake up the system via WOL if device wakeup is disabled
lan743x: add MODULE_DEVICE_TABLE for module loading alias
mlxsw: spectrum: Adjust headroom buffers for 8x ports
bareudp: Fixed configuration to avoid having garbage values
mvpp2: remove module bugfix
tcp: grow window for OOO packets only for SACK flows
mptcp: fix memory leak in mptcp_subflow_create_socket()
netfilter: flowtable: Make nf_flow_table_offload_add/del_cb inline
net/sched: act_ct: Make tcf_ct_flow_table_restore_skb inline
net: dsa: sja1105: fix PTP timestamping with large tc-taprio cycles
mvpp2: ethtool rxtx stats fix
MAINTAINERS: switch to my private email for Renesas Ethernet drivers
rocker: fix incorrect error handling in dma_rings_init
test_objagg: Fix potential memory leak in error handling
net: ethernet: mtk-star-emac: simplify interrupt handling
mld: fix memory leak in ipv6_mc_destroy_dev()
bnxt_en: Return from timer if interface is not in open state.
bnxt_en: Fix AER reset logic on 57500 chips.
...
Pull AFS fixes from David Howells:
"I've managed to get xfstests kind of working with afs. Here are a set
of patches that fix most of the bugs found.
There are a number of primary issues:
- Incorrect handling of mtime and non-handling of ctime. It might be
argued, that the latter isn't a bug since the AFS protocol doesn't
support ctime, but I should probably still update it locally.
- Shared-write mmap, truncate and writeback bugs. This includes not
changing i_size under the callback lock, overwriting local i_size
with the reply from the server after a partial writeback, not
limiting the writeback from an mmapped page to EOF.
- Checks for an abort code indicating that the primary vnode in an
operation was deleted by a third-party are done in the wrong place.
- Silly rename bugs. This includes an incomplete conversion to the
new operation handling, duplicate nlink handling, nlink changing
not being done inside the callback lock and insufficient handling
of third-party conflicting directory changes.
And some secondary ones:
- The UAEOVERFLOW abort code should map to EOVERFLOW not EREMOTEIO.
- Remove a couple of unused or incompletely used bits.
- Remove a couple of redundant success checks.
These seem to fix all the data-corruption bugs found by
./check -afs -g quick
along with the obvious silly rename bugs and time bugs.
There are still some test failures, but they seem to fall into two
classes: firstly, the authentication/security model is different to
the standard UNIX model and permission is arbitrated by the server and
cached locally; and secondly, there are a number of features that AFS
does not support (such as mknod). But in these cases, the tests
themselves need to be adapted or skipped.
Using the in-kernel afs client with xfstests also found a bug in the
AuriStor AFS server that has been fixed for a future release"
* tag 'afs-fixes-20200616' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Fix silly rename
afs: afs_vnode_commit_status() doesn't need to check the RPC error
afs: Fix use of afs_check_for_remote_deletion()
afs: Remove afs_operation::abort_code
afs: Fix yfs_fs_fetch_status() to honour vnode selector
afs: Remove yfs_fs_fetch_file_status() as it's not used
afs: Fix the mapping of the UAEOVERFLOW abort code
afs: Fix truncation issues and mmap writeback size
afs: Concoct ctimes
afs: Fix EOF corruption
afs: afs_write_end() should change i_size under the right lock
afs: Fix non-setting of mtime when writing into mmap
Clang static analysis reports this double free error
security/selinux/ss/conditional.c:139:2: warning: Attempt to free released memory [unix.Malloc]
kfree(node->expr.nodes);
^~~~~~~~~~~~~~~~~~~~~~~
When cond_read_node fails, it calls cond_node_destroy which frees the
node but does not poison the entry in the node list. So when it
returns to its caller cond_read_list, cond_read_list deletes the
partial list. The latest entry in the list will be deleted twice.
So instead of freeing the node in cond_read_node, let list freeing in
code_read_list handle the freeing the problem node along with all of the
earlier nodes.
Because cond_read_node no longer does any error handling, the goto's
the error case are redundant. Instead just return the error code.
Cc: stable@vger.kernel.org
Fixes: 60abd3181d ("selinux: convert cond_list to array")
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
[PM: subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Pull flexible-array member conversions from Gustavo A. R. Silva:
"Replace zero-length arrays with flexible-array members.
Notice that all of these patches have been baking in linux-next for
two development cycles now.
There is a regular need in the kernel to provide a way to declare
having a dynamically sized set of trailing elements in a structure.
Kernel code should always use “flexible array members”[1] for these
cases. The older style of one-element or zero-length arrays should no
longer be used[2].
C99 introduced “flexible array members”, which lacks a numeric size
for the array declaration entirely:
struct something {
size_t count;
struct foo items[];
};
This is the way the kernel expects dynamically sized trailing elements
to be declared. It allows the compiler to generate errors when the
flexible array does not occur last in the structure, which helps to
prevent some kind of undefined behavior[3] bugs from being
inadvertently introduced to the codebase.
It also allows the compiler to correctly analyze array sizes (via
sizeof(), CONFIG_FORTIFY_SOURCE, and CONFIG_UBSAN_BOUNDS). For
instance, there is no mechanism that warns us that the following
application of the sizeof() operator to a zero-length array always
results in zero:
struct something {
size_t count;
struct foo items[0];
};
struct something *instance;
instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
instance->count = count;
size = sizeof(instance->items) * instance->count;
memcpy(instance->items, source, size);
At the last line of code above, size turns out to be zero, when one
might have thought it represents the total size in bytes of the
dynamic memory recently allocated for the trailing array items. Here
are a couple examples of this issue[4][5].
Instead, flexible array members have incomplete type, and so the
sizeof() operator may not be applied[6], so any misuse of such
operators will be immediately noticed at build time.
The cleanest and least error-prone way to implement this is through
the use of a flexible array member:
struct something {
size_t count;
struct foo items[];
};
struct something *instance;
instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
instance->count = count;
size = sizeof(instance->items[0]) * instance->count;
memcpy(instance->items, source, size);
instead"
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
[4] commit f2cd32a443 ("rndis_wlan: Remove logically dead code")
[5] commit ab91c2a89f ("tpm: eventlog: Replace zero-length array with flexible-array member")
[6] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
* tag 'flex-array-conversions-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (41 commits)
w1: Replace zero-length array with flexible-array
tracing/probe: Replace zero-length array with flexible-array
soc: ti: Replace zero-length array with flexible-array
tifm: Replace zero-length array with flexible-array
dmaengine: tegra-apb: Replace zero-length array with flexible-array
stm class: Replace zero-length array with flexible-array
Squashfs: Replace zero-length array with flexible-array
ASoC: SOF: Replace zero-length array with flexible-array
ima: Replace zero-length array with flexible-array
sctp: Replace zero-length array with flexible-array
phy: samsung: Replace zero-length array with flexible-array
RxRPC: Replace zero-length array with flexible-array
rapidio: Replace zero-length array with flexible-array
media: pwc: Replace zero-length array with flexible-array
firmware: pcdp: Replace zero-length array with flexible-array
oprofile: Replace zero-length array with flexible-array
block: Replace zero-length array with flexible-array
tools/testing/nvdimm: Replace zero-length array with flexible-array
libata: Replace zero-length array with flexible-array
kprobes: Replace zero-length array with flexible-array
...
The purgatory Makefile removes -fstack-protector options if they were
configured in, but does not currently add -fno-stack-protector.
If gcc was configured with the --enable-default-ssp configure option,
this results in the stack protector still being enabled for the
purgatory (absent distro-specific specs files that might disable it
again for freestanding compilations), if the main kernel is being
compiled with stack protection enabled (if it's disabled for the main
kernel, the top-level Makefile will add -fno-stack-protector).
This will break the build since commit
e4160b2e4b ("x86/purgatory: Fail the build if purgatory.ro has missing symbols")
and prior to that would have caused runtime failure when trying to use
kexec.
Explicitly add -fno-stack-protector to avoid this, as done in other
Makefiles that need to disable the stack protector.
Reported-by: Gabriel C <nix.or.die@googlemail.com>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2020-06-16
This series contains fixes to e1000 and e1000e.
Chen fixes an e1000e issue where systems could be waken via WoL, even
though the user has disabled the wakeup bit via sysfs.
Vaibhav Gupta updates the e1000 driver to clean up the legacy Power
Management hooks.
Arnd Bergmann cleans up the inconsistent use CONFIG_PM_SLEEP
preprocessor tags, which also resolves the compiler warnings about the
possibility of unused structure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The CONFIG_PM_SLEEP #ifdef checks in this file are inconsistent,
leading to a warning about sometimes unused function:
drivers/net/ethernet/intel/e1000e/netdev.c:137:13: error: unused function 'e1000e_check_me' [-Werror,-Wunused-function]
Rather than adding more #ifdefs, just remove them completely
and mark the PM functions as __maybe_unused to let the compiler
work it out on it own.
Fixes: e086ba2fcc ("e1000e: disable s0ix entry and exit flows for ME systems")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
With legacy PM hooks, it was the responsibility of a driver to manage PCI
states and also the device's power state. The generic approach is to let PCI
core handle the work.
e1000_suspend() calls __e1000_shutdown() to perform intermediate tasks.
__e1000_shutdown() modifies the value of "wake" (device should be wakeup
enabled or not), responsible for controlling the flow of legacy PM.
Since, PCI core has no idea about the value of "wake", new code for generic
PM may produce unexpected results. Thus, use "device_set_wakeup_enable()"
to wakeup-enable the device accordingly.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the system will be woken up via WOL(Wake On LAN) even if the
device wakeup ability has been disabled via sysfs:
cat /sys/devices/pci0000:00/0000:00:1f.6/power/wakeup
disabled
The system should not be woken up if the user has explicitly
disabled the wake up ability for this device.
This patch clears the WOL ability of this network device if the
user has disabled the wake up ability in sysfs.
Fixes: bc7f75fa97 ("[E1000E]: New pci-express e1000 driver")
Reported-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
kernel build system used to add -mcpu for each ARC ISA as default.
These days there are versions and varaints of ARC HS cores some of which
have specific -mcpu options to fine tune / optimize generated code.
So allow users/external build systems to specify their own -mcpu
This will be used in future patches for HSDK-4xD board support which
uses specific -mcpu to utilize dual issue scheduling of the core.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[abrodkin/vgupta: rewrote changelog]
Without a MODULE_DEVICE_TABLE the attributes are missing that create
an alias for auto-loading the module in userspace via hotplug.
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix AFS's silly rename by the following means:
(1) Set the destination directory in afs_do_silly_rename() so as to avoid
misbehaviour and indicate that the directory data version will
increment by 1 so as to avoid warnings about unexpected changes in the
DV. Also indicate that the ctime should be updated to avoid xfstest
grumbling.
(2) Note when the server indicates that a directory changed more than we
expected (AFS_OPERATION_DIR_CONFLICT), indicating a conflict with a
third party change, checking on successful completion of unlink and
rename.
The problem is that the FS.RemoveFile RPC op doesn't report the status
of the unlinked file, though YFS.RemoveFile2 does. This can be
mitigated by the assumption that if the directory DV cranked by
exactly 1, we can be sure we removed one link from the file; further,
ordinarily in AFS, files cannot be hardlinked across directories, so
if we reduce nlink to 0, the file is deleted.
However, if the directory DV jumps by more than 1, we cannot know if a
third party intervened by adding or removing a link on the file we
just removed a link from.
The same also goes for any vnode that is at the destination of the
FS.Rename RPC op.
(3) Make afs_vnode_commit_status() apply the nlink drop inside the cb_lock
section along with the other attribute updates if ->op_unlinked is set
on the descriptor for the appropriate vnode.
(4) Issue a follow up status fetch to the unlinked file in the event of a
third party conflict that makes it impossible for us to know if we
actually deleted the file or not.
(5) Provide a flag, AFS_VNODE_SILLY_DELETED, to make afs_getattr() lie to
the user about the nlink of a silly deleted file so that it appears as
0, not 1.
Found with the generic/035 and generic/084 xfstests.
Fixes: e49c7b2f6d ("afs: Build an abstraction around an "operation" concept")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
The port's headroom buffers are used to store packets while they
traverse the device's pipeline and also to store packets that are egress
mirrored.
On Spectrum-3, ports with eight lanes use two headroom buffers between
which the configured headroom size is split.
In order to prevent packet loss, multiply the calculated headroom size
by two for 8x ports.
Fixes: da382875c6 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Code to initialize the conf structure while gathering the configuration
of the device was missing.
Fixes: 571912c69f ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Martin <martin.varghese@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The remove function does not destroy all
BM Pools when per cpu pool is active.
When reloading the mvpp2 as a module the BM Pools
are still active in hardware and due to the bug
have twice the size now old + new.
This eventually leads to a kernel crash.
v2:
* add Fixes tag
Fixes: 7d04b0b13b ("mvpp2: percpu buffers")
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Back in 2013, we made a change that broke fast retransmit
for non SACK flows.
Indeed, for these flows, a sender needs to receive three duplicate
ACK before starting fast retransmit. Sending ACK with different
receive window do not count.
Even if enabling SACK is strongly recommended these days,
there still are some cases where it has to be disabled.
Not increasing the window seems better than having to
rely on RTO.
After the fix, following packetdrill test gives :
// Initialize connection
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 8>
+0 < . 1:1(0) ack 1 win 514
+0 accept(3, ..., ...) = 4
+0 < . 1:1001(1000) ack 1 win 514
// Quick ack
+0 > . 1:1(0) ack 1001 win 264
+0 < . 2001:3001(1000) ack 1 win 514
// DUPACK : Normally we should not change the window
+0 > . 1:1(0) ack 1001 win 264
+0 < . 3001:4001(1000) ack 1 win 514
// DUPACK : Normally we should not change the window
+0 > . 1:1(0) ack 1001 win 264
+0 < . 4001:5001(1000) ack 1 win 514
// DUPACK : Normally we should not change the window
+0 > . 1:1(0) ack 1001 win 264
+0 < . 1001:2001(1000) ack 1 win 514
// Hole is repaired.
+0 > . 1:1(0) ack 5001 win 272
Fixes: 4e4f1fc226 ("tcp: properly increase rcv_ssthresh for ofo packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unable to handle kernel NULL pointer dereference at virtual address 00000918
pgd = (ptrval)
[00000918] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-15001-gfa384b50b96b-dirty #514
Hardware name: ST-Ericsson Ux5x0 platform (Device Tree Support)
PC is at mcde_display_enable+0x78/0x7c0
LR is at mcde_display_enable+0x78/0x7c0
Fix this by using to_mcde() as in other functions.
Fixes: fd7ee85cfe ("drm/mcde: Don't use drm_device->dev_private")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200613223027.4189309-2-linus.walleij@linaro.org
The following bug appeared in the MCDE driver/display
initialization during the recent merge window.
First the place we call drm_fbdev_generic_setup() in the
wrong place: this needs to be called AFTER calling
drm_dev_register() else we get this splat:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at ../drivers/gpu/drm/drm_fb_helper.c:2198 drm_fbdev_generic_setup+0x164/0x1a8
mcde a0350000.mcde: Device has not been registered.
Modules linked in:
Hardware name: ST-Ericsson Ux5x0 platform (Device Tree Support)
[<c010e704>] (unwind_backtrace) from [<c010a86c>] (show_stack+0x10/0x14)
[<c010a86c>] (show_stack) from [<c0414f38>] (dump_stack+0x9c/0xb0)
[<c0414f38>] (dump_stack) from [<c0121c8c>] (__warn+0xb8/0xd0)
[<c0121c8c>] (__warn) from [<c0121d18>] (warn_slowpath_fmt+0x74/0xb8)
[<c0121d18>] (warn_slowpath_fmt) from [<c04b154c>] (drm_fbdev_generic_setup+0x164/0x1a8)
[<c04b154c>] (drm_fbdev_generic_setup) from [<c04ed278>] (mcde_drm_bind+0xc4/0x160)
[<c04ed278>] (mcde_drm_bind) from [<c04f06b8>] (try_to_bring_up_master+0x15c/0x1a4)
(...)
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200613223027.4189309-1-linus.walleij@linaro.org
Trap handler for syscall tracing reads EFA (Exception Fault Address),
in case strace wants PC of trap instruction (EFA is not part of pt_regs
as of current code).
However this EFA read is racy as it happens after dropping to pure
kernel mode (re-enabling interrupts). A taken interrupt could
context-switch, trigger a different task's trap, clobbering EFA for this
execution context.
Fix this by reading EFA early, before re-enabling interrupts. A slight
side benefit is de-duplication of FAKE_RET_FROM_EXCPN in trap handler.
The trap handler is common to both ARCompact and ARCv2 builds too.
This just came out of code rework/review and no real problem was reported
but is clearly a potential problem specially for strace.
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The function hif_scan() return the timeout for the completion of the
scan request. It is the only function from hif_tx.c that return another
thing than just an error code. This behavior is not coherent with the
rest of file. Worse, if value returned is positive, the caller can't
make say if it is a timeout or the value returned by the hardware.
Uniformize API with other HIF functions, only return the error code and
pass timeout with parameters.
Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200529121256.1045521-1-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In order to work properly all the queues of the device must be filled (the
device chooses itself the queue to use depending of AC parameters and
other things). It is the job of wfx_tx_queues_get_skb() to choose which
queue must be filled. However, the sorting algorithm was inverted, so it
prioritized the already filled queue! Consequently, the AC priorities was
badly broken.
Fixes: 6bf418c50f ("staging: wfx: change the way to choose frame to send")
Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200529121603.1050891-1-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull MFD fix from Lee Jones:
"Fix NULL pointer dereference in mt6360 driver"
* tag 'mfd-fixes-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
mfd: mt6360: Fix register driver NULL pointer by adding driver name
When I squashed the 'allnoconfig' compiler warning about the
set_sve_default_vl() function being defined but not used in commit
1e570f512c ("arm64/sve: Eliminate data races on sve_default_vl"), I
accidentally broke the build for configs where ARM64_SVE is enabled, but
SYSCTL is not.
Fix this by only compiling the SVE sysctl support if both CONFIG_SVE=y
and CONFIG_SYSCTL=y.
Cc: Dave Martin <Dave.Martin@arm.com>
Reported-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/r/20200616131808.GA1040@lca.pw
Signed-off-by: Will Deacon <will@kernel.org>
In btrfs_ioctl_get_subvol_info(), there is a classic case where kzalloc()
was incorrectly paired with kzfree(). According to David Sterba, there
isn't any sensitive information in the subvol_info that needs to be
cleared before freeing. So kzfree() isn't really needed, use kfree()
instead.
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
A RWF_NOWAIT write is not supposed to wait on filesystem locks that can be
held for a long time or for ongoing IO to complete.
However when calling check_can_nocow(), if the inode has prealloc extents
or has the NOCOW flag set, we can block on extent (file range) locks
through the call to btrfs_lock_and_flush_ordered_range(). Such lock can
take a significant amount of time to be available. For example, a fiemap
task may be running, and iterating through the entire file range checking
all extents and doing backref walking to determine if they are shared,
or a readpage operation may be in progress.
Also at btrfs_lock_and_flush_ordered_range(), called by check_can_nocow(),
after locking the file range we wait for any existing ordered extent that
is in progress to complete. Another operation that can take a significant
amount of time and defeat the purpose of RWF_NOWAIT.
So fix this by trying to lock the file range and if it's currently locked
return -EAGAIN to user space. If we are able to lock the file range without
waiting and there is an ordered extent in the range, return -EAGAIN as
well, instead of waiting for it to complete. Finally, don't bother trying
to lock the snapshot lock of the root when attempting a RWF_NOWAIT write,
as that is only important for buffered writes.
Fixes: edf064e7c6 ("btrfs: nowait aio support")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If we attempt to do a RWF_NOWAIT write against a file range for which we
can only do NOCOW for a part of it, due to the existence of holes or
shared extents for example, we proceed with the write as if it were
possible to NOCOW the whole range.
Example:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ touch /mnt/sdj/bar
$ chattr +C /mnt/sdj/bar
$ xfs_io -d -c "pwrite -S 0xab -b 256K 0 256K" /mnt/bar
wrote 262144/262144 bytes at offset 0
256 KiB, 1 ops; 0.0003 sec (694.444 MiB/sec and 2777.7778 ops/sec)
$ xfs_io -c "fpunch 64K 64K" /mnt/bar
$ sync
$ xfs_io -d -c "pwrite -N -V 1 -b 128K -S 0xfe 0 128K" /mnt/bar
wrote 131072/131072 bytes at offset 0
128 KiB, 1 ops; 0.0007 sec (160.051 MiB/sec and 1280.4097 ops/sec)
This last write should fail with -EAGAIN since the file range from 64K to
128K is a hole. On xfs it fails, as expected, but on ext4 it currently
succeeds because apparently it is expensive to check if there are extents
allocated for the whole range, but I'll check with the ext4 people.
Fix the issue by checking if check_can_nocow() returns a number of
NOCOW'able bytes smaller then the requested number of bytes, and if it
does return -EAGAIN.
Fixes: edf064e7c6 ("btrfs: nowait aio support")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If we attempt to write to prealloc extent located after eof using a
RWF_NOWAIT write, we always fail with -EAGAIN.
We do actually check if we have an allocated extent for the write at
the start of btrfs_file_write_iter() through a call to check_can_nocow(),
but later when we go into the actual direct IO write path we simply
return -EAGAIN if the write starts at or beyond EOF.
Trivial to reproduce:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ touch /mnt/foo
$ chattr +C /mnt/foo
$ xfs_io -d -c "pwrite -S 0xab 0 64K" /mnt/foo
wrote 65536/65536 bytes at offset 0
64 KiB, 16 ops; 0.0004 sec (135.575 MiB/sec and 34707.1584 ops/sec)
$ xfs_io -c "falloc -k 64K 1M" /mnt/foo
$ xfs_io -d -c "pwrite -N -V 1 -S 0xfe -b 64K 64K 64K" /mnt/foo
pwrite: Resource temporarily unavailable
On xfs and ext4 the write succeeds, as expected.
Fix this by removing the wrong check at btrfs_direct_IO().
Fixes: edf064e7c6 ("btrfs: nowait aio support")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If we do a successful RWF_NOWAIT write we end up locking the snapshot lock
of the inode, through a call to check_can_nocow(), but we never unlock it.
This means the next attempt to create a snapshot on the subvolume will
hang forever.
Trivial reproducer:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ touch /mnt/foobar
$ chattr +C /mnt/foobar
$ xfs_io -d -c "pwrite -S 0xab 0 64K" /mnt/foobar
$ xfs_io -d -c "pwrite -N -V 1 -S 0xfe 0 64K" /mnt/foobar
$ btrfs subvolume snapshot -r /mnt /mnt/snap
--> hangs
Fix this by unlocking the snapshot lock if check_can_nocow() returned
success.
Fixes: edf064e7c6 ("btrfs: nowait aio support")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This brings back an optimization that commit e678934cbe ("btrfs:
Remove unnecessary check from join_running_log_trans") removed, but in
a different form. So it's almost equivalent to a revert.
That commit removed an optimization where we avoid locking a root's
log_mutex when there is no log tree created in the current transaction.
The affected code path is triggered through unlink operations.
That commit was based on the assumption that the optimization was not
necessary because we used to have the following checks when the patch
was authored:
int btrfs_del_dir_entries_in_log(...)
{
(...)
if (dir->logged_trans < trans->transid)
return 0;
ret = join_running_log_trans(root);
(...)
}
int btrfs_del_inode_ref_in_log(...)
{
(...)
if (inode->logged_trans < trans->transid)
return 0;
ret = join_running_log_trans(root);
(...)
}
However before that patch was merged, another patch was merged first which
replaced those checks because they were buggy.
That other patch corresponds to commit 803f0f64d1 ("Btrfs: fix fsync
not persisting dentry deletions due to inode evictions"). The assumption
that if the logged_trans field of an inode had a smaller value then the
current transaction's generation (transid) meant that the inode was not
logged in the current transaction was only correct if the inode was not
evicted and reloaded in the current transaction. So the corresponding bug
fix changed those checks and replaced them with the following helper
function:
static bool inode_logged(struct btrfs_trans_handle *trans,
struct btrfs_inode *inode)
{
if (inode->logged_trans == trans->transid)
return true;
if (inode->last_trans == trans->transid &&
test_bit(BTRFS_INODE_NEEDS_FULL_SYNC, &inode->runtime_flags) &&
!test_bit(BTRFS_FS_LOG_RECOVERING, &trans->fs_info->flags))
return true;
return false;
}
So if we have a subvolume without a log tree in the current transaction
(because we had no fsyncs), every time we unlink an inode we can end up
trying to lock the log_mutex of the root through join_running_log_trans()
twice, once for the inode being unlinked (by btrfs_del_inode_ref_in_log())
and once for the parent directory (with btrfs_del_dir_entries_in_log()).
This means if we have several unlink operations happening in parallel for
inodes in the same subvolume, and the those inodes and/or their parent
inode were changed in the current transaction, we end up having a lot of
contention on the log_mutex.
The test robots from intel reported a -30.7% performance regression for
a REAIM test after commit e678934cbe ("btrfs: Remove unnecessary check
from join_running_log_trans").
So just bring back the optimization to join_running_log_trans() where we
check first if a log root exists before trying to lock the log_mutex. This
is done by checking for a bit that is set on the root when a log tree is
created and removed when a log tree is freed (at transaction commit time).
Commit e678934cbe ("btrfs: Remove unnecessary check from
join_running_log_trans") was merged in the 5.4 merge window while commit
803f0f64d1 ("Btrfs: fix fsync not persisting dentry deletions due to
inode evictions") was merged in the 5.3 merge window. But the first
commit was actually authored before the second commit (May 23 2019 vs
June 19 2019).
Reported-by: kernel test robot <rong.a.chen@intel.com>
Link: https://lore.kernel.org/lkml/20200611090233.GL12456@shao2-debian/
Fixes: e678934cbe ("btrfs: Remove unnecessary check from join_running_log_trans")
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When balance and scrub are running in parallel it is possible to end up
with an underflow of the bytes_may_use counter of the data space_info
object, which triggers a warning like the following:
[134243.793196] BTRFS info (device sdc): relocating block group 1104150528 flags data
[134243.806891] ------------[ cut here ]------------
[134243.807561] WARNING: CPU: 1 PID: 26884 at fs/btrfs/space-info.h:125 btrfs_add_reserved_bytes+0x1da/0x280 [btrfs]
[134243.808819] Modules linked in: btrfs blake2b_generic xor (...)
[134243.815779] CPU: 1 PID: 26884 Comm: kworker/u8:8 Tainted: G W 5.6.0-rc7-btrfs-next-58 #5
[134243.816944] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[134243.818389] Workqueue: writeback wb_workfn (flush-btrfs-108483)
[134243.819186] RIP: 0010:btrfs_add_reserved_bytes+0x1da/0x280 [btrfs]
[134243.819963] Code: 0b f2 85 (...)
[134243.822271] RSP: 0018:ffffa4160aae7510 EFLAGS: 00010287
[134243.822929] RAX: 000000000000c000 RBX: ffff96159a8c1000 RCX: 0000000000000000
[134243.823816] RDX: 0000000000008000 RSI: 0000000000000000 RDI: ffff96158067a810
[134243.824742] RBP: ffff96158067a800 R08: 0000000000000001 R09: 0000000000000000
[134243.825636] R10: ffff961501432a40 R11: 0000000000000000 R12: 000000000000c000
[134243.826532] R13: 0000000000000001 R14: ffffffffffff4000 R15: ffff96158067a810
[134243.827432] FS: 0000000000000000(0000) GS:ffff9615baa00000(0000) knlGS:0000000000000000
[134243.828451] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[134243.829184] CR2: 000055bd7e414000 CR3: 00000001077be004 CR4: 00000000003606e0
[134243.830083] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[134243.830975] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[134243.831867] Call Trace:
[134243.832211] find_free_extent+0x4a0/0x16c0 [btrfs]
[134243.832846] btrfs_reserve_extent+0x91/0x180 [btrfs]
[134243.833487] cow_file_range+0x12d/0x490 [btrfs]
[134243.834080] fallback_to_cow+0x82/0x1b0 [btrfs]
[134243.834689] ? release_extent_buffer+0x121/0x170 [btrfs]
[134243.835370] run_delalloc_nocow+0x33f/0xa30 [btrfs]
[134243.836032] btrfs_run_delalloc_range+0x1ea/0x6d0 [btrfs]
[134243.836725] ? find_lock_delalloc_range+0x221/0x250 [btrfs]
[134243.837450] writepage_delalloc+0xe8/0x150 [btrfs]
[134243.838059] __extent_writepage+0xe8/0x4c0 [btrfs]
[134243.838674] extent_write_cache_pages+0x237/0x530 [btrfs]
[134243.839364] extent_writepages+0x44/0xa0 [btrfs]
[134243.839946] do_writepages+0x23/0x80
[134243.840401] __writeback_single_inode+0x59/0x700
[134243.841006] writeback_sb_inodes+0x267/0x5f0
[134243.841548] __writeback_inodes_wb+0x87/0xe0
[134243.842091] wb_writeback+0x382/0x590
[134243.842574] ? wb_workfn+0x4a2/0x6c0
[134243.843030] wb_workfn+0x4a2/0x6c0
[134243.843468] process_one_work+0x26d/0x6a0
[134243.843978] worker_thread+0x4f/0x3e0
[134243.844452] ? process_one_work+0x6a0/0x6a0
[134243.844981] kthread+0x103/0x140
[134243.845400] ? kthread_create_worker_on_cpu+0x70/0x70
[134243.846030] ret_from_fork+0x3a/0x50
[134243.846494] irq event stamp: 0
[134243.846892] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[134243.847682] hardirqs last disabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
[134243.848687] softirqs last enabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
[134243.849913] softirqs last disabled at (0): [<0000000000000000>] 0x0
[134243.850698] ---[ end trace bd7c03622e0b0a96 ]---
[134243.851335] ------------[ cut here ]------------
When relocating a data block group, for each extent allocated in the
block group we preallocate another extent with the same size for the
data relocation inode (we do it at prealloc_file_extent_cluster()).
We reserve space by calling btrfs_check_data_free_space(), which ends
up incrementing the data space_info's bytes_may_use counter, and
then call btrfs_prealloc_file_range() to allocate the extent, which
always decrements the bytes_may_use counter by the same amount.
The expectation is that writeback of the data relocation inode always
follows a NOCOW path, by writing into the preallocated extents. However,
when starting writeback we might end up falling back into the COW path,
because the block group that contains the preallocated extent was turned
into RO mode by a scrub running in parallel. The COW path then calls the
extent allocator which ends up calling btrfs_add_reserved_bytes(), and
this function decrements the bytes_may_use counter of the data space_info
object by an amount corresponding to the size of the allocated extent,
despite we haven't previously incremented it. When the counter currently
has a value smaller then the allocated extent we reset the counter to 0
and emit a warning, otherwise we just decrement it and slowly mess up
with this counter which is crucial for space reservation, the end result
can be granting reserved space to tasks when there isn't really enough
free space, and having the tasks fail later in critical places where
error handling consists of a transaction abort or hitting a BUG_ON().
Fix this by making sure that if we fallback to the COW path for a data
relocation inode, we increment the bytes_may_use counter of the data
space_info object. The COW path will then decrement it at
btrfs_add_reserved_bytes() on success or through its error handling part
by a call to extent_clear_unlock_delalloc() (which ends up calling
btrfs_clear_delalloc_extent() that does the decrement operation) in case
of an error.
Test case btrfs/061 from fstests could sporadically trigger this.
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When running relocation of a data block group while scrub is running in
parallel, it is possible that the relocation will fail and abort the
current transaction with an -EINVAL error:
[134243.988595] BTRFS info (device sdc): found 14 extents, stage: move data extents
[134243.999871] ------------[ cut here ]------------
[134244.000741] BTRFS: Transaction aborted (error -22)
[134244.001692] WARNING: CPU: 0 PID: 26954 at fs/btrfs/ctree.c:1071 __btrfs_cow_block+0x6a7/0x790 [btrfs]
[134244.003380] Modules linked in: btrfs blake2b_generic xor raid6_pq (...)
[134244.012577] CPU: 0 PID: 26954 Comm: btrfs Tainted: G W 5.6.0-rc7-btrfs-next-58 #5
[134244.014162] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[134244.016184] RIP: 0010:__btrfs_cow_block+0x6a7/0x790 [btrfs]
[134244.017151] Code: 48 c7 c7 (...)
[134244.020549] RSP: 0018:ffffa41607863888 EFLAGS: 00010286
[134244.021515] RAX: 0000000000000000 RBX: ffff9614bdfe09c8 RCX: 0000000000000000
[134244.022822] RDX: 0000000000000001 RSI: ffffffffb3d63980 RDI: 0000000000000001
[134244.024124] RBP: ffff961589e8c000 R08: 0000000000000000 R09: 0000000000000001
[134244.025424] R10: ffffffffc0ae5955 R11: 0000000000000000 R12: ffff9614bd530d08
[134244.026725] R13: ffff9614ced41b88 R14: ffff9614bdfe2a48 R15: 0000000000000000
[134244.028024] FS: 00007f29b63c08c0(0000) GS:ffff9615ba600000(0000) knlGS:0000000000000000
[134244.029491] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[134244.030560] CR2: 00007f4eb339b000 CR3: 0000000130d6e006 CR4: 00000000003606f0
[134244.031997] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[134244.033153] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[134244.034484] Call Trace:
[134244.034984] btrfs_cow_block+0x12b/0x2b0 [btrfs]
[134244.035859] do_relocation+0x30b/0x790 [btrfs]
[134244.036681] ? do_raw_spin_unlock+0x49/0xc0
[134244.037460] ? _raw_spin_unlock+0x29/0x40
[134244.038235] relocate_tree_blocks+0x37b/0x730 [btrfs]
[134244.039245] relocate_block_group+0x388/0x770 [btrfs]
[134244.040228] btrfs_relocate_block_group+0x161/0x2e0 [btrfs]
[134244.041323] btrfs_relocate_chunk+0x36/0x110 [btrfs]
[134244.041345] btrfs_balance+0xc06/0x1860 [btrfs]
[134244.043382] ? btrfs_ioctl_balance+0x27c/0x310 [btrfs]
[134244.045586] btrfs_ioctl_balance+0x1ed/0x310 [btrfs]
[134244.045611] btrfs_ioctl+0x1880/0x3760 [btrfs]
[134244.049043] ? do_raw_spin_unlock+0x49/0xc0
[134244.049838] ? _raw_spin_unlock+0x29/0x40
[134244.050587] ? __handle_mm_fault+0x11b3/0x14b0
[134244.051417] ? ksys_ioctl+0x92/0xb0
[134244.052070] ksys_ioctl+0x92/0xb0
[134244.052701] ? trace_hardirqs_off_thunk+0x1a/0x1c
[134244.053511] __x64_sys_ioctl+0x16/0x20
[134244.054206] do_syscall_64+0x5c/0x280
[134244.054891] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[134244.055819] RIP: 0033:0x7f29b51c9dd7
[134244.056491] Code: 00 00 00 (...)
[134244.059767] RSP: 002b:00007ffcccc1dd08 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
[134244.061168] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f29b51c9dd7
[134244.062474] RDX: 00007ffcccc1dda0 RSI: 00000000c4009420 RDI: 0000000000000003
[134244.063771] RBP: 0000000000000003 R08: 00005565cea4b000 R09: 0000000000000000
[134244.065032] R10: 0000000000000541 R11: 0000000000000202 R12: 00007ffcccc2060a
[134244.066327] R13: 00007ffcccc1dda0 R14: 0000000000000002 R15: 00007ffcccc1dec0
[134244.067626] irq event stamp: 0
[134244.068202] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[134244.069351] hardirqs last disabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
[134244.070909] softirqs last enabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
[134244.072392] softirqs last disabled at (0): [<0000000000000000>] 0x0
[134244.073432] ---[ end trace bd7c03622e0b0a99 ]---
The -EINVAL error comes from the following chain of function calls:
__btrfs_cow_block() <-- aborts the transaction
btrfs_reloc_cow_block()
replace_file_extents()
get_new_location() <-- returns -EINVAL
When relocating a data block group, for each allocated extent of the block
group, we preallocate another extent (at prealloc_file_extent_cluster()),
associated with the data relocation inode, and then dirty all its pages.
These preallocated extents have, and must have, the same size that extents
from the data block group being relocated have.
Later before we start the relocation stage that updates pointers (bytenr
field of file extent items) to point to the the new extents, we trigger
writeback for the data relocation inode. The expectation is that writeback
will write the pages to the previously preallocated extents, that it
follows the NOCOW path. That is generally the case, however, if a scrub
is running it may have turned the block group that contains those extents
into RO mode, in which case writeback falls back to the COW path.
However in the COW path instead of allocating exactly one extent with the
expected size, the allocator may end up allocating several smaller extents
due to free space fragmentation - because we tell it at cow_file_range()
that the minimum allocation size can match the filesystem's sector size.
This later breaks the relocation's expectation that an extent associated
to a file extent item in the data relocation inode has the same size as
the respective extent pointed by a file extent item in another tree - in
this case the extent to which the relocation inode poins to is smaller,
causing relocation.c:get_new_location() to return -EINVAL.
For example, if we are relocating a data block group X that has a logical
address of X and the block group has an extent allocated at the logical
address X + 128KiB with a size of 64KiB:
1) At prealloc_file_extent_cluster() we allocate an extent for the data
relocation inode with a size of 64KiB and associate it to the file
offset 128KiB (X + 128KiB - X) of the data relocation inode. This
preallocated extent was allocated at block group Z;
2) A scrub running in parallel turns block group Z into RO mode and
starts scrubing its extents;
3) Relocation triggers writeback for the data relocation inode;
4) When running delalloc (btrfs_run_delalloc_range()), we try first the
NOCOW path because the data relocation inode has BTRFS_INODE_PREALLOC
set in its flags. However, because block group Z is in RO mode, the
NOCOW path (run_delalloc_nocow()) falls back into the COW path, by
calling cow_file_range();
5) At cow_file_range(), in the first iteration of the while loop we call
btrfs_reserve_extent() to allocate a 64KiB extent and pass it a minimum
allocation size of 4KiB (fs_info->sectorsize). Due to free space
fragmentation, btrfs_reserve_extent() ends up allocating two extents
of 32KiB each, each one on a different iteration of that while loop;
6) Writeback of the data relocation inode completes;
7) Relocation proceeds and ends up at relocation.c:replace_file_extents(),
with a leaf which has a file extent item that points to the data extent
from block group X, that has a logical address (bytenr) of X + 128KiB
and a size of 64KiB. Then it calls get_new_location(), which does a
lookup in the data relocation tree for a file extent item starting at
offset 128KiB (X + 128KiB - X) and belonging to the data relocation
inode. It finds a corresponding file extent item, however that item
points to an extent that has a size of 32KiB, which doesn't match the
expected size of 64KiB, resuling in -EINVAL being returned from this
function and propagated up to __btrfs_cow_block(), which aborts the
current transaction.
To fix this make sure that at cow_file_range() when we call the allocator
we pass it a minimum allocation size corresponding the desired extent size
if the inode belongs to the data relocation tree, otherwise pass it the
filesystem's sector size as the minimum allocation size.
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There is a race between block group removal and block group creation
when the removal is completed by a task running fitrim or scrub. When
this happens we end up failing the block group creation with an error
-EEXIST since we attempt to insert a duplicate block group item key
in the extent tree. That results in a transaction abort.
The race happens like this:
1) Task A is doing a fitrim, and at btrfs_trim_block_group() it freezes
block group X with btrfs_freeze_block_group() (until very recently
that was named btrfs_get_block_group_trimming());
2) Task B starts removing block group X, either because it's now unused
or due to relocation for example. So at btrfs_remove_block_group(),
while holding the chunk mutex and the block group's lock, it sets
the 'removed' flag of the block group and it sets the local variable
'remove_em' to false, because the block group is currently frozen
(its 'frozen' counter is > 0, until very recently this counter was
named 'trimming');
3) Task B unlocks the block group and the chunk mutex;
4) Task A is done trimming the block group and unfreezes the block group
by calling btrfs_unfreeze_block_group() (until very recently this was
named btrfs_put_block_group_trimming()). In this function we lock the
block group and set the local variable 'cleanup' to true because we
were able to decrement the block group's 'frozen' counter down to 0 and
the flag 'removed' is set in the block group.
Since 'cleanup' is set to true, it locks the chunk mutex and removes
the extent mapping representing the block group from the mapping tree;
5) Task C allocates a new block group Y and it picks up the logical address
that block group X had as the logical address for Y, because X was the
block group with the highest logical address and now the second block
group with the highest logical address, the last in the fs mapping tree,
ends at an offset corresponding to block group X's logical address (this
logical address selection is done at volumes.c:find_next_chunk()).
At this point the new block group Y does not have yet its item added
to the extent tree (nor the corresponding device extent items and
chunk item in the device and chunk trees). The new group Y is added to
the list of pending block groups in the transaction handle;
6) Before task B proceeds to removing the block group item for block
group X from the extent tree, which has a key matching:
(X logical offset, BTRFS_BLOCK_GROUP_ITEM_KEY, length)
task C while ending its transaction handle calls
btrfs_create_pending_block_groups(), which finds block group Y and
tries to insert the block group item for Y into the exten tree, which
fails with -EEXIST since logical offset is the same that X had and
task B hasn't yet deleted the key from the extent tree.
This failure results in a transaction abort, producing a stack like
the following:
------------[ cut here ]------------
BTRFS: Transaction aborted (error -17)
WARNING: CPU: 2 PID: 19736 at fs/btrfs/block-group.c:2074 btrfs_create_pending_block_groups+0x1eb/0x260 [btrfs]
Modules linked in: btrfs blake2b_generic xor raid6_pq (...)
CPU: 2 PID: 19736 Comm: fsstress Tainted: G W 5.6.0-rc7-btrfs-next-58 #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_create_pending_block_groups+0x1eb/0x260 [btrfs]
Code: ff ff ff 48 8b 55 50 f0 48 (...)
RSP: 0018:ffffa4160a1c7d58 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff961581909d98 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffffb3d63990 RDI: 0000000000000001
RBP: ffff9614f3356a58 R08: 0000000000000000 R09: 0000000000000001
R10: ffff9615b65b0040 R11: 0000000000000000 R12: ffff961581909c10
R13: ffff9615b0c32000 R14: ffff9614f3356ab0 R15: ffff9614be779000
FS: 00007f2ce2841e80(0000) GS:ffff9615bae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555f18780000 CR3: 0000000131d34005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_start_dirty_block_groups+0x398/0x4e0 [btrfs]
btrfs_commit_transaction+0xd0/0xc50 [btrfs]
? btrfs_attach_transaction_barrier+0x1e/0x50 [btrfs]
? __ia32_sys_fdatasync+0x20/0x20
iterate_supers+0xdb/0x180
ksys_sync+0x60/0xb0
__ia32_sys_sync+0xa/0x10
do_syscall_64+0x5c/0x280
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f2ce1d4d5b7
Code: 83 c4 08 48 3d 01 (...)
RSP: 002b:00007ffd8b558c58 EFLAGS: 00000202 ORIG_RAX: 00000000000000a2
RAX: ffffffffffffffda RBX: 000000000000002c RCX: 00007f2ce1d4d5b7
RDX: 00000000ffffffff RSI: 00000000186ba07b RDI: 000000000000002c
RBP: 0000555f17b9e520 R08: 0000000000000012 R09: 000000000000ce00
R10: 0000000000000078 R11: 0000000000000202 R12: 0000000000000032
R13: 0000000051eb851f R14: 00007ffd8b558cd0 R15: 0000555f1798ec20
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
softirqs last enabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace bd7c03622e0b0a9c ]---
Fix this simply by making btrfs_remove_block_group() remove the block
group's item from the extent tree before it flags the block group as
removed. Also make the free space deletion from the free space tree
before flagging the block group as removed, to avoid a similar race
with adding and removing free space entries for the free space tree.
Fixes: 04216820fe ("Btrfs: fix race between fs trimming and block group remove/allocation")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When removing a block group, if we fail to delete the block group's item
from the extent tree, we jump to the 'out' label and end up decrementing
the block group's reference count once only (by 1), resulting in a counter
leak because the block group at that point was already removed from the
block group cache rbtree - so we have to decrement the reference count
twice, once for the rbtree and once for our lookup at the start of the
function.
There is a second bug where if removing the free space tree entries (the
call to remove_block_group_free_space()) fails we end up jumping to the
'out_put_group' label but end up decrementing the reference count only
once, when we should have done it twice, since we have already removed
the block group from the block group cache rbtree. This happens because
the reference count decrement for the rbtree reference happens after
attempting to remove the free space tree entries, which is far away from
the place where we remove the block group from the rbtree.
To make things less error prone, decrement the reference count for the
rbtree immediately after removing the block group from it. This also
eleminates the need for two different exit labels on error, renaming
'out_put_label' to just 'out' and removing the old 'out'.
Fixes: f6033c5e33 ("btrfs: fix block group leak when removing fails")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As of commit 4dc55525b0 ("drm: plane: Verify that no or all planes
have a zpos property") a warning is emitted if there's a mix of planes
with and without a zpos property.
On Tegra, cursor planes are always composited on top of all other
planes, which is why they never had a zpos property attached to them.
However, since the composition order is fixed, this is trivial to
remedy by simply attaching an immutable zpos property to them.
v3: do not hardcode zpos for overlay planes used as cursor (Dmitry)
v2: hardcode cursor plane zpos to 255 instead of 0 (Ville)
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Currently when a host1x device driver is unregistered, it is not
detached from the host1x controller, which means that the device
will stay around and when the driver is registered again, it may
bind to the old, stale device rather than the new one that was
created from scratch upon driver registration. This in turn can
cause various weird crashes within the driver core because it is
confronted with a device that was already deleted.
Fix this by detaching the driver from the host1x controller when
it is unregistered. This ensures that the deleted device also is
no longer present in the device list that drivers will bind to.
Reported-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Silence documentation build warnings by adding kernel-doc fields.
./include/linux/host1x.h:69: warning: Function parameter or member 'parent' not described in 'host1x_client'
./include/linux/host1x.h:69: warning: Function parameter or member 'usecount' not described in 'host1x_client'
./include/linux/host1x.h:69: warning: Function parameter or member 'lock' not described in 'host1x_client'
Signed-off-by: Colton Lewis <colton.w.lewis@protonmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add ":README" suffix support for the requires list, so that
the testcase can list up the required string for README file
to the requires list.
Note that the required string is treated as a fixed string,
instead of regular expression. Also, the testcase can specify
a string containing spaces with quotes. E.g.
# requires: "place: [<module>:]<symbol>":README
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Add ":tracer" suffix support for the requires list, so that
the testcase can list up the required tracer (e.g. function)
to the requires list.
For example, if the testcase requires function_graph tracer,
it can write requires list as below instead of checking
available_tracers.
# requires: function_graph:tracer
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Since check_filter_file() is basically checking the filter
tracefs file, we can convert it into requires list.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Commit cca98e9f8b ("mm: enforce that vmap can't map pages executable")
introduced 'pgprot_nx(prot)' for arm64 but collided silently with the
BTI support during the merge window, which endeavours to clear the GP
bit for non-executable kernel mappings in set_memory_nx().
For consistency between the two APIs, clear the GP bit in pgprot_nx().
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20200615154642.3579-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Since commit cd28d1d6e5 ("net: phy: at803x: Disable phy delay for
RGMII mode") the networking is broken on the BeagleBone AI which has
the AR8035 PHY for Gigabit Ethernet [0]. The fix is to switch from
phy-mode = "rgmii" to phy-mode = "rgmii-rxid".
Note: Grygorii made a similar DT fix for other AM57xx boards with a
different phy in commit 820f8a870f ("ARM: dts: am57xx: fix networking
on boards with ksz9031 phy").
[0] https://git.io/Jf7PX
Fixes: 520557d485 ("ARM: dts: am5729: beaglebone-ai: adding device tree")
Cc: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Drew Fustini <drew@beagleboard.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
I accidentally flipped the system timer to use system clock instead of
the 32k source clock.
Fixes: 14b1925a72 ("ARM: dts: Configure system timers for omap4")
Signed-off-by: Tony Lindgren <tony@atomide.com>
While testing the recent suspend and resume regressions I noticed that
duovero can still end up losing edge gpio interrupts on runtime
suspend. This causes NFSroot easily stopping working after resume on
duovero.
Let's fix the issue by using gpio level interrupts for smsc as then
the gpio interrupt state is seen by the gpio controller on resume.
Fixes: 731b409878 ("ARM: dts: Configure duovero for to allow core retention during idle")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Add support for devices which that have reports with id == 2
Signed-off-by: Caiyuan Xie <caiyuan.xie@cn.alps.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
afs_vnode_commit_status() is only ever called if op->error is 0, so remove
the op->error checks from the function.
Fixes: e49c7b2f6d ("afs: Build an abstraction around an "operation" concept")
Signed-off-by: David Howells <dhowells@redhat.com>
afs_check_for_remote_deletion() checks to see if error ENOENT is returned
by the server in response to an operation and, if so, marks the primary
vnode as having been deleted as the FID is no longer valid.
However, it's being called from the operation success functions, where no
abort has happened - and if an inline abort is recorded, it's handled by
afs_vnode_commit_status().
Fix this by actually calling the operation aborted method if provided and
having that point to afs_check_for_remote_deletion().
Fixes: e49c7b2f6d ("afs: Build an abstraction around an "operation" concept")
Signed-off-by: David Howells <dhowells@redhat.com>
Fix yfs_fs_fetch_status() to honour the vnode selector in
op->fetch_status.which as does afs_fs_fetch_status() that allows
afs_do_lookup() to use this as an alternative to the InlineBulkStatus RPC
call if not implemented by the server.
This doesn't matter in the current code as YFS servers always implement
InlineBulkStatus, but a subsequent will call it on YFS servers too in some
circumstances.
Fixes: e49c7b2f6d ("afs: Build an abstraction around an "operation" concept")
Signed-off-by: David Howells <dhowells@redhat.com>
Introduce "requires:" list to check required ftrace interface
for each test. This will simplify the interface checking code
and unify the error message. Another good point is, it can
skip the ftrace initializing.
Note that this requires list must be written as a shell
comment.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The Mediacom FlexBook edge13 uses the SIPODEV SP1064 touchpad, which does not
supply descriptors, so it has to be added to the override list.
Signed-off-by: Federico Ricchiuto <fed.ricchiuto@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
As same as other test cases, return unsupported if kprobe_events
or argument access feature are not found.
There can be a new arch which does not port those features yet,
and an older kernel which doesn't support it.
Those can not enable the features.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Allow ":" in the description line. Currently if there is ":"
in the test description line, the description is cut at that
point, but that was unintended.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Currently target_copy() is used only for sending linger pings, so
this doesn't come up, but generally omitting used_replica can hang
the client as we wouldn't notice the acting set change (legacy_change
in calc_target()) or trigger a warning in handle_reply().
Fixes: 117d96a04f ("libceph: support for balanced and localized reads")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Currently target_copy() is used only for sending linger pings, so
this doesn't come up, but generally omitting recovery_deletes can
result in unneeded resends (force_resend in calc_target()).
Fixes: ae78dd8139 ("libceph: make RECOVERY_DELETES feature create a new interval")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
osd_req_flags is overly general and doesn't suit its only user
(read_from_replica option) well:
- applying osd_req_flags in account_request() affects all OSD
requests, including linger (i.e. watch and notify). However,
linger requests should always go to the primary even though
some of them are reads (e.g. notify has side effects but it
is a read because it doesn't result in mutation on the OSDs).
- calls to class methods that are reads are allowed to go to
the replica, but most such calls issued for "rbd map" and/or
exclusive lock transitions are requested to be resent to the
primary via EAGAIN, doubling the latency.
Get rid of global osd_req_flags and set read_from_replica flag
only on specific OSD requests instead.
Fixes: 8ad44d5e0d ("libceph: read_from_replica option")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
UBSAN is supported since GCC 4.9, which unfortunately did not yet have
__has_attribute(). To work around, the __GCC4_has_attribute workaround
requires defining which compiler version supports the given attribute.
In the case of no_sanitize_undefined, it is the first version that
supports UBSAN, which is GCC 4.9.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Link: https://lkml.kernel.org/r/20200615231529.GA119644@google.com
Memset on the pointer right after malloc can cause a NULL pointer
deference if it failed to allocate memory. A simple fix is to
replace malloc()/memset() pair with a simple call to calloc().
Fixes: 0fca931a6f ("samples/bpf: program demonstrating access to xdp_rxq_info")
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
In order to remove the dependency on the simple-bus compatible string,
which causes the OF driver core to register all child devices, make the
display-hub driver explicitly register the display controller children.
Signed-off-by: Thierry Reding <treding@nvidia.com>
In order to remove the dependency on the simple-bus compatible string,
which causes the OF driver core to register all child devices, make the
host1x driver explicitly register its children.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Though the unconditional enable/disable code is not a final solution,
we don't want to run into a NULL pointer situation when window group
doesn't link to its DC parent if the DC is disabled in Device Tree.
So this patch simply adds a check to make sure that window group has
a valid parent before running into tegra_windowgroup_enable/disable.
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
host1x_debug_init() must be reverted in an error handling path.
This is already fixed in the remove function since commit 44156eee91
("gpu: host1x: Clean up debugfs on removal")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Qian Cai reported:
"""
When NUMA=n and nr_node_ids=2, in apply_wqattrs_prepare(), it has,
for_each_node(node) {
if (wq_calc_node_cpumask(...
where it will trigger a booting warning,
WARNING: workqueue cpumask: online intersect > possible intersect
because it found 2 nodes and wq_numa_possible_cpumask[1] is an empty
cpumask.
"""
Let NODES_SHIFT depend on NEED_MULTIPLE_NODES like it is done
on other architectures in order to fix this.
Fixes: 701dc81e74 ("s390/mm: remove fake numa support")
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
clock_getres in the vDSO library has to preserve the same behaviour
of posix_get_hrtimer_res().
In particular, posix_get_hrtimer_res() does:
sec = 0;
ns = hrtimer_resolution;
and hrtimer_resolution depends on the enablement of the high
resolution timers that can happen either at compile or at run time.
Fix the s390 vdso implementation of clock_getres keeping a copy of
hrtimer_resolution in vdso data and using that directly.
Link: https://lkml.kernel.org/r/20200324121027.21665-1-vincenzo.frascino@arm.com
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[heiko.carstens@de.ibm.com: use llgf for proper zero extension]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Currently, the VDSO is being linked through $(CC). This does not match
how the rest of the kernel links objects, which is through the $(LD)
variable.
When clang is built in a default configuration, it first attempts to use
the target triple's default linker, which is just ld. However, the user
can override this through the CLANG_DEFAULT_LINKER cmake define so that
clang uses another linker by default, such as LLVM's own linker, ld.lld.
This can be useful to get more optimized links across various different
projects.
However, this is problematic for the s390 vDSO because ld.lld does not
have any s390 emulatiom support:
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.1-rc1/lld/ELF/Driver.cpp#L132-L150
Thus, if a user is using a toolchain with ld.lld as the default, they
will see an error, even if they have specified ld.bfd through the LD
make variable:
$ make -j"$(nproc)" -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- LLVM=1 \
LD=s390x-linux-gnu-ld \
defconfig arch/s390/kernel/vdso64/
ld.lld: error: unknown emulation: elf64_s390
clang-11: error: linker command failed with exit code 1 (use -v to see invocation)
Normally, '-fuse-ld=bfd' could be used to get around this; however, this
can be fragile, depending on paths and variable naming. The cleaner
solution for the kernel is to take advantage of the fact that $(LD) can
be invoked directly, which bypasses the heuristics of $(CC) and respects
the user's choice. Similar changes have been done for ARM, ARM64, and
MIPS.
Link: https://lkml.kernel.org/r/20200602192523.32758-1-natechancellor@gmail.com
Link: https://github.com/ClangBuiltLinux/linux/issues/1041
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
[heiko.carstens@de.ibm.com: add --build-id flag]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Streamline the processing of QDIO Input Queues, and remove some
intermittent SLSB updates (no deleting of old ACKs, no redundant
transitions through NOT_INIT).
Rather than counting ACKs, we now keep track of the whole batch of
SBALs that were completed during the current polling cycle.
Most completed SBALs stay in their initial state (ie. PRIMED or ERROR),
except that the most recent SBAL in each sub-run is ACKed for
IRQ reduction.
The only logic changes happen in inbound_handle_work(), the other
delta is just a renaming of the variables that track the SBAL batch.
Note that in particular we don't need to flip the _oldest_ SBAL to
an idle state (eg. NOT_INIT or ACKed) as a guard against catching our
own tail. Since get_inbound_buffer_frontier() will never scan more than
the remaining nr_buf_used SBALs, this scenario just doesn't occur.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
s390 cannot set syscall number and reture code at the same time,
so set the appropriate flag to indicate it.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
When strace wants to update the syscall number, it sets GPR2
to the desired number and updates the GPR via PTRACE_SETREGSET.
It doesn't update regs->int_code which would cause the old syscall
executed on syscall restart. As we cannot change the ptrace ABI and
don't have a field for the interruption code, check whether the tracee
is in a syscall and the last instruction was svc. In that case assume
that the tracer wants to update the syscall number and copy the GPR2
value to regs->int_code.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
tracing expects to see invalid syscalls, so pass it through.
The syscall path in entry.S checks the syscall number before
looking up the handler, so it is still safe.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The current code returns the syscall number which an invalid
syscall number is supplied and tracing is enabled. This makes
the strace testsuite fail.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Use __secure_computing() and pass the register data via
seccomp_data so secure computing doesn't have to fetch it
again.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
xchg() for a single-byte location assembles to a 4-byte Compare&Swap,
wrapped into a non-trivial amount of retry code that deals with
concurrent modifications to the unaffected bytes.
Change it to a simple byte-store, but preserve the memory ordering
semantics that the CS provided.
This simplifies the generated code for a hot path, and in theory also
allows us to amortize the memory barriers over multiple SLSB updates.
CC: Andreas Krebbel <krebbel@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
For mono channel, SSI will switch to Normal mode.
In Normal mode and Network mode, the Word Length Control bits
control the word length divider in clock generator, which is
different with I2S Master mode (the word length is fixed to
32bit), it should be the value of params_width(hw_params).
The condition "slots == 2" is not good for I2S Master mode,
because for Network mode and Normal mode, the slots can also
be 2. Then we need to use (ssi->i2s_net & SSI_SCR_I2S_MODE_MASK)
to check if it is I2S Master mode.
So we refine the formula for mono channel, otherwise there
will be sound issue for S24_LE.
Fixes: b0a7043d5c ("ASoC: fsl_ssi: Caculate bit clock rate using slot number and width")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Reviewed-by: Nicolin Chen <nicoleotsuka@gmail.com>
Link: https://lore.kernel.org/r/034eff1435ff6ce300b6c781130cefd9db22ab9a.1592276147.git.shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The actual max_segs computation leads to failure while using the broadcom
sdio brcmfmac/bcmsdh driver, since the driver tries to make usage of
scatter gather.
But with the dram-access-quirk we use a 1,5K SRAM bounce buffer, and the
max_segs current value of 3 leads to max transfers to 4,5k, which doesn't
work.
This patch sets max_segs to 1 to better describe the hardware limitation,
and fix the SDIO functionality with the brcmfmac/bcmsdh driver on Amlogic
G12A/G12B SoCs on boards like SEI510 or Khadas VIM3.
Reported-by: Art Nikpal <art@khadas.com>
Reported-by: Christian Hewitt <christianshewitt@gmail.com>
Fixes: acdc8e71d9 ("mmc: meson-gx: add dram-access-quirk")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200608084458.32014-1-narmstrong@baylibre.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
It's a repetition of the commit aa58a21ae3
("gpio: pca953x: disable regmap locking")
which states the following:
This driver uses its own locking but regmap silently uses
a mutex for all operations too. Add the option to disable
locking to the regmap config struct.
Fixes: bcf41dc480 ("gpio: pca953x: fix handling of automatic address incrementing")
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
The commit 0f25fda840 ("gpio: pca953x: Zap ad-hoc reg_direction cache")
seems inadvertently made a typo in pca953x_irq_bus_sync_unlock().
When the direction bit is 1 it means input, and the piece of code in question
was looking for output ones that should be turned to inputs.
Fix direction setting when configure an IRQ by injecting a bitmap complement
operation.
Fixes: 0f25fda840 ("gpio: pca953x: Zap ad-hoc reg_direction cache")
Depends-on: 35d13d9489 ("gpio: pca953x: convert to use bitmap API")
Cc: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
ACPI table on Intel Galileo Gen 2 has wrong pin number for IRQ resource
of one of the I²C GPIO expanders. Since we know what that number is and
luckily have GPIO bases fixed for SoC's controllers, we may use a simple
DMI quirk to match the platform and retrieve GpioInt() pin on it for
the expander in question.
Mika suggested the way to avoid a quirk in the GPIO ACPI library and
here is the second, almost rewritten version of it.
Fixes: f32517bf1a ("gpio: pca953x: support ACPI devices found on Galileo Gen2")
Depends-on: 25e3ef894e ("gpio: acpi: Split out acpi_gpio_get_irq_resource() helper")
Suggested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
In most cases, such as CONFIG_ACPI_CUSTOM_DSDT and
CONFIG_ACPI_TABLE_UPGRADE, boot-time modifications to firmware tables
are tied to specific Kconfig options. Currently this is not the case
for modifying the ACPI SSDT via the efivar_ssdt kernel command line
option and associated EFI variable.
This patch adds CONFIG_EFI_CUSTOM_SSDT_OVERLAYS, which defaults
disabled, in order to allow enabling or disabling that feature during
the build.
Cc: <stable@vger.kernel.org>
Signed-off-by: Peter Jones <pjones@redhat.com>
Link: https://lore.kernel.org/r/20200615202408.2242614-1-pjones@redhat.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
According to BSpec the Data Island Packet should be disabled after
disabling the transcoder, but before the transcoder clock select is set
to none. On an ICL RVP, daisy-chained MST config not following this
leads to a hang with the following MCE when disabling the output:
[ 870.948739] mce: [Hardware Error]: CPU 0: Machine Check Exception: 5 Bank 6: ba00000011000402
[ 871.019212] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff81aca652> {poll_idle+0x92/0xb0}
[ 871.019212] mce: [Hardware Error]: TSC 135a261fe61
[ 871.019212] mce: [Hardware Error]: PROCESSOR 0:706e5 TIME 1591739604 SOCKET 0 APIC 0 microcode 20
[ 871.019212] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[ 871.019212] mce: [Hardware Error]: Machine check: Processor context corrupt
[ 871.019212] Kernel panic - not syncing: Fatal machine check
[ 871.019212] Kernel Offset: disabled
Bspec: 4287
Fixes: fa37a21327 ("drm/i915: Stop sending DP SDPs on ddi disable")
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609220616.6015-1-imre.deak@intel.com
(cherry picked from commit c980216dd2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In commit 5ba32c7be8 ("drm/i915/execlists: Always force a context
reload when rewinding RING_TAIL"), we placed the check for rewinding a
context on actually submitting the next request in that context. This
was so that we only had to check once, and could do so with precision
avoiding as many forced restores as possible. For example, to ensure
that we can resubmit the same request a couple of times, we include a
small wa_tail such that on the next submission, the ring->tail will
appear to move forwards when resubmitting the same request. This is very
common as it will happen for every lite-restore to fill the second port
after a context switch.
However, intel_ring_direction() is limited in precision to movements of
upto half the ring size. The consequence being that if we tried to
unwind many requests, we could exceed half the ring and flip the sense
of the direction, so missing a force restore. As no request can be
greater than half the ring (i.e. 2048 bytes in the smallest case), we
can check for rollback incrementally. As we check against the tail that
would be submitted, we do not lose any sensitivity and allow lite
restores for the simple case. We still need to double check upon
submitting the context, to allow for multiple preemptions and
resubmissions.
Fixes: 5ba32c7be8 ("drm/i915/execlists: Always force a context reload when rewinding RING_TAIL")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609151723.12971-1-chris@chris-wilson.co.uk
(cherry picked from commit e36ba817fa)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We have a I915_REQUEST_NOPREEMPT flag that we set when we must prevent
the HW from preempting during the course of this request. We need to
honour this flag and protect the HW even if we have a heartbeat request,
or other maximum priority barrier, pending. As such, restrict the
timeslicing check to avoid preempting into the topmost priority band,
leaving the unpreemptable requests in blissful peace running
uninterrupted on the HW.
v2: Set the I915_PRIORITY_BARRIER to be less than
I915_PRIORITY_UNPREEMPTABLE so that we never submit a request
(heartbeat or barrier) that can legitimately preempt the current
non-premptable request.
Fixes: 2a98f4e65b ("drm/i915: add infrastructure to hold off preemption on a request")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527162418.24755-1-chris@chris-wilson.co.uk
(cherry picked from commit b72f02d78e)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Fix the following warnings caused by reusage of the same irq_chip
instance for all spmi-gpio gpio_irq_chip instances. Instead embed
irq_chip into pmic_gpio_state struct.
gpio gpiochip2: (c440000.qcom,spmi:pmic@2:gpio@c000): detected irqchip that is shared with multiple gpiochips: please fix the driver.
gpio gpiochip3: (c440000.qcom,spmi:pmic@4:gpio@c000): detected irqchip that is shared with multiple gpiochips: please fix the driver.
gpio gpiochip4: (c440000.qcom,spmi:pmic@a:gpio@c000): detected irqchip that is shared with multiple gpiochips: please fix the driver.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20200604002817.667160-1-dmitry.baryshkov@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The kernel uses internal mounts created by kern_mount() and populated
with files with no lookup path by alloc_file_pseudo() for a variety of
reasons. An example of such a mount is for anonymous pipes. For pipes,
every vfs_write() regardless of filesystem, calls fsnotify_modify()
to notify of any changes which incurs a small amount of overhead in
fsnotify even when there are no watchers. It can also trigger for reads
and readv and writev, it was simply vfs_write() that was noticed first.
A patch is pending that reduces, but does not eliminate, the overhead of
fsnotify but for files that cannot be looked up via a path, even that
small overhead is unnecessary. The user API for all notification
subsystems (inotify, fanotify, ...) is based on the pathname and a dirfd
and proc entries appear to be the only visible representation of the
files. Proc does not have the same pathname as the internal entry and
the proc inode is not the same as the internal inode so even if fanotify
is used on a file under /proc/XX/fd, no useful events are notified.
This patch changes alloc_file_pseudo() to always opt out of fsnotify by
setting FMODE_NONOTIFY flag so that no check is made for fsnotify
watchers on pseudo files. This should be safe as the underlying helper
for the dentry is d_alloc_pseudo() which explicitly states that no
lookups are ever performed meaning that fanotify should have nothing
useful to attach to.
The test motivating this was "perf bench sched messaging --pipe". On
a single-socket machine using threads the difference of the patch was
as follows.
5.7.0 5.7.0
vanilla nofsnotify-v1r1
Amean 1 1.3837 ( 0.00%) 1.3547 ( 2.10%)
Amean 3 3.7360 ( 0.00%) 3.6543 ( 2.19%)
Amean 5 5.8130 ( 0.00%) 5.7233 * 1.54%*
Amean 7 8.1490 ( 0.00%) 7.9730 * 2.16%*
Amean 12 14.6843 ( 0.00%) 14.1820 ( 3.42%)
Amean 18 21.8840 ( 0.00%) 21.7460 ( 0.63%)
Amean 24 28.8697 ( 0.00%) 29.1680 ( -1.03%)
Amean 30 36.0787 ( 0.00%) 35.2640 * 2.26%*
Amean 32 38.0527 ( 0.00%) 38.1223 ( -0.18%)
The difference is small but in some cases it's outside the noise so
while marginal, there is still some small benefit to ignoring fsnotify
for files allocated via alloc_file_pseudo() in some cases.
Link: https://lore.kernel.org/r/20200615121358.GF3183@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Currently the PL330 is enabled by default. However if left in IDM reset, as is
the case with the Meraki and Synology NSP devices, the system will hang when
probing for the PL330's AMBA peripheral ID. We therefore should be able to
disable it in these cases.
The PL330 is also included among of the list of peripherals put into coherent
mode, so "dma-coherent" has been added here as well.
Fixes: 5fa1026a3e ("ARM: dts: NSP: Add PL330 support")
Signed-off-by: Matthew Hagan <mnhagan88@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
module and add the command family NVDIMM_FAMILY_PAPR to the white list
of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
nvdimm command mask and implement necessary scaffolding in the module
to handle ND_CMD_CALL ioctl and PDSM requests that we receive.
The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_pdsm.h' which
defines a 'struct nd_pkg_pdsm' and a maximal union named
'nd_pdsm_payload'. These new structs together with 'struct nd_cmd_pkg'
for a pdsm envelop thats sent by libndctl to libnvdimm and serviced by
papr_scm in 'papr_scm_service_pdsm()'. The PDSM request is
communicated by member 'struct nd_cmd_pkg.nd_command' together with
other information on the pdsm payload (size-in, size-out).
The patch also introduces 'struct pdsm_cmd_desc' instances of which
are stored in an array __pdsm_cmd_descriptors[] indexed with PDSM cmd
and corresponding access function pdsm_cmd_desc() is
introduced. 'struct pdsm_cdm_desc' holds the service function for a
given PDSM and corresponding payload in/out sizes.
A new function papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm. The function performs validation on the PDSM
payload based on info present in corresponding PDSM descriptor and if
valid calls the 'struct pdcm_cmd_desc.service' function to service the
PDSM.
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20200615124407.32596-6-vaibhav@linux.ibm.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement support for fetching nvdimm health information via
H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
of 64-bit bitmap, bitwise-and of which is then stored in
'struct papr_scm_priv' and subsequently partially exposed to
user-space via newly introduced dimm specific attribute
'papr/flags'. Since the hcall is costly, the health information is
cached and only re-queried, 60s after the previous successful hcall.
The patch also adds a documentation text describing flags reported by
the the new sysfs attribute 'papr/flags' is also introduced at
Documentation/ABI/testing/sysfs-bus-papr-pmem.
[1] commit 58b278f568 ("powerpc: Provide initial documentation for
PAPR hcalls")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20200615124407.32596-4-vaibhav@linux.ibm.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
'seq_buf' provides a very useful abstraction for writing to a string
buffer without needing to worry about it over-flowing. However even
though the API has been stable for couple of years now its still not
exported to kernel loadable modules limiting its usage.
Hence this patch proposes update to 'seq_buf.c' to mark
seq_buf_printf() which is part of the seq_buf API to be exported to
kernel loadable GPL modules. This symbol will be used in later parts
of this patch-set to simplify content creation for a sysfs attribute.
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Piotr Maziarz <piotrx.maziarz@linux.intel.com>
Cc: Cezary Rojewski <cezary.rojewski@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lore.kernel.org/r/20200615124407.32596-3-vaibhav@linux.ibm.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Roi Dayan says:
====================
remove dependency between mlx5, act_ct, nf_flow_table
Some exported functions from act_ct and nf_flow_table being used in mlx5_core.
This leads that mlx5 module always require act_ct and nf_flow_table modules.
Those small exported functions can be moved to the header files to
avoid this module dependency.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, nf_flow_table_offload_add/del_cb are exported by nf_flow_table
module, therefore modules using them will have hard-dependency
on nf_flow_table and will require loading it all the time.
This can lead to an unnecessary overhead on systems that do not
use this API.
To relax the hard-dependency between the modules, we unexport these
functions and make them static inline.
Fixes: 978703f425 ("netfilter: flowtable: Add API for registering to flow table events")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, tcf_ct_flow_table_restore_skb is exported by act_ct
module, therefore modules using it will have hard-dependency
on act_ct and will require loading it all the time.
This can lead to an unnecessary overhead on systems that do not
use hardware connection tracking action (ct_metadata action) in
the first place.
To relax the hard-dependency between the modules, we unexport this
function and make it a static inline one.
Fixes: 30b0cf90c6 ("net/sched: act_ct: Support restoring conntrack info on skbs")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix kernel-doc warning: the parameter was removed, so also remove
the kernel-doc notation for it.
../include/trace/events/block.h:278: warning: Excess function parameter 'error' description in 'trace_block_bio_complete'
Fixes: d24de76af8 ("block: remove the error argument to the block_bio_complete tracepoint")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When a user tries to parse a symbol located inside a module he must have
modpath set. Otherwise, decode_stacktrace won't be able to parse the
symbol correctly.
Right now the failure is silent and easily missed by the user. What's
worse is that by the time the user realizes what happened (or someone on
LKML asks him to add the modpath and re-run), he might have already got
rid of the vmlinux/modules.
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It isn't actually described clearly at all in UM10944.pdf, but on TX of
a management frame (such as PTP), this needs to happen:
- The destination MAC address (i.e. 01-80-c2-00-00-0e), along with the
desired destination port, need to be installed in one of the 4
management slots of the switch, over SPI.
- The host can poll over SPI for that management slot's ENFPORT field.
That gets unset when the switch has matched the slot to the frame.
And therein lies the problem. ENFPORT does not mean that the packet has
been transmitted. Just that it has been received over the CPU port, and
that the mgmt slot is yet again available.
This is relevant because of what we are doing in sja1105_ptp_txtstamp_skb,
which is called right after sja1105_mgmt_xmit. We are in a hard
real-time deadline, since the hardware only gives us 24 bits of TX
timestamp, so we need to read the full PTP clock to reconstruct it.
Because we're in a hurry (in an attempt to make sure that we have a full
64-bit PTP time which is as close as possible to the actual transmission
time of the frame, to avoid 24-bit wraparounds), first we read the PTP
clock, then we poll for the TX timestamp to become available.
But of course, we don't know for sure that the frame has been
transmitted when we read the full PTP clock. We had assumed that ENFPORT
means it has, but the assumption is incorrect. And while in most
real-life scenarios this has never been caught due to software delays,
nowhere is this fact more obvious than with a tc-taprio offload, where
PTP traffic gets a small timeslot very rarely (example: 1 packet per 10
ms). In that case, we will be reading the PTP clock for timestamp
reconstruction too early (before the packet has been transmitted), and
this renders the reconstruction procedure incorrect (see the assumptions
described in the comments found on function sja1105_tstamp_reconstruct).
So the PTP TX timestamps will be off by 1<<24 clock ticks, or 135 ms
(1 tick is 8 ns).
So fix this case of premature optimization by simply reordering the
sja1105_ptpegr_ts_poll and the sja1105_ptpclkval_read function calls. It
turns out that in practice, the 135 ms hard deadline for PTP timestamp
wraparound is not so hard, since even the most bandwidth-intensive PTP
profiles, such as 802.1AS-2011, have a sync frame interval of 125 ms.
So if we couldn't deliver a timestamp in 135 ms (which we can), we're
toast and have much bigger problems anyway.
Fixes: 47ed985e97 ("net: dsa: sja1105: Add logic for TX timestamping")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool rx and tx queue statistics are reporting wrong values.
Fix reading out the correct ones.
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
I no longer work for Cogent Embedded (but my old email still works :-)),
and still would like to continue looking after the Renesas Ethernet drivers
and bindings. Let's switch to my private email.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
In rocker_dma_rings_init, the goto blocks in case of errors
caused by the functions rocker_dma_cmd_ring_waits_alloc() and
rocker_dma_ring_create() are incorrect. The patch fixes the
order consistent with cleanup in rocker_dma_rings_fini().
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of failure of check_expect_hints_stats(), the resources
allocated by objagg_hints_get should be freed. The patch fixes
this issue.
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
During development we tried to make the interrupt handling as fine-grained
as possible with TX and RX interrupts being disabled/enabled independently
and the counter registers reset from workqueue context.
Unfortunately after thorough testing of current mainline, we noticed the
driver has become unstable under heavy load. While this is hard to
reproduce, it's quite consistent in the driver's current form.
This patch proposes to go back to the previous approach of doing all
processing in napi context with all interrupts masked in order to make the
driver usable in mainline linux. This doesn't impact the performance on
pumpkin boards at all and it's in line with what many ethernet drivers do
in mainline linux anyway.
At the same time we're adding a FIXME comment about the need to improve
the interrupt handling.
Fixes: 8c7bd5a454 ("net: ethernet: mtk-star-emac: new driver")
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Bug fixes.
Four fixes related to the bnxt_en driver's resume path, AER reset, and
the timer function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
AER reset should follow the same steps as suspend/resume. We need to
free context memory during AER reset and allocate new context memory
during recovery by calling bnxt_hwrm_func_qcaps(). We also need
to call bnxt_reenable_sriov() to restore the VFs.
Fixes: bae361c54f ("bnxt_en: Improve AER slot reset.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If VFs are enabled, we need to re-configure them during resume because
firmware has been reset while resuming. Otherwise, the VFs won't
work after resume.
Fixes: c16d4ee0e3 ("bnxt_en: Refactor logic to re-enable SRIOV after firmware reset detected.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The separate steps we do in bnxt_resume() can be done more simply by
calling bnxt_hwrm_func_qcaps(). This change will add an extra
__bnxt_hwrm_func_qcaps() call which is needed anyway on older
firmware.
Fixes: f9b69d7f62 ("bnxt_en: Fix suspend/resume path on 57500 chips")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Fix bogus EEXIST on element insertions to the rbtree with timeouts,
from Stefano Brivio.
2) Preempt BUG splat in the pipapo element insertion path, also from
Stefano.
3) Release filter from the ctnetlink error path.
4) Release flowtable hooks from the deletion path.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The ocelot switchdev driver also provides a set of library functions for
the felix DSA driver, which in practice means that most of the patches
will be of interest to both groups of driver maintainers.
So, as also suggested in the discussion here, let's merge the 2 entries
into a single larger one:
https://www.spinics.net/lists/netdev/msg657412.html
Note that the entry has been renamed into "OCELOT SWITCH" since neither
Vitesse nor Microsemi exist any longer as company names, instead they
are now named Microchip (which again might be subject to change in the
future), so use the device family name instead.
Suggested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a race condition exist during termination. The path is
alx_stop and then alx_remove. An alx_schedule_link_check could be called
before alx_stop by interrupt handler and invoke alx_link_check later.
Alx_stop frees the napis, and alx_remove cancels any pending works.
If any of the work is scheduled before termination and invoked before
alx_remove, a null-ptr-deref occurs because both expect alx->napis[i].
This patch fix the race condition by moving cancel_work_sync functions
before alx_free_napis inside alx_stop. Because interrupt handler can call
alx_schedule_link_check again, alx_free_irq is moved before
cancel_work_sync calls too.
Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VNIC driver's "login" command sequence is the final step
in the driver's initialization process with device firmware,
confirming the available device queue resources to be utilized
by the driver. Under high system load, firmware may not respond
to the request in a timely manner or may abort the request. In
such cases, the driver should reattempt the login command
sequence. In case of a device error, the number of retries
is bounded.
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A recent change added a disable to NAPI into macb_open, this was
intended to only happen on the error path but accidentally applies
to all paths. This causes NAPI to be disabled on the success path, which
leads to the network to no longer functioning.
Fixes: 014406babc ("net: cadence: macb: disable NAPI on error")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Tested-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 9302c1bb8e ("efi/libstub: Rewrite file I/O routine") introduced a
regression that made a couple of (badly configured) systems fail to
boot [1]: Until 5.6, we silently accepted Unix-style file separators in
EFI paths, which might violate the EFI standard, but are an easy to make
mistake. This fix restores the pre-5.7 behaviour.
[1] https://bbs.archlinux.org/viewtopic.php?id=256273
Fixes: 9302c1bb8e ("efi/libstub: Rewrite file I/O routine")
Signed-off-by: Philipp Fent <fent@in.tum.de>
Link: https://lore.kernel.org/r/20200615115109.7823-1-fent@in.tum.de
[ardb: rewrite as chained if/else statements]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Pull more ext4 updates from Ted Ts'o:
"This is the second round of ext4 commits for 5.8 merge window [1].
It includes the per-inode DAX support, which was dependant on the DAX
infrastructure which came in via the XFS tree, and a number of
regression and bug fixes; most notably the "BUG: using
smp_processor_id() in preemptible code in ext4_mb_new_blocks" reported
by syzkaller"
[1] The pull request actually came in 15 minutes after I had tagged the
rc1 release. Tssk, tssk, late.. - Linus
* tag 'ext4-for-linus-5.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4, jbd2: ensure panic by fix a race between jbd2 abort and ext4 error handlers
ext4: support xattr gnu.* namespace for the Hurd
ext4: mballoc: Use this_cpu_read instead of this_cpu_ptr
ext4: avoid utf8_strncasecmp() with unstable name
ext4: stop overwrite the errcode in ext4_setup_super
ext4: fix partial cluster initialization when splitting extent
ext4: avoid race conditions when remounting with options that change dax
Documentation/dax: Update DAX enablement for ext4
fs/ext4: Introduce DAX inode flag
fs/ext4: Remove jflag variable
fs/ext4: Make DAX mount option a tri-state
fs/ext4: Only change S_DAX on inode load
fs/ext4: Update ext4_should_use_dax()
fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS
fs/ext4: Disallow verity if inode is DAX
fs/ext4: Narrow scope of DAX check in setflags
Add is_intr_type() and is_intr_type_n() to consolidate the boilerplate
code for querying a specific type of interrupt given an encoded value
from VMCS.VM_{ENTER,EXIT}_INTR_INFO, with and without an associated
vector respectively.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200609014518.26756-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
sve_default_vl can be modified via the /proc/sys/abi/sve_default_vl
sysctl concurrently with use, and modified concurrently by multiple
threads.
Adding a lock for this seems overkill, and I don't want to think any
more than necessary, so just define wrappers using READ_ONCE()/
WRITE_ONCE().
This will avoid the possibility of torn accesses and repeated loads
and stores.
There's no evidence yet that this is going wrong in practice: this
is just hygiene. For generic sysctl users, it would be better to
build this kind of thing into the sysctl common code somehow.
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/1591808590-20210-3-git-send-email-Dave.Martin@arm.com
[will: move set_sve_default_vl() inside #ifdef to squash allnoconfig warning]
Signed-off-by: Will Deacon <will@kernel.org>
For an exiting process it tries to cancel all its inflight requests. Use
req->task to match such instead of work.pid. We always have req->task
set, and it will be valid because we're matching only current exiting
task.
Also, remove work.pid and everything related, it's useless now.
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There will be multiple places where req->task is used, so refcount-pin
it lazily with introduced *io_{get,put}_req_task(). We need to always
have valid ->task for cancellation reasons, but don't care about pinning
it in some cases. That's why it sets req->task in io_req_init() and
implements get/put laziness with a flag.
This also removes using @current from polling io_arm_poll_handler(),
etc., but doesn't change observable behaviour.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Instead of waiting for each request one by one, first try to cancel all
of them in a batched manner, and then go over inflight_list/etc to reap
leftovers.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If a process is going away, io_uring_flush() will cancel only 1
request with a matching pid. Cancel all of them
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This adds support for cancelling all io-wq works matching a predicate.
It isn't used yet, so no change in observable behaviour.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Go all over all pending lists and cancel works there, and only then
try to match running requests. No functional changes here, just a
preparation for bulk cancellation.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Abort code UAEOVERFLOW is returned when we try and set a time that's out of
range, but it's currently mapped to EREMOTEIO by the default case.
Fix UAEOVERFLOW to map instead to EOVERFLOW.
Found with the generic/258 xfstest. Note that the test is wrong as it
assumes that the filesystem will support a pre-UNIX-epoch date.
Fixes: 1eda8bab70 ("afs: Add support for the UAE error table")
Signed-off-by: David Howells <dhowells@redhat.com>
Fix the following issues:
(1) Fix writeback to reduce the size of a store operation to i_size,
effectively discarding the extra data.
The problem comes when afs_page_mkwrite() records that a page is about
to be modified by mmap(). It doesn't know what bits of the page are
going to be modified, so it records the whole page as being dirty
(this is stored in page->private as start and end offsets).
Without this, the marshalling for the store to the server extends the
size of the file to the end of the page (in afs_fs_store_data() and
yfs_fs_store_data()).
(2) Fix setattr to actually truncate the pagecache, thereby clearing
the discarded part of a file.
(3) Fix setattr to check that the new size is okay and to disable
ATTR_SIZE if i_size wouldn't change.
(4) Force i_size to be updated as the result of a truncate.
(5) Don't truncate if ATTR_SIZE is not set.
(6) Call pagecache_isize_extended() if the file was enlarged.
Note that truncate_set_size() isn't used because the setting of i_size is
done inside afs_vnode_commit_status() under the vnode->cb_lock.
Found with the generic/029 and generic/393 xfstests.
Fixes: 31143d5d51 ("AFS: implement basic file write support")
Fixes: 4343d00872 ("afs: Get rid of the afs_writeback record")
Signed-off-by: David Howells <dhowells@redhat.com>
The in-kernel afs filesystem ignores ctime because the AFS fileserver
protocol doesn't support ctimes. This, however, causes various xfstests to
fail.
Work around this by:
(1) Setting ctime to attr->ia_ctime in afs_setattr().
(2) Not ignoring ATTR_MTIME_SET, ATTR_TIMES_SET and ATTR_TOUCH settings.
(3) Setting the ctime from the server mtime when on the target file when
creating a hard link to it.
(4) Setting the ctime on directories from their revised mtimes when
renaming/moving a file.
Found by the generic/221 and generic/309 xfstests.
Signed-off-by: David Howells <dhowells@redhat.com>
When doing a partial writeback, afs_write_back_from_locked_page() may
generate an FS.StoreData RPC request that writes out part of a file when a
file has been constructed from pieces by doing seek, write, seek, write,
... as is done by ld.
The FS.StoreData RPC is given the current i_size as the file length, but
the server basically ignores it unless the data length is 0 (in which case
it's just a truncate operation). The revised file length returned in the
result of the RPC may then not reflect what we suggested - and this leads
to i_size getting moved backwards - which causes issues later.
Fix the client to take account of this by ignoring the returned file size
unless the data version number jumped unexpectedly - in which case we're
going to have to clear the pagecache and reload anyway.
This can be observed when doing a kernel build on an AFS mount. The
following pair of commands produce the issue:
ld -m elf_x86_64 -z max-page-size=0x200000 --emit-relocs \
-T arch/x86/realmode/rm/realmode.lds \
arch/x86/realmode/rm/header.o \
arch/x86/realmode/rm/trampoline_64.o \
arch/x86/realmode/rm/stack.o \
arch/x86/realmode/rm/reboot.o \
-o arch/x86/realmode/rm/realmode.elf
arch/x86/tools/relocs --realmode \
arch/x86/realmode/rm/realmode.elf \
>arch/x86/realmode/rm/realmode.relocs
This results in the latter giving:
Cannot read ELF section headers 0/18: Success
as the realmode.elf file got corrupted.
The sequence of events can also be driven with:
xfs_io -t -f \
-c "pwrite -S 0x58 0 0x58" \
-c "pwrite -S 0x59 10000 1000" \
-c "close" \
/afs/example.com/scratch/a
Fixes: 31143d5d51 ("AFS: implement basic file write support")
Signed-off-by: David Howells <dhowells@redhat.com>
Fix afs_write_end() to change i_size under vnode->cb_lock rather than
->wb_lock so that it doesn't race with afs_vnode_commit_status() and
afs_getattr().
The ->wb_lock is only meant to guard access to ->wb_keys which isn't
accessed by that piece of code.
Fixes: 4343d00872 ("afs: Get rid of the afs_writeback record")
Signed-off-by: David Howells <dhowells@redhat.com>
The mtime on an inode needs to be updated when a write is made into an
mmap'ed section. There are three ways in which this could be done: update
it when page_mkwrite is called, update it when a page is changed from dirty
to writeback or leave it to the server and fix the mtime up from the reply
to the StoreData RPC.
Found with the generic/215 xfstest.
Fixes: 1cf7a1518a ("afs: Implement shared-writeable mmap")
Signed-off-by: David Howells <dhowells@redhat.com>
PFUZE100_SWB_REG is not proper for sw1a/sw2, because enable_mask/enable_reg
is not correct. On PFUZE3000, sw1a/sw2 should be the same as sw1a/sw2 on
pfuze100 except that voltages are not linear, so add new PFUZE3000_SW_REG
and pfuze3000_sw_regulator_ops which like the non-linear PFUZE100_SW_REG
and pfuze100_sw_regulator_ops.
Fixes: 1dced996ee ("regulator: pfuze100: update voltage setting for pfuze3000 sw1a")
Reported-by: Christophe Meynard <Christophe.Meynard@ign.fr>
Signed-off-by: Robin Gong <yibin.gong@nxp.com>
Link: https://lore.kernel.org/r/1592171648-8752-1-git-send-email-yibin.gong@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The blk_mq_all_tag_iter() is a void function, thus remove
the redundant 'return' statement in this function.
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patchset fixes a memory allocation issue and removes a 100%
reproducible use-after-free report thrown by KASAN in automated module
removal tests across multiple platforms.
All the credit goes to Bard Liao for root-causing the issue. DAIs may
be registered at the same time as a component, or when the topology is
loaded. This two-step registration causes the memory for
topology-based DAIs to allocated last, and conversely to be released
first by devres, before the component is released and the DAIs removed
from the component DAI list with snd_soc_unregister_dais().
When we remove a component, by the time we walk through its dai list
to unregister all dais, the dais allocated by the topology have been
freed already by devres and the list is corrupted with pointers that
are no longer valid.
The suggestion is to add an explicit devm_ based registration for
topology-based dais, so that each dai is cleanly removed from the
component dai list in the release operation before devres releases the
allocated memory.
Pierre-Louis Bossart (2):
ASoC: soc-devres: add devm_snd_soc_register_dai()
ASoC: soc-topology: use devm_snd_soc_register_dai()
include/sound/soc.h | 4 ++++
sound/soc/soc-devres.c | 37 +++++++++++++++++++++++++++++++++++++
sound/soc/soc-topology.c | 3 +--
3 files changed, 42 insertions(+), 2 deletions(-)
--
2.20.1
For some reasons, running a simple qemu-kvm command with KCSAN will
reset AMD hosts. It turns out svm_vcpu_run() could not be instrumented.
Disable it for now.
# /usr/libexec/qemu-kvm -name ubuntu-18.04-server-cloudimg -cpu host
-smp 2 -m 2G -hda ubuntu-18.04-server-cloudimg.qcow2
=== console output ===
Kernel 5.6.0-next-20200408+ on an x86_64
hp-dl385g10-05 login:
<...host reset...>
HPE ProLiant System BIOS A40 v1.20 (03/09/2018)
(C) Copyright 1982-2018 Hewlett Packard Enterprise Development LP
Early system initialization, please wait...
Signed-off-by: Qian Cai <cai@lca.pw>
Message-Id: <20200415153709.1559-1-cai@lca.pw>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.
Lastly, make use of the sizeof_field() helper instead of an open-coded
version.
This issue was found with the help of Coccinelle and audited _manually_.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200527171425.GA4053@embeddedor
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Applications that read EFI variables may see a return
value of -EINTR if they exceed the rate limit and a
signal delivery is attempted while the process is sleeping.
This is quite surprising to the application, which probably
doesn't have code to handle it.
Change the interruptible sleep to a non-interruptible one.
Reported-by: Lennart Poettering <mzxreary@0pointer.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20200528194905.690-3-tony.luck@intel.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
kobject_init_and_add() takes reference even when it fails.
If this function returns an error, kobject_put() must be called to
properly clean up the memory associated with the object. Previous
commit "b8eb718348b8" fixed a similar problem.
Fixes: 0bb549052d ("efi: Add esrt support")
Signed-off-by: Qiushi Wu <wu000273@umn.edu>
Link: https://lore.kernel.org/r/20200528183804.4497-1-wu000273@umn.edu
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
With CONFIG_DEBUG_VIRTUAL=y, we can hit a BUG() if we take a hard
lockup watchdog interrupt when in OPAL mode.
This happens in show_instructions() if the kernel takes the watchdog
NMI IPI, or any other interrupt, with MSR_IR == 0. show_instructions()
updates the variable pc in the loop and the second iteration will
result in BUG().
We hit the BUG_ON due the below check in __va()
#define __va(x)
({
VIRTUAL_BUG_ON((unsigned long)(x) >= PAGE_OFFSET);
(void *)(unsigned long)((phys_addr_t)(x) | PAGE_OFFSET);
})
Fix it by moving the check out of the loop. Also update nip so that
the nip == pc check still matches.
Fixes: 4dd7554a64 ("powerpc/64: Add VIRTUAL_BUG_ON checks for __va and __pa addresses")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[mpe: Use IS_ENABLED(), massage change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200524093822.423487-1-aneesh.kumar@linux.ibm.com
It is possible that the first event in the event log is not actually a
log header at all, but rather a normal event. This leads to the cast in
__calc_tpm2_event_size being an invalid conversion, which means that
the values read are effectively garbage. Depending on the first event's
contents, this leads either to apparently normal behaviour, a crash or
a freeze.
While this behaviour of the firmware is not in accordance with the
TCG Client EFI Specification, this happens on a Dell Precision 5510
with the TPM enabled but hidden from the OS ("TPM On" disabled, state
otherwise untouched). The EFI firmware claims that the TPM is present
and active and that it supports the TCG 2.0 event log format.
Fortunately, this can be worked around by simply checking the header
of the first event and the event log header signature itself.
Commit b4f1874c62 ("tpm: check event log version before reading final
events") addressed a similar issue also found on Dell models.
Fixes: 6b03261902 ("efi: Attempt to get the TCG2 event log in the boot stub")
Signed-off-by: Fabian Vogt <fvogt@suse.de>
Link: https://lore.kernel.org/r/1927248.evlx2EsYKh@linux-e202.suse.de
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1165773
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reinitialize IA32_FEAT_CTL on the BSP during wakeup to handle the case
where firmware doesn't initialize or save/restore across S3. This fixes
a bug where IA32_FEAT_CTL is left uninitialized and results in VMXON
taking a #GP due to VMX not being fully enabled, i.e. breaks KVM.
Use init_ia32_feat_ctl() to "restore" IA32_FEAT_CTL as it already deals
with the case where the MSR is locked, and because APs already redo
init_ia32_feat_ctl() during suspend by virtue of the SMP boot flow being
used to reinitialize APs upon wakeup. Do the call in the early wakeup
flow to avoid dependencies in the syscore_ops chain, e.g. simply adding
a resume hook is not guaranteed to work, as KVM does VMXON in its own
resume hook, kvm_resume(), when KVM has active guests.
Fixes: 21bd3467a5 ("KVM: VMX: Drop initialization of IA32_FEAT_CTL MSR")
Reported-by: Brad Campbell <lists2009@fnarfbargle.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Tested-by: Brad Campbell <lists2009@fnarfbargle.com>
Cc: stable@vger.kernel.org # v5.6
Link: https://lkml.kernel.org/r/20200608174134.11157-1-sean.j.christopherson@intel.com
TEXT_OFFSET was recently changed to 0x0, in preparation for its removal
at a later stage, and a warning is emitted into the kernel log when the
bootloader appears to have failed to take the TEXT_OFFSET image header
value into account.
Ironically, this warning itself fails to take TEXT_OFFSET into account,
and compares the kernel image's alignment modulo 2M against a hardcoded
value of 0x0, and so the warning will trigger spuriously when TEXT_OFFSET
randomization is enabled.
Given the intent to get rid of TEXT_OFFSET entirely, let's fix this
oversight by just removing support for TEXT_OFFSET randomization.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20200615101939.634391-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Explain the rationale for annotating WARN(), even though, strictly
speaking printk() and friends are very much not safe in many of the
places we put them.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
The UBSAN instrumentation only inserts external CALLs when things go
'BAD', much like WARN(). So treat them similar to WARN()s for noinstr,
that is: allow them, at the risk of taking the machine down, to get
their message out.
Suggested-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Marco Elver <elver@google.com>
Adds config variable CC_HAS_WORKING_NOSANITIZE_ADDRESS, which will be
true if we have a compiler that does not fail builds due to
no_sanitize_address functions. This does not yet mean they work as
intended, but for automated build-tests, this is the minimum
requirement.
For example, we require that __always_inline functions used from
no_sanitize_address functions do not generate instrumentation. On GCC <=
7 this fails to build entirely, therefore we make the minimum version
GCC 8.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Link: https://lkml.kernel.org/r/20200602175859.GC2604@hirez.programming.kicks-ass.net
The 'noinstr' function attribute means no-instrumentation, this should
very much include *SAN. Because lots of that is broken at present,
only include KCSAN for now, as that is limited to clang11, which has
sane function attribute behaviour.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
There are no more user of this function attribute, also, with us now
actively supporting '__no_kcsan inline' it doesn't make sense to have
in any case.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Now that KCSAN relies on -tsan-distinguish-volatile we no longer need
the annotation for constant_test_bit(). Remove it.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
The dt-bindings for the GSWIP describe that the node should be named
"switch". Use the same name in sysctrl.c so the GSWIP driver can
actually find the "gphy0" and "gphy1" clocks.
Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Cc: stable@vger.kernel.org
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Commit
bbf8e8b0fe ("efi/libstub: Optimize for size instead of speed")
changed the optimization level for the EFI stub to -Os from -O2.
Andrey Ignatov reports that this breaks the build with gcc 4.8.5.
Testing on godbolt.org, the combination of -Os,
-fno-asynchronous-unwind-tables, and ms_abi functions doesn't work,
failing with the error:
sorry, unimplemented: ms_abi attribute requires
-maccumulate-outgoing-args or subtarget optimization implying it
This does appear to work with gcc 4.9 onwards.
Add -maccumulate-outgoing-args explicitly to unbreak the build with
pre-4.9 versions of gcc.
Reported-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Link: https://lore.kernel.org/r/20200605150638.1011637-1-nivedita@alum.mit.edu
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
gcc-9 gets confused by the code flow in check_dirty_whitelist:
drivers/gpu/drm/i915/gt/selftest_workarounds.c: In function 'check_dirty_whitelist':
drivers/gpu/drm/i915/gt/selftest_workarounds.c:492:17: error: 'rsvd' may be used uninitialized in this function [-Werror=maybe-uninitialized]
I could not figure out a good way to do this in a way that gcc
understands better, so initialize the variable to zero, as last
resort.
Fixes: aee20aaed8 ("drm/i915: Implement read-only support in whitelist selftest")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527140526.1458215-2-arnd@arndb.de
(cherry picked from commit cc649a9eaf)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Conditional spinlocks make it hard for gcc and for lockdep to
follow the code flow. This one causes a warning with at least
gcc-9 and higher:
In file included from include/linux/irq.h:14,
from drivers/gpu/drm/i915/i915_pmu.c:7:
drivers/gpu/drm/i915/i915_pmu.c: In function 'i915_sample':
include/linux/spinlock.h:289:3: error: 'flags' may be used uninitialized in this function [-Werror=maybe-uninitialized]
289 | _raw_spin_unlock_irqrestore(lock, flags); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/i915_pmu.c:288:17: note: 'flags' was declared here
288 | unsigned long flags;
| ^~~~~
Split out the part between the locks into a separate function
for readability and to let the compiler figure out what the
logic actually is.
Fixes: d79e1bd676 ("drm/i915/pmu: Only use exclusive mmio access for gen7")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527140526.1458215-1-arnd@arndb.de
(cherry picked from commit 6ec81b8273)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
ttm_bo_add_move_fence() invokes dma_fence_get(), which returns a
reference of the specified dma_fence object to "fence" with increased
refcnt.
When ttm_bo_add_move_fence() returns, local variable "fence" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling path of
ttm_bo_add_move_fence(). When no_wait_gpu flag is equals to true, the
function forgets to decrease the refcnt increased by dma_fence_get(),
causing a refcnt leak.
Fix this issue by calling dma_fence_put() when no_wait_gpu flag is
equals to true.
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/370221/
Signed-off-by: Christian König <christian.koenig@amd.com>
ttm_bo_vm_fault_reserved() invokes dma_fence_get(), which returns a
reference of the specified dma_fence object to "moving" with increased
refcnt.
When ttm_bo_vm_fault_reserved() returns, local variable "moving" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in several exception handling paths
of ttm_bo_vm_fault_reserved(). When those error scenarios occur such as
"err" equals to -EBUSY, the function forgets to decrease the refcnt
increased by dma_fence_get(), causing a refcnt leak.
Fix this issue by calling dma_fence_put() when no_wait_gpu flag is
equals to true.
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/370219/
Signed-off-by: Christian König <christian.koenig@amd.com>
Smatch reports that:
drivers/crypto/marvell/octeontx/otx_cptvf_algs.c:132 otx_cpt_aead_callback()
warn: variable dereferenced before check 'cpt_info' (see line 121)
This function is called from process_pending_queue() as:
drivers/crypto/marvell/octeontx/otx_cptvf_reqmgr.c
599 /*
600 * Call callback after current pending entry has been
601 * processed, we don't do it if the callback pointer is
602 * invalid.
603 */
604 if (callback)
605 callback(res_code, areq, cpt_info);
It does appear to me that "cpt_info" can be NULL so this could lead to
a NULL dereference.
Fixes: 10b4f09491 ("crypto: marvell - add the Virtual Function driver for CPT")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When a crypto template needs to be instantiated, CRYPTO_MSG_ALG_REQUEST
is sent to crypto_chain. cryptomgr_schedule_probe() handles this by
starting a thread to instantiate the template, then waiting for this
thread to complete via crypto_larval::completion.
This can deadlock because instantiating the template may require loading
modules, and this (apparently depending on userspace) may need to wait
for the crc-t10dif module (lib/crc-t10dif.c) to be loaded. But
crc-t10dif's module_init function uses crypto_register_notifier() and
therefore takes crypto_chain.rwsem for write. That can't proceed until
the notifier callback has finished, as it holds this semaphore for read.
Fix this by removing the wait on crypto_larval::completion from within
cryptomgr_schedule_probe(). It's actually unnecessary because
crypto_alg_mod_lookup() calls crypto_larval_wait() itself after sending
CRYPTO_MSG_ALG_REQUEST.
This only actually became a problem in v4.20 due to commit b76377543b
("crc-t10dif: Pick better transform if one becomes available"), but the
unnecessary wait was much older.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207159
Reported-by: Mike Gerow <gerow@google.com>
Fixes: 398710379f ("crypto: algapi - Move larval completion into algboss")
Cc: <stable@vger.kernel.org> # v3.6+
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reported-by: Kai Lüke <kai@kinvolk.io>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The function hisi_acc_create_sg_pool may allocate a block of
memory of size PAGE_SIZE * 2^(MAX_ORDER - 1). This value may
exceed 2^31 on ia64, which would overflow the u32.
This patch caps it at 2^31.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: d8ac7b8523 ("crypto: hisilicon - fix large sgl memory...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Memory bandwidth is calculated reading the monitoring counter
at two intervals and calculating the delta. It is the software’s
responsibility to read the count often enough to avoid having
the count roll over _twice_ between reads.
The current code hardcodes the bandwidth monitoring counter's width
to 24 bits for AMD. This is due to default base counter width which
is 24. Currently, AMD does not implement the CPUID 0xF.[ECX=1]:EAX
to adjust the counter width. But, the AMD hardware supports much
wider bandwidth counter with the default width of 44 bits.
Kernel reads these monitoring counters every 1 second and adjusts the
counter value for overflow. With 24 bits and scale value of 64 for AMD,
it can only measure up to 1GB/s without overflowing. For the rates
above 1GB/s this will fail to measure the bandwidth.
Fix the issue setting the default width to 44 bits by adjusting the
offset.
AMD future products will implement CPUID 0xF.[ECX=1]:EAX.
[ bp: Let the line stick out and drop {}-brackets around a single
statement. ]
Fixes: 4d05bf71f1 ("x86/resctrl: Introduce AMD QOS feature")
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/159129975546.62538.5656031125604254041.stgit@naples-babu.amd.com
fix error "clock source 41 is not valid, cannot use"
[] New USB device found, idVendor=154e, idProduct=1002, bcdDevice= 1.00
[] New USB device strings: Mfr=1, Product=2, SerialNumber=0
[] Product: DCD-1500RE
[] Manufacturer: D & M Holdings Inc.
[]
[] clock source 41 is not valid, cannot use
[] usbcore: registered new interface driver snd-usb-audio
Signed-off-by: Yick W. Tse <y_w_tse@yahoo.com.hk>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1373857985.210365.1592048406997@mail.yahoo.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
DMA_REMAP is an unnecessary requirement for AMD SEV, which requires
DMA_COHERENT_POOL, so avoid selecting it when it is otherwise unnecessary.
The only other requirement for DMA coherent pools is DMA_DIRECT_REMAP, so
ensure that properly selects the config option when needed.
Fixes: 82fef0ad81 ("x86/mm: unencrypted non-blocking DMA allocations use coherent pools")
Reported-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Signed-off-by: David Rientjes <rientjes@google.com>
Tested-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The crypto algorithms selected by the ESP and AH kconfig options are
out-of-date with the guidance of RFC 8221, which lists the legacy
algorithms MD5 and DES as "MUST NOT" be implemented, and some more
modern algorithms like AES-GCM and HMAC-SHA256 as "MUST" be implemented.
But the options select the legacy algorithms, not the modern ones.
Therefore, modify these options to select the MUST algorithms --
and *only* the MUST algorithms.
Also improve the help text.
Note that other algorithms may still be explicitly enabled in the
kconfig, and the choice of which to actually use is still controlled by
userspace. This change only modifies the list of algorithms for which
kernel support is guaranteed to be present.
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Suggested-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Corentin Labbe <clabbe@baylibre.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Commit f23efcbcc5 ("crypto: ctr - no longer needs CRYPTO_SEQIV") made
CRYPTO_CTR stop selecting CRYPTO_SEQIV. This breaks IPsec for most
users since GCM and several other encryption algorithms require "seqiv"
-- and RFC 8221 lists AES-GCM as "MUST" be implemented.
Just make XFRM_ESP select CRYPTO_SEQIV.
Fixes: f23efcbcc5 ("crypto: ctr - no longer needs CRYPTO_SEQIV")
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Corentin Labbe <clabbe@baylibre.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Commit
10e68b02c8 ("Makefile: support compressed debug info")
added support for compressed debug sections.
Support is detected by checking
- does the compiler support -gz=zlib
- does the assembler support --compressed-debug-sections=zlib
- does the linker support --compressed-debug-sections=zlib
However, the gcc driver's support for this option is somewhat
convoluted. The driver's builtin specs are set based on the version of
binutils that it was configured with. It reports an error if the
configure-time linker/assembler (i.e., not necessarily the actual
assembler that will be run) do not support the option, but only if the
assembler (or linker) is actually invoked when -gz=zlib is passed.
The cc-option check in scripts/Kconfig.include does not invoke the
assembler, so the gcc driver reports success even if it does not support
the option being passed to the assembler.
Because the as-option check passes the option directly to the assembler
via -Wa,--compressed-debug-sections=zlib, the gcc driver does not see
this option and will never report an error.
Combined with an installed version of binutils that is more recent than
the one the compiler was built with, it is possible for all three tests
to succeed, yet an actual compilation with -gz=zlib to fail.
Moreover, it is unnecessary to explicitly pass
--compressed-debug-sections=zlib to the assembler via -Wa, since the
driver will do that automatically when it supports -gz=zlib.
Convert the as-option to just -gz=zlib, simplifying it as well as
performing a better test of the gcc driver's capabilities.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
scripts/checkpatch.pl reports following warning for patch
("bcache: check and adjust logical block size for backing devices"),
WARNING: quoted string split across lines
#146: FILE: drivers/md/bcache/super.c:896:
+ pr_info("%s: sb/logical block size (%u) greater than page size "
+ "(%lu) falling back to device logical block size (%u)",
There are two things to fix up,
- The kernel message print should be in a single line.
- pr_info() won't automatically add new line since v5.8, a '\n' should
be added.
This patch just does the above cleanup in bcache_device_init().
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch changes the asynchronous registration kworker to a delayed
kworker. There is probability queue_work() queues the async registration
kworker to the same CPU (even though very little), then the process
which writing sysfs interface to reigster bcache device may won't return
immeidately. queue_delayed_work() in this patch will delay 10 jiffies
before insert the kworker to run queue, which makes sure the registering
process may always returns to user space in time.
Fixes: 9e23ccf8f0 ("bcache: asynchronous devices registration")
Signed-off-by: Coly Li <colyli@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It's possible for a block driver to set logical block size to
a value greater than page size incorrectly; e.g. bcache takes
the value from the superblock, set by the user w/ make-bcache.
This causes a BUG/NULL pointer dereference in the path:
__blkdev_get()
-> set_init_blocksize() // set i_blkbits based on ...
-> bdev_logical_block_size()
-> queue_logical_block_size() // ... this value
-> bdev_disk_changed()
...
-> blkdev_readpage()
-> block_read_full_page()
-> create_page_buffers() // size = 1 << i_blkbits
-> create_empty_buffers() // give size/take pointer
-> alloc_page_buffers() // return NULL
.. BUG!
Because alloc_page_buffers() is called with size > PAGE_SIZE,
thus it initializes head = NULL, skips the loop, return head;
then create_empty_buffers() gets (and uses) the NULL pointer.
This has been around longer than commit ad6bf88a6c ("block:
fix an integer overflow in logical block size"); however, it
increased the range of values that can trigger the issue.
Previously only 8k/16k/32k (on x86/4k page size) would do it,
as greater values overflow unsigned short to zero, and queue_
logical_block_size() would then use the default of 512.
Now the range with unsigned int is much larger, and users w/
the 512k value, which happened to be zero'ed previously and
work fine, started to hit this issue -- as the zero is gone,
and queue_logical_block_size() does return 512k (>PAGE_SIZE.)
Fix this by checking the bcache device's logical block size,
and if it's greater than page size, fallback to the backing/
cached device's logical page size.
This doesn't affect cache devices as those are still checked
for block/page size in read_super(); only the backing/cached
devices are not.
Apparently it's a regression from commit 2903381fce ("bcache:
Take data offset from the bdev superblock."), moving the check
into BCACHE_SB_VERSION_CDEV only. Now that we have superblocks
of backing devices out there with this larger value, we cannot
refuse to load them (i.e., have a similar check in _BDEV.)
Ideally perhaps bcache should use all values from the backing
device (physical/logical/io_min block size)? But for now just
fix the problematic case.
Test-case:
# IMG=/root/disk.img
# dd if=/dev/zero of=$IMG bs=1 count=0 seek=1G
# DEV=$(losetup --find --show $IMG)
# make-bcache --bdev $DEV --block 8k
< see dmesg >
Before:
# uname -r
5.7.0-rc7
[ 55.944046] BUG: kernel NULL pointer dereference, address: 0000000000000000
...
[ 55.949742] CPU: 3 PID: 610 Comm: bcache-register Not tainted 5.7.0-rc7 #4
...
[ 55.952281] RIP: 0010:create_empty_buffers+0x1a/0x100
...
[ 55.966434] Call Trace:
[ 55.967021] create_page_buffers+0x48/0x50
[ 55.967834] block_read_full_page+0x49/0x380
[ 55.972181] do_read_cache_page+0x494/0x610
[ 55.974780] read_part_sector+0x2d/0xaa
[ 55.975558] read_lba+0x10e/0x1e0
[ 55.977904] efi_partition+0x120/0x5a6
[ 55.980227] blk_add_partitions+0x161/0x390
[ 55.982177] bdev_disk_changed+0x61/0xd0
[ 55.982961] __blkdev_get+0x350/0x490
[ 55.983715] __device_add_disk+0x318/0x480
[ 55.984539] bch_cached_dev_run+0xc5/0x270
[ 55.986010] register_bcache.cold+0x122/0x179
[ 55.987628] kernfs_fop_write+0xbc/0x1a0
[ 55.988416] vfs_write+0xb1/0x1a0
[ 55.989134] ksys_write+0x5a/0xd0
[ 55.989825] do_syscall_64+0x43/0x140
[ 55.990563] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 55.991519] RIP: 0033:0x7f7d60ba3154
...
After:
# uname -r
5.7.0.bcachelbspgsz
[ 31.672460] bcache: bcache_device_init() bcache0: sb/logical block size (8192) greater than page size (4096) falling back to device logical block size (512)
[ 31.675133] bcache: register_bdev() registered backing device loop0
# grep ^ /sys/block/bcache0/queue/*_block_size
/sys/block/bcache0/queue/logical_block_size:512
/sys/block/bcache0/queue/physical_block_size:8192
Reported-by: Ryan Finnie <ryan@finnie.org>
Reported-by: Sebastian Marsching <sebastian@marsching.com>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
coccicheck reports:
drivers/md//bcache/btree.c:1538:1-7: preceding lock on line 1417
In btree_gc_coalesce func, if the coalescing process fails, we will goto
to out_nocoalesce tag directly without releasing new_nodes[i]->write_lock.
Then, it will cause a deadlock when trying to acquire new_nodes[i]->
write_lock for freeing new_nodes[i] before return.
btree_gc_coalesce func details as follows:
if alloc new_nodes[i] fails:
goto out_nocoalesce;
// obtain new_nodes[i]->write_lock
mutex_lock(&new_nodes[i]->write_lock)
// main coalescing process
for (i = nodes - 1; i > 0; --i)
[snipped]
if coalescing process fails:
// Here, directly goto out_nocoalesce
// tag will cause a deadlock
goto out_nocoalesce;
[snipped]
// release new_nodes[i]->write_lock
mutex_unlock(&new_nodes[i]->write_lock)
// coalesing succ, return
return;
out_nocoalesce:
btree_node_free(new_nodes[i]) // free new_nodes[i]
// obtain new_nodes[i]->write_lock
mutex_lock(&new_nodes[i]->write_lock);
// set flag for reuse
clear_bit(BTREE_NODE_dirty, &ew_nodes[i]->flags);
// release new_nodes[i]->write_lock
mutex_unlock(&new_nodes[i]->write_lock);
To fix the problem, we add a new tag 'out_unlock_nocoalesce' for
releasing new_nodes[i]->write_lock before out_nocoalesce tag. If
coalescing process fails, we will go to out_unlock_nocoalesce tag
for releasing new_nodes[i]->write_lock before free new_nodes[i] in
out_nocoalesce tag.
(Coly Li helps to clean up commit log format.)
Fixes: 2a285686c1 ("bcache: btree locking rework")
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The NSP SoC includes an SP804 timer so should be enabled here.
Fixes: a0efb0d28b ("ARM: dts: NSP: Add SP804 Support to DT")
Signed-off-by: Matthew Hagan <mnhagan88@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Add lock protection from race conditions to the 104-quad-8 counter
driver for filter clock prescaler code changes. Mutex calls used for
protection.
Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com>
Fixes: de65d05563 ("counter: 104-quad-8: Support Filter Clock Prescaler")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Add lock protection from race conditions to 104-quad-8 counter driver
for differential encoder status code changes. Mutex lock calls used for
protection.
Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com>
Fixes: 954ab5cc5f ("counter: 104-quad-8: Support Differential Encoder Cable Status")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
When devm_regmap_init_i2c() returns an error code, a pairing
runtime PM usage counter decrement is needed to keep the
counter balanced. For error paths after ak8974_set_power(),
ak8974_detect() and ak8974_reset(), things are the same.
However, When iio_triggered_buffer_setup() returns an error
code, there will be two PM usgae counter decrements.
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Fixes: 7c94a8b2ee ("iio: magn: add a driver for AK8974")
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
The function iio_device_register() was called in mma8452_probe().
But the function iio_device_unregister() was not called after
a call of the function mma8452_set_freefall_mode() failed.
Thus add the missed function call for one error case.
Fixes: 1a965d405f ("drivers:iio:accel:mma8452: added cleanup provision in case of failure.")
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses a 40 byte array of smaller elements on the stack.
As Lars also noted this anti pattern can involve a leak of data to
userspace and that indeed can happen here. We close both issues by
moving to a suitable structure in the iio_priv() data with alignment
explicitly requested. This data is allocated with kzalloc so no
data can leak appart from previous readings.
Fixes: 87aec56e27 ("iio: health: Add driver for the TI AFE4404 heart monitor")
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses a 32 byte array of smaller elements on the stack.
As Lars also noted this anti pattern can involve a leak of data to
userspace and that indeed can happen here. We close both issues by
moving to a suitable structure in the iio_priv() data with alignment
explicitly requested. This data is allocated with kzalloc so no
data can leak appart from previous readings.
Fixes: eec96d1e2d ("iio: health: Add driver for the TI AFE4403 heart monitor")
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Andrew F. Davis <afd@ti.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
This week I started seeing GPU crashes on my DragonBoard 845c
which I narrowed down to being caused by commit ccac7ce373
("drm/msm: Refactor address space initialization").
Looking through the patch, Jordan and I couldn't find anything
obviously wrong, so I ended up breaking that change up into a
number of smaller logical steps so I could figure out which part
was causing the trouble.
Ends up, visually counting 'f's is hard, esp across a number
of lines:
0xfffffff != 0xffffffff
This patch corrects the end value we pass in to
msm_gem_address_space_create() in
adreno_iommu_create_address_space() so that it matches the value
used before the problematic patch landed.
With this change, I no longer see the GPU crashes that were
affecting me.
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: Jordan Crouse <jcrouse@codeaurora.org>
Cc: freedreno@lists.freedesktop.org
Fixes: ccac7ce373 ("drm/msm: Refactor address space initialization")
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
The argument passed to cmpxchg is not guaranteed to be sign
extended, but lr.w sign extends on RV64I. This makes cmpxchg
fail on clang built kernels when __old is negative.
To fix this, we just cast __old to long which sign extends on
RV64I. With this fix, clang built RISC-V kernels now boot.
Link: https://github.com/ClangBuiltLinux/linux/issues/867
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
In the ext4 filesystem with errors=panic, if one process is recording
errno in the superblock when invoking jbd2_journal_abort() due to some
error cases, it could be raced by another __ext4_abort() which is
setting the SB_RDONLY flag but missing panic because errno has not been
recorded.
jbd2_journal_commit_transaction()
jbd2_journal_abort()
journal->j_flags |= JBD2_ABORT;
jbd2_journal_update_sb_errno()
| ext4_journal_check_start()
| __ext4_abort()
| sb->s_flags |= SB_RDONLY;
| if (!JBD2_REC_ERR)
| return;
journal->j_flags |= JBD2_REC_ERR;
Finally, it will no longer trigger panic because the filesystem has
already been set read-only. Fix this by introduce j_abort_mutex to make
sure journal abort is completed before panic, and remove JBD2_REC_ERR
flag.
Fixes: 4327ba52af ("ext4, jbd2: ensure entering into panic after recording an error in superblock")
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200609073540.3810702-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
After looking up for the flowtable hooks that need to be removed,
release the hook objects in the deletion list. The error path needs to
released these hook objects too.
Fixes: abadb2f865 ("netfilter: nf_tables: delete devices from flowtable")
Reported-by: syzbot+eb9d5924c51d6d59e094@syzkaller.appspotmail.com
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Commit 99f75ce666 ("regulator: da9063: fix suspend") converted
the regulators to use a common (corrected) suspend bit setting but
one of regulators (LDO9) slipped through the crack.
This means that the original problem was not fixed for LDO9 and
also leads to a warning found by the test robot.
da9063-regulator.c:515:3: warning: initialized field overwritten
Fix this by converting that regulator too like the others.
Fixes: 99f75ce666 ("regulator: da9063: fix suspend")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
Link: https://lore.kernel.org/r/1591959073-16792-1-git-send-email-martin.fuzzey@flowbird.group
Signed-off-by: Mark Brown <broonie@kernel.org>
With EDMA, there is two dma channels can be used for dev_to_dev,
one is from ASRC, one is from another peripheral (ESAI or SAI).
If we select the dma channel of ASRC, there is an issue for ideal
ratio case, the speed of copy data is faster than sample
frequency, because ASRC output data is very fast in ideal ratio
mode.
So it is reasonable to use the dma channel of Back-End peripheral.
then copying speed of DMA is controlled by data consumption
speed in the peripheral FIFO,
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Reviewed-by: Nicolin Chen <nicoleotsuka@gmail.com>
Link: https://lore.kernel.org/r/424ed6c249bafcbe30791c9de0352821c5ea67e2.1591947428.git.shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The dma channel has been requested by Back-End cpu dai driver already.
If fsl_asrc_dma requests dma chan with same dma:tx symlink, then
there will be below warning with SDMA.
[ 48.174236] fsl-esai-dai 2024000.esai: Cannot create DMA dma:tx symlink
So if we can reuse the dma channel of Back-End, then the issue can be
fixed.
In order to get the dma channel which is already requested in Back-End.
we use the exported two functions (snd_soc_lookup_component_nolocked
and soc_component_to_pcm). If we can get the dma channel, then reuse it,
if can't, then request a new one.
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Reviewed-by: Nicolin Chen <nicoleotsuka@gmail.com>
Link: https://lore.kernel.org/r/3a79f0442cb4930c633cf72145cfe95a45b9c78e.1591947428.git.shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
In the current implementation, mutex initialization
for encoder mutex locks are done during encoder
setup. This can lead to scenarios where the lock
is used before it is initialized. Move mutex_init
to dpu_encoder_init to avoid this.
Signed-off-by: Krishna Manikandan <mkrishn@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Fix to return negative error code -ENOMEM with the use of
ERR_PTR from dpu_encoder_init.
Fixes: 25fdd5933e ("drm/msm: Add SDM845 DPU support")
Signed-off-by: Chen Tao <chentao107@huawei.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
In function msm_submitqueue_create, the queue is a local
variable, in return -EINVAL branch, queue didn`t add to ctx`s
list yet, and also didn`t kfree, this maybe bring in potential
memleak.
Signed-off-by: Bernard Zhao <bernard@vivo.com>
[trivial commit msg fixup]
Signed-off-by: Rob Clark <robdclark@chromium.org>
Request for color processing blocks only if they are
available in the display hw catalog and they are
sufficient in number for the selection.
Signed-off-by: Kalyan Thota <kalyan_t@codeaurora.org>
Fixes: e47616df00 ("drm/msm/dpu: add support for color processing
Tested-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
When we want to use float point operation on Linux
we need to use within special kernel protection
(`kernel_fpu_{begin,end}()`.), otherwise the kernel
can clobber userspace FPU register state. For detecting
these issues we use a tool named objtool (with -Ffa
flags) to highlight the FPU problems, all warnings can
be summed up as follows:
./tools/objtool/objtool check -Ffa
drivers/gpu/drm/amd/display/dc/dml/dml_common_defs.o
[..] dc/dsc/rc_calc.o: warning: objtool: get_qp_set()+0x2f8:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: dsc_roundf()+0x5:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: dsc_ceil()+0x5:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: get_ofs_set()+0x3eb:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: calc_rc_params()+0x3c:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/dc_dsc.o: warning: objtool:
get_dsc_bandwidth_range.isra.0()+0x8d:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/dc_dsc.o: warning: objtool: setup_dsc_config()+0x2ef:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc_dpi.o: warning: objtool:copy_pps_fields()+0xbb:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc_dpi.o: warning: objtool:
dscc_compute_dsc_parameters()+0x7b:
FPU instruction outside of kernel_fpu_{begin,end}()
This commit fixes the above issues by rework DSC as described:
1. Isolate all FPU operations in a single file;
2. Use FPU flags only in the file that handles FPU operations;
3. Isolate all functions that require float point operation in static
functions;
4. Add a mid-layer function that does not use any float point operation,
and that could be safely invoked in other parts of the code.
5. Keep float point operation under DC_FP_{START/END} macro.
CC: Christian König <christian.koenig@amd.com>
CC: Alexander Deucher <Alexander.Deucher@amd.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Tony Cheng <tony.cheng@amd.com>
CC: Harry Wentland <hwentlan@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Initializes Powertune data for a specific Hawaii card by fixing what
looks like a typo in the code. The device ID 66B1 is not a supported
device ID for this driver, and is not mentioned elsewhere. 67B1 is a
valid device ID, and is a Hawaii Pro GPU.
I have tested on my R9 390 which has device ID 67B1, and it works
fine without problems.
Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Driver allocates DMA memory with dma_alloc_coherent() but frees it with
dma_unmap_single().
This causes DMA warning during system shutdown (with DMA debugging) on
Toradex Colibri VF50 module:
WARNING: CPU: 0 PID: 1 at ../kernel/dma/debug.c:1036 check_unmap+0x3fc/0xb04
DMA-API: fsl-edma 40098000.dma-controller: device driver frees DMA memory with wrong function
[device address=0x0000000087040000] [size=8 bytes] [mapped as coherent] [unmapped as single]
Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
(unwind_backtrace) from [<8010bb34>] (show_stack+0x10/0x14)
(show_stack) from [<8011ced8>] (__warn+0xf0/0x108)
(__warn) from [<8011cf64>] (warn_slowpath_fmt+0x74/0xb8)
(warn_slowpath_fmt) from [<8017d170>] (check_unmap+0x3fc/0xb04)
(check_unmap) from [<8017d900>] (debug_dma_unmap_page+0x88/0x90)
(debug_dma_unmap_page) from [<80601d68>] (dspi_release_dma+0x88/0x110)
(dspi_release_dma) from [<80601e4c>] (dspi_shutdown+0x5c/0x80)
(dspi_shutdown) from [<805845f8>] (device_shutdown+0x17c/0x220)
(device_shutdown) from [<80143ef8>] (kernel_restart+0xc/0x50)
(kernel_restart) from [<801441cc>] (__do_sys_reboot+0x18c/0x210)
(__do_sys_reboot) from [<80100060>] (ret_fast_syscall+0x0/0x28)
DMA-API: Mapped at:
dma_alloc_attrs+0xa4/0x130
dspi_probe+0x568/0x7b4
platform_drv_probe+0x6c/0xa4
really_probe+0x208/0x348
driver_probe_device+0x5c/0xb4
Fixes: 90ba37033c ("spi: spi-fsl-dspi: Add DMA support for Vybrid")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1591803717-11218-1-git-send-email-krzk@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Add SPI_TX_OCTAL and SPI_RX_OCTAL to fix the following build errors:
CC spidev_test.o
spidev_test.c: In function ‘transfer’:
spidev_test.c:131:13: error: ‘SPI_TX_OCTAL’ undeclared (first use in this function)
if (mode & SPI_TX_OCTAL)
^
spidev_test.c:131:13: note: each undeclared identifier is reported only once for each function it appears in
spidev_test.c:137:13: error: ‘SPI_RX_OCTAL’ undeclared (first use in this function)
if (mode & SPI_RX_OCTAL)
^
spidev_test.c: In function ‘parse_opts’:
spidev_test.c:290:12: error: ‘SPI_TX_OCTAL’ undeclared (first use in this function)
mode |= SPI_TX_OCTAL;
^
spidev_test.c:308:12: error: ‘SPI_RX_OCTAL’ undeclared (first use in this function)
mode |= SPI_RX_OCTAL;
^
LD spidev_test-in.o
ld: cannot find spidev_test.o: No such file or directory
Additionally, maybe SPI_CS_WORD and SPI_3WIRE_HIZ will be used in the future,
so add them too.
Fixes: 896fa73508 ("spi: spidev_test: Add support for Octal mode data transfers")
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/1591880212-13479-2-git-send-email-zhangqing@loongson.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
Fix the following sparse warning:
./spidev_test.c:50:9: warning: symbol 'default_tx' was not declared. Should it be static?
./spidev_test.c:59:9: warning: symbol 'default_rx' was not declared. Should it be static?
./spidev_test.c:60:6: warning: symbol 'input_tx' was not declared. Should it be static?
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Link: https://lore.kernel.org/r/1591880212-13479-1-git-send-email-zhangqing@loongson.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
If the dentry name passed to ->d_compare() fits in dentry::d_iname, then
it may be concurrently modified by a rename. This can cause undefined
behavior (possibly out-of-bounds memory accesses or crashes) in
utf8_strncasecmp(), since fs/unicode/ isn't written to handle strings
that may be concurrently modified.
Fix this by first copying the filename to a stack buffer if needed.
This way we get a stable snapshot of the filename.
Fixes: b886ee3e77 ("ext4: Support case-insensitive file name lookups")
Cc: <stable@vger.kernel.org> # v5.2+
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Daniel Rosenberg <drosen@google.com>
Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20200601200543.59417-1-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Successful send of EOS command does not indicate that EOS is actually
finished, correct event to wait EOS is finished is EOS_RENDERED event.
EOS_RENDERED means that the DSP has finished processing all the buffers
for that particular session and stream.
This patch fixes EOS handling!
Fixes: 68fd8480bb ("ASoC: qdsp6: q6asm: Add support to audio stream apis")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200611124159.20742-3-srinivas.kandagatla@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Fix the bug when calculating the physical block number of the first
block in the split extent.
This bug will cause xfstests shared/298 failure on ext4 with bigalloc
enabled occasionally. Ext4 error messages indicate that previously freed
blocks are being freed again, and the following fsck will fail due to
the inconsistency of block bitmap and bg descriptor.
The following is an example case:
1. First, Initialize a ext4 filesystem with cluster size '16K', block size
'4K', in which case, one cluster contains four blocks.
2. Create one file (e.g., xxx.img) on this ext4 filesystem. Now the extent
tree of this file is like:
...
36864:[0]4:220160
36868:[0]14332:145408
51200:[0]2:231424
...
3. Then execute PUNCH_HOLE fallocate on this file. The hole range is
like:
..
ext4_ext_remove_space: dev 254,16 ino 12 since 49506 end 49506 depth 1
ext4_ext_remove_space: dev 254,16 ino 12 since 49544 end 49546 depth 1
ext4_ext_remove_space: dev 254,16 ino 12 since 49605 end 49607 depth 1
...
4. Then the extent tree of this file after punching is like
...
49507:[0]37:158047
49547:[0]58:158087
...
5. Detailed procedure of punching hole [49544, 49546]
5.1. The block address space:
```
lblk ~49505 49506 49507~49543 49544~49546 49547~
---------+------+-------------+----------------+--------
extent | hole | extent | hole | extent
---------+------+-------------+----------------+--------
pblk ~158045 158046 158047~158083 158084~158086 158087~
```
5.2. The detailed layout of cluster 39521:
```
cluster 39521
<------------------------------->
hole extent
<----------------------><--------
lblk 49544 49545 49546 49547
+-------+-------+-------+-------+
| | | | |
+-------+-------+-------+-------+
pblk 158084 1580845 158086 158087
```
5.3. The ftrace output when punching hole [49544, 49546]:
- ext4_ext_remove_space (start 49544, end 49546)
- ext4_ext_rm_leaf (start 49544, end 49546, last_extent [49507(158047), 40], partial [pclu 39522 lblk 0 state 2])
- ext4_remove_blocks (extent [49507(158047), 40], from 49544 to 49546, partial [pclu 39522 lblk 0 state 2]
- ext4_free_blocks: (block 158084 count 4)
- ext4_mballoc_free (extent 1/6753/1)
5.4. Ext4 error message in dmesg:
EXT4-fs error (device vdb): mb_free_blocks:1457: group 1, block 158084:freeing already freed block (bit 6753); block bitmap corrupt.
EXT4-fs error (device vdb): ext4_mb_generate_buddy:747: group 1, block bitmap and bg descriptor inconsistent: 19550 vs 19551 free clusters
In this case, the whole cluster 39521 is freed mistakenly when freeing
pblock 158084~158086 (i.e., the first three blocks of this cluster),
although pblock 158087 (the last remaining block of this cluster) has
not been freed yet.
The root cause of this isuue is that, the pclu of the partial cluster is
calculated mistakenly in ext4_ext_remove_space(). The correct
partial_cluster.pclu (i.e., the cluster number of the first block in the
next extent, that is, lblock 49597 (pblock 158086)) should be 39521 rather
than 39522.
Fixes: f4226d9ea4 ("ext4: fix partial cluster initialization")
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Eric Whitney <enwlinux@gmail.com>
Cc: stable@kernel.org # v3.19+
Link: https://lore.kernel.org/r/1590121124-37096-1-git-send-email-jefflexu@linux.alibaba.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Trying to change dax mount options when remounting could allow mount
options to be enabled for a small amount of time, and then the mount
option change would be reverted.
In the case of "mount -o remount,dax", this can cause a race where
files would temporarily treated as DAX --- and then not.
Cc: stable@kernel.org
Reported-by: syzbot+bca9799bf129256190da@syzkaller.appspotmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This adds the same per-file/per-directory DAX support for ext4 as was
done for xfs, now that we finally have consensus over what the
interface should be.
Clang's static analysis tool reports these double free memory errors.
security/selinux/ss/services.c:2987:4: warning: Attempt to free released memory [unix.Malloc]
kfree(bnames[i]);
^~~~~~~~~~~~~~~~
security/selinux/ss/services.c:2990:2: warning: Attempt to free released memory [unix.Malloc]
kfree(bvalues);
^~~~~~~~~~~~~~
So improve the security_get_bools error handling by freeing these variables
and setting their return pointers to NULL and the return len to 0
Cc: stable@vger.kernel.org
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
This reverts commit 636338d796.
This patch is not a proper fixes the i2c2 timeouts are still
happening in some cases.
Signed-off-by: Tony Lindgren <tony@atomide.com>
LCD timings now come from panel-simple. Having timings in the DT will
cause a WARN.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
LCD timings now come from panel-simple. Having timings in the DT will
cause a WARN.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Use kfree() instead of kvfree() to free rgb_user in
calculate_user_regamma_ramp() because the memory is allocated with
kcalloc().
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use kvfree() instead of kfree() to free coeff in build_regamma()
because the memory is allocated with kvzalloc().
Fixes: e752058b86 ("drm/amd/display: Optimize gamma calculations")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
spi@13000: clock-names: Additional items are not allowed ('pclk' was unexpected)
spi@13000: clock-names: ['core', 'pclk'] is too long
spi@13000: clocks: [[2, 23], [2, 258]] is too long
spi@15000: clock-names: Additional items are not allowed ('pclk' was unexpected)
spi@15000: clock-names: ['core', 'pclk'] is too long
spi@15000: clocks: [[2, 29], [2, 261]] is too long
Conditional schema properties don't overwrite others. Instead of
restrictions have to be validated. So general clock amount is 1-2 and
depending on the actual device type limit the mount to 1 or 2.
Signed-off-by: Alexander Stein <alexander.stein@mailbox.org>
Link: https://lore.kernel.org/r/20200609165527.55183-1-alexander.stein@mailbox.org
Signed-off-by: Mark Brown <broonie@kernel.org>
LCD timings now come from panel-simple. Having timings in the DT will
cause a WARN.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
To pick the changes in:
f97f5a56f5 ("x86/kvm/hyper-v: Add support for synthetic debugger interface")
850448f35a ("KVM: nVMX: Fix VMX preemption timer migration")
2c4c413255 ("KVM: x86: Print symbolic names of VMX VM-Exit flags in traces")
cc440cdad5 ("KVM: nSVM: implement KVM_GET_NESTED_STATE and KVM_SET_NESTED_STATE")
f7d31e6536 ("x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit")
72de5fa4c1 ("KVM: x86: announce KVM_FEATURE_ASYNC_PF_INT")
acd05785e4 ("kvm: add capability for halt polling")
3ecad8c2c1 ("docs: fix broken references for ReST files that moved around")
That do not result in any change in tooling, as the additions are not
being used in any table generator.
This silences these perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/vmx.h' differs from latest version at 'arch/x86/include/uapi/asm/vmx.h'
diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Matlack <dmatlack@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jon Doron <arilou@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Shier <pshier@google.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the change in:
4ef10fe05b ("drm/i915/perf: add new open param to configure polling of OA buffer")
11ecbdddf2 ("drm/i915/perf: introduce global sseu pinning")
That don't result in any changes in tooling, just silences this perf
build warning:
Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
e3b1078bed ("fscrypt: add support for IV_INO_LBLK_32 policies")
That don't trigger any changes in tooling.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fscrypt.h' differs from latest version at 'include/uapi/linux/fscrypt.h'
diff -u tools/include/uapi/linux/fscrypt.h include/uapi/linux/fscrypt.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
80340fe360 ("statx: add mount_root")
fa2fcf4f1d ("statx: add mount ID")
581701b7ef ("uapi: deprecate STATX_ALL")
712b2698e4 ("fs/stat: Define DAX statx attribute")
These add some constants that will have to be manually added in a
followup cset, at some point this should move to the shell based
automated way.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/stat.h' differs from latest version at 'include/uapi/linux/stat.h'
diff -u tools/include/uapi/linux/stat.h include/uapi/linux/stat.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the change in:
700d3a5a66 ("x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"")
That doesn't trigger any changes in tooling and silences this perf build
warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/unistd.h' differs from latest version at 'arch/x86/include/uapi/asm/unistd.h'
diff -u tools/arch/x86/include/uapi/asm/unistd.h arch/x86/include/uapi/asm/unistd.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the copies of files affected by:
c8ffd8bcdd ("vfs: add faccessat2 syscall")
To address this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fcntl.h' differs from latest version at 'include/uapi/linux/fcntl.h'
diff -u tools/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Which results in 'perf trace' gaining support for the 'faccessat2'
syscall, now one can use:
# perf trace -e faccessat2
And have system wide tracing of this syscall. And this also will include
it;
# perf trace -e faccess*
Together with the other variants.
How it affects building/usage (on an x86_64 system):
$ cp /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c /tmp/syscalls_64.c.before
$
[root@five ~]# perf trace -e faccessat2
event syntax error: 'faccessat2'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-e, --event <event> event/syscall selector. use 'perf list' to list available events
[root@five ~]#
$ cp arch/x86/entry/syscalls/syscall_64.tbl tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
$ git diff
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index 37b844f839bc..78847b32e137 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -359,6 +359,7 @@
435 common clone3 sys_clone3
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
+439 common faccessat2 sys_faccessat2
#
# x32-specific system call numbers start at 512 to avoid cache impact
$
$ make -C tools/perf O=/tmp/build/perf/ install-bin
<SNIP>
CC /tmp/build/perf/util/syscalltbl.o
LD /tmp/build/perf/util/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
<SNIP>
[root@five ~]# perf trace -e faccessat2
^C[root@five ~]#
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Two symbols need to be exported to allow the zynqmp-fpga module
to get loaded dynamically:
ERROR: modpost: "zynqmp_pm_fpga_load" [drivers/fpga/zynqmp-fpga.ko] undefined!
ERROR: modpost: "zynqmp_pm_fpga_get_status" [drivers/fpga/zynqmp-fpga.ko] undefined!
To ensure this is done correctly, also fix the Kconfig dependency
to only allow building the fpga driver when the firmware driver is
either disabled, or when it is reachable. With that, the dependency
on the SoC itself can be removed, and there are no surprises when
the fpga driver is built-in but the firmware a module.
Fixes: 4db8180ffe ("firmware: xilinx: Remove eemi ops for fpga related APIs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
On systems with at least 32 MiB, but less than 32 GiB of RAM, the DMA
memory pools are much larger than intended (e.g. 2 MiB instead of 128
KiB on a 256 MiB system).
Fix this by correcting the calculation of the number of GiBs of RAM in
the system. Invert the order of the min/max operations, to keep on
calculating in pages until the last step, which aids readability.
Fixes: 1d659236fb ("dma-pool: scale the default DMA coherent pool size with memory capacity")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Currently, the RSPI driver always tries to use the maximum configured
bit rate for communicating with a slave device, even if the transfer(s)
in the current message specify a lower rate.
Use the mininum rate specified in the message instead.
Rename rspi_data.max_speed_hz accordingly.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20200608095940.30516-3-geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
While checking the validity of insertion in __nft_rbtree_insert(),
we currently ignore conflicting elements and intervals only if they
are not active within the next generation.
However, if we consider expired elements and intervals as
potentially conflicting and overlapping, we'll return error for
entries that should be added instead. This is particularly visible
with garbage collection intervals that are comparable with the
element timeout itself, as reported by Mike Dillinger.
Other than the simple issue of denying insertion of valid entries,
this might also result in insertion of a single element (opening or
closing) out of a given interval. With single entries (that are
inserted as intervals of size 1), this leads in turn to the creation
of new intervals. For example:
# nft add element t s { 192.0.2.1 }
# nft list ruleset
[...]
elements = { 192.0.2.1-255.255.255.255 }
Always ignore expired elements active in the next generation, while
checking for conflicts.
It might be more convenient to introduce a new macro that covers
both inactive and expired items, as this type of check also appears
quite frequently in other set back-ends. This is however beyond the
scope of this fix and can be deferred to a separate patch.
Other than the overlap detection cases introduced by commit
7c84d41416 ("netfilter: nft_set_rbtree: Detect partial overlaps
on insertion"), we also have to cover the original conflict check
dealing with conflicts between two intervals of size 1, which was
introduced before support for timeout was introduced. This won't
return an error to the user as -EEXIST is masked by nft if
NLM_F_EXCL is not given, but would result in a silent failure
adding the entry.
Reported-by: Mike Dillinger <miked@softtalker.com>
Cc: <stable@vger.kernel.org> # 5.6.x
Fixes: 8d8540c4f5 ("netfilter: nft_set_rbtree: add timeout support")
Fixes: 7c84d41416 ("netfilter: nft_set_rbtree: Detect partial overlaps on insertion")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The mailbox nodes defined in various dts files have been moved to
common dra7-ipu-dsp-common.dtsi and dra74-ipu-dsp-common.dtsi files
in commit a11a2f73b3 ("ARM: dts: dra7-ipu-dsp-common: Move mailboxes
into common files"), but the nodes were erroneously left out in the
dra7-evm-common.dtsi file. Fix this by removing these duplicate nodes.
Fixes: a11a2f73b3 ("ARM: dts: dra7-ipu-dsp-common: Move mailboxes into common files")
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The commit 5390130f3b ("ARM: dts: dra7: add timer_sys_ck entries
for IPU/DSP timers") was added to allow the OMAP clocksource timer
driver to use the clock aliases when reconfiguring the parent clock
source for the timer functional clocks after the timer_sys_ck clock
aliases got cleaned up in commit a8202cd517 ("clk: ti: dra7: drop
unnecessary clock aliases").
The above patch however has missed adding the entries for couple of
timers (14, 15 and 16), and also added erroneously in the parent
ti-sysc nodes for couple of clocks (timers 4, 5 and 6). Fix these
properly, so that any of these timers can be used with OMAP remoteproc
IPU and DSP devices. The always-on timers 1 and 12 are not expected
to use this clock source, so they are not modified.
Fixes: 5390130f3b ("ARM: dts: dra7: add timer_sys_ck entries for IPU/DSP timers")
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
AM335x TRM: Figure 16-23 define sysconfig register and soft_reset
are in first position corresponding to SYSC_OMAP4_SOFTRESET defined
in ti-sysc.h.
Fixes: 0782e8572c ("ARM: dts: Probe am335x musb with ti-sysc")
Signed-off-by: Oskar Holmlund <oskar@ohdata.se>
Signed-off-by: Tony Lindgren <tony@atomide.com>
AM335x TRM: Table 2-1 defines USBSS - USB Queue Manager in memory region
0x4740 0000 to 0x4740 7FFF.
Looks like the older TRM revisions list the range from 0x5000 to 0x8000
as reserved.
Fixes: 0782e8572c ("ARM: dts: Probe am335x musb with ti-sysc")
Signed-off-by: Oskar Holmlund <oskar@ohdata.se>
[tony@atomide.com: updated comments]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Consistently use %u to format unsigned numbers.
For "bits" this doesn't matter that much, as it is "uint8_t".
However, "speed" is "uint32_t", so in case people use "-s -1" to force
the maximum, they would see:
max speed: -1 Hz (4294967 KHz)
While at it, use "k" (kilo) instead of "K" (kelvin) in "kHz".
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20200608100049.30648-1-geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
Overwrite hw queue id for non-bufferable management frames if the hw
support always txq (altxq) in order to be in sync with mac txwi code
Fixes: cdad487405 ("mt76: mt7615: add dma and tx queue initialization for MT7622")
Fixes: f40ac0f3d3 ("mt76: mt7615: introduce mt7663e support")
Suggested-by: Felix Fietkau <nbd@nbd.name>
Tested-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
mt7622/mt7663 chipsets rely on a fixed reverse queue map order respect
to mac80211 one:
- q(0): IEEE80211_AC_BK
- q(1): IEEE80211_AC_BE
- q(2): IEEE80211_AC_VI
- q(3): IEEE80211_AC_VO
Fixes: cdad487405 ("mt76: mt7615: add dma and tx queue initialization for MT7622")
Fixes: f40ac0f3d3 ("mt76: mt7615: introduce mt7663e support")
Co-developed-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Co-developed-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
acs and wmm index are swapped in mt7615_queues_acq respect to the hw
design
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Coverage class callback can potentially run in parallel with other
routines (e.g. mt7615_set_channel) that configures timing registers.
Run coverage class callback holding mt76 mutex
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Calling pm_runtime_get_sync increments the counter even in case of
failure, causing incorrect ref count. Call pm_runtime_put if
pm_runtime_get_sync fails.
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit d9258898ad ("arm64: dts: arm: vexpress: Move fixed devices
out of bus node") moved the "mcc" DT node into the root node, because
it does not have any children using "reg" properties, so does violate
some dtc checks about "simple-bus" nodes.
However this broke the vexpress config-bus code, which walks up the
device tree to find the first node with an "arm,vexpress,site" property.
This gave the wrong result (matching the root node instead of the
motherboard node), so broke the clocks and some other devices for
VExpress boards.
Move the whole node back into its original position. This re-introduces
the dtc warning, but is conceptually the right thing to do. The dtc
warning seems to be overzealous here, there are discussions on fixing or
relaxing this check instead.
Link: https://lore.kernel.org/r/20200603162237.16319-1-andre.przywara@arm.com
Fixes: d9258898ad ("arm64: dts: vexpress: Move fixed devices out of bus node")
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
During IPsec performance testing, we see bad ICMP checksum. The error packet
has duplicated ESP trailer due to double validate_xmit_xfrm calls. The first call
is from ip_output, but the packet cannot be sent because
netif_xmit_frozen_or_stopped is true and the packet gets dev_requeue_skb. The second
call is from NET_TX softirq. However after the first call, the packet already
has the ESP trailer.
Fix by marking the skb with XFRM_XMIT bit after the packet is handled by
validate_xmit_xfrm to avoid duplicate ESP trailer insertion.
Fixes: f6e27114a6 ("net: Add a xfrm validate function to validate_xmit_skb")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Otherwise we can get "OCP softreset timed out" warnings occasionally
at least for i2c2 on omap4 now that we check the OCP softreset completed
bit on enable.
Reported-by: Merlijn Wajer <merlijn@wizzup.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
We must check for "dss_core" instead of "dss" to avoid also matching
also "dss_dispc". This only matters for the mixed case of data
configured in device tree but with legacy booting ti,hwmods property
still enabled.
Fixes: 8b30919a4e ("ARM: OMAP2+: Handle reset quirks for dynamically allocated modules")
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
We are currently only setting the framedonetv_irq disabled for the SoCs
that don't have it. But we are never setting it enabled for the SoCs that
have it. Let's initialized it to true by default.
Fixes: 7324a7a0d5 ("bus: ti-sysc: Implement display subsystem reset quirk")
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
We must ignore the clockactivity bit for most modules and not set it
unless specified for the module with SYSC_QUIRK_USE_CLOCKACT. Otherwise
the interface clock can be automatically gated constantly causing
unexpected performance issues.
Fixes: ae9ae12e9d ("bus: ti-sysc: Handle clockactivity for enable and disable")
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Some modules reset automatically when idled, and when re-enabled, we must
wait for the automatic OCP softreset to complete. And if optional clocks
are configured, we need to keep the clocks on while waiting for the reset
to complete.
Let's fix the issue by moving the OCP softreset code to a separate
function sysc_wait_softreset(), and call it also from sysc_enable_module()
with the optional clocks enabled.
This is based on what we're already doing for legacy platform data booting
in _enable_sysc().
Fixes: 7324a7a0d5 ("bus: ti-sysc: Implement display subsystem reset quirk")
Reported-by: Faiz Abbas <faiz_abbas@ti.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
We can currently sometimes get "RXS timed out" errors and "EOT timed out"
errors with spi transfers.
These errors can be made easy to reproduce by reading the cpcap iio
values in a loop while keeping the CPUs busy by also reading /dev/urandom.
The "RXS timed out" errors we can fix by adding spi-cpol and spi-cpha
in addition to the spi-cs-high property we already have.
The "EOT timed out" errors we can fix by increasing the spi clock rate
to 9.6 MHz. Looks similar MC13783 PMIC says it works at spi clock rates
up to 20 MHz, so let's assume we can pick any rate up to 20 MHz also
for cpcap.
Cc: maemo-leste@lists.dyne.org
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Using "%zu" to fix following warning,
kernel/rcu/rcuperf.c: In function ‘kfree_perf_init’:
include/linux/kern_levels.h:5:18: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘unsigned int’ [-Wformat=]
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The assembly and disassembly of data to be sent to or received from
a device invoke functions regmap_format_XX() and regmap_parse_XX()
that extract or insert data items from or into a buffer, using
assignments. In some cases the functions are called with a buffer
pointer with an odd address. On architectures with strict alignment
requirements this can result in a kernel crash. The assignments
have been replaced by functions that take alignment into account.
Signed-off-by: Jens Thoms Toerring <jt@toerring.de>
Link: https://lore.kernel.org/r/20200531095300.GA27570@toerring.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Looks like we're missing flush of posted write after module enable and
disable. I've seen occasional errors accessing various modules, and it
is suspected that the lack of posted writes can also cause random reboots.
The errors we can see are similar to the one below from spi for example:
44000000.ocp:L3 Custom Error: MASTER MPU TARGET L4CFG (Read): Data Access
in User mode during Functional access
...
mcspi_wait_for_reg_bit
omap2_mcspi_transfer_one
spi_transfer_one_message
...
We also want to also flush posted write for disable. The clkctrl clock
disable happens after module disable, and we don't want to have the
module potentially stay active while we're trying to disable the clock.
Fixes: d59b60564c ("bus: ti-sysc: Add generic enable/disable functions")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Add a flag ([EXT4|FS]_DAX_FL) to preserve FS_XFLAG_DAX in the ext4
inode.
Set the flag to be user visible and changeable. Set the flag to be
inherited. Allow applications to change the flag at any time except if
it conflicts with the set of mutually exclusive flags (Currently VERITY,
ENCRYPT, JOURNAL_DATA).
Furthermore, restrict setting any of the exclusive flags if DAX is set.
While conceptually possible, we do not allow setting EXT4_DAX_FL while
at the same time clearing exclusion flags (or vice versa) for 2 reasons:
1) The DAX flag does not take effect immediately which
introduces quite a bit of complexity
2) There is no clear use case for being this flexible
Finally, on regular files, flag the inode to not be cached to facilitate
changing S_DAX on the next creation of the inode.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20200528150003.828793-9-ira.weiny@intel.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We add 'always', 'never', and 'inode' (default). '-o dax' continues to
operate the same which is equivalent to 'always'. This new
functionality is limited to ext4 only.
Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
it and EXT4_MOUNT_DAX_ALWAYS appropriately for the mode.
We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.
Finally, EXT4_MOUNT2_DAX_INODE is used solely to detect if the user
specified that option for printing.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20200528150003.828793-7-ira.weiny@intel.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
To prevent complications with in memory inodes we only set S_DAX on
inode load. FS_XFLAG_DAX can be changed at any time and S_DAX will
change after inode eviction and reload.
Add init bool to ext4_set_inode_flags() to indicate if the inode is
being newly initialized.
Assert that S_DAX is not set on an inode which is just being loaded.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20200528150003.828793-6-ira.weiny@intel.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
S_DAX should only be enabled when the underlying block device supports
dax.
Cache the underlying support for DAX in the super block and modify
ext4_should_use_dax() to check for device support prior to the over
riding mount option.
While we are at it change the function to ext4_should_enable_dax() as
this better reflects the ask as well as matches xfs.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20200528150003.828793-5-ira.weiny@intel.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Verity and DAX are incompatible. Changing the DAX mode due to a verity
flag change is wrong without a corresponding address_space_operations
update.
Make the 2 options mutually exclusive by returning an error if DAX was
set first.
(Setting DAX is already disabled if Verity is set first.)
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20200528150003.828793-3-ira.weiny@intel.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The reset handling APIs for omap-prm can be invoked PM runtime which
runs in atomic context. For this to work properly, switch to atomic
iopoll version instead of the current which can sleep. Otherwise,
this throws a "BUG: scheduling while atomic" warning. Issue is seen
rather easily when CONFIG_PREEMPT is enabled.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Move mmc nodes to be compatible with the sdhci-omap driver. The following
modifications are required for omap_hsmmc specific properties:
ti,non-removable: convert to the generic mmc non-removable
ti,needs-special-reset: co-opted into the sdhci-omap driver
ti,dual-volt: removed. Legacy property not used in am335x or am43xx
ti,needs-special-hs-handling: removed. Legacy property not used in am335x
or am43xx
Also since the sdhci-omap driver does not support runtime PM, explicitly
disable the mmc3 instance in the dtsi.
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2020-05-19 08:54:42 -07:00
2145 changed files with 21055 additions and 12959 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.