Compare commits

..

917 Commits

Author SHA1 Message Date
Linus Torvalds
f443e374ae Linux 5.17
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-20 13:14:17 -07:00
Linus Torvalds
7445b2dcd7 Merge tag 'for-linus-5.17' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fix from Paolo Bonzini:
 "Fix for the SLS mitigation, which makes a 'SETcc/RET' pair grow
  to 'SETcc/RET/INT3'.

  This doesn't fit in 4 bytes any more, so the alignment has to
  change to 8 for this case"

* tag 'for-linus-5.17' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm/emulate: Fix SETcc emulation function offsets with SLS
2022-03-20 09:46:52 -07:00
Linus Torvalds
1e0e7a6a28 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
 "Two driver fixes:

   - a fix for zinitix touchscreen to properly report contacts

   - a fix for aiptek tablet driver to be more resilient to devices with
     incorrect descriptors"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: aiptek - properly check endpoint type
  Input: zinitix - do not report shadow fingers
2022-03-20 09:27:52 -07:00
Borislav Petkov
fe83f5eae4 kvm/emulate: Fix SETcc emulation function offsets with SLS
The commit in Fixes started adding INT3 after RETs as a mitigation
against straight-line speculation.

The fastop SETcc implementation in kvm's insn emulator uses macro magic
to generate all possible SETcc functions and to jump to them when
emulating the respective instruction.

However, it hardcodes the size and alignment of those functions to 4: a
three-byte SETcc insn and a single-byte RET. BUT, with SLS, there's an
INT3 that gets slapped after the RET, which brings the whole scheme out
of alignment:

  15:   0f 90 c0                seto   %al
  18:   c3                      ret
  19:   cc                      int3
  1a:   0f 1f 00                nopl   (%rax)
  1d:   0f 91 c0                setno  %al
  20:   c3                      ret
  21:   cc                      int3
  22:   0f 1f 00                nopl   (%rax)
  25:   0f 92 c0                setb   %al
  28:   c3                      ret
  29:   cc                      int3

and this explodes like this:

  int3: 0000 [#1] PREEMPT SMP PTI
  CPU: 0 PID: 2435 Comm: qemu-system-x86 Not tainted 5.17.0-rc8-sls #1
  Hardware name: Dell Inc. Precision WorkStation T3400  /0TP412, BIOS A14 04/30/2012
  RIP: 0010:setc+0x5/0x8 [kvm]
  Code: 00 00 0f 1f 00 0f b6 05 43 24 06 00 c3 cc 0f 1f 80 00 00 00 00 0f 90 c0 c3 cc 0f \
	  1f 00 0f 91 c0 c3 cc 0f 1f 00 0f 92 c0 c3 cc <0f> 1f 00 0f 93 c0 c3 cc 0f 1f 00 \
	  0f 94 c0 c3 cc 0f 1f 00 0f 95 c0
  Call Trace:
   <TASK>
   ? x86_emulate_insn [kvm]
   ? x86_emulate_instruction [kvm]
   ? vmx_handle_exit [kvm_intel]
   ? kvm_arch_vcpu_ioctl_run [kvm]
   ? kvm_vcpu_ioctl [kvm]
   ? __x64_sys_ioctl
   ? do_syscall_64
   ? entry_SYSCALL_64_after_hwframe
   </TASK>

Raise the alignment value when SLS is enabled and use a macro for that
instead of hard-coding naked numbers.

Fixes: e463a09af2 ("x86: Add straight-line-speculation mitigation")
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Jamie Heilman <jamie@audible.transient.net>
Link: https://lore.kernel.org/r/YjGzJwjrvxg5YZ0Z@audible.transient.net
[Add a comment and a bit of safety checking, since this is going to be changed
 again for IBT support. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-20 14:55:46 +01:00
Linus Torvalds
14702b3b24 Merge tag 'soc-fixes-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fix from Arnd Bergmann:
 "Here is one last regression fix for 5.17, reverting a patch that went
  into 5.16 as a cleanup that ended up breaking external interrupts on
  Layerscape chips.

  The revert makes it work again, but also reintroduces a build time
  warning about the nonstandard DT binding that will have to be dealt
  with in the future"

* tag 'soc-fixes-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  Revert "arm64: dts: freescale: Fix 'interrupt-map' parent address cells"
2022-03-19 16:36:32 -07:00
Linus Torvalds
f76da4d5ad Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Two small(ish) fixes, both in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: fnic: Finish scsi_cmnd before dropping the spinlock
  scsi: mpt3sas: Page fault in reply q processing
2022-03-19 15:56:43 -07:00
Linus Torvalds
97e9c8eb4b Merge tag 'perf-tools-fixes-for-v5.17-2022-03-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Avoid iterating empty evlist, fixing a segfault with 'perf stat --null'

 - Ignore case in topdown.slots check, fixing issue with Intel Icelake
   JSON metrics.

 - Fix symbol size calculation condition for fixing up corner case
   symbol end address obtained from Kallsyms.

* tag 'perf-tools-fixes-for-v5.17-2022-03-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf parse-events: Ignore case in topdown.slots check
  perf evlist: Avoid iteration for empty evlist.
  perf symbols: Fix symbol size calculation condition
2022-03-19 11:04:10 -07:00
Linus Torvalds
ba6354f614 Merge tag 'char-misc-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fix from Greg KH:
 "Here is a single driver fix for 5.17-final that has been submitted
  many times but I somehow missed it in my patch queue:

   - fix for counter sysfs code for reported problem

  This has been in linux-next all week with no reported issues"

* tag 'char-misc-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  counter: Stop using dev_get_drvdata() to get the counter device
2022-03-19 10:21:34 -07:00
Linus Torvalds
6aa61c12a4 Merge tag 'usb-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are some small remaining USB fixes for 5.17-final.

  They include:

   - two USB gadget driver fixes for reported problems

   - usbtmc driver fix for syzbot found issues

   - musb patch partial revert to resolve a reported regression.

  All of these have been in linux-next this week with no reported
  problems"

* tag 'usb-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: gadget: Fix use-after-free bug by not setting udc->dev.driver
  usb: usbtmc: Fix bug in pipe direction for control transfers
  partially Revert "usb: musb: Set the DT node on the child device"
  usb: gadget: rndis: prevent integer overflow in rndis_set_response()
2022-03-19 10:16:33 -07:00
Ian Rogers
7bd1da15d2 perf parse-events: Ignore case in topdown.slots check
An issue with icelakex metrics:

  https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json?h=perf/core&id=65eab2bc7dab326ee892ec5a4c749470b368b51a#n48

That causes the slots not to be first.

Fixes: 94dbfd6781 ("perf parse-events: Architecture specific leader override")
Reported-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220317224309.543736-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 18:39:09 -03:00
Ian Rogers
8b464eac97 perf evlist: Avoid iteration for empty evlist.
As seen with 'perf stat --null ..' and reported in:
https://lore.kernel.org/lkml/YjCLcpcX2peeQVCH@kernel.org/

v2. Avoids setting evsel in the empty list case as suggested by Jiri Olsa.

    Committer testing:

Before:

  $  perf stat --null sleep 1
  Segmentation fault (core dumped)
  $

After:

  $  perf stat --null sleep 1

   Performance counter stats for 'sleep 1':

         1.010340646 seconds time elapsed

         0.001420000 seconds user
         0.000000000 seconds sys
  $

Fixes: 472832d2c0 ("perf evlist: Refactor evlist__for_each_cpu()")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220317231643.550902-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 18:39:09 -03:00
Michael Petlan
3cf6a32f3f perf symbols: Fix symbol size calculation condition
Before this patch, the symbol end address fixup to be called, needed two
conditions being met:

  if (prev->end == prev->start && prev->end != curr->start)

Where
  "prev->end == prev->start" means that prev is zero-long
                             (and thus needs a fixup)
and
  "prev->end != curr->start" means that fixup hasn't been applied yet

However, this logic is incorrect in the following situation:

*curr  = {rb_node = {__rb_parent_color = 278218928,
  rb_right = 0x0, rb_left = 0x0},
  start = 0xc000000000062354,
  end = 0xc000000000062354, namelen = 40, type = 2 '\002',
  binding = 0 '\000', idle = 0 '\000', ignore = 0 '\000',
  inlined = 0 '\000', arch_sym = 0 '\000', annotate2 = false,
  name = 0x1159739e "kprobe_optinsn_page\t[__builtin__kprobes]"}

*prev = {rb_node = {__rb_parent_color = 278219041,
  rb_right = 0x109548b0, rb_left = 0x109547c0},
  start = 0xc000000000062354,
  end = 0xc000000000062354, namelen = 12, type = 2 '\002',
  binding = 1 '\001', idle = 0 '\000', ignore = 0 '\000',
  inlined = 0 '\000', arch_sym = 0 '\000', annotate2 = false,
  name = 0x1095486e "optinsn_slot"}

In this case, prev->start == prev->end == curr->start == curr->end,
thus the condition above thinks that "we need a fixup due to zero
length of prev symbol, but it has been probably done, since the
prev->end == curr->start", which is wrong.

After the patch, the execution path proceeds to arch__symbols__fixup_end
function which fixes up the size of prev symbol by adding page_size to
its end offset.

Fixes: 3b01a413c1 ("perf symbols: Improve kallsyms symbol end addr calculation")
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20220317135536.805-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 18:39:09 -03:00
Linus Torvalds
34e047aa16 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
 "Fix two compiler warnings introduced by recent commits: pointer
  arithmetic and double initialisation of struct field"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: errata: avoid duplicate field initializer
  arm64: fix clang warning about TRAMP_VALIAS
2022-03-18 12:32:59 -07:00
Linus Torvalds
6e4069881a Merge tag '5.17-rc8-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fix from Steve French:
 "Small fix for regression in multiuser mounts.

  The additional improvements suggested by Ronnie to make the server and
  session status handling code easier to read can wait for the 5.18
  merge window."

* tag '5.17-rc8-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: fix incorrect session setup check for multiuser mounts
2022-03-18 12:22:15 -07:00
Linus Torvalds
6c4bcd8140 Merge tag 'block-5.17-2022-03-18' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Revert of a nvme target feature (Hannes)

 - Fix a memory leak with rq-qos (Ming)

* tag 'block-5.17-2022-03-18' of git://git.kernel.dk/linux-block:
  nvmet: revert "nvmet: make discovery NQN configurable"
  block: release rq qos structures for queue without disk
2022-03-18 12:15:56 -07:00
Linus Torvalds
cced5148a1 Merge tag 'drm-fixes-2022-03-18' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "A few minor changes to finish things off, one mgag200 regression, imx
  fix and couple of panel changes.

  imx:
   - Don't test bus flags in atomic check

  mgag200:
   - Fix PLL setup on some models

  panel:
   - Fix bpp settings on Innolux G070Y2-L01
   - Fix DRM_PANEL_EDP Kconfig dependencies"

* tag 'drm-fixes-2022-03-18' of git://anongit.freedesktop.org/drm/drm:
  drm: Don't make DRM_PANEL_BRIDGE dependent on DRM_KMS_HELPERS
  drm/panel: simple: Fix Innolux G070Y2-L01 BPP settings
  drm/imx: parallel-display: Remove bus flags check in imx_pd_bridge_atomic_check()
  drm/mgag200: Fix PLL setup for g200wb and g200ew
2022-03-18 12:01:19 -07:00
Arnd Bergmann
316e46f65a arm64: errata: avoid duplicate field initializer
The '.type' field is initialized both in place and in the macro
as reported by this W=1 warning:

arch/arm64/include/asm/cpufeature.h:281:9: error: initialized field overwritten [-Werror=override-init]
  281 |         (ARM64_CPUCAP_SCOPE_LOCAL_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU)
      |         ^
arch/arm64/kernel/cpu_errata.c:136:17: note: in expansion of macro 'ARM64_CPUCAP_LOCAL_CPU_ERRATUM'
  136 |         .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,                         \
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/arm64/kernel/cpu_errata.c:145:9: note: in expansion of macro 'ERRATA_MIDR_RANGE'
  145 |         ERRATA_MIDR_RANGE(m, var, r_min, var, r_max)
      |         ^~~~~~~~~~~~~~~~~
arch/arm64/kernel/cpu_errata.c:613:17: note: in expansion of macro 'ERRATA_MIDR_REV_RANGE'
  613 |                 ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2),
      |                 ^~~~~~~~~~~~~~~~~~~~~
arch/arm64/include/asm/cpufeature.h:281:9: note: (near initialization for 'arm64_errata[18].type')
  281 |         (ARM64_CPUCAP_SCOPE_LOCAL_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU)
      |         ^

Remove the extranous initializer.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 1dd498e5e2 ("KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata")
Link: https://lore.kernel.org/r/20220316183800.1546731-1-arnd@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-03-18 14:09:18 +00:00
Arnd Bergmann
7f34b43e07 arm64: fix clang warning about TRAMP_VALIAS
The newly introduced TRAMP_VALIAS definition causes a build warning
with clang-14:

arch/arm64/include/asm/vectors.h:66:31: error: arithmetic on a null pointer treated as a cast from integer to pointer is a GNU extension [-Werror,-Wnull-pointer-arithmetic]
                return (char *)TRAMP_VALIAS + SZ_2K * slot;

Change the addition to something clang does not complain about.

Fixes: bd09128d16 ("arm64: Add percpu vectors for EL1")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20220316183833.1563139-1-arnd@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-03-18 13:48:28 +00:00
Dave Airlie
ca5a5761ac Merge tag 'drm-misc-fixes-2022-03-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
* drm/imx: Don't test bus flags in atomic check
 * drm/mgag200: Fix PLL setup on some models
 * drm/panel: Fix bpp settings on Innolux G070Y2-L01; Fix DRM_PANEL_EDP
   Kconfig dependencies

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YjMNcqOuDFDoe+EN@linux-uq9g
2022-03-18 13:32:54 +10:00
Linus Torvalds
551acdc3c3 Merge tag 'net-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, ipsec, and wireless.

  A few last minute revert / disable and fix patches came down from our
  sub-trees. We're not waiting for any fixes at this point.

  Current release - regressions:

   - Revert "netfilter: nat: force port remap to prevent shadowing
     well-known ports", restore working conntrack on asymmetric paths

   - Revert "ath10k: drop beacon and probe response which leak from
     other channel", restore working AP and mesh mode on QCA9984

   - eth: intel: fix hang during reboot/shutdown

  Current release - new code bugs:

   - netfilter: nf_tables: disable register tracking, it needs more work
     to cover all corner cases

  Previous releases - regressions:

   - ipv6: fix skb_over_panic in __ip6_append_data when (admin-only)
     extension headers get specified

   - esp6: fix ESP over TCP/UDP, interpret ipv6_skip_exthdr's return
     value more selectively

   - bnx2x: fix driver load failure when FW not present in initrd

  Previous releases - always broken:

   - vsock: stop destroying unrelated sockets in nested virtualization

   - packet: fix slab-out-of-bounds access in packet_recvmsg()

  Misc:

   - add Paolo Abeni to networking maintainers!"

* tag 'net-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (26 commits)
  iavf: Fix hang during reboot/shutdown
  net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower offload
  net: bcmgenet: skip invalid partial checksums
  bnx2x: fix built-in kernel driver load failure
  net: phy: mscc: Add MODULE_FIRMWARE macros
  net: dsa: Add missing of_node_put() in dsa_port_parse_of
  net: handle ARPHRD_PIMREG in dev_is_mac_header_xmit()
  Revert "ath10k: drop beacon and probe response which leak from other channel"
  hv_netvsc: Add check for kvmalloc_array
  iavf: Fix double free in iavf_reset_task
  ice: destroy flow director filter mutex after releasing VSIs
  ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats()
  Add Paolo Abeni to networking maintainers
  atm: eni: Add check for dma_map_single
  net/packet: fix slab-out-of-bounds access in packet_recvmsg()
  net: mdio: mscc-miim: fix duplicate debugfs entry
  net: phy: marvell: Fix invalid comparison in the resume and suspend functions
  esp6: fix check on ipv6_skip_exthdr's return value
  net: dsa: microchip: add spi_device_id tables
  netfilter: nf_tables: disable register tracking
  ...
2022-03-17 12:55:26 -07:00
Linus Torvalds
c81801eb7f Merge tag 'acpi-5.17-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
 "Revert recent commit that caused multiple systems to misbehave due to
  firmware issues"

* tag 'acpi-5.17-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPI: scan: Do not add device IDs from _CID if _HID is not valid"
2022-03-17 12:40:59 -07:00
Linus Torvalds
2ab99e5458 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "Four patches.

  Subsystems affected by this patch series: mm/swap, kconfig, ocfs2, and
  selftests"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  selftests: vm: fix clang build error multiple output files
  ocfs2: fix crash when initialize filecheck kobj fails
  configs/debug: restore DEBUG_INFO=y for overriding
  mm: swap: get rid of livelock in swapin readahead
2022-03-17 12:36:47 -07:00
Yosry Ahmed
1c4debc443 selftests: vm: fix clang build error multiple output files
When building the vm selftests using clang, some errors are seen due to
having headers in the compilation command:

  clang -Wall -I ../../../../usr/include  -no-pie    gup_test.c ../../../../mm/gup_test.h -lrt -lpthread -o .../tools/testing/selftests/vm/gup_test
  clang: error: cannot specify -o when generating multiple output files
  make[1]: *** [../lib.mk:146: .../tools/testing/selftests/vm/gup_test] Error 1

Rework to add the header files to LOCAL_HDRS before including ../lib.mk,
since the dependency is evaluated in '$(OUTPUT)/%:%.c $(LOCAL_HDRS)' in
file lib.mk.

Link: https://lkml.kernel.org/r/20220304000645.1888133-1-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-17 11:02:13 -07:00
Joseph Qi
7b0b1332cf ocfs2: fix crash when initialize filecheck kobj fails
Once s_root is set, genric_shutdown_super() will be called if
fill_super() fails.  That means, we will call ocfs2_dismount_volume()
twice in such case, which can lead to kernel crash.

Fix this issue by initializing filecheck kobj before setting s_root.

Link: https://lkml.kernel.org/r/20220310081930.86305-1-joseph.qi@linux.alibaba.com
Fixes: 5f483c4abb ("ocfs2: add kobject for online file check")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-17 11:02:13 -07:00
Qian Cai
8208257d2d configs/debug: restore DEBUG_INFO=y for overriding
Previously, I failed to realize that Kees' patch [1] has not been merged
into the mainline yet, and dropped DEBUG_INFO=y too eagerly from the
mainline.  As the results, "make debug.config" won't be able to flip
DEBUG_INFO=n from the existing .config.  This should close the gaps of a
few weeks before Kees' patch is there, and work regardless of their
merging status anyway.

Link: https://lore.kernel.org/all/20220125075126.891825-1-keescook@chromium.org/ [1]
Link: https://lkml.kernel.org/r/20220308153524.8618-1-quic_qiancai@quicinc.com
Signed-off-by: Qian Cai <quic_qiancai@quicinc.com>
Reported-by: Daniel Thompson <daniel.thompson@linaro.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-17 11:02:13 -07:00
Guo Ziliang
029c4628b2 mm: swap: get rid of livelock in swapin readahead
In our testing, a livelock task was found.  Through sysrq printing, same
stack was found every time, as follows:

  __swap_duplicate+0x58/0x1a0
  swapcache_prepare+0x24/0x30
  __read_swap_cache_async+0xac/0x220
  read_swap_cache_async+0x58/0xa0
  swapin_readahead+0x24c/0x628
  do_swap_page+0x374/0x8a0
  __handle_mm_fault+0x598/0xd60
  handle_mm_fault+0x114/0x200
  do_page_fault+0x148/0x4d0
  do_translation_fault+0xb0/0xd4
  do_mem_abort+0x50/0xb0

The reason for the livelock is that swapcache_prepare() always returns
EEXIST, indicating that SWAP_HAS_CACHE has not been cleared, so that it
cannot jump out of the loop.  We suspect that the task that clears the
SWAP_HAS_CACHE flag never gets a chance to run.  We try to lower the
priority of the task stuck in a livelock so that the task that clears
the SWAP_HAS_CACHE flag will run.  The results show that the system
returns to normal after the priority is lowered.

In our testing, multiple real-time tasks are bound to the same core, and
the task in the livelock is the highest priority task of the core, so
the livelocked task cannot be preempted.

Although cond_resched() is used by __read_swap_cache_async, it is an
empty function in the preemptive system and cannot achieve the purpose
of releasing the CPU.  A high-priority task cannot release the CPU
unless preempted by a higher-priority task.  But when this task is
already the highest priority task on this core, other tasks will not be
able to be scheduled.  So we think we should replace cond_resched() with
schedule_timeout_uninterruptible(1), schedule_timeout_interruptible will
call set_current_state first to set the task state, so the task will be
removed from the running queue, so as to achieve the purpose of giving
up the CPU and prevent it from running in kernel mode for too long.

(akpm: ugly hack becomes uglier.  But it fixes the issue in a
backportable-to-stable fashion while we hopefully work on something
better)

Link: https://lkml.kernel.org/r/20220221111749.1928222-1-cgel.zte@gmail.com
Signed-off-by: Guo Ziliang <guo.ziliang@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Reviewed-by: Jiang Xuexin <jiang.xuexin@zte.com.cn>
Reviewed-by: Yang Yang <yang.yang29@zte.com.cn>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roger Quadros <rogerq@kernel.org>
Cc: Ziliang Guo <guo.ziliang@zte.com.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-17 11:02:13 -07:00
Ivan Vecera
b04683ff8f iavf: Fix hang during reboot/shutdown
Recent commit 974578017f ("iavf: Add waiting so the port is
initialized in remove") adds a wait-loop at the beginning of
iavf_remove() to ensure that port initialization is finished
prior unregistering net device. This causes a regression
in reboot/shutdown scenario because in this case callback
iavf_shutdown() is called and this callback detaches the device,
makes it down if it is running and sets its state to __IAVF_REMOVE.
Later shutdown callback of associated PF driver (e.g. ice_shutdown)
is called. That callback calls among other things sriov_disable()
that calls indirectly iavf_remove() (see stack trace below).
As the adapter state is already __IAVF_REMOVE then the mentioned
loop is end-less and shutdown process hangs.

The patch fixes this by checking adapter's state at the beginning
of iavf_remove() and skips the rest of the function if the adapter
is already in remove state (shutdown is in progress).

Reproducer:
1. Create VF on PF driven by ice or i40e driver
2. Ensure that the VF is bound to iavf driver
3. Reboot

[52625.981294] sysrq: SysRq : Show Blocked State
[52625.988377] task:reboot          state:D stack:    0 pid:17359 ppid:     1 f2
[52625.996732] Call Trace:
[52625.999187]  __schedule+0x2d1/0x830
[52626.007400]  schedule+0x35/0xa0
[52626.010545]  schedule_hrtimeout_range_clock+0x83/0x100
[52626.020046]  usleep_range+0x5b/0x80
[52626.023540]  iavf_remove+0x63/0x5b0 [iavf]
[52626.027645]  pci_device_remove+0x3b/0xc0
[52626.031572]  device_release_driver_internal+0x103/0x1f0
[52626.036805]  pci_stop_bus_device+0x72/0xa0
[52626.040904]  pci_stop_and_remove_bus_device+0xe/0x20
[52626.045870]  pci_iov_remove_virtfn+0xba/0x120
[52626.050232]  sriov_disable+0x2f/0xe0
[52626.053813]  ice_free_vfs+0x7c/0x340 [ice]
[52626.057946]  ice_remove+0x220/0x240 [ice]
[52626.061967]  ice_shutdown+0x16/0x50 [ice]
[52626.065987]  pci_device_shutdown+0x34/0x60
[52626.070086]  device_shutdown+0x165/0x1c5
[52626.074011]  kernel_restart+0xe/0x30
[52626.077593]  __do_sys_reboot+0x1d2/0x210
[52626.093815]  do_syscall_64+0x5b/0x1a0
[52626.097483]  entry_SYSCALL_64_after_hwframe+0x65/0xca

Fixes: 974578017f ("iavf: Add waiting so the port is initialized in remove")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://lore.kernel.org/r/20220317104524.2802848-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 09:37:37 -07:00
Vladimir Oltean
8e0341aefc net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower offload
ACL rules can be offloaded to VCAP IS2 either through chain 0, or, since
the blamed commit, through a chain index whose number encodes a specific
PAG (Policy Action Group) and lookup number.

The chain number is translated through ocelot_chain_to_pag() into a PAG,
and through ocelot_chain_to_lookup() into a lookup number.

The problem with the blamed commit is that the above 2 functions don't
have special treatment for chain 0. So ocelot_chain_to_pag(0) returns
filter->pag = 224, which is in fact -32, but the "pag" field is an u8.

So we end up programming the hardware with VCAP IS2 entries having a PAG
of 224. But the way in which the PAG works is that it defines a subset
of VCAP IS2 filters which should match on a packet. The default PAG is
0, and previous VCAP IS1 rules (which we offload using 'goto') can
modify it. So basically, we are installing filters with a PAG on which
no packet will ever match. This is the hardware equivalent of adding
filters to a chain which has no 'goto' to it.

Restore the previous functionality by making ACL filters offloaded to
chain 0 go to PAG 0 and lookup number 0. The choice of PAG is clearly
correct, but the choice of lookup number isn't "as before" (which was to
leave the lookup a "don't care"). However, lookup 0 should be fine,
since even though there are ACL actions (policers) which have a
requirement to be used in a specific lookup, that lookup is 0.

Fixes: 226e9cd82a ("net: mscc: ocelot: only install TCAM entries into a specific lookup and PAG")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220316192117.2568261-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 09:34:52 -07:00
Doug Berger
0f643c88c8 net: bcmgenet: skip invalid partial checksums
The RXCHK block will return a partial checksum of 0 if it encounters
a problem while receiving a packet. Since a 1's complement sum can
only produce this result if no bits are set in the received data
stream it is fair to treat it as an invalid partial checksum and
not pass it up the stack.

Fixes: 8101553978 ("net: bcmgenet: use CHECKSUM_COMPLETE for NETIF_F_RXCSUM")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220317012812.1313196-1-opendmb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 09:34:24 -07:00
Manish Chopra
424e7834e2 bnx2x: fix built-in kernel driver load failure
Commit b7a49f7305 ("bnx2x: Utilize firmware 7.13.21.0")
added request_firmware() logic in probe() which caused
load failure when firmware file is not present in initrd (below),
as access to firmware file is not feasible during probe.

  Direct firmware load for bnx2x/bnx2x-e2-7.13.15.0.fw failed with error -2
  Direct firmware load for bnx2x/bnx2x-e2-7.13.21.0.fw failed with error -2

This patch fixes this issue by -

1. Removing request_firmware() logic from the probe()
   such that .ndo_open() handle it as it used to handle
   it earlier

2. Given request_firmware() is removed from probe(), so
   driver has to relax FW version comparisons a bit against
   the already loaded FW version (by some other PFs of same
   adapter) to allow different compatible/close enough FWs with which
   multiple PFs may run with (in different environments), as the
   given PF who is in probe flow has no idea now with which firmware
   file version it is going to initialize the device in ndo_open()

Link: https://lore.kernel.org/all/46f2d9d9-ae7f-b332-ddeb-b59802be2bab@molgen.mpg.de/
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Fixes: b7a49f7305 ("bnx2x: Utilize firmware 7.13.21.0")
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Link: https://lore.kernel.org/r/20220316214613.6884-1-manishc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 09:30:04 -07:00
Juerg Haefliger
f1858c277b net: phy: mscc: Add MODULE_FIRMWARE macros
The driver requires firmware so define MODULE_FIRMWARE so that modinfo
provides the details.

Fixes: fa164e40c5 ("net: phy: mscc: split the driver into separate files")
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Link: https://lore.kernel.org/r/20220316151835.88765-1-juergh@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 09:06:09 -07:00
Miaoqian Lin
cb0b430b4e net: dsa: Add missing of_node_put() in dsa_port_parse_of
The device_node pointer is returned by of_parse_phandle()  with refcount
incremented. We should use of_node_put() on it when done.

Fixes: 6d4e5c570c ("net: dsa: get port type at parse time")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220316082602.10785-1-linmq006@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-17 13:13:27 +01:00
Thomas Zimmermann
3c3384050d drm: Don't make DRM_PANEL_BRIDGE dependent on DRM_KMS_HELPERS
Fix a number of undefined references to drm_kms_helper.ko in
drm_dp_helper.ko:

  arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_mst_duplicate_state':
  drm_dp_mst_topology.c:(.text+0x2df0): undefined reference to `__drm_atomic_helper_private_obj_duplicate_state'
  arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_delayed_destroy_work':
  drm_dp_mst_topology.c:(.text+0x370c): undefined reference to `drm_kms_helper_hotplug_event'
  arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_mst_up_req_work':
  drm_dp_mst_topology.c:(.text+0x7938): undefined reference to `drm_kms_helper_hotplug_event'
  arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_mst_link_probe_work':
  drm_dp_mst_topology.c:(.text+0x82e0): undefined reference to `drm_kms_helper_hotplug_event'

This happens if panel-edp.ko has been configured with

  DRM_PANEL_EDP=y
  DRM_DP_HELPER=y
  DRM_KMS_HELPER=m

which builds DP helpers into the kernel and KMS helpers sa a module.
Making DRM_PANEL_EDP select DRM_KMS_HELPER resolves this problem.

To avoid a resulting cyclic dependency with DRM_PANEL_BRIDGE, don't
make the latter depend on DRM_KMS_HELPER and fix the one DRM bridge
drivers that doesn't already select DRM_KMS_HELPER. As KMS helpers
cannot be selected directly by the user, config symbols should avoid
depending on it anyway.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 3755d35ee1 ("drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP")
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Tested-by: Brian Masney <bmasney@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Linux Kernel Functional Testing <lkft@linaro.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Cc: Dave Airlie <airlied@redhat.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/478296/
2022-03-17 11:07:57 +01:00
Thomas Zimmermann
a8253684eb Merge drm/drm-fixes into drm-misc-fixes
Backmerging drm/drm-fixes for commit 3755d35ee1 ("drm/panel: Select
DRM_DP_HELPER for DRM_PANEL_EDP").

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2022-03-17 11:03:28 +01:00
Steve French
e3ee9fb226 smb3: fix incorrect session setup check for multiuser mounts
A recent change to how the SMB3 server (socket) and session status
is managed regressed multiuser mounts by changing the check
for whether session setup is needed to the socket (TCP_Server_info)
structure instead of the session struct (cifs_ses). Add additional
check in cifs_setup_sesion to fix this.

Fixes: 73f9bfbe3d ("cifs: maintain a state machine for tcp/smb/tcon sessions")
Reported-by: Ronnie Sahlberg <lsahlber@redhat.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-03-16 22:48:55 -05:00
Nicolas Dichtel
4ee06de772 net: handle ARPHRD_PIMREG in dev_is_mac_header_xmit()
This kind of interface doesn't have a mac header. This patch fixes
bpf_redirect() to a PIM interface.

Fixes: 27b29f6305 ("bpf: add bpf_redirect() helper")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20220315092008.31423-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-16 19:38:41 -07:00
Linus Torvalds
a46310bfae Merge tag 'efi-urgent-for-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fix from Ard Biesheuvel:
 "Avoid spurious warnings about unknown boot parameters"

* tag 'efi-urgent-for-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: fix return value of __setup handlers
2022-03-16 11:57:46 -07:00
Linus Torvalds
d34c58247f Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "This fixes a bug where qcom-rng can return a buffer that is not
  completely filled with random data"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: qcom-rng - ensure buffer for generate is completely filled
2022-03-16 11:50:35 -07:00
Vladimir Oltean
1447c63580 Revert "arm64: dts: freescale: Fix 'interrupt-map' parent address cells"
This reverts commit 869f0ec048. That
updated the expected device tree binding format for the ls-extirq
driver, without also updating the parsing code (ls_extirq_parse_map)
to the new format.

The context is that the ls-extirq driver uses the standard
"interrupt-map" OF property in a non-standard way, as suggested by
Rob Herring during review:
https://lore.kernel.org/lkml/20190927161118.GA19333@bogus/

This has turned out to be problematic, as Marc Zyngier discovered
through commit 0412841812 ("of/irq: Allow matching of an interrupt-map
local to an interrupt controller"), later fixed through commit
de4adddcbc ("of/irq: Add a quirk for controllers with their own
definition of interrupt-map"). Marc's position, expressed on multiple
opportunities, is that:

(a) [ making private use of the reserved "interrupt-map" name in a
    driver ] "is wrong, by the very letter of what an interrupt-map
    means. If the interrupt map points to an interrupt controller,
    that's the target for the interrupt."
https://lore.kernel.org/lkml/87k0g8jlmg.wl-maz@kernel.org/

(b) [ updating the driver's bindings to accept a non-reserved name for
    this property, as an alternative, is ] "is totally pointless. These
    machines have been in the wild for years, and existing DTs will be
    there *forever*."
https://lore.kernel.org/lkml/87ilvrk1r0.wl-maz@kernel.org/

Considering the above, the Linux kernel has quirks in place to deal with
the ls-extirq's non-standard use of the "interrupt-map". These quirks
may be needed in other operating systems that consume this device tree,
yet this is seen as the only viable solution.

Therefore, the premise of the patch being reverted here is invalid.
It doesn't matter whether the driver, in its non-standard use of the
property, complies to the standard format or not, since this property
isn't expected to be used for interrupt translation by the core.

This change restores LS1088A, LS2088A/LS2085A and LX2160A to their
previous bindings, which allows these systems to continue to use
external interrupt lines with the correct polarity.

Fixes: 869f0ec048 ("arm64: dts: freescale: Fix 'interrupt-map' parent address cells")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-16 19:41:14 +01:00
Jakub Kicinski
186abea8a8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2022-03-16

1) Fix a kernel-info-leak in pfkey.
   From Haimin Zhang.

2) Fix an incorrect check of the return value of ipv6_skip_exthdr.
   From Sabrina Dubroca.

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  esp6: fix check on ipv6_skip_exthdr's return value
  af_key: add __GFP_ZERO flag for compose_sadb_supported in function pfkey_register
====================

Link: https://lore.kernel.org/r/20220316121142.3142336-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-16 11:39:37 -07:00
Jakub Kicinski
1bbdcbaeda Merge tag 'wireless-2022-03-16' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Kalle Valo says:

====================
wireless fixes for v5.17

Third set of fixes for v5.17. We have only one revert to fix an ath10k
regression.

* tag 'wireless-2022-03-16' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  Revert "ath10k: drop beacon and probe response which leak from other channel"
====================

Link: https://lore.kernel.org/r/20220316130249.B5225C340EC@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-16 11:08:09 -07:00
Marek Vasut
fc1b6ef7bf drm/panel: simple: Fix Innolux G070Y2-L01 BPP settings
The Innolux G070Y2-L01 supports two modes of operation:
1) FRC=Low/NC ... MEDIA_BUS_FMT_RGB666_1X7X3_SPWG ... BPP=6
2) FRC=High ..... MEDIA_BUS_FMT_RGB888_1X7X4_SPWG ... BPP=8

Currently the panel description mixes both, BPP from 1) and bus
format from 2), which triggers a warning at panel-simple.c:615.

Pick the later, set bpp=8, fix the warning.

Fixes: a5d2ade627 ("drm/panel: simple: Add support for Innolux G070Y2-L01")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Christoph Fritz <chf.fritz@googlemail.com>
Cc: Laurent Pinchart <Laurent.pinchart@ideasonboard.com>
Cc: Maxime Ripard <maxime@cerno.tech>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220220040718.532866-1-marex@denx.de
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2022-03-16 16:35:05 +01:00
Christoph Niedermaier
6061806a86 drm/imx: parallel-display: Remove bus flags check in imx_pd_bridge_atomic_check()
If display timings were read from the devicetree using
of_get_display_timing() and pixelclk-active is defined
there, the flag DISPLAY_FLAGS_SYNC_POSEDGE/NEGEDGE is
automatically generated. Through the function
drm_bus_flags_from_videomode() e.g. called in the
panel-simple driver this flag got into the bus flags,
but then in imx_pd_bridge_atomic_check() the bus flag
check failed and will not initialize the display. The
original commit fe141cedc4 does not explain why this
check was introduced. So remove the bus flags check,
because it stops the initialization of the display with
valid bus flags.

Fixes: fe141cedc4 ("drm/imx: pd: Use bus format/flags provided by the bridge when available")
Signed-off-by: Christoph Niedermaier <cniedermaier@dh-electronics.com>
Cc: Marek Vasut <marex@denx.de>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: linux-arm-kernel@lists.infradead.org
To: dri-devel@lists.freedesktop.org
Tested-by: Max Krummenacher <max.krummenacher@toradex.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220201113643.4638-1-cniedermaier@dh-electronics.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2022-03-16 16:35:04 +01:00
Jens Axboe
f6189589fa Merge tag 'nvme-5.17-2022-03-16' of git://git.infradead.org/nvme into block-5.17
Pull NVMe fix from Christoph:

"nvme fix for Linux 5.17

 - last minute revert of a nvmet feature added in Linux 5.16
   (Hannes Reinecke)"

* tag 'nvme-5.17-2022-03-16' of git://git.infradead.org/nvme:
  nvmet: revert "nvmet: make discovery NQN configurable"
2022-03-16 05:43:25 -06:00
Kalle Valo
45b4eb7ee6 Revert "ath10k: drop beacon and probe response which leak from other channel"
This reverts commit 3bf2537ec2.

I was reported privately that this commit breaks AP and mesh mode on QCA9984
(firmware 10.4-3.9.0.2-00156). So revert the commit to fix the regression.

There was a conflict due to cfg80211 API changes but that was easy to fix.

Fixes: 3bf2537ec2 ("ath10k: drop beacon and probe response which leak from other channel")
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220315155455.20446-1-kvalo@kernel.org
2022-03-16 13:34:52 +02:00
Rafael J. Wysocki
462ccc35a7 Revert "ACPI: scan: Do not add device IDs from _CID if _HID is not valid"
Revert commit e38f9ff63e ("ACPI: scan: Do not add device IDs from _CID
if _HID is not valid"), because it has introduced regressions on
multiple systems, even though it only has effect on clearly invalid
firmware.

Reported-by: Pierre-Louis Bossart <notifications@github.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-16 11:23:05 +01:00
David S. Miller
dea2d93a8b Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
====================
Intel Wired LAN Driver Updates 2022-03-15

This series contains updates to ice and iavf drivers.

Maciej adjusts null check logic on Tx ring to prevent possible NULL
pointer dereference for ice.

Sudheer moves destruction of Flow Director lock as it was being accessed
after destruction for ice.

Przemyslaw removes an excess mutex unlock as it was being double
unlocked for iavf.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-16 10:07:43 +00:00
Jiasheng Jiang
886e44c929 hv_netvsc: Add check for kvmalloc_array
As the potential failure of the kvmalloc_array(),
it should be better to check and restore the 'data'
if fails in order to avoid the dereference of the
NULL pointer.

Fixes: 6ae7467112 ("hv_netvsc: Add per-cpu ethtool stats for netvsc")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Link: https://lore.kernel.org/r/20220314020125.2365084-1-jiasheng@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-15 21:57:53 -07:00
Przemyslaw Patynowski
16b2dd8cdf iavf: Fix double free in iavf_reset_task
Fix double free possibility in iavf_disable_vf, as crit_lock is
freed in caller, iavf_reset_task. Add kernel-doc for iavf_disable_vf.
Remove mutex_unlock in iavf_disable_vf.
Without this patch there is double free scenario, when calling
iavf_reset_task.

Fixes: e85ff9c631 ("iavf: Fix deadlock in iavf_reset_task")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15 13:36:13 -07:00
Sudheer Mogilappagari
1b4ae7d925 ice: destroy flow director filter mutex after releasing VSIs
Currently fdir_fltr_lock is accessed in ice_vsi_release_all() function
after it is destroyed. Instead destroy mutex after ice_vsi_release_all.

Fixes: 40319796b7 ("ice: Add flow director support for channel mode")
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15 13:36:13 -07:00
Maciej Fijalkowski
f153546913 ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats()
It is possible to do NULL pointer dereference in routine that updates
Tx ring stats. Currently only stats and bytes are updated when ring
pointer is valid, but later on ring is accessed to propagate gathered Tx
stats onto VSI stats.

Change the existing logic to move to next ring when ring is NULL.

Fixes: e72bba2135 ("ice: split ice_ring onto Tx/Rx separate structs")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-15 13:36:13 -07:00
Jakub Kicinski
e9c14b59ea Add Paolo Abeni to networking maintainers
Growing the network maintainers team from 2 to 3.

Signed-off-by: David S. Miller <davem@davemloft.net>
Link: https://lore.kernel.org/r/20220314222819.958428-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-15 12:16:10 -07:00
Uwe Kleine-König
01b44ef2bf counter: Stop using dev_get_drvdata() to get the counter device
dev_get_drvdata() returns NULL since commit b56346ddbd ("counter: Use
container_of instead of drvdata to track counter_device") which wrongly
claimed there were no users of drvdata. Convert to container_of() to
fix a null pointer dereference.

Reported-by: Oleksij Rempel <o.rempel@pengutronix.de>
Fixes: b56346ddbd ("counter: Use container_of instead of drvdata to track counter_device")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Link: https://lore.kernel.org/all/20220204082556.370348-1-u.kleine-koenig@pengutronix.de/
Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com>
Link: https://lore.kernel.org/r/4a14311a3b935b62b33e665a97ecaaf2f078228a.1646957732.git.vilhelm.gray@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-15 19:24:13 +01:00
David Jeffery
733ab7e1b5 scsi: fnic: Finish scsi_cmnd before dropping the spinlock
When aborting a SCSI command through fnic, there is a race with the fnic
interrupt handler which can result in the SCSI command and its request
being completed twice. If the interrupt handler claims the command by
setting CMD_SP to NULL first, the abort handler assumes the interrupt
handler has completed the command and returns SUCCESS, causing the request
for the scsi_cmnd to be re-queued.

But the interrupt handler may not have finished the command yet. After it
drops the spinlock protecting CMD_SP, it does memory cleanup before finally
calling scsi_done() to complete the scsi_cmnd. If the call to scsi_done
occurs after the abort handler finishes and re-queues the request, the
completion of the scsi_cmnd will advance and try to double complete a
request already queued for retry.

This patch fixes the issue by moving scsi_done() and any other use of
scsi_cmnd to before the spinlock is released by the interrupt handler.

Link: https://lore.kernel.org/r/20220311184359.2345319-1-djeffery@redhat.com
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15 14:01:28 -04:00
Alan Stern
16b1941eac usb: gadget: Fix use-after-free bug by not setting udc->dev.driver
The syzbot fuzzer found a use-after-free bug:

BUG: KASAN: use-after-free in dev_uevent+0x712/0x780 drivers/base/core.c:2320
Read of size 8 at addr ffff88802b934098 by task udevd/3689

CPU: 2 PID: 3689 Comm: udevd Not tainted 5.17.0-rc4-syzkaller-00229-g4f12b742eb2b #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0x8d/0x303 mm/kasan/report.c:255
 __kasan_report mm/kasan/report.c:442 [inline]
 kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
 dev_uevent+0x712/0x780 drivers/base/core.c:2320
 uevent_show+0x1b8/0x380 drivers/base/core.c:2391
 dev_attr_show+0x4b/0x90 drivers/base/core.c:2094

Although the bug manifested in the driver core, the real cause was a
race with the gadget core.  dev_uevent() does:

	if (dev->driver)
		add_uevent_var(env, "DRIVER=%s", dev->driver->name);

and between the test and the dereference of dev->driver, the gadget
core sets dev->driver to NULL.

The race wouldn't occur if the gadget core registered its devices on
a real bus, using the standard synchronization techniques of the
driver core.  However, it's not necessary to make such a large change
in order to fix this bug; all we need to do is make sure that
udc->dev.driver is always NULL.

In fact, there is no reason for udc->dev.driver ever to be set to
anything, let alone to the value it currently gets: the address of the
gadget's driver.  After all, a gadget driver only knows how to manage
a gadget, not how to manage a UDC.

This patch simply removes the statements in the gadget core that touch
udc->dev.driver.

Fixes: 2ccea03a8f ("usb: gadget: introduce UDC Class")
CC: <stable@vger.kernel.org>
Reported-and-tested-by: syzbot+348b571beb5eeb70a582@syzkaller.appspotmail.com
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/YiQgukfFFbBnwJ/9@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-15 18:46:01 +01:00
Alan Stern
e9b667a82c usb: usbtmc: Fix bug in pipe direction for control transfers
The syzbot fuzzer reported a minor bug in the usbtmc driver:

usb 5-1: BOGUS control dir, pipe 80001e80 doesn't match bRequestType 0
WARNING: CPU: 0 PID: 3813 at drivers/usb/core/urb.c:412
usb_submit_urb+0x13a5/0x1970 drivers/usb/core/urb.c:410
Modules linked in:
CPU: 0 PID: 3813 Comm: syz-executor122 Not tainted
5.17.0-rc5-syzkaller-00306-g2293be58d6a1 #0
...
Call Trace:
 <TASK>
 usb_start_wait_urb+0x113/0x530 drivers/usb/core/message.c:58
 usb_internal_control_msg drivers/usb/core/message.c:102 [inline]
 usb_control_msg+0x2a5/0x4b0 drivers/usb/core/message.c:153
 usbtmc_ioctl_request drivers/usb/class/usbtmc.c:1947 [inline]

The problem is that usbtmc_ioctl_request() uses usb_rcvctrlpipe() for
all of its transfers, whether they are in or out.  It's easy to fix.

CC: <stable@vger.kernel.org>
Reported-and-tested-by: syzbot+a48e3d1a875240cab5de@syzkaller.appspotmail.com
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/YiEsYTPEE6lOCOA5@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-15 18:45:31 +01:00
Bartosz Golaszewski
56e337f2cf Revert "gpio: Revert regression in sysfs-gpio (gpiolib.c)"
This reverts commit fc328a7d1f.

This commit - while attempting to fix a regression - has caused a number
of other problems. As the fallout from it is more significant than the
initial problem itself, revert it for now before we find a correct
solution.

Link: https://lore.kernel.org/all/20220314192522.GA3031157@roeck-us.net/
Link: https://lore.kernel.org/stable/20220314155509.552218-1-michael@walle.cc/
Link: https://lore.kernel.org/all/20211217153555.9413-1-marcelo.jimenez@gmail.com/
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Reported-and-bisected-by: Guenter Roeck <linux@roeck-us.net>
Reported-by: Michael Walle <michael@walle.cc>
Cc: Thorsten Leemhuis <linux@leemhuis.info>
Cc: Marcelo Roberto Jimenez <marcelo.jimenez@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-15 09:59:08 -07:00
H. Nikolaus Schaller
2390710647 partially Revert "usb: musb: Set the DT node on the child device"
This reverts the omap2430 changes of

commit cf081d009c ("usb: musb: Set the DT node on the child device")

Since v5.17-rc1, musb is broken on the gta04 and openpandora devices
(omap3530/dm3730). BeagleBone Black (am335x) seems to work.

Symptoms of this bug are

a) main symptom

[   21.336517] using random host ethernet address
[   21.341430] using host ethernet address: 32:70:05:18:ff:78
[   21.341461] using self ethernet address: 46:10:3a:b3:af:d9
[   21.358184] usb0: HOST MAC 32:70:05:18:ff:78
[   21.376678] usb0: MAC 46:10:3a:b3:af:d9
[   21.388305] using random self ethernet address
[   21.393371] using random host ethernet address
[   21.398162] g_ether gadget: Ethernet Gadget, version: Memorial Day 2008
[   21.421081] g_ether gadget: g_ether ready
[   21.492156] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   21.691345] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   21.803192] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   21.819427] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   22.124450] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   22.168518] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   22.179382] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   23.213592] musb-hdrc musb-hdrc.1.auto: pm runtime get failed in musb_gadget_queue
[   23.221832] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   23.227905] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   23.239440] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   23.401000] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   23.407073] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   23.426361] musb-hdrc musb-hdrc.1.auto: Could not enable: -22
[   23.734466] musb-hdrc musb-hdrc.1.auto: pm runtime get failed in musb_gadget_queue
[   23.742462] musb-hdrc musb-hdrc.1.auto: pm runtime get failed in musb_gadget_queue
[   23.750396] musb-hdrc musb-hdrc.1.auto: pm runtime get failed in musb_gadget_queue
... (repeats with high frequency)

This stops if the USB cable is unplugged and restarts if it is plugged in again.

b) also found in the log

[    6.498107] ------------[ cut here ]------------
[    6.502960] WARNING: CPU: 0 PID: 868 at arch/arm/mach-omap2/omap_hwmod.c:1885 _enable+0x50/0x234
[    6.512207] omap_hwmod: usb_otg_hs: enabled state can only be entered from initialized, idle, or disabled state
[    6.522766] Modules linked in: omap2430(+) bmp280_i2c bmp280 itg3200 at24 tsc2007 leds_tca6507 bma180 hmc5843_i2c hmc5843_core industrialio_triggered_buffer lis3lv02d_i2c kfifo_buf lis3lv02d phy_twl4030_usb snd_soc_omap_mcbsp snd_soc_ti_sdma musb_hdrc snd_soc_twl4030 gnss_sirf twl4030_vibra twl4030_madc twl4030_charger twl4030_pwrbutton gnss industrialio ehci_omap omapdrm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm drm_panel_orientation_quirks cec
[    6.566436] CPU: 0 PID: 868 Comm: udevd Not tainted 5.16.0-rc5-letux+ #8251
[    6.573730] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[    6.580322] [<c010ed30>] (unwind_backtrace) from [<c010a1d0>] (show_stack+0x10/0x14)
[    6.588470] [<c010a1d0>] (show_stack) from [<c0897c14>] (dump_stack_lvl+0x40/0x4c)
[    6.596405] [<c0897c14>] (dump_stack_lvl) from [<c0130cc4>] (__warn+0xb4/0xdc)
[    6.604003] [<c0130cc4>] (__warn) from [<c0130d5c>] (warn_slowpath_fmt+0x70/0x9c)
[    6.611846] [<c0130d5c>] (warn_slowpath_fmt) from [<c011f4d4>] (_enable+0x50/0x234)
[    6.619903] [<c011f4d4>] (_enable) from [<c012081c>] (omap_hwmod_enable+0x28/0x40)
[    6.627838] [<c012081c>] (omap_hwmod_enable) from [<c0120ff4>] (omap_device_enable+0x4c/0x78)
[    6.636779] [<c0120ff4>] (omap_device_enable) from [<c0121030>] (_od_runtime_resume+0x10/0x3c)
[    6.645812] [<c0121030>] (_od_runtime_resume) from [<c05c688c>] (__rpm_callback+0x3c/0xf4)
[    6.654510] [<c05c688c>] (__rpm_callback) from [<c05c6994>] (rpm_callback+0x50/0x54)
[    6.662628] [<c05c6994>] (rpm_callback) from [<c05c66b0>] (rpm_resume+0x448/0x4e4)
[    6.670593] [<c05c66b0>] (rpm_resume) from [<c05c6784>] (__pm_runtime_resume+0x38/0x50)
[    6.678985] [<c05c6784>] (__pm_runtime_resume) from [<bf14ab20>] (musb_init_controller+0x350/0xa5c [musb_hdrc])
[    6.689727] [<bf14ab20>] (musb_init_controller [musb_hdrc]) from [<c05bccb8>] (platform_probe+0x58/0xa8)
[    6.699737] [<c05bccb8>] (platform_probe) from [<c05badf0>] (really_probe+0x170/0x2fc)
[    6.708068] [<c05badf0>] (really_probe) from [<c05bb040>] (__driver_probe_device+0xc4/0xd8)
[    6.716827] [<c05bb040>] (__driver_probe_device) from [<c05bb084>] (driver_probe_device+0x30/0xac)
[    6.726226] [<c05bb084>] (driver_probe_device) from [<c05bb3d0>] (__device_attach_driver+0x94/0xb4)
[    6.735717] [<c05bb3d0>] (__device_attach_driver) from [<c05b93f8>] (bus_for_each_drv+0xa0/0xb4)
[    6.744934] [<c05b93f8>] (bus_for_each_drv) from [<c05bb248>] (__device_attach+0xc0/0x134)
[    6.753631] [<c05bb248>] (__device_attach) from [<c05b9fcc>] (bus_probe_device+0x28/0x80)
[    6.762207] [<c05b9fcc>] (bus_probe_device) from [<c05b7e40>] (device_add+0x5fc/0x788)
[    6.770507] [<c05b7e40>] (device_add) from [<c05bd240>] (platform_device_add+0x70/0x1bc)
[    6.779022] [<c05bd240>] (platform_device_add) from [<bf177830>] (omap2430_probe+0x260/0x2d4 [omap2430])
[    6.789001] [<bf177830>] (omap2430_probe [omap2430]) from [<c05bccb8>] (platform_probe+0x58/0xa8)
[    6.798309] [<c05bccb8>] (platform_probe) from [<c05badf0>] (really_probe+0x170/0x2fc)
[    6.806610] [<c05badf0>] (really_probe) from [<c05bb040>] (__driver_probe_device+0xc4/0xd8)
[    6.815399] [<c05bb040>] (__driver_probe_device) from [<c05bb084>] (driver_probe_device+0x30/0xac)
[    6.824798] [<c05bb084>] (driver_probe_device) from [<c05bb4b4>] (__driver_attach+0xc4/0xd8)
[    6.833648] [<c05bb4b4>] (__driver_attach) from [<c05b9308>] (bus_for_each_dev+0x64/0xa0)
[    6.842224] [<c05b9308>] (bus_for_each_dev) from [<c05ba248>] (bus_add_driver+0x148/0x1a4)
[    6.850891] [<c05ba248>] (bus_add_driver) from [<c05bbd1c>] (driver_register+0xb4/0xf8)
[    6.859313] [<c05bbd1c>] (driver_register) from [<c0101f54>] (do_one_initcall+0x90/0x1c8)
[    6.867889] [<c0101f54>] (do_one_initcall) from [<c0893968>] (do_init_module+0x4c/0x204)
[    6.876373] [<c0893968>] (do_init_module) from [<c01b4c30>] (load_module+0x13f0/0x1928)
[    6.884796] [<c01b4c30>] (load_module) from [<c01b53a0>] (sys_finit_module+0xa0/0xc0)
[    6.893005] [<c01b53a0>] (sys_finit_module) from [<c0100080>] (ret_fast_syscall+0x0/0x54)
[    6.901580] Exception stack(0xc2807fa8 to 0xc2807ff0)
[    6.906890] 7fa0:                   b6e517d4 00052068 00000006 b6e509f8 00000000 b6e5131c
[    6.915466] 7fc0: b6e517d4 00052068 cd718000 0000017b 00020000 00037f78 00050048 00063368
[    6.924011] 7fe0: bed8fef0 bed8fee0 b6e4ac4b b6f55a42
[    6.929321] ---[ end trace d715ff121b58763c ]---

c) git bisect result on testing for "musb-hdrc" in the console log:

cf081d009c is the first bad commit
commit cf081d009c
Author: Rob Herring <robh@kernel.org>
Date:   Wed Dec 15 17:07:57 2021 -0600

  usb: musb: Set the DT node on the child device

  The musb glue drivers just copy the glue resources to the musb child device.
  Instead, set the musb child device's DT node pointer to the parent device's
  node so that platform_get_irq_byname() can find the resources in the DT.
  This removes the need for statically populating the IRQ resources from the
  DT which has been deprecated for some time.

  Signed-off-by: Rob Herring <robh@kernel.org>
  Link: https://lore.kernel.org/r/20211215230756.2009115-3-robh@kernel.org
  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drivers/usb/musb/am35x.c    | 2 ++
drivers/usb/musb/da8xx.c    | 2 ++
drivers/usb/musb/jz4740.c   | 1 +
drivers/usb/musb/mediatek.c | 2 ++
drivers/usb/musb/omap2430.c | 1 +
drivers/usb/musb/ux500.c    | 1 +
6 files changed, 9 insertions(+)

Reverting this patch makes musb work again as before.

Fixes: cf081d009c ("usb: musb: Set the DT node on the child device")
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Link: https://lore.kernel.org/r/f62f5fc11f9ecae7e57f3fd66939e051bd3b11fc.1646744166.git.hns@goldelico.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-15 15:49:15 +01:00
Dan Carpenter
65f3324f4b usb: gadget: rndis: prevent integer overflow in rndis_set_response()
If "BufOffset" is very large the "BufOffset + 8" operation can have an
integer overflow.

Cc: stable@kernel.org
Fixes: 38ea1eac7d ("usb: gadget: rndis: check size of RNDIS_MSG_SET command")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220301080424.GA17208@kili
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-15 15:48:57 +01:00
Jiasheng Jiang
0f74b29a4f atm: eni: Add check for dma_map_single
As the potential failure of the dma_map_single(),
it should be better to check it and return error
if fails.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-15 11:01:52 +00:00
Hannes Reinecke
0c48645a7f nvmet: revert "nvmet: make discovery NQN configurable"
Revert commit 626851e922 ("nvmet: make discovery NQN configurable");
the interface was deemed incorrect and will be replaced with a different
one.

Fixes: 626851e922 ("nvmet: make discovery NQN configurable")
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-03-15 10:39:26 +01:00
Eric Dumazet
c700525fcc net/packet: fix slab-out-of-bounds access in packet_recvmsg()
syzbot found that when an AF_PACKET socket is using PACKET_COPY_THRESH
and mmap operations, tpacket_rcv() is queueing skbs with
garbage in skb->cb[], triggering a too big copy [1]

Presumably, users of af_packet using mmap() already gets correct
metadata from the mapped buffer, we can simply make sure
to clear 12 bytes that might be copied to user space later.

BUG: KASAN: stack-out-of-bounds in memcpy include/linux/fortify-string.h:225 [inline]
BUG: KASAN: stack-out-of-bounds in packet_recvmsg+0x56c/0x1150 net/packet/af_packet.c:3489
Write of size 165 at addr ffffc9000385fb78 by task syz-executor233/3631

CPU: 0 PID: 3631 Comm: syz-executor233 Not tainted 5.17.0-rc7-syzkaller-02396-g0b3660695e80 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0xf/0x336 mm/kasan/report.c:255
 __kasan_report mm/kasan/report.c:442 [inline]
 kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
 check_region_inline mm/kasan/generic.c:183 [inline]
 kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189
 memcpy+0x39/0x60 mm/kasan/shadow.c:66
 memcpy include/linux/fortify-string.h:225 [inline]
 packet_recvmsg+0x56c/0x1150 net/packet/af_packet.c:3489
 sock_recvmsg_nosec net/socket.c:948 [inline]
 sock_recvmsg net/socket.c:966 [inline]
 sock_recvmsg net/socket.c:962 [inline]
 ____sys_recvmsg+0x2c4/0x600 net/socket.c:2632
 ___sys_recvmsg+0x127/0x200 net/socket.c:2674
 __sys_recvmsg+0xe2/0x1a0 net/socket.c:2704
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fdfd5954c29
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffcf8e71e48 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fdfd5954c29
RDX: 0000000000000000 RSI: 0000000020000500 RDI: 0000000000000005
RBP: 0000000000000000 R08: 000000000000000d R09: 000000000000000d
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffcf8e71e60
R13: 00000000000f4240 R14: 000000000000c1ff R15: 00007ffcf8e71e54
 </TASK>

addr ffffc9000385fb78 is located in stack of task syz-executor233/3631 at offset 32 in frame:
 ____sys_recvmsg+0x0/0x600 include/linux/uio.h:246

this frame has 1 object:
 [32, 160) 'addr'

Memory state around the buggy address:
 ffffc9000385fa80: 00 04 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00
 ffffc9000385fb00: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
>ffffc9000385fb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f3
                                                                ^
 ffffc9000385fc00: f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 f1
 ffffc9000385fc80: f1 f1 f1 00 f2 f2 f2 00 f2 f2 f2 00 00 00 00 00
==================================================================

Fixes: 0fb375fb9b ("[AF_PACKET]: Allow for > 8 byte hardware addresses.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20220312232958.3535620-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-14 22:08:34 -07:00
Michael Walle
0f8946ae70 net: mdio: mscc-miim: fix duplicate debugfs entry
This driver can have up to two regmaps. If the second one is registered
its debugfs entry will have the same name as the first one and the
following error will be printed:

[    3.833521] debugfs: Directory 'e200413c.mdio' with parent 'regmap' already present!

Give the second regmap a name to avoid this.

Fixes: a27a762828 ("net: mdio: mscc-miim: convert to a regmap implementation")
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220312224140.4173930-1-michael@walle.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-14 22:04:15 -07:00
Matt Lupfer
69ad4ef868 scsi: mpt3sas: Page fault in reply q processing
A page fault was encountered in mpt3sas on a LUN reset error path:

[  145.763216] mpt3sas_cm1: Task abort tm failed: handle(0x0002),timeout(30) tr_method(0x0) smid(3) msix_index(0)
[  145.778932] scsi 1:0:0:0: task abort: FAILED scmd(0x0000000024ba29a2)
[  145.817307] scsi 1:0:0:0: attempting device reset! scmd(0x0000000024ba29a2)
[  145.827253] scsi 1:0:0:0: [sg1] tag#2 CDB: Receive Diagnostic 1c 01 01 ff fc 00
[  145.837617] scsi target1:0:0: handle(0x0002), sas_address(0x500605b0000272b9), phy(0)
[  145.848598] scsi target1:0:0: enclosure logical id(0x500605b0000272b8), slot(0)
[  149.858378] mpt3sas_cm1: Poll ReplyDescriptor queues for completion of smid(0), task_type(0x05), handle(0x0002)
[  149.875202] BUG: unable to handle page fault for address: 00000007fffc445d
[  149.885617] #PF: supervisor read access in kernel mode
[  149.894346] #PF: error_code(0x0000) - not-present page
[  149.903123] PGD 0 P4D 0
[  149.909387] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  149.917417] CPU: 24 PID: 3512 Comm: scsi_eh_1 Kdump: loaded Tainted: G S         O      5.10.89-altav-1 #1
[  149.934327] Hardware name: DDN           200NVX2             /200NVX2-MB          , BIOS ATHG2.2.02.01 09/10/2021
[  149.951871] RIP: 0010:_base_process_reply_queue+0x4b/0x900 [mpt3sas]
[  149.961889] Code: 0f 84 22 02 00 00 8d 48 01 49 89 fd 48 8d 57 38 f0 0f b1 4f 38 0f 85 d8 01 00 00 49 8b 45 10 45 31 e4 41 8b 55 0c 48 8d 1c d0 <0f> b6 03 83 e0 0f 3c 0f 0f 85 a2 00 00 00 e9 e6 01 00 00 0f b7 ee
[  149.991952] RSP: 0018:ffffc9000f1ebcb8 EFLAGS: 00010246
[  150.000937] RAX: 0000000000000055 RBX: 00000007fffc445d RCX: 000000002548f071
[  150.011841] RDX: 00000000ffff8881 RSI: 0000000000000001 RDI: ffff888125ed50d8
[  150.022670] RBP: 0000000000000000 R08: 0000000000000000 R09: c0000000ffff7fff
[  150.033445] R10: ffffc9000f1ebb68 R11: ffffc9000f1ebb60 R12: 0000000000000000
[  150.044204] R13: ffff888125ed50d8 R14: 0000000000000080 R15: 34cdc00034cdea80
[  150.054963] FS:  0000000000000000(0000) GS:ffff88dfaf200000(0000) knlGS:0000000000000000
[  150.066715] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  150.076078] CR2: 00000007fffc445d CR3: 000000012448a006 CR4: 0000000000770ee0
[  150.086887] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  150.097670] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  150.108323] PKRU: 55555554
[  150.114690] Call Trace:
[  150.120497]  ? printk+0x48/0x4a
[  150.127049]  mpt3sas_scsih_issue_tm.cold.114+0x2e/0x2b3 [mpt3sas]
[  150.136453]  mpt3sas_scsih_issue_locked_tm+0x86/0xb0 [mpt3sas]
[  150.145759]  scsih_dev_reset+0xea/0x300 [mpt3sas]
[  150.153891]  scsi_eh_ready_devs+0x541/0x9e0 [scsi_mod]
[  150.162206]  ? __scsi_host_match+0x20/0x20 [scsi_mod]
[  150.170406]  ? scsi_try_target_reset+0x90/0x90 [scsi_mod]
[  150.178925]  ? blk_mq_tagset_busy_iter+0x45/0x60
[  150.186638]  ? scsi_try_target_reset+0x90/0x90 [scsi_mod]
[  150.195087]  scsi_error_handler+0x3a5/0x4a0 [scsi_mod]
[  150.203206]  ? __schedule+0x1e9/0x610
[  150.209783]  ? scsi_eh_get_sense+0x210/0x210 [scsi_mod]
[  150.217924]  kthread+0x12e/0x150
[  150.224041]  ? kthread_worker_fn+0x130/0x130
[  150.231206]  ret_from_fork+0x1f/0x30

This is caused by mpt3sas_base_sync_reply_irqs() using an invalid reply_q
pointer outside of the list_for_each_entry() loop. At the end of the full
list traversal the pointer is invalid.

Move the _base_process_reply_queue() call inside of the loop.

Link: https://lore.kernel.org/r/d625deae-a958-0ace-2ba3-0888dd0a415b@ddn.com
Fixes: 711a923c14 ("scsi: mpt3sas: Postprocessing of target and LUN reset")
Cc: stable@vger.kernel.org
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Matt Lupfer <mlupfer@ddn.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14 23:45:19 -04:00
Pavel Skripkin
5600f69866 Input: aiptek - properly check endpoint type
Syzbot reported warning in usb_submit_urb() which is caused by wrong
endpoint type. There was a check for the number of endpoints, but not
for the type of endpoint.

Fix it by replacing old desc.bNumEndpoints check with
usb_find_common_endpoints() helper for finding endpoints

Fail log:

usb 5-1: BOGUS urb xfer, pipe 1 != type 3
WARNING: CPU: 2 PID: 48 at drivers/usb/core/urb.c:502 usb_submit_urb+0xed2/0x18a0 drivers/usb/core/urb.c:502
Modules linked in:
CPU: 2 PID: 48 Comm: kworker/2:2 Not tainted 5.17.0-rc6-syzkaller-00226-g07ebd38a0da2 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
Workqueue: usb_hub_wq hub_event
...
Call Trace:
 <TASK>
 aiptek_open+0xd5/0x130 drivers/input/tablet/aiptek.c:830
 input_open_device+0x1bb/0x320 drivers/input/input.c:629
 kbd_connect+0xfe/0x160 drivers/tty/vt/keyboard.c:1593

Fixes: 8e20cf2bce ("Input: aiptek - fix crash on detecting device without endpoints")
Reported-and-tested-by: syzbot+75cccf2b7da87fb6f84b@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Link: https://lore.kernel.org/r/20220308194328.26220-1-paskripkin@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-03-14 18:15:11 -07:00
Jakub Kicinski
15d703921f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net coming late
in the 5.17-rc process:

1) Revert port remap to mitigate shadowing service ports, this is causing
   problems in existing setups and this mitigation can be achieved with
   explicit ruleset, eg.

	... tcp sport < 16386 tcp dport >= 32768 masquerade random

  This patches provided a built-in policy similar to the one described above.

2) Disable register tracking infrastructure in nf_tables. Florian reported
   two issues:

   - Existing expressions with no implemented .reduce interface
     that causes data-store on register should cancel the tracking.
   - Register clobbering might be possible storing data on registers that
     are larger than 32-bits.

   This might lead to generating incorrect ruleset bytecode. These two
   issues are scheduled to be addressed in the next release cycle.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: disable register tracking
  Revert "netfilter: conntrack: tag conntracks picked up in local out hook"
  Revert "netfilter: nat: force port remap to prevent shadowing well-known ports"
====================

Link: https://lore.kernel.org/r/20220312220315.64531-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-14 15:51:10 -07:00
Kurt Cancemi
837d9e4940 net: phy: marvell: Fix invalid comparison in the resume and suspend functions
This bug resulted in only the current mode being resumed and suspended when
the PHY supported both fiber and copper modes and when the PHY only supported
copper mode the fiber mode would incorrectly be attempted to be resumed and
suspended.

Fixes: 3758be3dc1 ("Marvell phy: add functions to suspend and resume both interfaces: fiber and copper links.")
Signed-off-by: Kurt Cancemi <kurt@x64architecture.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220312201512.326047-1-kurt@x64architecture.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-14 15:08:37 -07:00
Ming Lei
daaca3522a block: release rq qos structures for queue without disk
blkcg_init_queue() may add rq qos structures to request queue, previously
blk_cleanup_queue() calls rq_qos_exit() to release them, but commit
8e141f9eb8 ("block: drain file system I/O on del_gendisk")
moves rq_qos_exit() into del_gendisk(), so memory leak is caused
because queues may not have disk, such as un-present scsi luns, nvme
admin queue, ...

Fixes the issue by adding rq_qos_exit() to blk_cleanup_queue() back.

BTW, v5.18 won't need this patch any more since we move
blkcg_init_queue()/blkcg_exit_queue() into disk allocation/release
handler, and patches have been in for-5.18/block.

Cc: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Fixes: 8e141f9eb8 ("block: drain file system I/O on del_gendisk")
Reported-by: syzbot+b42749a851a47a0f581b@syzkaller.appspotmail.com
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220314043018.177141-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-14 14:05:41 -06:00
Linus Torvalds
6665ca1574 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fix from Michael Tsirkin:
 "A last minute regression fix.

  I thought we did a lot of testing, but a regression still managed to
  sneak in. The fix seems trivial"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost: allow batching hint without size
2022-03-14 11:21:52 -07:00
Sabrina Dubroca
4db4075f92 esp6: fix check on ipv6_skip_exthdr's return value
Commit 5f9c55c806 ("ipv6: check return value of ipv6_skip_exthdr")
introduced an incorrect check, which leads to all ESP packets over
either TCPv6 or UDPv6 encapsulation being dropped. In this particular
case, offset is negative, since skb->data points to the ESP header in
the following chain of headers, while skb->network_header points to
the IPv6 header:

    IPv6 | ext | ... | ext | UDP | ESP | ...

That doesn't seem to be a problem, especially considering that if we
reach esp6_input_done2, we're guaranteed to have a full set of headers
available (otherwise the packet would have been dropped earlier in the
stack). However, it means that the return value will (intentionally)
be negative. We can make the test more specific, as the expected
return value of ipv6_skip_exthdr will be the (negated) size of either
a UDP header, or a TCP header with possible options.

In the future, we should probably either make ipv6_skip_exthdr
explicitly accept negative offsets (and adjust its return value for
error cases), or make ipv6_skip_exthdr only take non-negative
offsets (and audit all callers).

Fixes: 5f9c55c806 ("ipv6: check return value of ipv6_skip_exthdr")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-03-14 11:42:27 +01:00
Claudiu Beznea
e981bc74ae net: dsa: microchip: add spi_device_id tables
Add spi_device_id tables to avoid logs like "SPI driver ksz9477-switch
has no spi_device_id".

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-14 10:08:08 +00:00
Brian Masney
a680b1832c crypto: qcom-rng - ensure buffer for generate is completely filled
The generate function in struct rng_alg expects that the destination
buffer is completely filled if the function returns 0. qcom_rng_read()
can run into a situation where the buffer is partially filled with
randomness and the remaining part of the buffer is zeroed since
qcom_rng_generate() doesn't check the return value. This issue can
be reproduced by running the following from libkcapi:

    kcapi-rng -b 9000000 > OUTFILE

The generated OUTFILE will have three huge sections that contain all
zeros, and this is caused by the code where the test
'val & PRNG_STATUS_DATA_AVAIL' fails.

Let's fix this issue by ensuring that qcom_rng_read() always returns
with a full buffer if the function returns success. Let's also have
qcom_rng_generate() return the correct value.

Here's some statistics from the ent project
(https://www.fourmilab.ch/random/) that shows information about the
quality of the generated numbers:

    $ ent -c qcom-random-before
    Value Char Occurrences Fraction
      0           606748   0.067416
      1            33104   0.003678
      2            33001   0.003667
    ...
    253   �        32883   0.003654
    254   �        33035   0.003671
    255   �        33239   0.003693

    Total:       9000000   1.000000

    Entropy = 7.811590 bits per byte.

    Optimum compression would reduce the size
    of this 9000000 byte file by 2 percent.

    Chi square distribution for 9000000 samples is 9329962.81, and
    randomly would exceed this value less than 0.01 percent of the
    times.

    Arithmetic mean value of data bytes is 119.3731 (127.5 = random).
    Monte Carlo value for Pi is 3.197293333 (error 1.77 percent).
    Serial correlation coefficient is 0.159130 (totally uncorrelated =
    0.0).

Without this patch, the results of the chi-square test is 0.01%, and
the numbers are certainly not random according to ent's project page.
The results improve with this patch:

    $ ent -c qcom-random-after
    Value Char Occurrences Fraction
      0            35432   0.003937
      1            35127   0.003903
      2            35424   0.003936
    ...
    253   �        35201   0.003911
    254   �        34835   0.003871
    255   �        35368   0.003930

    Total:       9000000   1.000000

    Entropy = 7.999979 bits per byte.

    Optimum compression would reduce the size
    of this 9000000 byte file by 0 percent.

    Chi square distribution for 9000000 samples is 258.77, and randomly
    would exceed this value 42.24 percent of the times.

    Arithmetic mean value of data bytes is 127.5006 (127.5 = random).
    Monte Carlo value for Pi is 3.141277333 (error 0.01 percent).
    Serial correlation coefficient is 0.000468 (totally uncorrelated =
    0.0).

This change was tested on a Nexus 5 phone (msm8974 SoC).

Signed-off-by: Brian Masney <bmasney@redhat.com>
Fixes: ceec5f5b59 ("crypto: qcom-rng - Add Qcom prng driver")
Cc: stable@vger.kernel.org # 4.19+
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-03-14 14:41:04 +12:00
Linus Torvalds
09688c0166 Linux 5.17-rc8 2022-03-13 13:23:37 -07:00
Jocelyn Falempe
40ce1121c1 drm/mgag200: Fix PLL setup for g200wb and g200ew
commit f86c3ed559 ("drm/mgag200: Split PLL setup into compute and
 update functions") introduced a regression for g200wb and g200ew.
The PLLs are not set up properly, and VGA screen stays
black, or displays "out of range" message.

MGA1064_WB_PIX_PLLC_N/M/P was mistakenly replaced with
MGA1064_PIX_PLLC_N/M/P which have different addresses.

Patch tested on a Dell T310 with g200wb

Fixes: f86c3ed559 ("drm/mgag200: Split PLL setup into compute and update functions")
Cc: stable@vger.kernel.org
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220308174321.225606-1-jfalempe@redhat.com
2022-03-13 20:35:06 +01:00
Linus Torvalds
f0e18b03fc Merge tag 'x86_urgent_for_v5.17_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Free shmem backing storage for SGX enclave pages when those are
   swapped back into EPC memory

 - Prevent do_int3() from being kprobed, to avoid recursion

 - Remap setup_data and setup_indirect structures properly when
   accessing their members

 - Correct the alternatives patching order for modules too

* tag 'x86_urgent_for_v5.17_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sgx: Free backing memory after faulting the enclave page
  x86/traps: Mark do_int3() NOKPROBE_SYMBOL
  x86/boot: Add setup_indirect support in early_memremap_is_setup_data()
  x86/boot: Fix memremap of setup_indirect structures
  x86/module: Fix the paravirt vs alternative order
2022-03-13 10:36:38 -07:00
Linus Torvalds
aad611a868 Merge tag 'perf-tools-fixes-for-v5.17-2022-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix event parser error for hybrid systems

 - Fix NULL check against wrong variable in 'perf bench' and in the
   parsing code

 - Update arm64 KVM headers from the kernel sources

 - Sync cpufeatures header with the kernel sources

* tag 'perf-tools-fixes-for-v5.17-2022-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf parse: Fix event parser error for hybrid systems
  perf bench: Fix NULL check against wrong variable
  perf parse-events: Fix NULL check against wrong variable
  tools headers cpufeatures: Sync with the kernel sources
  tools kvm headers arm64: Update KVM headers from the kernel sources
2022-03-12 10:29:25 -08:00
Linus Torvalds
1518a4f636 Merge tag 'drm-fixes-2022-03-12' of git://anongit.freedesktop.org/drm/drm
Pull drm kconfig fix from Dave Airlie:
 "Thorsten pointed out this had fallen down the cracks and was in -next
  only, I've picked it out, fixed up it's Fixes: line.

   - fix regression in Kconfig"

* tag 'drm-fixes-2022-03-12' of git://anongit.freedesktop.org/drm/drm:
  drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP
2022-03-12 10:22:43 -08:00
Pablo Neira Ayuso
ed5f85d422 netfilter: nf_tables: disable register tracking
The register tracking infrastructure is incomplete, it might lead to
generating incorrect ruleset bytecode, disable it by now given we are
late in the release process.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-03-12 16:07:38 +01:00
Zhengjun Xing
91c9923a47 perf parse: Fix event parser error for hybrid systems
This bug happened on hybrid systems when both cpu_core and cpu_atom
have the same event name such as "UOPS_RETIRED.MS" while their event
terms are different, then during perf stat, the event for cpu_atom
will parse fail and then no output for cpu_atom.

UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/

It is because event terms in the "head" of parse_events_multi_pmu_add
will be changed to event terms for cpu_core after parsing UOPS_RETIRED.MS
for cpu_core, then when parsing the same event for cpu_atom, it still
uses the event terms for cpu_core, but event terms for cpu_atom are
different with cpu_core, the event parses for cpu_atom will fail. This
patch fixes it, the event terms should be parsed from the original
event.

This patch can work for the hybrid systems that have the same event
in more than 2 PMUs. It also can work in non-hybrid systems.

Before:

  # perf stat -v  -e  UOPS_RETIRED.MS  -a sleep 1

  Using CPUID GenuineIntel-6-97-1
  UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
  Control descriptor is not initialized
  UOPS_RETIRED.MS: 2737845 16068518485 16068518485

 Performance counter stats for 'system wide':

         2,737,845      cpu_core/UOPS_RETIRED.MS/

       1.002553850 seconds time elapsed

After:

  # perf stat -v  -e  UOPS_RETIRED.MS  -a sleep 1

  Using CPUID GenuineIntel-6-97-1
  UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
  UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/
  Control descriptor is not initialized
  UOPS_RETIRED.MS: 1977555 16076950711 16076950711
  UOPS_RETIRED.MS: 568684 8038694234 8038694234

 Performance counter stats for 'system wide':

         1,977,555      cpu_core/UOPS_RETIRED.MS/
           568,684      cpu_atom/UOPS_RETIRED.MS/

       1.004758259 seconds time elapsed

Fixes: fb0811535e ("perf parse-events: Allow config on kernel PMU events")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220307151627.30049-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 11:04:55 -03:00
Weiguo Li
073a15c351 perf bench: Fix NULL check against wrong variable
We did a NULL check after "epollfdp = calloc(...)", but we checked
"epollfd" instead of "epollfdp".

Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/tencent_B5D64530EB9C7DBB8D2C88A0C790F1489D0A@qq.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:49:13 -03:00
Weiguo Li
a7a72631f6 perf parse-events: Fix NULL check against wrong variable
We did a null check after "tmp->symbol = strdup(...)", but we checked
"list->symbol" other than "tmp->symbol".

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/tencent_DF39269807EC9425E24787E6DB632441A405@qq.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:47:52 -03:00
Arnaldo Carvalho de Melo
ec9d50ace3 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  d45476d983 ("x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE")

Its just a comment fixup.

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YiyiHatGaJQM7l/Y@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:38:05 -03:00
Arnaldo Carvalho de Melo
3ec94eeaff tools kvm headers arm64: Update KVM headers from the kernel sources
To pick the changes from:

  a5905d6af4 ("KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated")

That don't causes any changes in tooling (when built on x86), only
addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
  diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h

Cc: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/lkml/YiyhAK6sVPc83FaI@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:33:13 -03:00
Thomas Zimmermann
3755d35ee1 drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP
As reported in [1], DRM_PANEL_EDP depends on DRM_DP_HELPER. Select
the option to fix the build failure. The error message is shown
below.

  arm-linux-gnueabihf-ld: drivers/gpu/drm/panel/panel-edp.o: in function
    `panel_edp_probe': panel-edp.c:(.text+0xb74): undefined reference to
    `drm_panel_dp_aux_backlight'
  make[1]: *** [/builds/linux/Makefile:1222: vmlinux] Error 1

The issue has been reported before, when DisplayPort helpers were
hidden behind the option CONFIG_DRM_KMS_HELPER. [2]

v2:
	* fix and expand commit description (Arnd)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 9d6366e743 ("drm: fb_helper: improve CONFIG_FB dependency")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://lore.kernel.org/dri-devel/CA+G9fYvN0NyaVkRQmA1O6rX7H8PPaZrUAD7=RDy33QY9rUU-9g@mail.gmail.com/ # [1]
Link: https://lore.kernel.org/all/20211117062704.14671-1-rdunlap@infradead.org/ # [2]
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20220203093922.20754-1-tzimmermann@suse.de
Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-03-12 17:41:30 +10:00
Jiyong Park
8e6ed96376 vsock: each transport cycles only on its own sockets
When iterating over sockets using vsock_for_each_connected_socket, make
sure that a transport filters out sockets that don't belong to the
transport.

There actually was an issue caused by this; in a nested VM
configuration, destroying the nested VM (which often involves the
closing of /dev/vhost-vsock if there was h2g connections to the nested
VM) kills not only the h2g connections, but also all existing g2h
connections to the (outmost) host which are totally unrelated.

Tested: Executed the following steps on Cuttlefish (Android running on a
VM) [1]: (1) Enter into an `adb shell` session - to have a g2h
connection inside the VM, (2) open and then close /dev/vhost-vsock by
`exec 3< /dev/vhost-vsock && exec 3<&-`, (3) observe that the adb
session is not reset.

[1] https://android.googlesource.com/device/google/cuttlefish/

Fixes: c0cfa2d8a7 ("vsock: add multi-transports support")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jiyong Park <jiyong@google.com>
Link: https://lore.kernel.org/r/20220311020017.1509316-1-jiyong@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-11 23:14:19 -08:00
Niels Dossche
46b348fd2d alx: acquire mutex for alx_reinit in alx_change_mtu
alx_reinit has a lockdep assertion that the alx->mtx mutex must be held.
alx_reinit is called from two places: alx_reset and alx_change_mtu.
alx_reset does acquire alx->mtx before calling alx_reinit.
alx_change_mtu does not acquire this mutex, nor do its callers or any
path towards alx_change_mtu.
Acquire the mutex in alx_change_mtu.

The issue was introduced when the fine-grained locking was introduced
to the code to replace the RTNL. The same commit also introduced the
lockdep assertion.

Fixes: 4a5fe57e77 ("alx: use fine-grained locking instead of RTNL")
Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Link: https://lore.kernel.org/r/20220310232707.44251-1-dossche.niels@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-11 22:53:14 -08:00
Tadeusz Struk
5e34af4142 net: ipv6: fix skb_over_panic in __ip6_append_data
Syzbot found a kernel bug in the ipv6 stack:
LINK: https://syzkaller.appspot.com/bug?id=205d6f11d72329ab8d62a610c44c5e7e25415580
The reproducer triggers it by sending a crafted message via sendmmsg()
call, which triggers skb_over_panic, and crashes the kernel:

skbuff: skb_over_panic: text:ffffffff84647fb4 len:65575 put:65575
head:ffff888109ff0000 data:ffff888109ff0088 tail:0x100af end:0xfec0
dev:<NULL>

Update the check that prevents an invalid packet with MTU equal
to the fregment header size to eat up all the space for payload.

The reproducer can be found here:
LINK: https://syzkaller.appspot.com/text?tag=ReproC&x=1648c83fb00000

Reported-by: syzbot+e223cf47ec8ae183f2a0@syzkaller.appspotmail.com
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20220310232538.1044947-1-tadeusz.struk@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-11 17:11:38 -08:00
Randy Dunlap
6845376713 ARM: Spectre-BHB: provide empty stub for non-config
When CONFIG_GENERIC_CPU_VULNERABILITIES is not set, references
to spectre_v2_update_state() cause a build error, so provide an
empty stub for that function when the Kconfig option is not set.

Fixes this build error:

  arm-linux-gnueabi-ld: arch/arm/mm/proc-v7-bugs.o: in function `cpu_v7_bugs_init':
  proc-v7-bugs.c:(.text+0x52): undefined reference to `spectre_v2_update_state'
  arm-linux-gnueabi-ld: proc-v7-bugs.c:(.text+0x82): undefined reference to `spectre_v2_update_state'

Fixes: b9baf5c8c5 ("ARM: Spectre-BHB workaround")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: patches@armlinux.org.uk
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 12:42:49 -08:00
Linus Torvalds
77fe1ba902 Merge tag 'riscv-for-linus-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - prevent users from enabling the alternatives framework (and thus
   errata handling) on XIP kernels, where runtime code patching does not
   function correctly.

 - properly detect offset overflow for AUIPC-based relocations in
   modules. This may manifest as modules calling arbitrary invalid
   addresses, depending on the address allocated when a module is
   loaded.

* tag 'riscv-for-linus-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix auipc+jalr relocation range checks
  riscv: alternative only works on !XIP_KERNEL
2022-03-11 12:28:21 -08:00
Linus Torvalds
878409ecde Merge tag 'powerpc-5.17-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
 "Fix STACKTRACE=n build, in particular for skiroot_defconfig"

* tag 'powerpc-5.17-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Fix STACKTRACE=n build
2022-03-11 11:50:36 -08:00
Russell King (Oracle)
6c7cb60bff ARM: fix Thumb2 regression with Spectre BHB
When building for Thumb2, the vectors make use of a local label. Sadly,
the Spectre BHB code also uses a local label with the same number which
results in the Thumb2 reference pointing at the wrong place. Fix this
by changing the number used for the Spectre BHB local label.

Fixes: b9baf5c8c5 ("ARM: Spectre-BHB workaround")
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 11:40:08 -08:00
Linus Torvalds
3977a3fb67 Merge tag 'mmc-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Restore (mostly) the busy polling for MMC_SEND_OP_COND

  MMC host:
   - meson-gx: Fix DMA usage of meson_mmc_post_req()"

* tag 'mmc-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: core: Restore (almost) the busy polling for MMC_SEND_OP_COND
  mmc: meson: Fix usage of meson_mmc_post_req()
2022-03-11 11:24:58 -08:00
Jarkko Sakkinen
08999b2489 x86/sgx: Free backing memory after faulting the enclave page
There is a limited amount of SGX memory (EPC) on each system.  When that
memory is used up, SGX has its own swapping mechanism which is similar
in concept but totally separate from the core mm/* code.  Instead of
swapping to disk, SGX swaps from EPC to normal RAM.  That normal RAM
comes from a shared memory pseudo-file and can itself be swapped by the
core mm code.  There is a hierarchy like this:

	EPC <-> shmem <-> disk

After data is swapped back in from shmem to EPC, the shmem backing
storage needs to be freed.  Currently, the backing shmem is not freed.
This effectively wastes the shmem while the enclave is running.  The
memory is recovered when the enclave is destroyed and the backing
storage freed.

Sort this out by freeing memory with shmem_truncate_range(), as soon as
a page is faulted back to the EPC.  In addition, free the memory for
PCMD pages as soon as all PCMD's in a page have been marked as unused
by zeroing its contents.

Cc: stable@vger.kernel.org
Fixes: 1728ab54b4 ("x86/sgx: Add a page reclaimer")
Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220303223859.273187-1-jarkko@kernel.org
2022-03-11 10:31:06 -08:00
Linus Torvalds
93ce93587d Merge branch 'davidh' (fixes from David Howells)
Merge misc fixes from David Howells:
 "A set of patches for watch_queue filter issues noted by Jann. I've
  added in a cleanup patch from Christophe Jaillet to convert to using
  formal bitmap specifiers for the note allocation bitmap.

  Also two filesystem fixes (afs and cachefiles)"

* emailed patches from David Howells <dhowells@redhat.com>:
  cachefiles: Fix volume coherency attribute
  afs: Fix potential thrashing in afs writeback
  watch_queue: Make comment about setting ->defunct more accurate
  watch_queue: Fix lack of barrier/sync/lock between post and read
  watch_queue: Free the alloc bitmap when the watch_queue is torn down
  watch_queue: Fix the alloc bitmap size to reflect notes allocated
  watch_queue: Use the bitmap API when applicable
  watch_queue: Fix to always request a pow-of-2 pipe ring size
  watch_queue: Fix to release page in ->release()
  watch_queue, pipe: Free watchqueue state after clearing pipe ring
  watch_queue: Fix filter limit check
2022-03-11 10:28:32 -08:00
David Howells
413a4a6b0b cachefiles: Fix volume coherency attribute
A network filesystem may set coherency data on a volume cookie, and if
given, cachefiles will store this in an xattr on the directory in the
cache corresponding to the volume.

The function that sets the xattr just stores the contents of the volume
coherency buffer directly into the xattr, with nothing added; the
checking function, on the other hand, has a cut'n'paste error whereby it
tries to interpret the xattr contents as would be the xattr on an
ordinary file (using the cachefiles_xattr struct).  This results in a
failure to match the coherency data because the buffer ends up being
shifted by 18 bytes.

Fix this by defining a structure specifically for the volume xattr and
making both the setting and checking functions use it.

Since the volume coherency doesn't work if used, take the opportunity to
insert a reserved field for future use, set it to 0 and check that it is
0.  Log mismatch through the appropriate tracepoint.

Note that this only affects cifs; 9p, afs, ceph and nfs don't use the
volume coherency data at the moment.

Fixes: 32e150037d ("fscache, cachefiles: Store the volume coherency data")
Reported-by: Rohith Surabattula <rohiths.msft@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: Steve French <smfrench@gmail.com>
cc: linux-cifs@vger.kernel.org
cc: linux-cachefs@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:24:37 -08:00
David Howells
173ce1ca47 afs: Fix potential thrashing in afs writeback
In afs_writepages_region(), if the dirty page we find is undergoing
writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go
round the loop trying the same page again and again with no pausing or
waiting unless and until another thread manages to clear the writeback
and fscache flags.

Fix this with three measures:

 (1) Advance start to after the page we found.

 (2) Break out of the loop and return if rescheduling is requested.

 (3) Arbitrarily give up after a maximum of 5 skips.

Fixes: 31143d5d51 ("AFS: implement basic file write support")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
Acked-by: Marc Dionne <marc.dionne@auristor.com>
Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@warthog.procyon.org.uk/ # v1
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:24:37 -08:00
Li Huafei
a365a65f9c x86/traps: Mark do_int3() NOKPROBE_SYMBOL
Since kprobe_int3_handler() is called in do_int3(), probing do_int3()
can cause a breakpoint recursion and crash the kernel. Therefore,
do_int3() should be marked as NOKPROBE_SYMBOL.

Fixes: 21e28290b3 ("x86/traps: Split int3 handler up")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220310120915.63349-1-lihuafei1@huawei.com
2022-03-11 19:19:30 +01:00
David Howells
4edc076041 watch_queue: Make comment about setting ->defunct more accurate
watch_queue_clear() has a comment stating that setting ->defunct to true
preventing new additions as well as preventing notifications.  Whilst
the latter is true, the first bit is superfluous since at the time this
function is called, the pipe cannot be accessed to add new event
sources.

Remove the "new additions" bit from the comment.

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:13 -08:00
David Howells
2ed147f015 watch_queue: Fix lack of barrier/sync/lock between post and read
There's nothing to synchronise post_one_notification() versus
pipe_read().  Whilst posting is done under pipe->rd_wait.lock, the
reader only takes pipe->mutex which cannot bar notification posting as
that may need to be made from contexts that cannot sleep.

Fix this by setting pipe->head with a barrier in post_one_notification()
and reading pipe->head with a barrier in pipe_read().

If that's not sufficient, the rd_wait.lock will need to be taken,
possibly in a ->confirm() op so that it only applies to notifications.
The lock would, however, have to be dropped before copy_page_to_iter()
is invoked.

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:13 -08:00
David Howells
7ea1a0124b watch_queue: Free the alloc bitmap when the watch_queue is torn down
Free the watch_queue note allocation bitmap when the watch_queue is
destroyed.

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:13 -08:00
David Howells
3b4c037192 watch_queue: Fix the alloc bitmap size to reflect notes allocated
Currently, watch_queue_set_size() sets the number of notes available in
wqueue->nr_notes according to the number of notes allocated, but sets
the size of the bitmap to the unrounded number of notes originally asked
for.

Fix this by setting the bitmap size to the number of notes we're
actually going to make available (ie. the number allocated).

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:12 -08:00
Christophe JAILLET
a66bd7575b watch_queue: Use the bitmap API when applicable
Use bitmap_alloc() to simplify code, improve the semantic and reduce
some open-coded arithmetic in allocator arguments.

Also change a memset(0xff) into an equivalent bitmap_fill() to keep
consistency.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:12 -08:00
David Howells
96a4d8912b watch_queue: Fix to always request a pow-of-2 pipe ring size
The pipe ring size must always be a power of 2 as the head and tail
pointers are masked off by AND'ing with the size of the ring - 1.
watch_queue_set_size(), however, lets you specify any number of notes
between 1 and 511.  This number is passed through to pipe_resize_ring()
without checking/forcing its alignment.

Fix this by rounding the number of slots required up to the nearest
power of two.  The request is meant to guarantee that at least that many
notifications can be generated before the queue is full, so rounding
down isn't an option, but, alternatively, it may be better to give an
error if we aren't allowed to allocate that much ring space.

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:12 -08:00
David Howells
c1853fbadc watch_queue: Fix to release page in ->release()
When a pipe ring descriptor points to a notification message, the
refcount on the backing page is incremented by the generic get function,
but the release function, which marks the bitmap, doesn't drop the page
ref.

Fix this by calling generic_pipe_buf_release() at the end of
watch_queue_pipe_buf_release().

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:12 -08:00
David Howells
db8facfc9f watch_queue, pipe: Free watchqueue state after clearing pipe ring
In free_pipe_info(), free the watchqueue state after clearing the pipe
ring as each pipe ring descriptor has a release function, and in the
case of a notification message, this is watch_queue_pipe_buf_release()
which tries to mark the allocation bitmap that was previously released.

Fix this by moving the put of the pipe's ref on the watch queue to after
the ring has been cleared.  We still need to call watch_queue_clear()
before doing that to make sure that the pipe is disconnected from any
notification sources first.

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:12 -08:00
David Howells
c993ee0f9f watch_queue: Fix filter limit check
In watch_queue_set_filter(), there are a couple of places where we check
that the filter type value does not exceed what the type_filter bitmap
can hold.  One place calculates the number of bits by:

   if (tf[i].type >= sizeof(wfilter->type_filter) * 8)

which is fine, but the second does:

   if (tf[i].type >= sizeof(wfilter->type_filter) * BITS_PER_LONG)

which is not.  This can lead to a couple of out-of-bounds writes due to
a too-large type:

 (1) __set_bit() on wfilter->type_filter
 (2) Writing more elements in wfilter->filters[] than we allocated.

Fix this by just using the proper WATCH_TYPE__NR instead, which is the
number of types we actually know about.

The bug may cause an oops looking something like:

  BUG: KASAN: slab-out-of-bounds in watch_queue_set_filter+0x659/0x740
  Write of size 4 at addr ffff88800d2c66bc by task watch_queue_oob/611
  ...
  Call Trace:
   <TASK>
   dump_stack_lvl+0x45/0x59
   print_address_description.constprop.0+0x1f/0x150
   ...
   kasan_report.cold+0x7f/0x11b
   ...
   watch_queue_set_filter+0x659/0x740
   ...
   __x64_sys_ioctl+0x127/0x190
   do_syscall_64+0x43/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xae

  Allocated by task 611:
   kasan_save_stack+0x1e/0x40
   __kasan_kmalloc+0x81/0xa0
   watch_queue_set_filter+0x23a/0x740
   __x64_sys_ioctl+0x127/0x190
   do_syscall_64+0x43/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xae

  The buggy address belongs to the object at ffff88800d2c66a0
   which belongs to the cache kmalloc-32 of size 32
  The buggy address is located 28 bytes inside of
   32-byte region [ffff88800d2c66a0, ffff88800d2c66c0)

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:12 -08:00
Linus Torvalds
79b00034e9 Merge tag 'drm-fixes-2022-03-11' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "As expected at this stage its pretty quiet, one sun4i mixer fix and
  one i915 display flicker fix:

  i915:
   - fix psr screen flicker

  sun4i:
   - mixer format fix"

* tag 'drm-fixes-2022-03-11' of git://anongit.freedesktop.org/drm/drm:
  drm/sun4i: mixer: Fix P010 and P210 format numbers
  drm/i915/psr: Set "SF Partial Frame Enable" also on full update
2022-03-10 21:15:42 -08:00
Emil Renner Berthing
0966d38583 riscv: Fix auipc+jalr relocation range checks
RISC-V can do PC-relative jumps with a 32bit range using the following
two instructions:

	auipc	t0, imm20	; t0 = PC + imm20 * 2^12
	jalr	ra, t0, imm12	; ra = PC + 4, PC = t0 + imm12

Crucially both the 20bit immediate imm20 and the 12bit immediate imm12
are treated as two's-complement signed values. For this reason the
immediates are usually calculated like this:

	imm20 = (offset + 0x800) >> 12
	imm12 = offset & 0xfff

..where offset is the signed offset from the auipc instruction. When
the 11th bit of offset is 0 the addition of 0x800 doesn't change the top
20 bits and imm12 considered positive. When the 11th bit is 1 the carry
of the addition by 0x800 means imm20 is one higher, but since imm12 is
then considered negative the two's complement representation means it
all cancels out nicely.

However, this addition by 0x800 (2^11) means an offset greater than or
equal to 2^31 - 2^11 would overflow so imm20 is considered negative and
result in a backwards jump. Similarly the lower range of offset is also
moved down by 2^11 and hence the true 32bit range is

	[-2^31 - 2^11, 2^31 - 2^11)

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Fixes: e2c0cdfba7 ("RISC-V: User-facing API")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-10 20:37:44 -08:00
Dave Airlie
30eb13a260 Merge tag 'drm-intel-fixes-2022-03-10' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix PSR2 when selective fetch is enabled and cursor at (-1, -1) (Jouni Högander)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YinTFSFg++HvuFpZ@tursulin-mobl2
2022-03-11 13:26:19 +10:00
Dave Airlie
1f37299bb4 Merge tag 'drm-misc-fixes-2022-03-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
* drm/sun4i: Fix P010 and P210 format numbers

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YipS65Iuu7RMMlAa@linux-uq9g
2022-03-11 13:26:03 +10:00
Linus Torvalds
dda64ead7e Merge tag 'trace-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
 "Minor tracing fixes:

   - Fix unregistering the same event twice. A user could disable the
     same event that osnoise will disable on unregistering.

   - Inform RCU of a quiescent state in the osnoise testing thread.

   - Fix some kerneldoc comments"

* tag 'trace-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Fix some W=1 warnings in kernel doc comments
  tracing/osnoise: Force quiescent states while tracing
  tracing/osnoise: Do not unregister events twice
2022-03-10 17:23:08 -08:00
Linus Torvalds
186d32bbf0 Merge tag 'net-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, and ipsec.

  Current release - regressions:

   - Bluetooth: fix unbalanced unlock in set_device_flags()

   - Bluetooth: fix not processing all entries on cmd_sync_work, make
     connect with qualcomm and intel adapters reliable

   - Revert "xfrm: state and policy should fail if XFRMA_IF_ID 0"

   - xdp: xdp_mem_allocator can be NULL in trace_mem_connect()

   - eth: ice: fix race condition and deadlock during interface enslave

  Current release - new code bugs:

   - tipc: fix incorrect order of state message data sanity check

  Previous releases - regressions:

   - esp: fix possible buffer overflow in ESP transformation

   - dsa: unlock the rtnl_mutex when dsa_master_setup() fails

   - phy: meson-gxl: fix interrupt handling in forced mode

   - smsc95xx: ignore -ENODEV errors when device is unplugged

  Previous releases - always broken:

   - xfrm: fix tunnel mode fragmentation behavior

   - esp: fix inter address family tunneling on GSO

   - tipc: fix null-deref due to race when enabling bearer

   - sctp: fix kernel-infoleak for SCTP sockets

   - eth: macb: fix lost RX packet wakeup race in NAPI receive

   - eth: intel stop disabling VFs due to PF error responses

   - eth: bcmgenet: don't claim WOL when its not available"

* tag 'net-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
  xdp: xdp_mem_allocator can be NULL in trace_mem_connect().
  ice: Fix race condition during interface enslave
  net: phy: meson-gxl: improve link-up behavior
  net: bcmgenet: Don't claim WOL when its not available
  net: arc_emac: Fix use after free in arc_mdio_probe()
  sctp: fix kernel-infoleak for SCTP sockets
  net: phy: correct spelling error of media in documentation
  net: phy: DP83822: clear MISR2 register to disable interrupts
  gianfar: ethtool: Fix refcount leak in gfar_get_ts_info
  selftests: pmtu.sh: Kill nettest processes launched in subshell.
  selftests: pmtu.sh: Kill tcpdump processes launched by subshell.
  NFC: port100: fix use-after-free in port100_send_complete
  net/mlx5e: SHAMPO, reduce TIR indication
  net/mlx5e: Lag, Only handle events from highest priority multipath entry
  net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE
  net/mlx5: Fix a race on command flush flow
  net/mlx5: Fix size field in bufferx_reg struct
  ax25: Fix NULL pointer dereference in ax25_kill_by_device
  net: marvell: prestera: Add missing of_node_put() in prestera_switch_set_base_mac_addr
  net: ethernet: lpc_eth: Handle error for clk_enable
  ...
2022-03-10 16:47:58 -08:00
Sebastian Andrzej Siewior
e0ae713023 xdp: xdp_mem_allocator can be NULL in trace_mem_connect().
Since the commit mentioned below __xdp_reg_mem_model() can return a NULL
pointer. This pointer is dereferenced in trace_mem_connect() which leads
to segfault.

The trace points (mem_connect + mem_disconnect) were put in place to
pair connect/disconnect using the IDs. The ID is only assigned if
__xdp_reg_mem_model() does not return NULL. That connect trace point is
of no use if there is no ID.

Skip that connect trace point if xdp_alloc is NULL.

[ Toke Høiland-Jørgensen delivered the reasoning for skipping the trace
  point ]

Fixes: 4a48ef70b9 ("xdp: Allow registering memory model without rxq reference")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/YikmmXsffE+QajTB@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 16:09:29 -08:00
Ivan Vecera
5cb1ebdbc4 ice: Fix race condition during interface enslave
Commit 5dbbbd01cb ("ice: Avoid RTNL lock when re-creating
auxiliary device") changes a process of re-creation of aux device
so ice_plug_aux_dev() is called from ice_service_task() context.
This unfortunately opens a race window that can result in dead-lock
when interface has left LAG and immediately enters LAG again.

Reproducer:
```
#!/bin/sh

ip link add lag0 type bond mode 1 miimon 100
ip link set lag0

for n in {1..10}; do
        echo Cycle: $n
        ip link set ens7f0 master lag0
        sleep 1
        ip link set ens7f0 nomaster
done
```

This results in:
[20976.208697] Workqueue: ice ice_service_task [ice]
[20976.213422] Call Trace:
[20976.215871]  __schedule+0x2d1/0x830
[20976.219364]  schedule+0x35/0xa0
[20976.222510]  schedule_preempt_disabled+0xa/0x10
[20976.227043]  __mutex_lock.isra.7+0x310/0x420
[20976.235071]  enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core]
[20976.251215]  ib_enum_roce_netdev+0xa4/0xe0 [ib_core]
[20976.256192]  ib_cache_setup_one+0x33/0xa0 [ib_core]
[20976.261079]  ib_register_device+0x40d/0x580 [ib_core]
[20976.266139]  irdma_ib_register_device+0x129/0x250 [irdma]
[20976.281409]  irdma_probe+0x2c1/0x360 [irdma]
[20976.285691]  auxiliary_bus_probe+0x45/0x70
[20976.289790]  really_probe+0x1f2/0x480
[20976.298509]  driver_probe_device+0x49/0xc0
[20976.302609]  bus_for_each_drv+0x79/0xc0
[20976.306448]  __device_attach+0xdc/0x160
[20976.310286]  bus_probe_device+0x9d/0xb0
[20976.314128]  device_add+0x43c/0x890
[20976.321287]  __auxiliary_device_add+0x43/0x60
[20976.325644]  ice_plug_aux_dev+0xb2/0x100 [ice]
[20976.330109]  ice_service_task+0xd0c/0xed0 [ice]
[20976.342591]  process_one_work+0x1a7/0x360
[20976.350536]  worker_thread+0x30/0x390
[20976.358128]  kthread+0x10a/0x120
[20976.365547]  ret_from_fork+0x1f/0x40
...
[20976.438030] task:ip              state:D stack:    0 pid:213658 ppid:213627 flags:0x00004084
[20976.446469] Call Trace:
[20976.448921]  __schedule+0x2d1/0x830
[20976.452414]  schedule+0x35/0xa0
[20976.455559]  schedule_preempt_disabled+0xa/0x10
[20976.460090]  __mutex_lock.isra.7+0x310/0x420
[20976.464364]  device_del+0x36/0x3c0
[20976.467772]  ice_unplug_aux_dev+0x1a/0x40 [ice]
[20976.472313]  ice_lag_event_handler+0x2a2/0x520 [ice]
[20976.477288]  notifier_call_chain+0x47/0x70
[20976.481386]  __netdev_upper_dev_link+0x18b/0x280
[20976.489845]  bond_enslave+0xe05/0x1790 [bonding]
[20976.494475]  do_setlink+0x336/0xf50
[20976.502517]  __rtnl_newlink+0x529/0x8b0
[20976.543441]  rtnl_newlink+0x43/0x60
[20976.546934]  rtnetlink_rcv_msg+0x2b1/0x360
[20976.559238]  netlink_rcv_skb+0x4c/0x120
[20976.563079]  netlink_unicast+0x196/0x230
[20976.567005]  netlink_sendmsg+0x204/0x3d0
[20976.570930]  sock_sendmsg+0x4c/0x50
[20976.574423]  ____sys_sendmsg+0x1eb/0x250
[20976.586807]  ___sys_sendmsg+0x7c/0xc0
[20976.606353]  __sys_sendmsg+0x57/0xa0
[20976.609930]  do_syscall_64+0x5b/0x1a0
[20976.613598]  entry_SYSCALL_64_after_hwframe+0x65/0xca

1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev()
   is called from ice_service_task() context, aux device is created
   and associated device->lock is taken.
2. Command 'ip link ... set master...' calls ice's notifier under
   RTNL lock and that notifier calls ice_unplug_aux_dev(). That
   function tries to take aux device->lock but this is already taken
   by ice_plug_aux_dev() in step 1
3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already
   taken in step 2
4. Dead-lock

The patch fixes this issue by following changes:
- Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev()
  call in ice_service_task()
- The bit is checked in ice_clear_rdma_cap() and only if it is not set
  then ice_unplug_aux_dev() is called. If it is set (in other words
  plugging of aux device was requested and ice_plug_aux_dev() is
  potentially running) then the function only clears the bit
- Once ice_plug_aux_dev() call (in ice_service_task) is finished
  the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked
  whether it was already cleared by ice_clear_rdma_cap(). If so then
  aux device is unplugged.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Co-developed-by: Petr Oros <poros@redhat.com>
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Dave Ertman <david.m.ertman@intel.com>
Link: https://lore.kernel.org/r/20220310171641.3863659-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 15:07:46 -08:00
Heiner Kallweit
2c87c6f9fb net: phy: meson-gxl: improve link-up behavior
Sometimes the link comes up but no data flows. This patch fixes
this behavior. It's not clear what's the root cause of the issue.

According to the tests one other link-up issue remains.
In very rare cases the link isn't even reported as up.

Fixes: 84c8f773d2 ("net: phy: meson-gxl: remove the use of .ack_callback()")
Tested-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/e3473452-a1f9-efcf-5fdd-02b6f44c3fcd@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 14:57:02 -08:00
Jeremy Linton
00b022f8f8 net: bcmgenet: Don't claim WOL when its not available
Some of the bcmgenet platforms don't correctly support WOL, yet
ethtool returns:

"Supports Wake-on: gsf"

which is false.

Ideally if there isn't a wol_irq, or there is something else that
keeps the device from being able to wakeup it should display:

"Supports Wake-on: d"

This patch checks whether the device can wakup, before using the
hard-coded supported flags. This corrects the ethtool reporting, as
well as the WOL configuration because ethtool verifies that the mode
is supported before attempting it.

Fixes: c51de7f397 ("net: bcmgenet: add Wake-on-LAN support code")
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Tested-by: Peter Robinson <pbrobinson@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220310045535.224450-1-jeremy.linton@arm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 14:54:32 -08:00
Jianglei Nie
bc0e610a6e net: arc_emac: Fix use after free in arc_mdio_probe()
If bus->state is equal to MDIOBUS_ALLOCATED, mdiobus_free(bus) will free
the "bus". But bus->name is still used in the next line, which will lead
to a use after free.

We can fix it by putting the name in a local variable and make the
bus->name point to the rodata section "name",then use the name in the
error message without referring to bus to avoid the uaf.

Fixes: 95b5fc03c1 ("net: arc_emac: Make use of the helper function dev_err_probe()")
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Link: https://lore.kernel.org/r/20220309121824.36529-1-niejianglei2021@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 14:49:21 -08:00
Eric Dumazet
633593a808 sctp: fix kernel-infoleak for SCTP sockets
syzbot reported a kernel infoleak [1] of 4 bytes.

After analysis, it turned out r->idiag_expires is not initialized
if inet_sctp_diag_fill() calls inet_diag_msg_common_fill()

Make sure to clear idiag_timer/idiag_retrans/idiag_expires
and let inet_diag_msg_sctpasoc_fill() fill them again if needed.

[1]

BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline]
BUG: KMSAN: kernel-infoleak in copyout lib/iov_iter.c:154 [inline]
BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x6ef/0x25a0 lib/iov_iter.c:668
 instrument_copy_to_user include/linux/instrumented.h:121 [inline]
 copyout lib/iov_iter.c:154 [inline]
 _copy_to_iter+0x6ef/0x25a0 lib/iov_iter.c:668
 copy_to_iter include/linux/uio.h:162 [inline]
 simple_copy_to_iter+0xf3/0x140 net/core/datagram.c:519
 __skb_datagram_iter+0x2d5/0x11b0 net/core/datagram.c:425
 skb_copy_datagram_iter+0xdc/0x270 net/core/datagram.c:533
 skb_copy_datagram_msg include/linux/skbuff.h:3696 [inline]
 netlink_recvmsg+0x669/0x1c80 net/netlink/af_netlink.c:1977
 sock_recvmsg_nosec net/socket.c:948 [inline]
 sock_recvmsg net/socket.c:966 [inline]
 __sys_recvfrom+0x795/0xa10 net/socket.c:2097
 __do_sys_recvfrom net/socket.c:2115 [inline]
 __se_sys_recvfrom net/socket.c:2111 [inline]
 __x64_sys_recvfrom+0x19d/0x210 net/socket.c:2111
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Uninit was created at:
 slab_post_alloc_hook mm/slab.h:737 [inline]
 slab_alloc_node mm/slub.c:3247 [inline]
 __kmalloc_node_track_caller+0xe0c/0x1510 mm/slub.c:4975
 kmalloc_reserve net/core/skbuff.c:354 [inline]
 __alloc_skb+0x545/0xf90 net/core/skbuff.c:426
 alloc_skb include/linux/skbuff.h:1158 [inline]
 netlink_dump+0x3e5/0x16c0 net/netlink/af_netlink.c:2248
 __netlink_dump_start+0xcf8/0xe90 net/netlink/af_netlink.c:2373
 netlink_dump_start include/linux/netlink.h:254 [inline]
 inet_diag_handler_cmd+0x2e7/0x400 net/ipv4/inet_diag.c:1341
 sock_diag_rcv_msg+0x24a/0x620
 netlink_rcv_skb+0x40c/0x7e0 net/netlink/af_netlink.c:2494
 sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:277
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x1093/0x1360 net/netlink/af_netlink.c:1343
 netlink_sendmsg+0x14d9/0x1720 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:705 [inline]
 sock_sendmsg net/socket.c:725 [inline]
 sock_write_iter+0x594/0x690 net/socket.c:1061
 do_iter_readv_writev+0xa7f/0xc70
 do_iter_write+0x52c/0x1500 fs/read_write.c:851
 vfs_writev fs/read_write.c:924 [inline]
 do_writev+0x645/0xe00 fs/read_write.c:967
 __do_sys_writev fs/read_write.c:1040 [inline]
 __se_sys_writev fs/read_write.c:1037 [inline]
 __x64_sys_writev+0xe5/0x120 fs/read_write.c:1037
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Bytes 68-71 of 2508 are uninitialized
Memory access of size 2508 starts at ffff888114f9b000
Data copied to user address 00007f7fe09ff2e0

CPU: 1 PID: 3478 Comm: syz-executor306 Not tainted 5.17.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 8f840e47f1 ("sctp: add the sctp_diag.c file")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20220310001145.297371-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 14:46:42 -08:00
Colin Foster
26183cfe47 net: phy: correct spelling error of media in documentation
The header file incorrectly referenced "median-independant interface"
instead of media. Correct this typo.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Fixes: 4069a572d4 ("net: phy: Document core PHY structures")
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220309062544.3073-1-colin.foster@in-advantage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 14:40:59 -08:00
Jakub Kicinski
55c4bf4d93 Merge tag 'mlx5-fixes-2022-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5 fixes 2022-03-09

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2022-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: SHAMPO, reduce TIR indication
  net/mlx5e: Lag, Only handle events from highest priority multipath entry
  net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE
  net/mlx5: Fix a race on command flush flow
  net/mlx5: Fix size field in bufferx_reg struct
====================

Link: https://lore.kernel.org/r/20220309201517.589132-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 14:32:32 -08:00
Linus Torvalds
3bcb6451cc Merge tag 'block-5.17-2022-03-10' of git://git.kernel.dk/linux-block
Pull block fix from Jens Axboe:
 "Just a single fix for a regression that occured in this merge window"

* tag 'block-5.17-2022-03-10' of git://git.kernel.dk/linux-block:
  block: fix blk_mq_attempt_bio_merge and rq_qos_throttle protection
2022-03-10 12:56:36 -08:00
Linus Torvalds
c30b5b8cfb Merge tag 'staging-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
 "Here are three small fixes for staging drivers for 5.17-rc8 or -final,
  which ever comes next.

  They resolve some reported problems:

   - rtl8723bs wifi driver deadlock fix for reported problem that is a
     revert of a previous patch. Also a documentation fix is added so
     that the same problem hopefully can not come back again.

   - gdm724x driver use-after-free fix for a reported problem.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'staging-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: rtl8723bs: Improve the comment explaining the locking rules
  staging: rtl8723bs: Fix access-point mode deadlock
  staging: gdm724x: fix use after free in gdm_lte_rx()
2022-03-10 12:43:06 -08:00
Clément Léger
37c9d66c95 net: phy: DP83822: clear MISR2 register to disable interrupts
MISR1 was cleared twice but the original author intention was probably
to clear MISR1 & MISR2 to completely disable interrupts. Fix it to
clear MISR2.

Fixes: 87461f7a58 ("net: phy: DP83822 initial driver submission")
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220309142228.761153-1-clement.leger@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:24:40 -08:00
Miaoqian Lin
2ac5b58e64 gianfar: ethtool: Fix refcount leak in gfar_get_ts_info
The of_find_compatible_node() function returns a node pointer with
refcount incremented, We should use of_node_put() on it when done
Add the missing of_node_put() to release the refcount.

Fixes: 7349a74ea7 ("net: ethernet: gianfar_ethtool: get phc index through drvdata")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20220310015313.14938-1-linmq006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:20:54 -08:00
Linus Torvalds
55b4083b44 Merge tag 'soc-fixes-5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "Here is a third set of fixes for the soc tree, well within the
  expected set of changes.

  Maintainer list changes:
   - Krzysztof Kozlowski and Jisheng Zhang both have new email addresses
   - Broadcom iProc has a new git tree

  Regressions:
   - Robert Foss sends a revert for a Mediatek DPI bridge patch that
     caused an inadvertent break in the DT binding
   - mstar timers need to be included in Kconfig

  Devicetree fixes for:
   - Aspeed ast2600 spi pinmux
   - Tegra eDP panels on Nyan FHD
   - Tegra display IOMMU
   - Qualcomm sm8350 UFS clocks
   - minor DT changes for Marvell Armada, Qualcomm sdx65, Qualcomm
     sm8450, and Broadcom BCM2711"

* tag 'soc-fixes-5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: dts: marvell: armada-37xx: Remap IO space to bus address 0x0
  MAINTAINERS: Update Jisheng's email address
  Revert "arm64: dts: mt8183: jacuzzi: Fix bus properties in anx's DSI endpoint"
  dt-bindings: drm/bridge: anx7625: Revert DPI support
  ARM: dts: aspeed: Fix AST2600 quad spi group
  MAINTAINERS: update Krzysztof Kozlowski's email
  MAINTAINERS: Update git tree for Broadcom iProc SoCs
  ARM: tegra: Move Nyan FHD panels to AUX bus
  arm64: dts: armada-3720-turris-mox: Add missing ethernet0 alias
  ARM: mstar: Select HAVE_ARM_ARCH_TIMER
  soc: mediatek: mt8192-mmsys: Fix dither to dsi0 path's input sel
  arm64: dts: mt8183: jacuzzi: Fix bus properties in anx's DSI endpoint
  ARM: boot: dts: bcm2711: Fix HVS register range
  arm64: dts: qcom: c630: disable crypto due to serror
  arm64: dts: qcom: sm8450: fix apps_smmu interrupts
  arm64: dts: qcom: sm8450: enable GCC_USB3_0_CLKREF_EN for usb
  arm64: dts: qcom: sm8350: Correct UFS symbol clocks
  arm64: tegra: Disable ISO SMMU for Tegra194
  Revert "dt-bindings: arm: qcom: Document SDX65 platform and boards"
2022-03-10 11:43:01 -08:00
Linus Torvalds
fe673d3f5b mm: gup: make fault_in_safe_writeable() use fixup_user_fault()
Instead of using GUP, make fault_in_safe_writeable() actually force a
'handle_mm_fault()' using the same fixup_user_fault() machinery that
futexes already use.

Using the GUP machinery meant that fault_in_safe_writeable() did not do
everything that a real fault would do, ranging from not auto-expanding
the stack segment, to not updating accessed or dirty flags in the page
tables (GUP sets those flags on the pages themselves).

The latter causes problems on architectures (like s390) that do accessed
bit handling in software, which meant that fault_in_safe_writeable()
didn't actually do all the fault handling it needed to, and trying to
access the user address afterwards would still cause faults.

Reported-and-tested-by: Andreas Gruenbacher <agruenba@redhat.com>
Fixes: cdd591fc86 ("iov_iter: Introduce fault_in_iov_iter_writeable")
Link: https://lore.kernel.org/all/CAHc6FU5nP+nziNGG0JAF1FUx-GV7kKFvM7aZuU_XD2_1v4vnvg@mail.gmail.com/
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-10 10:48:53 -08:00
Jisheng Zhang
c80ee64a80 riscv: alternative only works on !XIP_KERNEL
The alternative mechanism needs runtime code patching, it can't work
on XIP_KERNEL. And the errata workarounds are implemented via the
alternative mechanism. So add !XIP_KERNEL dependency for alternative
and erratas.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: 44c9225729 ("RISC-V: enable XIP")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-10 10:05:19 -08:00
Arnd Bergmann
7e606edaa0 Merge tag 'mvebu-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into arm/fixes
mvebu fixes for 5.17 (part 2)

Allow using old PCIe card on Armada 37xx

* tag 'mvebu-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu:
  arm64: dts: marvell: armada-37xx: Remap IO space to bus address 0x0

Link: https://lore.kernel.org/r/87bkydj4fn.fsf@BL-laptop
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-10 15:25:46 +01:00
Pali Rohár
a1cc1697bb arm64: dts: marvell: armada-37xx: Remap IO space to bus address 0x0
Legacy and old PCI I/O based cards do not support 32-bit I/O addressing.

Since commit 64f160e19e ("PCI: aardvark: Configure PCIe resources from
'ranges' DT property") kernel can set different PCIe address on CPU and
different on the bus for the one A37xx address mapping without any firmware
support in case the bus address does not conflict with other A37xx mapping.

So remap I/O space to the bus address 0x0 to enable support for old legacy
I/O port based cards which have hardcoded I/O ports in low address space.

Note that DDR on A37xx is mapped to bus address 0x0. And mapping of I/O
space can be set to address 0x0 too because MEM space and I/O space are
separate and so do not conflict.

Remapping IO space on Turris Mox to different address is not possible to
due bootloader bug.

Signed-off-by: Pali Rohár <pali@kernel.org>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 76f6386b25 ("arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700")
Cc: stable@vger.kernel.org # 64f160e19e ("PCI: aardvark: Configure PCIe resources from 'ranges' DT property")
Cc: stable@vger.kernel.org # 514ef1e62d ("arm64: dts: marvell: armada-37xx: Extend PCIe MEM space")
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2022-03-10 14:49:10 +01:00
Jason Wang
95932ab2ea vhost: allow batching hint without size
Commit e2ae38cf3d ("vhost: fix hung thread due to erroneous iotlb
entries") tries to reject the IOTLB message whose size is zero. But
the size is not necessarily meaningful, one example is the batching
hint, so the commit breaks that.

Fixing this be reject zero size message only if the message is used to
update/invalidate the IOTLB.

Fixes: e2ae38cf3d ("vhost: fix hung thread due to erroneous iotlb entries")
Reported-by: Eli Cohen <elic@nvidia.com>
Cc: Anirudh Rayabharam <mail@anirudhrb.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20220310075211.4801-1-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Eli Cohen <elic@nvidia.com>
2022-03-10 08:12:04 -05:00
Linus Torvalds
1db333d9a5 Merge tag 'spi-fix-v5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fix from Mark Brown:
 "One fix for type conversion issues when working out maximum
  scatter/gather segment sizes.

  It caused problems for some systems where the limits overflow
  due to the type conversion"

* tag 'spi-fix-v5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: Fix invalid sgs value
2022-03-10 04:15:09 -08:00
Russell King (Oracle)
b1a384d2cb ARM: fix build warning in proc-v7-bugs.c
The kernel test robot discovered that building without
HARDEN_BRANCH_PREDICTOR issues a warning due to a missing
argument to pr_info().

Add the missing argument.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 9dd78194a3 ("ARM: report Spectre v2 status through sysfs")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-10 04:03:17 -08:00
Linus Torvalds
cef06913a0 Merge tag 'gpio-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - fix a probe failure for Tegra241 GPIO controller in gpio-tegra186

 - revert changes that caused a regression in the sysfs user-space
   interface

 - correct the debounce time conversion in GPIO ACPI

 - statify a struct in gpio-sim and fix a typo

 - update registers in correct order (hardware quirk) in gpio-ts4900

* tag 'gpio-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: sim: fix a typo
  gpio: ts4900: Do not set DAT and OE together
  gpio: sim: Declare gpio_sim_hog_config_item_ops static
  gpiolib: acpi: Convert ACPI value of debounce to microseconds
  gpio: Revert regression in sysfs-gpio (gpiolib.c)
  gpio: tegra186: Add IRQ per bank for Tegra241
2022-03-10 03:55:33 -08:00
Bartosz Golaszewski
55d01c98a8 gpio: sim: fix a typo
Just noticed this when applying Andy's patch. s/childred/children/

Fixes: cb8c474e79 ("gpio: sim: new testing module")
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2022-03-10 10:02:19 +01:00
Mark Featherston
03fe003547 gpio: ts4900: Do not set DAT and OE together
This works around an issue with the hardware where both OE and
DAT are exposed in the same register. If both are updated
simultaneously, the harware makes no guarantees that OE or DAT
will actually change in any given order and may result in a
glitch of a few ns on a GPIO pin when changing direction and value
in a single write.

Setting direction to input now only affects OE bit. Setting
direction to output updates DAT first, then OE.

Fixes: 9c6686322d ("gpio: add Technologic I2C-FPGA gpio support")
Signed-off-by: Mark Featherston <mark@embeddedTS.com>
Signed-off-by: Kris Bahnsen <kris@embeddedTS.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-03-10 10:00:27 +01:00
Haimin Zhang
9a564bccb7 af_key: add __GFP_ZERO flag for compose_sadb_supported in function pfkey_register
Add __GFP_ZERO flag for compose_sadb_supported in function pfkey_register
to initialize the buffer of supp_skb to fix a kernel-info-leak issue.
1) Function pfkey_register calls compose_sadb_supported to request
a sk_buff. 2) compose_sadb_supported calls alloc_sbk to allocate
a sk_buff, but it doesn't zero it. 3) If auth_len is greater 0, then
compose_sadb_supported treats the memory as a struct sadb_supported and
begins to initialize. But it just initializes the field sadb_supported_len
and field sadb_supported_exttype without field sadb_supported_reserved.

Reported-by: TCS Robot <tcs_robot@tencent.com>
Signed-off-by: Haimin Zhang <tcs_kernel@tencent.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-03-10 07:39:47 +01:00
Linus Torvalds
9c674947f6 Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
 "One more small batch of clk driver fixes:

   - A fix for the Qualcomm GDSC power domain delays that avoids black
     screens at boot on some more recent SoCs that use a different delay
     than the hard-coded delays in the driver.

   - A build fix LAN966X clk driver that let it be built on
     architectures that didn't have IOMEM"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: lan966x: Fix linking error
  clk: qcom: dispcc: Update the transition delay for MDSS GDSC
  clk: qcom: gdsc: Add support to update GDSC transition delay
2022-03-09 20:58:29 -08:00
Linus Torvalds
b5521fe9a9 Merge tag 'xsa396-5.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
 "Several Linux PV device frontends are using the grant table interfaces
  for removing access rights of the backends in ways being subject to
  race conditions, resulting in potential data leaks, data corruption by
  malicious backends, and denial of service triggered by malicious
  backends:

   - blkfront, netfront, scsifront and the gntalloc driver are testing
     whether a grant reference is still in use. If this is not the case,
     they assume that a following removal of the granted access will
     always succeed, which is not true in case the backend has mapped
     the granted page between those two operations.

     As a result the backend can keep access to the memory page of the
     guest no matter how the page will be used after the frontend I/O
     has finished. The xenbus driver has a similar problem, as it
     doesn't check the success of removing the granted access of a
     shared ring buffer.

   - blkfront, netfront, scsifront, usbfront, dmabuf, xenbus, 9p,
     kbdfront, and pvcalls are using a functionality to delay freeing a
     grant reference until it is no longer in use, but the freeing of
     the related data page is not synchronized with dropping the granted
     access.

     As a result the backend can keep access to the memory page even
     after it has been freed and then re-used for a different purpose.

   - netfront will fail a BUG_ON() assertion if it fails to revoke
     access in the rx path.

     This will result in a Denial of Service (DoS) situation of the
     guest which can be triggered by the backend"

* tag 'xsa396-5.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
  xen/gnttab: fix gnttab_end_foreign_access() without page specified
  xen/pvcalls: use alloc/free_pages_exact()
  xen/9p: use alloc/free_pages_exact()
  xen/usb: don't use gnttab_end_foreign_access() in xenhcd_gnttab_done()
  xen: remove gnttab_query_foreign_access()
  xen/gntalloc: don't use gnttab_query_foreign_access()
  xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
  xen/netfront: don't use gnttab_query_foreign_access() for mapped status
  xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
  xen/grant-table: add gnttab_try_end_foreign_access()
  xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
2022-03-09 20:44:17 -08:00
Jakub Kicinski
5f14747605 Merge branch 'selftests-pmtu-sh-fix-cleanup-of-processes-launched-in-subshell'
Guillaume Nault says:

====================
selftests: pmtu.sh: Fix cleanup of processes launched in subshell.

Depending on the options used, pmtu.sh may launch tcpdump and nettest
processes in the background. However it fails to clean them up after
the tests complete.

Patch 1 allows the cleanup() function to read the list of PIDs launched
by the tests.
Patch 2 fixes the way the nettest PIDs are retrieved.
====================

Link: https://lore.kernel.org/r/cover.1646776561.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09 20:23:38 -08:00
Guillaume Nault
94a4a4fe4c selftests: pmtu.sh: Kill nettest processes launched in subshell.
When using "run_cmd <command> &", then "$!" refers to the PID of the
subshell used to run <command>, not the command itself. Therefore
nettest_pids actually doesn't contain the list of the nettest commands
running in the background. So cleanup() can't kill them and the nettest
processes run until completion (fortunately they have a 5s timeout).

Fix this by defining a new command for running processes in the
background, for which "$!" really refers to the PID of the command run.

Also, double quote variables on the modified lines, to avoid shellcheck
warnings.

Fixes: ece1278a9b ("selftests: net: add ESP-in-UDP PMTU test")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09 20:23:32 -08:00
Guillaume Nault
18dfc66755 selftests: pmtu.sh: Kill tcpdump processes launched by subshell.
The cleanup() function takes care of killing processes launched by the
test functions. It relies on variables like ${tcpdump_pids} to get the
relevant PIDs. But tests are run in their own subshell, so updated
*_pids values are invisible to other shells. Therefore cleanup() never
sees any process to kill:

$ ./tools/testing/selftests/net/pmtu.sh -t pmtu_ipv4_exception
TEST: ipv4: PMTU exceptions                                         [ OK ]
TEST: ipv4: PMTU exceptions - nexthop objects                       [ OK ]

$ pgrep -af tcpdump
6084 tcpdump -s 0 -i veth_A-R1 -w pmtu_ipv4_exception_veth_A-R1.pcap
6085 tcpdump -s 0 -i veth_R1-A -w pmtu_ipv4_exception_veth_R1-A.pcap
6086 tcpdump -s 0 -i veth_R1-B -w pmtu_ipv4_exception_veth_R1-B.pcap
6087 tcpdump -s 0 -i veth_B-R1 -w pmtu_ipv4_exception_veth_B-R1.pcap
6088 tcpdump -s 0 -i veth_A-R2 -w pmtu_ipv4_exception_veth_A-R2.pcap
6089 tcpdump -s 0 -i veth_R2-A -w pmtu_ipv4_exception_veth_R2-A.pcap
6090 tcpdump -s 0 -i veth_R2-B -w pmtu_ipv4_exception_veth_R2-B.pcap
6091 tcpdump -s 0 -i veth_B-R2 -w pmtu_ipv4_exception_veth_B-R2.pcap
6228 tcpdump -s 0 -i veth_A-R1 -w pmtu_ipv4_exception_veth_A-R1.pcap
6229 tcpdump -s 0 -i veth_R1-A -w pmtu_ipv4_exception_veth_R1-A.pcap
6230 tcpdump -s 0 -i veth_R1-B -w pmtu_ipv4_exception_veth_R1-B.pcap
6231 tcpdump -s 0 -i veth_B-R1 -w pmtu_ipv4_exception_veth_B-R1.pcap
6232 tcpdump -s 0 -i veth_A-R2 -w pmtu_ipv4_exception_veth_A-R2.pcap
6233 tcpdump -s 0 -i veth_R2-A -w pmtu_ipv4_exception_veth_R2-A.pcap
6234 tcpdump -s 0 -i veth_R2-B -w pmtu_ipv4_exception_veth_R2-B.pcap
6235 tcpdump -s 0 -i veth_B-R2 -w pmtu_ipv4_exception_veth_B-R2.pcap

Fix this by running cleanup() in the context of the test subshell.
Now that each test cleans the environment after completion, there's no
need for calling cleanup() again when the next test starts. So let's
drop it from the setup() function. This is okay because cleanup() is
also called when pmtu.sh starts, so even the first test starts in a
clean environment.

Also, use tcpdump's immediate mode. Otherwise it might not have time to
process buffered packets, resulting in missing packets or even empty
pcap files for short tests.

Note: PAUSE_ON_FAIL is still evaluated before cleanup(), so one can
still inspect the test environment upon failure when using -p.

Fixes: a92a0a7b8e ("selftests: pmtu: Simplify cleanup and namespace names")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09 20:23:15 -08:00
Pavel Skripkin
f80cfe2f26 NFC: port100: fix use-after-free in port100_send_complete
Syzbot reported UAF in port100_send_complete(). The root case is in
missing usb_kill_urb() calls on error handling path of ->probe function.

port100_send_complete() accesses devm allocated memory which will be
freed on probe failure. We should kill this urbs before returning an
error from probe function to prevent reported use-after-free

Fail log:

BUG: KASAN: use-after-free in port100_send_complete+0x16e/0x1a0 drivers/nfc/port100.c:935
Read of size 1 at addr ffff88801bb59540 by task ksoftirqd/2/26
...
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0x8d/0x303 mm/kasan/report.c:255
 __kasan_report mm/kasan/report.c:442 [inline]
 kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
 port100_send_complete+0x16e/0x1a0 drivers/nfc/port100.c:935
 __usb_hcd_giveback_urb+0x2b0/0x5c0 drivers/usb/core/hcd.c:1670

...

Allocated by task 1255:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:45 [inline]
 set_alloc_info mm/kasan/common.c:436 [inline]
 ____kasan_kmalloc mm/kasan/common.c:515 [inline]
 ____kasan_kmalloc mm/kasan/common.c:474 [inline]
 __kasan_kmalloc+0xa6/0xd0 mm/kasan/common.c:524
 alloc_dr drivers/base/devres.c:116 [inline]
 devm_kmalloc+0x96/0x1d0 drivers/base/devres.c:823
 devm_kzalloc include/linux/device.h:209 [inline]
 port100_probe+0x8a/0x1320 drivers/nfc/port100.c:1502

Freed by task 1255:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
 kasan_set_track+0x21/0x30 mm/kasan/common.c:45
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370
 ____kasan_slab_free mm/kasan/common.c:366 [inline]
 ____kasan_slab_free+0xff/0x140 mm/kasan/common.c:328
 kasan_slab_free include/linux/kasan.h:236 [inline]
 __cache_free mm/slab.c:3437 [inline]
 kfree+0xf8/0x2b0 mm/slab.c:3794
 release_nodes+0x112/0x1a0 drivers/base/devres.c:501
 devres_release_all+0x114/0x190 drivers/base/devres.c:530
 really_probe+0x626/0xcc0 drivers/base/dd.c:670

Reported-and-tested-by: syzbot+16bcb127fb73baeecb14@syzkaller.appspotmail.com
Fixes: 0347a6ab30 ("NFC: port100: Commands mechanism implementation")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20220308185007.6987-1-paskripkin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09 19:59:34 -08:00
Linus Torvalds
3bf7edc84a Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 build fix from Catalin Marinas:
 "Fix kernel build with clang LTO after the inclusion of the Spectre BHB
  arm64 mitigations"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Do not include __READ_ONCE() block in assembly files
2022-03-09 14:30:09 -08:00
Nathan Chancellor
36168e387f ARM: Do not use NOCROSSREFS directive with ld.lld
ld.lld does not support the NOCROSSREFS directive at the moment, which
breaks the build after commit b9baf5c8c5 ("ARM: Spectre-BHB
workaround"):

  ld.lld: error: ./arch/arm/kernel/vmlinux.lds:34: AT expected, but got NOCROSSREFS

Support for this directive will eventually be implemented, at which
point a version check can be added. To avoid breaking the build in the
meantime, just define NOCROSSREFS to nothing when using ld.lld, with a
link to the issue for tracking.

Cc: stable@vger.kernel.org
Fixes: b9baf5c8c5 ("ARM: Spectre-BHB workaround")
Link: https://github.com/ClangBuiltLinux/linux/issues/1609
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-09 14:14:16 -08:00
Nathan Chancellor
52c9f93a9c arm64: Do not include __READ_ONCE() block in assembly files
When building arm64 defconfig + CONFIG_LTO_CLANG_{FULL,THIN}=y after
commit 558c303c97 ("arm64: Mitigate spectre style branch history side
channels"), the following error occurs:

  <instantiation>:4:2: error: invalid fixup for movz/movk instruction
   mov w0, #ARM_SMCCC_ARCH_WORKAROUND_3
   ^

Marc figured out that moving "#include <linux/init.h>" in
include/linux/arm-smccc.h into a !__ASSEMBLY__ block resolves it. The
full include chain with CONFIG_LTO=y from include/linux/arm-smccc.h:

include/linux/init.h
include/linux/compiler.h
arch/arm64/include/asm/rwonce.h
arch/arm64/include/asm/alternative-macros.h
arch/arm64/include/asm/assembler.h

The asm/alternative-macros.h include in asm/rwonce.h only happens when
CONFIG_LTO is set, which ultimately casues asm/assembler.h to be
included before the definition of ARM_SMCCC_ARCH_WORKAROUND_3. As a
result, the preprocessor does not expand ARM_SMCCC_ARCH_WORKAROUND_3 in
__mitigate_spectre_bhb_fw, which results in the error above.

Avoid this problem by just avoiding the CONFIG_LTO=y __READ_ONCE() block
in asm/rwonce.h with assembly files, as nothing in that block is useful
to assembly files, which allows ARM_SMCCC_ARCH_WORKAROUND_3 to be
properly expanded with CONFIG_LTO=y builds.

Fixes: e35123d83e ("arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y")
Cc: <stable@vger.kernel.org> # 5.11.x
Link: https://lore.kernel.org/r/20220309155716.3988480-1-maz@kernel.org/
Reported-by: Marc Zyngier <maz@kernel.org>
Acked-by: James Morse <james.morse@arm.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20220309191633.2307110-1-nathan@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-03-09 21:56:50 +00:00
Linus Torvalds
37c333a5de Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:

 - sysfs attributes leak fix for Google Vivaldi driver (Dmitry Torokhov)

 - fix for potential out-of-bounds read in Thrustmaster driver (Pavel
   Skripkin)

 - error handling reference leak in Elo driver (Jiri Kosina)

 - a few new device IDs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: nintendo: check the return value of alloc_workqueue()
  HID: vivaldi: fix sysfs attributes leak
  HID: hid-thrustmaster: fix OOB read in thrustmaster_interrupts
  HID: elo: Revert USB reference counting
  HID: Add support for open wheel and no attachment to T300
  HID: logitech-dj: add new lightspeed receiver id
2022-03-09 13:47:12 -08:00
Linus Torvalds
e7e19defa5 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:

 - Fix compilation of eBPF object files that indirectly include
   mte-kasan.h.

 - Fix test for execute-only permissions with EPAN (Enhanced Privileged
   Access Never, ARMv8.7 feature).

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kasan: fix include error in MTE functions
  arm64: Ensure execute-only permissions are not allowed without EPAN
2022-03-09 12:59:21 -08:00
Russell King (Oracle)
33970b031d ARM: fix co-processor register typo
In the recent Spectre BHB patches, there was a typo that is only
exposed in certain configurations: mcr p15,0,XX,c7,r5,4 should have
been mcr p15,0,XX,c7,c5,4

Reported-by: kernel test robot <lkp@intel.com>
Fixes: b9baf5c8c5 ("ARM: Spectre-BHB workaround")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-09 12:23:28 -08:00
Ben Ben-Ishay
99a2b9be07 net/mlx5e: SHAMPO, reduce TIR indication
SHAMPO is an RQ / WQ feature, an indication was added to the TIR in the
first place to enforce suitability between connected TIR and RQ, this
enforcement does not exist in current the Firmware implementation and was
redundant in the first place.

Fixes: 83439f3c37 ("net/mlx5e: Add HW-GRO offload")
Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09 11:39:35 -08:00
Roi Dayan
ad11c4f1d8 net/mlx5e: Lag, Only handle events from highest priority multipath entry
There could be multiple multipath entries but changing the port affinity
for each one doesn't make much sense and there should be a default one.
So only track the entry with lowest priority value.
The commit doesn't affect existing users with a single entry.

Fixes: 544fe7c2e6 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09 11:39:35 -08:00
Dima Chumak
39bab83b11 net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE
Only prio 1 is supported for nic mode when there is no ignore flow level
support in firmware. But for switchdev mode, which supports fixed number
of statically pre-allocated prios, this restriction is not relevant so
it can be relaxed.

Fixes: d671e109bd ("net/mlx5: Fix tc max supported prio for nic mode")
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09 11:39:35 -08:00
Moshe Shemesh
063bd35559 net/mlx5: Fix a race on command flush flow
Fix a refcount use after free warning due to a race on command entry.
Such race occurs when one of the commands releases its last refcount and
frees its index and entry while another process running command flush
flow takes refcount to this command entry. The process which handles
commands flush may see this command as needed to be flushed if the other
process released its refcount but didn't release the index yet. Fix it
by adding the needed spin lock.

It fixes the following warning trace:

refcount_t: addition on 0; use-after-free.
WARNING: CPU: 11 PID: 540311 at lib/refcount.c:25 refcount_warn_saturate+0x80/0xe0
...
RIP: 0010:refcount_warn_saturate+0x80/0xe0
...
Call Trace:
 <TASK>
 mlx5_cmd_trigger_completions+0x293/0x340 [mlx5_core]
 mlx5_cmd_flush+0x3a/0xf0 [mlx5_core]
 enter_error_state+0x44/0x80 [mlx5_core]
 mlx5_fw_fatal_reporter_err_work+0x37/0xe0 [mlx5_core]
 process_one_work+0x1be/0x390
 worker_thread+0x4d/0x3d0
 ? rescuer_thread+0x350/0x350
 kthread+0x141/0x160
 ? set_kthread_struct+0x40/0x40
 ret_from_fork+0x1f/0x30
 </TASK>

Fixes: 50b2412b7e ("net/mlx5: Avoid possible free of command entry while timeout comp handler")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09 11:39:34 -08:00
Mohammad Kabat
ac77998b7a net/mlx5: Fix size field in bufferx_reg struct
According to HW spec the field "size" should be 16 bits
in bufferx register.

Fixes: e281682bf2 ("net/mlx5_core: HW data structs/types definitions cleanup")
Signed-off-by: Mohammad Kabat <mohammadkab@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09 11:39:34 -08:00
Jiapeng Chong
78cbc65132 ftrace: Fix some W=1 warnings in kernel doc comments
Clean up the following clang-w1 warning:

kernel/trace/ftrace.c:7827: warning: Function parameter or member 'ops'
not described in 'unregister_ftrace_function'.

kernel/trace/ftrace.c:7805: warning: Function parameter or member 'ops'
not described in 'register_ftrace_function'.

Link: https://lkml.kernel.org/r/20220307004303.26399-1-jiapeng.chong@linux.alibaba.com

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-09 11:52:21 -05:00
Nicolas Saenz Julienne
caf4c86bf1 tracing/osnoise: Force quiescent states while tracing
At the moment running osnoise on a nohz_full CPU or uncontested FIFO
priority and a PREEMPT_RCU kernel might have the side effect of
extending grace periods too much. This will entice RCU to force a
context switch on the wayward CPU to end the grace period, all while
introducing unwarranted noise into the tracer. This behaviour is
unavoidable as overly extending grace periods might exhaust the system's
memory.

This same exact problem is what extended quiescent states (EQS) were
created for, conversely, rcu_momentary_dyntick_idle() emulates them by
performing a zero duration EQS. So let's make use of it.

In the common case rcu_momentary_dyntick_idle() is fairly inexpensive:
atomically incrementing a local per-CPU counter and doing a store. So it
shouldn't affect osnoise's measurements (which has a 1us granularity),
so we'll call it unanimously.

The uncommon case involve calling rcu_momentary_dyntick_idle() after
having the osnoise process:

 - Receive an expedited quiescent state IPI with preemption disabled or
   during an RCU critical section. (activates rdp->cpu_no_qs.b.exp
   code-path).

 - Being preempted within in an RCU critical section and having the
   subsequent outermost rcu_read_unlock() called with interrupts
   disabled. (t->rcu_read_unlock_special.b.blocked code-path).

Neither of those are possible at the moment, and are unlikely to be in
the future given the osnoise's loop design. On top of this, the noise
generated by the situations described above is unavoidable, and if not
exposed by rcu_momentary_dyntick_idle() will be eventually seen in
subsequent rcu_read_unlock() calls or schedule operations.

Link: https://lkml.kernel.org/r/20220307180740.577607-1-nsaenzju@redhat.com

Cc: stable@vger.kernel.org
Fixes: bce29ac9ce ("trace: Add osnoise tracer")
Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-09 11:51:42 -05:00
Daniel Bristot de Oliveira
f0cfe17bcc tracing/osnoise: Do not unregister events twice
Nicolas reported that using:

 # trace-cmd record -e all -M 10 -p osnoise --poll

Resulted in the following kernel warning:

 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 1217 at kernel/tracepoint.c:404 tracepoint_probe_unregister+0x280/0x370
 [...]
 CPU: 0 PID: 1217 Comm: trace-cmd Not tainted 5.17.0-rc6-next-20220307-nico+ #19
 RIP: 0010:tracepoint_probe_unregister+0x280/0x370
 [...]
 CR2: 00007ff919b29497 CR3: 0000000109da4005 CR4: 0000000000170ef0
 Call Trace:
  <TASK>
  osnoise_workload_stop+0x36/0x90
  tracing_set_tracer+0x108/0x260
  tracing_set_trace_write+0x94/0xd0
  ? __check_object_size.part.0+0x10a/0x150
  ? selinux_file_permission+0x104/0x150
  vfs_write+0xb5/0x290
  ksys_write+0x5f/0xe0
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7ff919a18127
 [...]
 ---[ end trace 0000000000000000 ]---

The warning complains about an attempt to unregister an
unregistered tracepoint.

This happens on trace-cmd because it first stops tracing, and
then switches the tracer to nop. Which is equivalent to:

  # cd /sys/kernel/tracing/
  # echo osnoise > current_tracer
  # echo 0 > tracing_on
  # echo nop > current_tracer

The osnoise tracer stops the workload when no trace instance
is actually collecting data. This can be caused both by
disabling tracing or disabling the tracer itself.

To avoid unregistering events twice, use the existing
trace_osnoise_callback_enabled variable to check if the events
(and the workload) are actually active before trying to
deactivate them.

Link: https://lore.kernel.org/all/c898d1911f7f9303b7e14726e7cc9678fbfb4a0e.camel@redhat.com/
Link: https://lkml.kernel.org/r/938765e17d5a781c2df429a98f0b2e7cc317b022.1646823913.git.bristot@kernel.org

Cc: stable@vger.kernel.org
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Fixes: 2fac8d6486 ("tracing/osnoise: Allow multiple instances of the same tracer")
Reported-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-09 11:15:24 -05:00
Paul Semel
b859ebedd1 arm64: kasan: fix include error in MTE functions
Fix `error: expected string literal in 'asm'`.
This happens when compiling an ebpf object file that includes
`net/net_namespace.h` from linux kernel headers.

Include trace:
     include/net/net_namespace.h:10
     include/linux/workqueue.h:9
     include/linux/timer.h:8
     include/linux/debugobjects.h:6
     include/linux/spinlock.h:90
     include/linux/workqueue.h:9
     arch/arm64/include/asm/spinlock.h:9
     arch/arm64/include/generated/asm/qrwlock.h:1
     include/asm-generic/qrwlock.h:14
     arch/arm64/include/asm/processor.h:33
     arch/arm64/include/asm/kasan.h:9
     arch/arm64/include/asm/mte-kasan.h:45
     arch/arm64/include/asm/mte-def.h:14

Signed-off-by: Paul Semel <paul.semel@datadoghq.com>
Fixes: 2cb3427642 ("arm64: kasan: simplify and inline MTE functions")
Cc: <stable@vger.kernel.org> # 5.12.x
Link: https://lore.kernel.org/r/bacb5387-2992-97e4-0c48-1ed925905bee@gmail.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-03-09 15:27:21 +00:00
David S. Miller
cc7e2f596e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2022-03-09

1) Fix IPv6 PMTU discovery for xfrm interfaces.
   From Lina Wang.

2) Revert failing for policies and states that are
   configured with XFRMA_IF_ID 0. It broke a
   user configuration. From Kai Lueke.

3) Fix a possible buffer overflow in the ESP output path.

4) Fix ESP GSO for tunnel and BEET mode on inter address
   family tunnels.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 14:48:11 +00:00
Duoming Zhou
71171ac8eb ax25: Fix NULL pointer dereference in ax25_kill_by_device
When two ax25 devices attempted to establish connection, the requester use ax25_create(),
ax25_bind() and ax25_connect() to initiate connection. The receiver use ax25_rcv() to
accept connection and use ax25_create_cb() in ax25_rcv() to create ax25_cb, but the
ax25_cb->sk is NULL. When the receiver is detaching, a NULL pointer dereference bug
caused by sock_hold(sk) in ax25_kill_by_device() will happen. The corresponding
fail log is shown below:

===============================================================
BUG: KASAN: null-ptr-deref in ax25_device_event+0xfd/0x290
Call Trace:
...
ax25_device_event+0xfd/0x290
raw_notifier_call_chain+0x5e/0x70
dev_close_many+0x174/0x220
unregister_netdevice_many+0x1f7/0xa60
unregister_netdevice_queue+0x12f/0x170
unregister_netdev+0x13/0x20
mkiss_close+0xcd/0x140
tty_ldisc_release+0xc0/0x220
tty_release_struct+0x17/0xa0
tty_release+0x62d/0x670
...

This patch add condition check in ax25_kill_by_device(). If s->sk is
NULL, it will goto if branch to kill device.

Fixes: 4e0f718daf ("ax25: improve the incomplete fix to avoid UAF and NPD bugs")
Reported-by: Thomas Osterried <thomas@osterried.de>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 12:45:02 +00:00
Miaoqian Lin
c9ffa3e2bc net: marvell: prestera: Add missing of_node_put() in prestera_switch_set_base_mac_addr
This node pointer is returned by of_find_compatible_node() with
refcount incremented. Calling of_node_put() to aovid the refcount leak.

Fixes: 501ef3066c ("net: marvell: prestera: Add driver for Prestera family ASIC devices")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 12:16:32 +00:00
Jiasheng Jiang
2169b79258 net: ethernet: lpc_eth: Handle error for clk_enable
As the potential failure of the clk_enable(),
it should be better to check it and return error
if fails.

Fixes: b7370112f5 ("lpc32xx: Added ethernet driver")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 12:15:20 +00:00
Minghao Chi (CGEL ZTE)
2a760554dc net:mcf8390: Use platform_get_irq() to get the interrupt
It is not recommened to use platform_get_resource(pdev, IORESOURCE_IRQ)
for requesting IRQ's resources any more, as they can be not ready yet in
case of DT-booting.

platform_get_irq() instead is a recommended way for getting IRQ even if
it was not retrieved earlier.

It also makes code simpler because we're getting "int" value right away
and no conversion from resource to int is required.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi (CGEL ZTE) <chi.minghao@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 12:14:30 +00:00
Jiasheng Jiang
6babfc6e6f net: ethernet: ti: cpts: Handle error for clk_enable
As the potential failure of the clk_enable(),
it should be better to check it and return error
if fails.

Fixes: 8a2c9a5ab4 ("net: ethernet: ti: cpts: rework initialization/deinitialization")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 12:12:46 +00:00
Ross Philipson
445c1470b6 x86/boot: Add setup_indirect support in early_memremap_is_setup_data()
The x86 boot documentation describes the setup_indirect structures and
how they are used. Only one of the two functions in ioremap.c that needed
to be modified to be aware of the introduction of setup_indirect
functionality was updated. Adds comparable support to the other function
where it was missing.

Fixes: b3c72fc9a7 ("x86/boot: Introduce setup_indirect")
Signed-off-by: Ross Philipson <ross.philipson@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1645668456-22036-3-git-send-email-ross.philipson@oracle.com
2022-03-09 12:49:46 +01:00
Ross Philipson
7228918b34 x86/boot: Fix memremap of setup_indirect structures
As documented, the setup_indirect structure is nested inside
the setup_data structures in the setup_data list. The code currently
accesses the fields inside the setup_indirect structure but only
the sizeof(struct setup_data) is being memremapped. No crash
occurred but this is just due to how the area is remapped under the
covers.

Properly memremap both the setup_data and setup_indirect structures
in these cases before accessing them.

Fixes: b3c72fc9a7 ("x86/boot: Introduce setup_indirect")
Signed-off-by: Ross Philipson <ross.philipson@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1645668456-22036-2-git-send-email-ross.philipson@oracle.com
2022-03-09 12:49:44 +01:00
David S. Miller
030141b0fc Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-03-08

This series contains updates to iavf, i40e, and ice drivers.

Michal ensures netdev features are properly updated to reflect VLAN
changes received from PF and adds an additional flag for MSI-X
reinitialization as further differentiation of reinitialization
operations is needed for iavf.

Jake stops disabling of VFs due to failed virtchannel responses for
i40e and ice driver.

Dave moves MTU event notification to the service task to prevent issues
with RTNL lock for ice.

Christophe Jaillet corrects an allocation to GFP_ATOMIC instead of
GFP_KERNEL for ice.

Jedrzej fixes the value for link speed comparison which was preventing
the requested value from being set for ice.
---
Note: This will conflict when merging with net-next. Resolution:

diff --cc drivers/net/ethernet/intel/ice/ice.h
index dc42ff92dbad,3121f9b04f59..000000000000
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@@ -484,10 -481,9 +484,11 @@@ enum ice_pf_flags
        ICE_FLAG_LEGACY_RX,
        ICE_FLAG_VF_TRUE_PROMISC_ENA,
        ICE_FLAG_MDD_AUTO_RESET_VF,
 +      ICE_FLAG_VF_VLAN_PRUNING,
        ICE_FLAG_LINK_LENIENT_MODE_ENA,
        ICE_FLAG_PLUG_AUX_DEV,
+       ICE_FLAG_MTU_CHANGED,
 +      ICE_FLAG_GNSS,                  /* GNSS successfully initialized */
        ICE_PF_FLAGS_NBITS              /* must be last */
  };
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 10:42:14 +00:00
Tung Nguyen
c79fcc27be tipc: fix incorrect order of state message data sanity check
When receiving a state message, function tipc_link_validate_msg()
is called to validate its header portion. Then, its data portion
is validated before it can be accessed correctly. However, current
data sanity  check is done after the message header is accessed to
update some link variables.

This commit fixes this issue by moving the data sanity check to
the beginning of state message handling and right after the header
sanity check.

Fixes: 9aa422ad32 ("tipc: improve size validations for received domain records")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/20220308021200.9245-1-tung.q.nguyen@dektech.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08 22:18:42 -08:00
Miaoqian Lin
b19ab4b38b ethernet: Fix error handling in xemaclite_of_probe
This node pointer is returned by of_parse_phandle() with refcount
incremented in this function. Calling of_node_put() to avoid the
refcount leak. As the remove function do.

Fixes: 5cdaaa1286 ("net: emaclite: adding MDIO and phy lib support")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220308024751.2320-1-linmq006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08 22:15:01 -08:00
Shin'ichiro Kawasaki
0a5aa8d161 block: fix blk_mq_attempt_bio_merge and rq_qos_throttle protection
Commit 9d497e2941 ("block: don't protect submit_bio_checks by
q_usage_counter") moved blk_mq_attempt_bio_merge and rq_qos_throttle
calls out of q_usage_counter protection. However, these functions require
q_usage_counter protection. The blk_mq_attempt_bio_merge call without
the protection resulted in blktests block/005 failure with KASAN null-
ptr-deref or use-after-free at bio merge. The rq_qos_throttle call
without the protection caused kernel hang at qos throttle.

To fix the failures, move the blk_mq_attempt_bio_merge and
rq_qos_throttle calls back to q_usage_counter protection.

Fixes: 9d497e2941 ("block: don't protect submit_bio_checks by q_usage_counter")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20220308080915.3473689-1-shinichiro.kawasaki@wdc.com
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08 17:48:39 -07:00
Jedrzej Jagielski
ad35ffa252 ice: Fix curr_link_speed advertised speed
Change curr_link_speed advertised speed, due to
link_info.link_speed is not equal phy.curr_user_speed_req.
Without this patch it is impossible to set advertised
speed to same as link_speed.

Testing Hints: Try to set advertised speed
to 25G only with 25G default link (use ethtool -s 0x80000000)

Fixes: 48cb27f2fd ("ice: Implement handlers for ethtool PHY/link operations")
Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-08 13:31:09 -08:00
Christophe JAILLET
3d97f1afd8 ice: Don't use GFP_KERNEL in atomic context
ice_misc_intr() is an irq handler. It should not sleep.

Use GFP_ATOMIC instead of GFP_KERNEL when allocating some memory.

Fixes: 348048e724 ("ice: Implement iidc operations")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-08 13:31:09 -08:00
Dave Ertman
97b0129146 ice: Fix error with handling of bonding MTU
When a bonded interface is destroyed, .ndo_change_mtu can be called
during the tear-down process while the RTNL lock is held.  This is a
problem since the auxiliary driver linked to the LAN driver needs to be
notified of the MTU change, and this requires grabbing a device_lock on
the auxiliary_device's dev.  Currently this is being attempted in the
same execution context as the call to .ndo_change_mtu which is causing a
dead-lock.

Move the notification of the changed MTU to a separate execution context
(watchdog service task) and eliminate the "before" notification.

Fixes: 348048e724 ("ice: Implement iidc operations")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Jonathan Toppins <jtoppins@redhat.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-08 13:31:08 -08:00
Jacob Keller
79498d5af8 ice: stop disabling VFs due to PF error responses
The ice_vc_send_msg_to_vf function has logic to detect "failure"
responses being sent to a VF. If a VF is sent more than
ICE_DFLT_NUM_INVAL_MSGS_ALLOWED then the VF is marked as disabled.
Almost identical logic also existed in the i40e driver.

This logic was added to the ice driver in commit 1071a8358a ("ice:
Implement virtchnl commands for AVF support") which itself copied from
the i40e implementation in commit 5c3c48ac6b ("i40e: implement virtual
device interface").

Neither commit provides a proper explanation or justification of the
check. In fact, later commits to i40e changed the logic to allow
bypassing the check in some specific instances.

The "logic" for this seems to be that error responses somehow indicate a
malicious VF. This is not really true. The PF might be sending an error
for any number of reasons such as lack of resources, etc.

Additionally, this causes the PF to log an info message for every failed
VF response which may confuse users, and can spam the kernel log.

This behavior is not documented as part of any requirement for our
products and other operating system drivers such as the FreeBSD
implementation of our drivers do not include this type of check.

In fact, the change from dev_err to dev_info in i40e commit 18b7af57d9
("i40e: Lower some message levels") explains that these messages
typically don't actually indicate a real issue. It is quite likely that
a user who hits this in practice will be very confused as the VF will be
disabled without an obvious way to recover.

We already have robust malicious driver detection logic using actual
hardware detection mechanisms that detect and prevent invalid device
usage. Remove the logic since its not a documented requirement and the
behavior is not intuitive.

Fixes: 1071a8358a ("ice: Implement virtchnl commands for AVF support")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-08 13:31:08 -08:00
Jacob Keller
5710ab7916 i40e: stop disabling VFs due to PF error responses
The i40e_vc_send_msg_to_vf_ex (and its wrapper i40e_vc_send_msg_to_vf)
function has logic to detect "failure" responses sent to the VF. If a VF
is sent more than I40E_DEFAULT_NUM_INVALID_MSGS_ALLOWED, then the VF is
marked as disabled. In either case, a dev_info message is printed
stating that a VF opcode failed.

This logic originates from the early implementation of VF support in
commit 5c3c48ac6b ("i40e: implement virtual device interface").

That commit did not go far enough. The "logic" for this behavior seems
to be that error responses somehow indicate a malicious VF. This is not
really true. The PF might be sending an error for any number of reasons
such as lacking resources, an unsupported operation, etc. This does not
indicate a malicious VF. We already have a separate robust malicious VF
detection which relies on hardware logic to detect and prevent a variety
of behaviors.

There is no justification for this behavior in the original
implementation. In fact, a later commit 18b7af57d9 ("i40e: Lower some
message levels") reduced the opcode failure message from a dev_err to a
dev_info. In addition, recent commit 01cbf50877 ("i40e: Fix to not
show opcode msg on unsuccessful VF MAC change") changed the logic to
allow quieting it for expected failures.

That commit prevented this logic from kicking in for specific
circumstances. This change did not go far enough. The behavior is not
documented nor is it part of any requirement for our products. Other
operating systems such as the FreeBSD implementation of our driver do
not include this logic.

It is clear this check does not make sense, and causes problems which
led to ugly workarounds.

Fix this by just removing the entire logic and the need for the
i40e_vc_send_msg_to_vf_ex function.

Fixes: 01cbf50877 ("i40e: Fix to not show opcode msg on unsuccessful VF MAC change")
Fixes: 5c3c48ac6b ("i40e: implement virtual device interface")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-08 13:31:08 -08:00
Michal Maloszewski
57d03f5608 iavf: Fix adopting new combined setting
In some cases overloaded flag IAVF_FLAG_REINIT_ITR_NEEDED
which should indicate that interrupts need to be completely
reinitialized during reset leads to RTNL deadlocks using ethtool -C
while a reset is in progress.
To fix, it was added a new flag IAVF_FLAG_REINIT_MSIX_NEEDED
used to trigger MSI-X reinit.
New combined setting is fixed adopt after VF reset.
This has been implemented by call reinit interrupt scheme
during VF reset.
Without this fix new combined setting has never been adopted.

Fixes: 209f2f9c71 ("iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-08 13:30:57 -08:00
Michal Maloszewski
2cf29e5589 iavf: Fix handling of vlan strip virtual channel messages
Modify netdev->features for vlan stripping based on virtual
channel messages received from the PF. Change is needed
to synchronize vlan strip status between PF sysfs and iavf ethtool.

Fixes: 5951a2b981 ("iavf: Fix VLAN feature flags after VFR")
Signed-off-by: Norbert Ciosek <norbertx.ciosek@intel.com>
Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-08 12:59:48 -08:00
Emmanuel Gil Peyrot
330f4c53d3 ARM: fix build error when BPF_SYSCALL is disabled
It was missing a semicolon.

Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Fixes: 25875aa71d ("ARM: include unprivileged BPF status in Spectre V2 reporting").
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-08 12:53:05 -08:00
Linus Torvalds
4f86a6b46e Merge tag 'devicetree-fixes-for-5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:

 - Fix pinctrl node name warnings in examples

 - Add missing 'mux-states' property in ti,tcan104x-can binding

* tag 'devicetree-fixes-for-5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: phy: ti,tcan104x-can: Document mux-states property
  dt-bindings: mfd: Fix pinctrl node name warnings
2022-03-08 11:52:45 -08:00
Linus Torvalds
92f90cc9fe Merge tag 'fuse-fixes-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:

 - Fix an issue with splice on the fuse device

 - Fix a regression in the fileattr API conversion

 - Add a small userspace API improvement

* tag 'fuse-fixes-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix pipe buffer lifetime for direct_io
  fuse: move FUSE_SUPER_MAGIC definition to magic.h
  fuse: fix fileattr op failure
2022-03-08 09:41:18 -08:00
Linus Torvalds
cd22a8bfcf Merge tag 'arm64-spectre-bhb-for-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 spectre fixes from James Morse:
 "ARM64 Spectre-BHB mitigations:

   - Make EL1 vectors per-cpu

   - Add mitigation sequences to the EL1 and EL2 vectors on vulnerble
     CPUs

   - Implement ARCH_WORKAROUND_3 for KVM guests

   - Report Vulnerable when unprivileged eBPF is enabled"

* tag 'arm64-spectre-bhb-for-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting
  arm64: Use the clearbhb instruction in mitigations
  KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated
  arm64: Mitigate spectre style branch history side channels
  arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2
  arm64: Add percpu vectors for EL1
  arm64: entry: Add macro for reading symbol addresses from the trampoline
  arm64: entry: Add vectors that have the bhb mitigation sequences
  arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations
  arm64: entry: Allow the trampoline text to occupy multiple pages
  arm64: entry: Make the kpti trampoline's kpti sequence optional
  arm64: entry: Move trampoline macros out of ifdef'd section
  arm64: entry: Don't assume tramp_vectors is the start of the vectors
  arm64: entry: Allow tramp_alias to access symbols after the 4K boundary
  arm64: entry: Move the trampoline data page before the text page
  arm64: entry: Free up another register on kpti's tramp_exit path
  arm64: entry: Make the trampoline cleanup optional
  KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A
  arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit
  arm64: entry.S: Add ventry overflow sanity checks
2022-03-08 09:27:25 -08:00
Linus Torvalds
fc55c23a73 Merge tag 'for-linus-bhb' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM spectre fixes from Russell King:
 "ARM Spectre BHB mitigations.

  These patches add Spectre BHB migitations for the following Arm CPUs
  to the 32-bit ARM kernels:
   - Cortex A15
   - Cortex A57
   - Cortex A72
   - Cortex A73
   - Cortex A75
   - Brahma B15
  for CVE-2022-23960"

* tag 'for-linus-bhb' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: include unprivileged BPF status in Spectre V2 reporting
  ARM: Spectre-BHB workaround
  ARM: use LOADADDR() to get load address of sections
  ARM: early traps initialisation
  ARM: report Spectre v2 status through sysfs
2022-03-08 09:08:06 -08:00
Aswath Govindraju
f6eafa4022 dt-bindings: phy: ti,tcan104x-can: Document mux-states property
On some boards, for routing CAN signals from controller to transceivers,
muxes might need to be set. This can be implemented using mux-states
property. Therefore, document the same in the respective bindings.

Signed-off-by: Aswath Govindraju <a-govindraju@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211216041012.16892-2-a-govindraju@ti.com
2022-03-08 10:41:32 -06:00
Rob Herring
7e807f4b08 dt-bindings: mfd: Fix pinctrl node name warnings
The recent addition pinctrl.yaml in commit c09acbc499 ("dt-bindings:
pinctrl: use pinctrl.yaml") resulted in some node name warnings:

Documentation/devicetree/bindings/mfd/cirrus,lochnagar.example.dt.yaml: \
 lochnagar-pinctrl: $nodename:0: 'lochnagar-pinctrl' does not match '^(pinctrl|pinmux)(@[0-9a-f]+)?$'
Documentation/devicetree/bindings/mfd/cirrus,madera.example.dt.yaml: \
 codec@1a: $nodename:0: 'codec@1a' does not match '^(pinctrl|pinmux)(@[0-9a-f]+)?$'
Documentation/devicetree/bindings/mfd/brcm,cru.example.dt.yaml: \
 pin-controller@1c0: $nodename:0: 'pin-controller@1c0' does not match '^(pinctrl|pinmux)(@[0-9a-f]+)?$'

Fix the node names to the preferred 'pinctrl'. For cirrus,madera,
nothing from pinctrl.yaml schema is used, so just drop the reference.

Fixes: c09acbc499 ("dt-bindings: pinctrl: use pinctrl.yaml")
Cc: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20220303232350.2591143-1-robh@kernel.org
2022-03-08 10:41:31 -06:00
Jisheng Zhang
d986afd5a7 MAINTAINERS: Update Jisheng's email address
I'm leaving synaptics. Update my email address to my korg mail
address and add entries to .mailmap as well to map my work
addresses to korg mail address.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Link: https://lore.kernel.org/r/ce7213bd-28ac-6580-466e-875e755fe0ae@synaptics.com'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-08 17:30:32 +01:00
Florian Westphal
ee0a4dc9f3 Revert "netfilter: conntrack: tag conntracks picked up in local out hook"
This was a prerequisite for the ill-fated
"netfilter: nat: force port remap to prevent shadowing well-known ports".

As this has been reverted, this change can be backed out too.

Signed-off-by: Florian Westphal <fw@strlen.de>
2022-03-08 17:28:38 +01:00
Arnd Bergmann
d25ca90833 Merge tag 'arm-soc/for-5.18/maintainers' of https://github.com/Broadcom/stblinux into arm/fixes
This pull request updates the MAINTAINERS file for Broadcom SoCs, please
pull the following for 5.18:

- Kuldeep updates the Broadcom iProc entry to use the same up to date
Linux tree as the other Broadcom SoCs.

* tag 'arm-soc/for-5.18/maintainers' of https://github.com/Broadcom/stblinux:
  MAINTAINERS: Update git tree for Broadcom iProc SoCs

Link: https://lore.kernel.org/r/20220307194817.3754107-4-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-08 17:25:24 +01:00
Russell King (Oracle)
25875aa71d ARM: include unprivileged BPF status in Spectre V2 reporting
The mitigations for Spectre-BHB are only applied when an exception
is taken, but when unprivileged BPF is enabled, userspace can
load BPF programs that can be used to exploit the problem.

When unprivileged BPF is enabled, report the vulnerable status via
the spectre_v2 sysfs file.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-03-08 14:46:08 +00:00
Robert Foss
d3258737af Revert "arm64: dts: mt8183: jacuzzi: Fix bus properties in anx's DSI endpoint"
This reverts commit 32568ae375.

Signed-off-by: Robert Foss <robert.foss@linaro.org>
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-08 15:20:17 +01:00
Robert Foss
979452fbc4 dt-bindings: drm/bridge: anx7625: Revert DPI support
Revert DPI support from binding.

DPI support relies on the bus-type enum which does not yet support
Mipi DPI, since no v4l2_fwnode_bus_type has been defined for this
bus type.

When DPI for anx7625 was initially added, it assumed that
V4L2_FWNODE_BUS_TYPE_PARALLEL was the correct bus type for
representing DPI, which it is not.

In order to prevent adding this mis-usage to the ABI, let's revert
the support.

Signed-off-by: Robert Foss <robert.foss@linaro.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-08 15:20:16 +01:00
Peter Zijlstra
5adf349439 x86/module: Fix the paravirt vs alternative order
Ever since commit

  4e6292114c ("x86/paravirt: Add new features for paravirt patching")

there is an ordering dependency between patching paravirt ops and
patching alternatives, the module loader still violates this.

Fixes: 4e6292114c ("x86/paravirt: Add new features for paravirt patching")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220303112825.068773913@infradead.org
2022-03-08 14:15:25 +01:00
Florian Westphal
a82c25c366 Revert "netfilter: nat: force port remap to prevent shadowing well-known ports"
This reverts commit 878aed8db3.

This change breaks existing setups where conntrack is used with
asymmetric paths.

In these cases, the NAT transformation occurs on the syn-ack instead of
the syn:

1. SYN    x:12345 -> y -> 443 // sent by initiator, receiverd by responder
2. SYNACK y:443 -> x:12345 // First packet seen by conntrack, as sent by responder
3. tuple_force_port_remap() gets called, sees:
  'tcp from 443 to port 12345 NAT' -> pick a new source port, inititor receives
4. SYNACK y:$RANDOM -> x:12345   // connection is never established

While its possible to avoid the breakage with NOTRACK rules, a kernel
update should not break working setups.

An alternative to the revert is to augment conntrack to tag
mid-stream connections plus more code in the nat core to skip NAT
for such connections, however, this leads to more interaction/integration
between conntrack and NAT.

Therefore, revert, users will need to add explicit nat rules to avoid
port shadowing.

Link: https://lore.kernel.org/netfilter-devel/20220302105908.GA5852@breakpoint.cc/#R
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2051413
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-03-08 13:52:11 +01:00
Joel Stanley
2f6edb6bcb ARM: dts: aspeed: Fix AST2600 quad spi group
Requesting quad mode for the FMC resulted in an error:

  &fmc {
         status = "okay";
 +       pinctrl-names = "default";
 +       pinctrl-0 = <&pinctrl_fwqspi_default>'

[    0.742963] aspeed-g6-pinctrl 1e6e2000.syscon:pinctrl: invalid function FWQSPID in map table


This is because the quad mode pins are a group of pins, not a function.

After applying this patch we can request the pins and the QSPI data
lines are muxed:

 # cat /sys/kernel/debug/pinctrl/1e6e2000.syscon\:pinctrl-aspeed-g6-pinctrl/pinmux-pins |grep 1e620000.spi
 pin 196 (AE12): device 1e620000.spi function FWSPID group FWQSPID
 pin 197 (AF12): device 1e620000.spi function FWSPID group FWQSPID
 pin 240 (Y1): device 1e620000.spi function FWSPID group FWQSPID
 pin 241 (Y2): device 1e620000.spi function FWSPID group FWQSPID
 pin 242 (Y3): device 1e620000.spi function FWSPID group FWQSPID
 pin 243 (Y4): device 1e620000.spi function FWSPID group FWQSPID

Fixes: f510f04c8c ("ARM: dts: aspeed: Add AST2600 pinmux nodes")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Link: https://lore.kernel.org/r/20220304011010.974863-1-joel@jms.id.au
Link: https://lore.kernel.org/r/20220304011010.974863-1-joel@jms.id.au'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-08 13:45:48 +01:00
Arnd Bergmann
60392db617 Merge tag 'tegra-for-5.17-arm-dt-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/fixes
ARM: tegra: Device tree fixes for v5.17

One more patch to fix up eDP panels on Nyan FHD models.

* tag 'tegra-for-5.17-arm-dt-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  ARM: tegra: Move Nyan FHD panels to AUX bus
  ARM: tegra: Move panels to AUX bus

Link: https://lore.kernel.org/r/20220308084339.2199400-1-thierry.reding@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-08 13:43:42 +01:00
Biju Das
1a4e53d2fc spi: Fix invalid sgs value
max_seg_size is unsigned int and it can have a value up to 2^32
(for eg:-RZ_DMAC driver sets dma_set_max_seg_size as U32_MAX)
When this value is used in min_t() as an integer type, it becomes
-1 and the value of sgs becomes 0.

Fix this issue by replacing the 'int' data type with 'unsigned int'
in min_t().

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220307184843.9994-1-biju.das.jz@bp.renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-03-08 12:27:33 +00:00
Russell King (Oracle)
e5417cbf7a net: dsa: mt7530: fix incorrect test in mt753x_phylink_validate()
Discussing one of the tests in mt753x_phylink_validate() with Landen
Chao confirms that the "||" should be "&&". Fix this.

Fixes: c288575f78 ("net: dsa: mt7530: Add the support of MT7531 switch")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1nRCF0-00CiXD-7q@rmk-PC.armlinux.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-08 12:12:25 +01:00
Jernej Skrabec
9470c29faa drm/sun4i: mixer: Fix P010 and P210 format numbers
It turns out that DE3 manual has inverted YUV and YVU format numbers for
P010 and P210. Invert them.

This was tested by playing video decoded to P010 and additionally
confirmed by looking at BSP driver source.

Fixes: 169ca4b389 ("drm/sun4i: Add separate DE3 VI layer formats")
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228181436.1424550-1-jernej.skrabec@gmail.com
2022-03-08 11:54:50 +01:00
Catalin Marinas
6e2edd6371 arm64: Ensure execute-only permissions are not allowed without EPAN
Commit 18107f8a2d ("arm64: Support execute-only permissions with
Enhanced PAN") re-introduced execute-only permissions when EPAN is
available. When EPAN is not available, arch_filter_pgprot() is supposed
to change a PAGE_EXECONLY permission into PAGE_READONLY_EXEC. However,
if BTI or MTE are present, such check does not detect the execute-only
pgprot in the presence of PTE_GP (BTI) or MT_NORMAL_TAGGED (MTE),
allowing the user to request PROT_EXEC with PROT_BTI or PROT_MTE.

Remove the arch_filter_pgprot() function, change the default VM_EXEC
permissions to PAGE_READONLY_EXEC and update the protection_map[] array
at core_initcall() if EPAN is detected.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Fixes: 18107f8a2d ("arm64: Support execute-only permissions with Enhanced PAN")
Cc: <stable@vger.kernel.org> # 5.13.x
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
2022-03-08 10:03:51 +00:00
Andy Shevchenko
a9a5b720dc gpio: sim: Declare gpio_sim_hog_config_item_ops static
Compiler is not happy:

  warning: symbol 'gpio_sim_hog_config_item_ops' was not declared. Should it be static?

Fixes: cb8c474e79 ("gpio: sim: new testing module")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-03-08 09:41:21 +01:00
Linus Torvalds
4a01e748a5 Merge tag 'x86_bugs_for_v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 spectre fixes from Borislav Petkov:

 - Mitigate Spectre v2-type Branch History Buffer attacks on machines
   which support eIBRS, i.e., the hardware-assisted speculation
   restriction after it has been shown that such machines are vulnerable
   even with the hardware mitigation.

 - Do not use the default LFENCE-based Spectre v2 mitigation on AMD as
   it is insufficient to mitigate such attacks. Instead, switch to
   retpolines on all AMD by default.

 - Update the docs and add some warnings for the obviously vulnerable
   cmdline configurations.

* tag 'x86_bugs_for_v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
  x86/speculation: Warn about Spectre v2 LFENCE mitigation
  x86/speculation: Update link to AMD speculation whitepaper
  x86/speculation: Use generic retpoline by default on AMD
  x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
  Documentation/hw-vuln: Update spectre doc
  x86/speculation: Add eIBRS + Retpoline options
  x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
2022-03-07 17:29:47 -08:00
Krzysztof Kozlowski
5125091d75 MAINTAINERS: update Krzysztof Kozlowski's email
Use Krzysztof Kozlowski's @kernel.org account in maintainer entries.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20220307172805.156760-1-krzysztof.kozlowski@canonical.com'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-07 23:46:03 +01:00
Arnd Bergmann
537c3757b4 Merge tag 'tegra-for-5.17-arm64-dt-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/fixes
arm64: tegra: Device tree fixes for v5.17

This contains a single, last-minute fix to disable the display SMMU by
default because under some circumstances leaving it enabled by default
can cause SMMU faults on boot.

* tag 'tegra-for-5.17-arm64-dt-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  arm64: tegra: Disable ISO SMMU for Tegra194

Link: https://lore.kernel.org/r/20220307182120.2169598-1-thierry.reding@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-07 23:23:58 +01:00
Linus Walleij
e941dc13fd Input: zinitix - do not report shadow fingers
I observed the following problem with the BT404 touch pad
running the Phosh UI:

When e.g. typing on the virtual keyboard pressing "g" would
produce "ggg".

After some analysis it turns out the firmware reports that three
fingers hit that coordinate at the same time, finger 0, 2 and
4 (of the five available 0,1,2,3,4).

DOWN
  Zinitix-TS 3-0020: finger 0 down (246, 395)
  Zinitix-TS 3-0020: finger 1 up (0, 0)
  Zinitix-TS 3-0020: finger 2 down (246, 395)
  Zinitix-TS 3-0020: finger 3 up (0, 0)
  Zinitix-TS 3-0020: finger 4 down (246, 395)
UP
  Zinitix-TS 3-0020: finger 0 up (246, 395)
  Zinitix-TS 3-0020: finger 2 up (246, 395)
  Zinitix-TS 3-0020: finger 4 up (246, 395)

This is one touch and release: i.e. this is all reported on
touch (down) and release.

There is a field in the struct touch_event called finger_cnt
which is actually a bitmask of the fingers active in the
event.

Rename this field finger_mask as this matches the use contents
better, then use for_each_set_bit() to iterate over just the
fingers that are actally active.

Factor out a finger reporting function zinitix_report_fingers()
to handle all fingers.

Also be more careful in reporting finger down/up: we were
reporting every event with input_mt_report_slot_state(..., true);
but this should only be reported on finger down or move,
not on finger up, so also add code to check p->sub_status
to see what is happening and report correctly.

After this my Zinitix BT404 touchscreen report fingers
flawlessly.

The vendor drive I have notably does not use the "finger_cnt"
and contains obviously incorrect code like this:

  if (touch_dev->touch_info.finger_cnt > MAX_SUPPORTED_FINGER_NUM)
      touch_dev->touch_info.finger_cnt = MAX_SUPPORTED_FINGER_NUM;

As MAX_SUPPORTED_FINGER_NUM is an ordinal and the field is
a bitmask this seems quite confused.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220228233017.2270599-1-linus.walleij@linaro.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-03-07 14:23:35 -08:00
Kuldeep Singh
1860d30466 MAINTAINERS: Update git tree for Broadcom iProc SoCs
Current git tree for Broadcom iProc SoCs is pretty outdated as it has
not updated for a long time. Fix the reference.

Signed-off-by: Kuldeep Singh <singh.kuldeep87k@gmail.com>
2022-03-07 11:46:32 -08:00
Linus Torvalds
ea4424be16 Merge tag 'mtd/fixes-for-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fix from Miquel Raynal:
 "As part of a previous changeset introducing support for the K3
  architecture, the OMAP_GPMC (a non visible symbol) got selected by the
  selection of MTD_NAND_OMAP2 instead of doing so from the architecture
  directly (like for the other users of these two drivers). Indeed, from
  a hardware perspective, the OMAP NAND controller needs the GPMC to
  work.

  This led to a robot error which got addressed in fix merge into -rc4.
  Unfortunately, the approach at this time still used "select" and lead
  to further build error reports (sparc64:allmodconfig).

  This time we switch to 'depends on' in order to prevent random
  misconfigurations. The different dependencies will however need a
  future cleanup"

* tag 'mtd/fixes-for-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: omap2: Actually prevent invalid configuration and build error
2022-03-07 11:43:22 -08:00
Linus Torvalds
06be302970 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
 "Some last minute fixes that took a while to get ready. Not
  regressions, but they look safe and seem to be worth to have"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  tools/virtio: handle fallout from folio work
  tools/virtio: fix virtio_test execution
  vhost: remove avail_event arg from vhost_update_avail_event()
  virtio: drop default for virtio-mem
  vdpa: fix use-after-free on vp_vdpa_remove
  virtio-blk: Remove BUG_ON() in virtio_queue_rq()
  virtio-blk: Don't use MAX_DISCARD_SEGMENTS if max_discard_seg is zero
  vhost: fix hung thread due to erroneous iotlb entries
  vduse: Fix returning wrong type in vduse_domain_alloc_iova()
  vdpa/mlx5: add validation for VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command
  vdpa/mlx5: should verify CTRL_VQ feature exists for MQ
  vdpa: factor out vdpa_set_features_unlocked for vdpa internal use
  virtio_console: break out of buf poll on remove
  virtio: document virtio_reset_device
  virtio: acknowledge all features before access
  virtio: unexport virtio_finalize_features
2022-03-07 11:32:17 -08:00
Halil Pasic
aa6f8dcbab swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.

The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
  must take precedence over DMA_ATTR_SKIP_CPU_SYNC

Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.

Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.

To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Fixes: ddbd89deb7 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-07 11:26:02 -08:00
Thierry Reding
7401b49c50 ARM: tegra: Move Nyan FHD panels to AUX bus
Similarly to what was earlier done for other Nyan variants, move the eDP
panel on the FHD models to the AUX bus as well.

Suggested-by: Dmitry Osipenko <digetx@gmail.com>
Fixes: ef6fb9875c ("ARM: tegra: Add device-tree for 1080p version of Nyan Big")
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-03-07 19:02:39 +01:00
James Morse
58c9a5060c arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting
The mitigations for Spectre-BHB are only applied when an exception is
taken from user-space. The mitigation status is reported via the spectre_v2
sysfs vulnerabilities file.

When unprivileged eBPF is enabled the mitigation in the exception vectors
can be avoided by an eBPF program.

When unprivileged eBPF is enabled, print a warning and report vulnerable
via the sysfs vulnerabilities file.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-03-07 17:25:52 +00:00
Roger Quadros
42da5a4ba1 mtd: rawnand: omap2: Actually prevent invalid configuration and build error
The root of the problem is that we are selecting symbols that have
dependencies. This can cause random configurations that can fail.
The cleanest solution is to avoid using select.

This driver uses interfaces from the OMAP_GPMC driver so we have to
depend on it instead.

Fixes: 4cd335dae3 ("mtd: rawnand: omap2: Prevent invalid configuration and build error")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/linux-mtd/20220219193600.24892-1-rogerq@kernel.org
2022-03-07 17:46:54 +01:00
Miklos Szeredi
0c4bcfdecb fuse: fix pipe buffer lifetime for direct_io
In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls
fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then
imports the write buffer with fuse_get_user_pages(), which uses
iov_iter_get_pages() to grab references to userspace pages instead of
actually copying memory.

On the filesystem device side, these pages can then either be read to
userspace (via fuse_dev_read()), or splice()d over into a pipe using
fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops.

This is wrong because after fuse_dev_do_read() unlocks the FUSE request,
the userspace filesystem can mark the request as completed, causing write()
to return. At that point, the userspace filesystem should no longer have
access to the pipe buffer.

Fix by copying pages coming from the user address space to new pipe
buffers.

Reported-by: Jann Horn <jannh@google.com>
Fixes: c3021629a0 ("fuse: support splice() reading from fuse device")
Cc: <stable@vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-03-07 16:30:44 +01:00
Jouni Högander
804f468853 drm/i915/psr: Set "SF Partial Frame Enable" also on full update
Currently we are observing occasional screen flickering when
PSR2 selective fetch is enabled. More specifically glitch seems
to happen on full frame update when cursor moves to coords
x = -1 or y = -1.

According to Bspec SF Single full frame should not be set if
SF Partial Frame Enable is not set. This happened to be true for
ADLP as PSR2_MAN_TRK_CTL_ENABLE is always set and for ADL_P it's
actually "SF Partial Frame Enable" (Bit 31).

Setting "SF Partial Frame Enable" bit also on full update seems to
fix screen flickering.

Also make code more clear by setting PSR2_MAN_TRK_CTL_ENABLE
only if not on ADL_P. Bit 31 has different meaning in ADL_P.

Bspec: 49274

v2: Fix Mihai Harpau email address
v3: Modify commit message and remove unnecessary comment

Tested-by: Lyude Paul <lyude@redhat.com>
Fixes: 7f6002e580 ("drm/i915/display: Enable PSR2 selective fetch by default")
Reported-by: Lyude Paul <lyude@redhat.com>
Cc: Mihai Harpau <mharpau@gmail.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://gitlab.freedesktop.org/drm/intel/-/issues/5077
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225070228.855138-1-jouni.hogander@intel.com
(cherry picked from commit 8d5516d18b)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-03-07 14:45:31 +00:00
Andy Shevchenko
660c619b9d gpiolib: acpi: Convert ACPI value of debounce to microseconds
It appears that GPIO ACPI library uses ACPI debounce values directly.
However, the GPIO library APIs expect the debounce timeout to be in
microseconds.

Convert ACPI value of debounce to microseconds.

While at it, document this detail where it is appropriate.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215664
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Fixes: 8dcb7a15a5 ("gpiolib: acpi: Take into account debounce settings")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-03-07 15:25:27 +01:00
Marcelo Roberto Jimenez
fc328a7d1f gpio: Revert regression in sysfs-gpio (gpiolib.c)
Some GPIO lines have stopped working after the patch
commit 2ab73c6d83 ("gpio: Support GPIO controllers without pin-ranges")

And this has supposedly been fixed in the following patches
commit 89ad556b7f ("gpio: Avoid using pin ranges with !PINCTRL")
commit 6dbbf84603 ("gpiolib: Don't free if pin ranges are not defined")

But an erratic behavior where some GPIO lines work while others do not work
has been introduced.

This patch reverts those changes so that the sysfs-gpio interface works
properly again.

Signed-off-by: Marcelo Roberto Jimenez <marcelo.jimenez@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-03-07 15:25:27 +01:00
Akhil R
5f84e73f9a gpio: tegra186: Add IRQ per bank for Tegra241
Add the number of interrupts per bank for Tegra241 (Grace) to
fix the probe failure.

Fixes: d1056b771d ("gpio: tegra186: Add support for Tegra241")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-03-07 15:25:27 +01:00
Fabio Estevam
c70c453abc smsc95xx: Ignore -ENODEV errors when device is unplugged
According to Documentation/driver-api/usb/URB.rst when a device
is unplugged usb_submit_urb() returns -ENODEV.

This error code propagates all the way up to usbnet_read_cmd() and
usbnet_write_cmd() calls inside the smsc95xx.c driver during
Ethernet cable unplug, unbind or reboot.

This causes the following errors to be shown on reboot, for example:

ci_hdrc ci_hdrc.1: remove, state 1
usb usb2: USB disconnect, device number 1
usb 2-1: USB disconnect, device number 2
usb 2-1.1: USB disconnect, device number 3
smsc95xx 2-1.1:1.0 eth1: unregister 'smsc95xx' usb-ci_hdrc.1-1.1, smsc95xx USB 2.0 Ethernet
smsc95xx 2-1.1:1.0 eth1: Failed to read reg index 0x00000114: -19
smsc95xx 2-1.1:1.0 eth1: Error reading MII_ACCESS
smsc95xx 2-1.1:1.0 eth1: __smsc95xx_mdio_read: MII is busy
smsc95xx 2-1.1:1.0 eth1: Failed to read reg index 0x00000114: -19
smsc95xx 2-1.1:1.0 eth1: Error reading MII_ACCESS
smsc95xx 2-1.1:1.0 eth1: __smsc95xx_mdio_read: MII is busy
smsc95xx 2-1.1:1.0 eth1: hardware isn't capable of remote wakeup
usb 2-1.4: USB disconnect, device number 4
ci_hdrc ci_hdrc.1: USB bus 2 deregistered
ci_hdrc ci_hdrc.0: remove, state 4
usb usb1: USB disconnect, device number 1
ci_hdrc ci_hdrc.0: USB bus 1 deregistered
imx2-wdt 30280000.watchdog: Device shutdown: Expect reboot!
reboot: Restarting system

Ignore the -ENODEV errors inside __smsc95xx_mdio_read() and
__smsc95xx_phy_wait_not_busy() and do not print error messages
when -ENODEV is returned.

Fixes: a049a30fc2 ("net: usb: Correct PHY handling of smsc95xx")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-07 12:32:31 +00:00
Tom Rix
d9dc0c84ad qed: return status of qed_iov_get_link
Clang static analysis reports this issue
qed_sriov.c:4727:19: warning: Assigned value is
  garbage or undefined
  ivi->max_tx_rate = tx_rate ? tx_rate : link.speed;
                   ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

link is only sometimes set by the call to qed_iov_get_link()
qed_iov_get_link fails without setting link or returning
status.  So change the decl to return status.

Fixes: 73390ac9d8 ("qed*: support ndo_get_vf_config")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-07 12:22:28 +00:00
Steffen Klassert
23c7f8d798 net: Fix esp GSO on inter address family tunnels.
The esp tunnel GSO handlers use skb_mac_gso_segment to
push the inner packet to the segmentation handlers.
However, skb_mac_gso_segment takes the Ethernet Protocol
ID from 'skb->protocol' which is wrong for inter address
family tunnels. We fix this by introducing a new
skb_eth_gso_segment function.

This function can be used if it is necessary to pass the
Ethernet Protocol ID directly to the segmentation handler.
First users of this function will be the esp4 and esp6
tunnel segmentation handlers.

Fixes: c35fe4106b ("xfrm: Add mode handlers for IPsec on layer 2")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-03-07 13:14:04 +01:00
Steffen Klassert
053c8fdf2c esp: Fix BEET mode inter address family tunneling on GSO
The xfrm{4,6}_beet_gso_segment() functions did not correctly set the
SKB_GSO_IPXIP4 and SKB_GSO_IPXIP6 gso types for the address family
tunneling case. Fix this by setting these gso types.

Fixes: 384a46ea7b ("esp4: add gso_segment for esp4 beet mode")
Fixes: 7f9e40eb18 ("esp6: add gso_segment for esp6 beet mode")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-03-07 13:14:03 +01:00
Steffen Klassert
ebe48d368e esp: Fix possible buffer overflow in ESP transformation
The maximum message size that can be send is bigger than
the  maximum site that skb_page_frag_refill can allocate.
So it is possible to write beyond the allocated buffer.

Fix this by doing a fallback to COW in that case.

v2:

Avoid get get_order() costs as suggested by Linus Torvalds.

Fixes: cac2661c53 ("esp4: Avoid skb_cow_data whenever possible")
Fixes: 03e2a30f6a ("esp6: Avoid skb_cow_data whenever possible")
Reported-by: valis <sec@valis.email>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-03-07 13:14:03 +01:00
Zheyu Ma
bb77bd31c2 ethernet: sun: Free the coherent when failing in probing
When the driver fails to register net device, it should free the DMA
region first, and then do other cleanup.

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-07 11:32:22 +00:00
Aleksander Jan Bajkowski
dd830aed23 net: lantiq_xrx200: fix use after free bug
The skb->len field is read after the packet is sent to the network
stack. In the meantime, skb can be freed. This patch fixes this bug.

Fixes: c3e6b2c35b ("net: lantiq_xrx200: add ingress SG DMA support")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-07 11:29:35 +00:00
Jia-Ju Bai
e0058f0fa8 net: qlogic: check the return value of dma_alloc_coherent() in qed_vf_hw_prepare()
The function dma_alloc_coherent() in qed_vf_hw_prepare() can fail, so
its return value should be checked.

Fixes: 1408cc1fa4 ("qed: Introduce VFs")
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-07 11:28:38 +00:00
Jia-Ju Bai
d0aeb0d4a3 isdn: hfcpci: check the return value of dma_set_mask() in setup_hw()
The function dma_set_mask() in setup_hw() can fail, so its return value
should be checked.

Fixes: 1700fe1a10 ("Add mISDN HFC PCI driver")
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-07 11:27:12 +00:00
Ulf Hansson
1760fdb6fe mmc: core: Restore (almost) the busy polling for MMC_SEND_OP_COND
Commit 76bfc7ccc2 ("mmc: core: adjust polling interval for CMD1"),
significantly decreased the polling period from ~10-12ms into just a couple
of us. The purpose was to decrease the total time spent in the busy polling
loop, but unfortunate it has lead to problems, that causes eMMC cards to
never gets out busy and thus fails to be initialized.

To fix the problem, but also to try to keep some of the new improved
behaviour, let's start by using a polling period of 1-2ms, which then
increases for each loop, according to common polling loop in
__mmc_poll_for_busy().

Reported-by: Jean Rene Dawin <jdawin@math.uni-bielefeld.de>
Reported-by: H. Nikolaus Schaller <hns@goldelico.com>
Cc: Huijin Park <huijin.park@samsung.com>
Fixes: 76bfc7ccc2 ("mmc: core: adjust polling interval for CMD1")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Jean Rene Dawin <jdawin@math.uni-bielefeld.de>
Tested-by: H. Nikolaus Schaller <hns@goldelico.com>
Link: https://lore.kernel.org/r/20220304105656.149281-1-ulf.hansson@linaro.org
2022-03-07 11:47:39 +01:00
Juergen Gross
66e3531b33 xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
When calling gnttab_end_foreign_access_ref() the returned value must
be tested and the reaction to that value should be appropriate.

In case of failure in xennet_get_responses() the reaction should not be
to crash the system, but to disable the network device.

The calls in setup_netfront() can be replaced by calls of
gnttab_end_foreign_access(). While at it avoid double free of ring
pages and grant references via xennet_disconnect_backend() in this case.

This is CVE-2022-23042 / part of XSA-396.

Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V2:
- avoid double free
V3:
- remove pointless initializer (Jan Beulich)
2022-03-07 09:48:55 +01:00
Juergen Gross
42baefac63 xen/gnttab: fix gnttab_end_foreign_access() without page specified
gnttab_end_foreign_access() is used to free a grant reference and
optionally to free the associated page. In case the grant is still in
use by the other side processing is being deferred. This leads to a
problem in case no page to be freed is specified by the caller: the
caller doesn't know that the page is still mapped by the other side
and thus should not be used for other purposes.

The correct way to handle this situation is to take an additional
reference to the granted page in case handling is being deferred and
to drop that reference when the grant reference could be freed
finally.

This requires that there are no users of gnttab_end_foreign_access()
left directly repurposing the granted page after the call, as this
might result in clobbered data or information leaks via the not yet
freed grant reference.

This is part of CVE-2022-23041 / XSA-396.

Reported-by: Simon Gaiser <simon@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V4:
- expand comment in header
V5:
- get page ref in case of kmalloc() failure, too
2022-03-07 09:48:55 +01:00
Juergen Gross
b0576cc9c6 xen/pvcalls: use alloc/free_pages_exact()
Instead of __get_free_pages() and free_pages() use alloc_pages_exact()
and free_pages_exact(). This is in preparation of a change of
gnttab_end_foreign_access() which will prohibit use of high-order
pages.

This is part of CVE-2022-23041 / XSA-396.

Reported-by: Simon Gaiser <simon@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V4:
- new patch
2022-03-07 09:48:55 +01:00
Juergen Gross
5cadd4bb1d xen/9p: use alloc/free_pages_exact()
Instead of __get_free_pages() and free_pages() use alloc_pages_exact()
and free_pages_exact(). This is in preparation of a change of
gnttab_end_foreign_access() which will prohibit use of high-order
pages.

By using the local variable "order" instead of ring->intf->ring_order
in the error path of xen_9pfs_front_alloc_dataring() another bug is
fixed, as the error path can be entered before ring->intf->ring_order
is being set.

By using alloc_pages_exact() the size in bytes is specified for the
allocation, which fixes another bug for the case of
order < (PAGE_SHIFT - XEN_PAGE_SHIFT).

This is part of CVE-2022-23041 / XSA-396.

Reported-by: Simon Gaiser <simon@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V4:
- new patch
2022-03-07 09:48:55 +01:00
Juergen Gross
cd7bcfab4e xen/usb: don't use gnttab_end_foreign_access() in xenhcd_gnttab_done()
The usage of gnttab_end_foreign_access() in xenhcd_gnttab_done() is
not safe against a malicious backend, as the backend could keep the
I/O page mapped and modify it even after the granted memory page is
being used for completely other purposes in the local system.

So replace that use case with gnttab_try_end_foreign_access() and
disable the PV host adapter in case the backend didn't stop using the
granted page.

In xenhcd_urb_request_done() immediately return in case of setting
the device state to "error" instead of looking into further backend
responses.

Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V2:
- use gnttab_try_end_foreign_access()
2022-03-07 09:48:55 +01:00
Juergen Gross
1dbd11ca75 xen: remove gnttab_query_foreign_access()
Remove gnttab_query_foreign_access(), as it is unused and unsafe to
use.

All previous use cases assumed a grant would not be in use after
gnttab_query_foreign_access() returned 0. This information is useless
in best case, as it only refers to a situation in the past, which could
have changed already.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2022-03-07 09:48:54 +01:00
Juergen Gross
d3b6372c58 xen/gntalloc: don't use gnttab_query_foreign_access()
Using gnttab_query_foreign_access() is unsafe, as it is racy by design.

The use case in the gntalloc driver is not needed at all. While at it
replace the call of gnttab_end_foreign_access_ref() with a call of
gnttab_end_foreign_access(), which is what is really wanted there. In
case the grant wasn't used due to an allocation failure, just free the
grant via gnttab_free_grant_reference().

This is CVE-2022-23039 / part of XSA-396.

Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V3:
- fix __del_gref() (Jan Beulich)
2022-03-07 09:48:54 +01:00
Juergen Gross
33172ab50a xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
It isn't enough to check whether a grant is still being in use by
calling gnttab_query_foreign_access(), as a mapping could be realized
by the other side just after having called that function.

In case the call was done in preparation of revoking a grant it is
better to do so via gnttab_try_end_foreign_access() and check the
success of that operation instead.

This is CVE-2022-23038 / part of XSA-396.

Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V2:
- use gnttab_try_end_foreign_access()
2022-03-07 09:48:54 +01:00
Juergen Gross
31185df7e2 xen/netfront: don't use gnttab_query_foreign_access() for mapped status
It isn't enough to check whether a grant is still being in use by
calling gnttab_query_foreign_access(), as a mapping could be realized
by the other side just after having called that function.

In case the call was done in preparation of revoking a grant it is
better to do so via gnttab_end_foreign_access_ref() and check the
success of that operation instead.

This is CVE-2022-23037 / part of XSA-396.

Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V2:
- use gnttab_try_end_foreign_access()
V3:
- don't use gnttab_try_end_foreign_access()
2022-03-07 09:48:54 +01:00
Juergen Gross
abf1fd5919 xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
It isn't enough to check whether a grant is still being in use by
calling gnttab_query_foreign_access(), as a mapping could be realized
by the other side just after having called that function.

In case the call was done in preparation of revoking a grant it is
better to do so via gnttab_end_foreign_access_ref() and check the
success of that operation instead.

For the ring allocation use alloc_pages_exact() in order to avoid
high order pages in case of a multi-page ring.

If a grant wasn't unmapped by the backend without persistent grants
being used, set the device state to "error".

This is CVE-2022-23036 / part of XSA-396.

Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
V2:
- use gnttab_try_end_foreign_access()
V4:
- use alloc_pages_exact() and free_pages_exact()
- set state to error if backend didn't unmap (Roger Pau Monné)
2022-03-07 09:48:54 +01:00
Juergen Gross
6b1775f26a xen/grant-table: add gnttab_try_end_foreign_access()
Add a new grant table function gnttab_try_end_foreign_access(), which
will remove and free a grant if it is not in use.

Its main use case is to either free a grant if it is no longer in use,
or to take some other action if it is still in use. This other action
can be an error exit, or (e.g. in the case of blkfront persistent grant
feature) some special handling.

This is CVE-2022-23036, CVE-2022-23038 / part of XSA-396.

Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V2:
- new patch
V4:
- add comments to header (Jan Beulich)
2022-03-07 09:48:54 +01:00
Juergen Gross
3777ea7bac xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
Letting xenbus_grant_ring() tear down grants in the error case is
problematic, as the other side could already have used these grants.
Calling gnttab_end_foreign_access_ref() without checking success is
resulting in an unclear situation for any caller of xenbus_grant_ring()
as in the error case the memory pages of the ring page might be
partially mapped. Freeing them would risk unwanted foreign access to
them, while not freeing them would leak memory.

In order to remove the need to undo any gnttab_grant_foreign_access()
calls, use gnttab_alloc_grant_references() to make sure no further
error can occur in the loop granting access to the ring pages.

It should be noted that this way of handling removes leaking of
grant entries in the error case, too.

This is CVE-2022-23040 / part of XSA-396.

Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2022-03-07 09:48:54 +01:00
Michael Ellerman
48015b632f powerpc: Fix STACKTRACE=n build
Our skiroot_defconfig doesn't enable FTRACE, and so doesn't get
STACKTRACE enabled either. That leads to a build failure since commit
1614b2b11f ("arch: Make ARCH_STACKWALK independent of STACKTRACE")
made stacktrace.c build even when STACKTRACE=n.

  arch/powerpc/kernel/stacktrace.c: In function ‘handle_backtrace_ipi’:
  arch/powerpc/kernel/stacktrace.c:171:2: error: implicit declaration of function ‘nmi_cpu_backtrace’
    171 |  nmi_cpu_backtrace(regs);
        |  ^~~~~~~~~~~~~~~~~
  arch/powerpc/kernel/stacktrace.c: In function ‘arch_trigger_cpumask_backtrace’:
  arch/powerpc/kernel/stacktrace.c:226:2: error: implicit declaration of function ‘nmi_trigger_cpumask_backtrace’
    226 |  nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace_ipi);
        |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This happens because our headers haven't defined
arch_trigger_cpumask_backtrace, which causes lib/nmi_backtrace.c not to
build nmi_cpu_backtrace().

The code in question doesn't actually depend on STACKTRACE=y, that was
just added because arch_trigger_cpumask_backtrace() lived in
stacktrace.c for convenience. So drop the dependency on
CONFIG_STACKTRACE, that causes lib/nmi_backtrace.c to build
nmi_cpu_backtrace() etc. and fixes the build.

Fixes: 1614b2b11f ("arch: Make ARCH_STACKWALK independent of STACKTRACE")
[mpe: Cherry pick of 5a72345e6a from next into fixes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220212111349.2806972-1-mpe@ellerman.id.au
2022-03-07 10:26:20 +11:00
Linus Torvalds
ffb217a13a Linux 5.17-rc7 2022-03-06 14:28:31 -08:00
Linus Torvalds
3ee65c0f07 Merge tag 'for-5.17-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
 "A few more fixes for various problems that have user visible effects
  or seem to be urgent:

   - fix corruption when combining DIO and non-blocking io_uring over
     multiple extents (seen on MariaDB)

   - fix relocation crash due to premature return from commit

   - fix quota deadlock between rescan and qgroup removal

   - fix item data bounds checks in tree-checker (found on a fuzzed
     image)

   - fix fsync of prealloc extents after EOF

   - add missing run of delayed items after unlink during log replay

   - don't start relocation until snapshot drop is finished

   - fix reversed condition for subpage writers locking

   - fix warning on page error"

* tag 'for-5.17-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fallback to blocking mode when doing async dio over multiple extents
  btrfs: add missing run of delayed items after unlink during log replay
  btrfs: qgroup: fix deadlock between rescan worker and remove qgroup
  btrfs: fix relocation crash due to premature return from btrfs_commit_transaction()
  btrfs: do not start relocation until in progress drops are done
  btrfs: tree-checker: use u64 for item data end to avoid overflow
  btrfs: do not WARN_ON() if we have PageError set
  btrfs: fix lost prealloc extents beyond eof after full fsync
  btrfs: subpage: fix a wrong check on subpage->writers
2022-03-06 12:19:36 -08:00
Linus Torvalds
f81664f760 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "x86 guest:

   - Tweaks to the paravirtualization code, to avoid using them when
     they're pointless or harmful

  x86 host:

   - Fix for SRCU lockdep splat

   - Brown paper bag fix for the propagation of errno"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: pull kvm->srcu read-side to kvm_arch_vcpu_ioctl_run
  KVM: x86/mmu: Passing up the error state of mmu_alloc_shadow_roots()
  KVM: x86: Yield to IPI target vCPU only if it is busy
  x86/kvmclock: Fix Hyper-V Isolated VM's boot issue when vCPUs > 64
  x86/kvm: Don't waste memory if kvmclock is disabled
  x86/kvm: Don't use PV TLB/yield when mwait is advertised
2022-03-06 12:08:42 -08:00
Linus Torvalds
9bdeaca18b Merge tag 'powerpc-5.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
 "Fix build failure when CONFIG_PPC_64S_HASH_MMU is not set.

  Thanks to Murilo Opsfelder Araujo, and Erhard F"

* tag 'powerpc-5.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Fix build failure when CONFIG_PPC_64S_HASH_MMU is not set
2022-03-06 11:57:42 -08:00
Linus Torvalds
f40a33f5ea Merge tag 'trace-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:

 - Fix sorting on old "cpu" value in histograms

 - Fix return value of __setup() boot parameter handlers

* tag 'trace-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix return value of __setup handlers
  tracing/histogram: Fix sorting on old "cpu" value
2022-03-06 11:47:59 -08:00
Michael S. Tsirkin
3dd7d135e7 tools/virtio: handle fallout from folio work
just add a stub

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06 06:06:50 -05:00
Stefano Garzarella
32f1b53fe8 tools/virtio: fix virtio_test execution
virtio_test hangs on __vring_new_virtqueue() because `vqs_list_lock`
is not initialized.

Let's initialize it in vdev_info_init().

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20220118150631.167015-1-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-06 06:06:50 -05:00
Stefano Garzarella
4c8093637b vhost: remove avail_event arg from vhost_update_avail_event()
In vhost_update_avail_event() we never used the `avail_event` argument,
since its introduction in commit 2723feaa8e ("vhost: set log when
updating used flags or avail event").

Let's remove it to clean up the code.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20220113141134.186773-1-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06 06:06:50 -05:00
Michael S. Tsirkin
e7c552ec89 virtio: drop default for virtio-mem
There's no special reason why virtio-mem needs a default that's
different from what kconfig provides, any more than e.g. virtio blk.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
2022-03-06 06:06:50 -05:00
Zhang Min
eb057b44db vdpa: fix use-after-free on vp_vdpa_remove
When vp_vdpa driver is unbind, vp_vdpa is freed in vdpa_unregister_device
and then vp_vdpa->mdev.pci_dev is dereferenced in vp_modern_remove,
triggering use-after-free.

Call Trace of unbinding driver free vp_vdpa :
do_syscall_64
  vfs_write
    kernfs_fop_write_iter
      device_release_driver_internal
        pci_device_remove
          vp_vdpa_remove
            vdpa_unregister_device
              kobject_release
                device_release
                  kfree

Call Trace of dereference vp_vdpa->mdev.pci_dev:
vp_modern_remove
  pci_release_selected_regions
    pci_release_region
      pci_resource_len
        pci_resource_end
          (dev)->resource[(bar)].end

Signed-off-by: Zhang Min <zhang.min9@zte.com.cn>
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Link: https://lore.kernel.org/r/20220301091059.46869-1-wang.yi59@zte.com.cn
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: 64b9f64f80 ("vdpa: introduce virtio pci driver")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-03-06 06:06:50 -05:00
Xie Yongji
e030759a1d virtio-blk: Remove BUG_ON() in virtio_queue_rq()
Currently we have a BUG_ON() to make sure the number of sg
list does not exceed queue_max_segments() in virtio_queue_rq().
However, the block layer uses queue_max_discard_segments()
instead of queue_max_segments() to limit the sg list for
discard requests. So the BUG_ON() might be triggered if
virtio-blk device reports a larger value for max discard
segment than queue_max_segments(). To fix it, let's simply
remove the BUG_ON() which has become unnecessary after commit
02746e26c39e("virtio-blk: avoid preallocating big SGL for data").
And the unused vblk->sg_elems can also be removed together.

Fixes: 1f23816b8e ("virtio_blk: add discard and write zeroes support")
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Link: https://lore.kernel.org/r/20220304100058.116-2-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06 06:06:50 -05:00
Xie Yongji
dacc73ed0b virtio-blk: Don't use MAX_DISCARD_SEGMENTS if max_discard_seg is zero
Currently the value of max_discard_segment will be set to
MAX_DISCARD_SEGMENTS (256) with no basis in hardware if device
set 0 to max_discard_seg in configuration space. It's incorrect
since the device might not be able to handle such large descriptors.
To fix it, let's follow max_segments restrictions in this case.

Fixes: 1f23816b8e ("virtio_blk: add discard and write zeroes support")
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20220304100058.116-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06 06:06:50 -05:00
Anirudh Rayabharam
e2ae38cf3d vhost: fix hung thread due to erroneous iotlb entries
In vhost_iotlb_add_range_ctx(), range size can overflow to 0 when
start is 0 and last is ULONG_MAX. One instance where it can happen
is when userspace sends an IOTLB message with iova=size=uaddr=0
(vhost_process_iotlb_msg). So, an entry with size = 0, start = 0,
last = ULONG_MAX ends up in the iotlb. Next time a packet is sent,
iotlb_access_ok() loops indefinitely due to that erroneous entry.

	Call Trace:
	 <TASK>
	 iotlb_access_ok+0x21b/0x3e0 drivers/vhost/vhost.c:1340
	 vq_meta_prefetch+0xbc/0x280 drivers/vhost/vhost.c:1366
	 vhost_transport_do_send_pkt+0xe0/0xfd0 drivers/vhost/vsock.c:104
	 vhost_worker+0x23d/0x3d0 drivers/vhost/vhost.c:372
	 kthread+0x2e9/0x3a0 kernel/kthread.c:377
	 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
	 </TASK>

Reported by syzbot at:
	https://syzkaller.appspot.com/bug?extid=0abd373e2e50d704db87

To fix this, do two things:

1. Return -EINVAL in vhost_chr_write_iter() when userspace asks to map
   a range with size 0.
2. Fix vhost_iotlb_add_range_ctx() to handle the range [0, ULONG_MAX]
   by splitting it into two entries.

Fixes: 0bbe30668d ("vhost: factor out IOTLB")
Reported-by: syzbot+0abd373e2e50d704db87@syzkaller.appspotmail.com
Tested-by: syzbot+0abd373e2e50d704db87@syzkaller.appspotmail.com
Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
Link: https://lore.kernel.org/r/20220305095525.5145-1-mail@anirudhrb.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06 06:05:45 -05:00
Vladimir Oltean
afb3cc1a39 net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails
After the blamed commit, dsa_tree_setup_master() may exit without
calling rtnl_unlock(), fix that.

Fixes: c146f9bc19 ("net: dsa: hold rtnl_mutex when calling dsa_master_{setup,teardown}")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-06 10:55:54 +00:00
Kai Lueke
a3d9001b4e Revert "xfrm: state and policy should fail if XFRMA_IF_ID 0"
This reverts commit 68ac0f3810 because ID
0 was meant to be used for configuring the policy/state without
matching for a specific interface (e.g., Cilium is affected, see
https://github.com/cilium/cilium/pull/18789 and
https://github.com/cilium/cilium/pull/19019).

Signed-off-by: Kai Lueke <kailueke@linux.microsoft.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-03-06 08:38:28 +01:00
Linus Torvalds
dcde98da99 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:

 - a fixup for Goodix touchscreen driver allowing it to work on certain
   Cherry Trail devices

 - a fix for imbalanced enable/disable regulator in Elam touchpad driver
   that became apparent when used with Asus TF103C 2-in-1 dock

 - a couple new input keycodes used on newer keyboards

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  HID: add mapping for KEY_ALL_APPLICATIONS
  HID: add mapping for KEY_DICTATE
  Input: elan_i2c - fix regulator enable count imbalance after suspend/resume
  Input: elan_i2c - move regulator_[en|dis]able() out of elan_[en|dis]able_power()
  Input: goodix - workaround Cherry Trail devices with a bogus ACPI Interrupt() resource
  Input: goodix - use the new soc_intel_is_byt() helper
  Input: samsung-keypad - properly state IOMEM dependency
2022-03-05 15:49:45 -08:00
Linus Torvalds
0014404f9c Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "8 patches.

  Subsystems affected by this patch series: mm (hugetlb, pagemap, and
  userfaultfd), memfd, selftests, and kconfig"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  configs/debug: set CONFIG_DEBUG_INFO=y properly
  proc: fix documentation and description of pagemap
  kselftest/vm: fix tests build with old libc
  memfd: fix F_SEAL_WRITE after shmem huge page allocated
  mm: fix use-after-free when anon vma name is used after vma is freed
  mm: prevent vm_area_struct::anon_name refcount saturation
  mm: refactor vm_area_struct::anon_vma_name usage code
  selftests/vm: cleanup hugetlb file after mremap test
2022-03-05 12:03:14 -08:00
Linus Torvalds
f9026e19a4 Merge tag 's390-5.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:

 - Fix HAVE_DYNAMIC_FTRACE_WITH_ARGS implementation by providing correct
   switching between ftrace_caller/ftrace_regs_caller and supplying
   pt_regs only when ftrace_regs_caller is activated.

 - Fix exception table sorting.

 - Fix breakage of kdump tooling by preserving metadata it cannot
   function without.

* tag 's390-5.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/extable: fix exception table sorting
  s390/ftrace: fix arch_ftrace_get_regs implementation
  s390/ftrace: fix ftrace_caller/ftrace_regs_caller generation
  s390/setup: preserve memory at OLDMEM_BASE and OLDMEM_SIZE
2022-03-05 11:25:26 -08:00
Qian Cai
d1eff16d72 configs/debug: set CONFIG_DEBUG_INFO=y properly
CONFIG_DEBUG_INFO can't be set by user directly, so set
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y instead.

Otherwise, we end up with no debuginfo in vmlinux which is a big no-no
for kernel debugging.

Link: https://lkml.kernel.org/r/20220301202920.18488-1-quic_qiancai@quicinc.com
Signed-off-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-05 11:08:33 -08:00
Yun Zhou
dd21bfa425 proc: fix documentation and description of pagemap
Since bit 57 was exported for uffd-wp write-protected (commit
fb8e37f35a: "mm/pagemap: export uffd-wp protection information"),
fixing it can reduce some unnecessary confusion.

Link: https://lkml.kernel.org/r/20220301044538.3042713-1-yun.zhou@windriver.com
Fixes: fb8e37f35a ("mm/pagemap: export uffd-wp protection information")
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Tiberiu A Georgescu <tiberiu.georgescu@nutanix.com>
Cc: Florian Schmidt <florian.schmidt@nutanix.com>
Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Colin Cross <ccross@google.com>
Cc: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-05 11:08:33 -08:00
Chengming Zhou
b773827e36 kselftest/vm: fix tests build with old libc
The error message when I build vm tests on debian10 (GLIBC 2.28):

    userfaultfd.c: In function `userfaultfd_pagemap_test':
    userfaultfd.c:1393:37: error: `MADV_PAGEOUT' undeclared (first use
    in this function); did you mean `MADV_RANDOM'?
      if (madvise(area_dst, test_pgsize, MADV_PAGEOUT))
                                         ^~~~~~~~~~~~
                                         MADV_RANDOM

This patch includes these newer definitions from UAPI linux/mman.h, is
useful to fix tests build on systems without these definitions in glibc
sys/mman.h.

Link: https://lkml.kernel.org/r/20220227055330.43087-2-zhouchengming@bytedance.com
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-05 11:08:32 -08:00
Hugh Dickins
f2b277c4d1 memfd: fix F_SEAL_WRITE after shmem huge page allocated
Wangyong reports: after enabling tmpfs filesystem to support transparent
hugepage with the following command:

  echo always > /sys/kernel/mm/transparent_hugepage/shmem_enabled

the docker program tries to add F_SEAL_WRITE through the following
command, but it fails unexpectedly with errno EBUSY:

  fcntl(5, F_ADD_SEALS, F_SEAL_WRITE) = -1.

That is because memfd_tag_pins() and memfd_wait_for_pins() were never
updated for shmem huge pages: checking page_mapcount() against
page_count() is hopeless on THP subpages - they need to check
total_mapcount() against page_count() on THP heads only.

Make memfd_tag_pins() (compared > 1) as strict as memfd_wait_for_pins()
(compared != 1): either can be justified, but given the non-atomic
total_mapcount() calculation, it is better now to be strict.  Bear in
mind that total_mapcount() itself scans all of the THP subpages, when
choosing to take an XA_CHECK_SCHED latency break.

Also fix the unlikely xa_is_value() case in memfd_wait_for_pins(): if a
page has been swapped out since memfd_tag_pins(), then its refcount must
have fallen, and so it can safely be untagged.

Link: https://lkml.kernel.org/r/a4f79248-df75-2c8c-3df-ba3317ccb5da@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reported-by: wangyong <wang.yong12@zte.com.cn>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: CGEL ZTE <cgel.zte@gmail.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-05 11:08:32 -08:00
Suren Baghdasaryan
942341dcc5 mm: fix use-after-free when anon vma name is used after vma is freed
When adjacent vmas are being merged it can result in the vma that was
originally passed to madvise_update_vma being destroyed.  In the current
implementation, the name parameter passed to madvise_update_vma points
directly to vma->anon_name and it is used after the call to vma_merge.
In the cases when vma_merge merges the original vma and destroys it,
this might result in UAF.  For that the original vma would have to hold
the anon_vma_name with the last reference.  The following vma would need
to contain a different anon_vma_name object with the same string.  Such
scenario is shown below:

madvise_vma_behavior(vma)
  madvise_update_vma(vma, ..., anon_name == vma->anon_name)
    vma_merge(vma)
      __vma_adjust(vma) <-- merges vma with adjacent one
        vm_area_free(vma) <-- frees the original vma
    replace_vma_anon_name(anon_name) <-- UAF of vma->anon_name

Fix this by raising the name refcount and stabilizing it.

Link: https://lkml.kernel.org/r/20220224231834.1481408-3-surenb@google.com
Link: https://lkml.kernel.org/r/20220223153613.835563-3-surenb@google.com
Fixes: 9a10064f56 ("mm: add a field to store names for private anonymous memory")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: syzbot+aa7b3d4b35f9dc46a366@syzkaller.appspotmail.com
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Chris Hyser <chris.hyser@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Colin Cross <ccross@google.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiaofeng Cao <caoxiaofeng@yulong.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-05 11:08:32 -08:00
Suren Baghdasaryan
96403e1128 mm: prevent vm_area_struct::anon_name refcount saturation
A deep process chain with many vmas could grow really high.  With
default sysctl_max_map_count (64k) and default pid_max (32k) the max
number of vmas in the system is 2147450880 and the refcounter has
headroom of 1073774592 before it reaches REFCOUNT_SATURATED
(3221225472).

Therefore it's unlikely that an anonymous name refcounter will overflow
with these defaults.  Currently the max for pid_max is PID_MAX_LIMIT
(4194304) and for sysctl_max_map_count it's INT_MAX (2147483647).  In
this configuration anon_vma_name refcount overflow becomes theoretically
possible (that still require heavy sharing of that anon_vma_name between
processes).

kref refcounting interface used in anon_vma_name structure will detect a
counter overflow when it reaches REFCOUNT_SATURATED value but will only
generate a warning and freeze the ref counter.  This would lead to the
refcounted object never being freed.  A determined attacker could leak
memory like that but it would be rather expensive and inefficient way to
do so.

To ensure anon_vma_name refcount does not overflow, stop anon_vma_name
sharing when the refcount reaches REFCOUNT_MAX (2147483647), which still
leaves INT_MAX/2 (1073741823) values before the counter reaches
REFCOUNT_SATURATED.  This should provide enough headroom for raising the
refcounts temporarily.

Link: https://lkml.kernel.org/r/20220223153613.835563-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Chris Hyser <chris.hyser@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Colin Cross <ccross@google.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiaofeng Cao <caoxiaofeng@yulong.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-05 11:08:32 -08:00
Suren Baghdasaryan
5c26f6ac94 mm: refactor vm_area_struct::anon_vma_name usage code
Avoid mixing strings and their anon_vma_name referenced pointers by
using struct anon_vma_name whenever possible.  This simplifies the code
and allows easier sharing of anon_vma_name structures when they
represent the same name.

[surenb@google.com: fix comment]

Link: https://lkml.kernel.org/r/20220223153613.835563-1-surenb@google.com
Link: https://lkml.kernel.org/r/20220224231834.1481408-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Colin Cross <ccross@google.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Chris Hyser <chris.hyser@oracle.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Xiaofeng Cao <caoxiaofeng@yulong.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-05 11:08:32 -08:00
Mike Kravetz
ff712a627f selftests/vm: cleanup hugetlb file after mremap test
The hugepage-mremap test will create a file in a hugetlb filesystem.  In
a default 'run_vmtests' run, the file will contain all the hugetlb
pages.  After the test, the file remains and there are no free hugetlb
pages for subsequent tests.  This causes those hugetlb tests to fail.

Change hugepage-mremap to take the name of the hugetlb file as an
argument.  Unlink the file within the test, and just to be sure remove
the file in the run_vmtests script.

Link: https://lkml.kernel.org/r/20220201033459.156944-1-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-05 11:08:32 -08:00
Alexey Khoroshilov
c6a502c229 mISDN: Fix memory leak in dsp_pipeline_build()
dsp_pipeline_build() allocates dup pointer by kstrdup(cfg),
but then it updates dup variable by strsep(&dup, "|").
As a result when it calls kfree(dup), the dup variable contains NULL.

Found by Linux Driver Verification project (linuxtesting.org) with SVACE.

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Fixes: 960366cf8d ("Add mISDN DSP")
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-05 12:04:14 +00:00
Russell King (Oracle)
b9baf5c8c5 ARM: Spectre-BHB workaround
Workaround the Spectre BHB issues for Cortex-A15, Cortex-A57,
Cortex-A72, Cortex-A73 and Cortex-A75. We also include Brahma B15 as
well to be safe, which is affected by Spectre V2 in the same ways as
Cortex-A15.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-03-05 10:42:07 +00:00
Russell King (Oracle)
8d9d651ff2 ARM: use LOADADDR() to get load address of sections
Use the linker's LOADADDR() macro to get the load address of the
sections, and provide a macro to set the start and end symbols.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-03-05 10:41:58 +00:00
Russell King (Oracle)
04e91b7324 ARM: early traps initialisation
Provide a couple of helpers to copy the vectors and stubs, and also
to flush the copied vectors and stubs.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-03-05 10:41:42 +00:00
Russell King (Oracle)
9dd78194a3 ARM: report Spectre v2 status through sysfs
As per other architectures, add support for reporting the Spectre
vulnerability status via sysfs CPU.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-03-05 10:41:22 +00:00
Murilo Opsfelder Araujo
58dbe9b373 powerpc/64s: Fix build failure when CONFIG_PPC_64S_HASH_MMU is not set
The following build failure occurs when CONFIG_PPC_64S_HASH_MMU is not
set:

    arch/powerpc/kernel/setup_64.c: In function ‘setup_per_cpu_areas’:
    arch/powerpc/kernel/setup_64.c:811:21: error: ‘mmu_linear_psize’ undeclared (first use in this function); did you mean ‘mmu_virtual_psize’?
      811 |                 if (mmu_linear_psize == MMU_PAGE_4K)
          |                     ^~~~~~~~~~~~~~~~
          |                     mmu_virtual_psize
    arch/powerpc/kernel/setup_64.c:811:21: note: each undeclared identifier is reported only once for each function it appears in

Move the declaration of mmu_linear_psize outside of
CONFIG_PPC_64S_HASH_MMU ifdef.

After the above is fixed, it fails later with the following error:

    ld: arch/powerpc/kexec/file_load_64.o: in function `.arch_kexec_kernel_image_probe':
    file_load_64.c:(.text+0x1c1c): undefined reference to `.add_htab_mem_range'

Fix that, too, by conditioning add_htab_mem_range() symbol to
CONFIG_PPC_64S_HASH_MMU.

Fixes: 387e220a2e ("powerpc/64s: Move hash MMU support code under CONFIG_PPC_64S_HASH_MMU")
Reported-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215567
Link: https://lore.kernel.org/r/20220301204743.45133-1-muriloo@linux.ibm.com
2022-03-05 20:42:21 +11:00
Josh Poimboeuf
0de05d056a x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
The commit

   44a3918c82 ("x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting")

added a warning for the "eIBRS + unprivileged eBPF" combination, which
has been shown to be vulnerable against Spectre v2 BHB-based attacks.

However, there's no warning about the "eIBRS + LFENCE retpoline +
unprivileged eBPF" combo. The LFENCE adds more protection by shortening
the speculation window after a mispredicted branch. That makes an attack
significantly more difficult, even with unprivileged eBPF. So at least
for now the logic doesn't warn about that combination.

But if you then add SMT into the mix, the SMT attack angle weakens the
effectiveness of the LFENCE considerably.

So extend the "eIBRS + unprivileged eBPF" warning to also include the
"eIBRS + LFENCE + unprivileged eBPF + SMT" case.

  [ bp: Massage commit message. ]

Suggested-by: Alyssa Milburn <alyssa.milburn@linux.intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-03-05 09:30:47 +01:00
Josh Poimboeuf
eafd987d4a x86/speculation: Warn about Spectre v2 LFENCE mitigation
With:

  f8a66d608a ("x86,bugs: Unconditionally allow spectre_v2=retpoline,amd")

it became possible to enable the LFENCE "retpoline" on Intel. However,
Intel doesn't recommend it, as it has some weaknesses compared to
retpoline.

Now AMD doesn't recommend it either.

It can still be left available as a cmdline option. It's faster than
retpoline but is weaker in certain scenarios -- particularly SMT, but
even non-SMT may be vulnerable in some cases.

So just unconditionally warn if the user requests it on the cmdline.

  [ bp: Massage commit message. ]

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-03-05 09:16:24 +01:00
Heiner Kallweit
a502a8f040 net: phy: meson-gxl: fix interrupt handling in forced mode
This PHY doesn't support a link-up interrupt source. If aneg is enabled
we use the "aneg complete" interrupt for this purpose, but if aneg is
disabled link-up isn't signaled currently.
According to a vendor driver there's an additional "energy detect"
interrupt source that can be used to signal link-up if aneg is disabled.
We can safely ignore this interrupt source if aneg is enabled.

This patch was tested on a TX3 Mini TV box with S905W (even though
boot message says it's a S905D).

This issue has been existing longer, but due to changes in phylib and
the driver the patch applies only from the commit marked as fixed.

Fixes: 84c8f773d2 ("net: phy: meson-gxl: remove the use of .ack_callback()")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/04cac530-ea1b-850e-6cfa-144a55c4d75d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-04 21:03:45 -08:00
Linus Torvalds
ac84e82f78 Merge tag 'block-5.17-2022-03-04' of git://git.kernel.dk/linux-block
Pull block fix from Jens Axboe:
 "Just a small UAF fix for blktrace"

* tag 'block-5.17-2022-03-04' of git://git.kernel.dk/linux-block:
  blktrace: fix use after free for struct blk_trace
2022-03-04 16:03:46 -08:00
Linus Torvalds
07ebd38a0d Merge tag 'riscv-for-linus-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - Fixes for a handful of KASAN-related crashes.

 - A fix to avoid a crash during boot for SPARSEMEM &&
   !SPARSEMEM_VMEMMAP configurations.

 - A fix to stop reporting some incorrect errors under DEBUG_VIRTUAL.

 - A fix for the K210's device tree to properly populate the interrupt
   map, so hart1 will get interrupts again.

* tag 'riscv-for-linus-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: dts: k210: fix broken IRQs on hart1
  riscv: Fix kasan pud population
  riscv: Move high_memory initialization to setup_bootmem
  riscv: Fix config KASAN && DEBUG_VIRTUAL
  riscv: Fix DEBUG_VIRTUAL false warnings
  riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
  riscv: Fix is_linear_mapping with recent move of KASAN region
2022-03-04 11:54:06 -08:00
Linus Torvalds
3f509f5971 Merge tag 'iommu-fixes-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:

 - Fix a double list_add() in Intel VT-d code

 - Add missing put_device() in Tegra SMMU driver

 - Two AMD IOMMU fixes:
     - Memory leak in IO page-table freeing code
     - Add missing recovery from event-log overflow

* tag 'iommu-fixes-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/tegra-smmu: Fix missing put_device() call in tegra_smmu_find
  iommu/vt-d: Fix double list_add when enabling VMD in scalable mode
  iommu/amd: Fix I/O page table memory leak
  iommu/amd: Recover from event log overflow
2022-03-04 11:30:57 -08:00
Linus Torvalds
a4ffdb6103 Merge tag 'thermal-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
 "Fix NULL pointer dereference in the thermal netlink interface (Nicolas
  Cavallari)"

* tag 'thermal-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: core: Fix TZ_GET_TRIP NULL pointer dereference
2022-03-04 11:19:14 -08:00
Linus Torvalds
8d670948f4 Merge tag 'sound-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "Hopefully the last PR for 5.17, including just a few small changes:
  an additional fix for ASoC ops boundary check and other minor
  device-specific fixes"

* tag 'sound-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: intel_hdmi: Fix reference to PCM buffer address
  ASoC: cs4265: Fix the duplicated control name
  ASoC: ops: Shift tested values in snd_soc_put_volsw() by +min
2022-03-04 11:15:00 -08:00
Linus Torvalds
c4fc118ae2 Merge tag 'drm-fixes-2022-03-04' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Things are quieting down as expected, just a small set of fixes, i915,
  exynos, amdgpu, vrr, bridge and hdlcd. Nothing scary at all.

  i915:
   - Fix GuC SLPC unset command
   - Fix misidentification of some Apple MacBook Pro laptops as Jasper Lake

  amdgpu:
   - Suspend regression fix

  exynos:
   - irq handling fixes
   - Fix two regressions to TE-gpio handling

  arm/hdlcd:
   - Select DRM_GEM_CMEA_HELPER for HDLCD

  bridge:
   - ti-sn65dsi86: Properly undo autosuspend

  vrr:
   - Fix potential NULL-pointer deref"

* tag 'drm-fixes-2022-03-04' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu: fix suspend/resume hang regression
  drm/vrr: Set VRR capable prop only if it is attached to connector
  drm/arm: arm hdlcd select DRM_GEM_CMA_HELPER
  drm/bridge: ti-sn65dsi86: Properly undo autosuspend
  drm/i915: s/JSP2/ICP2/ PCH
  drm/i915/guc/slpc: Correct the param count for unset param
  drm/exynos: Search for TE-gpio in DSI panel's node
  drm/exynos: Don't fail if no TE-gpio is defined for DSI driver
  drm/exynos: gsc: Use platform_get_irq() to get the interrupt
  drm/exynos/fimc: Use platform_get_irq() to get the interrupt
  drm/exynos/exynos_drm_fimd: Use platform_get_irq_byname() to get the interrupt
  drm/exynos: mixer: Use platform_get_irq() to get the interrupt
  drm/exynos/exynos7_drm_decon: Use platform_get_irq_byname() to get the interrupt
2022-03-04 11:01:22 -08:00
Linus Torvalds
0b7344a658 Merge tag 'pinctrl-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
 "These two fixes should fix the issues seen on the OrangePi, first we
  needed the correct offset when calling pinctrl_gpio_direction(), and
  fixing that made a lockdep issue explode in our face. Both now fixed"

* tag 'pinctrl-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: sunxi: Use unique lockdep classes for IRQs
  pinctrl-sunxi: sunxi_pinctrl_gpio_direction_in/output: use correct offset
2022-03-04 10:56:00 -08:00
Randy Dunlap
1d02b444b8 tracing: Fix return value of __setup handlers
__setup() handlers should generally return 1 to indicate that the
boot options have been handled.

Using invalid option values causes the entire kernel boot option
string to be reported as Unknown and added to init's environment
strings, polluting it.

  Unknown kernel command line parameters "BOOT_IMAGE=/boot/bzImage-517rc6
    kprobe_event=p,syscall_any,$arg1 trace_options=quiet
    trace_clock=jiffies", will be passed to user space.

 Run /sbin/init as init process
   with arguments:
     /sbin/init
   with environment:
     HOME=/
     TERM=linux
     BOOT_IMAGE=/boot/bzImage-517rc6
     kprobe_event=p,syscall_any,$arg1
     trace_options=quiet
     trace_clock=jiffies

Return 1 from the __setup() handlers so that init's environment is not
polluted with kernel boot options.

Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Link: https://lkml.kernel.org/r/20220303031744.32356-1-rdunlap@infradead.org

Cc: stable@vger.kernel.org
Fixes: 7bcfaf54f5 ("tracing: Add trace_options kernel command line parameter")
Fixes: e1e232ca6b ("tracing: Add trace_clock=<clock> kernel parameter")
Fixes: 970988e19e ("tracing/kprobe: Add kprobe_event= boot parameter")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-04 13:46:13 -05:00
Daniel Borkmann
0708a0afe2 mm: Consider __GFP_NOWARN flag for oversized kvmalloc() calls
syzkaller was recently triggering an oversized kvmalloc() warning via
xdp_umem_create().

The triggered warning was added back in 7661809d49 ("mm: don't allow
oversized kvmalloc() calls"). The rationale for the warning for huge
kvmalloc sizes was as a reaction to a security bug where the size was
more than UINT_MAX but not everything was prepared to handle unsigned
long sizes.

Anyway, the AF_XDP related call trace from this syzkaller report was:

  kvmalloc include/linux/mm.h:806 [inline]
  kvmalloc_array include/linux/mm.h:824 [inline]
  kvcalloc include/linux/mm.h:829 [inline]
  xdp_umem_pin_pages net/xdp/xdp_umem.c:102 [inline]
  xdp_umem_reg net/xdp/xdp_umem.c:219 [inline]
  xdp_umem_create+0x6a5/0xf00 net/xdp/xdp_umem.c:252
  xsk_setsockopt+0x604/0x790 net/xdp/xsk.c:1068
  __sys_setsockopt+0x1fd/0x4e0 net/socket.c:2176
  __do_sys_setsockopt net/socket.c:2187 [inline]
  __se_sys_setsockopt net/socket.c:2184 [inline]
  __x64_sys_setsockopt+0xb5/0x150 net/socket.c:2184
  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Björn mentioned that requests for >2GB allocation can still be valid:

  The structure that is being allocated is the page-pinning accounting.
  AF_XDP has an internal limit of U32_MAX pages, which is *a lot*, but
  still fewer than what memcg allows (PAGE_COUNTER_MAX is a LONG_MAX/
  PAGE_SIZE on 64 bit systems). [...]

  I could just change from U32_MAX to INT_MAX, but as I stated earlier
  that has a hacky feeling to it. [...] From my perspective, the code
  isn't broken, with the memcg limits in consideration. [...]

Linus says:

  [...] Pretty much every time this has come up, the kernel warning has
  shown that yes, the code was broken and there really wasn't a reason
  for doing allocations that big.

  Of course, some people would be perfectly fine with the allocation
  failing, they just don't want the warning. I didn't want __GFP_NOWARN
  to shut it up originally because I wanted people to see all those
  cases, but these days I think we can just say "yeah, people can shut
  it up explicitly by saying 'go ahead and fail this allocation, don't
  warn about it'".

  So enough time has passed that by now I'd certainly be ok with [it].

Thus allow call-sites to silence such userspace triggered splats if the
allocation requests have __GFP_NOWARN. For xdp_umem_pin_pages()'s call
to kvcalloc() this is already the case, so nothing else needed there.

Fixes: 7661809d49 ("mm: don't allow oversized kvmalloc() calls")
Reported-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com
Cc: Björn Töpel <bjorn@kernel.org>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Link: https://lore.kernel.org/bpf/CAJ+HfNhyfsT5cS_U9EC213ducHs9k9zNxX9+abqC0kTrPbQ0gg@mail.gmail.com
Link: https://lore.kernel.org/bpf/20211201202905.b9892171e3f5b9a60f9da251@linux-foundation.org
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Ackd-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-04 10:00:37 -08:00
Xie Yongji
b9d102dafe vduse: Fix returning wrong type in vduse_domain_alloc_iova()
This fixes the following smatch warnings:

drivers/vdpa/vdpa_user/iova_domain.c:305 vduse_domain_alloc_iova() warn: should 'iova_pfn << shift' be a 64 bit type?

Fixes: 8c773d53fb ("vduse: Implement an MMU-based software IOTLB")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20220121083940.102-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04 11:56:34 -05:00
Si-Wei Liu
ed0f849fc3 vdpa/mlx5: add validation for VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command
When control vq receives a VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command
request from the driver, presently there is no validation against the
number of queue pairs to configure, or even if multiqueue had been
negotiated or not is unverified. This may lead to kernel panic due to
uninitialized resource for the queues were there any bogus request
sent down by untrusted driver. Tie up the loose ends there.

Fixes: 52893733f2 ("vdpa/mlx5: Add multiqueue support")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Link: https://lore.kernel.org/r/1642206481-30721-4-git-send-email-si-wei.liu@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04 11:56:34 -05:00
Si-Wei Liu
30c22f3816 vdpa/mlx5: should verify CTRL_VQ feature exists for MQ
Per VIRTIO v1.1 specification, section 5.1.3.1 Feature bit requirements:
"VIRTIO_NET_F_MQ Requires VIRTIO_NET_F_CTRL_VQ".

There's assumption in the mlx5_vdpa multiqueue code that MQ must come
together with CTRL_VQ. However, there's nowhere in the upper layer to
guarantee this assumption would hold. Were there an untrusted driver
sending down MQ without CTRL_VQ, it would compromise various spots for
e.g. is_index_valid() and is_ctrl_vq_idx(). Although this doesn't end
up with immediate panic or security loophole as of today's code, the
chance for this to be taken advantage of due to future code change is
not zero.

Harden the crispy assumption by failing the set_driver_features() call
when seeing (MQ && !CTRL_VQ). For that end, verify_min_features() is
renamed to verify_driver_features() to reflect the fact that it now does
more than just validate the minimum features. verify_driver_features()
is now used to accommodate various checks against the driver features
for set_driver_features().

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Link: https://lore.kernel.org/r/1642206481-30721-3-git-send-email-si-wei.liu@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04 11:56:33 -05:00
Si-Wei Liu
e0077cc13b vdpa: factor out vdpa_set_features_unlocked for vdpa internal use
No functional change introduced. vdpa bus driver such as virtio_vdpa
or vhost_vdpa is not supposed to take care of the locking for core
by its own. The locked API vdpa_set_features should suffice the
bus driver's need.

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/1642206481-30721-2-git-send-email-si-wei.liu@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04 11:56:33 -05:00
Filipe Manana
ca93e44bfb btrfs: fallback to blocking mode when doing async dio over multiple extents
Some users recently reported that MariaDB was getting a read corruption
when using io_uring on top of btrfs. This started to happen in 5.16,
after commit 51bd9563b6 ("btrfs: fix deadlock due to page faults
during direct IO reads and writes"). That changed btrfs to use the new
iomap flag IOMAP_DIO_PARTIAL and to disable page faults before calling
iomap_dio_rw(). This was necessary to fix deadlocks when the iovector
corresponds to a memory mapped file region. That type of scenario is
exercised by test case generic/647 from fstests.

For this MariaDB scenario, we attempt to read 16K from file offset X
using IOCB_NOWAIT and io_uring. In that range we have 4 extents, each
with a size of 4K, and what happens is the following:

1) btrfs_direct_read() disables page faults and calls iomap_dio_rw();

2) iomap creates a struct iomap_dio object, its reference count is
   initialized to 1 and its ->size field is initialized to 0;

3) iomap calls btrfs_dio_iomap_begin() with file offset X, which finds
   the first 4K extent, and setups an iomap for this extent consisting
   of a single page;

4) At iomap_dio_bio_iter(), we are able to access the first page of the
   buffer (struct iov_iter) with bio_iov_iter_get_pages() without
   triggering a page fault;

5) iomap submits a bio for this 4K extent
   (iomap_dio_submit_bio() -> btrfs_submit_direct()) and increments
   the refcount on the struct iomap_dio object to 2; The ->size field
   of the struct iomap_dio object is incremented to 4K;

6) iomap calls btrfs_iomap_begin() again, this time with a file
   offset of X + 4K. There we setup an iomap for the next extent
   that also has a size of 4K;

7) Then at iomap_dio_bio_iter() we call bio_iov_iter_get_pages(),
   which tries to access the next page (2nd page) of the buffer.
   This triggers a page fault and returns -EFAULT;

8) At __iomap_dio_rw() we see the -EFAULT, but we reset the error
   to 0 because we passed the flag IOMAP_DIO_PARTIAL to iomap and
   the struct iomap_dio object has a ->size value of 4K (we submitted
   a bio for an extent already). The 'wait_for_completion' variable
   is not set to true, because our iocb has IOCB_NOWAIT set;

9) At the bottom of __iomap_dio_rw(), we decrement the reference count
   of the struct iomap_dio object from 2 to 1. Because we were not
   the only ones holding a reference on it and 'wait_for_completion' is
   set to false, -EIOCBQUEUED is returned to btrfs_direct_read(), which
   just returns it up the callchain, up to io_uring;

10) The bio submitted for the first extent (step 5) completes and its
    bio endio function, iomap_dio_bio_end_io(), decrements the last
    reference on the struct iomap_dio object, resulting in calling
    iomap_dio_complete_work() -> iomap_dio_complete().

11) At iomap_dio_complete() we adjust the iocb->ki_pos from X to X + 4K
    and return 4K (the amount of io done) to iomap_dio_complete_work();

12) iomap_dio_complete_work() calls the iocb completion callback,
    iocb->ki_complete() with a second argument value of 4K (total io
    done) and the iocb with the adjust ki_pos of X + 4K. This results
    in completing the read request for io_uring, leaving it with a
    result of 4K bytes read, and only the first page of the buffer
    filled in, while the remaining 3 pages, corresponding to the other
    3 extents, were not filled;

13) For the application, the result is unexpected because if we ask
    to read N bytes, it expects to get N bytes read as long as those
    N bytes don't cross the EOF (i_size).

MariaDB reports this as an error, as it's not expecting a short read,
since it knows it's asking for read operations fully within the i_size
boundary. This is typical in many applications, but it may also be
questionable if they should react to such short reads by issuing more
read calls to get the remaining data. Nevertheless, the short read
happened due to a change in btrfs regarding how it deals with page
faults while in the middle of a read operation, and there's no reason
why btrfs can't have the previous behaviour of returning the whole data
that was requested by the application.

The problem can also be triggered with the following simple program:

  /* Get O_DIRECT */
  #ifndef _GNU_SOURCE
  #define _GNU_SOURCE
  #endif

  #include <stdio.h>
  #include <stdlib.h>
  #include <unistd.h>
  #include <fcntl.h>
  #include <errno.h>
  #include <string.h>
  #include <liburing.h>

  int main(int argc, char *argv[])
  {
      char *foo_path;
      struct io_uring ring;
      struct io_uring_sqe *sqe;
      struct io_uring_cqe *cqe;
      struct iovec iovec;
      int fd;
      long pagesize;
      void *write_buf;
      void *read_buf;
      ssize_t ret;
      int i;

      if (argc != 2) {
          fprintf(stderr, "Use: %s <directory>\n", argv[0]);
          return 1;
      }

      foo_path = malloc(strlen(argv[1]) + 5);
      if (!foo_path) {
          fprintf(stderr, "Failed to allocate memory for file path\n");
          return 1;
      }
      strcpy(foo_path, argv[1]);
      strcat(foo_path, "/foo");

      /*
       * Create file foo with 2 extents, each with a size matching
       * the page size. Then allocate a buffer to read both extents
       * with io_uring, using O_DIRECT and IOCB_NOWAIT. Before doing
       * the read with io_uring, access the first page of the buffer
       * to fault it in, so that during the read we only trigger a
       * page fault when accessing the second page of the buffer.
       */
       fd = open(foo_path, O_CREAT | O_TRUNC | O_WRONLY |
                O_DIRECT, 0666);
       if (fd == -1) {
           fprintf(stderr,
                   "Failed to create file 'foo': %s (errno %d)",
                   strerror(errno), errno);
           return 1;
       }

       pagesize = sysconf(_SC_PAGE_SIZE);
       ret = posix_memalign(&write_buf, pagesize, 2 * pagesize);
       if (ret) {
           fprintf(stderr, "Failed to allocate write buffer\n");
           return 1;
       }

       memset(write_buf, 0xab, pagesize);
       memset(write_buf + pagesize, 0xcd, pagesize);

       /* Create 2 extents, each with a size matching page size. */
       for (i = 0; i < 2; i++) {
           ret = pwrite(fd, write_buf + i * pagesize, pagesize,
                        i * pagesize);
           if (ret != pagesize) {
               fprintf(stderr,
                     "Failed to write to file, ret = %ld errno %d (%s)\n",
                      ret, errno, strerror(errno));
               return 1;
           }
           ret = fsync(fd);
           if (ret != 0) {
               fprintf(stderr, "Failed to fsync file\n");
               return 1;
           }
       }

       close(fd);
       fd = open(foo_path, O_RDONLY | O_DIRECT);
       if (fd == -1) {
           fprintf(stderr,
                   "Failed to open file 'foo': %s (errno %d)",
                   strerror(errno), errno);
           return 1;
       }

       ret = posix_memalign(&read_buf, pagesize, 2 * pagesize);
       if (ret) {
           fprintf(stderr, "Failed to allocate read buffer\n");
           return 1;
       }

       /*
        * Fault in only the first page of the read buffer.
        * We want to trigger a page fault for the 2nd page of the
        * read buffer during the read operation with io_uring
        * (O_DIRECT and IOCB_NOWAIT).
        */
       memset(read_buf, 0, 1);

       ret = io_uring_queue_init(1, &ring, 0);
       if (ret != 0) {
           fprintf(stderr, "Failed to create io_uring queue\n");
           return 1;
       }

       sqe = io_uring_get_sqe(&ring);
       if (!sqe) {
           fprintf(stderr, "Failed to get io_uring sqe\n");
           return 1;
       }

       iovec.iov_base = read_buf;
       iovec.iov_len = 2 * pagesize;
       io_uring_prep_readv(sqe, fd, &iovec, 1, 0);

       ret = io_uring_submit_and_wait(&ring, 1);
       if (ret != 1) {
           fprintf(stderr,
                   "Failed at io_uring_submit_and_wait()\n");
           return 1;
       }

       ret = io_uring_wait_cqe(&ring, &cqe);
       if (ret < 0) {
           fprintf(stderr, "Failed at io_uring_wait_cqe()\n");
           return 1;
       }

       printf("io_uring read result for file foo:\n\n");
       printf("  cqe->res == %d (expected %d)\n", cqe->res, 2 * pagesize);
       printf("  memcmp(read_buf, write_buf) == %d (expected 0)\n",
              memcmp(read_buf, write_buf, 2 * pagesize));

       io_uring_cqe_seen(&ring, cqe);
       io_uring_queue_exit(&ring);

       return 0;
  }

When running it on an unpatched kernel:

  $ gcc io_uring_test.c -luring
  $ mkfs.btrfs -f /dev/sda
  $ mount /dev/sda /mnt/sda
  $ ./a.out /mnt/sda
  io_uring read result for file foo:

    cqe->res == 4096 (expected 8192)
    memcmp(read_buf, write_buf) == -205 (expected 0)

After this patch, the read always returns 8192 bytes, with the buffer
filled with the correct data. Although that reproducer always triggers
the bug in my test vms, it's possible that it will not be so reliable
on other environments, as that can happen if the bio for the first
extent completes and decrements the reference on the struct iomap_dio
object before we do the atomic_dec_and_test() on the reference at
__iomap_dio_rw().

Fix this in btrfs by having btrfs_dio_iomap_begin() return -EAGAIN
whenever we try to satisfy a non blocking IO request (IOMAP_NOWAIT flag
set) over a range that spans multiple extents (or a mix of extents and
holes). This avoids returning success to the caller when we only did
partial IO, which is not optimal for writes and for reads it's actually
incorrect, as the caller doesn't expect to get less bytes read than it has
requested (unless EOF is crossed), as previously mentioned. This is also
the type of behaviour that xfs follows (xfs_direct_write_iomap_begin()),
even though it doesn't use IOMAP_DIO_PARTIAL.

A test case for fstests will follow soon.

Link: https://lore.kernel.org/linux-btrfs/CABVffEM0eEWho+206m470rtM0d9J8ue85TtR-A_oVTuGLWFicA@mail.gmail.com/
Link: https://lore.kernel.org/linux-btrfs/CAHF2GV6U32gmqSjLe=XKgfcZAmLCiH26cJ2OnHGp5x=VAH4OHQ@mail.gmail.com/
CC: stable@vger.kernel.org # 5.16+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-04 15:09:21 +01:00
Michael S. Tsirkin
0e7174b9d5 virtio_console: break out of buf poll on remove
A common pattern for device reset is currently:
vdev->config->reset(vdev);
.. cleanup ..

reset prevents new interrupts from arriving and waits for interrupt
handlers to finish.

However if - as is common - the handler queues a work request which is
flushed during the cleanup stage, we have code adding buffers / trying
to get buffers while device is reset. Not good.

This was reproduced by running
	modprobe virtio_console
	modprobe -r virtio_console
in a loop.

Fix this up by calling virtio_break_device + flush before reset.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1786239
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04 08:33:22 -05:00
Michael S. Tsirkin
c46eccdaad virtio: document virtio_reset_device
Looks like most callers get driver/device removal wrong.
Document what's expected of callers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04 08:33:22 -05:00
Michael S. Tsirkin
4fa59ede95 virtio: acknowledge all features before access
The feature negotiation was designed in a way that
makes it possible for devices to know which config
fields will be accessed by drivers.

This is broken since commit 404123c2db ("virtio: allow drivers to
validate features") with fallout in at least block and net.  We have a
partial work-around in commit 2f9a174f91 ("virtio: write back
F_VERSION_1 before validate") which at least lets devices find out which
format should config space have, but this is a partial fix: guests
should not access config space without acknowledging features since
otherwise we'll never be able to change the config space format.

To fix, split finalize_features from virtio_finalize_features and
call finalize_features with all feature bits before validation,
and then - if validation changed any bits - once again after.

Since virtio_finalize_features no longer writes out features
rename it to virtio_features_ok - since that is what it does:
checks that features are ok with the device.

As a side effect, this also reduces the amount of hypervisor accesses -
we now only acknowledge features once unless we are clearing any
features when validating (which is uncommon).

IRC I think that this was more or less always the intent in the spec but
unfortunately the way the spec is worded does not say this explicitly, I
plan to address this at the spec level, too.

Acked-by: Jason Wang <jasowang@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 404123c2db ("virtio: allow drivers to validate features")
Fixes: 2f9a174f91 ("virtio: write back F_VERSION_1 before validate")
Cc: "Halil Pasic" <pasic@linux.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04 08:33:21 -05:00
Michael S. Tsirkin
838d6d3461 virtio: unexport virtio_finalize_features
virtio_finalize_features is only used internally within virtio.
No reason to export it.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04 08:33:21 -05:00
Tung Nguyen
be4977b847 tipc: fix kernel panic when enabling bearer
When enabling a bearer on a node, a kernel panic is observed:

[    4.498085] RIP: 0010:tipc_mon_prep+0x4e/0x130 [tipc]
...
[    4.520030] Call Trace:
[    4.520689]  <IRQ>
[    4.521236]  tipc_link_build_proto_msg+0x375/0x750 [tipc]
[    4.522654]  tipc_link_build_state_msg+0x48/0xc0 [tipc]
[    4.524034]  __tipc_node_link_up+0xd7/0x290 [tipc]
[    4.525292]  tipc_rcv+0x5da/0x730 [tipc]
[    4.526346]  ? __netif_receive_skb_core+0xb7/0xfc0
[    4.527601]  tipc_l2_rcv_msg+0x5e/0x90 [tipc]
[    4.528737]  __netif_receive_skb_list_core+0x20b/0x260
[    4.530068]  netif_receive_skb_list_internal+0x1bf/0x2e0
[    4.531450]  ? dev_gro_receive+0x4c2/0x680
[    4.532512]  napi_complete_done+0x6f/0x180
[    4.533570]  virtnet_poll+0x29c/0x42e [virtio_net]
...

The node in question is receiving activate messages in another
thread after changing bearer status to allow message sending/
receiving in current thread:

         thread 1           |              thread 2
         --------           |              --------
                            |
tipc_enable_bearer()        |
  test_and_set_bit_lock()   |
    tipc_bearer_xmit_skb()  |
                            | tipc_l2_rcv_msg()
                            |   tipc_rcv()
                            |     __tipc_node_link_up()
                            |       tipc_link_build_state_msg()
                            |         tipc_link_build_proto_msg()
                            |           tipc_mon_prep()
                            |           {
                            |             ...
                            |             // null-pointer dereference
                            |             u16 gen = mon->dom_gen;
                            |             ...
                            |           }
  // Not being executed yet |
  tipc_mon_create()         |
  {                         |
    ...                     |
    // allocate             |
    mon = kzalloc();        |
    ...                     |
  }                         |

Monitoring pointer in thread 2 is dereferenced before monitoring data
is allocated in thread 1. This causes kernel panic.

This commit fixes it by allocating the monitoring data before enabling
the bearer to receive messages.

Fixes: 35c55c9877 ("tipc: add neighbor monitoring framework")
Reported-by: Shuang Li <shuali@redhat.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-04 13:19:13 +00:00
Robert Hancock
0bf476fc36 net: macb: Fix lost RX packet wakeup race in NAPI receive
There is an oddity in the way the RSR register flags propagate to the
ISR register (and the actual interrupt output) on this hardware: it
appears that RSR register bits only result in ISR being asserted if the
interrupt was actually enabled at the time, so enabling interrupts with
RSR bits already set doesn't trigger an interrupt to be raised. There
was already a partial fix for this race in the macb_poll function where
it checked for RSR bits being set and re-triggered NAPI receive.
However, there was a still a race window between checking RSR and
actually enabling interrupts, where a lost wakeup could happen. It's
necessary to check again after enabling interrupts to see if RSR was set
just prior to the interrupt being enabled, and re-trigger receive in that
case.

This issue was noticed in a point-to-point UDP request-response protocol
which periodically saw timeouts or abnormally high response times due to
received packets not being processed in a timely fashion. In many
applications, more packets arriving, including TCP retransmissions, would
cause the original packet to be processed, thus masking the issue.

Fixes: 02f7a34f34 ("net: macb: Re-enable RX interrupt only when RX is done")
Cc: stable@vger.kernel.org
Co-developed-by: Scott McNutt <scott.mcnutt@siriusxm.com>
Signed-off-by: Scott McNutt <scott.mcnutt@siriusxm.com>
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-04 12:05:54 +00:00
Jakub Kicinski
9f3956d659 Merge tag 'for-net-2022-03-03' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - Fix regression with processing of MGMT commands
 - Fix unbalanced unlock in Set Device Flags

* tag 'for-net-2022-03-03' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: hci_sync: Fix not processing all entries on cmd_sync_work
  Bluetooth: hci_core: Fix unbalanced unlock in set_device_flags()
====================

Link: https://lore.kernel.org/r/20220303210743.314679-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 20:31:03 -08:00
Niklas Cassel
74583f1b92 riscv: dts: k210: fix broken IRQs on hart1
Commit 67d96729a9 ("riscv: Update Canaan Kendryte K210 device tree")
incorrectly removed two entries from the PLIC interrupt-controller node's
interrupts-extended property.

The PLIC driver cannot know the mapping between hart contexts and hart ids,
so this information has to be provided by device tree, as specified by the
PLIC device tree binding.

The PLIC driver uses the interrupts-extended property, and initializes the
hart context registers in the exact same order as provided by the
interrupts-extended property.

In other words, if we don't specify the S-mode interrupts, the PLIC driver
will simply initialize the hart0 S-mode hart context with the hart1 M-mode
configuration. It is therefore essential to specify the S-mode IRQs even
though the system itself will only ever be running in M-mode.

Re-add the S-mode interrupts, so that we get working IRQs on hart1 again.

Cc: <stable@vger.kernel.org>
Fixes: 67d96729a9 ("riscv: Update Canaan Kendryte K210 device tree")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 20:04:21 -08:00
Dave Airlie
8fdb196797 Merge tag 'drm-misc-fixes-2022-03-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
* drm/arm: Select DRM_GEM_CMEA_HELPER for HDLCD
 * drm/bridge: ti-sn65dsi86: Properly undo autosuspend
 * drm/vrr: Fix potential NULL-pointer deref

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YiCTGZ8IVCw0ilKK@linux-uq9g
2022-03-04 13:04:11 +10:00
Dave Airlie
c9585249c2 Merge tag 'amd-drm-fixes-5.17-2022-03-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.17-2022-03-02:

amdgpu:
- Suspend regression fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303045035.5650-1-alexander.deucher@amd.com
2022-03-04 13:02:13 +10:00
Dave Airlie
0d9f0ee17b Merge tag 'drm-intel-fixes-2022-03-03' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix GuC SLPC unset command. (Vinay Belgaumkar)
- Fix misidentification of some Apple MacBook Pro laptops as Jasper Lake. (Ville Syrjälä)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YiCXHiTyCE7TbopG@tursulin-mobl2
2022-03-04 12:55:48 +10:00
William Mahon
327b89f0ac HID: add mapping for KEY_ALL_APPLICATIONS
This patch adds a new key definition for KEY_ALL_APPLICATIONS
and aliases KEY_DASHBOARD to it.

It also maps the 0x0c/0x2a2 usage code to KEY_ALL_APPLICATIONS.

Signed-off-by: William Mahon <wmahon@chromium.org>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20220303035618.1.I3a7746ad05d270161a18334ae06e3b6db1a1d339@changeid
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-03-03 18:44:21 -08:00
William Mahon
bfa26ba343 HID: add mapping for KEY_DICTATE
Numerous keyboards are adding dictate keys which allows for text
messages to be dictated by a microphone.

This patch adds a new key definition KEY_DICTATE and maps 0x0c/0x0d8
usage code to this new keycode. Additionally hid-debug is adjusted to
recognize this new usage code as well.

Signed-off-by: William Mahon <wmahon@chromium.org>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20220303021501.1.I5dbf50eb1a7a6734ee727bda4a8573358c6d3ec0@changeid
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-03-03 18:44:19 -08:00
Alexandre Ghiti
e4fcfe6eca riscv: Fix kasan pud population
In sv48, the kasan inner regions are not aligned on PGDIR_SIZE and then
when we populate the kasan linear mapping region, we clear the kasan
vmalloc region which is in the same PGD.

Fix this by copying the content of the kasan early pud after allocating a
new PGD for the first time.

Fixes: e8a62cc26d ("riscv: Implement sv48 support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:34:29 -08:00
Alexandre Ghiti
625e24a550 riscv: Move high_memory initialization to setup_bootmem
high_memory used to be initialized in mem_init, way after setup_bootmem.
But a call to dma_contiguous_reserve in this function gives rise to the
below warning because high_memory is equal to 0 and is used at the very
beginning at cma_declare_contiguous_nid.

It went unnoticed since the move of the kasan region redefined
KERN_VIRT_SIZE so that it does not encompass -1 anymore.

Fix this by initializing high_memory in setup_bootmem.

------------[ cut here ]------------
virt_to_phys used for non-linear address: ffffffffffffffff (0xffffffffffffffff)
WARNING: CPU: 0 PID: 0 at arch/riscv/mm/physaddr.c:14 __virt_to_phys+0xac/0x1b8
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.17.0-rc1-00007-ga68b89289e26 #27
Hardware name: riscv-virtio,qemu (DT)
epc : __virt_to_phys+0xac/0x1b8
 ra : __virt_to_phys+0xac/0x1b8
epc : ffffffff80014922 ra : ffffffff80014922 sp : ffffffff84a03c30
 gp : ffffffff85866c80 tp : ffffffff84a3f180 t0 : ffffffff86bce657
 t1 : fffffffef09406e8 t2 : 0000000000000000 s0 : ffffffff84a03c70
 s1 : ffffffffffffffff a0 : 000000000000004f a1 : 00000000000f0000
 a2 : 0000000000000002 a3 : ffffffff8011f408 a4 : 0000000000000000
 a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffff84a03747
 s2 : ffffffd800000000 s3 : ffffffff86ef4000 s4 : ffffffff8467f828
 s5 : fffffff800000000 s6 : 8000000000006800 s7 : 0000000000000000
 s8 : 0000000480000000 s9 : 0000000080038ea0 s10: 0000000000000000
 s11: ffffffffffffffff t3 : ffffffff84a035c0 t4 : fffffffef09406e8
 t5 : fffffffef09406e9 t6 : ffffffff84a03758
status: 0000000000000100 badaddr: 0000000000000000 cause: 0000000000000003
[<ffffffff8322ef4c>] cma_declare_contiguous_nid+0xf2/0x64a
[<ffffffff83212a58>] dma_contiguous_reserve_area+0x46/0xb4
[<ffffffff83212c3a>] dma_contiguous_reserve+0x174/0x18e
[<ffffffff83208fc2>] paging_init+0x12c/0x35e
[<ffffffff83206bd2>] setup_arch+0x120/0x74e
[<ffffffff83201416>] start_kernel+0xce/0x68c
irq event stamp: 0
hardirqs last  enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<0000000000000000>] 0x0
softirqs last  enabled at (0): [<0000000000000000>] 0x0
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace 0000000000000000 ]---

Fixes: f7ae02333d ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:34:12 -08:00
Alexandre Ghiti
c648c4bb7d riscv: Fix config KASAN && DEBUG_VIRTUAL
__virt_to_phys function is called very early in the boot process (ie
kasan_early_init) so it should not be instrumented by KASAN otherwise it
bugs.

Fix this by declaring phys_addr.c as non-kasan instrumentable.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b72721 (riscv: Add KASAN support)
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:32:41 -08:00
Alexandre Ghiti
5f763b3b59 riscv: Fix DEBUG_VIRTUAL false warnings
KERN_VIRT_SIZE used to encompass the kernel mapping before it was
redefined when moving the kasan mapping next to the kernel mapping to only
match the maximum amount of physical memory.

Then, kernel mapping addresses that go through __virt_to_phys are now
declared as wrong which is not true, one can use __virt_to_phys on such
addresses.

Fix this by redefining the condition that matches wrong addresses.

Fixes: f7ae02333d ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:32:04 -08:00
Alexandre Ghiti
a3d3280378 riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
In order to get the pfn of a struct page* when sparsemem is enabled
without vmemmap, the mem_section structures need to be initialized which
happens in sparse_init.

But kasan_early_init calls pfn_to_page way before sparse_init is called,
which then tries to dereference a null mem_section pointer.

Fix this by removing the usage of this function in kasan_early_init.

Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 13:11:30 -08:00
Alexandre Ghiti
8b274f2238 riscv: Fix is_linear_mapping with recent move of KASAN region
The KASAN region was recently moved between the linear mapping and the
kernel mapping, is_linear_mapping used to check the validity of an
address by using the start of the kernel mapping, which is now wrong.

Fix this by using the maximum size of the physical memory.

Fixes: f7ae02333d ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 13:11:02 -08:00
Ammar Faizi
38f80f4214 MAINTAINERS: Remove dead patchwork link
The patchwork link is dead. It says:

  404: File not found
  The page URL requested (/project/LKML/list/) does not exist.

Remove it.

Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-03 12:14:36 -08:00
David Howells
b08968f196 cachefiles: Fix incorrect length to fallocate()
When cachefiles_shorten_object() calls fallocate() to shape the cache
file to match the DIO size, it passes the total file size it wants to
achieve, not the amount of zeros that should be inserted.  Since this is
meant to preallocate that amount of storage for the file, it can cause
the cache to fill up the disk and hit ENOSPC.

Fix this by passing the length actually required to go from the current
EOF to the desired EOF.

Fixes: 7623ed6772 ("cachefiles: Implement cookie resize for truncate")
Reported-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/164630854858.3665356.17419701804248490708.stgit@warthog.procyon.org.uk # v1
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-03 11:35:21 -08:00
Linus Torvalds
b949c21fc2 Merge tag 'net-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from can, xfrm, wifi, bluetooth, and netfilter.

  Lots of various size fixes, the length of the tag speaks for itself.
  Most of the 5.17-relevant stuff comes from xfrm, wifi and bt trees
  which had been lagging as you pointed out previously. But there's also
  a larger than we'd like portion of fixes for bugs from previous
  releases.

  Three more fixes still under discussion, including and xfrm revert for
  uAPI error.

  Current release - regressions:

   - iwlwifi: don't advertise TWT support, prevent FW crash

   - xfrm: fix the if_id check in changelink

   - xen/netfront: destroy queues before real_num_tx_queues is zeroed

   - bluetooth: fix not checking MGMT cmd pending queue, make scanning
     work again

  Current release - new code bugs:

   - mptcp: make SIOCOUTQ accurate for fallback socket

   - bluetooth: access skb->len after null check

   - bluetooth: hci_sync: fix not using conn_timeout

   - smc: fix cleanup when register ULP fails

   - dsa: restore error path of dsa_tree_change_tag_proto

   - iwlwifi: fix build error for IWLMEI

   - iwlwifi: mvm: propagate error from request_ownership to the user

  Previous releases - regressions:

   - xfrm: fix pMTU regression when reported pMTU is too small

   - xfrm: fix TCP MSS calculation when pMTU is close to 1280

   - bluetooth: fix bt_skb_sendmmsg not allocating partial chunks

   - ipv6: ensure we call ipv6_mc_down() at most once, prevent leaks

   - ipv6: prevent leaks in igmp6 when input queues get full

   - fix up skbs delta_truesize in UDP GRO frag_list

   - eth: e1000e: fix possible HW unit hang after an s0ix exit

   - eth: e1000e: correct NVM checksum verification flow

   - ptp: ocp: fix large time adjustments

  Previous releases - always broken:

   - tcp: make tcp_read_sock() more robust in presence of urgent data

   - xfrm: distinguishing SAs and SPs by if_id in xfrm_migrate

   - xfrm: fix xfrm_migrate issues when address family changes

   - dcb: flush lingering app table entries for unregistered devices

   - smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error

   - mac80211: fix EAPoL rekey fail in 802.3 rx path

   - mac80211: fix forwarded mesh frames AC & queue selection

   - netfilter: nf_queue: fix socket access races and bugs

   - batman-adv: fix ToCToU iflink problems and check the result belongs
     to the expected net namespace

   - can: gs_usb, etas_es58x: fix opened_channel_cnt's accounting

   - can: rcar_canfd: register the CAN device when fully ready

   - eth: igb, igc: phy: drop premature return leaking HW semaphore

   - eth: ixgbe: xsk: change !netif_carrier_ok() handling in
     ixgbe_xmit_zc(), prevent live lock when link goes down

   - eth: stmmac: only enable DMA interrupts when ready

   - eth: sparx5: move vlan checks before any changes are made

   - eth: iavf: fix races around init, removal, resets and vlan ops

   - ibmvnic: more reset flow fixes

  Misc:

   - eth: fix return value of __setup handlers"

* tag 'net-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (92 commits)
  ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report()
  net: dsa: make dsa_tree_change_tag_proto actually unwind the tag proto change
  ixgbe: xsk: change !netif_carrier_ok() handling in ixgbe_xmit_zc()
  selftests: mlxsw: resource_scale: Fix return value
  selftests: mlxsw: tc_police_scale: Make test more robust
  net: dcb: disable softirqs in dcbnl_flush_dev()
  bnx2: Fix an error message
  sfc: extend the locking on mcdi->seqno
  net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error cause by server
  net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error generated by client
  net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe()
  tcp: make tcp_read_sock() more robust
  bpf, sockmap: Do not ignore orig_len parameter
  net: ipa: add an interconnect dependency
  net: fix up skbs delta_truesize in UDP GRO frag_list
  iwlwifi: mvm: return value for request_ownership
  nl80211: Update bss channel on channel switch for P2P_CLIENT
  iwlwifi: fix build error for IWLMEI
  ptp: ocp: Add ptp_ocp_adjtime_coarse for large adjustments
  batman-adv: Don't expect inter-netns unique iflink indices
  ...
2022-03-03 11:10:56 -08:00
Linus Torvalds
e58bd49da6 Merge tag 'mips-fixes-5.17_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:

 - Fix memory detection for MT7621 devices

 - Fix setnocoherentio kernel option

 - Fix warning when CONFIG_SCHED_CORE is enabled

* tag 'mips-fixes-5.17_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: ralink: mt7621: use bitwise NOT instead of logical
  mips: setup: fix setnocoherentio() boolean setting
  MIPS: smp: fill in sibling and core maps earlier
  MIPS: ralink: mt7621: do memory detection on KSEG1
2022-03-03 10:38:28 -08:00
Linus Torvalds
4d5ae2340d Merge tag 'auxdisplay-for-linus-v5.17-rc7' of git://github.com/ojeda/linux
Pull auxdisplay fixes from Miguel Ojeda:
 "A few lcd2s fixes from Andy Shevchenko"

* tag 'auxdisplay-for-linus-v5.17-rc7' of git://github.com/ojeda/linux:
  auxdisplay: lcd2s: Use proper API to free the instance of charlcd object
  auxdisplay: lcd2s: Fix memory leak in ->remove()
  auxdisplay: lcd2s: Fix lcd2s_redefine_char() feature
2022-03-03 10:31:09 -08:00
Eric Dumazet
2d3916f318 ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report()
While investigating on why a synchronize_net() has been added recently
in ipv6_mc_down(), I found that igmp6_event_query() and igmp6_event_report()
might drop skbs in some cases.

Discussion about removing synchronize_net() from ipv6_mc_down()
will happen in a different thread.

Fixes: f185de28d9 ("mld: add new workqueues for process mld events")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Taehee Yoo <ap420073@gmail.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220303173728.937869-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 09:47:06 -08:00
Vladimir Oltean
e1bec7fa1c net: dsa: make dsa_tree_change_tag_proto actually unwind the tag proto change
The blamed commit said one thing but did another. It explains that we
should restore the "return err" to the original "goto out_unwind_tagger",
but instead it replaced it with "goto out_unlock".

When DSA_NOTIFIER_TAG_PROTO fails after the first switch of a
multi-switch tree, the switches would end up not using the same tagging
protocol.

Fixes: 0b0e2ff103 ("net: dsa: restore error path of dsa_tree_change_tag_proto")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220303154249.1854436-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 08:39:12 -08:00
Maciej Fijalkowski
6c7273a266 ixgbe: xsk: change !netif_carrier_ok() handling in ixgbe_xmit_zc()
Commit c685c69fba ("ixgbe: don't do any AF_XDP zero-copy transmit if
netif is not OK") addressed the ring transient state when
MEM_TYPE_XSK_BUFF_POOL was being configured which in turn caused the
interface to through down/up. Maurice reported that when carrier is not
ok and xsk_pool is present on ring pair, ksoftirqd will consume 100% CPU
cycles due to the constant NAPI rescheduling as ixgbe_poll() states that
there is still some work to be done.

To fix this, do not set work_done to false for a !netif_carrier_ok().

Fixes: c685c69fba ("ixgbe: don't do any AF_XDP zero-copy transmit if netif is not OK")
Reported-by: Maurice Baijens <maurice.baijens@ellips.com>
Tested-by: Maurice Baijens <maurice.baijens@ellips.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 08:26:55 -08:00
Jakub Kicinski
312f2d500a Merge branch 'selftests-mlxsw-a-couple-of-fixes'
Ido Schimmel says:

====================
selftests: mlxsw: A couple of fixes

Patch #1 fixes a breakage due to a change in iproute2 output. The real
problem is not iproute2, but the fact that the check was not strict
enough. Fixed by using JSON output instead. Targeting at net so that the
test will pass as part of old and new kernels regardless of iproute2
version.

Patch #2 fixes an issue uncovered by the first one.
====================

Link: https://lore.kernel.org/r/20220302161447.217447-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 08:14:06 -08:00
Amit Cohen
196f9bc050 selftests: mlxsw: resource_scale: Fix return value
The test runs several test cases and is supposed to return an error in
case at least one of them failed.

Currently, the check of the return value of each test case is in the
wrong place, which can result in the wrong return value. For example:

 # TESTS='tc_police' ./resource_scale.sh
 TEST: 'tc_police' [default] 968                                     [FAIL]
         tc police offload count failed
 Error: mlxsw_spectrum: Failed to allocate policer index.
 We have an error talking to the kernel
 Command failed /tmp/tmp.i7Oc5HwmXY:969
 TEST: 'tc_police' [default] overflow 969                            [ OK ]
 ...
 TEST: 'tc_police' [ipv4_max] overflow 969                           [ OK ]

 $ echo $?
 0

Fix this by moving the check to be done after each test case.

Fixes: 059b18e21c ("selftests: mlxsw: Return correct error code in resource scale test")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 08:14:01 -08:00
Amit Cohen
dc97520753 selftests: mlxsw: tc_police_scale: Make test more robust
The test adds tc filters and checks how many of them were offloaded by
grepping for 'in_hw'.

iproute2 commit f4cd4f127047 ("tc: add skip_hw and skip_sw to control
action offload") added offload indication to tc actions, producing the
following output:

 $ tc filter show dev swp2 ingress
 ...
 filter protocol ipv6 pref 1000 flower chain 0 handle 0x7c0
   eth_type ipv6
   dst_ip 2001:db8:1::7bf
   skip_sw
   in_hw in_hw_count 1
         action order 1:  police 0x7c0 rate 10Mbit burst 100Kb mtu 2Kb action drop overhead 0b
         ref 1 bind 1
         not_in_hw
         used_hw_stats immediate

The current grep expression matches on both 'in_hw' and 'not_in_hw',
resulting in incorrect results.

Fix that by using JSON output instead.

Fixes: 5061e77326 ("selftests: mlxsw: Add scale test for tc-police")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 08:14:01 -08:00
Vladimir Oltean
10b6bb62ae net: dcb: disable softirqs in dcbnl_flush_dev()
Ido Schimmel points out that since commit 52cff74eef ("dcbnl : Disable
software interrupts before taking dcb_lock"), the DCB API can be called
by drivers from softirq context.

One such in-tree example is the chelsio cxgb4 driver:
dcb_rpl
-> cxgb4_dcb_handle_fw_update
   -> dcb_ieee_setapp

If the firmware for this driver happened to send an event which resulted
in a call to dcb_ieee_setapp() at the exact same time as another
DCB-enabled interface was unregistering on the same CPU, the softirq
would deadlock, because the interrupted process was already holding the
dcb_lock in dcbnl_flush_dev().

Fix this unlikely event by using spin_lock_bh() in dcbnl_flush_dev() as
in the rest of the dcbnl code.

Fixes: 91b0383fef ("net: dcb: flush lingering app table entries for unregistered devices")
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220302193939.1368823-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 08:01:55 -08:00
Christophe JAILLET
8ccffe9ac3 bnx2: Fix an error message
Fix an error message and report the correct failing function.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:48:40 +00:00
Niels Dossche
f1fb205efb sfc: extend the locking on mcdi->seqno
seqno could be read as a stale value outside of the lock. The lock is
already acquired to protect the modification of seqno against a possible
race condition. Place the reading of this value also inside this locking
to protect it against a possible race condition.

Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:11:58 +00:00
Luiz Augusto von Dentz
008ee9eb8a Bluetooth: hci_sync: Fix not processing all entries on cmd_sync_work
hci_cmd_sync_queue can be called multiple times, each adding a
hci_cmd_sync_work_entry, before hci_cmd_sync_work is run so this makes
sure they are all dequeued properly otherwise it creates a backlog of
entries that are never run.

Link: https://lore.kernel.org/all/CAJCQCtSeUtHCgsHXLGrSTWKmyjaQDbDNpP4rb0i+RE+L2FTXSA@mail.gmail.com/T/
Fixes: 6a98e3836f ("Bluetooth: Add helper for serialized HCI command execution")
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2022-03-03 13:30:03 +01:00
Hans de Goede
815d512192 Bluetooth: hci_core: Fix unbalanced unlock in set_device_flags()
There is only one "goto done;" in set_device_flags() and this happens
*before* hci_dev_lock() is called, move the done label to after the
hci_dev_unlock() to fix the following unlock balance:

[   31.493567] =====================================
[   31.493571] WARNING: bad unlock balance detected!
[   31.493576] 5.17.0-rc2+ #13 Tainted: G         C  E
[   31.493581] -------------------------------------
[   31.493584] bluetoothd/685 is trying to release lock (&hdev->lock) at:
[   31.493594] [<ffffffffc07603f5>] set_device_flags+0x65/0x1f0 [bluetooth]
[   31.493684] but there are no more locks to release!

Note this bug has been around for a couple of years, but before
commit fe92ee6425 ("Bluetooth: hci_core: Rework hci_conn_params flags")
supported_flags was hardcoded to "((1U << HCI_CONN_FLAG_MAX) - 1)" so
the check for unsupported flags which does the "goto done;" never
triggered.

Fixes: fe92ee6425 ("Bluetooth: hci_core: Rework hci_conn_params flags")
Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2022-03-03 11:35:10 +01:00
David S. Miller
f8e9bd34ce Merge branch 'smc-fix'
D. Wythe says:

====================
fix unexpected SMC_CLC_DECL_ERR_REGRMB error

We can easily trigger the SMC_CLC_DECL_ERR_REGRMB exception within
following script:

server: smc_run nginx
client: smc_run  ./wrk -c 2000 -t 8 -d 20 http://smc-server

And we can clearly see that this error is also divided into two types:

1. 0x09990003
2. 0x05000000/0x09990003

Which has the same root causes, but the immediate causes vary.

The root cause of this issues is that remove connections from link group
is not synchronous with add/delete rtoken entry,  which means that even
the number of connections is less that SMC_RMBS_PER_LGR_MAX, it does not
mean that the connection can register rtoken successfully later. In
other words, the rtoken entry may released, This will cause an
unexpected SMC_CLC_DECL_ERR_REGRMB to be reported, and then this SMC
connections have to fallback to TCP.

This patch set handles two types of SMC_CLC_DECL_ERR_REGRMB exceptions
from different perspectives.

Patch 1: fix the 0x05000000/0x09990003 error.
Patch 2: fix the 0x09990003 error.

After those patches, there is no SMC_CLC_DECL_ERR_REGRMB exceptions in
my
test case any more.

v1 -> v2:
- add bugfix patch for SMC_CLC_DECL_ERR_REGRMB cause by server side
v2 -> v3:
- fix incorrect mail thread
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 10:34:18 +00:00
D. Wythe
4940a1fdf3 net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error cause by server
The problem of SMC_CLC_DECL_ERR_REGRMB on the server is very clear.
Based on the fact that whether a new SMC connection can be accepted or
not depends on not only the limit of conn nums, but also the available
entries of rtoken. Since the rtoken release is trigger by peer, while
the conn nums is decrease by local, tons of thing can happen in this
time difference.

This only thing that needs to be mentioned is that now all connection
creations are completely protected by smc_server_lgr_pending lock, it's
enough to check only the available entries in rtokens_used_mask.

Fixes: cd6851f303 ("smc: remote memory buffers (RMBs)")
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 10:34:18 +00:00
D. Wythe
0537f0a215 net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error generated by client
The main reason for this unexpected SMC_CLC_DECL_ERR_REGRMB in client
dues to following execution sequence:

Server Conn A:           Server Conn B:			Client Conn B:

smc_lgr_unregister_conn
                        smc_lgr_register_conn
                        smc_clc_send_accept     ->
                                                        smc_rtoken_add
smcr_buf_unuse
		->		Client Conn A:
				smc_rtoken_delete

smc_lgr_unregister_conn() makes current link available to assigned to new
incoming connection, while smcr_buf_unuse() has not executed yet, which
means that smc_rtoken_add may fail because of insufficient rtoken_entry,
reversing their execution order will avoid this problem.

Fixes: 3e034725c0 ("net/smc: common functions for RMBs and send buffers")
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 10:34:18 +00:00
Zheyu Ma
bd6f1fd5d3 net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe()
During driver initialization, the pointer of card info, i.e. the
variable 'ci' is required. However, the definition of
'com20020pci_id_table' reveals that this field is empty for some
devices, which will cause null pointer dereference when initializing
these devices.

The following log reveals it:

[    3.973806] KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
[    3.973819] RIP: 0010:com20020pci_probe+0x18d/0x13e0 [com20020_pci]
[    3.975181] Call Trace:
[    3.976208]  local_pci_probe+0x13f/0x210
[    3.977248]  pci_device_probe+0x34c/0x6d0
[    3.977255]  ? pci_uevent+0x470/0x470
[    3.978265]  really_probe+0x24c/0x8d0
[    3.978273]  __driver_probe_device+0x1b3/0x280
[    3.979288]  driver_probe_device+0x50/0x370

Fix this by checking whether the 'ci' is a null pointer first.

Fixes: 8c14f9c703 ("ARCNET: add com20020 PCI IDs with metadata")
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 10:29:13 +00:00
Eric Dumazet
e3d5ea2c01 tcp: make tcp_read_sock() more robust
If recv_actor() returns an incorrect value, tcp_read_sock()
might loop forever.

Instead, issue a one time warning and make sure to make progress.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20220302161723.3910001-2-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02 22:49:03 -08:00
Eric Dumazet
60ce37b039 bpf, sockmap: Do not ignore orig_len parameter
Currently, sk_psock_verdict_recv() returns skb->len

This is problematic because tcp_read_sock() might have
passed orig_len < skb->len, due to the presence of TCP urgent data.

This causes an infinite loop from tcp_read_sock()

Followup patch will make tcp_read_sock() more robust vs bad actors.

Fixes: ef5659280e ("bpf, sockmap: Allow skipping sk_skb parser program")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20220302161723.3910001-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02 22:49:03 -08:00
Alex Elder
1dba41c9d2 net: ipa: add an interconnect dependency
In order to function, the IPA driver very clearly requires the
interconnect framework to be enabled in the kernel configuration.
State that dependency in the Kconfig file.

This became a problem when CONFIG_COMPILE_TEST support was added.
Non-Qualcomm platforms won't necessarily enable CONFIG_INTERCONNECT.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 38a4066f59 ("net: ipa: support COMPILE_TEST")
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20220301113440.257916-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02 22:14:05 -08:00
lena wang
224102de2f net: fix up skbs delta_truesize in UDP GRO frag_list
The truesize for a UDP GRO packet is added by main skb and skbs in main
skb's frag_list:
skb_gro_receive_list
        p->truesize += skb->truesize;

The commit 53475c5dd8 ("net: fix use-after-free when UDP GRO with
shared fraglist") introduced a truesize increase for frag_list skbs.
When uncloning skb, it will call pskb_expand_head and trusesize for
frag_list skbs may increase. This can occur when allocators uses
__netdev_alloc_skb and not jump into __alloc_skb. This flow does not
use ksize(len) to calculate truesize while pskb_expand_head uses.
skb_segment_list
err = skb_unclone(nskb, GFP_ATOMIC);
pskb_expand_head
        if (!skb->sk || skb->destructor == sock_edemux)
                skb->truesize += size - osize;

If we uses increased truesize adding as delta_truesize, it will be
larger than before and even larger than previous total truesize value
if skbs in frag_list are abundant. The main skb truesize will become
smaller and even a minus value or a huge value for an unsigned int
parameter. Then the following memory check will drop this abnormal skb.

To avoid this error we should use the original truesize to segment the
main skb.

Fixes: 53475c5dd8 ("net: fix use-after-free when UDP GRO with shared fraglist")
Signed-off-by: lena wang <lena.wang@mediatek.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/1646133431-8948-1-git-send-email-lena.wang@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02 22:04:10 -08:00
Jakub Kicinski
ea97ab9889 Merge tag 'batadv-net-pullrequest-20220302' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:

====================
Here are some batman-adv bugfixes:

 - Remove redundant iflink requests, by Sven Eckelmann (2 patches)

 - Don't expect inter-netns unique iflink indices, by Sven Eckelmann

* tag 'batadv-net-pullrequest-20220302' of git://git.open-mesh.org/linux-merge:
  batman-adv: Don't expect inter-netns unique iflink indices
  batman-adv: Request iflink once in batadv_get_real_netdevice
  batman-adv: Request iflink once in batadv-on-batadv check
====================

Link: https://lore.kernel.org/r/20220302163049.101957-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02 21:53:35 -08:00
Jakub Kicinski
95749c1033 Merge tag 'wireless-for-net-2022-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:

====================
Three more fixes:
 - fix build issue in iwlwifi, now that I understood
   what's going on there
 - propagate error in iwlwifi/mvm to userspace so it
   can figure out what's happening
 - fix channel switch related updates in P2P-client
   in cfg80211

* tag 'wireless-for-net-2022-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  iwlwifi: mvm: return value for request_ownership
  nl80211: Update bss channel on channel switch for P2P_CLIENT
  iwlwifi: fix build error for IWLMEI
====================

Link: https://lore.kernel.org/r/20220302214444.100180-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02 21:49:57 -08:00
Linus Torvalds
5859a2b199 Merge branch 'ucount-rlimit-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull ucounts fix from Eric Biederman:
 "Etienne Dechamps recently found a regression caused by enforcing
  RLIMIT_NPROC for root where the rlimit was not previously enforced.

  Michal Koutný had previously pointed out the inconsistency in
  enforcing the RLIMIT_NPROC that had been on the root owned process
  after the root user creates a user namespace.

  Which makes the fix for the regression simply removing the
  inconsistency"

* 'ucount-rlimit-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ucounts: Fix systemd LimitNPROC with private users regression
2022-03-02 16:20:04 -08:00
Linus Torvalds
7e3d76139b Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:

 - Fix kgdb breakpoint for Thumb2

 - Fix dependency for BITREVERSE kconfig

 - Fix nommu early_params and __setup returns

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9182/1: mmu: fix returns from early_param() and __setup() functions
  ARM: 9178/1: fix unmet dependency on BITREVERSE for HAVE_ARCH_BITREVERSE
  ARM: Fix kgdb breakpoint for Thumb2
2022-03-02 16:11:56 -08:00
Qiang Yu
f1ef17011c drm/amdgpu: fix suspend/resume hang regression
Regression has been reported that suspend/resume may hang with
the previous vm ready check commit.

So bring back the evicted list check as a temp fix.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1922
Fixes: c1a66c3bc4 ("drm/amdgpu: check vm ready by amdgpu_vm->evicting flag")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Qiang Yu <qiang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-02 18:36:43 -05:00
Andy Shevchenko
9ed331f8a0 auxdisplay: lcd2s: Use proper API to free the instance of charlcd object
While it might work, the current approach is fragile in a few ways:
- whenever members in the structure are shuffled, the pointer will be wrong
- the resource freeing may include more than covered by kfree()

Fix this by using charlcd_free() call instead of kfree().

Fixes: 8c9108d014 ("auxdisplay: add a driver for lcd2s character display")
Cc: Lars Poeschel <poeschel@lemonage.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2022-03-03 00:30:31 +01:00
Andy Shevchenko
898c0a1542 auxdisplay: lcd2s: Fix memory leak in ->remove()
Once allocated the struct lcd2s_data is never freed.
Fix the memory leak by switching to devm_kzalloc().

Fixes: 8c9108d014 ("auxdisplay: add a driver for lcd2s character display")
Cc: Lars Poeschel <poeschel@lemonage.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2022-03-03 00:30:14 +01:00
Andy Shevchenko
4424c35ead auxdisplay: lcd2s: Fix lcd2s_redefine_char() feature
It seems that the lcd2s_redefine_char() has never been properly
tested. The buffer is filled by DEF_CUSTOM_CHAR command followed
by the character number (from 0 to 7), but immediately after that
these bytes are rewritten by the decoded hex stream.

Fix the index to fill the buffer after the command and number.

Fixes: 8c9108d014 ("auxdisplay: add a driver for lcd2s character display")
Cc: Lars Poeschel <poeschel@lemonage.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
[fixed typo in commit message]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2022-03-03 00:29:26 +01:00
Emmanuel Grumbach
e6e91ec966 iwlwifi: mvm: return value for request_ownership
Propagate the value to the user space so it can understand
if the operation failed or not.

Fixes: bfcfdb59b6 ("iwlwifi: mvm: add vendor commands needed for iwlmei")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Link: https://lore.kernel.org/r/20220302072715.4885-1-emmanuel.grumbach@intel.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-02 22:37:25 +01:00
Sreeramya Soratkal
e50b88c4f0 nl80211: Update bss channel on channel switch for P2P_CLIENT
The wdev channel information is updated post channel switch only for
the station mode and not for the other modes. Due to this, the P2P client
still points to the old value though it moved to the new channel
when the channel change is induced from the P2P GO.

Update the bss channel after CSA channel switch completion for P2P client
interface as well.

Signed-off-by: Sreeramya Soratkal <quic_ssramya@quicinc.com>
Link: https://lore.kernel.org/r/1646114600-31479-1-git-send-email-quic_ssramya@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-02 22:37:05 +01:00
Randy Dunlap
875ad06015 iwlwifi: fix build error for IWLMEI
When CONFIG_IWLWIFI=m and CONFIG_IWLMEI=y, the kernel build system
must be told to build the iwlwifi/ subdirectory for both IWLWIFI and
IWLMEI so that builds for both =y and =m are done.

This resolves an undefined reference build error:

ERROR: modpost: "iwl_mei_is_connected" [drivers/net/wireless/intel/iwlwifi/iwlwifi.ko] undefined!

Fixes: 977df8bd58 ("wlwifi: work around reverse dependency on MEI")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: linux-wireless@vger.kernel.org
Link: https://lore.kernel.org/r/20220227200051.7176-1-rdunlap@infradead.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-02 22:36:49 +01:00
Linus Torvalds
92ebf5f91b Merge tag 'erofs-for-5.17-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fix from Gao Xiang:
 "A one-line patch to fix the new ztailpacking feature on > 4GiB
  filesystems because z_idataoff can get trimmed improperly.

  ztailpacking is still a brand new EXPERIMENTAL feature, but it'd be
  better to fix the issue as soon as possible to avoid unnecessary
  backporting.

  Summary:

   - Fix ztailpacking z_idataoff getting trimmed on > 4GiB filesystems"

* tag 'erofs-for-5.17-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix ztailpacking on > 4GiB filesystems
2022-03-02 12:08:36 -08:00
Linus Torvalds
ae5f531d17 Merge tag 'ntb-5.17-bugfixes' of git://github.com/jonmason/ntb
Pull NTB fixes from Jon Mason:
 "Bug fixes for sparse warning, intel port config offset, and a new
  mailing list"

* tag 'ntb-5.17-bugfixes' of git://github.com/jonmason/ntb:
  MAINTAINERS: update mailing list address for NTB subsystem
  ntb: intel: fix port config status offset for SPR
  NTB/msi: Use struct_size() helper in devm_kzalloc()
2022-03-02 11:58:27 -08:00
Jonathan Lemon
90f8f4c0e3 ptp: ocp: Add ptp_ocp_adjtime_coarse for large adjustments
In ("ptp: ocp: Have FPGA fold in ns adjustment for adjtime."), the
ns adjustment was written to the FPGA register, so the clock could
accurately perform adjustments.

However, the adjtime() call passes in a s64, while the clock adjustment
registers use a s32.  When trying to perform adjustments with a large
value (37 sec), things fail.

Examine the incoming delta, and if larger than 1 sec, use the original
(coarse) adjustment method.  If smaller than 1 sec, then allow the
FPGA to fold in the changes over a 1 second window.

Fixes: 6d59d4fa17 ("ptp: ocp: Have FPGA fold in ns adjustment for adjtime.")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20220228203957.367371-1-jonathan.lemon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02 09:51:21 -08:00
Paolo Bonzini
8d25b7beca KVM: x86: pull kvm->srcu read-side to kvm_arch_vcpu_ioctl_run
kvm_arch_vcpu_ioctl_run is already doing srcu_read_lock/unlock in two
places, namely vcpu_run and post_kvm_run_save, and a third is actually
needed around the call to vcpu->arch.complete_userspace_io to avoid
the following splat:

  WARNING: suspicious RCU usage
  arch/x86/kvm/pmu.c:190 suspicious rcu_dereference_check() usage!
  other info that might help us debug this:
  rcu_scheduler_active = 2, debug_locks = 1
  1 lock held by CPU 28/KVM/370841:
  #0: ff11004089f280b8 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x87/0x730 [kvm]
  Call Trace:
   <TASK>
   dump_stack_lvl+0x59/0x73
   reprogram_fixed_counter+0x15d/0x1a0 [kvm]
   kvm_pmu_trigger_event+0x1a3/0x260 [kvm]
   ? free_moved_vector+0x1b4/0x1e0
   complete_fast_pio_in+0x8a/0xd0 [kvm]

This splat is not at all unexpected, since complete_userspace_io callbacks
can execute similar code to vmexits.  For example, SVM with nrips=false
will call into the emulator from svm_skip_emulated_instruction().

While it's tempting to never acquire kvm->srcu for an uninitialized vCPU,
practically speaking there's no penalty to acquiring kvm->srcu "early"
as the KVM_MP_STATE_UNINITIALIZED path is a one-time thing per vCPU.  On
the other hand, seemingly innocuous helpers like kvm_apic_accept_events()
and sync_regs() can theoretically reach code that might access
SRCU-protected data structures, e.g. sync_regs() can trigger forced
existing of nested mode via kvm_vcpu_ioctl_x86_set_vcpu_events().

Reported-by: Like Xu <likexu@tencent.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-02 10:55:58 -05:00
Like Xu
c6c937d673 KVM: x86/mmu: Passing up the error state of mmu_alloc_shadow_roots()
Just like on the optional mmu_alloc_direct_roots() path, once shadow
path reaches "r = -EIO" somewhere, the caller needs to know the actual
state in order to enter error handling and avoid something worse.

Fixes: 4a38162ee9 ("KVM: MMU: load PDPTRs outside mmu_lock")
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220301124941.48412-1-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-02 10:55:58 -05:00
Filipe Manana
4751dc9962 btrfs: add missing run of delayed items after unlink during log replay
During log replay, whenever we need to check if a name (dentry) exists in
a directory we do searches on the subvolume tree for inode references or
or directory entries (BTRFS_DIR_INDEX_KEY keys, and BTRFS_DIR_ITEM_KEY
keys as well, before kernel 5.17). However when during log replay we
unlink a name, through btrfs_unlink_inode(), we may not delete inode
references and dir index keys from a subvolume tree and instead just add
the deletions to the delayed inode's delayed items, which will only be
run when we commit the transaction used for log replay. This means that
after an unlink operation during log replay, if we attempt to search for
the same name during log replay, we will not see that the name was already
deleted, since the deletion is recorded only on the delayed items.

We run delayed items after every unlink operation during log replay,
except at unlink_old_inode_refs() and at add_inode_ref(). This was due
to an overlook, as delayed items should be run after evert unlink, for
the reasons stated above.

So fix those two cases.

Fixes: 0d836392ca ("Btrfs: fix mount failure after fsync due to hard link recreation")
Fixes: 1f250e929a ("Btrfs: fix log replay failure after unlink and link combination")
CC: stable@vger.kernel.org # 4.19+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-02 16:53:11 +01:00
Sidong Yang
d4aef1e122 btrfs: qgroup: fix deadlock between rescan worker and remove qgroup
The commit e804861bd4 ("btrfs: fix deadlock between quota disable and
qgroup rescan worker") by Kawasaki resolves deadlock between quota
disable and qgroup rescan worker. But also there is a deadlock case like
it. It's about enabling or disabling quota and creating or removing
qgroup. It can be reproduced in simple script below.

for i in {1..100}
do
    btrfs quota enable /mnt &
    btrfs qgroup create 1/0 /mnt &
    btrfs qgroup destroy 1/0 /mnt &
    btrfs quota disable /mnt &
done

Here's why the deadlock happens:

1) The quota rescan task is running.

2) Task A calls btrfs_quota_disable(), locks the qgroup_ioctl_lock
   mutex, and then calls btrfs_qgroup_wait_for_completion(), to wait for
   the quota rescan task to complete.

3) Task B calls btrfs_remove_qgroup() and it blocks when trying to lock
   the qgroup_ioctl_lock mutex, because it's being held by task A. At that
   point task B is holding a transaction handle for the current transaction.

4) The quota rescan task calls btrfs_commit_transaction(). This results
   in it waiting for all other tasks to release their handles on the
   transaction, but task B is blocked on the qgroup_ioctl_lock mutex
   while holding a handle on the transaction, and that mutex is being held
   by task A, which is waiting for the quota rescan task to complete,
   resulting in a deadlock between these 3 tasks.

To resolve this issue, the thread disabling quota should unlock
qgroup_ioctl_lock before waiting rescan completion. Move
btrfs_qgroup_wait_for_completion() after unlock of qgroup_ioctl_lock.

Fixes: e804861bd4 ("btrfs: fix deadlock between quota disable and qgroup rescan worker")
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-02 16:53:04 +01:00
Omar Sandoval
5fd76bf31c btrfs: fix relocation crash due to premature return from btrfs_commit_transaction()
We are seeing crashes similar to the following trace:

[38.969182] WARNING: CPU: 20 PID: 2105 at fs/btrfs/relocation.c:4070 btrfs_relocate_block_group+0x2dc/0x340 [btrfs]
[38.973556] CPU: 20 PID: 2105 Comm: btrfs Not tainted 5.17.0-rc4 #54
[38.974580] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[38.976539] RIP: 0010:btrfs_relocate_block_group+0x2dc/0x340 [btrfs]
[38.980336] RSP: 0000:ffffb0dd42e03c20 EFLAGS: 00010206
[38.981218] RAX: ffff96cfc4ede800 RBX: ffff96cfc3ce0000 RCX: 000000000002ca14
[38.982560] RDX: 0000000000000000 RSI: 4cfd109a0bcb5d7f RDI: ffff96cfc3ce0360
[38.983619] RBP: ffff96cfc309c000 R08: 0000000000000000 R09: 0000000000000000
[38.984678] R10: ffff96cec0000001 R11: ffffe84c80000000 R12: ffff96cfc4ede800
[38.985735] R13: 0000000000000000 R14: 0000000000000000 R15: ffff96cfc3ce0360
[38.987146] FS:  00007f11c15218c0(0000) GS:ffff96d6dfb00000(0000) knlGS:0000000000000000
[38.988662] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[38.989398] CR2: 00007ffc922c8e60 CR3: 00000001147a6001 CR4: 0000000000370ee0
[38.990279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[38.991219] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[38.992528] Call Trace:
[38.992854]  <TASK>
[38.993148]  btrfs_relocate_chunk+0x27/0xe0 [btrfs]
[38.993941]  btrfs_balance+0x78e/0xea0 [btrfs]
[38.994801]  ? vsnprintf+0x33c/0x520
[38.995368]  ? __kmalloc_track_caller+0x351/0x440
[38.996198]  btrfs_ioctl_balance+0x2b9/0x3a0 [btrfs]
[38.997084]  btrfs_ioctl+0x11b0/0x2da0 [btrfs]
[38.997867]  ? mod_objcg_state+0xee/0x340
[38.998552]  ? seq_release+0x24/0x30
[38.999184]  ? proc_nr_files+0x30/0x30
[38.999654]  ? call_rcu+0xc8/0x2f0
[39.000228]  ? __x64_sys_ioctl+0x84/0xc0
[39.000872]  ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
[39.001973]  __x64_sys_ioctl+0x84/0xc0
[39.002566]  do_syscall_64+0x3a/0x80
[39.003011]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[39.003735] RIP: 0033:0x7f11c166959b
[39.007324] RSP: 002b:00007fff2543e998 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[39.008521] RAX: ffffffffffffffda RBX: 00007f11c1521698 RCX: 00007f11c166959b
[39.009833] RDX: 00007fff2543ea40 RSI: 00000000c4009420 RDI: 0000000000000003
[39.011270] RBP: 0000000000000003 R08: 0000000000000013 R09: 00007f11c16f94e0
[39.012581] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff25440df3
[39.014046] R13: 0000000000000000 R14: 00007fff2543ea40 R15: 0000000000000001
[39.015040]  </TASK>
[39.015418] ---[ end trace 0000000000000000 ]---
[43.131559] ------------[ cut here ]------------
[43.132234] kernel BUG at fs/btrfs/extent-tree.c:2717!
[43.133031] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[43.133702] CPU: 1 PID: 1839 Comm: btrfs Tainted: G        W         5.17.0-rc4 #54
[43.134863] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[43.136426] RIP: 0010:unpin_extent_range+0x37a/0x4f0 [btrfs]
[43.139913] RSP: 0000:ffffb0dd4216bc70 EFLAGS: 00010246
[43.140629] RAX: 0000000000000000 RBX: ffff96cfc34490f8 RCX: 0000000000000001
[43.141604] RDX: 0000000080000001 RSI: 0000000051d00000 RDI: 00000000ffffffff
[43.142645] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff96cfd07dca50
[43.143669] R10: ffff96cfc46e8a00 R11: fffffffffffec000 R12: 0000000041d00000
[43.144657] R13: ffff96cfc3ce0000 R14: ffffb0dd4216bd08 R15: 0000000000000000
[43.145686] FS:  00007f7657dd68c0(0000) GS:ffff96d6df640000(0000) knlGS:0000000000000000
[43.146808] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[43.147584] CR2: 00007f7fe81bf5b0 CR3: 00000001093ee004 CR4: 0000000000370ee0
[43.148589] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[43.149581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[43.150559] Call Trace:
[43.150904]  <TASK>
[43.151253]  btrfs_finish_extent_commit+0x88/0x290 [btrfs]
[43.152127]  btrfs_commit_transaction+0x74f/0xaa0 [btrfs]
[43.152932]  ? btrfs_attach_transaction_barrier+0x1e/0x50 [btrfs]
[43.153786]  btrfs_ioctl+0x1edc/0x2da0 [btrfs]
[43.154475]  ? __check_object_size+0x150/0x170
[43.155170]  ? preempt_count_add+0x49/0xa0
[43.155753]  ? __x64_sys_ioctl+0x84/0xc0
[43.156437]  ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
[43.157456]  __x64_sys_ioctl+0x84/0xc0
[43.157980]  do_syscall_64+0x3a/0x80
[43.158543]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[43.159231] RIP: 0033:0x7f7657f1e59b
[43.161819] RSP: 002b:00007ffda5cd1658 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[43.162702] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f7657f1e59b
[43.163526] RDX: 0000000000000000 RSI: 0000000000009408 RDI: 0000000000000003
[43.164358] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
[43.165208] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[43.166029] R13: 00005621b91c3232 R14: 00005621b91ba580 R15: 00007ffda5cd1800
[43.166863]  </TASK>
[43.167125] Modules linked in: btrfs blake2b_generic xor pata_acpi ata_piix libata raid6_pq scsi_mod libcrc32c virtio_net virtio_rng net_failover rng_core failover scsi_common
[43.169552] ---[ end trace 0000000000000000 ]---
[43.171226] RIP: 0010:unpin_extent_range+0x37a/0x4f0 [btrfs]
[43.174767] RSP: 0000:ffffb0dd4216bc70 EFLAGS: 00010246
[43.175600] RAX: 0000000000000000 RBX: ffff96cfc34490f8 RCX: 0000000000000001
[43.176468] RDX: 0000000080000001 RSI: 0000000051d00000 RDI: 00000000ffffffff
[43.177357] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff96cfd07dca50
[43.178271] R10: ffff96cfc46e8a00 R11: fffffffffffec000 R12: 0000000041d00000
[43.179178] R13: ffff96cfc3ce0000 R14: ffffb0dd4216bd08 R15: 0000000000000000
[43.180071] FS:  00007f7657dd68c0(0000) GS:ffff96d6df800000(0000) knlGS:0000000000000000
[43.181073] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[43.181808] CR2: 00007fe09905f010 CR3: 00000001093ee004 CR4: 0000000000370ee0
[43.182706] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[43.183591] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

We first hit the WARN_ON(rc->block_group->pinned > 0) in
btrfs_relocate_block_group() and then the BUG_ON(!cache) in
unpin_extent_range(). This tells us that we are exiting relocation and
removing the block group with bytes still pinned for that block group.
This is supposed to be impossible: the last thing relocate_block_group()
does is commit the transaction to get rid of pinned extents.

Commit d0c2f4fa55 ("btrfs: make concurrent fsyncs wait less when
waiting for a transaction commit") introduced an optimization so that
commits from fsync don't have to wait for the previous commit to unpin
extents. This was only intended to affect fsync, but it inadvertently
made it possible for any commit to skip waiting for the previous commit
to unpin. This is because if a call to btrfs_commit_transaction() finds
that another thread is already committing the transaction, it waits for
the other thread to complete the commit and then returns. If that other
thread was in fsync, then it completes the commit without completing the
previous commit. This makes the following sequence of events possible:

Thread 1____________________|Thread 2 (fsync)_____________________|Thread 3 (balance)___________________
btrfs_commit_transaction(N) |                                     |
  btrfs_run_delayed_refs    |                                     |
    pin extents             |                                     |
  ...                       |                                     |
  state = UNBLOCKED         |btrfs_sync_file                      |
                            |  btrfs_start_transaction(N + 1)     |relocate_block_group
                            |                                     |  btrfs_join_transaction(N + 1)
                            |  btrfs_commit_transaction(N + 1)    |
  ...                       |  trans->state = COMMIT_START        |
                            |                                     |  btrfs_commit_transaction(N + 1)
                            |                                     |    wait_for_commit(N + 1, COMPLETED)
                            |  wait_for_commit(N, SUPER_COMMITTED)|
  state = SUPER_COMMITTED   |  ...                                |
  btrfs_finish_extent_commit|                                     |
    unpin_extent_range()    |  trans->state = COMPLETED           |
                            |                                     |    return
                            |                                     |
    ...                     |                                     |Thread 1 isn't done, so pinned > 0
                            |                                     |and we WARN
                            |                                     |
                            |                                     |btrfs_remove_block_group
    unpin_extent_range()    |                                     |
      Thread 3 removed the  |                                     |
      block group, so we BUG|                                     |

There are other sequences involving SUPER_COMMITTED transactions that
can cause a similar outcome.

We could fix this by making relocation explicitly wait for unpinning,
but there may be other cases that need it. Josef mentioned ENOSPC
flushing and the free space cache inode as other potential victims.
Rather than playing whack-a-mole, this fix is conservative and makes all
commits not in fsync wait for all previous transactions, which is what
the optimization intended.

Fixes: d0c2f4fa55 ("btrfs: make concurrent fsyncs wait less when waiting for a transaction commit")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-02 16:52:46 +01:00
Josef Bacik
b4be6aefa7 btrfs: do not start relocation until in progress drops are done
We hit a bug with a recovering relocation on mount for one of our file
systems in production.  I reproduced this locally by injecting errors
into snapshot delete with balance running at the same time.  This
presented as an error while looking up an extent item

  WARNING: CPU: 5 PID: 1501 at fs/btrfs/extent-tree.c:866 lookup_inline_extent_backref+0x647/0x680
  CPU: 5 PID: 1501 Comm: btrfs-balance Not tainted 5.16.0-rc8+ #8
  RIP: 0010:lookup_inline_extent_backref+0x647/0x680
  RSP: 0018:ffffae0a023ab960 EFLAGS: 00010202
  RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 000000000000000c RDI: 0000000000000000
  RBP: ffff943fd2a39b60 R08: 0000000000000000 R09: 0000000000000001
  R10: 0001434088152de0 R11: 0000000000000000 R12: 0000000001d05000
  R13: ffff943fd2a39b60 R14: ffff943fdb96f2a0 R15: ffff9442fc923000
  FS:  0000000000000000(0000) GS:ffff944e9eb40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f1157b1fca8 CR3: 000000010f092000 CR4: 0000000000350ee0
  Call Trace:
   <TASK>
   insert_inline_extent_backref+0x46/0xd0
   __btrfs_inc_extent_ref.isra.0+0x5f/0x200
   ? btrfs_merge_delayed_refs+0x164/0x190
   __btrfs_run_delayed_refs+0x561/0xfa0
   ? btrfs_search_slot+0x7b4/0xb30
   ? btrfs_update_root+0x1a9/0x2c0
   btrfs_run_delayed_refs+0x73/0x1f0
   ? btrfs_update_root+0x1a9/0x2c0
   btrfs_commit_transaction+0x50/0xa50
   ? btrfs_update_reloc_root+0x122/0x220
   prepare_to_merge+0x29f/0x320
   relocate_block_group+0x2b8/0x550
   btrfs_relocate_block_group+0x1a6/0x350
   btrfs_relocate_chunk+0x27/0xe0
   btrfs_balance+0x777/0xe60
   balance_kthread+0x35/0x50
   ? btrfs_balance+0xe60/0xe60
   kthread+0x16b/0x190
   ? set_kthread_struct+0x40/0x40
   ret_from_fork+0x22/0x30
   </TASK>

Normally snapshot deletion and relocation are excluded from running at
the same time by the fs_info->cleaner_mutex.  However if we had a
pending balance waiting to get the ->cleaner_mutex, and a snapshot
deletion was running, and then the box crashed, we would come up in a
state where we have a half deleted snapshot.

Again, in the normal case the snapshot deletion needs to complete before
relocation can start, but in this case relocation could very well start
before the snapshot deletion completes, as we simply add the root to the
dead roots list and wait for the next time the cleaner runs to clean up
the snapshot.

Fix this by setting a bit on the fs_info if we have any DEAD_ROOT's that
had a pending drop_progress key.  If they do then we know we were in the
middle of the drop operation and set a flag on the fs_info.  Then
balance can wait until this flag is cleared to start up again.

If there are DEAD_ROOT's that don't have a drop_progress set then we're
safe to start balance right away as we'll be properly protected by the
cleaner_mutex.

CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-02 16:52:39 +01:00
Su Yue
a6ab66eb85 btrfs: tree-checker: use u64 for item data end to avoid overflow
User reported there is an array-index-out-of-bounds access while
mounting the crafted image:

  [350.411942 ] loop0: detected capacity change from 0 to 262144
  [350.427058 ] BTRFS: device fsid a62e00e8-e94e-4200-8217-12444de93c2e devid 1 transid 8 /dev/loop0 scanned by systemd-udevd (1044)
  [350.428564 ] BTRFS info (device loop0): disk space caching is enabled
  [350.428568 ] BTRFS info (device loop0): has skinny extents
  [350.429589 ]
  [350.429619 ] UBSAN: array-index-out-of-bounds in fs/btrfs/struct-funcs.c:161:1
  [350.429636 ] index 1048096 is out of range for type 'page *[16]'
  [350.429650 ] CPU: 0 PID: 9 Comm: kworker/u8:1 Not tainted 5.16.0-rc4
  [350.429652 ] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
  [350.429653 ] Workqueue: btrfs-endio-meta btrfs_work_helper [btrfs]
  [350.429772 ] Call Trace:
  [350.429774 ]  <TASK>
  [350.429776 ]  dump_stack_lvl+0x47/0x5c
  [350.429780 ]  ubsan_epilogue+0x5/0x50
  [350.429786 ]  __ubsan_handle_out_of_bounds+0x66/0x70
  [350.429791 ]  btrfs_get_16+0xfd/0x120 [btrfs]
  [350.429832 ]  check_leaf+0x754/0x1a40 [btrfs]
  [350.429874 ]  ? filemap_read+0x34a/0x390
  [350.429878 ]  ? load_balance+0x175/0xfc0
  [350.429881 ]  validate_extent_buffer+0x244/0x310 [btrfs]
  [350.429911 ]  btrfs_validate_metadata_buffer+0xf8/0x100 [btrfs]
  [350.429935 ]  end_bio_extent_readpage+0x3af/0x850 [btrfs]
  [350.429969 ]  ? newidle_balance+0x259/0x480
  [350.429972 ]  end_workqueue_fn+0x29/0x40 [btrfs]
  [350.429995 ]  btrfs_work_helper+0x71/0x330 [btrfs]
  [350.430030 ]  ? __schedule+0x2fb/0xa40
  [350.430033 ]  process_one_work+0x1f6/0x400
  [350.430035 ]  ? process_one_work+0x400/0x400
  [350.430036 ]  worker_thread+0x2d/0x3d0
  [350.430037 ]  ? process_one_work+0x400/0x400
  [350.430038 ]  kthread+0x165/0x190
  [350.430041 ]  ? set_kthread_struct+0x40/0x40
  [350.430043 ]  ret_from_fork+0x1f/0x30
  [350.430047 ]  </TASK>
  [350.430047 ]
  [350.430077 ] BTRFS warning (device loop0): bad eb member start: ptr 0xffe20f4e start 20975616 member offset 4293005178 size 2

btrfs check reports:
  corrupt leaf: root=3 block=20975616 physical=20975616 slot=1, unexpected
  item end, have 4294971193 expect 3897

The first slot item offset is 4293005033 and the size is 1966160.
In check_leaf, we use btrfs_item_end() to check item boundary versus
extent_buffer data size. However, return type of btrfs_item_end() is u32.
(u32)(4293005033 + 1966160) == 3897, overflow happens and the result 3897
equals to leaf data size reasonably.

Fix it by use u64 variable to store item data end in check_leaf() to
avoid u32 overflow.

This commit does solve the invalid memory access showed by the stack
trace.  However, its metadata profile is DUP and another copy of the
leaf is fine.  So the image can be mounted successfully. But when umount
is called, the ASSERT btrfs_mark_buffer_dirty() will be triggered
because the only node in extent tree has 0 item and invalid owner. It's
solved by another commit
"btrfs: check extent buffer owner against the owner rootid".

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215299
Reported-by: Wenqing Liu <wenqingliu0120@gmail.com>
CC: stable@vger.kernel.org # 4.19+
Signed-off-by: Su Yue <l@damenly.su>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-02 16:52:32 +01:00
Josef Bacik
a50e1fcbc9 btrfs: do not WARN_ON() if we have PageError set
Whenever we do any extent buffer operations we call
assert_eb_page_uptodate() to complain loudly if we're operating on an
non-uptodate page.  Our overnight tests caught this warning earlier this
week

  WARNING: CPU: 1 PID: 553508 at fs/btrfs/extent_io.c:6849 assert_eb_page_uptodate+0x3f/0x50
  CPU: 1 PID: 553508 Comm: kworker/u4:13 Tainted: G        W         5.17.0-rc3+ #564
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
  Workqueue: btrfs-cache btrfs_work_helper
  RIP: 0010:assert_eb_page_uptodate+0x3f/0x50
  RSP: 0018:ffffa961440a7c68 EFLAGS: 00010246
  RAX: 0017ffffc0002112 RBX: ffffe6e74453f9c0 RCX: 0000000000001000
  RDX: ffffe6e74467c887 RSI: ffffe6e74453f9c0 RDI: ffff8d4c5efc2fc0
  RBP: 0000000000000d56 R08: ffff8d4d4a224000 R09: 0000000000000000
  R10: 00015817fa9d1ef0 R11: 000000000000000c R12: 00000000000007b1
  R13: ffff8d4c5efc2fc0 R14: 0000000001500000 R15: 0000000001cb1000
  FS:  0000000000000000(0000) GS:ffff8d4dbbd00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007ff31d3448d8 CR3: 0000000118be8004 CR4: 0000000000370ee0
  Call Trace:

   extent_buffer_test_bit+0x3f/0x70
   free_space_test_bit+0xa6/0xc0
   load_free_space_tree+0x1f6/0x470
   caching_thread+0x454/0x630
   ? rcu_read_lock_sched_held+0x12/0x60
   ? rcu_read_lock_sched_held+0x12/0x60
   ? rcu_read_lock_sched_held+0x12/0x60
   ? lock_release+0x1f0/0x2d0
   btrfs_work_helper+0xf2/0x3e0
   ? lock_release+0x1f0/0x2d0
   ? finish_task_switch.isra.0+0xf9/0x3a0
   process_one_work+0x26d/0x580
   ? process_one_work+0x580/0x580
   worker_thread+0x55/0x3b0
   ? process_one_work+0x580/0x580
   kthread+0xf0/0x120
   ? kthread_complete_and_exit+0x20/0x20
   ret_from_fork+0x1f/0x30

This was partially fixed by c2e3930529 ("btrfs: clear extent buffer
uptodate when we fail to write it"), however all that fix did was keep
us from finding extent buffers after a failed writeout.  It didn't keep
us from continuing to use a buffer that we already had found.

In this case we're searching the commit root to cache the block group,
so we can start committing the transaction and switch the commit root
and then start writing.  After the switch we can look up an extent
buffer that hasn't been written yet and start processing that block
group.  Then we fail to write that block out and clear Uptodate on the
page, and then we start spewing these errors.

Normally we're protected by the tree lock to a certain degree here.  If
we read a block we have that block read locked, and we block the writer
from locking the block before we submit it for the write.  However this
isn't necessarily fool proof because the read could happen before we do
the submit_bio and after we locked and unlocked the extent buffer.

Also in this particular case we have path->skip_locking set, so that
won't save us here.  We'll simply get a block that was valid when we
read it, but became invalid while we were using it.

What we really want is to catch the case where we've "read" a block but
it's not marked Uptodate.  On read we ClearPageError(), so if we're
!Uptodate and !Error we know we didn't do the right thing for reading
the page.

Fix this by checking !Uptodate && !Error, this way we will not complain
if our buffer gets invalidated while we're using it, and we'll maintain
the spirit of the check which is to make sure we have a fully in-cache
block while we're messing with it.

CC: stable@vger.kernel.org # 5.4+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-02 16:52:24 +01:00
Filipe Manana
d994788743 btrfs: fix lost prealloc extents beyond eof after full fsync
When doing a full fsync, if we have prealloc extents beyond (or at) eof,
and the leaves that contain them were not modified in the current
transaction, we end up not logging them. This results in losing those
extents when we replay the log after a power failure, since the inode is
truncated to the current value of the logged i_size.

Just like for the fast fsync path, we need to always log all prealloc
extents starting at or beyond i_size. The fast fsync case was fixed in
commit 471d557afe ("Btrfs: fix loss of prealloc extents past i_size
after fsync log replay") but it missed the full fsync path. The problem
exists since the very early days, when the log tree was added by
commit e02119d5a7 ("Btrfs: Add a write ahead tree log to optimize
synchronous operations").

Example reproducer:

  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt

  # Create our test file with many file extent items, so that they span
  # several leaves of metadata, even if the node/page size is 64K. Use
  # direct IO and not fsync/O_SYNC because it's both faster and it avoids
  # clearing the full sync flag from the inode - we want the fsync below
  # to trigger the slow full sync code path.
  $ xfs_io -f -d -c "pwrite -b 4K 0 16M" /mnt/foo

  # Now add two preallocated extents to our file without extending the
  # file's size. One right at i_size, and another further beyond, leaving
  # a gap between the two prealloc extents.
  $ xfs_io -c "falloc -k 16M 1M" /mnt/foo
  $ xfs_io -c "falloc -k 20M 1M" /mnt/foo

  # Make sure everything is durably persisted and the transaction is
  # committed. This makes all created extents to have a generation lower
  # than the generation of the transaction used by the next write and
  # fsync.
  sync

  # Now overwrite only the first extent, which will result in modifying
  # only the first leaf of metadata for our inode. Then fsync it. This
  # fsync will use the slow code path (inode full sync bit is set) because
  # it's the first fsync since the inode was created/loaded.
  $ xfs_io -c "pwrite 0 4K" -c "fsync" /mnt/foo

  # Extent list before power failure.
  $ xfs_io -c "fiemap -v" /mnt/foo
  /mnt/foo:
   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
     0: [0..7]:          2178048..2178055     8   0x0
     1: [8..16383]:      26632..43007     16376   0x0
     2: [16384..32767]:  2156544..2172927 16384   0x0
     3: [32768..34815]:  2172928..2174975  2048 0x800
     4: [34816..40959]:  hole              6144
     5: [40960..43007]:  2174976..2177023  2048 0x801

  <power fail>

  # Mount fs again, trigger log replay.
  $ mount /dev/sdc /mnt

  # Extent list after power failure and log replay.
  $ xfs_io -c "fiemap -v" /mnt/foo
  /mnt/foo:
   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
     0: [0..7]:          2178048..2178055     8   0x0
     1: [8..16383]:      26632..43007     16376   0x0
     2: [16384..32767]:  2156544..2172927 16384   0x1

  # The prealloc extents at file offsets 16M and 20M are missing.

So fix this by calling btrfs_log_prealloc_extents() when we are doing a
full fsync, so that we always log all prealloc extents beyond eof.

A test case for fstests will follow soon.

CC: stable@vger.kernel.org # 4.19+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-02 16:51:55 +01:00
Qu Wenruo
c992fa1fd5 btrfs: subpage: fix a wrong check on subpage->writers
[BUG]
When looping btrfs/074 with 64K page size and 4K sectorsize, there is a
low chance (1/50~1/100) to crash with the following ASSERT() triggered
in btrfs_subpage_start_writer():

	ret = atomic_add_return(nbits, &subpage->writers);
	ASSERT(ret == nbits); <<< This one <<<

[CAUSE]
With more debugging output on the parameters of
btrfs_subpage_start_writer(), it shows a very concerning error:

  ret=29 nbits=13 start=393216 len=53248

For @nbits it's correct, but @ret which is the returned value from
atomic_add_return(), it's not only larger than nbits, but also larger
than max sectors per page value (for 64K page size and 4K sector size,
it's 16).

This indicates that some call sites are not properly decreasing the value.

And that's exactly the case, in btrfs_page_unlock_writer(), due to the
fact that we can have page locked either by lock_page() or
process_one_page(), we have to check if the subpage has any writer.

If no writers, it's locked by lock_page() and we only need to unlock it.

But unfortunately the check for the writers are completely opposite:

	if (atomic_read(&subpage->writers))
		/* No writers, locked by plain lock_page() */
		return unlock_page(page);

We directly unlock the page if it has writers, which is the completely
opposite what we want.

Thankfully the affected call site is only limited to
extent_write_locked_range(), so it's mostly affecting compressed write.

[FIX]
Just fix the wrong check condition to fix the bug.

Fixes: e55a0de185 ("btrfs: rework page locking in __extent_writepage()")
CC: stable@vger.kernel.org # 5.16
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-02 16:51:39 +01:00
Hans de Goede
342e7c6ea5 staging: rtl8723bs: Improve the comment explaining the locking rules
rtw_mlme.h has a comment which briefly describes the locking rules for
the rtl8723bs driver, improve this to also mention the locking order
of xmit_priv.lock vs the lock(s) embedded in the various queues.

Cc: Fabio Aiuto <fabioaiuto83@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220302101637.26542-2-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-02 16:38:24 +01:00
Hans de Goede
8f4347081b staging: rtl8723bs: Fix access-point mode deadlock
Commit 54659ca026 ("staging: rtl8723bs: remove possible deadlock when
disconnect (v2)") split the locking of pxmitpriv->lock vs sleep_q/lock
into 2 locks in attempt to fix a lockdep reported issue with the locking
order of the sta_hash_lock vs pxmitpriv->lock.

But in the end this turned out to not fully solve the sta_hash_lock issue
so commit a7ac783c33 ("staging: rtl8723bs: remove a second possible
deadlock") was added to fix this in another way.

The original fix was kept as it was still seen as a good thing to have,
but now it turns out that it creates a deadlock in access-point mode:

[Feb20 23:47] ======================================================
[  +0.074085] WARNING: possible circular locking dependency detected
[  +0.074077] 5.16.0-1-amd64 #1 Tainted: G         C  E
[  +0.064710] ------------------------------------------------------
[  +0.074075] ksoftirqd/3/29 is trying to acquire lock:
[  +0.060542] ffffb8b30062ab00 (&pxmitpriv->lock){+.-.}-{2:2}, at: rtw_xmit_classifier+0x8a/0x140 [r8723bs]
[  +0.114921]
              but task is already holding lock:
[  +0.069908] ffffb8b3007ab704 (&psta->sleep_q.lock){+.-.}-{2:2}, at: wakeup_sta_to_xmit+0x3b/0x300 [r8723bs]
[  +0.116976]
              which lock already depends on the new lock.

[  +0.098037]
              the existing dependency chain (in reverse order) is:
[  +0.089704]
              -> #1 (&psta->sleep_q.lock){+.-.}-{2:2}:
[  +0.077232]        _raw_spin_lock_bh+0x34/0x40
[  +0.053261]        xmitframe_enqueue_for_sleeping_sta+0xc1/0x2f0 [r8723bs]
[  +0.082572]        rtw_xmit+0x58b/0x940 [r8723bs]
[  +0.056528]        _rtw_xmit_entry+0xba/0x350 [r8723bs]
[  +0.062755]        dev_hard_start_xmit+0xf1/0x320
[  +0.056381]        sch_direct_xmit+0x9e/0x360
[  +0.052212]        __dev_queue_xmit+0xce4/0x1080
[  +0.055334]        ip6_finish_output2+0x18f/0x6e0
[  +0.056378]        ndisc_send_skb+0x2c8/0x870
[  +0.052209]        ndisc_send_ns+0xd3/0x210
[  +0.050130]        addrconf_dad_work+0x3df/0x5a0
[  +0.055338]        process_one_work+0x274/0x5a0
[  +0.054296]        worker_thread+0x52/0x3b0
[  +0.050124]        kthread+0x16c/0x1a0
[  +0.044925]        ret_from_fork+0x1f/0x30
[  +0.049092]
              -> #0 (&pxmitpriv->lock){+.-.}-{2:2}:
[  +0.074101]        __lock_acquire+0x10f5/0x1d80
[  +0.054298]        lock_acquire+0xd7/0x300
[  +0.049088]        _raw_spin_lock_bh+0x34/0x40
[  +0.053248]        rtw_xmit_classifier+0x8a/0x140 [r8723bs]
[  +0.066949]        rtw_xmitframe_enqueue+0xa/0x20 [r8723bs]
[  +0.066946]        rtl8723bs_hal_xmitframe_enqueue+0x14/0x50 [r8723bs]
[  +0.078386]        wakeup_sta_to_xmit+0xa6/0x300 [r8723bs]
[  +0.065903]        rtw_recv_entry+0xe36/0x1160 [r8723bs]
[  +0.063809]        rtl8723bs_recv_tasklet+0x349/0x6c0 [r8723bs]
[  +0.071093]        tasklet_action_common.constprop.0+0xe5/0x110
[  +0.070966]        __do_softirq+0x16f/0x50a
[  +0.050134]        __irq_exit_rcu+0xeb/0x140
[  +0.051172]        irq_exit_rcu+0xa/0x20
[  +0.047006]        common_interrupt+0xb8/0xd0
[  +0.052214]        asm_common_interrupt+0x1e/0x40
[  +0.056381]        finish_task_switch.isra.0+0x100/0x3a0
[  +0.063670]        __schedule+0x3ad/0xd20
[  +0.048047]        schedule+0x4e/0xc0
[  +0.043880]        smpboot_thread_fn+0xc4/0x220
[  +0.054298]        kthread+0x16c/0x1a0
[  +0.044922]        ret_from_fork+0x1f/0x30
[  +0.049088]
              other info that might help us debug this:

[  +0.095950]  Possible unsafe locking scenario:

[  +0.070952]        CPU0                    CPU1
[  +0.054282]        ----                    ----
[  +0.054285]   lock(&psta->sleep_q.lock);
[  +0.047004]                                lock(&pxmitpriv->lock);
[  +0.074082]                                lock(&psta->sleep_q.lock);
[  +0.077209]   lock(&pxmitpriv->lock);
[  +0.043873]
               *** DEADLOCK ***

[  +0.070950] 1 lock held by ksoftirqd/3/29:
[  +0.049082]  #0: ffffb8b3007ab704 (&psta->sleep_q.lock){+.-.}-{2:2}, at: wakeup_sta_to_xmit+0x3b/0x300 [r8723bs]

Analysis shows that in hindsight the splitting of the lock was not
a good idea, so revert this to fix the access-point mode deadlock.

Note this is a straight-forward revert done with git revert, the commented
out "/* spin_lock_bh(&psta_bmc->sleep_q.lock); */" lines were part of the
code before the reverted changes.

Fixes: 54659ca026 ("staging: rtl8723bs: remove possible deadlock when disconnect (v2)")
Cc: stable <stable@vger.kernel.org>
Cc: Fabio Aiuto <fabioaiuto83@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215542
Link: https://lore.kernel.org/r/20220302101637.26542-1-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-02 16:38:19 +01:00
Gao Xiang
22ba5e99b9 erofs: fix ztailpacking on > 4GiB filesystems
z_idataoff here is an absolute physical offset, so it should use
erofs_off_t (64 bits at least). Otherwise, it'll get trimmed and
cause the decompresion failure.

Link: https://lore.kernel.org/r/20220222033118.20540-1-hsiangkao@linux.alibaba.com
Fixes: ab92184ff8 ("erofs: add on-disk compressed tail-packing inline support")
Reviewed-by: Yue Hu <huyue2@yulong.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-03-02 21:58:45 +08:00
Zhen Ni
0aa6b294b3 ALSA: intel_hdmi: Fix reference to PCM buffer address
PCM buffers might be allocated dynamically when the buffer
preallocation failed or a larger buffer is requested, and it's not
guaranteed that substream->dma_buffer points to the actually used
buffer.  The driver needs to refer to substream->runtime->dma_addr
instead for the buffer address.

Signed-off-by: Zhen Ni <nizhen@uniontech.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220302074241.30469-1-nizhen@uniontech.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-03-02 09:25:37 +01:00
Sven Eckelmann
6c1f41afc1 batman-adv: Don't expect inter-netns unique iflink indices
The ifindex doesn't have to be unique for multiple network namespaces on
the same machine.

  $ ip netns add test1
  $ ip -net test1 link add dummy1 type dummy
  $ ip netns add test2
  $ ip -net test2 link add dummy2 type dummy

  $ ip -net test1 link show dev dummy1
  6: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      link/ether 96:81:55:1e:dd:85 brd ff:ff:ff:ff:ff:ff
  $ ip -net test2 link show dev dummy2
  6: dummy2: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      link/ether 5a:3c:af:35:07:c3 brd ff:ff:ff:ff:ff:ff

But the batman-adv code to walk through the various layers of virtual
interfaces uses this assumption because dev_get_iflink handles it
internally and doesn't return the actual netns of the iflink. And
dev_get_iflink only documents the situation where ifindex == iflink for
physical devices.

But only checking for dev->netdev_ops->ndo_get_iflink is also not an option
because ipoib_get_iflink implements it even when it sometimes returns an
iflink != ifindex and sometimes iflink == ifindex. The caller must
therefore make sure itself to check both netns and iflink + ifindex for
equality. Only when they are equal, a "physical" interface was detected
which should stop the traversal. On the other hand, vxcan_get_iflink can
also return 0 in case there was currently no valid peer. In this case, it
is still necessary to stop.

Fixes: b7eddd0b39 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
Fixes: 5ed4a460a1 ("batman-adv: additional checks for virtual interfaces on top of WiFi")
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2022-03-02 09:24:55 +01:00
Sven Eckelmann
6116ba0942 batman-adv: Request iflink once in batadv_get_real_netdevice
There is no need to call dev_get_iflink multiple times for the same
net_device in batadv_get_real_netdevice. And since some of the
ndo_get_iflink callbacks are dynamic (for example via RCUs like in
vxcan_get_iflink), it could easily happen that the returned values are not
stable. The pre-checks before __dev_get_by_index are then of course bogus.

Fixes: 5ed4a460a1 ("batman-adv: additional checks for virtual interfaces on top of WiFi")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2022-03-02 09:01:28 +01:00
Sven Eckelmann
690bb6fb64 batman-adv: Request iflink once in batadv-on-batadv check
There is no need to call dev_get_iflink multiple times for the same
net_device in batadv_is_on_batman_iface. And since some of the
.ndo_get_iflink callbacks are dynamic (for example via RCUs like in
vxcan_get_iflink), it could easily happen that the returned values are not
stable. The pre-checks before __dev_get_by_index are then of course bogus.

Fixes: b7eddd0b39 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2022-03-02 09:01:25 +01:00
Hans de Goede
04b7762e37 Input: elan_i2c - fix regulator enable count imbalance after suspend/resume
Before these changes elan_suspend() would only disable the regulator
when device_may_wakeup() returns false; whereas elan_resume() would
unconditionally enable it, leading to an enable count imbalance when
device_may_wakeup() returns true.

This triggers the "WARN_ON(regulator->enable_count)" in regulator_put()
when the elan_i2c driver gets unbound, this happens e.g. with the
hot-plugable dock with Elan I2C touchpad for the Asus TF103C 2-in-1.

Fix this by making the regulator_enable() call also be conditional
on device_may_wakeup() returning false.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220131135436.29638-2-hdegoede@redhat.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-03-01 20:41:22 -08:00
Hans de Goede
81a36d8ce5 Input: elan_i2c - move regulator_[en|dis]able() out of elan_[en|dis]able_power()
elan_disable_power() is called conditionally on suspend, where as
elan_enable_power() is always called on resume. This leads to
an imbalance in the regulator's enable count.

Move the regulator_[en|dis]able() calls out of elan_[en|dis]able_power()
in preparation of fixing this.

No functional changes intended.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220131135436.29638-1-hdegoede@redhat.com
[dtor: consolidate elan_[en|dis]able() into elan_set_power()]
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-03-01 20:41:20 -08:00
Steven Rostedt (Google)
1d1898f656 tracing/histogram: Fix sorting on old "cpu" value
When trying to add a histogram against an event with the "cpu" field, it
was impossible due to "cpu" being a keyword to key off of the running CPU.
So to fix this, it was changed to "common_cpu" to match the other generic
fields (like "common_pid"). But since some scripts used "cpu" for keying
off of the CPU (for events that did not have "cpu" as a field, which is
most of them), a backward compatibility trick was added such that if "cpu"
was used as a key, and the event did not have "cpu" as a field name, then
it would fallback and switch over to "common_cpu".

This fix has a couple of subtle bugs. One was that when switching over to
"common_cpu", it did not change the field name, it just set a flag. But
the code still found a "cpu" field. The "cpu" field is used for filtering
and is returned when the event does not have a "cpu" field.

This was found by:

  # cd /sys/kernel/tracing
  # echo hist:key=cpu,pid:sort=cpu > events/sched/sched_wakeup/trigger
  # cat events/sched/sched_wakeup/hist

Which showed the histogram unsorted:

{ cpu:         19, pid:       1175 } hitcount:          1
{ cpu:          6, pid:        239 } hitcount:          2
{ cpu:         23, pid:       1186 } hitcount:         14
{ cpu:         12, pid:        249 } hitcount:          2
{ cpu:          3, pid:        994 } hitcount:          5

Instead of hard coding the "cpu" checks, take advantage of the fact that
trace_event_field_field() returns a special field for "cpu" and "CPU" if
the event does not have "cpu" as a field. This special field has the
"filter_type" of "FILTER_CPU". Check that to test if the returned field is
of the CPU type instead of doing the string compare.

Also, fix the sorting bug by testing for the hist_field flag of
HIST_FIELD_FL_CPU when setting up the sort routine. Otherwise it will use
the special CPU field to know what compare routine to use, and since that
special field does not have a size, it returns tracing_map_cmp_none.

Cc: stable@vger.kernel.org
Fixes: 1e3bac71c5 ("tracing/histogram: Rename "cpu" to "common_cpu"")
Reported-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-01 22:48:30 -05:00
Vladimir Oltean
0b0e2ff103 net: dsa: restore error path of dsa_tree_change_tag_proto
When the DSA_NOTIFIER_TAG_PROTO returns an error, the user space process
which initiated the protocol change exits the kernel processing while
still holding the rtnl_mutex. So any other process attempting to lock
the rtnl_mutex would deadlock after such event.

The error handling of DSA_NOTIFIER_TAG_PROTO was inadvertently changed
by the blamed commit, introducing this regression. We must still call
rtnl_unlock(), and we must still call DSA_NOTIFIER_TAG_PROTO for the old
protocol. The latter is due to the limiting design of notifier chains
for cross-chip operations, which don't have a built-in error recovery
mechanism - we should look into using notifier_call_chain_robust for that.

Fixes: dc452a471d ("net: dsa: introduce tagger-owned storage for private and shared data")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220228141715.146485-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-01 18:26:21 -08:00
Jakub Kicinski
2e77551c61 Merge tag 'for-net-2022-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - Fix regression with scanning not working in some systems.

* tag 'for-net-2022-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: Fix not checking MGMT cmd pending queue
====================

Link: https://lore.kernel.org/r/20220302004330.125536-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-01 17:16:46 -08:00
Brian Gix
275f3f6487 Bluetooth: Fix not checking MGMT cmd pending queue
A number of places in the MGMT handlers we examine the command queue for
other commands (in progress but not yet complete) that will interact
with the process being performed. However, not all commands go into the
queue if one of:

1. There is no negative side effect of consecutive or redundent commands
2. The command is entirely perform "inline".

This change examines each "pending command" check, and if it is not
needed, deletes the check. Of the remaining pending command checks, we
make sure that the command is in the pending queue by using the
mgmt_pending_add/mgmt_pending_remove pair rather than the
mgmt_pending_new/mgmt_pending_free pair.

Link: https://lore.kernel.org/linux-bluetooth/f648f2e11bb3c2974c32e605a85ac3a9fac944f1.camel@redhat.com/T/
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Brian Gix <brian.gix@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-03-01 16:10:58 -08:00
Jakub Kicinski
4761df52f1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

1) Use kfree_rcu(ptr, rcu) variant, using kfree_rcu(ptr) was not
   intentional. From Eric Dumazet.

2) Use-after-free in netfilter hook core, from Eric Dumazet.

3) Missing rcu read lock side for netfilter egress hook,
   from Florian Westphal.

4) nf_queue assume state->sk is full socket while it might not be.
   Invoke sock_gen_put(), from Florian Westphal.

5) Add selftest to exercise the reported KASAN splat in 4)

6) Fix possible use-after-free in nf_queue in case sk_refcnt is 0.
   Also from Florian.

7) Use input interface index only for hardware offload, not for
   the software plane. This breaks tc ct action. Patch from Paul Blakey.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  net/sched: act_ct: Fix flow table lookup failure with no originating ifindex
  netfilter: nf_queue: handle socket prefetch
  netfilter: nf_queue: fix possible use-after-free
  selftests: netfilter: add nfqueue TCP_NEW_SYN_RECV socket race test
  netfilter: nf_queue: don't assume sk is full socket
  netfilter: egress: silence egress hook lockdep splats
  netfilter: fix use-after-free in __nf_register_net_hook()
  netfilter: nf_tables: prefer kfree_rcu(ptr, rcu) variant
====================

Link: https://lore.kernel.org/r/20220301215337.378405-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-01 15:13:47 -08:00
Dan Carpenter
fc7f750dc9 staging: gdm724x: fix use after free in gdm_lte_rx()
The netif_rx_ni() function frees the skb so we can't dereference it to
save the skb->len.

Fixes: 61e1210476 ("staging: gdm7240: adding LTE USB driver")
Cc: stable <stable@vger.kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220228074331.GA13685@kili
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-01 22:42:32 +01:00
Pali Rohár
a0e897d1b3 arm64: dts: armada-3720-turris-mox: Add missing ethernet0 alias
U-Boot uses ethernet* aliases for setting MAC addresses. Therefore define
also alias for ethernet0.

Fixes: 7109d817db ("arm64: dts: marvell: add DTS for Turris Mox")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-01 22:13:07 +01:00
Paul Blakey
db6140e5e3 net/sched: act_ct: Fix flow table lookup failure with no originating ifindex
After cited commit optimizted hw insertion, flow table entries are
populated with ifindex information which was intended to only be used
for HW offload. This tuple ifindex is hashed in the flow table key, so
it must be filled for lookup to be successful. But tuple ifindex is only
relevant for the netfilter flowtables (nft), so it's not filled in
act_ct flow table lookup, resulting in lookup failure, and no SW
offload and no offload teardown for TCP connection FIN/RST packets.

To fix this, add new tc ifindex field to tuple, which will
only be used for offloading, not for lookup, as it will not be
part of the tuple hash.

Fixes: 9795ded7f9 ("net/sched: act_ct: Fill offloading tuple iifidx")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-03-01 22:08:31 +01:00
Linus Torvalds
fb184c4af9 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "The bigger part of the change is a revert for x86 hosts. Here the
  second patch was supposed to fix the first, but in reality it was just
  as broken, so both have to go.

  x86 host:

   - Revert incorrect assumption that cr3 changes come with preempt
     notifier callbacks (they don't when static branches are changed,
     for example)

  ARM host:

   - Correctly synchronise PMR and co on PSCI CPU_SUSPEND

   - Skip tests that depend on GICv3 when the HW isn't available"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: aarch64: Skip tests if we can't create a vgic-v3
  Revert "KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()"
  Revert "KVM: VMX: Save HOST_CR3 in vmx_set_host_fs_gs()"
  KVM: arm64: Don't miss pending interrupts for suspended vCPU
2022-03-01 12:01:18 -08:00
Heiko Carstens
c194dad210 s390/extable: fix exception table sorting
s390 has a swap_ex_entry_fixup function, however it is not being used
since common code expects a swap_ex_entry_fixup define. If it is not
defined the default implementation will be used. So fix this by adding
a proper define.
However also the implementation of the function must be fixed, since a
NULL value for handler has a special meaning and must not be adjusted.

Luckily all of this doesn't fix a real bug currently: the main extable
is correctly sorted during build time, and for runtime sorting there
is currently no case where the handler field is not NULL.

Fixes: 05a68e892e ("s390/kernel: expand exception table logic to allow new handling options")
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-01 20:41:28 +01:00
Heiko Carstens
1389f17937 s390/ftrace: fix arch_ftrace_get_regs implementation
arch_ftrace_get_regs is supposed to return a struct pt_regs pointer
only if the pt_regs structure contains all register contents, which
means it must have been populated when created via ftrace_regs_caller.

If it was populated via ftrace_caller the contents are not complete
(the psw mask part is missing), and therefore a NULL pointer needs be
returned.

The current code incorrectly always returns a struct pt_regs pointer.

Fix this by adding another pt_regs flag which indicates if the
contents are complete, and fix arch_ftrace_get_regs accordingly.

Fixes: 894979689d ("s390/ftrace: provide separate ftrace_caller/ftrace_regs_caller implementations")
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-01 20:41:28 +01:00
Heiko Carstens
9fa881f7e3 s390/ftrace: fix ftrace_caller/ftrace_regs_caller generation
ftrace_caller was used for both ftrace_caller and ftrace_regs_caller,
which means that the target address of the hotpatch trampoline was
never updated.

With commit 894979689d ("s390/ftrace: provide separate
ftrace_caller/ftrace_regs_caller implementations") a separate
ftrace_regs_caller entry point was implemeted, however it was
forgotten to implement the necessary changes for ftrace_modify_call
and ftrace_make_call, where the branch target has to be modified
accordingly.

Therefore add the missing code now.

Fixes: 894979689d ("s390/ftrace: provide separate ftrace_caller/ftrace_regs_caller implementations")
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-01 20:41:28 +01:00
Alexander Egorenkov
6b4b54c7ca s390/setup: preserve memory at OLDMEM_BASE and OLDMEM_SIZE
We need to preserve the values at OLDMEM_BASE and OLDMEM_SIZE which are
used by zgetdump in case when kdump crashes. In that case zgetdump will
attempt to read OLDMEM_BASE and OLDMEM_SIZE in order to find out where
the memory range [0 - OLDMEM_SIZE] belonging to the production kernel is.

Fixes: f1a5469474 ("s390/setup: don't reserve memory that occupied decompressor's head")
Cc: stable@vger.kernel.org # 5.15+
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-01 20:41:28 +01:00
Manasi Navare
62929726ef drm/vrr: Set VRR capable prop only if it is attached to connector
VRR capable property is not attached by default to the connector
It is attached only if VRR is supported.
So if the driver tries to call drm core set prop function without
it being attached that causes NULL dereference.

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225013055.9282-1-manasi.d.navare@intel.com
2022-03-01 11:37:21 -08:00
Linus Torvalds
5751153606 Merge tag 'binfmt_elf-v5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull binfmt_elf fix from Kees Cook:
 "This addresses a regression[1] under ia64 where some ET_EXEC binaries
  were not loading"

Link: https://linux-regtracking.leemhuis.info/regzbot/regression/a3edd529-c42d-3b09-135c-7e98a15b150f@leemhuis.info/ [1]

- Fix ia64 ET_EXEC loading

* tag 'binfmt_elf-v5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  binfmt_elf: Avoid total_mapping_size for ET_EXEC
2022-03-01 11:31:37 -08:00
Kees Cook
439a846824 binfmt_elf: Avoid total_mapping_size for ET_EXEC
Partially revert commit 5f501d5556 ("binfmt_elf: reintroduce using
MAP_FIXED_NOREPLACE"), which applied the ET_DYN "total_mapping_size"
logic also to ET_EXEC.

At least ia64 has ET_EXEC PT_LOAD segments that are not virtual-address
contiguous (but _are_ file-offset contiguous). This would result in a
giant mapping attempting to cover the entire span, including the virtual
address range hole, and well beyond the size of the ELF file itself,
causing the kernel to refuse to load it. For example:

$ readelf -lW /usr/bin/gcc
...
Program Headers:
  Type Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   ...
...
  LOAD 0x000000 0x4000000000000000 0x4000000000000000 0x00b5a0 0x00b5a0 ...
  LOAD 0x00b5a0 0x600000000000b5a0 0x600000000000b5a0 0x0005ac 0x000710 ...
...
       ^^^^^^^^ ^^^^^^^^^^^^^^^^^^                    ^^^^^^^^ ^^^^^^^^

File offset range     : 0x000000-0x00bb4c
			0x00bb4c bytes

Virtual address range : 0x4000000000000000-0x600000000000bcb0
			0x200000000000bcb0 bytes

Remove the total_mapping_size logic for ET_EXEC, which reduces the
ET_EXEC MAP_FIXED_NOREPLACE coverage to only the first PT_LOAD (better
than nothing), and retains it for ET_DYN.

Ironically, this is the reverse of the problem that originally caused
problems with MAP_FIXED_NOREPLACE: overlapping PT_LOAD segments. Future
work could restore full coverage if load_elf_binary() were to perform
mappings in a separate phase from the loading (where it could resolve
both overlaps and holes).

Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-mm@kvack.org
Reported-by: matoro <matoro_bugzilla_kernel@matoro.tk>
Fixes: 5f501d5556 ("binfmt_elf: reintroduce using MAP_FIXED_NOREPLACE")
Link: https://lore.kernel.org/r/a3edd529-c42d-3b09-135c-7e98a15b150f@leemhuis.info
Tested-by: matoro <matoro_mailinglist_kernel@matoro.tk>
Link: https://lore.kernel.org/lkml/ce8af9c13bcea9230c7689f3c1e0e2cd@matoro.tk
Tested-By: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/lkml/49182d0d-708b-4029-da5f-bc18603440a6@physik.fu-berlin.de
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-03-01 10:29:20 -08:00
Nicolas Cavallari
5838a14832 thermal: core: Fix TZ_GET_TRIP NULL pointer dereference
Do not call get_trip_hyst() from thermal_genl_cmd_tz_get_trip() if
the thermal zone does not define one.

Fixes: 1ce50e7d40 ("thermal: core: genetlink support for events/cmd/sampling")
Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-01 16:11:38 +01:00
Jia-Ju Bai
fe23b6bbea HID: nintendo: check the return value of alloc_workqueue()
The function alloc_workqueue() in nintendo_hid_probe() can fail, but
there is no check of its return value. To fix this bug, its return value
should be checked with new error handling code.

Fixes: c4eae84fef ("HID: nintendo: add rumble support")
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: Silvan Jegen <s.jegen@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2022-03-01 15:56:43 +01:00
David S. Miller
b8d06ce712 Merge tag 'wireless-for-net-2022-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
johannes Berg says:

====================

Some last-minute fixes:
 * rfkill
   - add missing rfill_soft_blocked() when disabled

 * cfg80211
   - handle a nla_memdup() failure correctly
   - fix CONFIG_CFG80211_EXTRA_REGDB_KEYDIR typo in
     Makefile

 * mac80211
   - fix EAPOL handling in 802.3 RX path
   - reject setting up aggregation sessions before
     connection is authorized to avoid timeouts or
     similar
   - handle some SAE authentication steps correctly
   - fix AC selection in mesh forwarding

 * iwlwifi
   - remove TWT support as it causes firmware crashes
     when the AP isn't behaving correctly
   - check debugfs pointer before dereferncing it
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01 14:45:55 +00:00
Dmitry Torokhov
cc71d37fd1 HID: vivaldi: fix sysfs attributes leak
The driver creates the top row map sysfs attribute in input_configured()
method; unfortunately we do not have a callback that is executed when HID
interface is unbound, thus we are leaking these sysfs attributes, for
example when device is disconnected.

To fix it let's switch to managed version of adding sysfs attributes which
will ensure that they are destroyed when the driver is unbound.

Fixes: 14c9c014ba ("HID: add vivaldi HID driver")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2022-03-01 15:31:17 +01:00
Johannes Berg
a12f76345e cfg80211: fix CONFIG_CFG80211_EXTRA_REGDB_KEYDIR typo
The kbuild change here accidentally removed not only the
unquoting, but also the last character of the variable
name. Fix that.

Fixes: 129ab0d2d9 ("kbuild: do not quote string values in include/config/auto.conf")
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20220221155512.1d25895f7c5f.I50fa3d4189fcab90a2896fe8cae215035dae9508@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-01 14:10:14 +01:00
Daniel Palmer
ea49432d18 ARM: mstar: Select HAVE_ARM_ARCH_TIMER
The mstar SoCs have an arch timer but HAVE_ARM_ARCH_TIMER wasn't
selected. If MSC313E_TIMER isn't selected then the kernel gets
stuck at boot because there are no timers available.

Signed-off-by: Daniel Palmer <daniel@0x0f.com>
Link: https://lore.kernel.org/r/20220301104349.3040422-1-daniel@0x0f.com'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-01 13:26:32 +01:00
Lina Wang
4ff2980b6b xfrm: fix tunnel model fragmentation behavior
in tunnel mode, if outer interface(ipv4) is less, it is easily to let
inner IPV6 mtu be less than 1280. If so, a Packet Too Big ICMPV6 message
is received. When send again, packets are fragmentized with 1280, they
are still rejected with ICMPV6(Packet Too Big) by xfrmi_xmit2().

According to RFC4213 Section3.2.2:
if (IPv4 path MTU - 20) is less than 1280
	if packet is larger than 1280 bytes
		Send ICMPv6 "packet too big" with MTU=1280
                Drop packet
        else
		Encapsulate but do not set the Don't Fragment
                flag in the IPv4 header.  The resulting IPv4
                packet might be fragmented by the IPv4 layer
                on the encapsulator or by some router along
                the IPv4 path.
	endif
else
	if packet is larger than (IPv4 path MTU - 20)
        	Send ICMPv6 "packet too big" with
                MTU = (IPv4 path MTU - 20).
                Drop packet.
        else
                Encapsulate and set the Don't Fragment flag
                in the IPv4 header.
        endif
endif
Packets should be fragmentized with ipv4 outer interface, so change it.

After it is fragemtized with ipv4, there will be double fragmenation.
No.48 & No.51 are ipv6 fragment packets, No.48 is double fragmentized,
then tunneled with IPv4(No.49& No.50), which obey spec. And received peer
cannot decrypt it rightly.

48              2002::10        2002::11 1296(length) IPv6 fragment (off=0 more=y ident=0xa20da5bc nxt=50)
49   0x0000 (0) 2002::10        2002::11 1304         IPv6 fragment (off=0 more=y ident=0x7448042c nxt=44)
50   0x0000 (0) 2002::10        2002::11 200          ESP (SPI=0x00035000)
51              2002::10        2002::11 180          Echo (ping) request
52   0x56dc     2002::10        2002::11 248          IPv6 fragment (off=1232 more=n ident=0xa20da5bc nxt=50)

xfrm6_noneed_fragment has fixed above issues. Finally, it acted like below:
1   0x6206 192.168.1.138   192.168.1.1 1316 Fragmented IP protocol (proto=Encap Security Payload 50, off=0, ID=6206) [Reassembled in #2]
2   0x6206 2002::10        2002::11    88   IPv6 fragment (off=0 more=y ident=0x1f440778 nxt=50)
3   0x0000 2002::10        2002::11    248  ICMPv6    Echo (ping) request

Signed-off-by: Lina Wang <lina.wang@mediatek.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-03-01 12:08:40 +01:00
Florian Westphal
3b836da408 netfilter: nf_queue: handle socket prefetch
In case someone combines bpf socket assign and nf_queue, then we will
queue an skb who references a struct sock that did not have its
reference count incremented.

As we leave rcu protection, there is no guarantee that skb->sk is still
valid.

For refcount-less skb->sk case, try to increment the reference count
and then override the destructor.

In case of failure we have two choices: orphan the skb and 'delete'
preselect or let nf_queue() drop the packet.

Do the latter, it should not happen during normal operation.

Fixes: cf7fbe660f ("bpf: Add socket assign support")
Acked-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-03-01 11:51:15 +01:00
Florian Westphal
c387307024 netfilter: nf_queue: fix possible use-after-free
Eric Dumazet says:
  The sock_hold() side seems suspect, because there is no guarantee
  that sk_refcnt is not already 0.

On failure, we cannot queue the packet and need to indicate an
error.  The packet will be dropped by the caller.

v2: split skb prefetch hunk into separate change

Fixes: 271b72c7fa ("udp: RCU handling for Unicast packets.")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-03-01 11:50:35 +01:00
Florian Westphal
2e78855d31 selftests: netfilter: add nfqueue TCP_NEW_SYN_RECV socket race test
causes:
BUG: KASAN: slab-out-of-bounds in sk_free+0x25/0x80
Write of size 4 at addr ffff888106df0284 by task nf-queue/1459
 sk_free+0x25/0x80
 nf_queue_entry_release_refs+0x143/0x1a0
 nf_reinject+0x233/0x770

... without 'netfilter: nf_queue: don't assume sk is full socket'.

Signed-off-by: Florian Westphal <fw@strlen.de>
2022-03-01 11:48:58 +01:00
Florian Westphal
747670fd9a netfilter: nf_queue: don't assume sk is full socket
There is no guarantee that state->sk refers to a full socket.

If refcount transitions to 0, sock_put calls sk_free which then ends up
with garbage fields.

I'd like to thank Oleksandr Natalenko and Jiri Benc for considerable
debug work and pointing out state->sk oddities.

Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Tested-by: Oleksandr Natalenko <oleksandr@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-03-01 11:48:18 +01:00
Johannes Berg
94d9864cc8 mac80211: treat some SAE auth steps as final
When we get anti-clogging token required (added by the commit
mentioned below), or the other status codes added by the later
commit 4e56cde15f ("mac80211: Handle special status codes in
SAE commit") we currently just pretend (towards the internal
state machine of authentication) that we didn't receive anything.

This has the undesirable consequence of retransmitting the prior
frame, which is not expected, because the timer is still armed.

If we just disarm the timer at that point, it would result in
the undesirable side effect of being in this state indefinitely
if userspace crashes, or so.

So to fix this, reset the timer and set a new auth_data->waiting
in order to have no more retransmissions, but to have the data
destroyed when the timer actually fires, which will only happen
if userspace didn't continue (i.e. crashed or abandoned it.)

Fixes: a4055e74a2 ("mac80211: Don't destroy auth data in case of anti-clogging")
Reported-by: Jouni Malinen <j@w1.fi>
Link: https://lore.kernel.org/r/20220224103932.75964e1d7932.Ia487f91556f29daae734bf61f8181404642e1eec@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-01 11:33:13 +01:00
Jiasheng Jiang
6ad27f522c nl80211: Handle nla_memdup failures in handle_nan_filter
As there's potential for failure of the nla_memdup(),
check the return value.

Fixes: a442b761b2 ("cfg80211: add add_nan_func / del_nan_func")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Link: https://lore.kernel.org/r/20220301100020.3801187-1-jiasheng@iscas.ac.cn
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-01 11:15:08 +01:00
Randy Dunlap
5a6248c0a2 iwlwifi: mvm: check debugfs_dir ptr before use
When "debugfs=off" is used on the kernel command line, iwiwifi's
mvm module uses an invalid/unchecked debugfs_dir pointer and causes
a BUG:

 BUG: kernel NULL pointer dereference, address: 000000000000004f
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP
 CPU: 1 PID: 503 Comm: modprobe Tainted: G        W         5.17.0-rc5 #7
 Hardware name: Dell Inc. Inspiron 15 5510/076F7Y, BIOS 2.4.1 11/05/2021
 RIP: 0010:iwl_mvm_dbgfs_register+0x692/0x700 [iwlmvm]
 Code: 69 a0 be 80 01 00 00 48 c7 c7 50 73 6a a0 e8 95 cf ee e0 48 8b 83 b0 1e 00 00 48 c7 c2 54 73 6a a0 be 64 00 00 00 48 8d 7d 8c <48> 8b 48 50 e8 15 22 07 e1 48 8b 43 28 48 8d 55 8c 48 c7 c7 5f 73
 RSP: 0018:ffffc90000a0ba68 EFLAGS: 00010246
 RAX: ffffffffffffffff RBX: ffff88817d6e3328 RCX: ffff88817d6e3328
 RDX: ffffffffa06a7354 RSI: 0000000000000064 RDI: ffffc90000a0ba6c
 RBP: ffffc90000a0bae0 R08: ffffffff824e4880 R09: ffffffffa069d620
 R10: ffffc90000a0ba00 R11: ffffffffffffffff R12: 0000000000000000
 R13: ffffc90000a0bb28 R14: ffff88817d6e3328 R15: ffff88817d6e3320
 FS:  00007f64dd92d740(0000) GS:ffff88847f640000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000000000000004f CR3: 000000016fc79001 CR4: 0000000000770ee0
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? iwl_mvm_mac_setup_register+0xbdc/0xda0 [iwlmvm]
  iwl_mvm_start_post_nvm+0x71/0x100 [iwlmvm]
  iwl_op_mode_mvm_start+0xab8/0xb30 [iwlmvm]
  _iwl_op_mode_start+0x6f/0xd0 [iwlwifi]
  iwl_opmode_register+0x6a/0xe0 [iwlwifi]
  ? 0xffffffffa0231000
  iwl_mvm_init+0x35/0x1000 [iwlmvm]
  ? 0xffffffffa0231000
  do_one_initcall+0x5a/0x1b0
  ? kmem_cache_alloc+0x1e5/0x2f0
  ? do_init_module+0x1e/0x220
  do_init_module+0x48/0x220
  load_module+0x2602/0x2bc0
  ? __kernel_read+0x145/0x2e0
  ? kernel_read_file+0x229/0x290
  __do_sys_finit_module+0xc5/0x130
  ? __do_sys_finit_module+0xc5/0x130
  __x64_sys_finit_module+0x13/0x20
  do_syscall_64+0x38/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f64dda564dd
 Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1b 29 0f 00 f7 d8 64 89 01 48
 RSP: 002b:00007ffdba393f88 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f64dda564dd
 RDX: 0000000000000000 RSI: 00005575399e2ab2 RDI: 0000000000000001
 RBP: 000055753a91c5e0 R08: 0000000000000000 R09: 0000000000000002
 R10: 0000000000000001 R11: 0000000000000246 R12: 00005575399e2ab2
 R13: 000055753a91ceb0 R14: 0000000000000000 R15: 000055753a923018
  </TASK>
 Modules linked in: btintel(+) btmtk bluetooth vfat snd_hda_codec_hdmi fat snd_hda_codec_realtek snd_hda_codec_generic iwlmvm(+) snd_sof_pci_intel_tgl mac80211 snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence soundwire_bus snd_sof_intel_hda snd_sof_pci snd_sof snd_sof_xtensa_dsp snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core btrfs snd_compress snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec raid6_pq iwlwifi snd_hda_core snd_pcm snd_timer snd soundcore cfg80211 intel_ish_ipc(+) thunderbolt rfkill intel_ishtp ucsi_acpi wmi i2c_hid_acpi i2c_hid evdev
 CR2: 000000000000004f
 ---[ end trace 0000000000000000 ]---

Check the debugfs_dir pointer for an error before using it.

Fixes: 8c082a99ed ("iwlwifi: mvm: simplify iwl_mvm_dbgfs_register")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: linux-wireless@vger.kernel.org
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220223030630.23241-1-rdunlap@infradead.org
[change to make both conditional]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-01 11:10:00 +01:00
Golan Ben Ami
1db5fcbba2 iwlwifi: don't advertise TWT support
Some APs misbehave when TWT is used and cause our firmware to crash.
We don't know a reasonable way to detect and work around this problem
in the FW yet.  To prevent these crashes, disable TWT in the driver by
stopping to advertise TWT support.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215523
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
[reworded the commit message]
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/20220301072926.153969-1-luca@coelho.fi
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-01 11:00:43 +01:00
Ben Dooks
50bb467c9e rfkill: define rfill_soft_blocked() if !RFKILL
If CONFIG_RFKILL is not set, the Intel WiFi driver will not build
the iw_mvm driver part due to the missing rfill_soft_blocked()
call. Adding a inline declaration of rfill_soft_blocked() if
CONFIG_RFKILL=n fixes the following error:

drivers/net/wireless/intel/iwlwifi/mvm/mvm.h: In function 'iwl_mvm_mei_set_sw_rfkill_state':
drivers/net/wireless/intel/iwlwifi/mvm/mvm.h:2215:38: error: implicit declaration of function 'rfkill_soft_blocked'; did you mean 'rfkill_blocked'? [-Werror=implicit-function-declaration]
 2215 |                 mvm->hw_registered ? rfkill_soft_blocked(mvm->hw->wiphy->rfkill) : false;
      |                                      ^~~~~~~~~~~~~~~~~~~
      |                                      rfkill_blocked

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reported-by: Neill Whillans <neill.whillans@codethink.co.uk>
Fixes: 5bc9a9dd75 ("rfkill: allow to get the software rfkill state")
Link: https://lore.kernel.org/r/20220218093858.1245677-1-ben.dooks@codethink.co.uk
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-01 10:59:13 +01:00
Arnd Bergmann
35e33a24f8 Merge tag 'v5.17-fixes-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux into arm/fixes
- Set display pipeline to DSI on mt8183 kukui jacuzzi
- Fix display for mt8192 based boards by fixing the routing table

* tag 'v5.17-fixes-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux:
  soc: mediatek: mt8192-mmsys: Fix dither to dsi0 path's input sel
  arm64: dts: mt8183: jacuzzi: Fix bus properties in anx's DSI endpoint

Link: https://lore.kernel.org/r/8eb8510d-c597-4fee-e4b3-924b6d4bb3be@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-01 10:46:05 +01:00
Arnd Bergmann
cf90e2f1de Merge tag 'qcom-dts-fixes-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm DeviceTree fixes for v5.17

The SDX65 platform and MTP device was added twice to the DT binding,
this drops one of the occurances.

* tag 'qcom-dts-fixes-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  Revert "dt-bindings: arm: qcom: Document SDX65 platform and boards"

Link: https://lore.kernel.org/r/20220301033838.1801689-1-bjorn.andersson@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-01 10:45:56 +01:00
Arnd Bergmann
e1d7eed180 Merge tag 'qcom-arm64-fixes-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm ARM64 DeviceTree fixes for 5.17

This starts off by fixing an issue introduced in a bug fix in the
global clock controller, where the symbol clocks for UFS would
end up picking the wrong parent clock which breaks UFS.

It then makes sure that the reference clock for the USB blocks are
enabled, even with booting without clk_ignore_unused.

It corrects the apps SMMU interrupts defintion by adding a missing
interrupt in the list.

Lastly it disables the Qualcomm crypto hardware (for now) on the Lenovo
Yoga C630, to prevent the cryptomanager tests during boot from crashing
the device.

* tag 'qcom-arm64-fixes-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  arm64: dts: qcom: c630: disable crypto due to serror
  arm64: dts: qcom: sm8450: fix apps_smmu interrupts
  arm64: dts: qcom: sm8450: enable GCC_USB3_0_CLKREF_EN for usb
  arm64: dts: qcom: sm8350: Correct UFS symbol clocks

Link: https://lore.kernel.org/r/20220301033526.1801295-1-bjorn.andersson@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-01 10:45:47 +01:00
Arnd Bergmann
9411ac255e Merge tag 'arm-soc/for-5.17/devicetree-fixes' of https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for
5.17, please pull the following:

- Maxime fixes the HVS (display) register range for the BCM2711
  (Raspberry Pi 4) SoC

* tag 'arm-soc/for-5.17/devicetree-fixes' of https://github.com/Broadcom/stblinux:
  ARM: boot: dts: bcm2711: Fix HVS register range

Link: https://lore.kernel.org/r/20220228165537.1950863-1-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-01 10:45:38 +01:00
Ilya Lipnitskiy
5d8965704f MIPS: ralink: mt7621: use bitwise NOT instead of logical
It was the intention to reverse the bits, not make them all zero by
using logical NOT operator.

Fixes: cc19db8b31 ("MIPS: ralink: mt7621: do memory detection on KSEG1")
Suggested-by: Chuanhong Guo <gch981213@gmail.com>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Reviewed-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-03-01 10:08:45 +01:00
David S. Miller
7cf5aa32e3 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-02-28

This series contains updates to igc and e1000e drivers.

Corinna Vinschen ensures release of hardware sempahore on failed
register read in igc_read_phy_reg_gpy().

Sasha does the same for the write variant, igc_write_phy_reg_gpy(). On
e1000e, he resolves an issue with hardware unit hang on s0ix exit
by disabling some bits and LAN connected device reset during power
management flows. Lastly, he allows for TGP platforms to correct its
NVM checksum.

v2: Fix Fixes tag on patch 3
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01 08:33:55 +00:00
Randy Dunlap
9feaf8b387 efi: fix return value of __setup handlers
When "dump_apple_properties" is used on the kernel boot command line,
it causes an Unknown parameter message and the string is added to init's
argument strings:

  Unknown kernel command line parameters "dump_apple_properties
    BOOT_IMAGE=/boot/bzImage-517rc6 efivar_ssdt=newcpu_ssdt", will be
    passed to user space.

 Run /sbin/init as init process
   with arguments:
     /sbin/init
     dump_apple_properties
   with environment:
     HOME=/
     TERM=linux
     BOOT_IMAGE=/boot/bzImage-517rc6
     efivar_ssdt=newcpu_ssdt

Similarly when "efivar_ssdt=somestring" is used, it is added to the
Unknown parameter message and to init's environment strings, polluting
them (see examples above).

Change the return value of the __setup functions to 1 to indicate
that the __setup options have been handled.

Fixes: 58c5475aba ("x86/efi: Retrieve and assign Apple device properties")
Fixes: 475fb4e8b2 ("efi / ACPI: load SSTDs from EFI variables")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-efi@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Octavian Purdila <octavian.purdila@intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Link: https://lore.kernel.org/r/20220301041851.12459-1-rdunlap@infradead.org
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-03-01 09:02:21 +01:00
AngeloGioacchino Del Regno
c432cd598a soc: mediatek: mt8192-mmsys: Fix dither to dsi0 path's input sel
In commit d687e056a1 ("soc: mediatek: mmsys: Add mt8192 mmsys routing table"),
the mmsys routing table for mt8192 was introduced but the input selector
for DITHER->DSI0 has no value assigned to it.

This means that we are clearing bit 0 instead of setting it, blocking
communication between these two blocks; due to that, any display that
is connected to DSI0 will not work, as no data will go through.
The effect of that issue is that, during bootup, the DRM will block for
some time, while atomically waiting for a vblank that never happens;
later, the situation doesn't get better, leaving the display in a
non-functional state.

To fix this issue, fix the route entry in the table by assigning the
dither input selector to MT8192_DISP_DSI0_SEL_IN.

Fixes: d687e056a1 ("soc: mediatek: mmsys: Add mt8192 mmsys routing table")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Tested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20220128142056.359900-1-angelogioacchino.delregno@collabora.com
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2022-03-01 08:36:59 +01:00
Hans de Goede
d982992669 Input: goodix - workaround Cherry Trail devices with a bogus ACPI Interrupt() resource
ACPI/x86 devices with a Cherry Trail SoC should have a GpioInt + a regular
GPIO ACPI resource in their ACPI tables.

Some CHT devices have a bug, where the also is bogus interrupt resource
(likely copied from a previous Bay Trail based generation of the device).

The i2c-core-acpi code will assign the bogus, non-working, interrupt
resource to client->irq. Add a workaround to fix this up.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2043960
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220228111613.363336-1-hdegoede@redhat.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-02-28 23:14:53 -08:00
Hans de Goede
d176708ffc Input: goodix - use the new soc_intel_is_byt() helper
Use the new soc_intel_is_byt() helper from linux/platform_data/x86/soc.h.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220131143539.109142-5-hdegoede@redhat.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-02-28 23:14:51 -08:00
Dmitry Torokhov
1136fa0c07 Merge tag 'v5.17-rc4' into for-linus
Merge with mainline to get the Intel ASoC generic helpers header and
other changes.
2022-02-28 23:12:55 -08:00
Samuel Holland
bac129dbc6 pinctrl: sunxi: Use unique lockdep classes for IRQs
This driver, like several others, uses a chained IRQ for each GPIO bank,
and forwards .irq_set_wake to the GPIO bank's upstream IRQ. As a result,
a call to irq_set_irq_wake() needs to lock both the upstream and
downstream irq_desc's. Lockdep considers this to be a possible deadlock
when the irq_desc's share lockdep classes, which they do by default:

 ============================================
 WARNING: possible recursive locking detected
 5.17.0-rc3-00394-gc849047c2473 #1 Not tainted
 --------------------------------------------
 init/307 is trying to acquire lock:
 c2dfe27c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0x58/0xa0

 but task is already holding lock:
 c3c0ac7c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0x58/0xa0

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&irq_desc_lock_class);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 4 locks held by init/307:
  #0: c1f29f18 (system_transition_mutex){+.+.}-{3:3}, at: __do_sys_reboot+0x90/0x23c
  #1: c20f7760 (&dev->mutex){....}-{3:3}, at: device_shutdown+0xf4/0x224
  #2: c2e804d8 (&dev->mutex){....}-{3:3}, at: device_shutdown+0x104/0x224
  #3: c3c0ac7c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0x58/0xa0

 stack backtrace:
 CPU: 0 PID: 307 Comm: init Not tainted 5.17.0-rc3-00394-gc849047c2473 #1
 Hardware name: Allwinner sun8i Family
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x68/0x90
  dump_stack_lvl from __lock_acquire+0x1680/0x31a0
  __lock_acquire from lock_acquire+0x148/0x3dc
  lock_acquire from _raw_spin_lock_irqsave+0x50/0x6c
  _raw_spin_lock_irqsave from __irq_get_desc_lock+0x58/0xa0
  __irq_get_desc_lock from irq_set_irq_wake+0x2c/0x19c
  irq_set_irq_wake from irq_set_irq_wake+0x13c/0x19c
    [tail call from sunxi_pinctrl_irq_set_wake]
  irq_set_irq_wake from gpio_keys_suspend+0x80/0x1a4
  gpio_keys_suspend from gpio_keys_shutdown+0x10/0x2c
  gpio_keys_shutdown from device_shutdown+0x180/0x224
  device_shutdown from __do_sys_reboot+0x134/0x23c
  __do_sys_reboot from ret_fast_syscall+0x0/0x1c

However, this can never deadlock because the upstream and downstream
IRQs are never the same (nor do they even involve the same irqchip).

Silence this erroneous lockdep splat by applying what appears to be the
usual fix of moving the GPIO IRQs to separate lockdep classes.

Fixes: a59c99d9ea ("pinctrl: sunxi: Forward calls to irq_set_irq_wake")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20220216040037.22730-1-samuel@sholland.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-02-28 23:53:19 +01:00
Hans Verkuil
7795686d57 pinctrl-sunxi: sunxi_pinctrl_gpio_direction_in/output: use correct offset
The commit that sets the direction directly without calling
pinctrl_gpio_direction(), forgot to add chip->base to the offset when
calling sunxi_pmx_gpio_set_direction().

This caused failures for various Allwinner boards which have two
GPIO blocks.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reported-by: 5kft <5kft@5kft.org>
Suggested-by: 5kft <5kft@5kft.org>
Reported-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Fixes: 8df89a7cbc (pinctrl-sunxi: don't call pinctrl_gpio_direction())
Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Tested-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/0f536cd8-01db-5d16-2cec-ec6d19409a49@xs4all.nl
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[Picked from linux-next to pinctrl fixes]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-02-28 23:52:20 +01:00
Sasha Neftin
ffd24fa2fc e1000e: Correct NVM checksum verification flow
Update MAC type check e1000_pch_tgp because for e1000_pch_cnp,
NVM checksum update is still possible.
Emit a more detailed warning message.

Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1191663
Fixes: 4051f68318 ("e1000e: Do not take care about recovery NVM checksum")
Reported-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-28 13:43:00 -08:00
Sasha Neftin
1866aa0d0d e1000e: Fix possible HW unit hang after an s0ix exit
Disable the OEM bit/Gig Disable/restart AN impact and disable the PHY
LAN connected device (LCD) reset during power management flows. This
fixes possible HW unit hangs on the s0ix exit on some corporate ADL
platforms.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214821
Fixes: 3e55d23171 ("e1000e: Add handshake with the CSME to support S0ix")
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Suggested-by: Nir Efrati <nir.efrati@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-28 13:42:28 -08:00
Florian Westphal
17a8f31bba netfilter: egress: silence egress hook lockdep splats
Netfilter assumes its called with rcu_read_lock held, but in egress
hook case it may be called with BH readlock.

This triggers lockdep splat.

In order to avoid to change all rcu_dereference() to
rcu_dereference_check(..., rcu_read_lock_bh_held()), wrap nf_hook_slow
with read lock/unlock pair.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-28 22:34:04 +01:00
Eric Dumazet
56763f12b0 netfilter: fix use-after-free in __nf_register_net_hook()
We must not dereference @new_hooks after nf_hook_mutex has been released,
because other threads might have freed our allocated hooks already.

BUG: KASAN: use-after-free in nf_hook_entries_get_hook_ops include/linux/netfilter.h:130 [inline]
BUG: KASAN: use-after-free in hooks_validate net/netfilter/core.c:171 [inline]
BUG: KASAN: use-after-free in __nf_register_net_hook+0x77a/0x820 net/netfilter/core.c:438
Read of size 2 at addr ffff88801c1a8000 by task syz-executor237/4430

CPU: 1 PID: 4430 Comm: syz-executor237 Not tainted 5.17.0-rc5-syzkaller-00306-g2293be58d6a1 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0x8d/0x336 mm/kasan/report.c:255
 __kasan_report mm/kasan/report.c:442 [inline]
 kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
 nf_hook_entries_get_hook_ops include/linux/netfilter.h:130 [inline]
 hooks_validate net/netfilter/core.c:171 [inline]
 __nf_register_net_hook+0x77a/0x820 net/netfilter/core.c:438
 nf_register_net_hook+0x114/0x170 net/netfilter/core.c:571
 nf_register_net_hooks+0x59/0xc0 net/netfilter/core.c:587
 nf_synproxy_ipv6_init+0x85/0xe0 net/netfilter/nf_synproxy_core.c:1218
 synproxy_tg6_check+0x30d/0x560 net/ipv6/netfilter/ip6t_SYNPROXY.c:81
 xt_check_target+0x26c/0x9e0 net/netfilter/x_tables.c:1038
 check_target net/ipv6/netfilter/ip6_tables.c:530 [inline]
 find_check_entry.constprop.0+0x7f1/0x9e0 net/ipv6/netfilter/ip6_tables.c:573
 translate_table+0xc8b/0x1750 net/ipv6/netfilter/ip6_tables.c:735
 do_replace net/ipv6/netfilter/ip6_tables.c:1153 [inline]
 do_ip6t_set_ctl+0x56e/0xb90 net/ipv6/netfilter/ip6_tables.c:1639
 nf_setsockopt+0x83/0xe0 net/netfilter/nf_sockopt.c:101
 ipv6_setsockopt+0x122/0x180 net/ipv6/ipv6_sockglue.c:1024
 rawv6_setsockopt+0xd3/0x6a0 net/ipv6/raw.c:1084
 __sys_setsockopt+0x2db/0x610 net/socket.c:2180
 __do_sys_setsockopt net/socket.c:2191 [inline]
 __se_sys_setsockopt net/socket.c:2188 [inline]
 __x64_sys_setsockopt+0xba/0x150 net/socket.c:2188
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f65a1ace7d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 71 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f65a1a7f308 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f65a1ace7d9
RDX: 0000000000000040 RSI: 0000000000000029 RDI: 0000000000000003
RBP: 00007f65a1b574c8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000020000000 R11: 0000000000000246 R12: 00007f65a1b55130
R13: 00007f65a1b574c0 R14: 00007f65a1b24090 R15: 0000000000022000
 </TASK>

The buggy address belongs to the page:
page:ffffea0000706a00 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1c1a8
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000000 ffffea0001c1b108 ffffea000046dd08 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as freed
page last allocated via order 2, migratetype Unmovable, gfp_mask 0x52dc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO), pid 4430, ts 1061781545818, free_ts 1061791488993
 prep_new_page mm/page_alloc.c:2434 [inline]
 get_page_from_freelist+0xa72/0x2f50 mm/page_alloc.c:4165
 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5389
 __alloc_pages_node include/linux/gfp.h:572 [inline]
 alloc_pages_node include/linux/gfp.h:595 [inline]
 kmalloc_large_node+0x62/0x130 mm/slub.c:4438
 __kmalloc_node+0x35a/0x4a0 mm/slub.c:4454
 kmalloc_node include/linux/slab.h:604 [inline]
 kvmalloc_node+0x97/0x100 mm/util.c:580
 kvmalloc include/linux/slab.h:731 [inline]
 kvzalloc include/linux/slab.h:739 [inline]
 allocate_hook_entries_size net/netfilter/core.c:61 [inline]
 nf_hook_entries_grow+0x140/0x780 net/netfilter/core.c:128
 __nf_register_net_hook+0x144/0x820 net/netfilter/core.c:429
 nf_register_net_hook+0x114/0x170 net/netfilter/core.c:571
 nf_register_net_hooks+0x59/0xc0 net/netfilter/core.c:587
 nf_synproxy_ipv6_init+0x85/0xe0 net/netfilter/nf_synproxy_core.c:1218
 synproxy_tg6_check+0x30d/0x560 net/ipv6/netfilter/ip6t_SYNPROXY.c:81
 xt_check_target+0x26c/0x9e0 net/netfilter/x_tables.c:1038
 check_target net/ipv6/netfilter/ip6_tables.c:530 [inline]
 find_check_entry.constprop.0+0x7f1/0x9e0 net/ipv6/netfilter/ip6_tables.c:573
 translate_table+0xc8b/0x1750 net/ipv6/netfilter/ip6_tables.c:735
 do_replace net/ipv6/netfilter/ip6_tables.c:1153 [inline]
 do_ip6t_set_ctl+0x56e/0xb90 net/ipv6/netfilter/ip6_tables.c:1639
 nf_setsockopt+0x83/0xe0 net/netfilter/nf_sockopt.c:101
page last free stack trace:
 reset_page_owner include/linux/page_owner.h:24 [inline]
 free_pages_prepare mm/page_alloc.c:1352 [inline]
 free_pcp_prepare+0x374/0x870 mm/page_alloc.c:1404
 free_unref_page_prepare mm/page_alloc.c:3325 [inline]
 free_unref_page+0x19/0x690 mm/page_alloc.c:3404
 kvfree+0x42/0x50 mm/util.c:613
 rcu_do_batch kernel/rcu/tree.c:2527 [inline]
 rcu_core+0x7b1/0x1820 kernel/rcu/tree.c:2778
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558

Memory state around the buggy address:
 ffff88801c1a7f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88801c1a7f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff88801c1a8000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                   ^
 ffff88801c1a8080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88801c1a8100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

Fixes: 2420b79f8c ("netfilter: debug: check for sorted array")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-28 22:34:04 +01:00
Linus Torvalds
719fce7539 Merge tag 'soc-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "The code changes address mostly minor problems:

   - Several NXP/FSL SoC driver fixes, addressing issues with error
     handling and compilation

   - Fix a clock disabling imbalance in gpcv2 driver.

   - Arm Juno DMA coherency issue

   - Trivial firmware driver fixes for op-tee and scmi firmware

  The remaining changes address issues in the devicetree files:

   - A timer regression for the OMAP devkit8000, which has to use the
     alternative timer.

   - A hang in the i.MX8MM power domain configuration

   - Multiple fixes for the Rockchip RK3399 addressing issues with sound
     and eMMC

   - Cosmetic fixes for i.MX8ULP, RK3xxx, and Tegra124"

* tag 'soc-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (32 commits)
  ARM: tegra: Move panels to AUX bus
  soc: imx: gpcv2: Fix clock disabling imbalance in error path
  soc: fsl: qe: Check of ioremap return value
  soc: fsl: qe: fix typo in a comment
  soc: fsl: guts: Add a missing memory allocation failure check
  soc: fsl: guts: Revert commit 3c0d64e867
  soc: fsl: Correct MAINTAINERS database (SOC)
  soc: fsl: Correct MAINTAINERS database (QUICC ENGINE LIBRARY)
  soc: fsl: Replace kernel.h with the necessary inclusions
  dt-bindings: fsl,layerscape-dcfg: add missing compatible for lx2160a
  dt-bindings: qoriq-clock: add missing compatible for lx2160a
  ARM: dts: Use 32KiHz oscillator on devkit8000
  ARM: dts: switch timer config to common devkit8000 devicetree
  tee: optee: fix error return code in probe function
  arm64: dts: imx8ulp: Set #thermal-sensor-cells to 1 as required
  arm64: dts: imx8mm: Fix VPU Hanging
  ARM: dts: rockchip: fix a typo on rk3288 crypto-controller
  ARM: dts: rockchip: reorder rk322x hmdi clocks
  firmware: arm_scmi: Remove space in MODULE_ALIAS name
  arm64: dts: agilex: use the compatible "intel,socfpga-agilex-hsotg"
  ...
2022-02-28 12:51:14 -08:00
Linus Torvalds
201b5c016f Merge tag 'efi-urgent-for-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:

 - don't treat valid hartid U32_MAX as a failure return code (RISC-V)

 - avoid blocking query_variable_info() call when blocking is not
   allowed

* tag 'efi-urgent-for-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efivars: Respect "block" flag in efivar_entry_set_safe()
  riscv/efi_stub: Fix get_boot_hartid_from_fdt() return value
2022-02-28 12:44:33 -08:00
Carsten Haitzler
cb1852783f drm/arm: arm hdlcd select DRM_GEM_CMA_HELPER
Without DRM_GEM_CMA_HELPER HDLCD won't build. This needs to be there too.

Fixes: 09717af7d1 ("drm: Remove CONFIG_DRM_KMS_CMA_HELPER option")
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220124162437.2470344-1-carsten.haitzler@foss.arm.com
2022-02-28 13:31:20 -06:00
Douglas Anderson
26d3474348 drm/bridge: ti-sn65dsi86: Properly undo autosuspend
The PM Runtime docs say:
  Drivers in ->remove() callback should undo the runtime PM changes done
  in ->probe(). Usually this means calling pm_runtime_disable(),
  pm_runtime_dont_use_autosuspend() etc.

We weren't doing that for autosuspend. Let's do it.

Fixes: 9bede63127 ("drm/bridge: ti-sn65dsi86: Use pm_runtime autosuspend")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220222141838.1.If784ba19e875e8ded4ec4931601ce6d255845245@changeid
2022-02-28 09:52:46 -08:00
Kim Phillips
e9b6013a7c x86/speculation: Update link to AMD speculation whitepaper
Update the link to the "Software Techniques for Managing Speculation
on AMD Processors" whitepaper.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-02-28 18:37:12 +01:00
Kim Phillips
244d00b5dd x86/speculation: Use generic retpoline by default on AMD
AMD retpoline may be susceptible to speculation. The speculation
execution window for an incorrect indirect branch prediction using
LFENCE/JMP sequence may potentially be large enough to allow
exploitation using Spectre V2.

By default, don't use retpoline,lfence on AMD.  Instead, use the
generic retpoline.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-02-28 18:37:08 +01:00
Sasha Neftin
c4208653a3 igc: igc_write_phy_reg_gpy: drop premature return
Similar to "igc_read_phy_reg_gpy: drop premature return" patch.
igc_write_phy_reg_gpy checks the return value from igc_write_phy_reg_mdic
and if it's not 0, returns immediately. By doing this, it leaves the HW
semaphore in the acquired state.

Drop this premature return statement, the function returns after
releasing the semaphore immediately anyway.

Fixes: 5586838fe9 ("igc: Add code for PHY support")
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Reported-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-28 08:48:45 -08:00
Corinna Vinschen
fda2635466 igc: igc_read_phy_reg_gpy: drop premature return
igc_read_phy_reg_gpy checks the return value from igc_read_phy_reg_mdic
and if it's not 0, returns immediately. By doing this, it leaves the HW
semaphore in the acquired state.

Drop this premature return statement, the function returns after
releasing the semaphore immediately anyway.

Fixes: 5586838fe9 ("igc: Add code for PHY support")
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-28 08:48:45 -08:00
Randy Dunlap
7b83299e5b ARM: 9182/1: mmu: fix returns from early_param() and __setup() functions
early_param() handlers should return 0 on success.
__setup() handlers should return 1 on success, i.e., the parameter
has been handled. A return of 0 would cause the "option=value" string
to be added to init's environment strings, polluting it.

../arch/arm/mm/mmu.c: In function 'test_early_cachepolicy':
../arch/arm/mm/mmu.c:215:1: error: no return statement in function returning non-void [-Werror=return-type]
../arch/arm/mm/mmu.c: In function 'test_noalign_setup':
../arch/arm/mm/mmu.c:221:1: error: no return statement in function returning non-void [-Werror=return-type]

Fixes: b849a60e09 ("ARM: make cr_alignment read-only #ifndef CONFIG_CPU_CP15")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: patches@armlinux.org.uk
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-02-28 13:57:32 +00:00
Yu Kuai
3093929326 blktrace: fix use after free for struct blk_trace
When tracing the whole disk, 'dropped' and 'msg' will be created
under 'q->debugfs_dir' and 'bt->dir' is NULL, thus blk_trace_free()
won't remove those files. What's worse, the following UAF can be
triggered because of accessing stale 'dropped' and 'msg':

==================================================================
BUG: KASAN: use-after-free in blk_dropped_read+0x89/0x100
Read of size 4 at addr ffff88816912f3d8 by task blktrace/1188

CPU: 27 PID: 1188 Comm: blktrace Not tainted 5.17.0-rc4-next-20220217+ #469
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-4
Call Trace:
 <TASK>
 dump_stack_lvl+0x34/0x44
 print_address_description.constprop.0.cold+0xab/0x381
 ? blk_dropped_read+0x89/0x100
 ? blk_dropped_read+0x89/0x100
 kasan_report.cold+0x83/0xdf
 ? blk_dropped_read+0x89/0x100
 kasan_check_range+0x140/0x1b0
 blk_dropped_read+0x89/0x100
 ? blk_create_buf_file_callback+0x20/0x20
 ? kmem_cache_free+0xa1/0x500
 ? do_sys_openat2+0x258/0x460
 full_proxy_read+0x8f/0xc0
 vfs_read+0xc6/0x260
 ksys_read+0xb9/0x150
 ? vfs_write+0x3d0/0x3d0
 ? fpregs_assert_state_consistent+0x55/0x60
 ? exit_to_user_mode_prepare+0x39/0x1e0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fbc080d92fd
Code: ce 20 00 00 75 10 b8 00 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 1
RSP: 002b:00007fbb95ff9cb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00007fbb95ff9dc0 RCX: 00007fbc080d92fd
RDX: 0000000000000100 RSI: 00007fbb95ff9cc0 RDI: 0000000000000045
RBP: 0000000000000045 R08: 0000000000406299 R09: 00000000fffffffd
R10: 000000000153afa0 R11: 0000000000000293 R12: 00007fbb780008c0
R13: 00007fbb78000938 R14: 0000000000608b30 R15: 00007fbb780029c8
 </TASK>

Allocated by task 1050:
 kasan_save_stack+0x1e/0x40
 __kasan_kmalloc+0x81/0xa0
 do_blk_trace_setup+0xcb/0x410
 __blk_trace_setup+0xac/0x130
 blk_trace_ioctl+0xe9/0x1c0
 blkdev_ioctl+0xf1/0x390
 __x64_sys_ioctl+0xa5/0xe0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 1050:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_set_free_info+0x20/0x30
 __kasan_slab_free+0x103/0x180
 kfree+0x9a/0x4c0
 __blk_trace_remove+0x53/0x70
 blk_trace_ioctl+0x199/0x1c0
 blkdev_common_ioctl+0x5e9/0xb30
 blkdev_ioctl+0x1a5/0x390
 __x64_sys_ioctl+0xa5/0xe0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff88816912f380
 which belongs to the cache kmalloc-96 of size 96
The buggy address is located 88 bytes inside of
 96-byte region [ffff88816912f380, ffff88816912f3e0)
The buggy address belongs to the page:
page:000000009a1b4e7c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0f
flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff)
raw: 0017ffffc0000200 ffffea00044f1100 dead000000000002 ffff88810004c780
raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88816912f280: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
 ffff88816912f300: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
>ffff88816912f380: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
                                                    ^
 ffff88816912f400: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
 ffff88816912f480: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
==================================================================

Fixes: c0ea57608b ("blktrace: remove debugfs file dentries from struct blk_trace")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220228034354.4047385-1-yukuai3@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-28 06:36:33 -07:00
Miaoqian Lin
9826e393e4 iommu/tegra-smmu: Fix missing put_device() call in tegra_smmu_find
The reference taken by 'of_find_device_by_node()' must be released when
not needed anymore.
Add the corresponding 'put_device()' in the error handling path.

Fixes: 765a9d1d02 ("iommu/tegra-smmu: Fix mc errors on tegra124-nyan")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20220107080915.12686-1-linmq006@gmail.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28 14:01:57 +01:00
Adrian Huang
b00833768e iommu/vt-d: Fix double list_add when enabling VMD in scalable mode
When enabling VMD and IOMMU scalable mode, the following kernel panic
call trace/kernel log is shown in Eagle Stream platform (Sapphire Rapids
CPU) during booting:

pci 0000:59:00.5: Adding to iommu group 42
...
vmd 0000:59:00.5: PCI host bridge to bus 10000:80
pci 10000:80:01.0: [8086:352a] type 01 class 0x060400
pci 10000:80:01.0: reg 0x10: [mem 0x00000000-0x0001ffff 64bit]
pci 10000:80:01.0: enabling Extended Tags
pci 10000:80:01.0: PME# supported from D0 D3hot D3cold
pci 10000:80:01.0: DMAR: Setup RID2PASID failed
pci 10000:80:01.0: Failed to add to iommu group 42: -16
pci 10000:80:03.0: [8086:352b] type 01 class 0x060400
pci 10000:80:03.0: reg 0x10: [mem 0x00000000-0x0001ffff 64bit]
pci 10000:80:03.0: enabling Extended Tags
pci 10000:80:03.0: PME# supported from D0 D3hot D3cold
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:29!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.17.0-rc3+ #7
Hardware name: Lenovo ThinkSystem SR650V3/SB27A86647, BIOS ESE101Y-1.00 01/13/2022
Workqueue: events work_for_cpu_fn
RIP: 0010:__list_add_valid.cold+0x26/0x3f
Code: 9a 4a ab ff 4c 89 c1 48 c7 c7 40 0c d9 9e e8 b9 b1 fe ff 0f
      0b 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 f0 0c d9 9e e8 a2 b1
      fe ff <0f> 0b 48 89 d1 4c 89 c6 4c 89 ca 48 c7 c7 98 0c d9
      9e e8 8b b1 fe
RSP: 0000:ff5ad434865b3a40 EFLAGS: 00010246
RAX: 0000000000000058 RBX: ff4d61160b74b880 RCX: ff4d61255e1fffa8
RDX: 0000000000000000 RSI: 00000000fffeffff RDI: ffffffff9fd34f20
RBP: ff4d611d8e245c00 R08: 0000000000000000 R09: ff5ad434865b3888
R10: ff5ad434865b3880 R11: ff4d61257fdc6fe8 R12: ff4d61160b74b8a0
R13: ff4d61160b74b8a0 R14: ff4d611d8e245c10 R15: ff4d611d8001ba70
FS:  0000000000000000(0000) GS:ff4d611d5ea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ff4d611fa1401000 CR3: 0000000aa0210001 CR4: 0000000000771ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 intel_pasid_alloc_table+0x9c/0x1d0
 dmar_insert_one_dev_info+0x423/0x540
 ? device_to_iommu+0x12d/0x2f0
 intel_iommu_attach_device+0x116/0x290
 __iommu_attach_device+0x1a/0x90
 iommu_group_add_device+0x190/0x2c0
 __iommu_probe_device+0x13e/0x250
 iommu_probe_device+0x24/0x150
 iommu_bus_notifier+0x69/0x90
 blocking_notifier_call_chain+0x5a/0x80
 device_add+0x3db/0x7b0
 ? arch_memremap_can_ram_remap+0x19/0x50
 ? memremap+0x75/0x140
 pci_device_add+0x193/0x1d0
 pci_scan_single_device+0xb9/0xf0
 pci_scan_slot+0x4c/0x110
 pci_scan_child_bus_extend+0x3a/0x290
 vmd_enable_domain.constprop.0+0x63e/0x820
 vmd_probe+0x163/0x190
 local_pci_probe+0x42/0x80
 work_for_cpu_fn+0x13/0x20
 process_one_work+0x1e2/0x3b0
 worker_thread+0x1c4/0x3a0
 ? rescuer_thread+0x370/0x370
 kthread+0xc7/0xf0
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x1f/0x30
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
...
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x1ca00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

The following 'lspci' output shows devices '10000:80:*' are subdevices of
the VMD device 0000:59:00.5:

  $ lspci
  ...
  0000:59:00.5 RAID bus controller: Intel Corporation Volume Management Device NVMe RAID Controller (rev 20)
  ...
  10000:80:01.0 PCI bridge: Intel Corporation Device 352a (rev 03)
  10000:80:03.0 PCI bridge: Intel Corporation Device 352b (rev 03)
  10000:80:05.0 PCI bridge: Intel Corporation Device 352c (rev 03)
  10000:80:07.0 PCI bridge: Intel Corporation Device 352d (rev 03)
  10000:81:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller]
  10000:82:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller]

The symptom 'list_add double add' is caused by the following failure
message:

  pci 10000:80:01.0: DMAR: Setup RID2PASID failed
  pci 10000:80:01.0: Failed to add to iommu group 42: -16
  pci 10000:80:03.0: [8086:352b] type 01 class 0x060400

Device 10000:80:01.0 is the subdevice of the VMD device 0000:59:00.5,
so invoking intel_pasid_alloc_table() gets the pasid_table of the VMD
device 0000:59:00.5. Here is call path:

  intel_pasid_alloc_table
    pci_for_each_dma_alias
     get_alias_pasid_table
       search_pasid_table

pci_real_dma_dev() in pci_for_each_dma_alias() gets the real dma device
which is the VMD device 0000:59:00.5. However, pte of the VMD device
0000:59:00.5 has been configured during this message "pci 0000:59:00.5:
Adding to iommu group 42". So, the status -EBUSY is returned when
configuring pasid entry for device 10000:80:01.0.

It then invokes dmar_remove_one_dev_info() to release
'struct device_domain_info *' from iommu_devinfo_cache. But, the pasid
table is not released because of the following statement in
__dmar_remove_one_dev_info():

	if (info->dev && !dev_is_real_dma_subdevice(info->dev)) {
		...
		intel_pasid_free_table(info->dev);
        }

The subsequent dmar_insert_one_dev_info() operation of device
10000:80:03.0 allocates 'struct device_domain_info *' from
iommu_devinfo_cache. The allocated address is the same address that
is released previously for device 10000:80:01.0. Finally, invoking
device_attach_pasid_table() causes the issue.

`git bisect` points to the offending commit 474dd1c650 ("iommu/vt-d:
Fix clearing real DMA device's scalable-mode context entries"), which
releases the pasid table if the device is not the subdevice by
checking the returned status of dev_is_real_dma_subdevice().
Reverting the offending commit can work around the issue.

The solution is to prevent from allocating pasid table if those
devices are subdevices of the VMD device.

Fixes: 474dd1c650 ("iommu/vt-d: Fix clearing real DMA device's scalable-mode context entries")
Cc: stable@vger.kernel.org # v5.14+
Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
Link: https://lore.kernel.org/r/20220216091307.703-1-adrianhuang0701@gmail.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220221053348.262724-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28 13:48:44 +01:00
Rong Chen
f0d2f15362 mmc: meson: Fix usage of meson_mmc_post_req()
Currently meson_mmc_post_req() is called in meson_mmc_request() right
after meson_mmc_start_cmd(). This could lead to DMA unmapping before the request
is actually finished.

To fix, don't call meson_mmc_post_req() until meson_mmc_request_done().

Signed-off-by: Rong Chen <rong.chen@amlogic.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Fixes: 79ed05e329 ("mmc: meson-gx: add support for descriptor chain mode")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220216124239.4007667-1-rong.chen@amlogic.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-02-28 13:18:12 +01:00
Ville Syrjälä
08783aa769 drm/i915: s/JSP2/ICP2/ PCH
This JSP2 PCH actually seems to be some special Apple
specific ICP variant rather than a JSP. Make it so. Or at
least all the references to it seem to be some Apple ICL
machines. Didn't manage to find these PCI IDs in any
public chipset docs unfortunately.

The only thing we're losing here with this JSP->ICP change
is Wa_14011294188, but based on the HSD that isn't actually
needed on any ICP based design (including JSP), only TGP
based stuff (including MCC) really need it. The documented
w/a just never made that distinction because Windows didn't
want to differentiate between JSP and MCC (not sure how
they handle hpd/ddc/etc. then though...).

Cc: stable@vger.kernel.org
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4226
Fixes: 943682e3bd ("drm/i915: Introduce Jasper Lake PCH")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220224132142.12927-1-ville.syrjala@linux.intel.com
Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Tested-by: Tomas Bzatek <bugs@bzatek.net>
(cherry picked from commit 53581504a8)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-02-28 11:59:01 +00:00
Vinay Belgaumkar
1b279f6ad4 drm/i915/guc/slpc: Correct the param count for unset param
SLPC unset param H2G only needs one parameter - the id of the
param.

Fixes: 025cb07beb ("drm/i915/guc/slpc: Cache platform frequency limits")

Suggested-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220216181504.7155-1-vinay.belgaumkar@intel.com
(cherry picked from commit 9648f1c373)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-02-28 11:59:01 +00:00
Alex Elder
caef14b753 net: ipa: fix a build dependency
An IPA build problem arose in the linux-next tree the other day.
The problem is that a recent commit adds a new dependency on some
code, and the Kconfig file for IPA doesn't reflect that dependency.
As a result, some configurations can fail to build (particularly
when COMPILE_TEST is enabled).

The recent patch adds calls to qmp_get(), qmp_put(), and qmp_send(),
and those are built based on the QCOM_AOSS_QMP config option.  If
that symbol is not defined, stubs are defined, so we just need to
ensure QCOM_AOSS_QMP is compatible with QCOM_IPA, or it's not
defined.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: 34a081761e ("net: ipa: request IPA register values be retained")
Signed-off-by: Alex Elder <elder@linaro.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-28 11:44:27 +00:00
Jia-Ju Bai
d4e26aaea7 atm: firestream: check the return value of ioremap() in fs_init()
The function ioremap() in fs_init() can fail, so its return value should
be checked.

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-28 11:36:01 +00:00
Casper Andersson
90d4025285 net: sparx5: Add #include to remove warning
main.h uses NUM_TARGETS from main_regs.h, but
the missing include never causes any errors
because everywhere main.h is (currently)
included, main_regs.h is included before.
But since it is dependent on main_regs.h
it should always be included.

Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Reviewed-by: Joacim Zetterling <joacim.zetterling@westermo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-28 11:34:26 +00:00
Tony Lu
4d08b7b57e net/smc: Fix cleanup when register ULP fails
This patch calls smc_ib_unregister_client() when tcp_register_ulp()
fails, and make sure to clean it up.

Fixes: d7cd421da9 ("net/smc: Introduce TCP ULP support")
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-28 11:31:49 +00:00
Nícolas F. R. A. Prado
32568ae375 arm64: dts: mt8183: jacuzzi: Fix bus properties in anx's DSI endpoint
mt8183-kukui-jacuzzi has an anx7625 bridge connected to the output of
its DSI host. However, after commit fd0310b6fe ("drm/bridge: anx7625:
add MIPI DPI input feature"), a bus-type property started being required
in the endpoint node by the driver to indicate whether it is DSI or DPI.

Add the missing bus-type property and set it to 5
(V4L2_FWNODE_BUS_TYPE_PARALLEL) so that the driver has its input
configured to DSI and the display pipeline can probe correctly.

While at it, also set the data-lanes property that was also introduced
in that same commit, so that we don't rely on the default value.

Fixes: fd0310b6fe ("drm/bridge: anx7625: add MIPI DPI input feature")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20220214200507.2500693-1-nfraprado@collabora.com
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2022-02-28 12:23:08 +01:00
j.nixdorf@avm.de
9995b408f1 net: ipv6: ensure we call ipv6_mc_down() at most once
There are two reasons for addrconf_notify() to be called with NETDEV_DOWN:
either the network device is actually going down, or IPv6 was disabled
on the interface.

If either of them stays down while the other is toggled, we repeatedly
call the code for NETDEV_DOWN, including ipv6_mc_down(), while never
calling the corresponding ipv6_mc_up() in between. This will cause a
new entry in idev->mc_tomb to be allocated for each multicast group
the interface is subscribed to, which in turn leaks one struct ifmcaddr6
per nontrivial multicast group the interface is subscribed to.

The following reproducer will leak at least $n objects:

ip addr add ff2e::4242/32 dev eth0 autojoin
sysctl -w net.ipv6.conf.eth0.disable_ipv6=1
for i in $(seq 1 $n); do
	ip link set up eth0; ip link set down eth0
done

Joining groups with IPV6_ADD_MEMBERSHIP (unprivileged) or setting the
sysctl net.ipv6.conf.eth0.forwarding to 1 (=> subscribing to ff02::2)
can also be used to create a nontrivial idev->mc_list, which will the
leak objects with the right up-down-sequence.

Based on both sources for NETDEV_DOWN events the interface IPv6 state
should be considered:

 - not ready if the network interface is not ready OR IPv6 is disabled
   for it
 - ready if the network interface is ready AND IPv6 is enabled for it

The functions ipv6_mc_up() and ipv6_down() should only be run when this
state changes.

Implement this by remembering when the IPv6 state is ready, and only
run ipv6_mc_down() if it actually changed from ready to not ready.

The other direction (not ready -> ready) already works correctly, as:

 - the interface notification triggered codepath for NETDEV_UP /
   NETDEV_CHANGE returns early if ipv6 is disabled, and
 - the disable_ipv6=0 triggered codepath skips fully initializing the
   interface as long as addrconf_link_ready(dev) returns false
 - calling ipv6_mc_up() repeatedly does not leak anything

Fixes: 3ce62a84d5 ("ipv6: exit early in addrconf_notify() if IPv6 is disabled")
Signed-off-by: Johannes Nixdorf <j.nixdorf@avm.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-28 11:04:45 +00:00
Jann Horn
258dd90202 efivars: Respect "block" flag in efivar_entry_set_safe()
When the "block" flag is false, the old code would sometimes still call
check_var_size(), which wrongly tells ->query_variable_store() that it can
block.

As far as I can tell, this can't really materialize as a bug at the moment,
because ->query_variable_store only does something on X86 with generic EFI,
and in that configuration we always take the efivar_entry_set_nonblocking()
path.

Fixes: ca0e30dcaa ("efi: Add nonblocking option to efi_query_variable_store()")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220218180559.1432559-1-jannh@google.com
2022-02-28 10:07:50 +01:00
Sunil V L
dcf0c83885 riscv/efi_stub: Fix get_boot_hartid_from_fdt() return value
The get_boot_hartid_from_fdt() function currently returns U32_MAX
for failure case which is not correct because U32_MAX is a valid
hartid value. This patch fixes the issue by returning error code.

Cc: <stable@vger.kernel.org>
Fixes: d7071743db ("RISC-V: Add EFI stub support.")
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-02-28 10:07:49 +01:00
David Gow
ba115adf61 Input: samsung-keypad - properly state IOMEM dependency
Make the samsung-keypad driver explicitly depend on CONFIG_HAS_IOMEM, as it
calls devm_ioremap(). This prevents compile errors in some configs (e.g,
allyesconfig/randconfig under UML):

/usr/bin/ld: drivers/input/keyboard/samsung-keypad.o: in function `samsung_keypad_probe':
samsung-keypad.c:(.text+0xc60): undefined reference to `devm_ioremap'

Signed-off-by: David Gow <davidgow@google.com>
Acked-by: anton ivanov <anton.ivanov@cambridgegreys.com>
Link: https://lore.kernel.org/r/20220225041727.1902850-1-davidgow@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-02-27 21:03:55 -08:00
Dave Airlie
e7c470a4b5 Merge tag 'exynos-drm-fixes-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes
Fixups
- Make display controller drivers for Exynos series to use
  platform_get_irq and platform_get_irq_byname functions
  to get the interrupt, which prevents irq chaning from messed up
  when using hierarchical interrupt domains which use "interrupts"
  property in the node.
- Fix two regressions to TE-gpio handling.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225014042.17637-1-inki.dae@samsung.com
2022-02-28 14:05:44 +10:00
Linus Torvalds
7e57714cd0 Linux 5.17-rc6 2022-02-27 14:36:33 -08:00
Linus Torvalds
52a0255467 Merge tag 'irq-urgent-2022-02-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Thomas Gleixner:
 "A single fix for a regression caused by the recent PCI/MSI rework
  which resulted in a recursive locking problem in the VMD driver.

  The cure is to cache the relevant information upfront instead of
  retrieving it at runtime"

* tag 'irq-urgent-2022-02-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  PCI: vmd: Prevent recursive locking on interrupt allocation
2022-02-27 13:07:40 -08:00
Linus Torvalds
98f3e84f8d Merge tag 'dma-mapping-5.17-1' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:

 - fix a swiotlb info leak (Halil Pasic)

* tag 'dma-mapping-5.17-1' of git://git.infradead.org/users/hch/dma-mapping:
  swiotlb: fix info leak with DMA_FROM_DEVICE
2022-02-27 12:42:37 -08:00
Linus Torvalds
6676ba2a6d Merge tag 'pinctrl-v5-17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:

 - Fix some drive strength and pull-up code in the K210 driver.

 - Add the Alder Lake-M ACPI ID so it starts to work properly.

 - Use a static name for the StarFive GPIO irq_chip, forestalling an
   upcoming fixes series from Marc Zyngier.

 - Fix an ages old bug in the Tegra 186 driver where we were indexing at
   random into struct and being lucky getting the right member.

* tag 'pinctrl-v5-17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  gpio: tegra186: Fix chip_data type confusion
  pinctrl: starfive: Use a static name for the GPIO irq_chip
  pinctrl: tigerlake: Revert "Add Alder Lake-M ACPI ID"
  pinctrl: k210: Fix bias-pull-up
  pinctrl: fix loop in k210_pinconf_get_drive()
2022-02-27 12:30:54 -08:00
Linus Torvalds
2293be58d6 Merge tag 'trace-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:

 - rtla (Real-Time Linux Analysis tool):
    - fix typo in man page
    - Update API -e to -E before it is released
    - Error message fix and memory leak fix

 - Partially uninline trace event soft disable to shrink text

 - Fix function graph start up test

 - Have triggers affect the trace instance they are in and not top level

 - Have osnoise sleep in the units it says it uses

 - Remove unused ftrace stub function

 - Remove event probe redundant info from event in the buffer

 - Fix group ownership setting in tracefs

 - Ensure trace buffer is minimum size to prevent crashes

* tag 'trace-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  rtla/osnoise: Fix error message when failing to enable trace instance
  rtla/osnoise: Free params at the exit
  rtla/hist: Make -E the short version of --entries
  tracing: Fix selftest config check for function graph start up test
  tracefs: Set the group ownership in apply_options() not parse_options()
  tracing/osnoise: Make osnoise_main to sleep for microseconds
  ftrace: Remove unused ftrace_startup_enable() stub
  tracing: Ensure trace buffer is at least 4096 bytes large
  tracing: Uninline trace_trigger_soft_disabled() partly
  eprobes: Remove redundant event type information
  tracing: Have traceon and traceoff trigger honor the instance
  tracing: Dump stacktrace trigger to the corresponding instance
  rtla: Fix systme -> system typo on man page
2022-02-26 12:10:17 -08:00
Linus Torvalds
e41898d2ba Merge tag 'fixes-2022-02-26' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock fix from Mike Rapoport:
 "Use kfree() to release kmalloced memblock regions

  memblock.{reserved,memory}.regions may be allocated using kmalloc()
  in memblock_double_array(). Use kfree() to release these kmalloced
  regions"

* tag 'fixes-2022-02-26' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock: use kfree() to release kmalloced memblock regions
2022-02-26 12:00:44 -08:00
Linus Torvalds
086ee11b03 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "12 patches.

  Subsystems affected by this patch series: MAINTAINERS, mailmap, memfd,
  and mm (hugetlb, kasan, hugetlbfs, pagemap, selftests, memcg, and
  slab)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  selftests/memfd: clean up mapping in mfd_fail_write
  mailmap: update Roman Gushchin's email
  MAINTAINERS, SLAB: add Roman as reviewer, git tree
  MAINTAINERS: add Shakeel as a memcg co-maintainer
  MAINTAINERS: remove Vladimir from memcg maintainers
  MAINTAINERS: add Roman as a memcg co-maintainer
  selftest/vm: fix map_fixed_noreplace test failure
  mm: fix use-after-free bug when mm->mmap is reused after being freed
  hugetlbfs: fix a truncation issue in hugepages parameter
  kasan: test: prevent cache merging in kmem_cache_double_destroy
  mm/hugetlb: fix kernel crash with hugetlb mremap
  MAINTAINERS: add sysctl-next git tree
2022-02-26 11:52:14 -08:00
Linus Torvalds
2c8c230eda Merge tag 'riscv-for-linus-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for the K210 sdcard defconfig, to avoid using a
   fixed delay for the root FS

 - A fix to make sure there's a proper call frame for
   trace_hardirqs_{on,off}().

* tag 'riscv-for-linus-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: fix oops caused by irqsoff latency tracer
  riscv: fix nommu_k210_sdcard_defconfig
2022-02-26 10:26:24 -08:00
Linus Torvalds
3bd9dd8138 Merge tag 'xfs-5.17-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
 "Nothing exciting, just more fixes for not returning sync_filesystem
  error values (and eliding it when it's not necessary).

  Summary:

   - Only call sync_filesystem when we're remounting the filesystem
     readonly readonly, and actually check its return value"

* tag 'xfs-5.17-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: only bother with sync_filesystem during readonly remount
2022-02-26 09:53:19 -08:00
Mike Kravetz
fda153c89a selftests/memfd: clean up mapping in mfd_fail_write
Running the memfd script ./run_hugetlbfs_test.sh will often end in error
as follows:

    memfd-hugetlb: CREATE
    memfd-hugetlb: BASIC
    memfd-hugetlb: SEAL-WRITE
    memfd-hugetlb: SEAL-FUTURE-WRITE
    memfd-hugetlb: SEAL-SHRINK
    fallocate(ALLOC) failed: No space left on device
    ./run_hugetlbfs_test.sh: line 60: 166855 Aborted                 (core dumped) ./memfd_test hugetlbfs
    opening: ./mnt/memfd
    fuse: DONE

If no hugetlb pages have been preallocated, run_hugetlbfs_test.sh will
allocate 'just enough' pages to run the test.  In the SEAL-FUTURE-WRITE
test the mfd_fail_write routine maps the file, but does not unmap.  As a
result, two hugetlb pages remain reserved for the mapping.  When the
fallocate call in the SEAL-SHRINK test attempts allocate all hugetlb
pages, it is short by the two reserved pages.

Fix by making sure to unmap in mfd_fail_write.

Link: https://lkml.kernel.org/r/20220219004340.56478-1-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Roman Gushchin
9502bdbf34 mailmap: update Roman Gushchin's email
I'm moving to a @linux.dev account. Map my old addresses.

Link: https://lkml.kernel.org/r/20220221200006.416377-1-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Vlastimil Babka
7b0112f343 MAINTAINERS, SLAB: add Roman as reviewer, git tree
The slab code has an overlap with kmem accounting, where Roman has done
a lot of work recently and it would be useful to make sure he's CC'd on
patches that potentially affect it.  Thus add him as a reviewer for the
SLAB subsystem.

Also while at it, add the link to slab git tree.

Link: https://lkml.kernel.org/r/20220222103104.13241-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Shakeel Butt
bb9d545499 MAINTAINERS: add Shakeel as a memcg co-maintainer
I have been contributing and reviewing to the memcg codebase for last
couple of years.  So, making it official.

Link: https://lkml.kernel.org/r/20220224060148.4092228-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Vladimir Davydov
0a972e72e2 MAINTAINERS: remove Vladimir from memcg maintainers
Link: https://lkml.kernel.org/r/4ad1f8da49d7b71c84a0c15bd5347f5ce704e730.1645608825.git.vdavydov.dev@gmail.com
Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Roman Gushchin
7d547dcf97 MAINTAINERS: add Roman as a memcg co-maintainer
Add myself as a memcg co-maintainer.  My primary focus over last few
years was the kernel memory accounting stack, but I do work on some
other parts of the memory controller as well.

Link: https://lkml.kernel.org/r/20220221233951.659048-1-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Aneesh Kumar K.V
f39c58008d selftest/vm: fix map_fixed_noreplace test failure
On the latest RHEL the test fails due to executable mapped at 256MB
address

     # ./map_fixed_noreplace
    mmap() @ 0x10000000-0x10050000 p=0xffffffffffffffff result=File exists
    10000000-10010000 r-xp 00000000 fd:04 34905657                           /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace
    10010000-10020000 r--p 00000000 fd:04 34905657                           /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace
    10020000-10030000 rw-p 00010000 fd:04 34905657                           /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace
    10029b90000-10029bc0000 rw-p 00000000 00:00 0                            [heap]
    7fffbb510000-7fffbb750000 r-xp 00000000 fd:04 24534                      /usr/lib64/libc.so.6
    7fffbb750000-7fffbb760000 r--p 00230000 fd:04 24534                      /usr/lib64/libc.so.6
    7fffbb760000-7fffbb770000 rw-p 00240000 fd:04 24534                      /usr/lib64/libc.so.6
    7fffbb780000-7fffbb7a0000 r--p 00000000 00:00 0                          [vvar]
    7fffbb7a0000-7fffbb7b0000 r-xp 00000000 00:00 0                          [vdso]
    7fffbb7b0000-7fffbb800000 r-xp 00000000 fd:04 24514                      /usr/lib64/ld64.so.2
    7fffbb800000-7fffbb810000 r--p 00040000 fd:04 24514                      /usr/lib64/ld64.so.2
    7fffbb810000-7fffbb820000 rw-p 00050000 fd:04 24514                      /usr/lib64/ld64.so.2
    7fffd93f0000-7fffd9420000 rw-p 00000000 00:00 0                          [stack]
    Error: couldn't map the space we need for the test

Fix this by finding a free address using mmap instead of hardcoding
BASE_ADDRESS.

Link: https://lkml.kernel.org/r/20220217083417.373823-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Jann Horn <jannh@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Suren Baghdasaryan
f798a1d4f9 mm: fix use-after-free bug when mm->mmap is reused after being freed
oom reaping (__oom_reap_task_mm) relies on a 2 way synchronization with
exit_mmap.  First it relies on the mmap_lock to exclude from unlock
path[1], page tables tear down (free_pgtables) and vma destruction.
This alone is not sufficient because mm->mmap is never reset.

For historical reasons[2] the lock is taken there is also MMF_OOM_SKIP
set for oom victims before.

The oom reaper only ever looks at oom victims so the whole scheme works
properly but process_mrelease can opearate on any task (with fatal
signals pending) which doesn't really imply oom victims.  That means
that the MMF_OOM_SKIP part of the synchronization doesn't work and it
can see a task after the whole address space has been demolished and
traverse an already released mm->mmap list.  This leads to use after
free as properly caught up by KASAN report.

Fix the issue by reseting mm->mmap so that MMF_OOM_SKIP synchronization
is not needed anymore.  The MMF_OOM_SKIP is not removed from exit_mmap
yet but it acts mostly as an optimization now.

[1] 27ae357fa8 ("mm, oom: fix concurrent munlock and oom reaper unmap, v3")
[2] 2129258024 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")

[mhocko@suse.com: changelog rewrite]

Link: https://lore.kernel.org/all/00000000000072ef2c05d7f81950@google.com/
Link: https://lkml.kernel.org/r/20220215201922.1908156-1-surenb@google.com
Fixes: 64591e8605 ("mm: protect free_pgtables with mmap_lock write lock in exit_mmap")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: syzbot+2ccf63a4bd07cf39cab0@syzkaller.appspotmail.com
Suggested-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Tim Murray <timmurray@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Liu Yuntao
e79ce98323 hugetlbfs: fix a truncation issue in hugepages parameter
When we specify a large number for node in hugepages parameter, it may
be parsed to another number due to truncation in this statement:

	node = tmp;

For example, add following parameter in command line:

	hugepagesz=1G hugepages=4294967297:5

and kernel will allocate 5 hugepages for node 1 instead of ignoring it.

I move the validation check earlier to fix this issue, and slightly
simplifies the condition here.

Link: https://lkml.kernel.org/r/20220209134018.8242-1-liuyuntao10@huawei.com
Fixes: b5389086ad ("hugetlbfs: extend the definition of hugepages parameter to support node allocation")
Signed-off-by: Liu Yuntao <liuyuntao10@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Andrey Konovalov
70effdc375 kasan: test: prevent cache merging in kmem_cache_double_destroy
With HW_TAGS KASAN and kasan.stacktrace=off, the cache created in the
kmem_cache_double_destroy() test might get merged with an existing one.
Thus, the first kmem_cache_destroy() call won't actually destroy it but
will only decrease the refcount.  This causes the test to fail.

Provide an empty constructor for the created cache to prevent the cache
from getting merged.

Link: https://lkml.kernel.org/r/b597bd434c49591d8af00ee3993a42c609dc9a59.1644346040.git.andreyknvl@google.com
Fixes: f98f966cd7 ("kasan: test: add test case for double-kmem_cache_destroy()")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Aneesh Kumar K.V
db110a99d3 mm/hugetlb: fix kernel crash with hugetlb mremap
This fixes the below crash:

  kernel BUG at include/linux/mm.h:2373!
  cpu 0x5d: Vector: 700 (Program Check) at [c00000003c6e76e0]
      pc: c000000000581a54: pmd_to_page+0x54/0x80
      lr: c00000000058d184: move_hugetlb_page_tables+0x4e4/0x5b0
      sp: c00000003c6e7980
     msr: 9000000000029033
    current = 0xc00000003bd8d980
    paca    = 0xc000200fff610100   irqmask: 0x03   irq_happened: 0x01
      pid   = 9349, comm = hugepage-mremap
  kernel BUG at include/linux/mm.h:2373!
    move_hugetlb_page_tables+0x4e4/0x5b0 (link register)
    move_hugetlb_page_tables+0x22c/0x5b0 (unreliable)
    move_page_tables+0xdbc/0x1010
    move_vma+0x254/0x5f0
    sys_mremap+0x7c0/0x900
    system_call_exception+0x160/0x2c0

the kernel can't use huge_pte_offset before it set the pte entry because
a page table lookup check for huge PTE bit in the page table to
differentiate between a huge pte entry and a pointer to pte page.  A
huge_pte_alloc won't mark the page table entry huge and hence kernel
should not use huge_pte_offset after a huge_pte_alloc.

Link: https://lkml.kernel.org/r/20220211063221.99293-1-aneesh.kumar@linux.ibm.com
Fixes: 550a7d60bd ("mm, hugepages: add mremap() support for hugepage backed vma")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Luis Chamberlain
bbcf7b0e2e MAINTAINERS: add sysctl-next git tree
Add a git tree for sysctls as there's been quite a bit of work lately to
remove all the syctls out of kernel/sysctl.c and move to their respective
places, so coordination has been needed to avoid conflicts.  This tree
will also help soak these changes on linux-next prior to getting to Linus.

Link: https://lkml.kernel.org/r/20220218182736.3694508-1-mcgrof@kernel.org
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
David S. Miller
519ca6fa96 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-02-25

This series contains updates to iavf driver only.

Slawomir fixes stability issues that can be seen when stressing the
driver using a large number of VFs with a multitude of operations.
Among the fixes are reworking mutexes to provide more effective locking,
ensuring initialization is complete before teardown, preventing
operations which could race while removing the driver, stopping certain
tasks from being queued when the device is down, and adding a missing
mutex unlock.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-26 12:50:20 +00:00
Daniel Bristot de Oliveira
90f59ee41a rtla/osnoise: Fix error message when failing to enable trace instance
When a trace instance creation fails, tools are printing:

	Could not enable -> osnoiser <- tracer for tracing

Print the actual (and correct) name of the tracer it fails to enable.

Link: https://lkml.kernel.org/r/53ef0582605af91eca14b19dba9fc9febb95d4f9.1645206561.git.bristot@kernel.org

Fixes: b1696371d8 ("rtla: Helper functions for rtla")
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25 21:05:30 -05:00
Daniel Bristot de Oliveira
316f710172 rtla/osnoise: Free params at the exit
The variable that stores the parsed command line arguments are not
being free()d at the rtla osnoise top exit path.

Free params variable before exiting.

Link: https://lkml.kernel.org/r/0be31d8259c7c53b98a39769d60cfeecd8421785.1645206561.git.bristot@kernel.org

Fixes: 1eceb2fc2c ("rtla/osnoise: Add osnoise top mode")
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25 21:05:30 -05:00
Daniel Bristot de Oliveira
dd48f316a1 rtla/hist: Make -E the short version of --entries
Currently, --entries uses -e as the short version in the hist mode of
timerlat and osnoise tools. But as -e is already used to enable events
on trace sessions by other tools, thus let's keep it available for the
same usage for all rtla tools.

Make -E the short version of --entries for hist mode on all tools.

Note: rtla was merged in this merge window, so rtla was not released yet.

Link: https://lkml.kernel.org/r/5dbf0cbe7364d3a05e708926b41a097c59a02b1e.1645206561.git.bristot@kernel.org

Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25 21:05:30 -05:00
Christophe Leroy
c5229a0bd4 tracing: Fix selftest config check for function graph start up test
CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS is required to test
direct tramp.

Link: https://lkml.kernel.org/r/bdc7e594e13b0891c1d61bc8d56c94b1890eaed7.1640017960.git.christophe.leroy@csgroup.eu

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25 21:05:29 -05:00
Steven Rostedt (Google)
851e99ebee tracefs: Set the group ownership in apply_options() not parse_options()
Al Viro brought it to my attention that the dentries may not be filled
when the parse_options() is called, causing the call to set_gid() to
possibly crash. It should only be called if parse_options() succeeds
totally anyway.

He suggested the logical place to do the update is in apply_options().

Link: https://lore.kernel.org/all/20220225165219.737025658@goodmis.org/
Link: https://lkml.kernel.org/r/20220225153426.1c4cab6b@gandalf.local.home

Cc: stable@vger.kernel.org
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: 48b27b6b51 ("tracefs: Set all files to the same group ownership as the mount option")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25 21:05:04 -05:00
Jakub Kicinski
328e765c03 Merge tag 'linux-can-fixes-for-5.17-20220225' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:

====================
pull-request: can 2022-02-25

The first 2 patches are by Vincent Mailhol and fix the error handling
of the ndo_open callbacks of the etas_es58x and the gs_usb CAN USB
drivers.

The last patch is by Lad Prabhakar and fixes a small race condition in
the rcar_canfd's rcar_canfd_channel_probe() function.

* tag 'linux-can-fixes-for-5.17-20220225' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: rcar_canfd: rcar_canfd_channel_probe(): register the CAN device when fully ready
  can: gs_usb: change active_channels's type from atomic_t to u8
  can: etas_es58x: change opened_channel_cnt's type from atomic_t to u8
====================

Link: https://lore.kernel.org/r/20220225165622.3231809-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-25 14:53:59 -08:00
Linus Torvalds
9137eda537 Merge tag 'configfs-5.17-2022-02-25' of git://git.infradead.org/users/hch/configfs
Pull configfs fix from Christoph Hellwig:

 - fix a race in configfs_{,un}register_subsystem (ChenXiaoSong)

* tag 'configfs-5.17-2022-02-25' of git://git.infradead.org/users/hch/configfs:
  configfs: fix a race in configfs_{,un}register_subsystem()
2022-02-25 14:12:36 -08:00
Linus Torvalds
c0419188b5 Merge tag 'for-5.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
 "This is a hopefully last batch of fixes for defrag that got broken in
  5.16, all stable material.

  The remaining reported problem is excessive IO with autodefrag due to
  various conditions in the defrag code not met or missing"

* tag 'for-5.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: reduce extent threshold for autodefrag
  btrfs: autodefrag: only scan one inode once
  btrfs: defrag: don't use merged extent map for their generation check
  btrfs: defrag: bring back the old file extent search behavior
  btrfs: defrag: remove an ambiguous condition for rejection
  btrfs: defrag: don't defrag extents which are already at max capacity
  btrfs: defrag: don't try to merge regular extents with preallocated extents
  btrfs: defrag: allow defrag_one_cluster() to skip large extent which is not a target
  btrfs: prevent copying too big compressed lzo segment
2022-02-25 14:08:03 -08:00
Linus Torvalds
ca7457236d Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:

 - Older "does not even boot" regression in qib from July

 - Bug fixes for error unwind in rtrs

 - Avoid a deadlock syzkaller found in srp

 - Fix another UAF syzkaller found in cma

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/cma: Do not change route.addr.src_addr outside state checks
  RDMA/ib_srp: Fix a deadlock
  RDMA/rtrs-clt: Move free_permit from free_clt to rtrs_clt_close
  RDMA/rtrs-clt: Fix possible double free in error case
  IB/qib: Fix duplicate sysfs directory name
2022-02-25 13:34:30 -08:00
Linus Torvalds
115ccd2278 Merge tag 'gpio-fixes-for-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - fix an bug generating spurious interrupts in gpio-rockchip

 - fix a race condition in gpiod_to_irq() called by GPIO consumers

* tag 'gpio-fixes-for-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: Return EPROBE_DEFER if gc->to_irq is NULL
  gpio: rockchip: Reset int_bothedge when changing trigger
2022-02-25 12:56:11 -08:00
Jason Gunthorpe
22e9f71072 RDMA/cma: Do not change route.addr.src_addr outside state checks
If the state is not idle then resolve_prepare_src() should immediately
fail and no change to global state should happen. However, it
unconditionally overwrites the src_addr trying to build a temporary any
address.

For instance if the state is already RDMA_CM_LISTEN then this will corrupt
the src_addr and would cause the test in cma_cancel_operation():

           if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev)

Which would manifest as this trace from syzkaller:

  BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26
  Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204

  CPU: 1 PID: 32204 Comm: syz-executor.1 Not tainted 5.12.0-rc8-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:79 [inline]
   dump_stack+0x141/0x1d7 lib/dump_stack.c:120
   print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
   __kasan_report mm/kasan/report.c:399 [inline]
   kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
   __list_add_valid+0x93/0xa0 lib/list_debug.c:26
   __list_add include/linux/list.h:67 [inline]
   list_add_tail include/linux/list.h:100 [inline]
   cma_listen_on_all drivers/infiniband/core/cma.c:2557 [inline]
   rdma_listen+0x787/0xe00 drivers/infiniband/core/cma.c:3751
   ucma_listen+0x16a/0x210 drivers/infiniband/core/ucma.c:1102
   ucma_write+0x259/0x350 drivers/infiniband/core/ucma.c:1732
   vfs_write+0x28e/0xa30 fs/read_write.c:603
   ksys_write+0x1ee/0x250 fs/read_write.c:658
   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xae

This is indicating that an rdma_id_private was destroyed without doing
cma_cancel_listens().

Instead of trying to re-use the src_addr memory to indirectly create an
any address derived from the dst build one explicitly on the stack and
bind to that as any other normal flow would do. rdma_bind_addr() will copy
it over the src_addr once it knows the state is valid.

This is similar to commit bc0bdc5afa ("RDMA/cma: Do not change
route.addr.src_addr.ss_family")

Link: https://lore.kernel.org/r/0-v2-e975c8fd9ef2+11e-syz_cma_srcaddr_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: 732d41c545 ("RDMA/cma: Make the locking for automatic state transition more clear")
Reported-by: syzbot+c94a3675a626f6333d74@syzkaller.appspotmail.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-25 16:46:51 -04:00
Linus Torvalds
4b23c6ecef Merge tag 'spi-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A few small driver specific fixes"

* tag 'spi-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: rockchip: terminate dma transmission when slave abort
  spi: rockchip: Fix error in getting num-cs property
  spi: spi-zynq-qspi: Fix a NULL pointer dereference in zynq_qspi_exec_mem_op()
2022-02-25 12:37:41 -08:00
Linus Torvalds
64b5132b89 Merge tag 'regulator-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
 "A series of fixes for the da9121 driver"

* tag 'regulator-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: da9121: Remove surplus DA9141 parameters
  regulator: da9121: Fix DA914x voltage value
  regulator: da9121: Fix DA914x current values
2022-02-25 12:33:51 -08:00
Linus Torvalds
0e9894e6aa Merge tag 'regmap-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
 "A fix for interrupt controllers which require the explicit
  acknowledgement of interrupts using a different register to the one
  where interrupts are reported.

  Urgent for the few devices this affects"

* tag 'regmap-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap-irq: Update interrupt clear register for proper reset
2022-02-25 12:30:01 -08:00
Linus Torvalds
e48cb5c2c6 Merge tag 'thermal-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
 "Fix a memory leak in the int340x thermal driver's ACPI notify handler
  (Chuansheng Liu)"

* tag 'thermal-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: int340x: fix memory leak in int3400_notify()
2022-02-25 12:25:44 -08:00
Linus Torvalds
2800b6d0fc Merge tag 'pm-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "Fix the throttle IRQ handling during cpufreq initialization on
  Qualcomm platforms (Bjorn Andersson)"

* tag 'pm-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: qcom-hw: Delay enabling throttle_irq
  cpufreq: Reintroduce ready() callback
2022-02-25 12:17:20 -08:00
Linus Torvalds
c47658311d Merge tag 'char-misc-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
 "Here are a few small driver fixes for 5.17-rc6 for reported issues.

  The majority of these are IIO fixes for small things, and the other
  two are a mvmem and mtd core conflict fix.

  All of these have been in linux-next with no reported issues"

* tag 'char-misc-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  mtd: core: Fix a conflict between MTD and NVMEM on wp-gpios property
  nvmem: core: Fix a conflict between MTD and NVMEM on wp-gpios property
  iio: imu: st_lsm6dsx: wait for settling time in st_lsm6dsx_read_oneshot
  iio: Fix error handling for PM
  iio: addac: ad74413r: correct comparator gpio getters mask usage
  iio: addac: ad74413r: use ngpio size when iterating over mask
  iio: addac: ad74413r: Do not reference negative array offsets
  iio: adc: men_z188_adc: Fix a resource leak in an error handling path
  iio: frequency: admv1013: remove the always true condition
  iio: accel: fxls8962af: add padding to regmap for SPI
  iio:imu:adis16480: fix buffering for devices with no burst mode
  iio: adc: ad7124: fix mask used for setting AIN_BUFP & AIN_BUFM bits
  iio: adc: tsc2046: fix memory corruption by preventing array overflow
2022-02-25 12:12:06 -08:00
Linus Torvalds
d68ccfdbe5 Merge tag 'driver-core-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
 "Here is a single driver core fix for 5.17-rc6. It resolves a reported
  problem when the DMA map of a device is not properly released.

  It has been in linux-next with no reported problems"

* tag 'driver-core-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core: Free DMA range map when device is released
2022-02-25 12:05:40 -08:00
Linus Torvalds
eae9350eb4 Merge tag 'staging-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fix from Greg KH:
 "Here is a single staging driver fix for 5.17-rc6.

  It resolves a reported problem in the fbtft fb_st7789v.c driver that
  could cause the display to be flipped in cold weather.

  It has been in linux-next with no reported problems"

* tag 'staging-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: fbtft: fb_st7789v: reset display before initialization
2022-02-25 11:56:16 -08:00
Linus Torvalds
d8fc3bb606 Merge tag 'tty-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
 "Here are some small n_gsm and sc16is7xx serial driver fixes for
  5.17-rc6.

  The n_gsm fixes are from Siemens as it seems they are using the line
  discipline and fixing up a number of issues they found in their
  testing. The sc16is7xx serial driver fix is for a reported problem
  with that chip.

  All of these have been in linux-next with no reported problems"

* tag 'tty-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  sc16is7xx: Fix for incorrect data being transmitted
  tty: n_gsm: fix deadlock in gsmtty_open()
  tty: n_gsm: fix wrong modem processing in convergence layer type 2
  tty: n_gsm: fix wrong tty control line for flow control
  tty: n_gsm: fix NULL pointer access due to DLCI release
  tty: n_gsm: fix proper link termination after failed open
  tty: n_gsm: fix encoding of command/response bit
  tty: n_gsm: fix encoding of control signal octet bit DV
2022-02-25 11:45:29 -08:00
Slawomir Laba
14756b2ae2 iavf: Fix __IAVF_RESETTING state usage
The setup of __IAVF_RESETTING state in watchdog task had no
effect and could lead to slow resets in the driver as
the task for __IAVF_RESETTING state only requeues watchdog.
Till now the __IAVF_RESETTING was interpreted by reset task
as running state which could lead to errors with allocating
and resources disposal.

Make watchdog_task queue the reset task when it's necessary.
Do not update the state to __IAVF_RESETTING so the reset task
knows exactly what is the current state of the adapter.

Fixes: 898ef1cb1c ("iavf: Combine init and watchdog state machines")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25 11:37:08 -08:00
Slawomir Laba
d2c0f45fcc iavf: Fix missing check for running netdev
The driver was queueing reset_task regardless of the netdev
state.

Do not queue the reset task in iavf_change_mtu if netdev
is not running.

Fixes: fdd4044ffd ("iavf: Remove timer for work triggering, use delaying work instead")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25 11:37:08 -08:00
Slawomir Laba
e85ff9c631 iavf: Fix deadlock in iavf_reset_task
There exists a missing mutex_unlock call on crit_lock in
iavf_reset_task call path.

Unlock the crit_lock before returning from reset task.

Fixes: 5ac49f3c27 ("iavf: use mutexes for locking of critical sections")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25 11:37:07 -08:00
Slawomir Laba
a472eb5cba iavf: Fix race in init state
When iavf_init_version_check sends VIRTCHNL_OP_GET_VF_RESOURCES
message, the driver will wait for the response after requeueing
the watchdog task in iavf_init_get_resources call stack. The
logic is implemented this way that iavf_init_get_resources has
to be called in order to allocate adapter->vf_res. It is polling
for the AQ response in iavf_get_vf_config function. Expect a
call trace from kernel when adminq_task worker handles this
message first. adapter->vf_res will be NULL in
iavf_virtchnl_completion.

Make the watchdog task not queue the adminq_task if the init
process is not finished yet.

Fixes: 898ef1cb1c ("iavf: Combine init and watchdog state machines")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25 11:37:07 -08:00
Slawomir Laba
0579fafd37 iavf: Fix locking for VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS
iavf_virtchnl_completion is called under crit_lock but when
the code for VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS is called,
this lock is released in order to obtain rtnl_lock to avoid
ABBA deadlock with unregister_netdev.

Along with the new way iavf_remove behaves, there exist
many risks related to the lock release and attmepts to regrab
it. The driver faces crashes related to races between
unregister_netdev and netdev_update_features. Yet another
risk is that the driver could already obtain the crit_lock
in order to destroy it and iavf_virtchnl_completion could
crash or block forever.

Make iavf_virtchnl_completion never relock crit_lock in it's
call paths.

Extract rtnl_lock locking logic to the driver for
unregister_netdev in order to set the netdev_registered flag
inside the lock.

Introduce a new flag that will inform adminq_task to perform
the code from VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS right after
it finishes processing messages. Guard this code with remove
flags so it's never called when the driver is in remove state.

Fixes: 5951a2b981 ("iavf: Fix VLAN feature flags after VFR")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25 11:37:07 -08:00
Slawomir Laba
3ccd54ef44 iavf: Fix init state closure on remove
When init states of the adapter work, the errors like lack
of communication with the PF might hop in. If such events
occur the driver restores previous states in order to retry
initialization in a proper way. When remove task kicks in,
this situation could lead to races with unregistering the
netdevice as well as resources cleanup. With the commit
introducing the waiting in remove for init to complete,
this problem turns into an endless waiting if init never
recovers from errors.

Introduce __IAVF_IN_REMOVE_TASK bit to indicate that the
remove thread has started.

Make __IAVF_COMM_FAILED adapter state respect the
__IAVF_IN_REMOVE_TASK bit and set the __IAVF_INIT_FAILED
state and return without any action instead of trying to
recover.

Make __IAVF_INIT_FAILED adapter state respect the
__IAVF_IN_REMOVE_TASK bit and return without any further
actions.

Make the loop in the remove handler break when adapter has
__IAVF_INIT_FAILED state set.

Fixes: 898ef1cb1c ("iavf: Combine init and watchdog state machines")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25 11:37:07 -08:00
Slawomir Laba
974578017f iavf: Add waiting so the port is initialized in remove
There exist races when port is being configured and remove is
triggered.

unregister_netdev is not and can't be called under crit_lock
mutex since it is calling ndo_stop -> iavf_close which requires
this lock. Depending on init state the netdev could be still
unregistered so unregister_netdev never cleans up, when shortly
after that the device could become registered.

Make iavf_remove wait until port finishes initialization.
All critical state changes are atomic (under crit_lock).
Crashes that come from iavf_reset_interrupt_capability and
iavf_free_traffic_irqs should now be solved in a graceful
manner.

Fixes: 605ca7c5c6 ("iavf: Fix kernel BUG in free_msi_irqs")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25 11:37:07 -08:00
Slawomir Laba
fc2e6b3b13 iavf: Rework mutexes for better synchronisation
The driver used to crash in multiple spots when put to stress testing
of the init, reset and remove paths.

The user would experience call traces or hangs when creating,
resetting, removing VFs. Depending on the machines, the call traces
are happening in random spots, like reset restoring resources racing
with driver remove.

Make adapter->crit_lock mutex a mandatory lock for guarding the
operations performed on all workqueues and functions dealing with
resource allocation and disposal.

Make __IAVF_REMOVE a final state of the driver respected by
workqueues that shall not requeue, when they fail to obtain the
crit_lock.

Make the IRQ handler not to queue the new work for adminq_task
when the __IAVF_REMOVE state is set.

Fixes: 5ac49f3c27 ("iavf: use mutexes for locking of critical sections")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25 11:37:07 -08:00
Linus Torvalds
548b1af45d Merge tag 'usb-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are a number of small USB driver fixes for 5.17-rc6 to resolve
  reported problems and add new device ids. They include:

   - dwc3:
      - device mapping fix
      - new device ids
      - driver fixes

   - xhci driver fixes

   - gadget driver fixes

   - usb-serial driver device id updates

  All of these have been in linux-next with no reported problems"

* tag 'usb-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: gadget: rndis: add spinlock for rndis response list
  usb: dwc3: gadget: Let the interrupt handler disable bottom halves.
  USB: gadget: validate endpoint index for xilinx udc
  USB: serial: option: add Telit LE910R1 compositions
  USB: serial: option: add support for DW5829e
  Revert "USB: serial: ch341: add new Product ID for CH341A"
  usb: dwc2: drd: fix soft connect when gadget is unconfigured
  usb: dwc3: pci: Fix Bay Trail phy GPIO mappings
  tps6598x: clear int mask on probe failure
  xhci: Prevent futile URB re-submissions due to incorrect return value.
  xhci: re-initialize the HC during resume if HCE was set
  usb: dwc3: pci: Add "snps,dis_u2_susphy_quirk" for Intel Bay Trail
  usb: dwc3: pci: add support for the Intel Raptor Lake-S
2022-02-25 11:36:31 -08:00
Linus Torvalds
7808159497 Merge tag 'ata-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata fixes from Damien Le Moal:
 "Two fixes for the pata_hpt37x driver, both from Sergey:

   - Fix a PCI register access using an incorrect size (8bits instead of
     16bits)

   - Make sure to always disable the primary channel as it is unused"

* tag 'ata-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: pata_hpt37x: disable primary channel on HPT371
  ata: pata_hpt37x: fix PCI clock detection
2022-02-25 11:22:19 -08:00
Daniel Bristot de Oliveira
dd990352f0 tracing/osnoise: Make osnoise_main to sleep for microseconds
osnoise's runtime and period are in the microseconds scale, but it is
currently sleeping in the millisecond's scale. This behavior roots in the
usage of hwlat as the skeleton for osnoise.

Make osnoise to sleep in the microseconds scale. Also, move the sleep to
a specialized function.

Link: https://lkml.kernel.org/r/302aa6c7bdf2d131719b22901905e9da122a11b2.1645197336.git.bristot@kernel.org

Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25 12:07:01 -05:00
Nathan Chancellor
ab2f993c01 ftrace: Remove unused ftrace_startup_enable() stub
When building with clang + CONFIG_DYNAMIC_FTRACE=n + W=1, there is a
warning:

  kernel/trace/ftrace.c:7194:20: error: unused function 'ftrace_startup_enable' [-Werror,-Wunused-function]
  static inline void ftrace_startup_enable(int command) { }
                     ^
  1 error generated.

Clang warns on instances of static inline functions in .c files with W=1
after commit 6863f5643d ("kbuild: allow Clang to find unused static
inline functions for W=1 build").

The ftrace_startup_enable() stub has been unused since
commit e1effa0144 ("ftrace: Annotate the ops operation on update"),
where its use outside of the CONFIG_DYNAMIC_TRACE section was replaced
by ftrace_startup_all().  Remove it to resolve the warning.

Link: https://lkml.kernel.org/r/20220214192847.488166-1-nathan@kernel.org

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25 12:07:01 -05:00
Sven Schnelle
7acf3a127b tracing: Ensure trace buffer is at least 4096 bytes large
Booting the kernel with 'trace_buf_size=1' give a warning at
boot during the ftrace selftests:

[    0.892809] Running postponed tracer tests:
[    0.892893] Testing tracer function:
[    0.901899] Callback from call_rcu_tasks_trace() invoked.
[    0.983829] Callback from call_rcu_tasks_rude() invoked.
[    1.072003] .. bad ring buffer .. corrupted trace buffer ..
[    1.091944] Callback from call_rcu_tasks() invoked.
[    1.097695] PASSED
[    1.097701] Testing dynamic ftrace: .. filter failed count=0 ..FAILED!
[    1.353474] ------------[ cut here ]------------
[    1.353478] WARNING: CPU: 0 PID: 1 at kernel/trace/trace.c:1951 run_tracer_selftest+0x13c/0x1b0

Therefore enforce a minimum of 4096 bytes to make the selftest pass.

Link: https://lkml.kernel.org/r/20220214134456.1751749-1-svens@linux.ibm.com

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25 12:07:01 -05:00
Christophe Leroy
bc82c38a69 tracing: Uninline trace_trigger_soft_disabled() partly
On a powerpc32 build with CONFIG_CC_OPTIMISE_FOR_SIZE, the inline
keyword is not honored and trace_trigger_soft_disabled() appears
approx 50 times in vmlinux.

Adding -Winline to the build, the following message appears:

	./include/linux/trace_events.h:712:1: error: inlining failed in call to 'trace_trigger_soft_disabled': call is unlikely and code size would grow [-Werror=inline]

That function is rather big for an inlined function:

	c003df60 <trace_trigger_soft_disabled>:
	c003df60:	94 21 ff f0 	stwu    r1,-16(r1)
	c003df64:	7c 08 02 a6 	mflr    r0
	c003df68:	90 01 00 14 	stw     r0,20(r1)
	c003df6c:	bf c1 00 08 	stmw    r30,8(r1)
	c003df70:	83 e3 00 24 	lwz     r31,36(r3)
	c003df74:	73 e9 01 00 	andi.   r9,r31,256
	c003df78:	41 82 00 10 	beq     c003df88 <trace_trigger_soft_disabled+0x28>
	c003df7c:	38 60 00 00 	li      r3,0
	c003df80:	39 61 00 10 	addi    r11,r1,16
	c003df84:	4b fd 60 ac 	b       c0014030 <_rest32gpr_30_x>
	c003df88:	73 e9 00 80 	andi.   r9,r31,128
	c003df8c:	7c 7e 1b 78 	mr      r30,r3
	c003df90:	41 a2 00 14 	beq     c003dfa4 <trace_trigger_soft_disabled+0x44>
	c003df94:	38 c0 00 00 	li      r6,0
	c003df98:	38 a0 00 00 	li      r5,0
	c003df9c:	38 80 00 00 	li      r4,0
	c003dfa0:	48 05 c5 f1 	bl      c009a590 <event_triggers_call>
	c003dfa4:	73 e9 00 40 	andi.   r9,r31,64
	c003dfa8:	40 82 00 28 	bne     c003dfd0 <trace_trigger_soft_disabled+0x70>
	c003dfac:	73 ff 02 00 	andi.   r31,r31,512
	c003dfb0:	41 82 ff cc 	beq     c003df7c <trace_trigger_soft_disabled+0x1c>
	c003dfb4:	80 01 00 14 	lwz     r0,20(r1)
	c003dfb8:	83 e1 00 0c 	lwz     r31,12(r1)
	c003dfbc:	7f c3 f3 78 	mr      r3,r30
	c003dfc0:	83 c1 00 08 	lwz     r30,8(r1)
	c003dfc4:	7c 08 03 a6 	mtlr    r0
	c003dfc8:	38 21 00 10 	addi    r1,r1,16
	c003dfcc:	48 05 6f 6c 	b       c0094f38 <trace_event_ignore_this_pid>
	c003dfd0:	38 60 00 01 	li      r3,1
	c003dfd4:	4b ff ff ac 	b       c003df80 <trace_trigger_soft_disabled+0x20>

However it is located in a hot path so inlining it is important.
But forcing inlining of the entire function by using __always_inline
leads to increasing the text size by approx 20 kbytes.

Instead, split the fonction in two parts, one part with the likely
fast path, flagged __always_inline, and a second part out of line.

With this change, on a powerpc32 with CONFIG_CC_OPTIMISE_FOR_SIZE
vmlinux text increases by only 1,4 kbytes, which is partly
compensated by a decrease of vmlinux data by 7 kbytes.

On ppc64_defconfig which has CONFIG_CC_OPTIMISE_FOR_SPEED, this
change reduces vmlinux text by more than 30 kbytes.

Link: https://lkml.kernel.org/r/69ce0986a52d026d381d612801d978aa4f977460.1644563295.git.christophe.leroy@csgroup.eu

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25 12:07:01 -05:00
Steven Rostedt (Google)
b61edd5774 eprobes: Remove redundant event type information
Currently, the event probes save the type of the event they are attached
to when recording the event. For example:

  # echo 'e:switch sched/sched_switch prev_state=$prev_state prev_prio=$prev_prio next_pid=$next_pid next_prio=$next_prio' > dynamic_events
  # cat events/eprobes/switch/format

 name: switch
 ID: 1717
 format:
        field:unsigned short common_type;       offset:0;       size:2; signed:0;
        field:unsigned char common_flags;       offset:2;       size:1; signed:0;
        field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
        field:int common_pid;   offset:4;       size:4; signed:1;

        field:unsigned int __probe_type;        offset:8;       size:4; signed:0;
        field:u64 prev_state;   offset:12;      size:8; signed:0;
        field:u64 prev_prio;    offset:20;      size:8; signed:0;
        field:u64 next_pid;     offset:28;      size:8; signed:0;
        field:u64 next_prio;    offset:36;      size:8; signed:0;

 print fmt: "(%u) prev_state=0x%Lx prev_prio=0x%Lx next_pid=0x%Lx next_prio=0x%Lx", REC->__probe_type, REC->prev_state, REC->prev_prio, REC->next_pid, REC->next_prio

The __probe_type adds 4 bytes to every event.

One of the reasons for creating eprobes is to limit what is traced in an
event to be able to limit what is written into the ring buffer. Having
this redundant 4 bytes to every event takes away from this.

The event that is recorded can be retrieved from the event probe itself,
that is available when the trace is happening. For user space tools, it
could simply read the dynamic_event file to find the event they are for.
So there is really no reason to write this information into the ring
buffer for every event.

Link: https://lkml.kernel.org/r/20220218190057.2f5a19a8@gandalf.local.home

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25 12:07:01 -05:00
Steven Rostedt (Google)
302e9edd54 tracing: Have traceon and traceoff trigger honor the instance
If a trigger is set on an event to disable or enable tracing within an
instance, then tracing should be disabled or enabled in the instance and
not at the top level, which is confusing to users.

Link: https://lkml.kernel.org/r/20220223223837.14f94ec3@rorschach.local.home

Cc: stable@vger.kernel.org
Fixes: ae63b31e4d ("tracing: Separate out trace events from global variables")
Tested-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25 12:06:45 -05:00
Randy Dunlap
e01b042e58 net: stmmac: fix return value of __setup handler
__setup() handlers should return 1 on success, i.e., the parameter
has been handled. A return of 0 causes the "option=value" string to be
added to init's environment strings, polluting it.

Fixes: 47dd7a540b ("net: add support for STMicroelectronics Ethernet controllers.")
Fixes: f3240e2811 ("stmmac: remove warning when compile as built-in (V2)")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Jose Abreu <joabreu@synopsys.com>
Link: https://lore.kernel.org/r/20220224033536.25056-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-25 08:53:17 -08:00
Randy Dunlap
50e06ddcee net: sxgbe: fix return value of __setup handler
__setup() handlers should return 1 on success, i.e., the parameter
has been handled. A return of 0 causes the "option=value" string to be
added to init's environment strings, polluting it.

Fixes: acc18c147b ("net: sxgbe: add EEE(Energy Efficient Ethernet) for Samsung sxgbe")
Fixes: 1edb9ca69e ("net: sxgbe: add basic framework for Samsung 10Gb ethernet driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Cc: Siva Reddy <siva.kallam@samsung.com>
Cc: Girish K S <ks.giri@samsung.com>
Cc: Byungho An <bh74.an@samsung.com>
Link: https://lore.kernel.org/r/20220224033528.24640-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-25 08:53:13 -08:00
Lad Prabhakar
c5048a7b2c can: rcar_canfd: rcar_canfd_channel_probe(): register the CAN device when fully ready
Register the CAN device only when all the necessary initialization is
completed. This patch makes sure all the data structures and locks are
initialized before registering the CAN device.

Link: https://lore.kernel.org/all/20220221225935.12300-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Reported-by: Pavel Machek <pavel@denx.de>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Pavel Machek <pavel@denx.de>
Reviewed-by: Ulrich Hecht <uli+renesas@fpond.eu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-25 17:46:54 +01:00
Eric W. Biederman
0ac983f512 ucounts: Fix systemd LimitNPROC with private users regression
Long story short recursively enforcing RLIMIT_NPROC when it is not
enforced on the process that creates a new user namespace, causes
currently working code to fail.  There is no reason to enforce
RLIMIT_NPROC recursively when we don't enforce it normally so update
the code to detect this case.

I would like to simply use capable(CAP_SYS_RESOURCE) to detect when
RLIMIT_NPROC is not enforced upon the caller.  Unfortunately because
RLIMIT_NPROC is charged and checked for enforcement based upon the
real uid, using capable() which is euid based is inconsistent with reality.
Come as close as possible to testing for capable(CAP_SYS_RESOURCE) by
testing for when the real uid would match the conditions when
CAP_SYS_RESOURCE would be present if the real uid was the effective
uid.

Reported-by: Etienne Dechamps <etienne@edechamps.fr>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215596
Link: https://lkml.kernel.org/r/e9589141-cfeb-90cd-2d0e-83a62787239a@edechamps.fr
Link: https://lkml.kernel.org/r/87sfs8jmpz.fsf_-_@email.froward.int.ebiederm.org
Cc: stable@vger.kernel.org
Fixes: 21d1c5e386 ("Reimplement RLIMIT_NPROC on top of ucounts")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-02-25 10:40:14 -06:00
Arnd Bergmann
c253bf70c6 Merge tag 'soc-fsl-fix-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux into arm/fixes
NXP/FSL SoC driver fixes for v5.17

- Add missing SoC compatible in existing binding
- Replace kernel.h with the necessary inclusions
- MAINTAINERS file fixes
- Fix memory allocation failure check in guts driver
- Various cleanups and minor fixes

* tag 'soc-fsl-fix-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux:
  soc: fsl: qe: Check of ioremap return value
  soc: fsl: qe: fix typo in a comment
  soc: fsl: guts: Add a missing memory allocation failure check
  soc: fsl: guts: Revert commit 3c0d64e867
  soc: fsl: Correct MAINTAINERS database (SOC)
  soc: fsl: Correct MAINTAINERS database (QUICC ENGINE LIBRARY)
  soc: fsl: Replace kernel.h with the necessary inclusions
  dt-bindings: fsl,layerscape-dcfg: add missing compatible for lx2160a
  dt-bindings: qoriq-clock: add missing compatible for lx2160a

Link: https://lore.kernel.org/r/20220219012208.21835-1-leoyang.li@nxp.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 16:41:17 +01:00
Li RongQing
9ee83635d8 KVM: x86: Yield to IPI target vCPU only if it is busy
When sending a call-function IPI-many to vCPUs, yield to the
IPI target vCPU which is marked as preempted.

but when emulating HLT, an idling vCPU will be voluntarily
scheduled out and mark as preempted from the guest kernel
perspective. yielding to idle vCPU is pointless and increase
unnecessary vmexit, maybe miss the true preempted vCPU

so yield to IPI target vCPU only if vCPU is busy and preempted

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Message-Id: <1644380201-29423-1-git-send-email-lirongqing@baidu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-25 10:09:35 -05:00
Dexuan Cui
92e68cc558 x86/kvmclock: Fix Hyper-V Isolated VM's boot issue when vCPUs > 64
When Linux runs as an Isolated VM on Hyper-V, it supports AMD SEV-SNP
but it's partially enlightened, i.e. cc_platform_has(
CC_ATTR_GUEST_MEM_ENCRYPT) is true but sev_active() is false.

Commit 4d96f91091 per se is good, but with it now
kvm_setup_vsyscall_timeinfo() -> kvmclock_init_mem() calls
set_memory_decrypted(), and later gets stuck when trying to zere out
the pages pointed by 'hvclock_mem', if Linux runs as an Isolated VM on
Hyper-V. The cause is that here now the Linux VM should no longer access
the original guest physical addrss (GPA); instead the VM should do
memremap() and access the original GPA + ms_hyperv.shared_gpa_boundary:
see the example code in drivers/hv/connection.c: vmbus_connect() or
drivers/hv/ring_buffer.c: hv_ringbuffer_init(). If the VM tries to
access the original GPA, it keepts getting injected a fault by Hyper-V
and gets stuck there.

Here the issue happens only when the VM has >=65 vCPUs, because the
global static array hv_clock_boot[] can hold 64 "struct
pvclock_vsyscall_time_info" (the sizeof of the struct is 64 bytes), so
kvmclock_init_mem() only allocates memory in the case of vCPUs > 64.

Since the 'hvclock_mem' pages are only useful when the kvm clock is
supported by the underlying hypervisor, fix the issue by returning
early when Linux VM runs on Hyper-V, which doesn't support kvm clock.

Fixes: 4d96f91091 ("x86/sev: Replace occurrences of sev_active() with cc_platform_has()")
Tested-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Message-Id: <20220225084600.17817-1-decui@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-25 10:09:34 -05:00
Wanpeng Li
3c51d0a6c7 x86/kvm: Don't waste memory if kvmclock is disabled
Even if "no-kvmclock" is passed in cmdline parameter, the guest kernel
still allocates hvclock_mem which is scaled by the number of vCPUs,
let's check kvmclock enable in advance to avoid this memory waste.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1645520523-30814-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-25 10:09:34 -05:00
Wanpeng Li
40cd58dbf1 x86/kvm: Don't use PV TLB/yield when mwait is advertised
MWAIT is advertised in host is not overcommitted scenario, however, PV
TLB/sched yield should be enabled in host overcommitted scenario. Let's
add the MWAIT checking when enabling PV TLB/sched yield.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1645777780-2581-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-25 10:09:34 -05:00
Paolo Bonzini
ece32a75f0 Merge tag 'kvmarm-fixes-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.17, take #4

- Correctly synchronise PMR and co on PSCI CPU_SUSPEND

- Skip tests that depend on GICv3 when the HW isn't available
2022-02-25 09:49:30 -05:00
Mark Brown
456f89e092 KVM: selftests: aarch64: Skip tests if we can't create a vgic-v3
The arch_timer and vgic_irq kselftests assume that they can create a
vgic-v3, using the library function vgic_v3_setup() which aborts with a
test failure if it is not possible to do so. Since vgic-v3 can only be
instantiated on systems where the host has GICv3 this leads to false
positives on older systems where that is not the case.

Fix this by changing vgic_v3_setup() to return an error if the vgic can't
be instantiated and have the callers skip if this happens. We could also
exit flagging a skip in vgic_v3_setup() but this would prevent future test
cases conditionally deciding which GIC to use or generally doing more
complex output.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Tested-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220223131624.1830351-1-broonie@kernel.org
2022-02-25 13:02:28 +00:00
Casper Andersson
b3a34dc362 net: sparx5: Fix add vlan when invalid operation
Check if operation is valid before changing any
settings in hardware. Otherwise it results in
changes being made despite it not being a valid
operation.

Fixes: 78eab33bb6 ("net: sparx5: add vlan support")

Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 12:54:48 +00:00
Jia-Ju Bai
767b9825ed net: chelsio: cxgb3: check the return value of pci_find_capability()
The function pci_find_capability() in t3_prep_adapter() can fail, so its
return value should be checked.

Fixes: 4d22de3e6c ("Add support for the latest 1G/10G Chelsio adapter, T3")
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 12:52:05 +00:00
David S. Miller
5a83dd14c6 Merge branch 'ibmvnic-fixes'
Sukadev Bhattiprolu says:

====================
ibmvnic: Fix a race in ibmvnic_probe()

If we get a transport (reset) event right after a successful CRQ_INIT
during ibmvnic_probe() but before we set the adapter state to VNIC_PROBED,
we will throw away the reset assuming that the adapter is still in the
probing state. But since the adapter has completed the CRQ_INIT any
subsequent CRQs the we send will be ignored by the vnicserver until
we release/init the CRQ again. This can leave the adapter unconfigured.

While here fix a couple of other bugs that were observed (Patches 1,2,4).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:57:47 +00:00
Sukadev Bhattiprolu
fd98693cb0 ibmvnic: Allow queueing resets during probe
We currently don't allow queuing resets when adapter is in VNIC_PROBING
state - instead we throw away the reset and return EBUSY. The reasoning
is probably that during ibmvnic_probe() the ibmvnic_adapter itself is
being initialized so performing a reset during this time can lead us to
accessing fields in the ibmvnic_adapter that are not fully initialized.
A review of the code shows that all the adapter state neede to process a
reset is initialized before registering the CRQ so that should no longer
be a concern.

Further the expectation is that if we do get a reset (transport event)
during probe, the do..while() loop in ibmvnic_probe() will handle this
by reinitializing the CRQ.

While that is true to some extent, it is possible that the reset might
occur _after_ the CRQ is registered and CRQ_INIT message was exchanged
but _before_ the adapter state is set to VNIC_PROBED. As mentioned above,
such a reset will be thrown away. While the client assumes that the
adapter is functional, the vnic server will wait for the client to reinit
the adapter. This disconnect between the two leaves the adapter down
needing manual intervention.

Because ibmvnic_probe() has other work to do after initializing the CRQ
(such as registering the netdev at a minimum) and because the reset event
can occur at any instant after the CRQ is initialized, there will always
be a window between initializing the CRQ and considering the adapter
ready for resets (ie state == PROBED).

So rather than discarding resets during this window, allow queueing them
- but only process them after the adapter is fully initialized.

To do this, introduce a new completion state ->probe_done and have the
reset worker thread wait on this before processing resets.

This change brings up two new situations in or just after ibmvnic_probe().
First after one or more resets were queued, we encounter an error and
decide to retry the initialization.  At that point the queued resets are
no longer relevant since we could be talking to a new vnic server. So we
must purge/flush the queued resets before restarting the initialization.
As a side note, since we are still in the probing stage and we have not
registered the netdev, it will not be CHANGE_PARAM reset.

Second this change opens up a potential race between the worker thread
in __ibmvnic_reset(), the tasklet and the ibmvnic_open() due to the
following sequence of events:

	1. Register CRQ
	2. Get transport event before CRQ_INIT completes.
	3. Tasklet schedules reset:
		a) add rwi to list
		b) schedule_work() to start worker thread which runs
		   and waits for ->probe_done.
	4. ibmvnic_probe() decides to retry, purges rwi_list
	5. Re-register crq and this time rest of probe succeeds - register
	   netdev and complete(->probe_done).
	6. Worker thread resumes in __ibmvnic_reset() from 3b.
	7. Worker thread sets ->resetting bit
	8. ibmvnic_open() comes in, notices ->resetting bit, sets state
	   to IBMVNIC_OPEN and returns early expecting worker thread to
	   finish the open.
	9. Worker thread finds rwi_list empty and returns without
	   opening the interface.

If this happens, the ->ndo_open() call is effectively lost and the
interface remains down. To address this, ensure that ->rwi_list is
not empty before setting the ->resetting  bit. See also comments in
__ibmvnic_reset().

Fixes: 6a2fb0e99f ("ibmvnic: driver initialization for kdump/kexec")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:57:47 +00:00
Sukadev Bhattiprolu
f628ad531b ibmvnic: clear fop when retrying probe
Clear ->failover_pending flag that may have been set in the previous
pass of registering CRQ. If we don't clear, a subsequent ibmvnic_open()
call would be misled into thinking a failover is pending and assuming
that the reset worker thread would open the adapter. If this pass of
registering the CRQ succeeds (i.e there is no transport event), there
wouldn't be a reset worker thread.

This would leave the adapter unconfigured and require manual intervention
to bring it up during boot.

Fixes: 5a18e1e0c1 ("ibmvnic: Fix failover case for non-redundant configuration")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:57:47 +00:00
Sukadev Bhattiprolu
ae16bf1537 ibmvnic: init init_done_rc earlier
We currently initialize the ->init_done completion/return code fields
before issuing a CRQ_INIT command. But if we get a transport event soon
after registering the CRQ the taskslet may already have recorded the
completion and error code. If we initialize here, we might overwrite/
lose that and end up issuing the CRQ_INIT only to timeout later.

If that timeout happens during probe, we will leave the adapter in the
DOWN state rather than retrying to register/init the CRQ.

Initialize the completion before registering the CRQ so we don't lose
the notification.

Fixes: 032c5e8284 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:57:47 +00:00
Sukadev Bhattiprolu
570425f8c7 ibmvnic: register netdev after init of adapter
Finish initializing the adapter before registering netdev so state
is consistent.

Fixes: c26eba03e4 ("ibmvnic: Update reset infrastructure to support tunable parameters")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:57:47 +00:00
Sukadev Bhattiprolu
36491f2df9 ibmvnic: complete init_done on transport events
If we get a transport event, set the error and mark the init as
complete so the attempt to send crq-init or login fail sooner
rather than wait for the timeout.

Fixes: bbd669a868 ("ibmvnic: Fix completion structure initialization")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:57:47 +00:00
Sukadev Bhattiprolu
83da53f7e4 ibmvnic: define flush_reset_queue helper
Define and use a helper to flush the reset queue.

Fixes: 2770a7984d ("ibmvnic: Introduce hard reset recovery")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:57:47 +00:00
Sukadev Bhattiprolu
765559b10c ibmvnic: initialize rc before completing wait
We should initialize ->init_done_rc before calling complete(). Otherwise
the waiting thread may see ->init_done_rc as 0 before we have updated it
and may assume that the CRQ was successful.

Fixes: 6b278c0cb3 ("ibmvnic delay complete()")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:57:47 +00:00
Sukadev Bhattiprolu
8d0657f39f ibmvnic: free reset-work-item when flushing
Fix a tiny memory leak when flushing the reset work queue.

Fixes: 2770a7984d ("ibmvnic: Introduce hard reset recovery")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:57:46 +00:00
David S. Miller
31372fe966 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
1) Fix PMTU for IPv6 if the reported MTU minus the ESP overhead is
   smaller than 1280. From Jiri Bohac.

2) Fix xfrm interface ID and inter address family tunneling when
   migrating xfrm states. From Yan Yan.

3) Add missing xfrm intrerface ID initialization on xfrmi_changelink.
   From Antony Antony.

4) Enforce validity of xfrm offload input flags so that userspace can't
   send undefined flags to the offload driver.
   From Leon Romanovsky.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:44:15 +00:00
Vladimir Oltean
91b0383fef net: dcb: flush lingering app table entries for unregistered devices
If I'm not mistaken (and I don't think I am), the way in which the
dcbnl_ops work is that drivers call dcb_ieee_setapp() and this populates
the application table with dynamically allocated struct dcb_app_type
entries that are kept in the module-global dcb_app_list.

However, nobody keeps exact track of these entries, and although
dcb_ieee_delapp() is supposed to remove them, nobody does so when the
interface goes away (example: driver unbinds from device). So the
dcb_app_list will contain lingering entries with an ifindex that no
longer matches any device in dcb_app_lookup().

Reclaim the lost memory by listening for the NETDEV_UNREGISTER event and
flushing the app table entries of interfaces that are now gone.

In fact something like this used to be done as part of the initial
commit (blamed below), but it was done in dcbnl_exit() -> dcb_flushapp(),
essentially at module_exit time. That became dead code after commit
7a6b6f515f ("DCB: fix kconfig option") which essentially merged
"tristate config DCB" and "bool config DCBNL" into a single "bool config
DCB", so net/dcb/dcbnl.c could not be built as a module anymore.

Commit 36b9ad8084 ("net/dcb: make dcbnl.c explicitly non-modular")
recognized this and deleted dcbnl_exit() and dcb_flushapp() altogether,
leaving us with the version we have today.

Since flushing application table entries can and should be done as soon
as the netdevice disappears, fundamentally the commit that is to blame
is the one that introduced the design of this API.

Fixes: 9ab933ab2c ("dcbnl: add appliction tlv handlers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:42:34 +00:00
D. Wythe
9f1c50cf39 net/smc: fix connection leak
There's a potential leak issue under following execution sequence :

smc_release  				smc_connect_work
if (sk->sk_state == SMC_INIT)
					send_clc_confirim
	tcp_abort();
					...
					sk.sk_state = SMC_ACTIVE
smc_close_active
switch(sk->sk_state) {
...
case SMC_ACTIVE:
	smc_close_final()
	// then wait peer closed

Unfortunately, tcp_abort() may discard CLC CONFIRM messages that are
still in the tcp send buffer, in which case our connection token cannot
be delivered to the server side, which means that we cannot get a
passive close message at all. Therefore, it is impossible for the to be
disconnected at all.

This patch tries a very simple way to avoid this issue, once the state
has changed to SMC_ACTIVE after tcp_abort(), we can actively abort the
smc connection, considering that the state is SMC_INIT before
tcp_abort(), abandoning the complete disconnection process should not
cause too much problem.

In fact, this problem may exist as long as the CLC CONFIRM message is
not received by the server. Whether a timer should be added after
smc_close_final() needs to be discussed in the future. But even so, this
patch provides a faster release for connection in above case, it should
also be valuable.

Fixes: 39f41f367b ("net/smc: common release code for non-accepted sockets")
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:40:21 +00:00
Vincent Whitchurch
087a7b944c net: stmmac: only enable DMA interrupts when ready
In this driver's ->ndo_open() callback, it enables DMA interrupts,
starts the DMA channels, then requests interrupts with request_irq(),
and then finally enables napi.

If RX DMA interrupts are received before napi is enabled, no processing
is done because napi_schedule_prep() will return false.  If the network
has a lot of broadcast/multicast traffic, then the RX ring could fill up
completely before napi is enabled.  When this happens, no further RX
interrupts will be delivered, and the driver will fail to receive any
packets.

Fix this by only enabling DMA interrupts after all other initialization
is complete.

Fixes: 523f11b5d4 ("net: stmmac: move hardware setup for stmmac_open to new function")
Reported-by: Lars Persson <larper@axis.com>
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:34:22 +00:00
Marek Marczykowski-Górecki
dcf4ff7a48 xen/netfront: destroy queues before real_num_tx_queues is zeroed
xennet_destroy_queues() relies on info->netdev->real_num_tx_queues to
delete queues. Since d7dac08341
("net-sysfs: update the queue counts in the unregistration path"),
unregister_netdev() indirectly sets real_num_tx_queues to 0. Those two
facts together means, that xennet_destroy_queues() called from
xennet_remove() cannot do its job, because it's called after
unregister_netdev(). This results in kfree-ing queues that are still
linked in napi, which ultimately crashes:

    BUG: kernel NULL pointer dereference, address: 0000000000000000
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    PGD 0 P4D 0
    Oops: 0000 [#1] PREEMPT SMP PTI
    CPU: 1 PID: 52 Comm: xenwatch Tainted: G        W         5.16.10-1.32.fc32.qubes.x86_64+ #226
    RIP: 0010:free_netdev+0xa3/0x1a0
    Code: ff 48 89 df e8 2e e9 00 00 48 8b 43 50 48 8b 08 48 8d b8 a0 fe ff ff 48 8d a9 a0 fe ff ff 49 39 c4 75 26 eb 47 e8 ed c1 66 ff <48> 8b 85 60 01 00 00 48 8d 95 60 01 00 00 48 89 ef 48 2d 60 01 00
    RSP: 0000:ffffc90000bcfd00 EFLAGS: 00010286
    RAX: 0000000000000000 RBX: ffff88800edad000 RCX: 0000000000000000
    RDX: 0000000000000001 RSI: ffffc90000bcfc30 RDI: 00000000ffffffff
    RBP: fffffffffffffea0 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800edad050
    R13: ffff8880065f8f88 R14: 0000000000000000 R15: ffff8880066c6680
    FS:  0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 00000000e998c006 CR4: 00000000003706e0
    Call Trace:
     <TASK>
     xennet_remove+0x13d/0x300 [xen_netfront]
     xenbus_dev_remove+0x6d/0xf0
     __device_release_driver+0x17a/0x240
     device_release_driver+0x24/0x30
     bus_remove_device+0xd8/0x140
     device_del+0x18b/0x410
     ? _raw_spin_unlock+0x16/0x30
     ? klist_iter_exit+0x14/0x20
     ? xenbus_dev_request_and_reply+0x80/0x80
     device_unregister+0x13/0x60
     xenbus_dev_changed+0x18e/0x1f0
     xenwatch_thread+0xc0/0x1a0
     ? do_wait_intr_irq+0xa0/0xa0
     kthread+0x16b/0x190
     ? set_kthread_struct+0x40/0x40
     ret_from_fork+0x22/0x30
     </TASK>

Fix this by calling xennet_destroy_queues() from xennet_uninit(),
when real_num_tx_queues is still available. This ensures that queues are
destroyed when real_num_tx_queues is set to 0, regardless of how
unregister_netdev() was called.

Originally reported at
https://github.com/QubesOS/qubes-issues/issues/7257

Fixes: d7dac08341 ("net-sysfs: update the queue counts in the unregistration path")
Cc: stable@vger.kernel.org
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:27:01 +00:00
Arnd Bergmann
f03f10a982 Merge tag 'omap-for-v5.17/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes
Fixes for omaps

Fixes for devkit8000 timer regression. Similar to the earlier beagleboard
fixes, we must not configure the clocksource drivers to use an alternative
timer configuration. It causes unnecessary issues with power management.
Only some old designs based on early beagleboard revisions with a miswired
timer need to use the alternative timer.

* tag 'omap-for-v5.17/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: Use 32KiHz oscillator on devkit8000
  ARM: dts: switch timer config to common devkit8000 devicetree

Link: https://lore.kernel.org/r/pull-1645606483-876944@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 11:14:23 +01:00
Sean Christopherson
1a71581012 Revert "KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()"
Revert back to refreshing vmcs.HOST_CR3 immediately prior to VM-Enter.
The PCID (ASID) part of CR3 can be bumped without KVM being scheduled
out, as the kernel will switch CR3 during __text_poke(), e.g. in response
to a static key toggling.  If switch_mm_irqs_off() chooses a new ASID for
the mm associate with KVM, KVM will do VM-Enter => VM-Exit with a stale
vmcs.HOST_CR3.

Add a comment to explain why KVM must wait until VM-Enter is imminent to
refresh vmcs.HOST_CR3.

The following splat was captured by stashing vmcs.HOST_CR3 in kvm_vcpu
and adding a WARN in load_new_mm_cr3() to fire if a new ASID is being
loaded for the KVM-associated mm while KVM has a "running" vCPU:

  static void load_new_mm_cr3(pgd_t *pgdir, u16 new_asid, bool need_flush)
  {
	struct kvm_vcpu *vcpu = kvm_get_running_vcpu();

	...

	WARN(vcpu && (vcpu->cr3 & GENMASK(11, 0)) != (new_mm_cr3 & GENMASK(11, 0)) &&
	     (vcpu->cr3 & PHYSICAL_PAGE_MASK) == (new_mm_cr3 & PHYSICAL_PAGE_MASK),
	     "KVM is hosed, loading CR3 = %lx, vmcs.HOST_CR3 = %lx", new_mm_cr3, vcpu->cr3);
  }

  ------------[ cut here ]------------
  KVM is hosed, loading CR3 = 8000000105393004, vmcs.HOST_CR3 = 105393003
  WARNING: CPU: 4 PID: 20717 at arch/x86/mm/tlb.c:291 load_new_mm_cr3+0x82/0xe0
  Modules linked in: vhost_net vhost vhost_iotlb tap kvm_intel
  CPU: 4 PID: 20717 Comm: stable Tainted: G        W         5.17.0-rc3+ #747
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:load_new_mm_cr3+0x82/0xe0
  RSP: 0018:ffffc9000489fa98 EFLAGS: 00010082
  RAX: 0000000000000000 RBX: 8000000105393004 RCX: 0000000000000027
  RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff888277d1b788
  RBP: 0000000000000004 R08: ffff888277d1b780 R09: ffffc9000489f8b8
  R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
  R13: ffff88810678a800 R14: 0000000000000004 R15: 0000000000000c33
  FS:  00007fa9f0e72700(0000) GS:ffff888277d00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 00000001001b5003 CR4: 0000000000172ea0
  Call Trace:
   <TASK>
   switch_mm_irqs_off+0x1cb/0x460
   __text_poke+0x308/0x3e0
   text_poke_bp_batch+0x168/0x220
   text_poke_finish+0x1b/0x30
   arch_jump_label_transform_apply+0x18/0x30
   static_key_slow_inc_cpuslocked+0x7c/0x90
   static_key_slow_inc+0x16/0x20
   kvm_lapic_set_base+0x116/0x190
   kvm_set_apic_base+0xa5/0xe0
   kvm_set_msr_common+0x2f4/0xf60
   vmx_set_msr+0x355/0xe70 [kvm_intel]
   kvm_set_msr_ignored_check+0x91/0x230
   kvm_emulate_wrmsr+0x36/0x120
   vmx_handle_exit+0x609/0x6c0 [kvm_intel]
   kvm_arch_vcpu_ioctl_run+0x146f/0x1b80
   kvm_vcpu_ioctl+0x279/0x690
   __x64_sys_ioctl+0x83/0xb0
   do_syscall_64+0x3b/0xc0
   entry_SYSCALL_64_after_hwframe+0x44/0xae
   </TASK>
  ---[ end trace 0000000000000000 ]---

This reverts commit 15ad9762d6.

Fixes: 15ad9762d6 ("KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()")
Reported-by: Wanpeng Li <kernellwp@gmail.com>
Cc: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Lai Jiangshan <jiangshanlai@gmail.com>
Message-Id: <20220224191917.3508476-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-25 04:02:18 -05:00
Sean Christopherson
bca06b85fc Revert "KVM: VMX: Save HOST_CR3 in vmx_set_host_fs_gs()"
Undo a nested VMX fix as a step toward reverting the commit it fixed,
15ad9762d6 ("KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()"),
as the underlying premise that "host CR3 in the vcpu thread can only be
changed when scheduling" is wrong.

This reverts commit a9f2705ec8.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220224191917.3508476-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-25 04:02:18 -05:00
Vincent Mailhol
035b0fcf02 can: gs_usb: change active_channels's type from atomic_t to u8
The driver uses an atomic_t variable: gs_usb:active_channels to keep
track of the number of opened channels in order to only allocate
memory for the URBs when this count changes from zero to one.

However, the driver does not decrement the counter when an error
occurs in gs_can_open(). This issue is fixed by changing the type from
atomic_t to u8 and by simplifying the logic accordingly.

It is safe to use an u8 here because the network stack big kernel lock
(a.k.a. rtnl_mutex) is being hold. For details, please refer to [1].

[1] https://lore.kernel.org/linux-can/CAMZ6Rq+sHpiw34ijPsmp7vbUpDtJwvVtdV7CvRZJsLixjAFfrg@mail.gmail.com/T/#t

Fixes: d08e973a77 ("can: gs_usb: Added support for the GS_USB CAN devices")
Link: https://lore.kernel.org/all/20220214234814.1321599-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-25 09:14:34 +01:00
Vincent Mailhol
f4896248e9 can: etas_es58x: change opened_channel_cnt's type from atomic_t to u8
The driver uses an atomic_t variable: struct
es58x_device::opened_channel_cnt to keep track of the number of opened
channels in order to only allocate memory for the URBs when this count
changes from zero to one.

While the intent was to prevent race conditions, the choice of an
atomic_t turns out to be a bad idea for several reasons:

- implementation is incorrect and fails to decrement
  opened_channel_cnt when the URB allocation fails as reported in
  [1].

- even if opened_channel_cnt were to be correctly decremented,
  atomic_t is insufficient to cover edge cases: there can be a race
  condition in which 1/ a first process fails to allocate URBs
  memory 2/ a second process enters es58x_open() before the first
  process does its cleanup and decrements opened_channed_cnt. In
  which case, the second process would successfully return despite
  the URBs memory not being allocated.

- actually, any kind of locking mechanism was useless here because
  it is redundant with the network stack big kernel lock
  (a.k.a. rtnl_lock) which is being hold by all the callers of
  net_device_ops:ndo_open() and net_device_ops:ndo_close(). c.f. the
  ASSERST_RTNL() calls in __dev_open() [2] and __dev_close_many()
  [3].

The atmomic_t is thus replaced by a simple u8 type and the logic to
increment and decrement es58x_device:opened_channel_cnt is simplified
accordingly fixing the bug reported in [1]. We do not check again for
ASSERST_RTNL() as this is already done by the callers.

[1] https://lore.kernel.org/linux-can/20220201140351.GA2548@kili/T/#u
[2] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1463
[3] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1541

Fixes: 8537257874 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces")
Link: https://lore.kernel.org/all/20220212112713.577957-1-mailhol.vincent@wanadoo.fr
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-25 09:14:34 +01:00
Jakub Kicinski
a6df953f01 Merge branch 'mptcp-fixes-for-5-17'
Mat Martineau says:

====================
mptcp: Fixes for 5.17

Patch 1 fixes an issue with the SIOCOUTQ ioctl in MPTCP sockets that
have performed a fallback to TCP.

Patch 2 is a selftest fix to correctly remove temp files.

Patch 3 fixes a shift-out-of-bounds issue found by syzkaller.
====================

Link: https://lore.kernel.org/r/20220225005259.318898-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:54:57 -08:00
Mat Martineau
877d11f033 mptcp: Correctly set DATA_FIN timeout when number of retransmits is large
Syzkaller with UBSAN uncovered a scenario where a large number of
DATA_FIN retransmits caused a shift-out-of-bounds in the DATA_FIN
timeout calculation:

================================================================================
UBSAN: shift-out-of-bounds in net/mptcp/protocol.c:470:29
shift exponent 32 is too large for 32-bit type 'unsigned int'
CPU: 1 PID: 13059 Comm: kworker/1:0 Not tainted 5.17.0-rc2-00630-g5fbf21c90c60 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Workqueue: events mptcp_worker
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:151
 __ubsan_handle_shift_out_of_bounds.cold+0xb2/0x20e lib/ubsan.c:330
 mptcp_set_datafin_timeout net/mptcp/protocol.c:470 [inline]
 __mptcp_retrans.cold+0x72/0x77 net/mptcp/protocol.c:2445
 mptcp_worker+0x58a/0xa70 net/mptcp/protocol.c:2528
 process_one_work+0x9df/0x16d0 kernel/workqueue.c:2307
 worker_thread+0x95/0xe10 kernel/workqueue.c:2454
 kthread+0x2f4/0x3b0 kernel/kthread.c:377
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>
================================================================================

This change limits the maximum timeout by limiting the size of the
shift, which keeps all intermediate values in-bounds.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/259
Fixes: 6477dd39e6 ("mptcp: Retransmit DATA_FIN")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:54:54 -08:00
Paolo Abeni
63bb8239d8 selftests: mptcp: do complete cleanup at exit
After commit 05be5e273c ("selftests: mptcp: add disconnect tests")
the mptcp selftests leave behind a couple of tmp files after
each run. run_tests_disconnect() misnames a few variables used to
track them. Address the issue setting the appropriate global variables

Fixes: 05be5e273c ("selftests: mptcp: add disconnect tests")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:54:54 -08:00
Paolo Abeni
07c2c7a3b6 mptcp: accurate SIOCOUTQ for fallback socket
The MPTCP SIOCOUTQ implementation is not very accurate in
case of fallback: it only measures the data in the MPTCP-level
write queue, but it does not take in account the subflow
write queue utilization. In case of fallback the first can be
empty, while the latter is not.

The above produces sporadic self-tests issues and can foul
legit user-space application.

Fix the issue additionally querying the subflow in case of fallback.

Fixes: 644807e3e4 ("mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/260
Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:54:53 -08:00
Changbin Du
22e2100b1b riscv: fix oops caused by irqsoff latency tracer
The trace_hardirqs_{on,off}() require the caller to setup frame pointer
properly. This because these two functions use macro 'CALLER_ADDR1' (aka.
__builtin_return_address(1)) to acquire caller info. If the $fp is used
for other purpose, the code generated this macro (as below) could trigger
memory access fault.

   0xffffffff8011510e <+80>:    ld      a1,-16(s0)
   0xffffffff80115112 <+84>:    ld      s2,-8(a1)  # <-- paging fault here

The oops message during booting if compiled with 'irqoff' tracer enabled:
[    0.039615][    T0] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000f8
[    0.041925][    T0] Oops [#1]
[    0.042063][    T0] Modules linked in:
[    0.042864][    T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc1-00233-g9a20c48d1ed2 #29
[    0.043568][    T0] Hardware name: riscv-virtio,qemu (DT)
[    0.044343][    T0] epc : trace_hardirqs_on+0x56/0xe2
[    0.044601][    T0]  ra : restore_all+0x12/0x6e
[    0.044721][    T0] epc : ffffffff80126a5c ra : ffffffff80003b94 sp : ffffffff81403db0
[    0.044801][    T0]  gp : ffffffff8163acd8 tp : ffffffff81414880 t0 : 0000000000000020
[    0.044882][    T0]  t1 : 0098968000000000 t2 : 0000000000000000 s0 : ffffffff81403de0
[    0.044967][    T0]  s1 : 0000000000000000 a0 : 0000000000000001 a1 : 0000000000000100
[    0.045046][    T0]  a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
[    0.045124][    T0]  a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000054494d45
[    0.045210][    T0]  s2 : ffffffff80003b94 s3 : ffffffff81a8f1b0 s4 : ffffffff80e27b50
[    0.045289][    T0]  s5 : ffffffff81414880 s6 : ffffffff8160fa00 s7 : 00000000800120e8
[    0.045389][    T0]  s8 : 0000000080013100 s9 : 000000000000007f s10: 0000000000000000
[    0.045474][    T0]  s11: 0000000000000000 t3 : 7fffffffffffffff t4 : 0000000000000000
[    0.045548][    T0]  t5 : 0000000000000000 t6 : ffffffff814aa368
[    0.045620][    T0] status: 0000000200000100 badaddr: 00000000000000f8 cause: 000000000000000d
[    0.046402][    T0] [<ffffffff80003b94>] restore_all+0x12/0x6e

This because the $fp(aka. $s0) register is not used as frame pointer in the
assembly entry code.

	resume_kernel:
		REG_L s0, TASK_TI_PREEMPT_COUNT(tp)
		bnez s0, restore_all
		REG_L s0, TASK_TI_FLAGS(tp)
                andi s0, s0, _TIF_NEED_RESCHED
                beqz s0, restore_all
                call preempt_schedule_irq
                j restore_all

To fix above issue, here we add one extra level wrapper for function
trace_hardirqs_{on,off}() so they can be safely called by low level entry
code.

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Fixes: 3c46979829 ("riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-24 20:30:30 -08:00
Damien Le Moal
762e52f79c riscv: fix nommu_k210_sdcard_defconfig
Instead of an arbitrary delay, use the "rootwait" kernel option to wait
for the mmc root device to be ready.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Fixes: 7e09fd3994 ("riscv: Add Canaan Kendryte K210 SD card defconfig")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-24 19:22:55 -08:00
Daniel Bristot de Oliveira
ce33c845b0 tracing: Dump stacktrace trigger to the corresponding instance
The stacktrace event trigger is not dumping the stacktrace to the instance
where it was enabled, but to the global "instance."

Use the private_data, pointing to the trigger file, to figure out the
corresponding trace instance, and use it in the trigger action, like
snapshot_trigger does.

Link: https://lkml.kernel.org/r/afbb0b4f18ba92c276865bc97204d438473f4ebc.1645396236.git.bristot@kernel.org

Cc: stable@vger.kernel.org
Fixes: ae63b31e4d ("tracing: Separate out trace events from global variables")
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-24 22:07:03 -05:00
Jakub Kicinski
8a7271000b Merge tag 'for-net-2022-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - Fix regression with RFCOMM
 - Fix regression with LE devices using Privacy (RPA)
 - Fix regression with LE devices not waiting proper timeout to
   establish connections
 - Fix race in smp

* tag 'for-net-2022-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: hci_sync: Fix not using conn_timeout
  Bluetooth: hci_sync: Fix hci_update_accept_list_sync
  Bluetooth: assign len after null check
  Bluetooth: Fix bt_skb_sendmmsg not allocating partial chunks
  Bluetooth: fix data races in smp_unregister(), smp_del_chan()
  Bluetooth: hci_core: Fix leaking sent_cmd skb
====================

Link: https://lore.kernel.org/r/20220224210838.197787-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 18:13:31 -08:00
Linus Torvalds
53ab78cd6d Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
 "A couple driver fixes in the clk subsystem

   - Fix a hang due to bad clk parent in the Ingenic jz4725b driver

   - Fix SD controllers on Qualcomm MSM8994 SoCs by removing clks that
     shouldn't be touched"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: jz4725b: fix mmc0 clock gating
  clk: qcom: gcc-msm8994: Remove NoC clocks
2022-02-24 17:35:22 -08:00
Linus Torvalds
5ee3d0015a Merge tag 'drm-fixes-2022-02-25' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Regular drm fixes pull, i915, amdgpu and tegra mostly, all pretty
  small.

  core:
   - edid: Always set RGB444

  tegra:
   - tegra186 suspend/resume fixes
   - syncpoint wait fix
   - build warning fix
   - eDP on older devices fix

  amdgpu:
   - Display FP fix
   - PCO powergating fix
   - RDNA2 OEM SKU stability fixes
   - Display PSR fix
   - PCI ASPM fix
   - Display link encoder fix for TEST_COMMIT
   - Raven2 suspend/resume fix
   - Fix a regression in virtual display support
   - GPUVM eviction fix

  i915:
   - Fix QGV handling on ADL-P+
   - Fix bw atomic check when switching between SAGV vs. no SAGV
   - Disconnect PHYs left connected by BIOS on disabled ports
   - Fix SAVG to no SAGV transitions on TGL+
   - Print PHY name properly on calibration error (DG2)

  imx:
   - dcss: Select GEM CMA helpers

  radeon:
   - Fix some variables's type

  vc4:
   - Fix codec cleanup
   - Fix PM reference counting"

* tag 'drm-fixes-2022-02-25' of git://anongit.freedesktop.org/drm/drm: (24 commits)
  drm/amdgpu: check vm ready by amdgpu_vm->evicting flag
  drm/amdgpu: bypass tiling flag check in virtual display case (v2)
  Revert "drm/amdgpu: add modifiers in amdgpu_vkms_plane_init()"
  drm/amdgpu: do not enable asic reset for raven2
  drm/amd/display: Fix stream->link_enc unassigned during stream removal
  drm/amd: Check if ASPM is enabled from PCIe subsystem
  drm/edid: Always set RGB444
  drm/tegra: dpaux: Populate AUX bus
  drm/radeon: fix variable type
  drm/amd/display: For vblank_disable_immediate, check PSR is really used
  drm/amd/pm: fix some OEM SKU specific stability issues
  drm/amdgpu: disable MMHUB PG for Picasso
  drm/amd/display: Protect update_bw_bounding_box FPU code.
  drm/i915/dg2: Print PHY name properly on calibration error
  drm/i915: Fix bw atomic check when switching between SAGV vs. no SAGV
  drm/i915: Correctly populate use_sagv_wm for all pipes
  drm/i915: Disconnect PHYs left connected by BIOS on disabled ports
  drm/i915: Widen the QGV point mask
  drm/imx/dcss: i.MX8MQ DCSS select DRM_GEM_CMA_HELPER
  drm/vc4: crtc: Fix runtime_pm reference counting
  ...
2022-02-24 17:29:26 -08:00
Horatiu Vultur
aa091a6a91 clk: lan966x: Fix linking error
If the config options HAS_IOMEM is not set then the driver fails to link
with the following error:
clk-lan966x.c:(.text+0x950): undefined reference to
`devm_platform_ioremap_resource'

Therefor add missing dependencies: HAS_IOMEM and OF.

Fixes: 54104ee023 ("clk: lan966x: Add lan966x SoC clock driver")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20220219141536.460812-1-horatiu.vultur@microchip.com
Reviewed-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-02-24 16:53:24 -08:00
Marek Szyprowski
4188db2328 drm/exynos: Search for TE-gpio in DSI panel's node
TE-gpio, if defined, is placed in the panel's node, not the parent DSI
node. Change the devm_gpiod_get_optional() to gpiod_get_optional() and
pass proper device node to it. The code already has a proper cleanup
path, so it looks that the devm_* variant has been applied accidentally
during the conversion to gpiod API.

Fixes: ee6c8b5afa ("drm/exynos: Replace legacy gpio interface for gpiod interface")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixed a typo.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25 09:50:48 +09:00
Marek Szyprowski
0a6e8d0a6d drm/exynos: Don't fail if no TE-gpio is defined for DSI driver
TE-gpio is optional and if it is not found then gpiod_get_optional()
returns NULL. In such case the code will continue and try to convert NULL
gpiod to irq what in turn fails. The failure is then propagated and driver
is not registered.

Fix this by returning early from exynos_dsi_register_te_irq() if no
TE-gpio is found.

Fixes: ee6c8b5afa ("drm/exynos: Replace legacy gpio interface for gpiod interface")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25 09:50:47 +09:00
Lad Prabhakar
586d090245 drm/exynos: gsc: Use platform_get_irq() to get the interrupt
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypassed the hierarchical setup and messed up the
irq chaining.

In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq().

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25 09:50:47 +09:00
Lad Prabhakar
be0a3b7e2a drm/exynos/fimc: Use platform_get_irq() to get the interrupt
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypassed the hierarchical setup and messed up the
irq chaining.

In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq().

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25 09:50:47 +09:00
Lad Prabhakar
b342c1f335 drm/exynos/exynos_drm_fimd: Use platform_get_irq_byname() to get the interrupt
platform_get_resource_byname(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypassed the hierarchical setup and messed up the
irq chaining.

In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq_byname().

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25 09:50:47 +09:00
Lad Prabhakar
be52abd4d2 drm/exynos: mixer: Use platform_get_irq() to get the interrupt
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypassed the hierarchical setup and messed up the
irq chaining.

In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq().

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25 09:50:47 +09:00
Lad Prabhakar
0d22b03166 drm/exynos/exynos7_drm_decon: Use platform_get_irq_byname() to get the interrupt
platform_get_resource_byname(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypassed the hierarchical setup and messed up the
irq chaining.

In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq_byname().

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-02-25 09:50:47 +09:00
Taniya Das
6e6fec3f96 clk: qcom: dispcc: Update the transition delay for MDSS GDSC
On SC7180 we observe black screens because the gdsc is being
enabled/disabled very rapidly and the GDSC FSM state does not work as
expected. This is due to the fact that the GDSC reset value is being
updated from SW.

The recommended transition delay for mdss core gdsc updated for
SC7180/SC7280/SM8250.

Fixes: dd3d066221 ("clk: qcom: Add display clock controller driver for SC7180")
Fixes: 1a00c962f9 ("clk: qcom: Add display clock controller driver for SC7280")
Fixes: 80a18f4a85 ("clk: qcom: Add display clock controller driver for SM8150 and SM8250")
Signed-off-by: Taniya Das <tdas@codeaurora.org>
Link: https://lore.kernel.org/r/20220223185606.3941-2-tdas@codeaurora.org
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
[sboyd@kernel.org: lowercase hex]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-02-24 16:22:11 -08:00
Taniya Das
4e7c4d3652 clk: qcom: gdsc: Add support to update GDSC transition delay
GDSCs have multiple transition delays which are used for the GDSC FSM
states. Older targets/designs required these values to be updated from
gdsc code to certain default values for the FSM state to work as
expected. But on the newer targets/designs the values updated from the
GDSC driver can hamper the FSM state to not work as expected.

On SC7180 we observe black screens because the gdsc is being
enabled/disabled very rapidly and the GDSC FSM state does not work as
expected. This is due to the fact that the GDSC reset value is being
updated from SW.

Thus add support to update the transition delay from the clock
controller gdscs as required.

Fixes: 45dd0e5531 ("clk: qcom: Add support for GDSCs)
Signed-off-by: Taniya Das <tdas@codeaurora.org>
Link: https://lore.kernel.org/r/20220223185606.3941-1-tdas@codeaurora.org
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-02-24 16:11:42 -08:00
Linus Torvalds
7ee022567b Merge tag 'perf-tools-fixes-for-v5.17-2022-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix double free in in the error path when opening perf.data from
   multiple files in a directory instead of from a single file

 - Sync the msr-index.h copy with the kernel sources

 - Fix error when printing 'weight' field in 'perf script'

 - Skip failing sigtrap test for arm+aarch64 in 'perf test'

 - Fix failure to use a cpu list for uncore events in hybrid systems,
   e.g. Intel Alder Lake

* tag 'perf-tools-fixes-for-v5.17-2022-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf script: Fix error when printing 'weight' field
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf data: Fix double free in perf_session__delete()
  perf evlist: Fix failed to use cpu list for uncore events
  perf test: Skip failing sigtrap test for arm+aarch64
2022-02-24 14:36:38 -08:00
Linus Torvalds
1f840c0ef4 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "x86 host:

   - Expose KVM_CAP_ENABLE_CAP since it is supported

   - Disable KVM_HC_CLOCK_PAIRING in TSC catchup mode

   - Ensure async page fault token is nonzero

   - Fix lockdep false negative

   - Fix FPU migration regression from the AMX changes

  x86 guest:

   - Don't use PV TLB/IPI/yield on uniprocessor guests

  PPC:

   - reserve capability id (topic branch for ppc/kvm)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: nSVM: disallow userspace setting of MSR_AMD64_TSC_RATIO to non default value when tsc scaling disabled
  KVM: x86/mmu: make apf token non-zero to fix bug
  KVM: PPC: reserve capability 210 for KVM_CAP_PPC_AIL_MODE_3
  x86/kvm: Don't use pv tlb/ipi/sched_yield if on 1 vCPU
  x86/kvm: Fix compilation warning in non-x86_64 builds
  x86/kvm/fpu: Remove kvm_vcpu_arch.guest_supported_xcr0
  x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0
  kvm: x86: Disable KVM_HC_CLOCK_PAIRING if tsc is in always catchup mode
  KVM: Fix lockdep false negative during host resume
  KVM: x86: Add KVM_CAP_ENABLE_CAP to x86
2022-02-24 14:05:49 -08:00
Arnd Bergmann
3f96885eb7 Merge tag 'imx-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 5.17, round 2:

- Drop reset signal from i.MX8MM vpumix power domain to fix a system
  hang.
- Fix a dtbs_check warning caused by #thermal-sensor-cells in i.MX8ULP
  device tree.
- Fix a clock disabling imbalance in gpcv2 driver.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-24 22:50:26 +01:00
Arnd Bergmann
31c50bf184 Merge tag 'tegra-for-5.17-arm-dt-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/fixes
ARM: tegra: Device tree fixes for v5.17-rc6

This contains fixes for the eDP panel found on the Venice 2 and Nyan
boards.

* tag 'tegra-for-5.17-arm-dt-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  ARM: tegra: Move panels to AUX bus

Link: https://lore.kernel.org/r/20220223162209.293722-1-thierry.reding@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-24 22:48:00 +01:00
Arnd Bergmann
795a2ab1da Merge tag 'v5.17-rockchip-dtsfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes
Fix the display-port-sound on Gru devices, DDR voltage on the Quartz-A
board, fix emmc signal-integrity and usb OTG mode on rk3399-puma as well
as a number of dtschema fixes to make the reduce the number of errors.

* tag 'v5.17-rockchip-dtsfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  ARM: dts: rockchip: fix a typo on rk3288 crypto-controller
  ARM: dts: rockchip: reorder rk322x hmdi clocks
  arm64: dts: rockchip: reorder rk3399 hdmi clocks
  arm64: dts: rockchip: align pl330 node name with dtschema
  arm64: dts: rockchip: fix rk3399-puma eMMC HS400 signal integrity
  arm64: dts: rockchip: fix Quartz64-A ddr regulator voltage
  arm64: dts: rockchip: Switch RK3399-Gru DP to SPDIF output
  arm64: dts: rockchip: fix rk3399-puma-haikou USB OTG mode
  arm64: dts: rockchip: drop pclk_xpcs from gmac0 on rk3568
  arm64: dts: rockchip: fix dma-controller node names on rk356x

Link: https://lore.kernel.org/r/1973741.CViHJPHrxy@phil
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-24 22:46:59 +01:00
Linus Torvalds
d8152cfe2f Merge tag 'pci-v5.17-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci fixes from Bjorn Helgaas:

 - Fix a merge error that broke PCI device enumeration on mvebu
   platforms, including Turris Omnia (Armada 385) (Pali Rohár)

 - Avoid using ATS on all AMD Navi10 and Navi14 GPUs because some
   VBIOSes don't account for "harvested" (disabled) parts of the chip
   when initializing caches (Alex Deucher)

* tag 'pci-v5.17-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Mark all AMD Navi10 and Navi14 GPU ATS as broken
  PCI: mvebu: Fix device enumeration regression
2022-02-24 13:19:57 -08:00
Linus Torvalds
f672ff9123 Merge tag 'net-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf and netfilter.

  Current release - regressions:

   - bpf: fix crash due to out of bounds access into reg2btf_ids

   - mvpp2: always set port pcs ops, avoid null-deref

   - eth: marvell: fix driver load from initrd

   - eth: intel: revert "Fix reset bw limit when DCB enabled with 1 TC"

  Current release - new code bugs:

   - mptcp: fix race in overlapping signal events

  Previous releases - regressions:

   - xen-netback: revert hotplug-status changes causing devices to not
     be configured

   - dsa:
      - avoid call to __dev_set_promiscuity() while rtnl_mutex isn't
        held
      - fix panic when removing unoffloaded port from bridge

   - dsa: microchip: fix bridging with more than two member ports

  Previous releases - always broken:

   - bpf:
      - fix crash due to incorrect copy_map_value when both spin lock
        and timer are present in a single value
      - fix a bpf_timer initialization issue with clang
      - do not try bpf_msg_push_data with len 0
      - add schedule points in batch ops

   - nf_tables:
      - unregister flowtable hooks on netns exit
      - correct flow offload action array size
      - fix a couple of memory leaks

   - vsock: don't check owner in vhost_vsock_stop() while releasing

   - gso: do not skip outer ip header in case of ipip and net_failover

   - smc: use a mutex for locking "struct smc_pnettable"

   - openvswitch: fix setting ipv6 fields causing hw csum failure

   - mptcp: fix race in incoming ADD_ADDR option processing

   - sysfs: add check for netdevice being present to speed_show

   - sched: act_ct: fix flow table lookup after ct clear or switching
     zones

   - eth: intel: fixes for SR-IOV forwarding offloads

   - eth: broadcom: fixes for selftests and error recovery

   - eth: mellanox: flow steering and SR-IOV forwarding fixes

  Misc:

   - make __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor
     friends not report freed skbs as drops

   - force inlining of checksum functions in net/checksum.h"

* tag 'net-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
  net: mv643xx_eth: process retval from of_get_mac_address
  ping: remove pr_err from ping_lookup
  Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC"
  openvswitch: Fix setting ipv6 fields causing hw csum failure
  ipv6: prevent a possible race condition with lifetimes
  net/smc: Use a mutex for locking "struct smc_pnettable"
  bnx2x: fix driver load from initrd
  Revert "xen-netback: Check for hotplug-status existence before watching"
  Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"
  net/mlx5e: Fix VF min/max rate parameters interchange mistake
  net/mlx5e: Add missing increment of count
  net/mlx5e: MPLSoUDP decap, fix check for unsupported matches
  net/mlx5e: Fix MPLSoUDP encap to use MPLS action information
  net/mlx5e: Add feature check for set fec counters
  net/mlx5e: TC, Skip redundant ct clear actions
  net/mlx5e: TC, Reject rules with forward and drop actions
  net/mlx5e: TC, Reject rules with drop and modify hdr action
  net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets
  net/mlx5e: Fix wrong return value on ioctl EEPROM query failure
  net/mlx5: Fix possible deadlock on rule deletion
  ...
2022-02-24 12:45:32 -08:00
Luiz Augusto von Dentz
a56a1138cb Bluetooth: hci_sync: Fix not using conn_timeout
When using hci_le_create_conn_sync it shall wait for the conn_timeout
since the connection complete may take longer than just 2 seconds.

Also fix the masking of HCI_EV_LE_ENHANCED_CONN_COMPLETE and
HCI_EV_LE_CONN_COMPLETE so they are never both set so we can predict
which one the controller will use in case of HCI_OP_LE_CREATE_CONN.

Fixes: 6cd29ec6ae ("Bluetooth: hci_sync: Wait for proper events when connecting LE")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2022-02-24 21:34:28 +01:00
Luiz Augusto von Dentz
80740ebb7e Bluetooth: hci_sync: Fix hci_update_accept_list_sync
hci_update_accept_list_sync is returning the filter based on the error
but that gets overwritten by hci_le_set_addr_resolution_enable_sync
return instead of using the actual result of the likes of
hci_le_add_accept_list_sync which was intended.

Fixes: ad383c2c65 ("Bluetooth: hci_sync: Enable advertising when LL privacy is enabled")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2022-02-24 21:07:01 +01:00
Wang Qing
2e8ecb4bbc Bluetooth: assign len after null check
len should be assigned after a null check

Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2022-02-24 21:05:21 +01:00
Luiz Augusto von Dentz
29fb608396 Bluetooth: Fix bt_skb_sendmmsg not allocating partial chunks
Since bt_skb_sendmmsg can be used with the likes of SOCK_STREAM it
shall return the partial chunks it could allocate instead of freeing
everything as otherwise it can cause problems like bellow.

Fixes: 81be03e026 ("Bluetooth: RFCOMM: Replace use of memcpy_from_msg with bt_skb_sendmmsg")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/r/d7206e12-1b99-c3be-84f4-df22af427ef5@molgen.mpg.de
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215594
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> (Nokia N9 (MeeGo/Harmattan)
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2022-02-24 21:05:21 +01:00
Lin Ma
fa78d2d1d6 Bluetooth: fix data races in smp_unregister(), smp_del_chan()
Previous commit e04480920d ("Bluetooth: defer cleanup of resources
in hci_unregister_dev()") defers all destructive actions to
hci_release_dev() to prevent cocurrent problems like NPD, UAF.

However, there are still some exceptions that are ignored.

The smp_unregister() in hci_dev_close_sync() (previously in
hci_dev_do_close) will release resources like the sensitive channel
and the smp_dev objects. Consider the situations the device is detaching
or power down while the kernel is still operating on it, the following
data race could take place.

thread-A  hci_dev_close_sync  | thread-B  read_local_oob_ext_data
                              |
hci_dev_unlock()              |
...                           | hci_dev_lock()
if (hdev->smp_data)           |
  chan = hdev->smp_data       |
                              | chan = hdev->smp_data (3)
                              |
  hdev->smp_data = NULL (1)   | if (!chan || !chan->data) (4)
  ...                         |
  smp = chan->data            | smp = chan->data
  if (smp)                    |
    chan->data = NULL (2)     |
    ...                       |
    kfree_sensitive(smp)      |
                              | // dereference smp trigger UFA

That is, the objects hdev->smp_data and chan->data both suffer from the
data races. In a preempt-enable kernel, the above schedule (when (3) is
before (1) and (4) is before (2)) leads to UAF bugs. It can be
reproduced in the latest kernel and below is part of the report:

[   49.097146] ================================================================
[   49.097611] BUG: KASAN: use-after-free in smp_generate_oob+0x2dd/0x570
[   49.097611] Read of size 8 at addr ffff888006528360 by task generate_oob/155
[   49.097611]
[   49.097611] Call Trace:
[   49.097611]  <TASK>
[   49.097611]  dump_stack_lvl+0x34/0x44
[   49.097611]  print_address_description.constprop.0+0x1f/0x150
[   49.097611]  ? smp_generate_oob+0x2dd/0x570
[   49.097611]  ? smp_generate_oob+0x2dd/0x570
[   49.097611]  kasan_report.cold+0x7f/0x11b
[   49.097611]  ? smp_generate_oob+0x2dd/0x570
[   49.097611]  smp_generate_oob+0x2dd/0x570
[   49.097611]  read_local_oob_ext_data+0x689/0xc30
[   49.097611]  ? hci_event_packet+0xc80/0xc80
[   49.097611]  ? sysvec_apic_timer_interrupt+0x9b/0xc0
[   49.097611]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[   49.097611]  ? mgmt_init_hdev+0x1c/0x240
[   49.097611]  ? mgmt_init_hdev+0x28/0x240
[   49.097611]  hci_sock_sendmsg+0x1880/0x1e70
[   49.097611]  ? create_monitor_event+0x890/0x890
[   49.097611]  ? create_monitor_event+0x890/0x890
[   49.097611]  sock_sendmsg+0xdf/0x110
[   49.097611]  __sys_sendto+0x19e/0x270
[   49.097611]  ? __ia32_sys_getpeername+0xa0/0xa0
[   49.097611]  ? kernel_fpu_begin_mask+0x1c0/0x1c0
[   49.097611]  __x64_sys_sendto+0xd8/0x1b0
[   49.097611]  ? syscall_exit_to_user_mode+0x1d/0x40
[   49.097611]  do_syscall_64+0x3b/0x90
[   49.097611]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   49.097611] RIP: 0033:0x7f5a59f51f64
...
[   49.097611] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5a59f51f64
[   49.097611] RDX: 0000000000000007 RSI: 00007f5a59d6ac70 RDI: 0000000000000006
[   49.097611] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[   49.097611] R10: 0000000000000040 R11: 0000000000000246 R12: 00007ffec26916ee
[   49.097611] R13: 00007ffec26916ef R14: 00007f5a59d6afc0 R15: 00007f5a59d6b700

To solve these data races, this patch places the smp_unregister()
function in the protected area by the hci_dev_lock(). That is, the
smp_unregister() function can not be concurrently executed when
operating functions (most of them are mgmt operations in mgmt.c) hold
the device lock.

This patch is tested with kernel LOCK DEBUGGING enabled. The price from
the extended holding time of the device lock is supposed to be low as the
smp_unregister() function is fairly short and efficient.

Signed-off-by: Lin Ma <linma@zju.edu.cn>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2022-02-24 21:05:21 +01:00
Luiz Augusto von Dentz
dd3b1dc3dd Bluetooth: hci_core: Fix leaking sent_cmd skb
sent_cmd memory is not freed before freeing hci_dev causing it to leak
it contents.

Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2022-02-24 21:05:21 +01:00
Dave Airlie
ecf8a99f48 Merge tag 'drm-intel-fixes-2022-02-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix QGV handling on ADL-P+ (Ville Syrjälä)
- Fix bw atomic check when switching between SAGV vs. no SAGV (Ville Syrjälä)
- Disconnect PHYs left connected by BIOS on disabled ports (Imre Deak)
- Fix SAVG to no SAGV transitions on TGL+ (Ville Syrjälä)
- Print PHY name properly on calibration error (DG2) (Matt Roper)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YhdyHwRWkOTWwlqi@tursulin-mobl2
2022-02-25 05:51:04 +10:00
Linus Torvalds
73878e5eb1 Merge tag 'block-5.17-2022-02-24' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - NVMe pull request:
    - send H2CData PDUs based on MAXH2CDATA (Varun Prakash)
    - fix passthrough to namespaces with unsupported features (Christoph
      Hellwig)

 - Clear iocb->private at poll completion (Stefano)

* tag 'block-5.17-2022-02-24' of git://git.kernel.dk/linux-block:
  nvme-tcp: send H2CData PDUs based on MAXH2CDATA
  nvme: also mark passthrough-only namespaces ready in nvme_update_ns_info
  nvme: don't return an error from nvme_configure_metadata
  block: clear iocb->private in blkdev_bio_end_io_async()
2022-02-24 11:15:10 -08:00
Chuansheng Liu
3abea10e6a thermal: int340x: fix memory leak in int3400_notify()
It is easy to hit the below memory leaks in my TigerLake platform:

unreferenced object 0xffff927c8b91dbc0 (size 32):
  comm "kworker/0:2", pid 112, jiffies 4294893323 (age 83.604s)
  hex dump (first 32 bytes):
    4e 41 4d 45 3d 49 4e 54 33 34 30 30 20 54 68 65  NAME=INT3400 The
    72 6d 61 6c 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  rmal.kkkkkkkkkk.
  backtrace:
    [<ffffffff9c502c3e>] __kmalloc_track_caller+0x2fe/0x4a0
    [<ffffffff9c7b7c15>] kvasprintf+0x65/0xd0
    [<ffffffff9c7b7d6e>] kasprintf+0x4e/0x70
    [<ffffffffc04cb662>] int3400_notify+0x82/0x120 [int3400_thermal]
    [<ffffffff9c8b7358>] acpi_ev_notify_dispatch+0x54/0x71
    [<ffffffff9c88f1a7>] acpi_os_execute_deferred+0x17/0x30
    [<ffffffff9c2c2c0a>] process_one_work+0x21a/0x3f0
    [<ffffffff9c2c2e2a>] worker_thread+0x4a/0x3b0
    [<ffffffff9c2cb4dd>] kthread+0xfd/0x130
    [<ffffffff9c201c1f>] ret_from_fork+0x1f/0x30

Fix it by calling kfree() accordingly.

Fixes: 38e44da591 ("thermal: int3400_thermal: process "thermal table changed" event")
Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-24 20:14:19 +01:00
Linus Torvalds
3a5f59b17f Merge tag 'io_uring-5.17-2022-02-23' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:

 - Add a conditional schedule point in io_add_buffers() (Eric)

 - Fix for a quiesce speedup merged in this release (Dylan)

 - Don't convert to jiffies for event timeout waiting, it's way too
   coarse when we accept a timespec as input (me)

* tag 'io_uring-5.17-2022-02-23' of git://git.kernel.dk/linux-block:
  io_uring: disallow modification of rsrc_data during quiesce
  io_uring: don't convert to jiffies for waiting on timeouts
  io_uring: add a schedule point in io_add_buffers()
2022-02-24 11:08:15 -08:00
Rafael J. Wysocki
c5eb92f57d Merge branch 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull ARM cpufreq fixes for 5.18-rc6 from Viresh Kumar:

"This fixes issues related to throttle IRQ for Qcom SoCs."

* 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  cpufreq: qcom-hw: Delay enabling throttle_irq
  cpufreq: Reintroduce ready() callback
2022-02-24 19:54:59 +01:00
Linus Torvalds
6c528f34ca Merge tag 'platform-drivers-x86-v5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull more x86 platform driver fixes from Hans de Goede:
 "Two more fixes:

   - Fix suspend/resume regression on AMD Cezanne APUs in >= 5.16

   - Fix Microsoft Surface 3 battery readings"

* tag 'platform-drivers-x86-v5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  surface: surface3_power: Fix battery readings on batteries without a serial number
  platform/x86: amd-pmc: Set QOS during suspend on CZN w/ timer wakeup
2022-02-24 10:42:20 -08:00
Mauri Sandberg
42404d8f1c net: mv643xx_eth: process retval from of_get_mac_address
Obtaining a MAC address may be deferred in cases when the MAC is stored
in an NVMEM block, for example, and it may not be ready upon the first
retrieval attempt and return EPROBE_DEFER.

It is also possible that a port that does not rely on NVMEM has been
already created when getting the defer request. Thus, also the resources
allocated previously must be freed when doing a roll-back.

Fixes: 76723bca28 ("net: mv643xx_eth: add DT parsing support")
Signed-off-by: Mauri Sandberg <maukka@ext.kapsi.fi>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220223142337.41757-1-maukka@ext.kapsi.fi
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 10:05:08 -08:00
Maxim Levitsky
e910a53fb4 KVM: x86: nSVM: disallow userspace setting of MSR_AMD64_TSC_RATIO to non default value when tsc scaling disabled
If nested tsc scaling is disabled, MSR_AMD64_TSC_RATIO should
never have non default value.

Due to way nested tsc scaling support was implmented in qemu,
it would set this msr to 0 when nested tsc scaling was disabled.
Ignore that value for now, as it causes no harm.

Fixes: 5228eb96a4 ("KVM: x86: nSVM: implement nested TSC scaling")
Cc: stable@vger.kernel.org

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220223115649.319134-1-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-24 13:04:47 -05:00
Liang Zhang
6f3c1fc53d KVM: x86/mmu: make apf token non-zero to fix bug
In current async pagefault logic, when a page is ready, KVM relies on
kvm_arch_can_dequeue_async_page_present() to determine whether to deliver
a READY event to the Guest. This function test token value of struct
kvm_vcpu_pv_apf_data, which must be reset to zero by Guest kernel when a
READY event is finished by Guest. If value is zero meaning that a READY
event is done, so the KVM can deliver another.
But the kvm_arch_setup_async_pf() may produce a valid token with zero
value, which is confused with previous mention and may lead the loss of
this READY event.

This bug may cause task blocked forever in Guest:
 INFO: task stress:7532 blocked for more than 1254 seconds.
       Not tainted 5.10.0 #16
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:stress          state:D stack:    0 pid: 7532 ppid:  1409
 flags:0x00000080
 Call Trace:
  __schedule+0x1e7/0x650
  schedule+0x46/0xb0
  kvm_async_pf_task_wait_schedule+0xad/0xe0
  ? exit_to_user_mode_prepare+0x60/0x70
  __kvm_handle_async_pf+0x4f/0xb0
  ? asm_exc_page_fault+0x8/0x30
  exc_page_fault+0x6f/0x110
  ? asm_exc_page_fault+0x8/0x30
  asm_exc_page_fault+0x1e/0x30
 RIP: 0033:0x402d00
 RSP: 002b:00007ffd31912500 EFLAGS: 00010206
 RAX: 0000000000071000 RBX: ffffffffffffffff RCX: 00000000021a32b0
 RDX: 000000000007d011 RSI: 000000000007d000 RDI: 00000000021262b0
 RBP: 00000000021262b0 R08: 0000000000000003 R09: 0000000000000086
 R10: 00000000000000eb R11: 00007fefbdf2baa0 R12: 0000000000000000
 R13: 0000000000000002 R14: 000000000007d000 R15: 0000000000001000

Signed-off-by: Liang Zhang <zhangliang5@huawei.com>
Message-Id: <20220222031239.1076682-1-zhangliang5@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-24 13:04:46 -05:00
Xin Long
cd33bdcbea ping: remove pr_err from ping_lookup
As Jakub noticed, prints should be avoided on the datapath.
Also, as packets would never come to the else branch in
ping_lookup(), remove pr_err() from ping_lookup().

Fixes: 35a79e64de ("ping: fix the dif and sdif check in ping_lookup")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/1ef3f2fcd31bd681a193b1fcf235eee1603819bd.1645674068.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 09:18:29 -08:00
Mateusz Palczewski
fe20371578 Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC"
Revert of a patch that instead of fixing a AQ error when trying
to reset BW limit introduced several regressions related to
creation and managing TC. Currently there are errors when creating
a TC on both PF and VF.

Error log:
[17428.783095] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14
[17428.783107] i40e 0000:3b:00.1: Failed configuring TC map 0 for VSI 391
[17428.783254] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14
[17428.783259] i40e 0000:3b:00.1: Unable to  configure TC map 0 for VSI 391

This reverts commit 3d2504663c.

Fixes: 3d2504663c (i40e: Fix reset bw limit when DCB enabled with 1 TC)
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220223175347.1690692-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 09:16:57 -08:00
Paul Blakey
d9b5ae5c1b openvswitch: Fix setting ipv6 fields causing hw csum failure
Ipv6 ttl, label and tos fields are modified without first
pulling/pushing the ipv6 header, which would have updated
the hw csum (if available). This might cause csum validation
when sending the packet to the stack, as can be seen in
the trace below.

Fix this by updating skb->csum if available.

Trace resulted by ipv6 ttl dec and then sending packet
to conntrack [actions: set(ipv6(hlimit=63)),ct(zone=99)]:
[295241.900063] s_pf0vf2: hw csum failure
[295241.923191] Call Trace:
[295241.925728]  <IRQ>
[295241.927836]  dump_stack+0x5c/0x80
[295241.931240]  __skb_checksum_complete+0xac/0xc0
[295241.935778]  nf_conntrack_tcp_packet+0x398/0xba0 [nf_conntrack]
[295241.953030]  nf_conntrack_in+0x498/0x5e0 [nf_conntrack]
[295241.958344]  __ovs_ct_lookup+0xac/0x860 [openvswitch]
[295241.968532]  ovs_ct_execute+0x4a7/0x7c0 [openvswitch]
[295241.979167]  do_execute_actions+0x54a/0xaa0 [openvswitch]
[295242.001482]  ovs_execute_actions+0x48/0x100 [openvswitch]
[295242.006966]  ovs_dp_process_packet+0x96/0x1d0 [openvswitch]
[295242.012626]  ovs_vport_receive+0x6c/0xc0 [openvswitch]
[295242.028763]  netdev_frame_hook+0xc0/0x180 [openvswitch]
[295242.034074]  __netif_receive_skb_core+0x2ca/0xcb0
[295242.047498]  netif_receive_skb_internal+0x3e/0xc0
[295242.052291]  napi_gro_receive+0xba/0xe0
[295242.056231]  mlx5e_handle_rx_cqe_mpwrq_rep+0x12b/0x250 [mlx5_core]
[295242.062513]  mlx5e_poll_rx_cq+0xa0f/0xa30 [mlx5_core]
[295242.067669]  mlx5e_napi_poll+0xe1/0x6b0 [mlx5_core]
[295242.077958]  net_rx_action+0x149/0x3b0
[295242.086762]  __do_softirq+0xd7/0x2d6
[295242.090427]  irq_exit+0xf7/0x100
[295242.093748]  do_IRQ+0x7f/0xd0
[295242.096806]  common_interrupt+0xf/0xf
[295242.100559]  </IRQ>
[295242.102750] RIP: 0033:0x7f9022e88cbd
[295242.125246] RSP: 002b:00007f9022282b20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda
[295242.132900] RAX: 0000000000000005 RBX: 0000000000000010 RCX: 0000000000000000
[295242.140120] RDX: 00007f9022282ba8 RSI: 00007f9022282a30 RDI: 00007f9014005c30
[295242.147337] RBP: 00007f9014014d60 R08: 0000000000000020 R09: 00007f90254a8340
[295242.154557] R10: 00007f9022282a28 R11: 0000000000000246 R12: 0000000000000000
[295242.161775] R13: 00007f902308c000 R14: 000000000000002b R15: 00007f9022b71f40

Fixes: 3fdbd1ce11 ("openvswitch: add ipv6 'set' action")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Link: https://lore.kernel.org/r/20220223163416.24096-1-paulb@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 09:16:21 -08:00
Niels Dossche
6c0d8833a6 ipv6: prevent a possible race condition with lifetimes
valid_lft, prefered_lft and tstamp are always accessed under the lock
"lock" in other places. Reading these without taking the lock may result
in inconsistencies regarding the calculation of the valid and preferred
variables since decisions are taken on these fields for those variables.

Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Niels Dossche <niels.dossche@ugent.be>
Link: https://lore.kernel.org/r/20220223131954.6570-1-niels.dossche@ugent.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 09:10:23 -08:00
Fabio M. De Francesco
7ff57e98fb net/smc: Use a mutex for locking "struct smc_pnettable"
smc_pnetid_by_table_ib() uses read_lock() and then it calls smc_pnet_apply_ib()
which, in turn, calls mutex_lock(&smc_ib_devices.mutex).

read_lock() disables preemption. Therefore, the code acquires a mutex while in
atomic context and it leads to a SAC bug.

Fix this bug by replacing the rwlock with a mutex.

Reported-and-tested-by: syzbot+4f322a6d84e991c38775@syzkaller.appspotmail.com
Fixes: 64e28b52c7 ("net/smc: add pnet table namespace support")
Confirmed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/20220223100252.22562-1-fmdefrancesco@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 09:09:33 -08:00
Manish Chopra
e13ad14436 bnx2x: fix driver load from initrd
Commit b7a49f7305 ("bnx2x: Utilize firmware 7.13.21.0") added
new firmware support in the driver with maintaining older firmware
compatibility. However, older firmware was not added in MODULE_FIRMWARE()
which caused missing firmware files in initrd image leading to driver load
failure from initrd. This patch adds MODULE_FIRMWARE() for older firmware
version to have firmware files included in initrd.

Fixes: b7a49f7305 ("bnx2x: Utilize firmware 7.13.21.0")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215627
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Link: https://lore.kernel.org/r/20220223085720.12021-1-manishc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 09:06:18 -08:00
Marek Marczykowski-Górecki
e8240addd0 Revert "xen-netback: Check for hotplug-status existence before watching"
This reverts commit 2afeec08ab.

The reasoning in the commit was wrong - the code expected to setup the
watch even if 'hotplug-status' didn't exist. In fact, it relied on the
watch being fired the first time - to check if maybe 'hotplug-status' is
already set to 'connected'. Not registering a watch for non-existing
path (which is the case if hotplug script hasn't been executed yet),
made the backend not waiting for the hotplug script to execute. This in
turns, made the netfront think the interface is fully operational, while
in fact it was not (the vif interface on xen-netback side might not be
configured yet).

This was a workaround for 'hotplug-status' erroneously being removed.
But since that is reverted now, the workaround is not necessary either.

More discussion at
https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#u

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Michael Brown <mbrown@fensystems.co.uk>
Link: https://lore.kernel.org/r/20220222001817.2264967-2-marmarek@invisiblethingslab.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 08:58:37 -08:00
Marek Marczykowski-Górecki
0f4558ae91 Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"
This reverts commit 1f2565780e.

The 'hotplug-status' node should not be removed as long as the vif
device remains configured. Otherwise the xen-netback would wait for
re-running the network script even if it was already called (in case of
the frontent re-connecting). But also, it _should_ be removed when the
vif device is destroyed (for example when unbinding the driver) -
otherwise hotplug script would not configure the device whenever it
re-appear.

Moving removal of the 'hotplug-status' node was a workaround for nothing
calling network script after xen-netback module is reloaded. But when
vif interface is re-created (on xen-netback unbind/bind for example),
the script should be called, regardless of who does that - currently
this case is not handled by the toolstack, and requires manual
script call. Keeping hotplug-status=connected to skip the call is wrong
and leads to not configured interface.

More discussion at
https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#u

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Link: https://lore.kernel.org/r/20220222001817.2264967-1-marmarek@invisiblethingslab.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 08:58:37 -08:00
Qu Wenruo
558732df21 btrfs: reduce extent threshold for autodefrag
There is a big gap between inode_should_defrag() and autodefrag extent
size threshold.  For inode_should_defrag() it has a flexible
@small_write value. For compressed extent is 16K, and for non-compressed
extent it's 64K.

However for autodefrag extent size threshold, it's always fixed to the
default value (256K).

This means, the following write sequence will trigger autodefrag to
defrag ranges which didn't trigger autodefrag:

  pwrite 0 8k
  sync
  pwrite 8k 128K
  sync

The latter 128K write will also be considered as a defrag target (if
other conditions are met). While only that 8K write is really
triggering autodefrag.

Such behavior can cause extra IO for autodefrag.

Close the gap, by copying the @small_write value into inode_defrag, so
that later autodefrag can use the same @small_write value which
triggered autodefrag.

With the existing transid value, this allows autodefrag really to scan
the ranges which triggered autodefrag.

Although this behavior change is mostly reducing the extent_thresh value
for autodefrag, I believe in the future we should allow users to specify
the autodefrag extent threshold through mount options, but that's an
other problem to consider in the future.

CC: stable@vger.kernel.org # 5.16+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-24 16:11:28 +01:00
James Morse
228a26b912 arm64: Use the clearbhb instruction in mitigations
Future CPUs may implement a clearbhb instruction that is sufficient
to mitigate SpectreBHB. CPUs that implement this instruction, but
not CSV2.3 must be affected by Spectre-BHB.

Add support to use this instruction as the BHB mitigation on CPUs
that support it. The instruction is in the hint space, so it will
be treated by a NOP as older CPUs.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-24 14:02:44 +00:00
Jens Axboe
b2750f1400 Merge tag 'nvme-5.17-2022-02-24' of git://git.infradead.org/nvme into block-5.17
Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.17

 - send H2CData PDUs based on MAXH2CDATA (Varun Prakash)
 - fix passthrough to namespaces with unsupported features (me)"

* tag 'nvme-5.17-2022-02-24' of git://git.infradead.org/nvme:
  nvme-tcp: send H2CData PDUs based on MAXH2CDATA
  nvme: also mark passthrough-only namespaces ready in nvme_update_ns_info
  nvme: don't return an error from nvme_configure_metadata
2022-02-24 07:02:15 -07:00
James Morse
a5905d6af4 KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated
KVM allows the guest to discover whether the ARCH_WORKAROUND SMCCC are
implemented, and to preserve that state during migration through its
firmware register interface.

Add the necessary boiler plate for SMCCC_ARCH_WORKAROUND_3.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-24 13:58:52 +00:00
James Morse
558c303c97 arm64: Mitigate spectre style branch history side channels
Speculation attacks against some high-performance processors can
make use of branch history to influence future speculation.
When taking an exception from user-space, a sequence of branches
or a firmware call overwrites or invalidates the branch history.

The sequence of branches is added to the vectors, and should appear
before the first indirect branch. For systems using KPTI the sequence
is added to the kpti trampoline where it has a free register as the exit
from the trampoline is via a 'ret'. For systems not using KPTI, the same
register tricks are used to free up a register in the vectors.

For the firmware call, arch-workaround-3 clobbers 4 registers, so
there is no choice but to save them to the EL1 stack. This only happens
for entry from EL0, so if we take an exception due to the stack access,
it will not become re-entrant.

For KVM, the existing branch-predictor-hardening vectors are used.
When a spectre version of these vectors is in use, the firmware call
is sufficient to mitigate against Spectre-BHB. For the non-spectre
versions, the sequence of branches is added to the indirect vector.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-24 13:58:52 +00:00
Greg Kroah-Hartman
19eae24b76 Merge tag 'usb-serial-5.17-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:

USB-serial fixes for 5.17-rc6

Here's a revert of a commit which erroneously added a device id used for
the EPP/MEM mode of ch341 devices.

Included are also some new modem device ids.

All have been in linux-next with no reported issues.

* tag 'usb-serial-5.17-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: option: add Telit LE910R1 compositions
  USB: serial: option: add support for DW5829e
  Revert "USB: serial: ch341: add new Product ID for CH341A"
2022-02-24 14:51:45 +01:00
Hans de Goede
21d90aaee8 surface: surface3_power: Fix battery readings on batteries without a serial number
The battery on the 2nd hand Surface 3 which I recently bought appears to
not have a serial number programmed in. This results in any I2C reads from
the registers containing the serial number failing with an I2C NACK.

This was causing mshw0011_bix() to fail causing the battery readings to
not work at all.

Ignore EREMOTEIO (I2C NACK) errors when retrieving the serial number and
continue with an empty serial number to fix this.

Fixes: b1f81b496b ("platform/x86: surface3_power: MSHW0011 rev-eng implementation")
BugLink: https://github.com/linux-surface/linux-surface/issues/608
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220224101848.7219-1-hdegoede@redhat.com
2022-02-24 13:48:39 +01:00
Mario Limonciello
68af28426b platform/x86: amd-pmc: Set QOS during suspend on CZN w/ timer wakeup
commit 59348401eb ("platform/x86: amd-pmc: Add special handling for
timer based S0i3 wakeup") adds support for using another platform timer
in lieu of the RTC which doesn't work properly on some systems. This path
was validated and worked well before submission. During the 5.16-rc1 merge
window other patches were merged that caused this to stop working properly.

When this feature was used with 5.16-rc1 or later some OEM laptops with the
matching firmware requirements from that commit would shutdown instead of
program a timer based wakeup.

This was bisected to commit 8d89835b04 ("PM: suspend: Do not pause
cpuidle in the suspend-to-idle path").  This wasn't supposed to cause any
negative impacts and also tested well on both Intel and ARM platforms.
However this changed the semantics of when CPUs are allowed to be in the
deepest state. For the AMD systems in question it appears this causes a
firmware crash for timer based wakeup.

It's hypothesized to be caused by the `amd-pmc` driver sending `OS_HINT`
and all the CPUs going into a deep state while the timer is still being
programmed. It's likely a firmware bug, but to avoid it don't allow setting
CPUs into the deepest state while using CZN timer wakeup path.

If later it's discovered that this also occurs from "regular" suspends
without a timer as well or on other silicon, this may be later expanded to
run in the suspend path for more scenarios.

Cc: stable@vger.kernel.org # 5.16+
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/linux-acpi/BL1PR12MB51570F5BD05980A0DCA1F3F4E23A9@BL1PR12MB5157.namprd12.prod.outlook.com/T/#mee35f39c41a04b624700ab2621c795367f19c90e
Fixes: 8d89835b04 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path")
Fixes: 23f62d7ab2 ("PM: sleep: Pause cpuidle later and resume it earlier during system transitions")
Fixes: 59348401eb ("platform/x86: amd-pmc: Add special handling for timer based S0i3 wakeup"
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20220223175237.6209-1-mario.limonciello@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-02-24 13:48:34 +01:00
Daehwan Jung
aaaba1c86d usb: gadget: rndis: add spinlock for rndis response list
There's no lock for rndis response list. It could cause list corruption
if there're two different list_add at the same time like below.
It's better to add in rndis_add_response / rndis_free_response
/ rndis_get_next_response to prevent any race condition on response list.

[  361.894299] [1:   irq/191-dwc3:16979] list_add corruption.
next->prev should be prev (ffffff80651764d0),
but was ffffff883dc36f80. (next=ffffff80651764d0).

[  361.904380] [1:   irq/191-dwc3:16979] Call trace:
[  361.904391] [1:   irq/191-dwc3:16979]  __list_add_valid+0x74/0x90
[  361.904401] [1:   irq/191-dwc3:16979]  rndis_msg_parser+0x168/0x8c0
[  361.904409] [1:   irq/191-dwc3:16979]  rndis_command_complete+0x24/0x84
[  361.904417] [1:   irq/191-dwc3:16979]  usb_gadget_giveback_request+0x20/0xe4
[  361.904426] [1:   irq/191-dwc3:16979]  dwc3_gadget_giveback+0x44/0x60
[  361.904434] [1:   irq/191-dwc3:16979]  dwc3_ep0_complete_data+0x1e8/0x3a0
[  361.904442] [1:   irq/191-dwc3:16979]  dwc3_ep0_interrupt+0x29c/0x3dc
[  361.904450] [1:   irq/191-dwc3:16979]  dwc3_process_event_entry+0x78/0x6cc
[  361.904457] [1:   irq/191-dwc3:16979]  dwc3_process_event_buf+0xa0/0x1ec
[  361.904465] [1:   irq/191-dwc3:16979]  dwc3_thread_interrupt+0x34/0x5c

Fixes: f6281af9d6 ("usb: gadget: rndis: use list_for_each_entry_safe")
Cc: stable <stable@kernel.org>
Signed-off-by: Daehwan Jung <dh10.jung@samsung.com>
Link: https://lore.kernel.org/r/1645507768-77687-1-git-send-email-dh10.jung@samsung.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-24 11:16:57 +01:00
Sebastian Andrzej Siewior
84918a89d6 usb: dwc3: gadget: Let the interrupt handler disable bottom halves.
The interrupt service routine registered for the gadget is a primary
handler which mask the interrupt source and a threaded handler which
handles the source of the interrupt. Since the threaded handler is
voluntary threaded, the IRQ-core does not disable bottom halves before
invoke the handler like it does for the forced-threaded handler.

Due to changes in networking it became visible that a network gadget's
completions handler may schedule a softirq which remains unprocessed.
The gadget's completion handler is usually invoked either in hard-IRQ or
soft-IRQ context. In this context it is enough to just raise the softirq
because the softirq itself will be handled once that context is left.
In the case of the voluntary threaded handler, there is nothing that
will process pending softirqs. Which means it remain queued until
another random interrupt (on this CPU) fires and handles it on its exit
path or another thread locks and unlocks a lock with the bh suffix.
Worst case is that the CPU goes idle and the NOHZ complains about
unhandled softirqs.

Disable bottom halves before acquiring the lock (and disabling
interrupts) and enable them after dropping the lock. This ensures that
any pending softirqs will handled right away.

Link: https://lkml.kernel.org/r/c2a64979-73d1-2c22-e048-c275c9f81558@samsung.com
Fixes: e5f68b4a3e ("Revert "usb: dwc3: gadget: remove unnecessary _irqsave()"")
Cc: stable <stable@kernel.org>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/Yg/YPejVQH3KkRVd@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-24 11:16:38 +01:00
Szymon Heidrich
7f14c7227f USB: gadget: validate endpoint index for xilinx udc
Assure that host may not manipulate the index to point
past endpoint array.

Signed-off-by: Szymon Heidrich <szymon.heidrich@gmail.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-24 11:00:07 +01:00
Jakub Kicinski
5facf49702 Merge tag 'mlx5-fixes-2022-02-23' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5 fixes 2022-02-22

This series provides bug fixes to mlx5 driver.
Please pull and let me know if there is any problem.

* tag 'mlx5-fixes-2022-02-23' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Fix VF min/max rate parameters interchange mistake
  net/mlx5e: Add missing increment of count
  net/mlx5e: MPLSoUDP decap, fix check for unsupported matches
  net/mlx5e: Fix MPLSoUDP encap to use MPLS action information
  net/mlx5e: Add feature check for set fec counters
  net/mlx5e: TC, Skip redundant ct clear actions
  net/mlx5e: TC, Reject rules with forward and drop actions
  net/mlx5e: TC, Reject rules with drop and modify hdr action
  net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets
  net/mlx5e: Fix wrong return value on ioctl EEPROM query failure
  net/mlx5: Fix possible deadlock on rule deletion
  net/mlx5: Fix tc max supported prio for nic mode
  net/mlx5: Fix wrong limitation of metadata match on ecpf
  net/mlx5: Update log_max_qp value to be 17 at most
  net/mlx5: DR, Fix the threshold that defines when pool sync is initiated
  net/mlx5: DR, Don't allow match on IP w/o matching on full ethertype/ip_version
  net/mlx5: DR, Fix slab-out-of-bounds in mlx5_cmd_dr_create_fte
  net/mlx5: DR, Cache STE shadow memory
  net/mlx5: Update the list of the PCI supported devices
====================

Link: https://lore.kernel.org/r/20220224001123.365265-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-23 20:30:01 -08:00
Dave Airlie
7c17b3d37f Merge tag 'amd-drm-fixes-5.17-2022-02-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.17-2022-02-23:

amdgpu:
- Display FP fix
- PCO powergating fix
- RDNA2 OEM SKU stability fixes
- Display PSR fix
- PCI ASPM fix
- Display link encoder fix for TEST_COMMIT
- Raven2 suspend/resume fix
- Fix a regression in virtual display support
- GPUVM eviction fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220223214623.28823-1-alexander.deucher@amd.com
2022-02-24 14:27:36 +10:00
Dave Airlie
0c3127933c Merge tag 'drm/tegra/for-5.17-rc6' of https://gitlab.freedesktop.org/drm/tegra into drm-fixes
drm/tegra: Fixes for v5.17-rc6

Contains a couple of fixes for Tegra186 suspend/resume, syncpoint
waiting, a build warning and eDP on older Tegra devices.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220223161903.293392-1-thierry.reding@gmail.com
2022-02-24 14:22:12 +10:00
Dave Airlie
753a64c779 Merge tag 'drm-misc-fixes-2022-02-23' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
* edid: Always set RGB444
 * imx/dcss: Select GEM CMA helpers
 * radeon: Fix some variables's type
 * vc4: Fix codec cleanup; Fix PM reference counting

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YhaKj4zWJ42YWRts@linux-uq9g.fritz.box
2022-02-24 13:53:13 +10:00
Arnaldo Carvalho de Melo
7414db4119 rtla: Fix systme -> system typo on man page
Link: https://lkml.kernel.org/r/YhZsZxqk+IaFxorj@kernel.org

Fixes: 496082df01 ("rtla: Add rtla osnoise man page")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-23 21:52:18 -05:00
Linus Torvalds
91318b29a8 Merge tag 'devicetree-fixes-for-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:

 - Update some maintainers email addresses

 - Fix handling of elfcorehdr reservation for crash dump kernel

 - Fix unittest expected warnings text

* tag 'devicetree-fixes-for-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: update Roger Quadros email
  MAINTAINERS: sifive: drop Yash Shah
  of/fdt: move elfcorehdr reservation early for crash dump kernel
  of: unittest: update text of expected warnings
2022-02-23 17:25:22 -08:00
Linus Torvalds
54134be658 Merge tag 'selinux-pr-20220223' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
 "A second small SELinux fix which addresses an incorrect
  mutex_is_locked() check"

* tag 'selinux-pr-20220223' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix misuse of mutex_is_locked()
2022-02-23 17:19:55 -08:00
Gal Pressman
ca49df96f9 net/mlx5e: Fix VF min/max rate parameters interchange mistake
The VF min and max rate were passed incorrectly and resulted in wrongly
interchanging them. Fix the order of parameters in
mlx5_esw_qos_set_vport_rate().

Fixes: d7df09f5e7 ("net/mlx5: E-switch, Enable vport QoS on demand")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:19 -08:00
Lama Kayal
5ee02b7a80 net/mlx5e: Add missing increment of count
Add mistakenly missing increment of count variable when looping over
output buffer in mlx5e_self_test().

This resolves the issue of garbage values output when querying with self
test via ethtool.

before:
$ ethtool -t eth2
The test result is PASS
The test extra info:
Link Test        0
Speed Test       1768697188
Health Test      758528120
Loopback Test    3288687

after:
$ ethtool -t eth2
The test result is PASS
The test extra info:
Link Test        0
Speed Test       0
Health Test      0
Loopback Test    0

Fixes: 7990b1b5e8 ("net/mlx5e: loopback test is not supported in switchdev mode")
Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:19 -08:00
Maor Dickman
fdc18e4e4b net/mlx5e: MPLSoUDP decap, fix check for unsupported matches
Currently offload of rule on bareudp device require tunnel key
in order to match on mpls fields and without it the mpls fields
are ignored, this is incorrect due to the fact udp tunnel doesn't
have key to match on.

Fix by returning error in case flow is matching on tunnel key.

Fixes: 72046a91d1 ("net/mlx5e: Allow to match on mpls parameters")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:19 -08:00
Maor Dickman
c63741b426 net/mlx5e: Fix MPLSoUDP encap to use MPLS action information
Currently the MPLSoUDP encap builds the MPLS header using encap action
information (tunnel id, ttl and tos) instead of the MPLS action
information (label, ttl, tc and bos) which is wrong.

Fix by storing the MPLS action information during the flow action
parse and later using it to create the encap MPLS header.

Fixes: f828ca6a2f ("net/mlx5e: Add support for hw encapsulation of MPLS over UDP")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:18 -08:00
Lama Kayal
7fac052903 net/mlx5e: Add feature check for set fec counters
Fec counters support is checked via the PCAM feature_cap_mask,
bit 0: PPCNT_counter_group_Phy_statistical_counter_group.
Add feature check to avoid faulty behavior.

Fixes: 0a1498ebfa ("net/mlx5e: Expose FEC counters via ethtool")
Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:18 -08:00
Roi Dayan
fb7e76ea3f net/mlx5e: TC, Skip redundant ct clear actions
Offload of ct clear action is just resetting the reg_c register.
It's done by allocating modify hdr resources which is limited.
Doing it multiple times is redundant and wasting modify hdr resources
and if resources depleted the driver will fail offloading the rule.
Ignore redundant ct clear actions after the first one.

Fixes: 806401c20a ("net/mlx5e: CT, Fix multiple allocations and memleak of mod acts")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:18 -08:00
Roi Dayan
3d65492a86 net/mlx5e: TC, Reject rules with forward and drop actions
Such rules are redundant but allowed and passed to the driver.
The driver does not support offloading such rules so return an error.

Fixes: 03a9d11e6e ("net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:17 -08:00
Roi Dayan
23216d387c net/mlx5e: TC, Reject rules with drop and modify hdr action
This kind of action is not supported by firmware and generates a
syndrome.

kernel: mlx5_core 0000:08:00.0: mlx5_cmd_check:777:(pid 102063): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x8708c3)

Fixes: d7e75a325c ("net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:17 -08:00
Tariq Toukan
7eaf1f37b8 net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets
For RX TLS device-offloaded packets, the HW spec guarantees checksum
validation for the offloaded packets, but does not define whether the
CQE.checksum field matches the original packet (ciphertext) or
the decrypted one (plaintext). This latitude allows architetctural
improvements between generations of chips, resulting in different decisions
regarding the value type of CQE.checksum.

Hence, for these packets, the device driver should not make use of this CQE
field. Here we block CHECKSUM_COMPLETE usage for RX TLS device-offloaded
packets, and use CHECKSUM_UNNECESSARY instead.

Value of the packet's tcp_hdr.csum is not modified by the HW, and it always
matches the original ciphertext.

Fixes: 1182f36593 ("net/mlx5e: kTLS, Add kTLS RX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:16 -08:00
Gal Pressman
0b89429722 net/mlx5e: Fix wrong return value on ioctl EEPROM query failure
The ioctl EEPROM query wrongly returns success on read failures, fix
that by returning the appropriate error code.

Fixes: bb64143eee ("net/mlx5e: Add ethtool support for dump module EEPROM")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:15 -08:00
Maor Gottlieb
b645e57deb net/mlx5: Fix possible deadlock on rule deletion
Add missing call to up_write_ref_node() which releases the semaphore
in case the FTE doesn't have destinations, such in drop rule case.

Fixes: 465e7baab6 ("net/mlx5: Fix deletion of duplicate rules")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:14 -08:00
Chris Mi
be7f4b0ab1 net/mlx5: Fix tc max supported prio for nic mode
Only prio 1 is supported if firmware doesn't support ignore flow
level for nic mode. The offending commit removed the check wrongly.
Add it back.

Fixes: 9a99c8f125 ("net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:13 -08:00
Ariel Levkovich
07666c75ad net/mlx5: Fix wrong limitation of metadata match on ecpf
Match metadata support check returns false for ecpf device.
However, this support does exist for ecpf and therefore this
limitation should be removed to allow feature such as stacked
devices and internal port offloaded to be supported.

Fixes: 92ab1eb392 ("net/mlx5: E-Switch, Enable vport metadata matching if firmware supports it")
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:13 -08:00
Maher Sanalla
7f839965b2 net/mlx5: Update log_max_qp value to be 17 at most
Currently, log_max_qp value is dependent on what FW reports as its max capability.
In reality, due to a bug, some FWs report a value greater than 17, even though they
don't support log_max_qp > 17.

This FW issue led the driver to exhaust memory on startup.
Thus, log_max_qp value is set to be no more than 17 regardless
of what FW reports, as it was before the cited commit.

Fixes: f79a609ea6 ("net/mlx5: Update log_max_qp value to FW max capability")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:12 -08:00
Yevgeny Kliteynik
ecd9c5cd46 net/mlx5: DR, Fix the threshold that defines when pool sync is initiated
When deciding whether to start syncing and actually free all the "hot"
ICM chunks, we need to consider the type of the ICM chunks that we're
dealing with. For instance, the amount of available ICM for MODIFY_ACTION
is significantly lower than the usual STE ICM, so the threshold should
account for that - otherwise we can deplete MODIFY_ACTION memory just by
creating and deleting the same modify header action in a continuous loop.

This patch replaces the hard-coded threshold with a dynamic value.

Fixes: 1c58651412 ("net/mlx5: DR, ICM memory pools sync optimization")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:11 -08:00
Yevgeny Kliteynik
ffb0753b95 net/mlx5: DR, Don't allow match on IP w/o matching on full ethertype/ip_version
Currently SMFS allows adding rule with matching on src/dst IP w/o matching
on full ethertype or ip_version, which is not supported by HW.
This patch fixes this issue and adds the check as it is done in DMFS.

Fixes: 26d688e33f ("net/mlx5: DR, Add Steering entry (STE) utilities")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:10 -08:00
Yevgeny Kliteynik
0aec12d97b net/mlx5: DR, Fix slab-out-of-bounds in mlx5_cmd_dr_create_fte
When adding a rule with 32 destinations, we hit the following out-of-band
access issue:

  BUG: KASAN: slab-out-of-bounds in mlx5_cmd_dr_create_fte+0x18ee/0x1e70

This patch fixes the issue by both increasing the allocated buffers to
accommodate for the needed actions and by checking the number of actions
to prevent this issue when a rule with too many actions is provided.

Fixes: 1ffd498901 ("net/mlx5: DR, Increase supported num of actions to 32")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:10 -08:00
Yevgeny Kliteynik
e5b2bc30c2 net/mlx5: DR, Cache STE shadow memory
During rule insertion on each ICM memory chunk we also allocate shadow memory
used for management. This includes the hw_ste, dr_ste and miss list per entry.
Since the scale of these allocations is large we noticed a performance hiccup
that happens once malloc and free are stressed.
In extreme usecases when ~1M chunks are freed at once, it might take up to 40
seconds to complete this, up to the point the kernel sees this as self-detected
stall on CPU:

 rcu: INFO: rcu_sched self-detected stall on CPU

To resolve this we will increase the reuse of shadow memory.
Doing this we see that a time in the aforementioned usecase dropped from ~40
seconds to ~8-10 seconds.

Fixes: 29cf8febd1 ("net/mlx5: DR, ICM pool memory allocator")
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:09 -08:00
Meir Lichtinger
f908a35b22 net/mlx5: Update the list of the PCI supported devices
Add the upcoming BlueField-4 and ConnectX-8 device IDs.

Fixes: 2e9d3e83ab ("net/mlx5: Update the list of the PCI supported devices")
Signed-off-by: Meir Lichtinger <meirl@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 16:08:08 -08:00
Qiang Yu
c1a66c3bc4 drm/amdgpu: check vm ready by amdgpu_vm->evicting flag
Workstation application ANSA/META v21.1.4 get this error dmesg when
running CI test suite provided by ANSA/META:
[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)

This is caused by:
1. create a 256MB buffer in invisible VRAM
2. CPU map the buffer and access it causes vm_fault and try to move
   it to visible VRAM
3. force visible VRAM space and traverse all VRAM bos to check if
   evicting this bo is valuable
4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
   will set amdgpu_vm->evicting, but latter due to not in visible
   VRAM, won't really evict it so not add it to amdgpu_vm->evicted
5. before next CS to clear the amdgpu_vm->evicting, user VM ops
   ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
   but fail in amdgpu_vm_bo_update_mapping() (check
   amdgpu_vm->evicting) and get this error log

This error won't affect functionality as next CS will finish the
waiting VM ops. But we'd better clear the error log by checking
the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling
amdgpu_vm_bo_update_mapping() later.

Another reason is amdgpu_vm->evicted list holds all BOs (both
user buffer and page table), but only page table BOs' eviction
prevent VM ops. amdgpu_vm->evicting flag is set only for page
table BOs, so we should use evicting flag instead of evicted list
in amdgpu_vm_ready().

The side effect of this change is: previously blocked VM op (user
buffer in "evicted" list but no page table in it) gets done
immediately.

v2: update commit comments.

Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Qiang Yu <qiang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-02-23 16:31:06 -05:00
Guchun Chen
e2b993302f drm/amdgpu: bypass tiling flag check in virtual display case (v2)
vkms leverages common amdgpu framebuffer creation, and
also as it does not support FB modifier, there is no need
to check tiling flags when initing framebuffer when virtual
display is enabled.

This can fix below calltrace:

amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier
WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu]

v2: check adev->enable_virtual_display instead as vkms can be
	enabled in bare metal as well.

Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-23 16:31:06 -05:00
Guchun Chen
97c61e0b7c Revert "drm/amdgpu: add modifiers in amdgpu_vkms_plane_init()"
This reverts commit 4046afcebf.

No need to support modifier in virtual kms, otherwise, in SRIOV
mode, when lanuching X server, set crtc will fail due to mismatch
between primary plane modifier and framebuffer modifier.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-23 16:31:06 -05:00
Chen Gong
1e2be869c8 drm/amdgpu: do not enable asic reset for raven2
The GPU reset function of raven2 is not maintained or tested, so it should be
very unstable.

Now the amdgpu_asic_reset function is added to amdgpu_pmops_suspend, which
causes the S3 test of raven2 to fail, so the asic_reset of raven2 is ignored
here.

Fixes: daf8de0874 ("drm/amdgpu: always reset the asic in suspend (v2)")
Signed-off-by: Chen Gong <curry.gong@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-02-23 16:31:06 -05:00
Nicholas Kazlauskas
3743e7f6fc drm/amd/display: Fix stream->link_enc unassigned during stream removal
[Why]
Found when running igt@kms_atomic.

Userspace attempts to do a TEST_COMMIT when 0 streams which calls
dc_remove_stream_from_ctx. This in turn calls link_enc_unassign
which ends up modifying stream->link = NULL directly, causing the
global link_enc to be removed preventing further link activity
and future link validation from passing.

[How]
We take care of link_enc unassignment at the start of
link_enc_cfg_link_encs_assign so this call is no longer necessary.

Fixes global state from being modified while unlocked.

Reviewed-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-02-23 16:31:06 -05:00
Mario Limonciello
7294863a6f drm/amd: Check if ASPM is enabled from PCIe subsystem
commit 0064b0ce85 ("drm/amd/pm: enable ASPM by default") enabled ASPM
by default but a variety of hardware configurations it turns out that this
caused a regression.

* PPC64LE hardware does not support ASPM at a hardware level.
  CONFIG_PCIEASPM is often disabled on these architectures.
* Some dGPUs on ALD platforms don't work with ASPM enabled and PCIe subsystem
  disables it

Check with the PCIe subsystem to see that ASPM has been enabled
or not.

Fixes: 0064b0ce85 ("drm/amd/pm: enable ASPM by default")
Link: https://wiki.raptorcs.com/w/images/a/ad/P9_PHB_version1.0_27July2018_pub.pdf
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1723
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1739
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1907
Tested-by: koba.ko@canonical.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-02-23 16:31:06 -05:00
Shreeya Patel
ae42f92888 gpio: Return EPROBE_DEFER if gc->to_irq is NULL
We are racing the registering of .to_irq when probing the
i2c driver. This results in random failure of touchscreen
devices.

Following explains the race condition better.

[gpio driver] gpio driver registers gpio chip
[gpio consumer] gpio is acquired
[gpio consumer] gpiod_to_irq() fails with -ENXIO
[gpio driver] gpio driver registers irqchip
gpiod_to_irq works at this point, but -ENXIO is fatal

We could see the following errors in dmesg logs when gc->to_irq is NULL

[2.101857] i2c_hid i2c-FTS3528:00: HID over i2c has not been provided an Int IRQ
[2.101953] i2c_hid: probe of i2c-FTS3528:00 failed with error -22

To avoid this situation, defer probing until to_irq is registered.
Returning -EPROBE_DEFER would be the first step towards avoiding
the failure of devices due to the race in registration of .to_irq.
Final solution to this issue would be to avoid using gc irq members
until they are fully initialized.

This issue has been reported many times in past and people have been
using workarounds like changing the pinctrl_amd to built-in instead
of loading it as a module or by adding a softdep for pinctrl_amd into
the config file.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=209413
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-02-23 22:30:56 +01:00
Linus Torvalds
23d0432844 Merge tag 'for-5.17/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc unaligned handler fixes from Helge Deller:
 "Two patches which fix a few bugs in the unalignment handlers.

  The fldd and fstd instructions weren't handled at all on 32-bit
  kernels, the stw instruction didn't check for fault errors and the
  fldw_l and ldw_m were handled wrongly as integer vs floating point
  instructions.

  Both patches are tagged for stable series"

* tag 'for-5.17/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc/unaligned: Fix ldw() and stw() unalignment handlers
  parisc/unaligned: Fix fldd and fstd unaligned handlers on 32-bit kernel
2022-02-23 12:06:23 -08:00
Linus Torvalds
6f5738db96 Merge tag 'hwmon-for-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
 "Fix two old bugs and one new bug in the hwmon subsystem:

   - In pmbus core, clear pmbus fault/warning status bits after read to
     follow PMBus standard

   - In hwmon core, handle failure to register sensor with thermal zone
     correctly

   - In ntc_thermal driver, use valid thermistor names for Samsung
     thermistors"

* tag 'hwmon-for-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus) Clear pmbus fault/warning bits after read
  hwmon: Handle failure to register sensor with thermal zone correctly
  hwmon: (ntc_thermistor) Underscore Samsung thermistor
2022-02-23 11:51:35 -08:00
Linus Torvalds
4eb0a7c8e1 Merge tag 'slab-for-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fixes from Vlastimil Babka:

 - Build fix (workaround) for clang.

 - Fix a /proc/kcore based slabinfo script broken by struct slab changes
   in 5.17-rc1.

* tag 'slab-for-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  tools/cgroup/slabinfo: update to work with struct slab
  slab: remove __alloc_size attribute from __kmalloc_track_caller
2022-02-23 11:33:12 -08:00
Bart Van Assche
081bdc9fe0 RDMA/ib_srp: Fix a deadlock
Remove the flush_workqueue(system_long_wq) call since flushing
system_long_wq is deadlock-prone and since that call is redundant with a
preceding cancel_work_sync()

Link: https://lore.kernel.org/r/20220215210511.28303-3-bvanassche@acm.org
Fixes: ef6c49d87c ("IB/srp: Eliminate state SRP_TARGET_DEAD")
Reported-by: syzbot+831661966588c802aae9@syzkaller.appspotmail.com
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 15:03:58 -04:00
Maxime Ripard
515415d316 ARM: boot: dts: bcm2711: Fix HVS register range
While the HVS has the same context memory size in the BCM2711 than in
the previous SoCs, the range allocated to the registers doubled and it
now takes 16k + 16k, compared to 8k + 16k before.

The KMS driver will use the whole context RAM though, eventually
resulting in a pointer dereference error when we access the higher half
of the context memory since it hasn't been mapped.

Fixes: 4564363351 ("ARM: dts: bcm2711: Enable the display pipeline")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2022-02-23 11:03:29 -08:00
Alex Deucher
3f1271b54e PCI: Mark all AMD Navi10 and Navi14 GPU ATS as broken
There are enough VBIOS escapes without the proper workaround that some
users still hit this.  Microsoft never productized ATS on Windows so OEM
platforms that were Windows-only didn't always validate ATS.

The advantages of ATS are not worth it compared to the potential
instabilities on harvested boards.  Disable ATS on all Navi10 and Navi14
boards.

Symptoms include:

  amdgpu 0000:07:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007 address=0xffffc02000 flags=0x0000]
  AMD-Vi: Event logged [IO_PAGE_FAULT device=07:00.0 domain=0x0007 address=0xffffc02000 flags=0x0000]
  [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=6047, emitted seq=6049
  amdgpu 0000:07:00.0: amdgpu: GPU reset begin!
  amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume
  amdgpu 0000:07:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110)
  [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110
  amdgpu 0000:07:00.0: amdgpu: GPU reset(1) failed

Related commits:

  e8946a53e2 ("PCI: Mark AMD Navi14 GPU ATS as broken")
  a2da5d8cc0 ("PCI: Mark AMD Raven iGPU ATS as broken in some platforms")
  45beb31d3a ("PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken")
  5e89cd303e ("PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken")
  d28ca864c4 ("PCI: Mark AMD Stoney Radeon R7 GPU ATS as broken")
  9b44b0b09d ("PCI: Mark AMD Stoney GPU ATS as broken")

[bhelgaas: add symptoms and related commits]
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1760
Link: https://lore.kernel.org/r/20220222160801.841643-1-alexander.deucher@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
2022-02-23 12:33:32 -06:00
Helge Deller
a972798368 parisc/unaligned: Fix ldw() and stw() unalignment handlers
Fix 3 bugs:

a) emulate_stw() doesn't return the error code value, so faulting
instructions are not reported and aborted.

b) Tell emulate_ldw() to handle fldw_l as floating point instruction

c) Tell emulate_ldw() to handle ldw_m as integer instruction

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
2022-02-23 18:01:06 +01:00
Helge Deller
dd2288f4a0 parisc/unaligned: Fix fldd and fstd unaligned handlers on 32-bit kernel
Usually the kernel provides fixup routines to emulate the fldd and fstd
floating-point instructions if they load or store 8-byte from/to a not
natuarally aligned memory location.

On a 32-bit kernel I noticed that those unaligned handlers didn't worked and
instead the application got a SEGV.
While checking the code I found two problems:

First, the OPCODE_FLDD_L and OPCODE_FSTD_L cases were ifdef'ed out by the
CONFIG_PA20 option, and as such those weren't built on a pure 32-bit kernel.
This is now fixed by moving the CONFIG_PA20 #ifdef to prevent the compilation
of OPCODE_LDD_L and OPCODE_FSTD_L only, and handling the fldd and fstd
instructions.

The second problem are two bugs in the 32-bit inline assembly code, where the
wrong registers where used. The calculation of the natural alignment used %2
(vall) instead of %3 (ior), and the first word was stored back to address %1
(valh) instead of %3 (ior).

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
2022-02-23 18:01:06 +01:00
Qu Wenruo
26fbac2517 btrfs: autodefrag: only scan one inode once
Although we have btrfs_requeue_inode_defrag(), for autodefrag we are
still just exhausting all inode_defrag items in the tree.

This means, it doesn't make much difference to requeue an inode_defrag,
other than scan the inode from the beginning till its end.

Change the behaviour to always scan from offset 0 of an inode, and till
the end.

By this we get the following benefit:

- Straight-forward code

- No more re-queue related check

- Fewer members in inode_defrag

We still keep the same btrfs_get_fs_root() and btrfs_iget() check for
each loop, and added extra should_auto_defrag() check per-loop.

Note: the patch needs to be backported and is intentionally written
to minimize the diff size, code will be cleaned up later.

CC: stable@vger.kernel.org # 5.16
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-23 17:55:01 +01:00
Qu Wenruo
199257a78b btrfs: defrag: don't use merged extent map for their generation check
For extent maps, if they are not compressed extents and are adjacent by
logical addresses and file offsets, they can be merged into one larger
extent map.

Such merged extent map will have the higher generation of all the
original ones.

But this brings a problem for autodefrag, as it relies on accurate
extent_map::generation to determine if one extent should be defragged.

For merged extent maps, their higher generation can mark some older
extents to be defragged while the original extent map doesn't meet the
minimal generation threshold.

Thus this will cause extra IO.

So solve the problem, here we introduce a new flag, EXTENT_FLAG_MERGED,
to indicate if the extent map is merged from one or more ems.

And for autodefrag, if we find a merged extent map, and its generation
meets the generation requirement, we just don't use this one, and go
back to defrag_get_extent() to read extent maps from subvolume trees.

This could cause more read IO, but should result less defrag data write,
so in the long run it should be a win for autodefrag.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-23 17:43:13 +01:00
Qu Wenruo
d5633b0dee btrfs: defrag: bring back the old file extent search behavior
For defrag, we don't really want to use btrfs_get_extent() to iterate
all extent maps of an inode.

The reasons are:

- btrfs_get_extent() can merge extent maps
  And the result em has the higher generation of the two, causing defrag
  to mark unnecessary part of such merged large extent map.

  This in fact can result extra IO for autodefrag in v5.16+ kernels.

  However this patch is not going to completely solve the problem, as
  one can still using read() to trigger extent map reading, and got
  them merged.

  The completely solution for the extent map merging generation problem
  will come as an standalone fix.

- btrfs_get_extent() caches the extent map result
  Normally it's fine, but for defrag the target range may not get
  another read/write for a long long time.
  Such cache would only increase the memory usage.

- btrfs_get_extent() doesn't skip older extent map
  Unlike the old find_new_extent() which uses btrfs_search_forward() to
  skip the older subtree, thus it will pick up unnecessary extent maps.

This patch will fix the regression by introducing defrag_get_extent() to
replace the btrfs_get_extent() call.

This helper will:

- Not cache the file extent we found
  It will search the file extent and manually convert it to em.

- Use btrfs_search_forward() to skip entire ranges which is modified in
  the past

This should reduce the IO for autodefrag.

Reported-by: Filipe Manana <fdmanana@suse.com>
Fixes: 7b508037d4 ("btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-23 17:43:07 +01:00
Qu Wenruo
550f133f69 btrfs: defrag: remove an ambiguous condition for rejection
From the very beginning of btrfs defrag, there is a check to reject
extents which meet both conditions:

- Physically adjacent

  We may want to defrag physically adjacent extents to reduce the number
  of extents or the size of subvolume tree.

- Larger than 128K

  This may be there for compressed extents, but unfortunately 128K is
  exactly the max capacity for compressed extents.
  And the check is > 128K, thus it never rejects compressed extents.

  Furthermore, the compressed extent capacity bug is fixed by previous
  patch, there is no reason for that check anymore.

The original check has a very small ranges to reject (the target extent
size is > 128K, and default extent threshold is 256K), and for
compressed extent it doesn't work at all.

So it's better just to remove the rejection, and allow us to defrag
physically adjacent extents.

CC: stable@vger.kernel.org # 5.16
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-23 17:42:55 +01:00
Qu Wenruo
979b25c300 btrfs: defrag: don't defrag extents which are already at max capacity
[BUG]
For compressed extents, defrag ioctl will always try to defrag any
compressed extents, wasting not only IO but also CPU time to
compress/decompress:

   mkfs.btrfs -f $DEV
   mount -o compress $DEV $MNT
   xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar
   sync
   xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar
   sync
   echo "=== before ==="
   xfs_io -c "fiemap -v" $MNT/foobar
   btrfs filesystem defrag $MNT/foobar
   sync
   echo "=== after ==="
   xfs_io -c "fiemap -v" $MNT/foobar

Then it shows the 2 128K extents just get COW for no extra benefit, with
extra IO/CPU spent:

    === before ===
    /mnt/btrfs/file1:
     EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
       0: [0..255]:        26624..26879       256   0x8
       1: [256..511]:      26632..26887       256   0x9
    === after ===
    /mnt/btrfs/file1:
     EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
       0: [0..255]:        26640..26895       256   0x8
       1: [256..511]:      26648..26903       256   0x9

This affects not only v5.16 (after the defrag rework), but also v5.15
(before the defrag rework).

[CAUSE]
From the very beginning, btrfs defrag never checks if one extent is
already at its max capacity (128K for compressed extents, 128M
otherwise).

And the default extent size threshold is 256K, which is already beyond
the compressed extent max size.

This means, by default btrfs defrag ioctl will mark all compressed
extent which is not adjacent to a hole/preallocated range for defrag.

[FIX]
Introduce a helper to grab the maximum extent size, and then in
defrag_collect_targets() and defrag_check_next_extent(), reject extents
which are already at their max capacity.

Reported-by: Filipe Manana <fdmanana@suse.com>
CC: stable@vger.kernel.org # 5.16
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-23 17:42:53 +01:00
Qu Wenruo
7093f15291 btrfs: defrag: don't try to merge regular extents with preallocated extents
[BUG]
With older kernels (before v5.16), btrfs will defrag preallocated extents.
While with newer kernels (v5.16 and newer) btrfs will not defrag
preallocated extents, but it will defrag the extent just before the
preallocated extent, even it's just a single sector.

This can be exposed by the following small script:

	mkfs.btrfs -f $dev > /dev/null

	mount $dev $mnt
	xfs_io -f -c "pwrite 0 4k" -c sync -c "falloc 4k 16K" $mnt/file
	xfs_io -c "fiemap -v" $mnt/file
	btrfs fi defrag $mnt/file
	sync
	xfs_io -c "fiemap -v" $mnt/file

The output looks like this on older kernels:

/mnt/btrfs/file:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..7]:          26624..26631         8   0x0
   1: [8..39]:         26632..26663        32 0x801
/mnt/btrfs/file:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..39]:         26664..26703        40   0x1

Which defrags the single sector along with the preallocated extent, and
replace them with an regular extent into a new location (caused by data
COW).
This wastes most of the data IO just for the preallocated range.

On the other hand, v5.16 is slightly better:

/mnt/btrfs/file:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..7]:          26624..26631         8   0x0
   1: [8..39]:         26632..26663        32 0x801
/mnt/btrfs/file:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..7]:          26664..26671         8   0x0
   1: [8..39]:         26632..26663        32 0x801

The preallocated range is not defragged, but the sector before it still
gets defragged, which has no need for it.

[CAUSE]
One of the function reused by the old and new behavior is
defrag_check_next_extent(), it will determine if we should defrag
current extent by checking the next one.

It only checks if the next extent is a hole or inlined, but it doesn't
check if it's preallocated.

On the other hand, out of the function, both old and new kernel will
reject preallocated extents.

Such inconsistent behavior causes above behavior.

[FIX]
- Also check if next extent is preallocated
  If so, don't defrag current extent.

- Add comments for each branch why we reject the extent

This will reduce the IO caused by defrag ioctl and autodefrag.

CC: stable@vger.kernel.org # 5.16
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-23 17:42:52 +01:00
Takashi Iwai
ce345f1e48 Merge tag 'asoc-fix-v5.17-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.17

A few more fixes for v5.17, one followup to the bounds checking fixes
handling controls which support negative values internally and a driver
specific one.
2022-02-23 15:06:48 +01:00
Varun Prakash
c2700d2886 nvme-tcp: send H2CData PDUs based on MAXH2CDATA
As per NVMe/TCP specification (revision 1.0a, section 3.6.2.3)
Maximum Host to Controller Data length (MAXH2CDATA): Specifies the
maximum number of PDU-Data bytes per H2CData PDU in bytes. This value
is a multiple of dwords and should be no less than 4,096.

Current code sets H2CData PDU data_length to r2t_length,
it does not check MAXH2CDATA value. Fix this by setting H2CData PDU
data_length to min(req->h2cdata_left, queue->maxh2cdata).

Also validate MAXH2CDATA value returned by target in ICResp PDU,
if it is not a multiple of dword or if it is less than 4096 return
-EINVAL from nvme_tcp_init_connection().

Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-02-23 14:43:11 +01:00
Christoph Hellwig
602e57c979 nvme: also mark passthrough-only namespaces ready in nvme_update_ns_info
Commit e7d65803e2 ("nvme-multipath: revalidate paths during rescan")
introduced the NVME_NS_READY flag, which nvme_path_is_disabled() uses
to check if a path can be used or not.  We also need to set this flag
for devices that fail the ZNS feature validation and which are available
through passthrough devices only to that they can be used in multipathing
setups.

Fixes: e7d65803e2 ("nvme-multipath: revalidate paths during rescan")
Reported-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Tested-by: Kanchan Joshi <joshi.k@samsung.com>
2022-02-23 14:42:58 +01:00
Christoph Hellwig
363f636860 nvme: don't return an error from nvme_configure_metadata
When a fabrics controller claims to support an invalidate metadata
configuration we already warn and disable metadata support.  No need to
also return an error during revalidation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Tested-by: Kanchan Joshi <joshi.k@samsung.com>
2022-02-23 14:42:51 +01:00
Maxime Ripard
ecbd4912a6 drm/edid: Always set RGB444
In order to fill the drm_display_info structure each time an EDID is
read, the code currently will call drm_add_display_info with the parsed
EDID.

drm_add_display_info will then call drm_reset_display_info to reset all
the fields to 0, and then set them to the proper value depending on the
EDID.

In the color_formats case, we will thus report that we don't support any
color format, and then fill it back with RGB444 plus the additional
formats described in the EDID Feature Support byte.

However, since that byte only contains format-related bits since the 1.4
specification, this doesn't happen if the EDID is following an earlier
specification. In turn, it means that for one of these EDID, we end up
with color_formats set to 0.

The EDID 1.3 specification never really specifies what it means by RGB
exactly, but since both HDMI and DVI will use RGB444, it's fairly safe
to assume it's supposed to be RGB444.

Let's move the addition of RGB444 to color_formats earlier in
drm_add_display_info() so that it's always set for a digital display.

Fixes: da05a5a71a ("drm: parse color format support for digital displays")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reported-by: Matthias Reichl <hias@horus.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220203115416.1137308-1-maxime@cerno.tech
2022-02-23 14:12:01 +01:00
David S. Miller
0228d37bd1 Merge branch 'ftgmac100-fixes'
Heyi Guo says:

====================
drivers/net/ftgmac100: fix occasional DHCP failure

This patch set is to fix the issues discussed in the mail thread:
https://lore.kernel.org/netdev/51f5b7a7-330f-6b3c-253d-10e45cdb6805@linux.alibaba.com/
and follows the advice from Andrew Lunn.

The first 2 patches refactors the code to enable adjust_link calling reset
function directly.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23 12:50:19 +00:00
Heyi Guo
1baf2e50e4 drivers/net/ftgmac100: fix DHCP potential failure with systemd
DHCP failures were observed with systemd 247.6. The issue could be
reproduced by rebooting Aspeed 2600 and then running ifconfig ethX
down/up.

It is caused by below procedures in the driver:

1. ftgmac100_open() enables net interface and call phy_start()
2. When PHY is link up, it calls netif_carrier_on() and then
adjust_link callback
3. ftgmac100_adjust_link() will schedule the reset task
4. ftgmac100_reset_task() will then reset the MAC in another schedule

After step 2, systemd will be notified to send DHCP discover packet,
while the packet might be corrupted by MAC reset operation in step 4.

Call ftgmac100_reset() directly instead of scheduling task to fix the
issue.

Signed-off-by: Heyi Guo <guoheyi@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23 12:50:19 +00:00
Heyi Guo
3c773dba81 drivers/net/ftgmac100: adjust code place for function call dependency
This is to prepare for ftgmac100_adjust_link() to call
ftgmac100_reset() directly. Only code places are changed.

Signed-off-by: Heyi Guo <guoheyi@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23 12:50:19 +00:00
Heyi Guo
4f1e72850d drivers/net/ftgmac100: refactor ftgmac100_reset_task to enable direct function call
This is to prepare for ftgmac100_adjust_link() to call reset function
directly, instead of task schedule.

Signed-off-by: Heyi Guo <guoheyi@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23 12:50:19 +00:00
Wan Jiabing
ecf4a24cf9 net: sched: avoid newline at end of message in NL_SET_ERR_MSG_MOD
Fix following coccicheck warning:
./net/sched/act_api.c:277:7-49: WARNING avoid newline at end of message
in NL_SET_ERR_MSG_MOD

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23 12:45:44 +00:00
Alvin Šipraga
404ba13a65 MAINTAINERS: add myself as co-maintainer for Realtek DSA switch drivers
Adding myself (Alvin Šipraga) as another maintainer for the Realtek DSA
switch drivers. I intend to help Linus out with reviewing and testing
changes to these drivers, particularly the rtl8365mb driver which I
authored and have hardware access to.

Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23 12:36:21 +00:00
Dan Carpenter
a1f8fec4da tipc: Fix end of loop tests for list_for_each_entry()
These tests are supposed to check if the loop exited via a break or not.
However the tests are wrong because if we did not exit via a break then
"p" is not a valid pointer.  In that case, it's the equivalent of
"if (*(u32 *)sr == *last_key) {".  That's going to work most of the time,
but there is a potential for those to be equal.

Fixes: 1593123a6a ("tipc: add name table dump to new netlink api")
Fixes: 1a1a143daf ("tipc: add publication dump to new netlink api")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23 12:35:40 +00:00
Dan Carpenter
de7b2efacf udp_tunnel: Fix end of loop test in udp_tunnel_nic_unregister()
This test is checking if we exited the list via break or not.  However
if it did not exit via a break then "node" does not point to a valid
udp_tunnel_nic_shared_node struct.  It will work because of the way
the structs are laid out it's the equivalent of
"if (info->shared->udp_tunnel_nic_info != dev)" which will always be
true, but it's not the right way to test.

Fixes: 74cc6d182d ("udp_tunnel: add the ability to share port tables")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23 12:35:00 +00:00
Stefano Garzarella
a58da53ffd vhost/vsock: don't check owner in vhost_vsock_stop() while releasing
vhost_vsock_stop() calls vhost_dev_check_owner() to check the device
ownership. It expects current->mm to be valid.

vhost_vsock_stop() is also called by vhost_vsock_dev_release() when
the user has not done close(), so when we are in do_exit(). In this
case current->mm is invalid and we're releasing the device, so we
should clean it anyway.

Let's check the owner only when vhost_vsock_stop() is called
by an ioctl.

When invoked from release we can not fail so we don't check return
code of vhost_vsock_stop(). We need to stop vsock even if it's not
the owner.

Fixes: 433fc58e6b ("VSOCK: Introduce vhost_vsock.ko")
Cc: stable@vger.kernel.org
Reported-by: syzbot+1e3ea63db39f2b4440e0@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+3140b17cb44a7b174008@syzkaller.appspotmail.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23 12:32:33 +00:00
Thierry Reding
8d3b01e0d4 ARM: tegra: Move panels to AUX bus
Move the eDP panel on Venice 2 and Nyan boards into the corresponding
AUX bus device tree node. This allows us to avoid a nasty circular
dependency that would otherwise be created between the DPAUX and panel
nodes via the DDC/I2C phandle.

Fixes: eb481f9ac9 ("ARM: tegra: add Acer Chromebook 13 device tree")
Fixes: 59fe02cb07 ("ARM: tegra: Add DTS for the nyan-blaze board")
Fixes: 40e231c770 ("ARM: tegra: Enable eDP for Venice2")
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-02-23 13:26:00 +01:00
Thierry Reding
8913e1aea4 drm/tegra: dpaux: Populate AUX bus
The DPAUX hardware block exposes an DP AUX interface that provides
access to an AUX bus and the devices on that bus. Use the DP AUX bus
infrastructure that was recently introduced to probe devices on this
bus from DT.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-02-23 13:00:37 +01:00
Christian König
f762ce7889 drm/radeon: fix variable type
When we switch to dma_resv_wait_timeout() the returned type changes as
well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 89aae41d74 ("drm/radeon: use dma_resv_wait_timeout() instead of manually waiting")
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215600
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221110503.2803-1-christian.koenig@amd.com
2022-02-23 10:29:46 +01:00
Eric Dumazet
ae089831ff netfilter: nf_tables: prefer kfree_rcu(ptr, rcu) variant
While kfree_rcu(ptr) _is_ supported, it has some limitations.

Given that 99.99% of kfree_rcu() users [1] use the legacy
two parameters variant, and @catchall objects do have an rcu head,
simply use it.

Choice of kfree_rcu(ptr) variant was probably not intentional.

[1] including calls from net/netfilter/nf_tables_api.c

Fixes: aaa31047a6 ("netfilter: nftables: add catch-all set element support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-23 09:22:46 +01:00
Sukadev Bhattiprolu
277f2bb143 ibmvnic: schedule failover only if vioctl fails
If client is unable to initiate a failover reset via H_VIOCTL hcall, then
it should schedule a failover reset as a last resort. Otherwise, there is
no need to do a last resort.

Fixes: 334c424147 ("ibmvnic: improve failover sysfs entry")
Reported-by: Cris Forno <cforno12@outlook.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Link: https://lore.kernel.org/r/20220221210545.115283-1-drt@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-22 17:06:27 -08:00
Alvin Šipraga
342b641919 net: dsa: fix panic when removing unoffloaded port from bridge
If a bridged port is not offloaded to the hardware - either because the
underlying driver does not implement the port_bridge_{join,leave} ops,
or because the operation failed - then its dp->bridge pointer will be
NULL when dsa_port_bridge_leave() is called. Avoid dereferncing NULL.

This fixes the following splat when removing a port from a bridge:

 Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000000
 Internal error: Oops: 96000004 [#1] PREEMPT_RT SMP
 CPU: 3 PID: 1119 Comm: brctl Tainted: G           O      5.17.0-rc4-rt4 #1
 Call trace:
  dsa_port_bridge_leave+0x8c/0x1e4
  dsa_slave_changeupper+0x40/0x170
  dsa_slave_netdevice_event+0x494/0x4d4
  notifier_call_chain+0x80/0xe0
  raw_notifier_call_chain+0x1c/0x24
  call_netdevice_notifiers_info+0x5c/0xac
  __netdev_upper_dev_unlink+0xa4/0x200
  netdev_upper_dev_unlink+0x38/0x60
  del_nbp+0x1b0/0x300
  br_del_if+0x38/0x114
  add_del_if+0x60/0xa0
  br_ioctl_stub+0x128/0x2dc
  br_ioctl_call+0x68/0xb0
  dev_ifsioc+0x390/0x554
  dev_ioctl+0x128/0x400
  sock_do_ioctl+0xb4/0xf4
  sock_ioctl+0x12c/0x4e0
  __arm64_sys_ioctl+0xa8/0xf0
  invoke_syscall+0x4c/0x110
  el0_svc_common.constprop.0+0x48/0xf0
  do_el0_svc+0x28/0x84
  el0_svc+0x1c/0x50
  el0t_64_sync_handler+0xa8/0xb0
  el0t_64_sync+0x17c/0x180
 Code: f9402f00 f0002261 f9401302 913cc021 (a9401404)
 ---[ end trace 0000000000000000 ]---

Fixes: d3eed0e57d ("net: dsa: keep the bridge_dev and bridge_num as part of the same structure")
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220221203539.310690-1-alvin@pqrs.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-22 17:03:02 -08:00
Sergey Shtylyov
8d093e02e8 ata: pata_hpt37x: disable primary channel on HPT371
The HPT371 chip physically has only one channel, the secondary one,
however the primary channel registers do exist! Thus we have to
manually disable the non-existing channel if the BIOS hasn't done this
already. Similarly to the pata_hpt3x2n driver, always disable the
primary channel.

Fixes: 669a5db411 ("[libata] Add a bunch of PATA drivers.")
Cc: stable@vger.kernel.org
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-23 09:39:37 +09:00
Eric Dumazet
ef527f968a net: __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor friends
Whenever one of these functions pull all data from an skb in a frag_list,
use consume_skb() instead of kfree_skb() to avoid polluting drop
monitoring.

Fixes: 6fa01ccd88 ("skbuff: Add pskb_extract() helper function")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220220154052.1308469-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-22 16:32:35 -08:00
German Gomez
13e741b834 perf script: Fix error when printing 'weight' field
In SPE traces the 'weight' field can't be printed in 'perf script'
because the 'dummy:u' event doesn't have the WEIGHT attribute set.

Use evsel__do_check_stype(..) to check this field, as it's done with
other fields such as "phys_addr".

Before:

  $ perf record -e arm_spe_0// -- sleep 1
  $ perf script -F event,ip,weight
  Samples for 'dummy:u' event do not have WEIGHT attribute set. Cannot print 'weight' field.

After:

  $ perf script -F event,ip,weight
     l1d-access:               12 ffffaf629d4cb320
     tlb-access:               12 ffffaf629d4cb320
         memory:               12 ffffaf629d4cb320

Fixes: b0fde9c6e2 ("perf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHT")
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220221171707.62960-1-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22 21:17:55 -03:00
Linus Torvalds
5c1ee56966 Merge branch 'for-5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:

 - Fix for a subtle bug in the recent release_agent permission check
   update

 - Fix for a long-standing race condition between cpuset and cpu hotplug

 - Comment updates

* 'for-5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: Fix kernel-doc
  cgroup-v1: Correct privileges check in release_agent writes
  cgroup: clarify cgroup_css_set_fork()
  cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug
2022-02-22 16:14:35 -08:00
Ondrej Mosnacek
ce2fc710c9 selinux: fix misuse of mutex_is_locked()
mutex_is_locked() tests whether the mutex is locked *by any task*, while
here we want to test if it is held *by the current task*. To avoid
false/missed WARNINGs, use lockdep_assert_is_held() and
lockdep_assert_is_not_held() instead, which do the right thing (though
they are a no-op if CONFIG_LOCKDEP=n).

Cc: stable@vger.kernel.org
Fixes: 2554a48f44 ("selinux: measure state and policy capabilities")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-02-22 18:02:58 -05:00
Krzysztof Kozlowski
0c0822bcb7 dt-bindings: update Roger Quadros email
Emails to Roger Quadros TI account bounce with:
  550 Invalid recipient <rogerq@ti.com> (#5.1.1)

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Acked-by: Roger Quadros <rogerq@kernel.org>
Acked-By: Vinod Koul <vkoul@kernel.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220221100701.48593-1-krzysztof.kozlowski@canonical.com
2022-02-22 16:06:31 -06:00
Krzysztof Kozlowski
34f3eda8c8 MAINTAINERS: sifive: drop Yash Shah
Emails to Yash Shah bounce with "The email account that you tried to
reach does not exist.", so drop him from all maintainer entries.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220214082349.162973-1-krzysztof.kozlowski@canonical.com
2022-02-22 15:15:39 -06:00
Arnaldo Carvalho de Melo
5b061a322b tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  3915035282 ("KVM: x86: SVM: move avic definitions from AMD's spec to svm.h")

Addressing these tools/perf build warnings:

    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
    Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2022-02-22 17:35:36.996271430 -0300
  +++ after	2022-02-22 17:35:46.258503347 -0300
  @@ -287,6 +287,7 @@
   	[0xc0010114 - x86_AMD_V_KVM_MSRs_offset] = "VM_CR",
   	[0xc0010115 - x86_AMD_V_KVM_MSRs_offset] = "VM_IGNNE",
   	[0xc0010117 - x86_AMD_V_KVM_MSRs_offset] = "VM_HSAVE_PA",
  +	[0xc001011b - x86_AMD_V_KVM_MSRs_offset] = "AMD64_SVM_AVIC_DOORBELL",
   	[0xc001011e - x86_AMD_V_KVM_MSRs_offset] = "AMD64_VM_PAGE_FLUSH",
   	[0xc001011f - x86_AMD_V_KVM_MSRs_offset] = "AMD64_VIRT_SPEC_CTRL",
   	[0xc0010130 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_SEV_ES_GHCB",
  $

And this gets rebuilt:

  CC      /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
  LD      /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
  LD      /tmp/build/perf/trace/beauty/perf-in.o
  CC      /tmp/build/perf/util/amd-sample-raw.o
  LD      /tmp/build/perf/util/perf-in.o
  LD      /tmp/build/perf/perf-in.o
  LINK    /tmp/build/perf/perf

Now one can trace systemwide asking to see backtraces to where those
MSRs are being read/written with:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=AMD64_SVM_AVIC_DOORBELL && msr<=AMD64_SEV_ES_GHCB"
  ^C#

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=AMD64_SVM_AVIC_DOORBELL && msr<=AMD64_SEV_ES_GHCB"
  Using CPUID AuthenticAMD-25-21-0
  0xc001011b
  0xc0010130
  New filter for msr:read_msr: (msr>=0xc001011b && msr<=0xc0010130) && (common_pid != 1019953 && common_pid != 3629)
  0xc001011b
  0xc0010130
  New filter for msr:write_msr: (msr>=0xc001011b && msr<=0xc0010130) && (common_pid != 1019953 && common_pid != 3629)
  mmap size 528384B
  ^C#

  Example with a frequent msr:

    # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
    Using CPUID AuthenticAMD-25-21-0
    0x48
    New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
    0x48
    New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
    mmap size 528384B
    Looking at the vmlinux_path (8 entries long)
    symsrc__init: build id mismatch for vmlinux.
    Using /proc/kcore for kernel data
    Using /proc/kallsyms for symbols
       0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         __switch_to_xtra ([kernel.kallsyms])
                                         __switch_to ([kernel.kallsyms])
                                         __schedule ([kernel.kallsyms])
                                         schedule ([kernel.kallsyms])
                                         futex_wait_queue_me ([kernel.kallsyms])
                                         futex_wait ([kernel.kallsyms])
                                         do_futex ([kernel.kallsyms])
                                         __x64_sys_futex ([kernel.kallsyms])
                                         do_syscall_64 ([kernel.kallsyms])
                                         entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                         __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
       0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         __switch_to_xtra ([kernel.kallsyms])
                                         __switch_to ([kernel.kallsyms])
                                         __schedule ([kernel.kallsyms])
                                         schedule_idle ([kernel.kallsyms])
                                         do_idle ([kernel.kallsyms])
                                         cpu_startup_entry ([kernel.kallsyms])
                                         secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/YhVKxaft+z8rpOfy@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22 17:43:05 -03:00
Alexey Bayduraev
69560e366f perf data: Fix double free in perf_session__delete()
When perf_data__create_dir() fails, it calls close_dir(), but
perf_session__delete() also calls close_dir() and since dir.version and
dir.nr were initialized by perf_data__create_dir(), a double free occurs.

This patch moves the initialization of dir.version and dir.nr after
successful initialization of dir.files, that prevents double freeing.
This behavior is already implemented in perf_data__open_dir().

Fixes: 1455206311 ("perf data: Add perf_data__(create_dir|close_dir) functions")
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220218152341.5197-2-alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22 17:34:16 -03:00
Jiapeng Chong
c70cd039f1 cpuset: Fix kernel-doc
Fix the following W=1 kernel warnings:

kernel/cgroup/cpuset.c:3718: warning: expecting prototype for
cpuset_memory_pressure_bump(). Prototype was for
__cpuset_memory_pressure_bump() instead.

kernel/cgroup/cpuset.c:3568: warning: expecting prototype for
cpuset_node_allowed(). Prototype was for __cpuset_node_allowed()
instead.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-02-22 09:49:49 -10:00
Linus Torvalds
917bbdb107 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ITER_PIPE fix from Al Viro:
 "Fix for old sloppiness in pipe_buffer reuse"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  lib/iov_iter: initialize "flags" in new pipe_buffer
2022-02-22 10:31:53 -08:00
Michal Koutný
467a726b75 cgroup-v1: Correct privileges check in release_agent writes
The idea is to check: a) the owning user_ns of cgroup_ns, b)
capabilities in init_user_ns.

The commit 24f6008564 ("cgroup-v1: Require capabilities to set
release_agent") got this wrong in the write handler of release_agent
since it checked user_ns of the opener (may be different from the owning
user_ns of cgroup_ns).
Secondly, to avoid possibly confused deputy, the capability of the
opener must be checked.

Fixes: 24f6008564 ("cgroup-v1: Require capabilities to set release_agent")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/stable/20220216121142.GB30035@blackbody.suse.cz/
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Masami Ichikawa(CIP) <masami.ichikawa@cybertrust.co.jp>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-02-22 08:12:22 -10:00
Christian Brauner
6d3971dab2 cgroup: clarify cgroup_css_set_fork()
With recent fixes for the permission checking when moving a task into a cgroup
using a file descriptor to a cgroup's cgroup.procs file and calling write() it
seems a good idea to clarify CLONE_INTO_CGROUP permission checking with a
comment.

Cc: Tejun Heo <tj@kernel.org>
Cc: <cgroups@vger.kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-02-22 07:35:19 -10:00
ChenXiaoSong
84ec758fb2 configfs: fix a race in configfs_{,un}register_subsystem()
When configfs_register_subsystem() or configfs_unregister_subsystem()
is executing link_group() or unlink_group(),
it is possible that two processes add or delete list concurrently.
Some unfortunate interleavings of them can cause kernel panic.

One of cases is:
A --> B --> C --> D
A <-- B <-- C <-- D

     delete list_head *B        |      delete list_head *C
--------------------------------|-----------------------------------
configfs_unregister_subsystem   |   configfs_unregister_subsystem
  unlink_group                  |     unlink_group
    unlink_obj                  |       unlink_obj
      list_del_init             |         list_del_init
        __list_del_entry        |           __list_del_entry
          __list_del            |             __list_del
            // next == C        |
            next->prev = prev   |
                                |               next->prev = prev
            prev->next = next   |
                                |                 // prev == B
                                |                 prev->next = next

Fix this by adding mutex when calling link_group() or unlink_group(),
but parent configfs_subsystem is NULL when config_item is root.
So I create a mutex configfs_subsystem_mutex.

Fixes: 7063fbf226 ("[PATCH] configfs: User-driven configuration filesystem")
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-02-22 18:30:28 +01:00
Dylan Yudaken
80912cef18 io_uring: disallow modification of rsrc_data during quiesce
io_rsrc_ref_quiesce will unlock the uring while it waits for references to
the io_rsrc_data to be killed.
There are other places to the data that might add references to data via
calls to io_rsrc_node_switch.
There is a race condition where this reference can be added after the
completion has been signalled. At this point the io_rsrc_ref_quiesce call
will wake up and relock the uring, assuming the data is unused and can be
freed - although it is actually being used.

To fix this check in io_rsrc_ref_quiesce if a resource has been revived.

Reported-by: syzbot+ca8bf833622a1662745b@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220222161751.995746-1-dylany@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-22 09:57:32 -07:00
Vikash Chandola
35f165f089 hwmon: (pmbus) Clear pmbus fault/warning bits after read
Almost all fault/warning bits in pmbus status registers remain set even
after fault/warning condition are removed. As per pmbus specification
these faults must be cleared by user.
Modify hwmon behavior to clear fault/warning bit after fetching data if
fault/warning bit was set. This allows to get fresh data in next read.

Signed-off-by: Vikash Chandola <vikash.chandola@linux.intel.com>
Link: https://lore.kernel.org/r/20220222131253.2426834-1-vikash.chandola@linux.intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-02-22 08:15:39 -08:00
Guenter Roeck
1b5f517cca hwmon: Handle failure to register sensor with thermal zone correctly
If an attempt is made to a sensor with a thermal zone and it fails,
the call to devm_thermal_zone_of_sensor_register() may return -ENODEV.
This may result in crashes similar to the following.

Unable to handle kernel NULL pointer dereference at virtual address 00000000000003cd
...
Internal error: Oops: 96000021 [#1] PREEMPT SMP
...
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mutex_lock+0x18/0x60
lr : thermal_zone_device_update+0x40/0x2e0
sp : ffff800014c4fc60
x29: ffff800014c4fc60 x28: ffff365ee3f6e000 x27: ffffdde218426790
x26: ffff365ee3f6e000 x25: 0000000000000000 x24: ffff365ee3f6e000
x23: ffffdde218426870 x22: ffff365ee3f6e000 x21: 00000000000003cd
x20: ffff365ee8bf3308 x19: ffffffffffffffed x18: 0000000000000000
x17: ffffdde21842689c x16: ffffdde1cb7a0b7c x15: 0000000000000040
x14: ffffdde21a4889a0 x13: 0000000000000228 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
x8 : 0000000001120000 x7 : 0000000000000001 x6 : 0000000000000000
x5 : 0068000878e20f07 x4 : 0000000000000000 x3 : 00000000000003cd
x2 : ffff365ee3f6e000 x1 : 0000000000000000 x0 : 00000000000003cd
Call trace:
 mutex_lock+0x18/0x60
 hwmon_notify_event+0xfc/0x110
 0xffffdde1cb7a0a90
 0xffffdde1cb7a0b7c
 irq_thread_fn+0x2c/0xa0
 irq_thread+0x134/0x240
 kthread+0x178/0x190
 ret_from_fork+0x10/0x20
Code: d503201f d503201f d2800001 aa0103e4 (c8e47c02)

Jon Hunter reports that the exact call sequence is:

hwmon_notify_event()
  --> hwmon_thermal_notify()
    --> thermal_zone_device_update()
      --> update_temperature()
        --> mutex_lock()

The hwmon core needs to handle all errors returned from calls
to devm_thermal_zone_of_sensor_register(). If the call fails
with -ENODEV, report that the sensor was not attached to a
thermal zone  but continue to register the hwmon device.

Reported-by: Jon Hunter <jonathanh@nvidia.com>
Cc: Dmitry Osipenko <digetx@gmail.com>
Fixes: 1597b374af ("hwmon: Add notification support")
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-02-22 08:04:01 -08:00
Paolo Bonzini
1e2277ed70 Merge branch 'kvm-ppc-cap-210' into kvm-master
By request of Nick Piggin:

> Patch 3 requires a KVM_CAP_PPC number allocated. QEMU maintainers are
> happy with it (link in changelog) just waiting on KVM upstreaming. Do
> you have objections to the series going to ppc/kvm tree first, or
> another option is you could take patch 3 alone first (it's relatively
> independent of the other 2) and ppc/kvm gets it from you?
2022-02-22 09:07:16 -05:00
Nicholas Piggin
93b71801a8 KVM: PPC: reserve capability 210 for KVM_CAP_PPC_AIL_MODE_3
Add KVM_CAP_PPC_AIL_MODE_3 to advertise the capability to set the AIL
resource mode to 3 with the H_SET_MODE hypercall. This capability
differs between processor types and KVM types (PR, HV, Nested HV), and
affects guest-visible behaviour.

QEMU will implement a cap-ail-mode-3 to control this behaviour[1], and
use the KVM CAP if available to determine KVM support[2].

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-22 09:06:54 -05:00
Stefano Garzarella
bb49c6fa8b block: clear iocb->private in blkdev_bio_end_io_async()
iocb_bio_iopoll() expects iocb->private to be cleared before
releasing the bio.

We already do this in blkdev_bio_end_io(), but we forgot in the
recently added blkdev_bio_end_io_async().

Fixes: 54a88eb838 ("block: add single bio async direct IO helper")
Cc: asml.silence@gmail.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220211090136.44471-1-sgarzare@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-22 06:59:49 -07:00
Adam Ward
9c7cf33c53 regulator: da9121: Remove surplus DA9141 parameters
Remove ramp_delay/enable_time values - subject to OTP, incorrect

Signed-off-by: Adam Ward <Adam.Ward.opensource@diasemi.com>
Link: https://lore.kernel.org/r/a175201b4a7ea323c6a70d77f7f6d2124bfc0bed.1645489455.git.Adam.Ward.opensource@diasemi.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-22 11:56:29 +00:00
Adam Ward
c8c57fbc1c regulator: da9121: Fix DA914x voltage value
Update DA9141/2 max voltage to match spec change

Signed-off-by: Adam Ward <Adam.Ward.opensource@diasemi.com>
Link: https://lore.kernel.org/r/9d1ec5b6db70d27f56d05b8a0139fc0840f03e20.1645489455.git.Adam.Ward.opensource@diasemi.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-22 11:56:28 +00:00
Adam Ward
f0fdfc04fd regulator: da9121: Fix DA914x current values
Update DA9141/2 ranges to correct errors

Signed-off-by: Adam Ward <Adam.Ward.opensource@diasemi.com>
Link: https://lore.kernel.org/r/cd5732c5061ce49dcfbcebb306d12ba1664b4ea6.1645489455.git.Adam.Ward.opensource@diasemi.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-22 11:56:26 +00:00
David S. Miller
5663b85462 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

This is fixing up the use without proper initialization in patch 5/5

-o-

Hi,

The following patchset contains Netfilter fixes for net:

1) Missing #ifdef CONFIG_IP6_NF_IPTABLES in recent xt_socket fix.

2) Fix incorrect flow action array size in nf_tables.

3) Unregister flowtable hooks from netns exit path.

4) Fix missing limit object release, from Florian Westphal.

5) Memleak in nf_tables object update path, also from Florian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22 11:00:51 +00:00
Randy Dunlap
1e6ae0e46e mips: setup: fix setnocoherentio() boolean setting
Correct a typo/pasto: setnocoherentio() should set
dma_default_coherent to false, not true.

Fixes: 14ac09a65e ("MIPS: refactor the runtime coherent vs noncoherent DMA indicators")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: linux-mips@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-02-22 09:35:02 +01:00
Mårten Lindahl
d8f7a5484f driver core: Free DMA range map when device is released
When unbinding/binding a driver with DMA mapped memory, the DMA map is
not freed before the driver is reloaded. This leads to a memory leak
when the DMA map is overwritten when reprobing the driver.

This can be reproduced with a platform driver having a dma-range:

dummy {
	...
	#address-cells = <0x2>;
	#size-cells = <0x2>;
	ranges;
	dma-ranges = <...>;
	...
};

and then unbinding/binding it:

~# echo soc:dummy >/sys/bus/platform/drivers/<driver>/unbind

DMA map object 0xffffff800b0ae540 still being held by &pdev->dev

~# echo soc:dummy >/sys/bus/platform/drivers/<driver>/bind
~# echo scan > /sys/kernel/debug/kmemleak
~# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffffff800b0ae540 (size 64):
  comm "sh", pid 833, jiffies 4295174550 (age 2535.352s)
  hex dump (first 32 bytes):
    00 00 00 80 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 80 00 00 00 00 00 00 00 80 00 00 00 00  ................
  backtrace:
    [<ffffffefd1694708>] create_object.isra.0+0x108/0x344
    [<ffffffefd1d1a850>] kmemleak_alloc+0x8c/0xd0
    [<ffffffefd167e2d0>] __kmalloc+0x440/0x6f0
    [<ffffffefd1a960a4>] of_dma_get_range+0x124/0x220
    [<ffffffefd1a8ce90>] of_dma_configure_id+0x40/0x2d0
    [<ffffffefd198b68c>] platform_dma_configure+0x5c/0xa4
    [<ffffffefd198846c>] really_probe+0x8c/0x514
    [<ffffffefd1988990>] __driver_probe_device+0x9c/0x19c
    [<ffffffefd1988cd8>] device_driver_attach+0x54/0xbc
    [<ffffffefd1986634>] bind_store+0xc4/0x120
    [<ffffffefd19856e0>] drv_attr_store+0x30/0x44
    [<ffffffefd173c9b0>] sysfs_kf_write+0x50/0x60
    [<ffffffefd173c1c4>] kernfs_fop_write_iter+0x124/0x1b4
    [<ffffffefd16a013c>] new_sync_write+0xdc/0x160
    [<ffffffefd16a256c>] vfs_write+0x23c/0x2a0
    [<ffffffefd16a2758>] ksys_write+0x64/0xec

To prevent this we should free the dma_range_map when the device is
released.

Fixes: e0d072782c ("dma-mapping: introduce DMA range map, supplanting dma_pfn_offset")
Cc: stable <stable@vger.kernel.org>
Suggested-by: Rob Herring <robh@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mårten Lindahl <marten.lindahl@axis.com>
Link: https://lore.kernel.org/r/20220216094128.4025861-1-marten.lindahl@axis.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-22 08:40:02 +01:00
Florian Westphal
dad3bdeef4 netfilter: nf_tables: fix memory leak during stateful obj update
stateful objects can be updated from the control plane.
The transaction logic allocates a temporary object for this purpose.

The ->init function was called for this object, so plain kfree() leaks
resources. We must call ->destroy function of the object.

nft_obj_destroy does this, but it also decrements the module refcount,
but the update path doesn't increment it.

To avoid special-casing the update object release, do module_get for
the update case too and release it via nft_obj_destroy().

Fixes: d62d0ba97b ("netfilter: nf_tables: Introduce stateful object update operation")
Cc: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-22 08:28:04 +01:00
Sergey Shtylyov
5f6b0f2d03 ata: pata_hpt37x: fix PCI clock detection
The f_CNT register (at the PCI config. address 0x78) is 16-bit, not
8-bit! The bug was there from the very start... :-(

Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Fixes: 669a5db411 ("[libata] Add a bunch of PATA drivers.")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-22 09:39:57 +09:00
Michel Dänzer
4d22336f90 drm/amd/display: For vblank_disable_immediate, check PSR is really used
Even if PSR is allowed for a present GPU, there might be no eDP link
which supports PSR.

Fixes: 7089784873 ("drm/amdgpu/display: Only set vblank_disable_immediate when PSR is not enabled")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-21 17:55:17 -05:00
Evan Quan
e3f3824874 drm/amd/pm: fix some OEM SKU specific stability issues
Add a quirk in sienna_cichlid_ppt.c to fix some OEM SKU
specific stability issues.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-02-21 17:55:17 -05:00
Evan Quan
f626dd0ff0 drm/amdgpu: disable MMHUB PG for Picasso
MMHUB PG needs to be disabled for Picasso for stability reasons.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-02-21 17:55:17 -05:00
Bas Nieuwenhuizen
1432108d00 drm/amd/display: Protect update_bw_bounding_box FPU code.
For DCN3/3.01/3.02 at least these use the fpu.

v2: squash in build fix for when DCN is not enabled (Leo)

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-02-21 17:55:17 -05:00
Phil Elwell
eebb0f4e89 sc16is7xx: Fix for incorrect data being transmitted
UART drivers are meant to use the port spinlock within certain
methods, to protect against reentrancy. The sc16is7xx driver does
very little locking, presumably because when added it triggers
"scheduling while atomic" errors. This is due to the use of mutexes
within the regmap abstraction layer, and the mutex implementation's
habit of sleeping the current thread while waiting for access.
Unfortunately this lack of interlocking can lead to corruption of
outbound data, which occurs when the buffer used for I2C transmission
is used simultaneously by two threads - a work queue thread running
sc16is7xx_tx_proc, and an IRQ thread in sc16is7xx_port_irq, both
of which can call sc16is7xx_handle_tx.

An earlier patch added efr_lock, a mutex that controls access to the
EFR register. This mutex is already claimed in the IRQ handler, and
all that is required is to claim the same mutex in sc16is7xx_tx_proc.

See: https://github.com/raspberrypi/linux/issues/4885

Fixes: 6393ff1c44 ("sc16is7xx: Use threaded IRQ")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Link: https://lore.kernel.org/r/20220216160802.1026013-1-phil@raspberrypi.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-21 19:51:39 +01:00
daniel.starke@siemens.com
a2ab75b8e7 tty: n_gsm: fix deadlock in gsmtty_open()
In the current implementation the user may open a virtual tty which then
could fail to establish the underlying DLCI. The function gsmtty_open()
gets stuck in tty_port_block_til_ready() while waiting for a carrier rise.
This happens if the remote side fails to acknowledge the link establishment
request in time or completely. At some point gsm_dlci_close() is called
to abort the link establishment attempt. The function tries to inform the
associated virtual tty by performing a hangup. But the blocking loop within
tty_port_block_til_ready() is not informed about this event.
The patch proposed here fixes this by resetting the initialization state of
the virtual tty to ensure the loop exits and triggering it to make
tty_port_block_til_ready() return.

Fixes: e1eaea46bb ("tty: n_gsm line discipline")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Starke <daniel.starke@siemens.com>
Link: https://lore.kernel.org/r/20220218073123.2121-7-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-21 19:51:04 +01:00
daniel.starke@siemens.com
687f9ad43c tty: n_gsm: fix wrong modem processing in convergence layer type 2
The function gsm_process_modem() exists to handle modem status bits of
incoming frames. This includes incoming MSC (modem status command) frames
and convergence layer type 2 data frames. The function, however, was only
designed to handle MSC frames as it expects the command length. Within
gsm_dlci_data() it is wrongly assumed that this is the same as the data
frame length. This is only true if the data frame contains only 1 byte of
payload.

This patch names the length parameter of gsm_process_modem() in a generic
manner to reflect its association. It also corrects all calls to the
function to handle the variable number of modem status octets correctly in
both cases.

Fixes: 7263287af9 ("tty: n_gsm: Fixed logic to decode break signal from modem status")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Starke <daniel.starke@siemens.com>
Link: https://lore.kernel.org/r/20220218073123.2121-6-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-21 19:51:04 +01:00
daniel.starke@siemens.com
c19d93542a tty: n_gsm: fix wrong tty control line for flow control
tty flow control is handled via gsmtty_throttle() and gsmtty_unthrottle().
Both functions propagate the outgoing hardware flow control state to the
remote side via MSC (modem status command) frames. The local state is taken
from the RTS (ready to send) flag of the tty. However, RTS gets mapped to
DTR (data terminal ready), which is wrong.
This patch corrects this by mapping RTS to RTS.

Fixes: e1eaea46bb ("tty: n_gsm line discipline")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Starke <daniel.starke@siemens.com>
Link: https://lore.kernel.org/r/20220218073123.2121-5-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-21 19:51:04 +01:00
daniel.starke@siemens.com
96b169f05c tty: n_gsm: fix NULL pointer access due to DLCI release
The here fixed commit made the tty hangup asynchronous to avoid a circular
locking warning. I could not reproduce this warning. Furthermore, due to
the asynchronous hangup the function call now gets queued up while the
underlying tty is being freed. Depending on the timing this results in a
NULL pointer access in the global work queue scheduler. To be precise in
process_one_work(). Therefore, the previous commit made the issue worse
which it tried to fix.

This patch fixes this by falling back to the old behavior which uses a
blocking tty hangup call before freeing up the associated tty.

Fixes: 7030082a74 ("tty: n_gsm: avoid recursive locking with async port hangup")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Starke <daniel.starke@siemens.com>
Link: https://lore.kernel.org/r/20220218073123.2121-4-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-21 19:51:04 +01:00
daniel.starke@siemens.com
e3b7468f08 tty: n_gsm: fix proper link termination after failed open
Trying to open a DLCI by sending a SABM frame may fail with a timeout.
The link is closed on the initiator side without informing the responder
about this event. The responder assumes the link is open after sending a
UA frame to answer the SABM frame. The link gets stuck in a half open
state.

This patch fixes this by initiating the proper link termination procedure
after link setup timeout instead of silently closing it down.

Fixes: e1eaea46bb ("tty: n_gsm line discipline")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Starke <daniel.starke@siemens.com>
Link: https://lore.kernel.org/r/20220218073123.2121-3-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-21 19:51:04 +01:00
daniel.starke@siemens.com
57435c4240 tty: n_gsm: fix encoding of command/response bit
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.2.1.2 describes the encoding of the
C/R (command/response) bit. Table 1 shows that the actual encoding of the
C/R bit is inverted if the associated frame is sent by the responder.

The referenced commit fixed here further broke the internal meaning of this
bit in the outgoing path by always setting the C/R bit regardless of the
frame type.

This patch fixes both by setting the C/R bit always consistently for
command (1) and response (0) frames and inverting it later for the
responder where necessary. The meaning of this bit in the debug output
is being preserved and shows the bit as if it was encoded by the initiator.
This reflects only the frame type rather than the encoded combination of
communication side and frame type.

Fixes: cc0f42122a ("tty: n_gsm: Modify CR,PF bit when config requester")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Starke <daniel.starke@siemens.com>
Link: https://lore.kernel.org/r/20220218073123.2121-2-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-21 19:51:04 +01:00
daniel.starke@siemens.com
737b0ef3be tty: n_gsm: fix encoding of control signal octet bit DV
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.4.6.3.7 describes the encoding of the
control signal octet used by the MSC (modem status command). The same
encoding is also used in convergence layer type 2 as described in chapter
5.5.2. Table 7 and 24 both require the DV (data valid) bit to be set 1 for
outgoing control signal octets sent by the DTE (data terminal equipment),
i.e. for the initiator side.
Currently, the DV bit is only set if CD (carrier detect) is on, regardless
of the side.

This patch fixes this behavior by setting the DV bit on the initiator side
unconditionally.

Fixes: e1eaea46bb ("tty: n_gsm line discipline")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Starke <daniel.starke@siemens.com>
Link: https://lore.kernel.org/r/20220218073123.2121-1-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-21 19:51:04 +01:00
Linus Torvalds
038101e6b2 Merge tag 'platform-drivers-x86-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
 "Two small fixes and one hardware-id addition"

* tag 'platform-drivers-x86-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: int3472: Add terminator to gpiod_lookup_table
  platform/x86: asus-wmi: Fix regression when probing for fan curve control
  platform/x86: thinkpad_acpi: Add dual-fan quirk for T15g (2nd gen)
2022-02-21 09:10:53 -08:00
Christophe Kerello
6c76218909 mtd: core: Fix a conflict between MTD and NVMEM on wp-gpios property
Wp-gpios property can be used on NVMEM nodes and the same property can
be also used on MTD NAND nodes. In case of the wp-gpios property is
defined at NAND level node, the GPIO management is done at NAND driver
level. Write protect is disabled when the driver is probed or resumed
and is enabled when the driver is released or suspended.

When no partitions are defined in the NAND DT node, then the NAND DT node
will be passed to NVMEM framework. If wp-gpios property is defined in
this node, the GPIO resource is taken twice and the NAND controller
driver fails to probe.

A new Boolean flag named ignore_wp has been added in nvmem_config.
In case ignore_wp is set, it means that the GPIO is handled by the
provider. Lets set this flag in MTD layer to avoid the conflict on
wp_gpios property.

Fixes: 2a127da461 ("nvmem: add support for the write-protect pin")
Cc: stable@vger.kernel.org
Acked-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20220220151432.16605-3-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-21 17:59:25 +01:00
Christophe Kerello
f6c052afe6 nvmem: core: Fix a conflict between MTD and NVMEM on wp-gpios property
Wp-gpios property can be used on NVMEM nodes and the same property can
be also used on MTD NAND nodes. In case of the wp-gpios property is
defined at NAND level node, the GPIO management is done at NAND driver
level. Write protect is disabled when the driver is probed or resumed
and is enabled when the driver is released or suspended.

When no partitions are defined in the NAND DT node, then the NAND DT node
will be passed to NVMEM framework. If wp-gpios property is defined in
this node, the GPIO resource is taken twice and the NAND controller
driver fails to probe.

It would be possible to set config->wp_gpio at MTD level before calling
nvmem_register function but NVMEM framework will toggle this GPIO on
each write when this GPIO should only be controlled at NAND level driver
to ensure that the Write Protect has not been enabled.

A way to fix this conflict is to add a new boolean flag in nvmem_config
named ignore_wp. In case ignore_wp is set, the GPIO resource will
be managed by the provider.

Fixes: 2a127da461 ("nvmem: add support for the write-protect pin")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20220220151432.16605-2-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-21 17:59:25 +01:00
Greg Kroah-Hartman
efe8a1e7ca Merge tag 'iio-fixes-for-5.17a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus
Jonathan writes:

1st set of IIO fixes for the 5.17 cycle.

Several drivers:
 - Fix a failure to disable runtime in probe error paths. All cases
   were introduced in the same rework patch.

adi,ad7124
 - Fix incorrect register masking.
adi,ad74413r
 - Avoid referencing negative array offsets.
 - Use ngpio size when iterating over mask not numebr of channels.
 - Fix issue with wrong mask uage getting GPIOs.
adi,admv1014
 - Drop check on unsigned less than 0.
adi,ads16480
 - Correctly handle devices that don't have burst mode support.
fsl,fxls8962af
 - Add missing padding needed between address and data for SPI transfers.
men_z188
 - Fix iomap leak in error path.
st,lsm6dsx
 - Wait for setting time in oneshot reads to get a stable result.
ti,tsc2046
 - Prevent an array overflow.

* tag 'iio-fixes-for-5.17a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
  iio: imu: st_lsm6dsx: wait for settling time in st_lsm6dsx_read_oneshot
  iio: Fix error handling for PM
  iio: addac: ad74413r: correct comparator gpio getters mask usage
  iio: addac: ad74413r: use ngpio size when iterating over mask
  iio: addac: ad74413r: Do not reference negative array offsets
  iio: adc: men_z188_adc: Fix a resource leak in an error handling path
  iio: frequency: admv1013: remove the always true condition
  iio: accel: fxls8962af: add padding to regmap for SPI
  iio:imu:adis16480: fix buffering for devices with no burst mode
  iio: adc: ad7124: fix mask used for setting AIN_BUFP & AIN_BUFM bits
  iio: adc: tsc2046: fix memory corruption by preventing array overflow
2022-02-21 17:58:09 +01:00
Max Kellermann
9d2231c5d7 lib/iov_iter: initialize "flags" in new pipe_buffer
The functions copy_page_to_iter_pipe() and push_pipe() can both
allocate a new pipe_buffer, but the "flags" member initializer is
missing.

Fixes: 241699cd72 ("new iov_iter flavour: pipe-backed")
To: Alexander Viro <viro@zeniv.linux.org.uk>
To: linux-fsdevel@vger.kernel.org
To: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-02-21 10:16:39 -05:00
Julian Braha
11c57c3ba9 ARM: 9178/1: fix unmet dependency on BITREVERSE for HAVE_ARCH_BITREVERSE
Resending this to properly add it to the patch tracker - thanks for letting
me know, Arnd :)

When ARM is enabled, and BITREVERSE is disabled,
Kbuild gives the following warning:

WARNING: unmet direct dependencies detected for HAVE_ARCH_BITREVERSE
  Depends on [n]: BITREVERSE [=n]
  Selected by [y]:
  - ARM [=y] && (CPU_32v7M [=n] || CPU_32v7 [=y]) && !CPU_32v6 [=n]

This is because ARM selects HAVE_ARCH_BITREVERSE
without selecting BITREVERSE, despite
HAVE_ARCH_BITREVERSE depending on BITREVERSE.

This unmet dependency bug was found by Kismet,
a static analysis tool for Kconfig. Please advise if this
is not the appropriate solution.

Signed-off-by: Julian Braha <julianbraha@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-02-21 14:58:34 +00:00
Russell King (Oracle)
d920eaa4c4 ARM: Fix kgdb breakpoint for Thumb2
The kgdb code needs to register an undef hook for the Thumb UDF
instruction that will fault in order to be functional on Thumb2
platforms.

Reported-by: Johannes Stezenbach <js@sig21.net>
Tested-by: Johannes Stezenbach <js@sig21.net>
Fixes: 5cbad0ebf4 ("kgdb: support for ARCH=arm")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-02-21 14:56:53 +00:00
Florian Westphal
1a58f84ea5 netfilter: nft_limit: fix stateful object memory leak
We need to provide a destroy callback to release the extra fields.

Fixes: 3b9e2ea6c1 ("netfilter: nft_limit: move stateful fields out of expression data")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-21 15:52:14 +01:00
Pablo Neira Ayuso
6069da443b netfilter: nf_tables: unregister flowtable hooks on netns exit
Unregister flowtable hooks before they are releases via
nf_tables_flowtable_destroy() otherwise hook core reports UAF.

BUG: KASAN: use-after-free in nf_hook_entries_grow+0x5a7/0x700 net/netfilter/core.c:142 net/netfilter/core.c:142
Read of size 4 at addr ffff8880736f7438 by task syz-executor579/3666

CPU: 0 PID: 3666 Comm: syz-executor579 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 __dump_stack lib/dump_stack.c:88 [inline] lib/dump_stack.c:106
 dump_stack_lvl+0x1dc/0x2d8 lib/dump_stack.c:106 lib/dump_stack.c:106
 print_address_description+0x65/0x380 mm/kasan/report.c:247 mm/kasan/report.c:247
 __kasan_report mm/kasan/report.c:433 [inline]
 __kasan_report mm/kasan/report.c:433 [inline] mm/kasan/report.c:450
 kasan_report+0x19a/0x1f0 mm/kasan/report.c:450 mm/kasan/report.c:450
 nf_hook_entries_grow+0x5a7/0x700 net/netfilter/core.c:142 net/netfilter/core.c:142
 __nf_register_net_hook+0x27e/0x8d0 net/netfilter/core.c:429 net/netfilter/core.c:429
 nf_register_net_hook+0xaa/0x180 net/netfilter/core.c:571 net/netfilter/core.c:571
 nft_register_flowtable_net_hooks+0x3c5/0x730 net/netfilter/nf_tables_api.c:7232 net/netfilter/nf_tables_api.c:7232
 nf_tables_newflowtable+0x2022/0x2cf0 net/netfilter/nf_tables_api.c:7430 net/netfilter/nf_tables_api.c:7430
 nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline]
 nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:634 [inline]
 nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline] net/netfilter/nfnetlink.c:652
 nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:634 [inline] net/netfilter/nfnetlink.c:652
 nfnetlink_rcv+0x10e6/0x2550 net/netfilter/nfnetlink.c:652 net/netfilter/nfnetlink.c:652

__nft_release_hook() calls nft_unregister_flowtable_net_hooks() which
only unregisters the hooks, then after RCU grace period, it is
guaranteed that no packets add new entries to the flowtable (no flow
offload rules and flowtable hooks are reachable from packet path), so it
is safe to call nf_flow_table_free() which cleans up the remaining
entries from the flowtable (both software and hardware) and it unbinds
the flow_block.

Fixes: ff4bf2f42a ("netfilter: nf_tables: add nft_unregister_flowtable_hook()")
Reported-by: syzbot+e918523f77e62790d6d9@syzkaller.appspotmail.com
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-21 15:51:55 +01:00
Jeff Layton
c086df4902 fuse: move FUSE_SUPER_MAGIC definition to magic.h
...to help userland apps that need to identify FUSE mounts.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-02-21 14:57:26 +01:00
Daniel Scally
ae09639e3b platform/x86: int3472: Add terminator to gpiod_lookup_table
Without the terminator, if a con_id is passed to gpio_find() that
does not exist in the lookup table the function will not stop looping
correctly, and eventually cause an oops.

Fixes: 19d8d6e36b ("platform/x86: int3472: Pass tps68470_regulator_platform_data to the tps68470-regulator MFD-cell")
Signed-off-by: Daniel Scally <djrscally@gmail.com>
Link: https://lore.kernel.org/r/20220216225304.53911-5-djrscally@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-02-21 14:47:59 +01:00
Baruch Siach
b6ad6261d2 net: mdio-ipq4019: add delay after clock enable
Experimentation shows that PHY detect might fail when the code attempts
MDIO bus read immediately after clock enable. Add delay to stabilize the
clock before bus access.

PHY detect failure started to show after commit 7590fc6f80 ("net:
mdio: Demote probed message to debug print") that removed coincidental
delay between clock enable and bus access.

10ms is meant to match the time it take to send the probed message over
UART at 115200 bps. This might be a far overshoot.

Fixes: 23a890d493 ("net: mdio: Add the reset function for IPQ MDIO driver")
Signed-off-by: Baruch Siach <baruch.siach@siklu.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21 13:04:53 +00:00
Jens Axboe
228339662b io_uring: don't convert to jiffies for waiting on timeouts
If an application calls io_uring_enter(2) with a timespec passed in,
convert that timespec to ktime_t rather than jiffies. The latter does
not provide the granularity the application may expect, and may in
fact provided different granularity on different systems, depending
on what the HZ value is configured at.

Turn the timespec into an absolute ktime_t, and use that with
schedule_hrtimeout() instead.

Link: https://github.com/axboe/liburing/issues/531
Cc: stable@vger.kernel.org
Reported-by: Bob Chen <chenbo.chen@alibaba-inc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-21 05:55:42 -07:00
Tao Liu
cc20cced05 gso: do not skip outer ip header in case of ipip and net_failover
We encounter a tcp drop issue in our cloud environment. Packet GROed in
host forwards to a VM virtio_net nic with net_failover enabled. VM acts
as a IPVS LB with ipip encapsulation. The full path like:
host gro -> vm virtio_net rx -> net_failover rx -> ipvs fullnat
 -> ipip encap -> net_failover tx -> virtio_net tx

When net_failover transmits a ipip pkt (gso_type = 0x0103, which means
SKB_GSO_TCPV4, SKB_GSO_DODGY and SKB_GSO_IPXIP4), there is no gso
did because it supports TSO and GSO_IPXIP4. But network_header points to
inner ip header.

Call Trace:
 tcp4_gso_segment        ------> return NULL
 inet_gso_segment        ------> inner iph, network_header points to
 ipip_gso_segment
 inet_gso_segment        ------> outer iph
 skb_mac_gso_segment

Afterwards virtio_net transmits the pkt, only inner ip header is modified.
And the outer one just keeps unchanged. The pkt will be dropped in remote
host.

Call Trace:
 inet_gso_segment        ------> inner iph, outer iph is skipped
 skb_mac_gso_segment
 __skb_gso_segment
 validate_xmit_skb
 validate_xmit_skb_list
 sch_direct_xmit
 __qdisc_run
 __dev_queue_xmit        ------> virtio_net
 dev_hard_start_xmit
 __dev_queue_xmit        ------> net_failover
 ip_finish_output2
 ip_output
 iptunnel_xmit
 ip_tunnel_xmit
 ipip_tunnel_xmit        ------> ipip
 dev_hard_start_xmit
 __dev_queue_xmit
 ip_finish_output2
 ip_output
 ip_forward
 ip_rcv
 __netif_receive_skb_one_core
 netif_receive_skb_internal
 napi_gro_receive
 receive_buf
 virtnet_poll
 net_rx_action

The root cause of this issue is specific with the rare combination of
SKB_GSO_DODGY and a tunnel device that adds an SKB_GSO_ tunnel option.
SKB_GSO_DODGY is set from external virtio_net. We need to reset network
header when callbacks.gso_segment() returns NULL.

This patch also includes ipv6_gso_segment(), considering SIT, etc.

Fixes: cb32f511a7 ("ipip: add GSO/TSO support")
Signed-off-by: Tao Liu <thomas.liu@ucloud.cn>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21 11:41:30 +00:00
Roman Gushchin
221944736f tools/cgroup/slabinfo: update to work with struct slab
After the introduction of the dedicated struct slab to describe slab
pages by commit d122019bf0 ("mm: Split slab into its own type") and
the following removal of the corresponding struct page's fields by
commit 07f910f9b7 ("mm: Remove slab from struct page") the
memcg_slabinfo tool broke. An attempt to run it produces a trace like
this:
Traceback (most recent call last):
  File "/usr/bin/drgn", line 33, in <module>
    sys.exit(load_entry_point('drgn==0.0.16', 'console_scripts', 'drgn')())
  File "/usr/lib64/python3.9/site-packages/drgn/internal/cli.py", line 133, in main
    runpy.run_path(args.script[0], init_globals=init_globals, run_name="__main__")
  File "/usr/lib64/python3.9/runpy.py", line 268, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/usr/lib64/python3.9/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "memcg_slabinfo.py", line 226, in <module>
    main()
  File "memcg_slabinfo.py", line 199, in main
    cache = page.slab_cache
AttributeError: 'struct page' has no member 'slab_cache'

The problem can be fixed by explicitly casting struct page * to struct
slab * for slab pages. The tools works as expected with this fix, e.g.:

cred_jar             776    776    192   21    1 : tunables    0    0    0 : slabdata    547    547      0
kmalloc-cg-32          6      6     32  128    1 : tunables    0    0    0 : slabdata      9      9      0
files_cache            3      3    832   39    8 : tunables    0    0    0 : slabdata      8      8      0
kmalloc-cg-512         1      1    512   32    4 : tunables    0    0    0 : slabdata     10     10      0
task_struct           10     10   6720    4    8 : tunables    0    0    0 : slabdata     63     63      0
mm_struct              3      3   1664   19    8 : tunables    0    0    0 : slabdata      9      9      0
kmalloc-cg-16          1      1     16  256    1 : tunables    0    0    0 : slabdata      8      8      0
pde_opener             1      1     40  102    1 : tunables    0    0    0 : slabdata      8      8      0
anon_vma_chain       375    375     64   64    1 : tunables    0    0    0 : slabdata     81     81      0
radix_tree_node        3      3    584   28    4 : tunables    0    0    0 : slabdata    419    419      0
dentry                98     98    312   26    2 : tunables    0    0    0 : slabdata   1420   1420      0
btrfs_inode            3      3   2368   13    8 : tunables    0    0    0 : slabdata    730    730      0
signal_cache           3      3   1600   20    8 : tunables    0    0    0 : slabdata     17     17      0
sighand_cache          3      3   2240   14    8 : tunables    0    0    0 : slabdata     20     20      0
filp                  90     90    512   32    4 : tunables    0    0    0 : slabdata     95     95      0
anon_vma             214    214    200   20    1 : tunables    0    0    0 : slabdata    162    162      0
kmalloc-cg-1k          1      1   1024   32    8 : tunables    0    0    0 : slabdata     22     22      0
pid                   10     10    256   32    2 : tunables    0    0    0 : slabdata     14     14      0
kmalloc-cg-64          2      2     64   64    1 : tunables    0    0    0 : slabdata      8      8      0
kmalloc-cg-96          3      3     96   42    1 : tunables    0    0    0 : slabdata      8      8      0
sock_inode_cache       5      5   1408   23    8 : tunables    0    0    0 : slabdata     29     29      0
UNIX                   7      7   1920   17    8 : tunables    0    0    0 : slabdata     21     21      0
inode_cache           36     36   1152   28    8 : tunables    0    0    0 : slabdata    680    680      0
proc_inode_cache      26     26   1224   26    8 : tunables    0    0    0 : slabdata     64     64      0
kmalloc-cg-2k          2      2   2048   16    8 : tunables    0    0    0 : slabdata      9      9      0

v2: change naming and count_partial()/count_free()/for_each_slab()
    signatures to work with slabs, suggested by Matthew Wilcox

Fixes: 07f910f9b7 ("mm: Remove slab from struct page")
Reported-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Roman Gushchin <guro@fb.com>
Tested-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/linux-patches/Yg2cKKnIboNu7j+p@carbon.DHCP.thefacebook.com/
2022-02-21 11:34:49 +01:00
Greg Kroah-Hartman
93dd04ab0b slab: remove __alloc_size attribute from __kmalloc_track_caller
Commit c37495d625 ("slab: add __alloc_size attributes for better
bounds checking") added __alloc_size attributes to a bunch of kmalloc
function prototypes.  Unfortunately the change to __kmalloc_track_caller
seems to cause clang to generate broken code and the first time this is
called when booting, the box will crash.

While the compiler problems are being reworked and attempted to be
solved [1], let's just drop the attribute to solve the issue now.  Once
it is resolved it can be added back.

[1] https://github.com/ClangBuiltLinux/linux/issues/1599

Fixes: c37495d625 ("slab: add __alloc_size attributes for better bounds checking")
Cc: stable <stable@vger.kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: llvm@lists.linux.dev
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220218131358.3032912-1-gregkh@linuxfoundation.org
2022-02-21 11:32:44 +01:00
Matt Roper
28adef8612 drm/i915/dg2: Print PHY name properly on calibration error
We need to use phy_name() to convert the PHY value into a human-readable
character in the error message.

Fixes: a6a128116e ("drm/i915/dg2: Wait for SNPS PHY calibration during display init")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220215163545.2175730-1-matthew.d.roper@intel.com
(cherry picked from commit 84073e568e)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-02-21 09:43:11 +00:00
Ville Syrjälä
ec663bca91 drm/i915: Fix bw atomic check when switching between SAGV vs. no SAGV
If the only thing that is changing is SAGV vs. no SAGV but
the number of active planes and the total data rates end up
unchanged we currently bail out of intel_bw_atomic_check()
early and forget to actually compute the new WGV point
mask and thus won't actually enable/disable SAGV as requested.
This ends up poorly if we end up running with SAGV enabled
when we shouldn't. Usually ends up in underruns.

To fix this let's go through the QGV point mask computation
if either the data rates/number of planes, or the state
of SAGV is changing.

v2: Check more carefully if things are changing to avoid
    the extra calculations/debugs from introducing unwanted
    overhead

Cc: stable@vger.kernel.org
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> #v1
Fixes: 20f505f225 ("drm/i915: Restrict qgv points which don't have enough bandwidth.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218064039.12834-3-ville.syrjala@linux.intel.com
(cherry picked from commit 6b728595ff)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-02-21 09:37:19 +00:00
Ville Syrjälä
afc189df6b drm/i915: Correctly populate use_sagv_wm for all pipes
When changing between SAGV vs. no SAGV on tgl+ we have to
update the use_sagv_wm flag for all the crtcs or else
an active pipe not already in the state will end up using
the wrong watermarks. That is especially bad when we end up
with the tighter non-SAGV watermarks with SAGV enabled.
Usually ends up in underruns.

Cc: stable@vger.kernel.org
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Fixes: 7241c57d31 ("drm/i915: Add TGL+ SAGV support")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218064039.12834-2-ville.syrjala@linux.intel.com
(cherry picked from commit 8dd8ffb824)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-02-21 09:37:19 +00:00
Imre Deak
a40ee54e9a drm/i915: Disconnect PHYs left connected by BIOS on disabled ports
BIOS may leave a TypeC PHY in a connected state even though the
corresponding port is disabled. This will prevent any hotplug events
from being signalled (after the monitor deasserts and then reasserts its
HPD) until the PHY is disconnected and so the driver will not detect a
connected sink. Rebooting with the PHY in the connected state also
results in a system hang.

Fix the above by disconnecting TypeC PHYs on disabled ports.

Before commit 64851a32c4 the PHY connected state was read out even
for disabled ports and later the PHY got disconnected as a side effect
of a tc_port_lock/unlock() sequence (during connector probing), hence
recovering the port's hotplug functionality.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5014
Fixes: 64851a32c4 ("drm/i915/tc: Add a mode for the TypeC PHY's disconnected state")
Cc: <stable@vger.kernel.org> # v5.16+
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220217152237.670220-1-imre.deak@intel.com
(cherry picked from commit ed0ccf349f)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-02-21 09:37:19 +00:00
Ville Syrjälä
3f33364836 drm/i915: Widen the QGV point mask
adlp+ adds some extra bits to the QGV point mask. The code attempts
to handle that but forgot to actually make sure we can store those
bits in the bw state. Fix it.

Cc: stable@vger.kernel.org
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Fixes: 192fbfb767 ("drm/i915: Implement PSF GV point support")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220214091811.13725-4-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
(cherry picked from commit c0299cc984)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-02-21 09:37:19 +00:00
Josh Poimboeuf
44a3918c82 x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
With unprivileged eBPF enabled, eIBRS (without retpoline) is vulnerable
to Spectre v2 BHB-based attacks.

When both are enabled, print a warning message and report it in the
'spectre_v2' sysfs vulnerabilities file.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2022-02-21 10:21:47 +01:00
Peter Zijlstra
5ad3eb1132 Documentation/hw-vuln: Update spectre doc
Update the doc with the new fun.

  [ bp: Massage commit message. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2022-02-21 10:21:41 +01:00
Peter Zijlstra
1e19da8522 x86/speculation: Add eIBRS + Retpoline options
Thanks to the chaps at VUsec it is now clear that eIBRS is not
sufficient, therefore allow enabling of retpolines along with eIBRS.

Add spectre_v2=eibrs, spectre_v2=eibrs,lfence and
spectre_v2=eibrs,retpoline options to explicitly pick your preferred
means of mitigation.

Since there's new mitigations there's also user visible changes in
/sys/devices/system/cpu/vulnerabilities/spectre_v2 to reflect these
new mitigations.

  [ bp: Massage commit message, trim error messages,
    do more precise eIBRS mode checking. ]

Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Patrick Colp <patrick.colp@oracle.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2022-02-21 10:21:35 +01:00
Peter Zijlstra (Intel)
d45476d983 x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
The RETPOLINE_AMD name is unfortunate since it isn't necessarily
AMD only, in fact Hygon also uses it. Furthermore it will likely be
sufficient for some Intel processors. Therefore rename the thing to
RETPOLINE_LFENCE to better describe what it is.

Add the spectre_v2=retpoline,lfence option as an alias to
spectre_v2=retpoline,amd to preserve existing setups. However, the output
of /sys/devices/system/cpu/vulnerabilities/spectre_v2 will be changed.

  [ bp: Fix typos, massage. ]

Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2022-02-21 10:21:28 +01:00
Daniele Palmas
cfc4442c64 USB: serial: option: add Telit LE910R1 compositions
Add support for the following Telit LE910R1 compositions:

0x701a: rndis, tty, tty, tty
0x701b: ecm, tty, tty, tty
0x9201: tty

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Link: https://lore.kernel.org/r/20220218134552.4051-1-dnlplm@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2022-02-21 10:20:26 +01:00
Slark Xiao
6ecb3f0b18 USB: serial: option: add support for DW5829e
Dell DW5829e same as DW5821e except CAT level.
DW5821e supports CAT16 but DW5829e supports CAT9.
There are 2 types product of DW5829e: normal and eSIM.
So we will add 2 PID for DW5829e.
And for each PID, it support MBIM or RMNET.
Let's see test evidence as below:

DW5829e MBIM mode:
T:  Bus=04 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  4 Spd=5000 MxCh= 0
D:  Ver= 3.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs=  2
P:  Vendor=413c ProdID=81e6 Rev=03.18
S:  Manufacturer=Dell Inc.
S:  Product=DW5829e Snapdragon X20 LTE
S:  SerialNumber=0123456789ABCDEF
C:  #Ifs= 7 Cfg#= 2 Atr=a0 MxPwr=896mA
I:  If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
I:  If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I:  If#=0x6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)

DW5829e RMNET mode:
T:  Bus=04 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  5 Spd=5000 MxCh= 0
D:  Ver= 3.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs=  1
P:  Vendor=413c ProdID=81e6 Rev=03.18
S:  Manufacturer=Dell Inc.
S:  Product=DW5829e Snapdragon X20 LTE
S:  SerialNumber=0123456789ABCDEF
C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA
I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I:  If#=0x1 Alt= 0 #EPs= 1 Cls=03(HID  ) Sub=00 Prot=00 Driver=usbhid
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option

DW5829e-eSIM MBIM mode:
T:  Bus=04 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  6 Spd=5000 MxCh= 0
D:  Ver= 3.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs=  2
P:  Vendor=413c ProdID=81e4 Rev=03.18
S:  Manufacturer=Dell Inc.
S:  Product=DW5829e-eSIM Snapdragon X20 LTE
S:  SerialNumber=0123456789ABCDEF
C:  #Ifs= 7 Cfg#= 2 Atr=a0 MxPwr=896mA
I:  If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
I:  If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I:  If#=0x6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)

DW5829e-eSIM RMNET mode:
T:  Bus=04 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  7 Spd=5000 MxCh= 0
D:  Ver= 3.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs=  1
P:  Vendor=413c ProdID=81e4 Rev=03.18
S:  Manufacturer=Dell Inc.
S:  Product=DW5829e-eSIM Snapdragon X20 LTE
S:  SerialNumber=0123456789ABCDEF
C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA
I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I:  If#=0x1 Alt= 0 #EPs= 1 Cls=03(HID  ) Sub=00 Prot=00 Driver=usbhid
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option

BTW, the interface 0x6 of MBIM mode is GNSS port, which not same as NMEA
port. So it's banned from serial option driver.
The remaining interfaces 0x2-0x5 are: MODEM, MODEM, NMEA, DIAG.

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Link: https://lore.kernel.org/r/20220214021401.6264-1-slark_xiao@163.com
[ johan: drop unnecessary reservation of interface 1 ]
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2022-02-21 10:19:31 +01:00
Dmytro Bagrii
198a7ebd5f Revert "USB: serial: ch341: add new Product ID for CH341A"
This reverts commit 46ee4abb10.

CH341 has Product ID 0x5512 in EPP/MEM mode which is used for
I2C/SPI/GPIO interfaces. In asynchronous serial interface mode
CH341 has PID 0x5523 which is already in the table.

Mode is selected by corresponding jumper setting.

Signed-off-by: Dmytro Bagrii <dimich.dmb@gmail.com>
Link: https://lore.kernel.org/r/20220210164137.4376-1-dimich.dmb@gmail.com
Link: https://lore.kernel.org/r/YJ0OCS/sh+1ifD/q@hovoldconsulting.com
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2022-02-21 09:58:14 +01:00
Pavel Skripkin
fc3ef2e329 HID: hid-thrustmaster: fix OOB read in thrustmaster_interrupts
Syzbot reported an slab-out-of-bounds Read in thrustmaster_probe() bug.
The root case is in missing validation check of actual number of endpoints.

Code should not blindly access usb_host_interface::endpoint array, since
it may contain less endpoints than code expects.

Fix it by adding missing validaion check and print an error if
number of endpoints do not match expected number

Fixes: c49c336378 ("HID: support for initialization of some Thrustmaster wheels")
Reported-and-tested-by: syzbot+35eebd505e97d315d01c@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2022-02-21 09:15:10 +01:00
Laurent Pinchart
fa231bef3b soc: imx: gpcv2: Fix clock disabling imbalance in error path
The imx_pgc_power_down() starts by enabling the domain clocks, and thus
disables them in the error path. Commit 18c98573a4 ("soc: imx: gpcv2:
add domain option to keep domain clocks enabled") made the clock enable
conditional, but forgot to add the same condition to the error path.
This can result in a clock enable/disable imbalance. Fix it.

Fixes: 18c98573a4 ("soc: imx: gpcv2: add domain option to keep domain clocks enabled")
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2022-02-21 15:27:41 +08:00
Thomas Gleixner
ba1366f3d0 PCI: vmd: Prevent recursive locking on interrupt allocation
Tejas reported the following recursive locking issue:

 swapper/0/1 is trying to acquire lock:
 ffff8881074fd0a0 (&md->mutex){+.+.}-{3:3}, at: msi_get_virq+0x30/0xc0
 
 but task is already holding lock:
 ffff8881017cd6a0 (&md->mutex){+.+.}-{3:3}, at: __pci_enable_msi_range+0xf2/0x290
 
 stack backtrace:
  __mutex_lock+0x9d/0x920
  msi_get_virq+0x30/0xc0
  pci_irq_vector+0x26/0x30
  vmd_msi_init+0xcc/0x210
  msi_domain_alloc+0xbf/0x150
  msi_domain_alloc_irqs_descs_locked+0x3e/0xb0
  __pci_enable_msi_range+0x155/0x290
  pci_alloc_irq_vectors_affinity+0xba/0x100
  pcie_port_device_register+0x307/0x550
  pcie_portdrv_probe+0x3c/0xd0
  pci_device_probe+0x95/0x110

This is caused by the VMD MSI code which does a lookup of the Linux
interrupt number for an VMD managed MSI[X] vector. The lookup function
tries to acquire the already held mutex.

Avoid that by caching the Linux interrupt number at initialization time
instead of looking it up over and over.

Fixes: 82ff8e6b78 ("PCI/MSI: Use msi_get_virq() in pci_get_vector()")
Reported-by: "Surendrakumar Upadhyay, TejaskumarX" <tejaskumarx.surendrakumar.upadhyay@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: "Surendrakumar Upadhyay, TejaskumarX" <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: linux-pci@vger.kernel.org
Link: https://lore.kernel.org/r/87a6euub2a.ffs@tglx
2022-02-21 08:26:53 +01:00
David S. Miller
5a3449734b Merge branch 'bnxt_en-fixes'
Michael Chan says:

====================
bnxt_en: Bug fixes

This series contains bug fixes for FEC reporting, ethtool self test,
multicast setup, devlink health reporting and live patching, and
a firmware response timeout.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-20 13:47:15 +00:00
Kalesh AP
1278d17a1f bnxt_en: Fix devlink fw_activate
To install a livepatch, first flash the package to NVM, and then
activate the patch through the "HWRM_FW_LIVEPATCH" fw command.
To uninstall a patch from NVM, flash the removal package and then
activate it through the "HWRM_FW_LIVEPATCH" fw command.

The "HWRM_FW_LIVEPATCH" fw command has to consider following scenarios:

1. no patch in NVM and no patch active. Do nothing.
2. patch in NVM, but not active. Activate the patch currently in NVM.
3. patch is not in NVM, but active. Deactivate the patch.
4. patch in NVM and the patch active. Do nothing.

Fix the code to handle these scenarios during devlink "fw_activate".

To install and activate a live patch:
devlink dev flash pci/0000:c1:00.0 file thor_patch.pkg
devlink -f dev reload pci/0000:c1:00.0 action fw_activate limit no_reset

To remove and deactivate a live patch:
devlink dev flash pci/0000:c1:00.0 file thor_patch_rem.pkg
devlink -f dev reload pci/0000:c1:00.0 action fw_activate limit no_reset

Fixes: 3c4153394e ("bnxt_en: implement firmware live patching")
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-20 13:47:15 +00:00
Michael Chan
b891106da5 bnxt_en: Increase firmware message response DMA wait time
When polling for the firmware message response, we first poll for the
response message header.  Once the valid length is detected in the
header, we poll for the valid bit at the end of the message which
signals DMA completion.  Normally, this poll time for DMA completion
is extremely short (0 to a few usec).  But on some devices under some
rare conditions, it can be up to about 20 msec.

Increase this delay to 50 msec and use udelay() for the first 10 usec
for the common case, and usleep_range() beyond that.

Also, change the error message to include the above delay time when
printing the timeout value.

Fixes: 3c8c20db76 ("bnxt_en: move HWRM API implementation into separate file")
Reviewed-by: Vladimir Olovyannikov <vladimir.olovyannikov@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-20 13:47:15 +00:00
Kalesh AP
0e0e3c5358 bnxt_en: Restore the resets_reliable flag in bnxt_open()
During ifdown, we call bnxt_inv_fw_health_reg() which will clear
both the status_reliable and resets_reliable flags if these
registers are mapped.  This is correct because a FW reset during
ifdown will clear these register mappings.  If we detect that FW
has gone through reset during the next ifup, we will remap these
registers.

But during normal ifup with no FW reset, we need to restore the
resets_reliable flag otherwise we will not show the reset counter
during devlink diagnose.

Fixes: 8cc95ceb70 ("bnxt_en: improve fw diagnose devlink health messages")
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-20 13:47:15 +00:00
Pavan Chebbi
8cdb159242 bnxt_en: Fix incorrect multicast rx mask setting when not requested
We should setup multicast only when net_device flags explicitly
has IFF_MULTICAST set. Otherwise we will incorrectly turn it on
even when not asked.  Fix it by only passing the multicast table
to the firmware if IFF_MULTICAST is set.

Fixes: 7d2837dd7a ("bnxt_en: Setup multicast properly after resetting device.")
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-20 13:47:14 +00:00
Michael Chan
cfcab3b3b6 bnxt_en: Fix occasional ethtool -t loopback test failures
In the current code, we setup the port to PHY or MAC loopback mode
and then transmit a test broadcast packet for the loopback test.  This
scheme fails sometime if the port is shared with management firmware
that can also send packets.  The driver may receive the management
firmware's packet and the test will fail when the contents don't
match the test packet.

Change the test packet to use it's own MAC address as the destination
and setup the port to only receive it's own MAC address.  This should
filter out other packets sent by management firmware.

Fixes: 91725d89b9 ("bnxt_en: Add PHY loopback to ethtool self-test.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-20 13:47:14 +00:00
Michael Chan
6758f93766 bnxt_en: Fix offline ethtool selftest with RDMA enabled
For offline (destructive) self tests, we need to stop the RDMA driver
first.  Otherwise, the RDMA driver will run into unrecoverable errors
when destructive firmware tests are being performed.

The irq_re_init parameter used in the half close and half open
sequence when preparing the NIC for offline tests should be set to
true because the RDMA driver will free all IRQs before the offline
tests begin.

Fixes: 55fd0cf320 ("bnxt_en: Add external loopback test to ethtool selftest.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Ben Li <ben.li@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-20 13:47:14 +00:00
Somnath Kotur
84d3c83e6e bnxt_en: Fix active FEC reporting to ethtool
ethtool --show-fec <interface> does not show anything when the Active
FEC setting in the chip is set to None.  Fix it to properly return
ETHTOOL_FEC_OFF in that case.

Fixes: 8b2775890a ("bnxt_en: Report FEC settings to ethtool.")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-20 13:47:14 +00:00
Miaohe Lin
c94afc46ca memblock: use kfree() to release kmalloced memblock regions
memblock.{reserved,memory}.regions may be allocated using kmalloc() in
memblock_double_array(). Use kfree() to release these kmalloced regions
indicated by memblock_{reserved,memory}_in_slab.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Fixes: 3010f87650 ("mm: discard memblock data later")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
2022-02-20 08:45:39 +02:00
Linus Walleij
e23e40fd6d hwmon: (ntc_thermistor) Underscore Samsung thermistor
The sysfs does not like that we name the thermistor something
that contains a dash:

ntc-thermistor thermistor: hwmon: 'ssg1404-001221' is not a valid
name attribute, please fix

Fix it up by switching to an underscore.

Fixes: e13e979b2b ("hwmon: (ntc_thermistor) Add Samsung 1404-001221 NTC")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220205005804.123245-1-linus.walleij@linaro.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-02-19 16:47:54 -08:00
Pablo Neira Ayuso
b1a5983f56 netfilter: nf_tables_offload: incorrect flow offload action array size
immediate verdict expression needs to allocate one slot in the flow offload
action array, however, immediate data expression does not need to do so.

fwd and dup expression need to allocate one slot, this is missing.

Add a new offload_action interface to report if this expression needs to
allocate one slot in the flow offload action array.

Fixes: be2861dc36 ("netfilter: nft_{fwd,dup}_netdev: add offload support")
Reported-and-tested-by: Nick Gregory <Nick.Gregory@Sophos.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-20 01:22:20 +01:00
Vladimir Oltean
8940e6b669 net: dsa: avoid call to __dev_set_promiscuity() while rtnl_mutex isn't held
If the DSA master doesn't support IFF_UNICAST_FLT, then the following
call path is possible:

dsa_slave_switchdev_event_work
-> dsa_port_host_fdb_add
   -> dev_uc_add
      -> __dev_set_rx_mode
         -> __dev_set_promiscuity

Since the blamed commit, dsa_slave_switchdev_event_work() no longer
holds rtnl_lock(), which triggers the ASSERT_RTNL() from
__dev_set_promiscuity().

Taking rtnl_lock() around dev_uc_add() is impossible, because all the
code paths that call dsa_flush_workqueue() do so from contexts where the
rtnl_mutex is already held - so this would lead to an instant deadlock.

dev_uc_add() in itself doesn't require the rtnl_mutex for protection.
There is this comment in __dev_set_rx_mode() which assumes so:

		/* Unicast addresses changes may only happen under the rtnl,
		 * therefore calling __dev_set_promiscuity here is safe.
		 */

but it is from commit 4417da668c ("[NET]: dev: secondary unicast
address support") dated June 2007, and in the meantime, commit
f1f28aa351 ("netdev: Add addr_list_lock to struct net_device."), dated
July 2008, has added &dev->addr_list_lock to protect this instead of the
global rtnl_mutex.

Nonetheless, __dev_set_promiscuity() does assume rtnl_mutex protection,
but it is the uncommon path of what we typically expect dev_uc_add()
to do. So since only the uncommon path requires rtnl_lock(), just check
ahead of time whether dev_uc_add() would result into a call to
__dev_set_promiscuity(), and handle that condition separately.

DSA already configures the master interface to be promiscuous if the
tagger requires this. We can extend this to also cover the case where
the master doesn't handle dev_uc_add() (doesn't support IFF_UNICAST_FLT),
and on the premise that we'd end up making it promiscuous during
operation anyway, either if a DSA slave has a non-inherited MAC address,
or if the bridge notifies local FDB entries for its own MAC address, the
address of a station learned on a foreign port, etc.

Fixes: 0faf890fc5 ("net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work")
Reported-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 21:18:33 +00:00
Svenning Sørensen
3d00827a90 net: dsa: microchip: fix bridging with more than two member ports
Commit b3612ccdf2 ("net: dsa: microchip: implement multi-bridge support")
plugged a packet leak between ports that were members of different bridges.
Unfortunately, this broke another use case, namely that of more than two
ports that are members of the same bridge.

After that commit, when a port is added to a bridge, hardware bridging
between other member ports of that bridge will be cleared, preventing
packet exchange between them.

Fix by ensuring that the Port VLAN Membership bitmap includes any existing
ports in the bridge, not just the port being added.

Fixes: b3612ccdf2 ("net: dsa: microchip: implement multi-bridge support")
Signed-off-by: Svenning Sørensen <sss@secomea.com>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 16:22:46 +00:00
Christophe Leroy
5486f5bf79 net: Force inlining of checksum functions in net/checksum.h
All functions defined as static inline in net/checksum.h are
meant to be inlined for performance reason.

But since commit ac7c3e4ff4 ("compiler: enable
CONFIG_OPTIMIZE_INLINING forcibly") the compiler is allowed to
uninline functions when it wants.

Fair enough in the general case, but for tiny performance critical
checksum helpers that's counter-productive.

The problem mainly arises when selecting CONFIG_CC_OPTIMISE_FOR_SIZE,
Those helpers being 'static inline' in header files you suddenly find
them duplicated many times in the resulting vmlinux.

Here is a typical exemple when building powerpc pmac32_defconfig
with CONFIG_CC_OPTIMISE_FOR_SIZE. csum_sub() appears 4 times:

	c04a23cc <csum_sub>:
	c04a23cc:	7c 84 20 f8 	not     r4,r4
	c04a23d0:	7c 63 20 14 	addc    r3,r3,r4
	c04a23d4:	7c 63 01 94 	addze   r3,r3
	c04a23d8:	4e 80 00 20 	blr
		...
	c04a2ce8:	4b ff f6 e5 	bl      c04a23cc <csum_sub>
		...
	c04a2d2c:	4b ff f6 a1 	bl      c04a23cc <csum_sub>
		...
	c04a2d54:	4b ff f6 79 	bl      c04a23cc <csum_sub>
		...
	c04a754c <csum_sub>:
	c04a754c:	7c 84 20 f8 	not     r4,r4
	c04a7550:	7c 63 20 14 	addc    r3,r3,r4
	c04a7554:	7c 63 01 94 	addze   r3,r3
	c04a7558:	4e 80 00 20 	blr
		...
	c04ac930:	4b ff ac 1d 	bl      c04a754c <csum_sub>
		...
	c04ad264:	4b ff a2 e9 	bl      c04a754c <csum_sub>
		...
	c04e3b08 <csum_sub>:
	c04e3b08:	7c 84 20 f8 	not     r4,r4
	c04e3b0c:	7c 63 20 14 	addc    r3,r3,r4
	c04e3b10:	7c 63 01 94 	addze   r3,r3
	c04e3b14:	4e 80 00 20 	blr
		...
	c04e5788:	4b ff e3 81 	bl      c04e3b08 <csum_sub>
		...
	c04e65c8:	4b ff d5 41 	bl      c04e3b08 <csum_sub>
		...
	c0512d34 <csum_sub>:
	c0512d34:	7c 84 20 f8 	not     r4,r4
	c0512d38:	7c 63 20 14 	addc    r3,r3,r4
	c0512d3c:	7c 63 01 94 	addze   r3,r3
	c0512d40:	4e 80 00 20 	blr
		...
	c0512dfc:	4b ff ff 39 	bl      c0512d34 <csum_sub>
		...
	c05138bc:	4b ff f4 79 	bl      c0512d34 <csum_sub>
		...

Restore the expected behaviour by using __always_inline for all
functions defined in net/checksum.h

vmlinux size is even reduced by 256 bytes with this patch:

	   text	   data	    bss	    dec	    hex	filename
	6980022	2515362	 194384	9689768	 93daa8	vmlinux.before
	6979862	2515266	 194384	9689512	 93d9a8	vmlinux.now

Fixes: ac7c3e4ff4 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly")
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 16:07:12 +00:00
David S. Miller
0033fced48 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-02-18

This series contains updates to ice driver only.

Wojciech fixes protocol matching for slow-path switchdev so that all
packets are correctly redirected.

Michal removes accidental unconditional setting of l4 port filtering
flag.

Jake adds locking to protect VF reset and removal to fix various issues
that can be encountered when they race with each other.

Tom Rix propagates an error and initializes a struct to resolve reported
Clang issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:35:20 +00:00
David S. Miller
90141edcd5 Merge branch 'mptcp-fixes'
Mat Martineau says:

====================
mptcp: Fix address advertisement races and stabilize tests

Patches 1, 2, and 7 modify two self tests to give consistent, accurate
results by fixing timing issues and accounting for syncookie behavior.

Paches 3-6 fix two races in overlapping address advertisement send and
receive. Associated self tests are updated, including addition of two
MIBs to enable testing and tracking dropped address events.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:28:01 +00:00
Paolo Abeni
e35f885b35 selftests: mptcp: be more conservative with cookie MPJ limits
Since commit 2843ff6f36 ("mptcp: remote addresses fullmesh"), an
MPTCP client can attempt creating multiple MPJ subflow simultaneusly.

In such scenario the server, when syncookies are enabled, could end-up
accepting incoming MPJ syn even above the configured subflow limit, as
the such limit can be enforced in a reliable way only after the subflow
creation. In case of syncookie, only after the 3rd ack reception.

As a consequence the related self-tests case sporadically fails, as it
verify that the server always accept the expected number of MPJ syn.

Address the issues relaxing the MPJ syn number constrain. Note that the
check on the accepted number of MPJ 3rd ack still remains intact.

Fixes: 2843ff6f36 ("mptcp: remote addresses fullmesh")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:28:01 +00:00
Paolo Abeni
6ef84b1517 selftests: mptcp: more robust signal race test
The in kernel MPTCP PM implementation can process a single
incoming add address option at any given time. In the
mentioned test the server can surpass such limit. Let the
setup cope with that allowing a faster add_addr retransmission.

Fixes: a88c9e4969 ("mptcp: do not block subflows creation on errors")
Fixes: f7efc7771e ("mptcp: drop argument port from mptcp_pm_announce_addr")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/254
Reported-and-tested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:28:00 +00:00
Paolo Abeni
f73c119463 mptcp: add mibs counter for ignored incoming options
The MPTCP in kernel path manager has some constraints on incoming
addresses announce processing, so that in edge scenarios it can
end-up dropping (ignoring) some of such announces.

The above is not very limiting in practice since such scenarios are
very uncommon and MPTCP will recover due to ADD_ADDR retransmissions.

This patch adds a few MIB counters to account for such drop events
to allow easier introspection of the critical scenarios.

Fixes: f7efc7771e ("mptcp: drop argument port from mptcp_pm_announce_addr")
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:28:00 +00:00
Paolo Abeni
837cf45df1 mptcp: fix race in incoming ADD_ADDR option processing
If an MPTCP endpoint received multiple consecutive incoming
ADD_ADDR options, mptcp_pm_add_addr_received() can overwrite
the current remote address value after the PM lock is released
in mptcp_pm_nl_add_addr_received() and before such address
is echoed.

Fix the issue caching the remote address value a little earlier
and always using the cached value after releasing the PM lock.

Fixes: f7efc7771e ("mptcp: drop argument port from mptcp_pm_announce_addr")
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:28:00 +00:00
Paolo Abeni
98247bc16a mptcp: fix race in overlapping signal events
After commit a88c9e4969 ("mptcp: do not block subflows
creation on errors"), if a signal address races with a failing
subflow creation, the subflow creation failure control path
can trigger the selection of the next address to be announced
while the current announced is still pending.

The above will cause the unintended suppression of the ADD_ADDR
announce.

Fix the issue skipping the to-be-suppressed announce before it
will mark an endpoint as already used. The relevant announce
will be triggered again when the current one will complete.

Fixes: a88c9e4969 ("mptcp: do not block subflows creation on errors")
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:28:00 +00:00
Paolo Abeni
5b31dda736 selftests: mptcp: improve 'fair usage on close' stability
The mentioned test has to wait for a subflow creation failure.
The current code looks for TCP sockets in TW state and sometimes
misses the relevant event. Switch to a more stable check, looking
for the associated mib counter.

Fixes: 46e967d187 ("selftests: mptcp: add tests for subflow creation failure")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/257
Reported-and-tested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:28:00 +00:00
Paolo Abeni
0cd33c5ffe selftests: mptcp: fix diag instability
Instead of waiting for an arbitrary amount of time for the MPTCP
MP_CAPABLE handshake to complete, explicitly wait for the relevant
socket to enter into the established status.

Additionally let the data transfer application use the slowest
transfer mode available (-r), to cope with very slow host, or
high jitter caused by hosting VMs.

Fixes: df62f2ec3d ("selftests/mptcp: add diag interface tests")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/258
Reported-and-tested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 12:28:00 +00:00
Christophe JAILLET
3a14d0888e nfp: flower: Fix a potential leak in nfp_tunnel_add_shared_mac()
ida_simple_get() returns an id between min (0) and max (NFP_MAX_MAC_INDEX)
inclusive.
So NFP_MAX_MAC_INDEX (0xff) is a valid id.

In order for the error handling path to work correctly, the 'invalid'
value for 'ida_idx' should not be in the 0..NFP_MAX_MAC_INDEX range,
inclusive.

So set it to -1.

Fixes: 20cce88650 ("nfp: flower: enable MAC address sharing for offloadable devs")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220218131535.100258-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18 21:08:14 -08:00
Subash Abhinov Kasiviswanathan
ba88b55337 MAINTAINERS: rmnet: Update email addresses
Switch to the quicinc.com ids.

Signed-off-by: Sean Tranchetti <quic_stranche@quicinc.com>
Signed-off-by: Subash Abhinov Kasiviswanathan <quic_subashab@quicinc.com>
Link: https://lore.kernel.org/r/1645174218-32632-1-git-send-email-quic_subashab@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18 20:42:09 -08:00
Jeremy Linton
5a2aba71cd net: mvpp2: always set port pcs ops
Booting a MACCHIATObin with 5.17, the system OOPs with
a null pointer deref when the network is started. This
is caused by the pcs->ops structure being null in
mcpp2_acpi_start() when it tries to call pcs_config().

Hoisting the code which sets pcs_gmac.ops and pcs_xlg.ops,
assuring they are always set, fixes the problem.

The OOPs looks like:
[   18.687760] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000010
[   18.698561] Mem abort info:
[   18.698564]   ESR = 0x96000004
[   18.698567]   EC = 0x25: DABT (current EL), IL = 32 bits
[   18.709821]   SET = 0, FnV = 0
[   18.714292]   EA = 0, S1PTW = 0
[   18.718833]   FSC = 0x04: level 0 translation fault
[   18.725126] Data abort info:
[   18.729408]   ISV = 0, ISS = 0x00000004
[   18.734655]   CM = 0, WnR = 0
[   18.738933] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000111bbf000
[   18.745409] [0000000000000010] pgd=0000000000000000, p4d=0000000000000000
[   18.752235] Internal error: Oops: 96000004 [#1] SMP
[   18.757134] Modules linked in: rfkill ip_set nf_tables nfnetlink qrtr sunrpc vfat fat omap_rng fuse zram xfs crct10dif_ce mvpp2 ghash_ce sbsa_gwdt phylink xhci_plat_hcd ahci_plam
[   18.773481] CPU: 0 PID: 681 Comm: NetworkManager Not tainted 5.17.0-0.rc3.89.fc36.aarch64 #1
[   18.781954] Hardware name: Marvell                         Armada 7k/8k Family Board      /Armada 7k/8k Family Board      , BIOS EDK II Jun  4 2019
[   18.795222] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   18.802213] pc : mvpp2_start_dev+0x2b0/0x300 [mvpp2]
[   18.807208] lr : mvpp2_start_dev+0x298/0x300 [mvpp2]
[   18.812197] sp : ffff80000b4732c0
[   18.815522] x29: ffff80000b4732c0 x28: 0000000000000000 x27: ffffccab38ae57f8
[   18.822689] x26: ffff6eeb03065a10 x25: ffff80000b473a30 x24: ffff80000b4735b8
[   18.829855] x23: 0000000000000000 x22: 00000000000001e0 x21: ffff6eeb07b6ab68
[   18.837021] x20: ffff6eeb07b6ab30 x19: ffff6eeb07b6a9c0 x18: 0000000000000014
[   18.844187] x17: 00000000f6232bfe x16: ffffccab899b1dc0 x15: 000000006a30f9fa
[   18.851353] x14: 000000003b77bd50 x13: 000006dc896f0e8e x12: 001bbbfccfd0d3a2
[   18.858519] x11: 0000000000001528 x10: 0000000000001548 x9 : ffffccab38ad0fb0
[   18.865685] x8 : ffff80000b473330 x7 : 0000000000000000 x6 : 0000000000000000
[   18.872851] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff80000b4732f8
[   18.880017] x2 : 000000000000001a x1 : 0000000000000002 x0 : ffff6eeb07b6ab68
[   18.887183] Call trace:
[   18.889637]  mvpp2_start_dev+0x2b0/0x300 [mvpp2]
[   18.894279]  mvpp2_open+0x134/0x2b4 [mvpp2]
[   18.898483]  __dev_open+0x128/0x1e4
[   18.901988]  __dev_change_flags+0x17c/0x1d0
[   18.906187]  dev_change_flags+0x30/0x70
[   18.910038]  do_setlink+0x278/0xa7c
[   18.913540]  __rtnl_newlink+0x44c/0x7d0
[   18.917391]  rtnl_newlink+0x5c/0x8c
[   18.920892]  rtnetlink_rcv_msg+0x254/0x314
[   18.925006]  netlink_rcv_skb+0x48/0x10c
[   18.928858]  rtnetlink_rcv+0x24/0x30
[   18.932449]  netlink_unicast+0x290/0x2f4
[   18.936386]  netlink_sendmsg+0x1d0/0x41c
[   18.940323]  sock_sendmsg+0x60/0x70
[   18.943825]  ____sys_sendmsg+0x248/0x260
[   18.947762]  ___sys_sendmsg+0x74/0xa0
[   18.951438]  __sys_sendmsg+0x64/0xcc
[   18.955027]  __arm64_sys_sendmsg+0x30/0x40
[   18.959140]  invoke_syscall+0x50/0x120
[   18.962906]  el0_svc_common.constprop.0+0x4c/0xf4
[   18.967629]  do_el0_svc+0x30/0x9c
[   18.970958]  el0_svc+0x28/0xb0
[   18.974025]  el0t_64_sync_handler+0x10c/0x140
[   18.978400]  el0t_64_sync+0x1a4/0x1a8
[   18.982078] Code: 52800004 b9416262 aa1503e0 52800041 (f94008a5)
[   18.988196] ---[ end trace 0000000000000000 ]---

Fixes: cff0563223 ("net: mvpp2: use .mac_select_pcs() interface")
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Marcin Wojtas <mw@semihalf.com>
Link: https://lore.kernel.org/r/20220214231852.3331430-1-jeremy.linton@arm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18 20:14:16 -08:00
Linus Walleij
486c2d15aa Merge tag 'intel-pinctrl-v5.17-5' of gitolite.kernel.org:pub/scm/linux/kernel/git/pinctrl/intel into fixes
intel-pinctrl for v5.17-5

* Revert misplaced ID

The following is an automated git shortlog grouped by driver:

tigerlake:
 -  Revert "Add Alder Lake-M ACPI ID"
2022-02-19 02:03:58 +01:00
Marc Zyngier
d1e972ace4 gpio: tegra186: Fix chip_data type confusion
The tegra186 GPIO driver makes the assumption that the pointer
returned by irq_data_get_irq_chip_data() is a pointer to a
tegra_gpio structure. Unfortunately, it is actually a pointer
to the inner gpio_chip structure, as mandated by the gpiolib
infrastructure. Nice try.

The saving grace is that the gpio_chip is the first member of
tegra_gpio, so the bug has gone undetected since... forever.

Fix it by performing a container_of() on the pointer. This results
in no additional code, and makes it possible to understand how
the whole thing works.

Fixes: 5b2b135a87 ("gpio: Add Tegra186 support")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Link: https://lore.kernel.org/r/20220211093904.1112679-1-maz@kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-02-19 01:51:24 +01:00
Marc Zyngier
64fd52a4d3 pinctrl: starfive: Use a static name for the GPIO irq_chip
Drop the device name used for the GPIO irq_chip and replace it
with something static. The information is still available from
debugfs and carried as part of the irqdomain.

Suggested-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Link: https://lore.kernel.org/r/20220211092345.1093332-1-maz@kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-02-19 01:50:23 +01:00
Jiasheng Jiang
a222fd8541 soc: fsl: qe: Check of ioremap return value
As the possible failure of the ioremap(), the par_io could be NULL.
Therefore it should be better to check it and return error in order to
guarantee the success of the initiation.
But, I also notice that all the caller like mpc85xx_qe_par_io_init() in
`arch/powerpc/platforms/85xx/common.c` don't check the return value of
the par_io_init().
Actually, par_io_init() needs to check to handle the potential error.
I will submit another patch to fix that.
Anyway, par_io_init() itsely should be fixed.

Fixes: 7aa1aa6ece ("QE: Move QE from arch/powerpc to drivers/soc")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
2022-02-18 17:11:23 -06:00
Jason Wang
6385960501 soc: fsl: qe: fix typo in a comment
The double `is' in the comment in line 150 is repeated. Remove one
of them from the comment.  Also removes a redundant tab in a new line.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
2022-02-18 17:11:21 -06:00
Christophe JAILLET
b9abe942cd soc: fsl: guts: Add a missing memory allocation failure check
If 'devm_kstrdup()' fails, we should return -ENOMEM.

While at it, move the 'of_node_put()' call in the error handling path and
after the 'machine' has been copied.
Better safe than sorry.

Fixes: a6fc3b6981 ("soc: fsl: add GUTS driver for QorIQ platforms")
Depends-on: fddacc7ff4dd ("soc: fsl: guts: Revert commit 3c0d64e867ed")
Suggested-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
2022-02-18 17:11:20 -06:00
Christophe JAILLET
b113737cf1 soc: fsl: guts: Revert commit 3c0d64e867
This reverts commit 3c0d64e867
("soc: fsl: guts: reuse machine name from device tree").

A following patch will fix the missing memory allocation failure check
instead.

Suggested-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
2022-02-18 17:11:20 -06:00
Andy Shevchenko
b80af75644 soc: fsl: Correct MAINTAINERS database (SOC)
MAINTAINERS lacks of proper coverage for FSL headers. Fix it accordingly.

Fixes: 1b48706f02 ("MAINTAINERS: add entry for Freescale SoC drivers")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
2022-02-18 17:11:19 -06:00
Andy Shevchenko
f2b70418ec soc: fsl: Correct MAINTAINERS database (QUICC ENGINE LIBRARY)
MAINTAINERS lacks of proper coverage for FSL headers. Fix it accordingly.

Fixes: 7aa1aa6ece ("QE: Move QE from arch/powerpc to drivers/soc")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
2022-02-18 17:11:18 -06:00
Andy Shevchenko
988f0a9045 soc: fsl: Replace kernel.h with the necessary inclusions
When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
2022-02-18 17:11:17 -06:00
Li Yang
6b4266b8de dt-bindings: fsl,layerscape-dcfg: add missing compatible for lx2160a
The compatible string is already in use, fix the chip list in binding to
include it.

Signed-off-by: Li Yang <leoyang.li@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
2022-02-18 17:11:16 -06:00
Li Yang
efd12405f1 dt-bindings: qoriq-clock: add missing compatible for lx2160a
The compatible string is already in use, fix the binding to include it.

Signed-off-by: Li Yang <leoyang.li@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
2022-02-18 17:11:16 -06:00
Tom Rix
5950bdc88d ice: initialize local variable 'tlv'
Clang static analysis reports this issues
ice_common.c:5008:21: warning: The left expression of the compound
  assignment is an uninitialized value. The computed value will
  also be garbage
  ldo->phy_type_low |= ((u64)buf << (i * 16));
  ~~~~~~~~~~~~~~~~~ ^

When called from ice_cfg_phy_fec() ldo is the uninitialized local
variable tlv.  So initialize.

Fixes: ea78ce4dab ("ice: add link lenient and default override support")
Signed-off-by: Tom Rix <trix@redhat.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-18 13:28:39 -08:00
Tom Rix
ed22d9c8d1 ice: check the return of ice_ptp_gettimex64
Clang static analysis reports this issue
time64.h:69:50: warning: The left operand of '+'
  is a garbage value
  set_normalized_timespec64(&ts_delta, lhs.tv_sec + rhs.tv_sec,
                                       ~~~~~~~~~~ ^
In ice_ptp_adjtime_nonatomic(), the timespec64 variable 'now'
is set by ice_ptp_gettimex64().  This function can fail
with -EBUSY, so 'now' can have a gargbage value.
So check the return.

Fixes: 06c16d89d2 ("ice: register 1588 PTP clock device object for E810 devices")
Signed-off-by: Tom Rix <trix@redhat.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-18 13:28:39 -08:00
Jacob Keller
fadead80fe ice: fix concurrent reset and removal of VFs
Commit c503e63200 ("ice: Stop processing VF messages during teardown")
introduced a driver state flag, ICE_VF_DEINIT_IN_PROGRESS, which is
intended to prevent some issues with concurrently handling messages from
VFs while tearing down the VFs.

This change was motivated by crashes caused while tearing down and
bringing up VFs in rapid succession.

It turns out that the fix actually introduces issues with the VF driver
caused because the PF no longer responds to any messages sent by the VF
during its .remove routine. This results in the VF potentially removing
its DMA memory before the PF has shut down the device queues.

Additionally, the fix doesn't actually resolve concurrency issues within
the ice driver. It is possible for a VF to initiate a reset just prior
to the ice driver removing VFs. This can result in the remove task
concurrently operating while the VF is being reset. This results in
similar memory corruption and panics purportedly fixed by that commit.

Fix this concurrency at its root by protecting both the reset and
removal flows using the existing VF cfg_lock. This ensures that we
cannot remove the VF while any outstanding critical tasks such as a
virtchnl message or a reset are occurring.

This locking change also fixes the root cause originally fixed by commit
c503e63200 ("ice: Stop processing VF messages during teardown"), so we
can simply revert it.

Note that I kept these two changes together because simply reverting the
original commit alone would leave the driver vulnerable to worse race
conditions.

Fixes: c503e63200 ("ice: Stop processing VF messages during teardown")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-18 13:28:38 -08:00
Michal Swiatkowski
932645c298 ice: fix setting l4 port flag when adding filter
Accidentally filter flag for none encapsulated l4 port field is always
set. Even if user wants to add encapsulated l4 port field.

Remove this unnecessary flag setting.

Fixes: 9e300987d4 ("ice: VXLAN and Geneve TC support")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-18 13:28:18 -08:00
Wojciech Drewek
b70bc066d7 ice: Match on all profiles in slow-path
In switchdev mode, slow-path rules need to match all protocols, in order
to correctly redirect unfiltered or missed packets to the uplink. To set
this up for the virtual function to uplink flow, the rule that redirects
packets to the control VSI must have the tunnel type set to
ICE_SW_TUN_AND_NON_TUN. As a result of that new tunnel type being set,
ice_get_compat_fv_bitmap will select ICE_PROF_ALL. At that point all
profiles would be selected for this rule, resulting in the desired
behavior. Without this change slow-path would not work with
tunnel protocols.

Fixes: 8b032a55c1 ("ice: low level support for tunnels")
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-18 13:22:06 -08:00
Arnd Bergmann
98e437f134 Merge tag 'scmi-fix-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm SCMI fix for v5.17

A simple fix to remove space in the MODULE_ALIAS name used in the
SCMI driver as userspace expect no spaces in these names.

* tag 'scmi-fix-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
  firmware: arm_scmi: Remove space in MODULE_ALIAS name

Link: https://lore.kernel.org/r/20220214144245.2376150-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-18 17:31:55 +01:00
Arnd Bergmann
f159f2941d Merge tag 'juno-fix-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm Juno fix for v5.17

Just a single fix to address coherency issue reported[1] by removing the
GICv2m address from the DMA ranges as it loose coherency if mapped as
cacheable at the SMMU due to the attribute combining rules. The GICv2m
range is normally programmed for Device memory attributes.

[1] https://lore.kernel.org/stable/0a1d437d-9ea0-de83-3c19-e07f560ad37c@arm.com/

* tag 'juno-fix-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
  arm64: dts: juno: Remove GICv2m dma-range

Link: https://lore.kernel.org/r/20220214142615.2375269-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-18 17:30:32 +01:00
Arnd Bergmann
4f6668f052 Merge tag 'optee-fix2-for-v5.17' of git://git.linaro.org/people/jens.wiklander/linux-tee into arm/fixes
OP-TEE fix error return code in probe functions

* tag 'optee-fix2-for-v5.17' of git://git.linaro.org/people/jens.wiklander/linux-tee:
  tee: optee: fix error return code in probe function

Link: https://lore.kernel.org/r/20220214125931.GA1332792@jade
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-18 17:30:01 +01:00
Arnd Bergmann
35f5417911 Merge tag 'socfpga_dts_update_for_v5.18_part2' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into arm/fixes
SoCFPGA dts updates for v5.18, part 2
- Add the "intel,socfpga-agilex-hsotg" compatible for Agilex platform

* tag 'socfpga_dts_update_for_v5.18_part2' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  arm64: dts: agilex: use the compatible "intel,socfpga-agilex-hsotg"
  dt-bindings: usb: dwc2: add compatible "intel,socfpga-agilex-hsotg"

Link: https://lore.kernel.org/r/20220211112556.98940-2-dinguyen@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-18 17:28:44 +01:00
Md Haris Iqbal
c46fa8911b RDMA/rtrs-clt: Move free_permit from free_clt to rtrs_clt_close
Error path of rtrs_clt_open() calls free_clt(), where free_permit is
called.  This is wrong since error path of rtrs_clt_open() does not need
to call free_permit().

Also, moving free_permits() call to rtrs_clt_close(), makes it more
aligned with the call to alloc_permit() in rtrs_clt_open().

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20220217030929.323849-2-haris.iqbal@ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-18 11:59:33 -04:00
Md Haris Iqbal
8700af2cc1 RDMA/rtrs-clt: Fix possible double free in error case
Callback function rtrs_clt_dev_release() for put_device() calls kfree(clt)
to free memory. We shouldn't call kfree(clt) again, and we can't use the
clt after kfree too.

Replace device_register() with device_initialize() and device_add() so that
dev_set_name can() be used appropriately.

Move mutex_destroy() to the release function so it can be called in
the alloc_clt err path.

Fixes: eab0982466 ("RDMA/rtrs-clt: Refactor the failure cases in alloc_clt")
Link: https://lore.kernel.org/r/20220217030929.323849-1-haris.iqbal@ionos.com
Reported-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-18 11:59:33 -04:00
Zhengjun Xing
8a3d2ee0de perf evlist: Fix failed to use cpu list for uncore events
The 'perf record' and 'perf stat' commands have supported the option
'-C/--cpus' to count or collect only on the list of CPUs provided.

Commit 1d3351e631 ("perf tools: Enable on a list of CPUs for
hybrid") add it to be supported for hybrid. For hybrid support, it
checks the cpu list are available on hybrid PMU. But when we test only
uncore events(or events not in cpu_core and cpu_atom), there is a bug:

Before:

 # perf stat -C0  -e uncore_clock/clockticks/ sleep 1
   failed to use cpu list 0

In this case, for uncore event, its pmu_name is not cpu_core or
cpu_atom, so in evlist__fix_hybrid_cpus, perf_pmu__find_hybrid_pmu
should return NULL,both events_nr and unmatched_count should be 0 ,then
the cpu list check function evlist__fix_hybrid_cpus return -1 and the
error "failed to use cpu list 0" will happen. Bypass "events_nr=0" case
then the issue is fixed.

After:

 # perf stat -C0  -e uncore_clock/clockticks/ sleep 1

 Performance counter stats for 'CPU(s) 0':

       195,476,873      uncore_clock/clockticks/

       1.004518677 seconds time elapsed

When testing with at least one core event and uncore events, it has no
issue.

 # perf stat -C0  -e cpu_core/cpu-cycles/,uncore_clock/clockticks/ sleep 1

 Performance counter stats for 'CPU(s) 0':

         5,993,774      cpu_core/cpu-cycles/
       301,025,912      uncore_clock/clockticks/

       1.003964934 seconds time elapsed

Fixes: 1d3351e631 ("perf tools: Enable on a list of CPUs for hybrid")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: alexander.shishkin@intel.com
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220218093127.1844241-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-18 09:59:26 -03:00
John Garry
f268088f65 perf test: Skip failing sigtrap test for arm+aarch64
Skip the Sigtrap test for arm + arm64, same as was done for s390 in
commit a840974e96 ("perf test: Test 73 Sig_trap fails on s390"). For
this, reuse BP_SIGNAL_IS_SUPPORTED - meaning that the arch can use BP to
generate signals - instead of BP_ACCOUNT_IS_SUPPORTED, which is
appropriate.

As described by Will at [0], in the test we get stuck in a loop of
handling the HW breakpoint exception and never making progress. GDB
handles this by stepping over the faulting instruction, but with perf
the kernel is expected to handle the step (which it doesn't for arm).

Dmitry made an attempt to get this work, also mentioned in the same
thread as [0], which was appreciated. But the best thing to do is skip
the test for now.

[0] https://lore.kernel.org/linux-perf-users/20220118124343.GC98966@leoy-ThinkPad-X240s/T/#m13b06c39d2a5100d340f009435df6f4d8ee57b5a

Fixes: 5504f67944 ("perf test sigtrap: Add basic stress test for sigtrap handling")
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Marco Elver <elver@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux@armlinux.org.uk
Link: https://lore.kernel.org/r/1645176813-202756-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-18 09:54:50 -03:00
Xiaoke Wang
b352c3465b net: ll_temac: check the return value of devm_kmalloc()
devm_kmalloc() returns a pointer to allocated memory on success, NULL
on failure. While lp->indirect_lock is allocated by devm_kmalloc()
without proper check. It is better to check the value of it to
prevent potential wrong memory access.

Fixes: f14f5c11f0 ("net: ll_temac: Support indirect_mutex share within TEMAC IP")
Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18 12:00:44 +00:00
Eric Dumazet
a1cdec57e0 net-timestamp: convert sk->sk_tskey to atomic_t
UDP sendmsg() can be lockless, this is causing all kinds
of data races.

This patch converts sk->sk_tskey to remove one of these races.

BUG: KCSAN: data-race in __ip_append_data / __ip_append_data

read to 0xffff8881035d4b6c of 4 bytes by task 8877 on cpu 1:
 __ip_append_data+0x1c1/0x1de0 net/ipv4/ip_output.c:994
 ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636
 udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249
 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
 sock_sendmsg_nosec net/socket.c:705 [inline]
 sock_sendmsg net/socket.c:725 [inline]
 ____sys_sendmsg+0x39a/0x510 net/socket.c:2413
 ___sys_sendmsg net/socket.c:2467 [inline]
 __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
 __do_sys_sendmmsg net/socket.c:2582 [inline]
 __se_sys_sendmmsg net/socket.c:2579 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

write to 0xffff8881035d4b6c of 4 bytes by task 8880 on cpu 0:
 __ip_append_data+0x1d8/0x1de0 net/ipv4/ip_output.c:994
 ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636
 udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249
 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
 sock_sendmsg_nosec net/socket.c:705 [inline]
 sock_sendmsg net/socket.c:725 [inline]
 ____sys_sendmsg+0x39a/0x510 net/socket.c:2413
 ___sys_sendmsg net/socket.c:2467 [inline]
 __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
 __do_sys_sendmmsg net/socket.c:2582 [inline]
 __se_sys_sendmmsg net/socket.c:2579 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x0000054d -> 0x0000054e

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 8880 Comm: syz-executor.5 Not tainted 5.17.0-rc2-syzkaller-00167-gdcb85f85fa6f-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 09c2d251b7 ("net-timestamp: add key to disambiguate concurrent datagrams")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18 11:14:52 +00:00
Oliver Neukum
e9da0b56fe sr9700: sanity check for packet length
A malicious device can leak heap data to user space
providing bogus frame lengths. Introduce a sanity check.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18 11:05:08 +00:00
Paul Blakey
2f131de361 net/sched: act_ct: Fix flow table lookup after ct clear or switching zones
Flow table lookup is skipped if packet either went through ct clear
action (which set the IP_CT_UNTRACKED flag on the packet), or while
switching zones and there is already a connection associated with
the packet. This will result in no SW offload of the connection,
and the and connection not being removed from flow table with
TCP teardown (fin/rst packet).

To fix the above, remove these unneccary checks in flow
table lookup.

Fixes: 46475bb20f ("net/sched: act_ct: Software offload of established flows")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18 11:02:48 +00:00
suresh kumar
4224cfd7fb net-sysfs: add check for netdevice being present to speed_show
When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <suresh2514@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18 10:59:11 +00:00
Duoming Zhou
efe4186e6a drivers: hamradio: 6pack: fix UAF bug caused by mod_timer()
When a 6pack device is detaching, the sixpack_close() will act to cleanup
necessary resources. Although del_timer_sync() in sixpack_close()
won't return if there is an active timer, one could use mod_timer() in
sp_xmit_on_air() to wake up timer again by calling userspace syscall such
as ax25_sendmsg(), ax25_connect() and ax25_ioctl().

This unexpected waked handler, sp_xmit_on_air(), realizes nothing about
the undergoing cleanup and may still call pty_write() to use driver layer
resources that have already been released.

One of the possible race conditions is shown below:

      (USE)                      |      (FREE)
ax25_sendmsg()                   |
 ax25_queue_xmit()               |
  ...                            |
  sp_xmit()                      |
   sp_encaps()                   | sixpack_close()
    sp_xmit_on_air()             |  del_timer_sync(&sp->tx_t)
     mod_timer(&sp->tx_t,...)    |  ...
                                 |  unregister_netdev()
                                 |  ...
     (wait a while)              | tty_release()
                                 |  tty_release_struct()
                                 |   release_tty()
    sp_xmit_on_air()             |    tty_kref_put(tty_struct) //FREE
     pty_write(tty_struct) //USE |    ...

The corresponding fail log is shown below:
===============================================================
BUG: KASAN: use-after-free in __run_timers.part.0+0x170/0x470
Write of size 8 at addr ffff88800a652ab8 by task swapper/2/0
...
Call Trace:
  ...
  queue_work_on+0x3f/0x50
  pty_write+0xcd/0xe0pty_write+0xcd/0xe0
  sp_xmit_on_air+0xb2/0x1f0
  call_timer_fn+0x28/0x150
  __run_timers.part.0+0x3c2/0x470
  run_timer_softirq+0x3b/0x80
  __do_softirq+0xf1/0x380
  ...

This patch reorders the del_timer_sync() after the unregister_netdev()
to avoid UAF bugs. Because the unregister_netdev() is well synchronized,
it flushs out any pending queues, waits the refcount of net_device
decreases to zero and removes net_device from kernel. There is not any
running routines after executing unregister_netdev(). Therefore, we could
not arouse timer from userspace again.

Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Lin Ma <linma@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18 10:58:17 +00:00
Miklos Szeredi
a679a61520 fuse: fix fileattr op failure
The fileattr API conversion broke lsattr on ntfs3g.

Previously the ioctl(... FS_IOC_GETFLAGS) returned an EINVAL error, but
after the conversion the error returned by the fuse filesystem was not
propagated back to the ioctl() system call, resulting in success being
returned with bogus values.

Fix by checking for outarg.result in fuse_priv_ioctl(), just as generic
ioctl code does.

Reported-by: Jean-Pierre André <jean-pierre.andre@wanadoo.fr>
Fixes: 72227eac17 ("fuse: convert to fileattr")
Cc: <stable@vger.kernel.org> # v5.13
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-02-18 11:47:51 +01:00
Rudi Heitbaum
1aae05754f drm/imx/dcss: i.MX8MQ DCSS select DRM_GEM_CMA_HELPER
Without DRM_GEM_CMA_HELPER i.MX8MQ DCSS won't build. This needs to be
there.

Signed-off-by: Rudi Heitbaum <rudi@heitbaum.com>
Reviewed-by: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Signed-off-by: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220216212228.1217831-1-rudi@heitbaum.com
2022-02-18 09:25:16 +00:00
Wanpeng Li
ec756e40e2 x86/kvm: Don't use pv tlb/ipi/sched_yield if on 1 vCPU
Inspired by commit 3553ae5690 (x86/kvm: Don't use pvqspinlock code if
only 1 vCPU), on a VM with only 1 vCPU, there is no need to enable
pv tlb/ipi/sched_yield and we can save the memory for __pv_cpu_mask.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1645171838-2855-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-18 03:36:24 -05:00
Leonardo Bras
ba1f77c546 x86/kvm: Fix compilation warning in non-x86_64 builds
On non-x86_64 builds, helpers gtod_is_based_on_tsc() and
kvm_guest_supported_xfd() are defined but never used.  Because these are
static inline but are in a .c file, some compilers do warn for them with
-Wunused-function, which becomes an error if -Werror is present.

Add #ifdef so they are only defined in x86_64 builds.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20220218034100.115702-1-leobras@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-18 03:33:45 -05:00
Anthoine Bourgeois
8840f5460a ARM: dts: Use 32KiHz oscillator on devkit8000
Devkit8000 board seems to always used 32k_counter as clocksource.
Restore this behavior.

If clocksource is back to 32k_counter, timer12 is now the clockevent
source (as before) and timer2 is not longer needed here.

This commit fixes the same issue observed with commit 23885389db
("ARM: dts: Fix timer regression for beagleboard revision c") when sleep
is blocked until hitting keys over serial console.

Fixes: aba1ad05da ("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support")
Fixes: e428e250fd ("ARM: dts: Configure system timers for omap3")
Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2022-02-18 10:08:45 +02:00
Anthoine Bourgeois
64324ef337 ARM: dts: switch timer config to common devkit8000 devicetree
This patch allow lcd43 and lcd70 flavors to benefit from timer
evolution.

Fixes: e428e250fd ("ARM: dts: Configure system timers for omap3")
Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2022-02-18 10:07:14 +02:00
Siarhei Volkau
2f0754f27a clk: jz4725b: fix mmc0 clock gating
The mmc0 clock gate bit was mistakenly assigned to "i2s" clock.
You can find that the same bit is assigned to "mmc0" too.
It leads to mmc0 hang for a long time after any sound activity
also it  prevented PM_SLEEP to work properly.
I guess it was introduced by copy-paste from jz4740 driver
where it is really controls I2S clock gate.

Fixes: 226dfa4726 ("clk: Add Ingenic jz4725b CGU driver")
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Tested-by: Siarhei Volkau <lis8215@gmail.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220205171849.687805-2-lis8215@gmail.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-02-17 17:05:07 -08:00
Konrad Dybcio
3494894aff clk: qcom: gcc-msm8994: Remove NoC clocks
Just like in commit 05cf3ec00d ("clk: qcom: gcc-msm8996: Drop (again)
gcc_aggre1_pnoc_ahb_clk") adding NoC clocks turned out to be a huge
mistake, as they cause a lot of issues at little benefit (basically
letting Linux know about their children's frequencies), especially when
mishandled or misconfigured.

Adding these ones broke SDCC approx 99 out of 100 times, but that somehow
went unnoticed. To prevent further issues like this one, remove them.

This commit is effectively a revert of 74a33fac3a ("clk: qcom:
gcc-msm8994: Add missing NoC clocks") with ABI preservation.

Fixes: 74a33fac3a ("clk: qcom: gcc-msm8994: Add missing NoC clocks")
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Link: https://lore.kernel.org/r/20220217232408.78932-1-konrad.dybcio@somainline.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-02-17 16:56:43 -08:00
Nikhil Gupta
132507ed04 of/fdt: move elfcorehdr reservation early for crash dump kernel
elfcorehdr_addr is fixed address passed to Second kernel which may be conflicted
with potential reserved memory in Second kernel,so fdt_reserve_elfcorehdr() ahead
of fdt_init_reserved_mem() can relieve this situation.

Signed-off-by: Nikhil Gupta <nikhil.gupta@nxp.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220128042321.15228-1-nikhil.gupta@nxp.com
2022-02-17 17:13:52 -06:00
Jakub Kicinski
7a2fb91285 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2022-02-17

We've added 8 non-merge commits during the last 7 day(s) which contain
a total of 8 files changed, 119 insertions(+), 15 deletions(-).

The main changes are:

1) Add schedule points in map batch ops, from Eric.

2) Fix bpf_msg_push_data with len 0, from Felix.

3) Fix crash due to incorrect copy_map_value, from Kumar.

4) Fix crash due to out of bounds access into reg2btf_ids, from Kumar.

5) Fix a bpf_timer initialization issue with clang, from Yonghong.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add schedule points in batch ops
  bpf: Fix crash due to out of bounds access into reg2btf_ids.
  selftests: bpf: Check bpf_msg_push_data return value
  bpf: Fix a bpf_timer initialization issue
  bpf: Emit bpf_timer in vmlinux BTF
  selftests/bpf: Add test for bpf_timer overwriting crash
  bpf: Fix crash due to incorrect copy_map_value
  bpf: Do not try bpf_msg_push_data with len 0
====================

Link: https://lore.kernel.org/r/20220217190000.37925-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 12:01:55 -08:00
Eric Dumazet
75134f16e7 bpf: Add schedule points in batch ops
syzbot reported various soft lockups caused by bpf batch operations.

 INFO: task kworker/1:1:27 blocked for more than 140 seconds.
 INFO: task hung in rcu_barrier

Nothing prevents batch ops to process huge amount of data,
we need to add schedule points in them.

Note that maybe_wait_bpf_programs(map) calls from
generic_map_delete_batch() can be factorized by moving
the call after the loop.

This will be done later in -next tree once we get this fix merged,
unless there is strong opinion doing this optimization sooner.

Fixes: aa2e93b8e5 ("bpf: Add generic support for update and delete batch ops")
Fixes: cb4d03ab49 ("bpf: Add generic support for lookup batch op")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Brian Vazquez <brianvv@google.com>
Link: https://lore.kernel.org/bpf/20220217181902.808742-1-eric.dumazet@gmail.com
2022-02-17 10:48:26 -08:00
Maxime Ripard
6764eb690e drm/vc4: crtc: Fix runtime_pm reference counting
At boot on the BCM2711, if the HDMI controllers are running, the CRTC
driver will disable itself and its associated HDMI controller to work
around a hardware bug that would leave some pixels stuck in a FIFO.

In order to avoid that issue, we need to run some operations in lockstep
between the CRTC and HDMI controller, and we need to make sure the HDMI
controller will be powered properly.

However, since we haven't enabled it through KMS, the runtime_pm state
is off at this point so we need to make sure the device is powered
through pm_runtime_resume_and_get, and once the operations are complete,
we call pm_runtime_put.

However, the HDMI controller will do that itself in its
post_crtc_powerdown, which means we'll end up calling pm_runtime_put for
a single pm_runtime_get, throwing the reference counting off. Let's
remove the pm_runtime_put call in the CRTC code in order to have the
proper counting.

Fixes: bca10db67b ("drm/vc4: crtc: Make sure the HDMI controller is powered when disabling")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220203102003.1114673-1-maxime@cerno.tech
2022-02-17 17:27:59 +01:00
Maxime Ripard
e40945ab7c drm/vc4: hdmi: Unregister codec device on unbind
On bind we will register the HDMI codec device but we don't unregister
it on unbind, leading to a device leakage. Unregister our device at
unbind.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127111452.222002-1-maxime@cerno.tech
2022-02-17 17:27:54 +01:00
Mike Marciniszyn
32f57cb1b2 IB/qib: Fix duplicate sysfs directory name
The qib driver load has been failing with the following message:

  sysfs: cannot create duplicate filename '/devices/pci0000:80/0000:80:02.0/0000:81:00.0/infiniband/qib0/ports/1/linkcontrol'

The patch below has two "linkcontrol" names causing the duplication.

Fix by using the correct "diag_counters" name on the second instance.

Fixes: 4a7aaf88c8 ("RDMA/qib: Use attributes for the port sysfs")
Link: https://lore.kernel.org/r/1645106372-23004-1-git-send-email-mike.marciniszyn@cornelisnetworks.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-17 11:54:08 -04:00
Jon Lin
80808768e4 spi: rockchip: terminate dma transmission when slave abort
After slave abort, all DMA should be stopped, or it will affect the
next transmission and maybe abort again.

Signed-off-by: Jon Lin <jon.lin@rock-chips.com>
Link: https://lore.kernel.org/r/20220216014028.8123-3-jon.lin@rock-chips.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-17 15:33:18 +00:00
Jon Lin
9382df0a98 spi: rockchip: Fix error in getting num-cs property
Get num-cs u32 from dts of_node property rather than u16.

Signed-off-by: Jon Lin <jon.lin@rock-chips.com>
Link: https://lore.kernel.org/r/20220216014028.8123-2-jon.lin@rock-chips.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-17 15:33:17 +00:00
Prasad Kumpatla
d04ad245d6 regmap-irq: Update interrupt clear register for proper reset
With the existing logic where clear_ack is true (HW doesn’t support
auto clear for ICR), interrupt clear register reset is not handled
properly. Due to this only the first interrupts get processed properly
and further interrupts are blocked due to not resetting interrupt
clear register.

Example for issue case where Invert_ack is false and clear_ack is true:

    Say Default ISR=0x00 & ICR=0x00 and ISR is triggered with 2
    interrupts making ISR = 0x11.

    Step 1: Say ISR is set 0x11 (store status_buff = ISR). ISR needs to
            be cleared with the help of ICR once the Interrupt is processed.

    Step 2: Write ICR = 0x11 (status_buff), this will clear the ISR to 0x00.

    Step 3: Issue - In the existing code, ICR is written with ICR =
            ~(status_buff) i.e ICR = 0xEE -> This will block all the interrupts
            from raising except for interrupts 0 and 4. So expectation here is to
            reset ICR, which will unblock all the interrupts.

            if (chip->clear_ack) {
                 if (chip->ack_invert && !ret)
                  ........
                 else if (!ret)
                     ret = regmap_write(map, reg,
                            ~data->status_buf[i]);

So writing 0 and 0xff (when ack_invert is true) should have no effect, other
than clearing the ACKs just set.

Fixes: 3a6f0fb7b8 ("regmap: irq: Add support to clear ack registers")
Signed-off-by: Prasad Kumpatla <quic_pkumpatl@quicinc.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20220217085007.30218-1-quic_pkumpatl@quicinc.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-17 15:33:15 +00:00
Fabrice Gasnier
32fde84362 usb: dwc2: drd: fix soft connect when gadget is unconfigured
When the gadget driver hasn't been (yet) configured, and the cable is
connected to a HOST, the SFTDISCON gets cleared unconditionally, so the
HOST tries to enumerate it.
At the host side, this can result in a stuck USB port or worse. When
getting lucky, some dmesg can be observed at the host side:
 new high-speed USB device number ...
 device descriptor read/64, error -110

Fix it in drd, by checking the enabled flag before calling
dwc2_hsotg_core_connect(). It will be called later, once configured,
by the normal flow:
- udc_bind_to_driver
 - usb_gadget_connect
   - dwc2_hsotg_pullup
     - dwc2_hsotg_core_connect

Fixes: 17f934024e ("usb: dwc2: override PHY input signals with usb role switch support")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/1644999135-13478-1-git-send-email-fabrice.gasnier@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-17 16:10:21 +01:00
Hans de Goede
62e3f0afe2 usb: dwc3: pci: Fix Bay Trail phy GPIO mappings
When the Bay Trail phy GPIO mappings where added cs and reset were swapped,
this did not cause any issues sofar, because sofar they were always driven
high/low at the same time.

Note the new mapping has been verified both in /sys/kernel/debug/gpio
output on Android factory images on multiple devices, as well as in
the schematics for some devices.

Fixes: 5741022cbd ("usb: dwc3: pci: Add GPIO lookup table on platforms without ACPI GPIO resources")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220213130524.18748-3-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-17 16:10:03 +01:00
Leonardo Bras
988896bb61 x86/kvm/fpu: Remove kvm_vcpu_arch.guest_supported_xcr0
kvm_vcpu_arch currently contains the guest supported features in both
guest_supported_xcr0 and guest_fpu.fpstate->user_xfeatures field.

Currently both fields are set to the same value in
kvm_vcpu_after_set_cpuid() and are not changed anywhere else after that.

Since it's not good to keep duplicated data, remove guest_supported_xcr0.

To keep the code more readable, introduce kvm_guest_supported_xcr()
and kvm_guest_supported_xfd() to replace the previous usages of
guest_supported_xcr0.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20220217053028.96432-3-leobras@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-17 10:06:49 -05:00
Leonardo Bras
ad856280dd x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0
During host/guest switch (like in kvm_arch_vcpu_ioctl_run()), the kernel
swaps the fpu between host/guest contexts, by using fpu_swap_kvm_fpstate().

When xsave feature is available, the fpu swap is done by:
- xsave(s) instruction, with guest's fpstate->xfeatures as mask, is used
  to store the current state of the fpu registers to a buffer.
- xrstor(s) instruction, with (fpu_kernel_cfg.max_features &
  XFEATURE_MASK_FPSTATE) as mask, is used to put the buffer into fpu regs.

For xsave(s) the mask is used to limit what parts of the fpu regs will
be copied to the buffer. Likewise on xrstor(s), the mask is used to
limit what parts of the fpu regs will be changed.

The mask for xsave(s), the guest's fpstate->xfeatures, is defined on
kvm_arch_vcpu_create(), which (in summary) sets it to all features
supported by the cpu which are enabled on kernel config.

This means that xsave(s) will save to guest buffer all the fpu regs
contents the cpu has enabled when the guest is paused, even if they
are not used.

This would not be an issue, if xrstor(s) would also do that.

xrstor(s)'s mask for host/guest swap is basically every valid feature
contained in kernel config, except XFEATURE_MASK_PKRU.
Accordingto kernel src, it is instead switched in switch_to() and
flush_thread().

Then, the following happens with a host supporting PKRU starts a
guest that does not support it:
1 - Host has XFEATURE_MASK_PKRU set. 1st switch to guest,
2 - xsave(s) fpu regs to host fpustate (buffer has XFEATURE_MASK_PKRU)
3 - xrstor(s) guest fpustate to fpu regs (fpu regs have XFEATURE_MASK_PKRU)
4 - guest runs, then switch back to host,
5 - xsave(s) fpu regs to guest fpstate (buffer now have XFEATURE_MASK_PKRU)
6 - xrstor(s) host fpstate to fpu regs.
7 - kvm_vcpu_ioctl_x86_get_xsave() copy guest fpstate to userspace (with
    XFEATURE_MASK_PKRU, which should not be supported by guest vcpu)

On 5, even though the guest does not support PKRU, it does have the flag
set on guest fpstate, which is transferred to userspace via vcpu ioctl
KVM_GET_XSAVE.

This becomes a problem when the user decides on migrating the above guest
to another machine that does not support PKRU: the new host restores
guest's fpu regs to as they were before (xrstor(s)), but since the new
host don't support PKRU, a general-protection exception ocurs in xrstor(s)
and that crashes the guest.

This can be solved by making the guest's fpstate->user_xfeatures hold
a copy of guest_supported_xcr0. This way, on 7 the only flags copied to
userspace will be the ones compatible to guest requirements, and thus
there will be no issue during migration.

As a bonus, it will also fail if userspace tries to set fpu features
(with the KVM_SET_XSAVE ioctl) that are not compatible to the guest
configuration.  Such features will never be returned by KVM_GET_XSAVE
or KVM_GET_XSAVE2.

Also, since kvm_vcpu_after_set_cpuid() now sets fpstate->user_xfeatures,
there is not need to set it in kvm_check_cpuid(). So, change
fpstate_realloc() so it does not touch fpstate->user_xfeatures if a
non-NULL guest_fpu is passed, which is the case when kvm_check_cpuid()
calls it.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20220217053028.96432-2-leobras@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-17 10:05:57 -05:00
Jens Axboe
aba2081e0a tps6598x: clear int mask on probe failure
The interrupt mask is enabled before any potential failure points in
the driver, which can leave a failure path where we exit with
interrupts enabled but the device not live. This causes an infinite
stream of interrupts on an Apple M1 Pro laptop on USB-C.

Add a failure label that's used post enabling interrupts, where we
mask them again before returning an error.

Suggested-by: Sven Peter <sven@svenpeter.dev>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/e6b80669-20f3-06e7-9ed5-8951a9c6db6f@kernel.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-17 16:02:27 +01:00
Anton Romanov
3a55f72924 kvm: x86: Disable KVM_HC_CLOCK_PAIRING if tsc is in always catchup mode
If vcpu has tsc_always_catchup set each request updates pvclock data.
KVM_HC_CLOCK_PAIRING consumers such as ptp_kvm_x86 rely on tsc read on
host's side and do hypercall inside pvclock_read_retry loop leading to
infinite loop in such situation.

v3:
    Removed warn
    Changed return code to KVM_EFAULT
v2:
    Added warn

Signed-off-by: Anton Romanov <romanton@google.com>
Message-Id: <20220216182653.506850-1-romanton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-17 09:52:50 -05:00
Wanpeng Li
4cb9a998b1 KVM: Fix lockdep false negative during host resume
I saw the below splatting after the host suspended and resumed.

   WARNING: CPU: 0 PID: 2943 at kvm/arch/x86/kvm/../../../virt/kvm/kvm_main.c:5531 kvm_resume+0x2c/0x30 [kvm]
   CPU: 0 PID: 2943 Comm: step_after_susp Tainted: G        W IOE     5.17.0-rc3+ #4
   RIP: 0010:kvm_resume+0x2c/0x30 [kvm]
   Call Trace:
    <TASK>
    syscore_resume+0x90/0x340
    suspend_devices_and_enter+0xaee/0xe90
    pm_suspend.cold+0x36b/0x3c2
    state_store+0x82/0xf0
    kernfs_fop_write_iter+0x1b6/0x260
    new_sync_write+0x258/0x370
    vfs_write+0x33f/0x510
    ksys_write+0xc9/0x160
    do_syscall_64+0x3b/0xc0
    entry_SYSCALL_64_after_hwframe+0x44/0xae

lockdep_is_held() can return -1 when lockdep is disabled which triggers
this warning. Let's use lockdep_assert_not_held() which can detect
incorrect calls while holding a lock and it also avoids false negatives
when lockdep is disabled.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1644920142-81249-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-17 09:52:50 -05:00
Aaron Lewis
127770ac0d KVM: x86: Add KVM_CAP_ENABLE_CAP to x86
Follow the precedent set by other architectures that support the VCPU
ioctl, KVM_ENABLE_CAP, and advertise the VM extension, KVM_CAP_ENABLE_CAP.
This way, userspace can ensure that KVM_ENABLE_CAP is available on a
vcpu before using it.

Fixes: 5c919412fe ("kvm/x86: Hyper-V synthetic interrupt controller")
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Message-Id: <20220214212950.1776943-1-aaronlewis@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-17 09:52:50 -05:00
Oliver Upton
a867e9d0cc KVM: arm64: Don't miss pending interrupts for suspended vCPU
In order to properly emulate the WFI instruction, KVM reads back
ICH_VMCR_EL2 and enables doorbells for GICv4. These preparations are
necessary in order to recognize pending interrupts in
kvm_arch_vcpu_runnable() and return to the guest. Until recently, this
work was done by kvm_arch_vcpu_{blocking,unblocking}(). Since commit
6109c5a6ab ("KVM: arm64: Move vGIC v4 handling for WFI out arch
callback hook"), these callbacks were gutted and superseded by
kvm_vcpu_wfi().

It is important to note that KVM implements PSCI CPU_SUSPEND calls as
a WFI within the guest. However, the implementation calls directly into
kvm_vcpu_halt(), which skips the needed work done in kvm_vcpu_wfi()
to detect pending interrupts. Fix the issue by calling the WFI helper.

Fixes: 6109c5a6ab ("KVM: arm64: Move vGIC v4 handling for WFI out arch callback hook")
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220217101242.3013716-1-oupton@google.com
2022-02-17 14:36:50 +00:00
Jiri Kosina
ac89895213 HID: elo: Revert USB reference counting
Commit 817b8b9c53 ("HID: elo: fix memory leak in elo_probe") introduced
memory leak on error path, but more importantly the whole USB reference
counting is not needed at all in the first place, as the driver itself
doesn't change the reference counting in any way, and the associated
usb_device is guaranteed to be kept around by USB core as long as the
driver binding exists.

Reported-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: fbf42729d0 ("HID: elo: update the reference count of the usb device structure")
Fixes: 817b8b9c53 ("HID: elo: fix memory leak in elo_probe")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2022-02-17 14:14:41 +01:00
Jon Hunter
16693c1b2d drm/tegra: Fix cast to restricted __le32
Sparse warns about the following cast in the function
falcon_copy_firmware_image() ...

 drivers/gpu/drm/tegra/falcon.c:66:27: warning: cast to restricted __le32

Fix this by casting the firmware data array to __le32 instead of u32.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-02-17 07:32:57 +01:00
Kumar Kartikeya Dwivedi
45ce4b4f90 bpf: Fix crash due to out of bounds access into reg2btf_ids.
When commit e6ac2450d6 ("bpf: Support bpf program calling kernel function") added
kfunc support, it defined reg2btf_ids as a cheap way to translate the verifier
reg type to the appropriate btf_vmlinux BTF ID, however
commit c25b2ae136 ("bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL")
moved the __BPF_REG_TYPE_MAX from the last member of bpf_reg_type enum to after
the base register types, and defined other variants using type flag
composition. However, now, the direct usage of reg->type to index into
reg2btf_ids may no longer fall into __BPF_REG_TYPE_MAX range, and hence lead to
out of bounds access and kernel crash on dereference of bad pointer.

Fixes: c25b2ae136 ("bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220216201943.624869-1-memxor@gmail.com
2022-02-16 12:46:51 -08:00
Alexander Lobakin
f2703def33 MIPS: smp: fill in sibling and core maps earlier
After enabling CONFIG_SCHED_CORE (landed during 5.14 cycle),
2-core 2-thread-per-core interAptiv (CPS-driven) started emitting
the following:

[    0.025698] CPU1 revision is: 0001a120 (MIPS interAptiv (multi))
[    0.048183] ------------[ cut here ]------------
[    0.048187] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:6025 sched_core_cpu_starting+0x198/0x240
[    0.048220] Modules linked in:
[    0.048233] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.17.0-rc3+ #35 b7b319f24073fd9a3c2aa7ad15fb7993eec0b26f
[    0.048247] Stack : 817f0000 00000004 327804c8 810eb050 00000000 00000004 00000000 c314fdd1
[    0.048278]         830cbd64 819c0000 81800000 817f0000 83070bf4 00000001 830cbd08 00000000
[    0.048307]         00000000 00000000 815fcbc4 00000000 00000000 00000000 00000000 00000000
[    0.048334]         00000000 00000000 00000000 00000000 817f0000 00000000 00000000 817f6f34
[    0.048361]         817f0000 818a3c00 817f0000 00000004 00000000 00000000 4dc33260 0018c933
[    0.048389]         ...
[    0.048396] Call Trace:
[    0.048399] [<8105a7bc>] show_stack+0x3c/0x140
[    0.048424] [<8131c2a0>] dump_stack_lvl+0x60/0x80
[    0.048440] [<8108b5c0>] __warn+0xc0/0xf4
[    0.048454] [<8108b658>] warn_slowpath_fmt+0x64/0x10c
[    0.048467] [<810bd418>] sched_core_cpu_starting+0x198/0x240
[    0.048483] [<810c6514>] sched_cpu_starting+0x14/0x80
[    0.048497] [<8108c0f8>] cpuhp_invoke_callback_range+0x78/0x140
[    0.048510] [<8108d914>] notify_cpu_starting+0x94/0x140
[    0.048523] [<8106593c>] start_secondary+0xbc/0x280
[    0.048539]
[    0.048543] ---[ end trace 0000000000000000 ]---
[    0.048636] Synchronize counters for CPU 1: done.

...for each but CPU 0/boot.
Basic debug printks right before the mentioned line say:

[    0.048170] CPU: 1, smt_mask:

So smt_mask, which is sibling mask obviously, is empty when entering
the function.
This is critical, as sched_core_cpu_starting() calculates
core-scheduling parameters only once per CPU start, and it's crucial
to have all the parameters filled in at that moment (at least it
uses cpu_smt_mask() which in fact is `&cpu_sibling_map[cpu]` on
MIPS).

A bit of debugging led me to that set_cpu_sibling_map() performing
the actual map calculation, was being invocated after
notify_cpu_start(), and exactly the latter function starts CPU HP
callback round (sched_core_cpu_starting() is basically a CPU HP
callback).
While the flow is same on ARM64 (maps after the notifier, although
before calling set_cpu_online()), x86 started calculating sibling
maps earlier than starting the CPU HP callbacks in Linux 4.14 (see
[0] for the reference). Neither me nor my brief tests couldn't find
any potential caveats in calculating the maps right after performing
delay calibration, but the WARN splat is now gone.
The very same debug prints now yield exactly what I expected from
them:

[    0.048433] CPU: 1, smt_mask: 0-1

[0] https://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git/commit/?id=76ce7cfe35ef

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-02-16 20:48:10 +01:00
Chuanhong Guo
cc19db8b31 MIPS: ralink: mt7621: do memory detection on KSEG1
It's reported that current memory detection code occasionally detects
larger memory under some bootloaders.
Current memory detection code tests whether address space wraps around
on KSEG0, which is unreliable because it's cached.

Rewrite memory size detection to perform the same test on KSEG1 instead.
While at it, this patch also does the following two things:
1. use a fixed pattern instead of a random function pointer as the magic
   value.
2. add an additional memory write and a second comparison as part of the
   test to prevent possible smaller memory detection result due to
   leftover values in memory.

Fixes: 139c949f7f MIPS: ("ralink: mt7621: add memory detection support")
Reported-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Chuanhong Guo <gch981213@gmail.com>
Tested-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Tested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-02-16 20:46:50 +01:00
Fabio Estevam
c5487b9cde ASoC: cs4265: Fix the duplicated control name
Currently, the following error messages are seen during boot:

asoc-simple-card sound: control 2:0:0:SPDIF Switch:0 is already present
cs4265 1-004f: ASoC: failed to add widget SPDIF dapm kcontrol SPDIF Switch: -16

Quoting Mark Brown:

"The driver is just plain buggy, it defines both a regular SPIDF Switch
control and a SND_SOC_DAPM_SWITCH() called SPDIF both of which will
create an identically named control, it can never have loaded without
error.  One or both of those has to be renamed or they need to be
merged into one thing."

Fix the duplicated control name by combining the two SPDIF controls here
and move the register bits onto the DAPM widget and have DAPM control them.

Fixes: f853d6b3ba ("ASoC: cs4265: Add a S/PDIF enable switch")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220215120514.1760628-1-festevam@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-16 16:34:16 +00:00
Marek Vasut
9bdd10d57a ASoC: ops: Shift tested values in snd_soc_put_volsw() by +min
While the $val/$val2 values passed in from userspace are always >= 0
integers, the limits of the control can be signed integers and the $min
can be non-zero and less than zero. To correctly validate $val/$val2
against platform_max, add the $min offset to val first.

Fixes: 817f7c9335 ("ASoC: ops: Reject out of bounds values in snd_soc_put_volsw()")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220215130645.164025-1-marex@denx.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-16 16:33:48 +00:00
Mikko Perttunen
184b58fa81 gpu: host1x: Always return syncpoint value when waiting
The new TegraDRM UAPI uses syncpoint waiting with timeout set to
zero to indicate reading the syncpoint value. To support that we
need to return the syncpoint value always when waiting.

Fixes: 44e9613813 ("drm/tegra: Implement syncpoint wait UAPI")
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-02-16 17:20:53 +01:00
Michael Hübner
0a5a587501 HID: Add support for open wheel and no attachment to T300
Different add ons to the wheel base report different models. Having
no wheel mounted to the base and using the open wheel attachment is
added here.

Signed-off-by: Michael Hübner <michaelh.95@t-online.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2022-02-16 17:09:33 +01:00
Lucas Zampieri
25666e8ccd HID: logitech-dj: add new lightspeed receiver id
As of logitech lightspeed receiver fw version 04.02.B0009,
HIDPP_PARAM_DEVICE_INFO is being reported as 0x11.

With patch "HID: logitech-dj: add support for the new lightspeed receiver
iteration", the mouse starts to error out with:
  logitech-djreceiver: unusable device of type UNKNOWN (0x011) connected on
  slot 1
and becomes unusable.

This has been noticed on a Logitech G Pro X Superlight fw MPM 25.01.B0018.

Signed-off-by: Lucas Zampieri <lzampier@redhat.com>
Acked-by: Nestor Lopez Casado <nlopezcasad@logitech.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2022-02-16 16:26:21 +01:00
Samuel Holland
7920af5c82 gpio: rockchip: Reset int_bothedge when changing trigger
With v2 hardware, an IRQ can be configured to trigger on both edges via
a bit in the int_bothedge register. Currently, the driver sets this bit
when changing the trigger type to IRQ_TYPE_EDGE_BOTH, but fails to reset
this bit if the trigger type is later changed to something else. This
causes spurious IRQs, and when using gpio-keys with wakeup-event-action
set to EV_ACT_(DE)ASSERTED, those IRQs translate into spurious wakeups.

Fixes: 3bcbd1a85b ("gpio/rockchip: support next version gpio controller")
Reported-by: Guillaume Savaton <guillaume@baierouge.fr>
Tested-by: Guillaume Savaton <guillaume@baierouge.fr>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-02-16 15:52:22 +01:00
Nicolas Escande
859ae70183 mac80211: fix forwarded mesh frames AC & queue selection
There are two problems with the current code that have been highlighted
with the AQL feature that is now enbaled by default.

First problem is in ieee80211_rx_h_mesh_fwding(),
ieee80211_select_queue_80211() is used on received packets to choose
the sending AC queue of the forwarding packet although this function
should only be called on TX packet (it uses ieee80211_tx_info).
This ends with forwarded mesh packets been sent on unrelated random AC
queue. To fix that, AC queue can directly be infered from skb->priority
which has been extracted from QOS info (see ieee80211_parse_qos()).

Second problem is the value of queue_mapping set on forwarded mesh
frames via skb_set_queue_mapping() is not the AC of the packet but a
hardware queue index. This may or may not work depending on AC to HW
queue mapping which is driver specific.

Both of these issues lead to improper AC selection while forwarding
mesh packets but more importantly due to improper airtime accounting
(which is done on a per STA, per AC basis) caused traffic stall with
the introduction of AQL.

Fixes: cf44012810 ("mac80211: fix unnecessary frame drops in mesh fwding")
Fixes: d3c1597b8d ("mac80211: fix forwarded mesh frame queue mapping")
Co-developed-by: Remi Pommarel <repk@triplefau.lt>
Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Signed-off-by: Nicolas Escande <nico.escande@gmail.com>
Link: https://lore.kernel.org/r/20220214173214.368862-1-nico.escande@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-16 15:27:33 +01:00
Johannes Berg
a6bce78262 mac80211: refuse aggregations sessions before authorized
If an MFP station isn't authorized, the receiver will (or
at least should) drop the action frame since it's a robust
management frame, but if we're not authorized we haven't
installed keys yet. Refuse attempts to start a session as
they'd just time out.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20220203201528.ff4d5679dce9.I34bb1f2bc341e161af2d6faf74f91b332ba11285@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-16 15:26:34 +01:00
Deren Wu
610d086d6d mac80211: fix EAPoL rekey fail in 802.3 rx path
mac80211 set capability NL80211_EXT_FEATURE_CONTROL_PORT_OVER_NL80211
to upper layer by default. That means we should pass EAPoL packets through
nl80211 path only, and should not send the EAPoL skb to netdevice diretly.
At the meanwhile, wpa_supplicant would not register sock to listen EAPoL
skb on the netdevice.

However, there is no control_port_protocol handler in mac80211 for 802.3 RX
packets, mac80211 driver would pass up the EAPoL rekey frame to netdevice
and wpa_supplicant would be never interactive with this kind of packets,
if SUPPORTS_RX_DECAP_OFFLOAD is enabled. This causes STA always rekey fail
if EAPoL frame go through 802.3 path.

To avoid this problem, align the same process as 802.11 type to handle
this frame before put it into network stack.

This also addresses a potential security issue in 802.3 RX mode that was
previously fixed in commit a8c4d76a8d ("mac80211: do not accept/forward
invalid EAPOL frames").

Cc: stable@vger.kernel.org # 5.12+
Fixes: 80a915ec44 ("mac80211: add rx decapsulation offload support")
Signed-off-by: Deren Wu <deren.wu@mediatek.com>
Link: https://lore.kernel.org/r/6889c9fced5859ebb088564035f84fd0fa792a49.1644680751.git.deren.wu@mediatek.com
[fix typos, update comment and add note about security issue]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-16 15:25:02 +01:00
James Morse
dee435be76 arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2
Speculation attacks against some high-performance processors can
make use of branch history to influence future speculation as part of
a spectre-v2 attack. This is not mitigated by CSV2, meaning CPUs that
previously reported 'Not affected' are now moderately mitigated by CSV2.

Update the value in /sys/devices/system/cpu/vulnerabilities/spectre_v2
to also show the state of the BHB mitigation.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-16 13:22:26 +00:00
James Morse
bd09128d16 arm64: Add percpu vectors for EL1
The Spectre-BHB workaround adds a firmware call to the vectors. This
is needed on some CPUs, but not others. To avoid the unaffected CPU in
a big/little pair from making the firmware call, create per cpu vectors.

The per-cpu vectors only apply when returning from EL0.

Systems using KPTI can use the canonical 'full-fat' vectors directly at
EL1, the trampoline exit code will switch to this_cpu_vector on exit to
EL0. Systems not using KPTI should always use this_cpu_vector.

this_cpu_vector will point at a vector in tramp_vecs or
__bp_harden_el1_vectors, depending on whether KPTI is in use.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-16 13:17:30 +00:00
James Morse
b28a8eebe8 arm64: entry: Add macro for reading symbol addresses from the trampoline
The trampoline code needs to use the address of symbols in the wider
kernel, e.g. vectors. PC-relative addressing wouldn't work as the
trampoline code doesn't run at the address the linker expected.

tramp_ventry uses a literal pool, unless CONFIG_RANDOMIZE_BASE is
set, in which case it uses the data page as a literal pool because
the data page can be unmapped when running in user-space, which is
required for CPUs vulnerable to meltdown.

Pull this logic out as a macro, instead of adding a third copy
of it.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-16 13:16:27 +00:00
James Morse
ba2689234b arm64: entry: Add vectors that have the bhb mitigation sequences
Some CPUs affected by Spectre-BHB need a sequence of branches, or a
firmware call to be run before any indirect branch. This needs to go
in the vectors. No CPU needs both.

While this can be patched in, it would run on all CPUs as there is a
single set of vectors. If only one part of a big/little combination is
affected, the unaffected CPUs have to run the mitigation too.

Create extra vectors that include the sequence. Subsequent patches will
allow affected CPUs to select this set of vectors. Later patches will
modify the loop count to match what the CPU requires.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-16 13:16:08 +00:00
Qu Wenruo
966d879baf btrfs: defrag: allow defrag_one_cluster() to skip large extent which is not a target
In the rework of btrfs_defrag_file(), we always call
defrag_one_cluster() and increase the offset by cluster size, which is
only 256K.

But there are cases where we have a large extent (e.g. 128M) which
doesn't need to be defragged at all.

Before the refactor, we can directly skip the range, but now we have to
scan that extent map again and again until the cluster moves after the
non-target extent.

Fix the problem by allow defrag_one_cluster() to increase
btrfs_defrag_ctrl::last_scanned to the end of an extent, if and only if
the last extent of the cluster is not a target.

The test script looks like this:

	mkfs.btrfs -f $dev > /dev/null

	mount $dev $mnt

	# As btrfs ioctl uses 32M as extent_threshold
	xfs_io -f -c "pwrite 0 64M" $mnt/file1
	sync
	# Some fragemented range to defrag
	xfs_io -s -c "pwrite 65548k 4k" \
		  -c "pwrite 65544k 4k" \
		  -c "pwrite 65540k 4k" \
		  -c "pwrite 65536k 4k" \
		  $mnt/file1
	sync

	echo "=== before ==="
	xfs_io -c "fiemap -v" $mnt/file1
	echo "=== after ==="
	btrfs fi defrag $mnt/file1
	sync
	xfs_io -c "fiemap -v" $mnt/file1
	umount $mnt

With extra ftrace put into defrag_one_cluster(), before the patch it
would result tons of loops:

(As defrag_one_cluster() is inlined, the function name is its caller)

  btrfs-126062  [005] .....  4682.816026: btrfs_defrag_file: r/i=5/257 start=0 len=262144
  btrfs-126062  [005] .....  4682.816027: btrfs_defrag_file: r/i=5/257 start=262144 len=262144
  btrfs-126062  [005] .....  4682.816028: btrfs_defrag_file: r/i=5/257 start=524288 len=262144
  btrfs-126062  [005] .....  4682.816028: btrfs_defrag_file: r/i=5/257 start=786432 len=262144
  btrfs-126062  [005] .....  4682.816028: btrfs_defrag_file: r/i=5/257 start=1048576 len=262144
  ...
  btrfs-126062  [005] .....  4682.816043: btrfs_defrag_file: r/i=5/257 start=67108864 len=262144

But with this patch there will be just one loop, then directly to the
end of the extent:

  btrfs-130471  [014] .....  5434.029558: defrag_one_cluster: r/i=5/257 start=0 len=262144
  btrfs-130471  [014] .....  5434.029559: defrag_one_cluster: r/i=5/257 start=67108864 len=16384

CC: stable@vger.kernel.org # 5.16
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-15 19:59:30 +01:00
Dāvis Mosāns
741b23a970 btrfs: prevent copying too big compressed lzo segment
Compressed length can be corrupted to be a lot larger than memory
we have allocated for buffer.
This will cause memcpy in copy_compressed_segment to write outside
of allocated memory.

This mostly results in stuck read syscall but sometimes when using
btrfs send can get #GP

  kernel: general protection fault, probably for non-canonical address 0x841551d5c1000: 0000 [#1] PREEMPT SMP NOPTI
  kernel: CPU: 17 PID: 264 Comm: kworker/u256:7 Tainted: P           OE     5.17.0-rc2-1 #12
  kernel: Workqueue: btrfs-endio btrfs_work_helper [btrfs]
  kernel: RIP: 0010:lzo_decompress_bio (./include/linux/fortify-string.h:225 fs/btrfs/lzo.c:322 fs/btrfs/lzo.c:394) btrfs
  Code starting with the faulting instruction
  ===========================================
     0:*  48 8b 06                mov    (%rsi),%rax              <-- trapping instruction
     3:   48 8d 79 08             lea    0x8(%rcx),%rdi
     7:   48 83 e7 f8             and    $0xfffffffffffffff8,%rdi
     b:   48 89 01                mov    %rax,(%rcx)
     e:   44 89 f0                mov    %r14d,%eax
    11:   48 8b 54 06 f8          mov    -0x8(%rsi,%rax,1),%rdx
  kernel: RSP: 0018:ffffb110812efd50 EFLAGS: 00010212
  kernel: RAX: 0000000000001000 RBX: 000000009ca264c8 RCX: ffff98996e6d8ff8
  kernel: RDX: 0000000000000064 RSI: 000841551d5c1000 RDI: ffffffff9500435d
  kernel: RBP: ffff989a3be856c0 R08: 0000000000000000 R09: 0000000000000000
  kernel: R10: 0000000000000000 R11: 0000000000001000 R12: ffff98996e6d8000
  kernel: R13: 0000000000000008 R14: 0000000000001000 R15: 000841551d5c1000
  kernel: FS:  0000000000000000(0000) GS:ffff98a09d640000(0000) knlGS:0000000000000000
  kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  kernel: CR2: 00001e9f984d9ea8 CR3: 000000014971a000 CR4: 00000000003506e0
  kernel: Call Trace:
  kernel:  <TASK>
  kernel: end_compressed_bio_read (fs/btrfs/compression.c:104 fs/btrfs/compression.c:1363 fs/btrfs/compression.c:323) btrfs
  kernel: end_workqueue_fn (fs/btrfs/disk-io.c:1923) btrfs
  kernel: btrfs_work_helper (fs/btrfs/async-thread.c:326) btrfs
  kernel: process_one_work (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:212 ./include/trace/events/workqueue.h:108 kernel/workqueue.c:2312)
  kernel: worker_thread (./include/linux/list.h:292 kernel/workqueue.c:2455)
  kernel: ? process_one_work (kernel/workqueue.c:2397)
  kernel: kthread (kernel/kthread.c:377)
  kernel: ? kthread_complete_and_exit (kernel/kthread.c:332)
  kernel: ret_from_fork (arch/x86/entry/entry_64.S:301)
  kernel:  </TASK>

CC: stable@vger.kernel.org # 4.9+
Signed-off-by: Dāvis Mosāns <davispuh@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-15 19:59:09 +01:00
Felix Maurer
61d06f01f9 selftests: bpf: Check bpf_msg_push_data return value
bpf_msg_push_data may return a non-zero value to indicate an error. The
return value should be checked to prevent undetected errors.

To indicate an error, the BPF programs now perform a different action
than their intended one to make the userspace test program notice the
error, i.e., the programs supposed to pass/redirect drop, the program
supposed to drop passes.

Fixes: 84fbfe026a ("bpf: test_sockmap add options to use msg_push_data")
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/89f767bb44005d6b4dd1f42038c438f76b3ebfad.1644601294.git.fmaurer@redhat.com
2022-02-15 10:10:31 -08:00
James Morse
aff65393fa arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations
kpti is an optional feature, for systems not using kpti a set of
vectors for the spectre-bhb mitigations is needed.

Add another set of vectors, __bp_harden_el1_vectors, that will be
used if a mitigation is needed and kpti is not in use.

The EL1 ventries are repeated verbatim as there is no additional
work needed for entry from EL1.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:40:43 +00:00
James Morse
a9c406e646 arm64: entry: Allow the trampoline text to occupy multiple pages
Adding a second set of vectors to .entry.tramp.text will make it
larger than a single 4K page.

Allow the trampoline text to occupy up to three pages by adding two
more fixmap slots. Previous changes to tramp_valias allowed it to reach
beyond a single page.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:40:28 +00:00
James Morse
c47e4d04ba arm64: entry: Make the kpti trampoline's kpti sequence optional
Spectre-BHB needs to add sequences to the vectors. Having one global
set of vectors is a problem for big/little systems where the sequence
is costly on cpus that are not vulnerable.

Making the vectors per-cpu in the style of KVM's bh_harden_hyp_vecs
requires the vectors to be generated by macros.

Make the kpti re-mapping of the kernel optional, so the macros can be
used without kpti.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:40:16 +00:00
James Morse
13d7a08352 arm64: entry: Move trampoline macros out of ifdef'd section
The macros for building the kpti trampoline are all behind
CONFIG_UNMAP_KERNEL_AT_EL0, and in a region that outputs to the
.entry.tramp.text section.

Move the macros out so they can be used to generate other kinds of
trampoline. Only the symbols need to be guarded by
CONFIG_UNMAP_KERNEL_AT_EL0 and appear in the .entry.tramp.text section.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:40:03 +00:00
James Morse
ed50da7764 arm64: entry: Don't assume tramp_vectors is the start of the vectors
The tramp_ventry macro uses tramp_vectors as the address of the vectors
when calculating which ventry in the 'full fat' vectors to branch to.

While there is one set of tramp_vectors, this will be true.
Adding multiple sets of vectors will break this assumption.

Move the generation of the vectors to a macro, and pass the start
of the vectors as an argument to tramp_ventry.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:39:49 +00:00
James Morse
6c5bf79b69 arm64: entry: Allow tramp_alias to access symbols after the 4K boundary
Systems using kpti enter and exit the kernel through a trampoline mapping
that is always mapped, even when the kernel is not. tramp_valias is a macro
to find the address of a symbol in the trampoline mapping.

Adding extra sets of vectors will expand the size of the entry.tramp.text
section to beyond 4K. tramp_valias will be unable to generate addresses
for symbols beyond 4K as it uses the 12 bit immediate of the add
instruction.

As there are now two registers available when tramp_alias is called,
use the extra register to avoid the 4K limit of the 12 bit immediate.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:39:34 +00:00
James Morse
c091fb6ae0 arm64: entry: Move the trampoline data page before the text page
The trampoline code has a data page that holds the address of the vectors,
which is unmapped when running in user-space. This ensures that with
CONFIG_RANDOMIZE_BASE, the randomised address of the kernel can't be
discovered until after the kernel has been mapped.

If the trampoline text page is extended to include multiple sets of
vectors, it will be larger than a single page, making it tricky to
find the data page without knowing the size of the trampoline text
pages, which will vary with PAGE_SIZE.

Move the data page to appear before the text page. This allows the
data page to be found without knowing the size of the trampoline text
pages. 'tramp_vectors' is used to refer to the beginning of the
.entry.tramp.text section, do that explicitly.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:39:14 +00:00
James Morse
03aff3a77a arm64: entry: Free up another register on kpti's tramp_exit path
Kpti stashes x30 in far_el1 while it uses x30 for all its work.

Making the vectors a per-cpu data structure will require a second
register.

Allow tramp_exit two registers before it unmaps the kernel, by
leaving x30 on the stack, and stashing x29 in far_el1.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:39:05 +00:00
James Morse
d739da1694 arm64: entry: Make the trampoline cleanup optional
Subsequent patches will add additional sets of vectors that use
the same tricks as the kpti vectors to reach the full-fat vectors.
The full-fat vectors contain some cleanup for kpti that is patched
in by alternatives when kpti is in use. Once there are additional
vectors, the cleanup will be needed in more cases.

But on big/little systems, the cleanup would be harmful if no
trampoline vector were in use. Instead of forcing CPUs that don't
need a trampoline vector to use one, make the trampoline cleanup
optional.

Entry at the top of the vectors will skip the cleanup. The trampoline
vectors can then skip the first instruction, triggering the cleanup
to run.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:38:46 +00:00
James Morse
5bdf343760 KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A
CPUs vulnerable to Spectre-BHB either need to make an SMC-CC firmware
call from the vectors, or run a sequence of branches. This gets added
to the hyp vectors. If there is no support for arch-workaround-1 in
firmware, the indirect vector will be used.

kvm_init_vector_slots() only initialises the two indirect slots if
the platform is vulnerable to Spectre-v3a. pKVM's hyp_map_vectors()
only initialises __hyp_bp_vect_base if the platform is vulnerable to
Spectre-v3a.

As there are about to more users of the indirect vectors, ensure
their entries in hyp_spectre_vector_selector[] are always initialised,
and __hyp_bp_vect_base defaults to the regular VA mapping.

The Spectre-v3a check is moved to a helper
kvm_system_needs_idmapped_vectors(), and merged with the code
that creates the hyp mappings.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:38:25 +00:00
James Morse
1b33d4860d arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit
The spectre-v4 sequence includes an SMC from the assembly entry code.
spectre_v4_patch_fw_mitigation_conduit is the patching callback that
generates an HVC or SMC depending on the SMCCC conduit type.

As this isn't specific to spectre-v4, rename it
smccc_patch_fw_mitigation_conduit so it can be re-used.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:38:09 +00:00
James Morse
4330e2c5c0 arm64: entry.S: Add ventry overflow sanity checks
Subsequent patches add even more code to the ventry slots.
Ensure kernels that overflow a ventry slot don't get built.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:37:44 +00:00
Andy Shevchenko
6f66db29e2 pinctrl: tigerlake: Revert "Add Alder Lake-M ACPI ID"
It appears that last minute change moved ACPI ID of Alder Lake-M
to the INTC1055, which is already in the driver.

This ID on the other hand will be used elsewhere.

This reverts commit 258435a1c8.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2022-02-15 19:02:46 +02:00
Oliver Graute
b6821b0d9b staging: fbtft: fb_st7789v: reset display before initialization
In rare cases the display is flipped or mirrored. This was observed more
often in a low temperature environment. A clean reset on init_display()
should help to get registers in a sane state.

Fixes: ef8f317795 (staging: fbtft: use init function instead of init sequence)
Cc: stable@vger.kernel.org
Signed-off-by: Oliver Graute <oliver.graute@kococonnector.com>
Link: https://lore.kernel.org/r/20220210085322.15676-1-oliver.graute@kococonnector.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-15 17:14:22 +01:00
Eric Dumazet
f240762f88 io_uring: add a schedule point in io_add_buffers()
Looping ~65535 times doing kmalloc() calls can trigger soft lockups,
especially with DEBUG features (like KASAN).

[  253.536212] watchdog: BUG: soft lockup - CPU#64 stuck for 26s! [b219417889:12575]
[  253.544433] Modules linked in: vfat fat i2c_mux_pca954x i2c_mux spidev cdc_acm xhci_pci xhci_hcd sha3_generic gq(O)
[  253.544451] CPU: 64 PID: 12575 Comm: b219417889 Tainted: G S         O      5.17.0-smp-DEV #801
[  253.544457] RIP: 0010:kernel_text_address (./include/asm-generic/sections.h:192 ./include/linux/kallsyms.h:29 kernel/extable.c:67 kernel/extable.c:98)
[  253.544464] Code: 0f 93 c0 48 c7 c1 e0 63 d7 a4 48 39 cb 0f 92 c1 20 c1 0f b6 c1 5b 5d c3 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 53 48 89 fb <48> c7 c0 00 00 80 a0 41 be 01 00 00 00 48 39 c7 72 0c 48 c7 c0 40
[  253.544468] RSP: 0018:ffff8882d8baf4c0 EFLAGS: 00000246
[  253.544471] RAX: 1ffff1105b175e00 RBX: ffffffffa13ef09a RCX: 00000000a13ef001
[  253.544474] RDX: ffffffffa13ef09a RSI: ffff8882d8baf558 RDI: ffffffffa13ef09a
[  253.544476] RBP: ffff8882d8baf4d8 R08: ffff8882d8baf5e0 R09: 0000000000000004
[  253.544479] R10: ffff8882d8baf5e8 R11: ffffffffa0d59a50 R12: ffff8882eab20380
[  253.544481] R13: ffffffffa0d59a50 R14: dffffc0000000000 R15: 1ffff1105b175eb0
[  253.544483] FS:  00000000016d3380(0000) GS:ffff88af48c00000(0000) knlGS:0000000000000000
[  253.544486] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  253.544488] CR2: 00000000004af0f0 CR3: 00000002eabfa004 CR4: 00000000003706e0
[  253.544491] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  253.544492] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  253.544494] Call Trace:
[  253.544496]  <TASK>
[  253.544498] ? io_queue_sqe (fs/io_uring.c:7143)
[  253.544505] __kernel_text_address (kernel/extable.c:78)
[  253.544508] unwind_get_return_address (arch/x86/kernel/unwind_frame.c:19)
[  253.544514] arch_stack_walk (arch/x86/kernel/stacktrace.c:27)
[  253.544517] ? io_queue_sqe (fs/io_uring.c:7143)
[  253.544521] stack_trace_save (kernel/stacktrace.c:123)
[  253.544527] ____kasan_kmalloc (mm/kasan/common.c:39 mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:515)
[  253.544531] ? ____kasan_kmalloc (mm/kasan/common.c:39 mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:515)
[  253.544533] ? __kasan_kmalloc (mm/kasan/common.c:524)
[  253.544535] ? kmem_cache_alloc_trace (./include/linux/kasan.h:270 mm/slab.c:3567)
[  253.544541] ? io_issue_sqe (fs/io_uring.c:4556 fs/io_uring.c:4589 fs/io_uring.c:6828)
[  253.544544] ? __io_queue_sqe (fs/io_uring.c:?)
[  253.544551] __kasan_kmalloc (mm/kasan/common.c:524)
[  253.544553] kmem_cache_alloc_trace (./include/linux/kasan.h:270 mm/slab.c:3567)
[  253.544556] ? io_issue_sqe (fs/io_uring.c:4556 fs/io_uring.c:4589 fs/io_uring.c:6828)
[  253.544560] io_issue_sqe (fs/io_uring.c:4556 fs/io_uring.c:4589 fs/io_uring.c:6828)
[  253.544564] ? __kasan_slab_alloc (mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:469)
[  253.544567] ? __kasan_slab_alloc (mm/kasan/common.c:39 mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:469)
[  253.544569] ? kmem_cache_alloc_bulk (mm/slab.h:732 mm/slab.c:3546)
[  253.544573] ? __io_alloc_req_refill (fs/io_uring.c:2078)
[  253.544578] ? io_submit_sqes (fs/io_uring.c:7441)
[  253.544581] ? __se_sys_io_uring_enter (fs/io_uring.c:10154 fs/io_uring.c:10096)
[  253.544584] ? __x64_sys_io_uring_enter (fs/io_uring.c:10096)
[  253.544587] ? do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[  253.544590] ? entry_SYSCALL_64_after_hwframe (??:?)
[  253.544596] __io_queue_sqe (fs/io_uring.c:?)
[  253.544600] io_queue_sqe (fs/io_uring.c:7143)
[  253.544603] io_submit_sqe (fs/io_uring.c:?)
[  253.544608] io_submit_sqes (fs/io_uring.c:?)
[  253.544612] __se_sys_io_uring_enter (fs/io_uring.c:10154 fs/io_uring.c:10096)
[  253.544616] __x64_sys_io_uring_enter (fs/io_uring.c:10096)
[  253.544619] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[  253.544623] entry_SYSCALL_64_after_hwframe (??:?)

Fixes: ddf0322db7 ("io_uring: add IORING_OP_PROVIDE_BUFFERS")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring <io-uring@vger.kernel.org>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20220215041003.2394784-1-eric.dumazet@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-15 07:47:16 -07:00
Hongyu Xie
243a1dd7ba xhci: Prevent futile URB re-submissions due to incorrect return value.
The -ENODEV return value from xhci_check_args() is incorrectly changed
to -EINVAL in a couple places before propagated further.

xhci_check_args() returns 4 types of value, -ENODEV, -EINVAL, 1 and 0.
xhci_urb_enqueue and xhci_check_streams_endpoint return -EINVAL if
the return value of xhci_check_args <= 0.
This causes problems for example r8152_submit_rx, calling usb_submit_urb
in drivers/net/usb/r8152.c.
r8152_submit_rx will never get -ENODEV after submiting an urb when xHC
is halted because xhci_urb_enqueue returns -EINVAL in the very beginning.

[commit message and header edit -Mathias]

Fixes: 203a86613f ("xhci: Avoid NULL pointer deref when host dies.")
Cc: stable@vger.kernel.org
Signed-off-by: Hongyu Xie <xiehongyu1@kylinos.cn>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20220215123320.1253947-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-15 15:13:20 +01:00
Puma Hsu
8b328f8002 xhci: re-initialize the HC during resume if HCE was set
When HCE(Host Controller Error) is set, it means an internal
error condition has been detected. Software needs to re-initialize
the HC, so add this check in xhci resume.

Cc: stable@vger.kernel.org
Signed-off-by: Puma Hsu <pumahsu@google.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20220215123320.1253947-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-15 15:13:20 +01:00
Hans de Goede
d7c93a903f usb: dwc3: pci: Add "snps,dis_u2_susphy_quirk" for Intel Bay Trail
Commit e0082698b6 ("usb: dwc3: ulpi: conditionally resume ULPI PHY")
fixed an issue where ULPI transfers would timeout if any requests where
send to the phy sometime after init, giving it enough time to auto-suspend.

Commit e5f4ca3fce ("usb: dwc3: ulpi: Fix USB2.0 HS/FS/LS PHY suspend
regression") changed the behavior to instead of clearing the
DWC3_GUSB2PHYCFG_SUSPHY bit, add an extra sleep when it is set.

But on Bay Trail devices, when phy_set_mode() gets called during init,
this leads to errors like these:
[   28.451522] tusb1210 dwc3.ulpi: error -110 writing val 0x01 to reg 0x0a
[   28.464089] tusb1210 dwc3.ulpi: error -110 writing val 0x01 to reg 0x0a

Add "snps,dis_u2_susphy_quirk" to the settings for Bay Trail devices to
fix this. This restores the old behavior for Bay Trail devices, since
previously the DWC3_GUSB2PHYCFG_SUSPHY bit would get cleared on the first
ulpi_read/_write() and then was never set again.

Fixes: e5f4ca3fce ("usb: dwc3: ulpi: Fix USB2.0 HS/FS/LS PHY suspend regression")
Cc: stable@kernel.org
Cc: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220213130524.18748-2-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-15 15:12:38 +01:00
Heikki Krogerus
038438a25c usb: dwc3: pci: add support for the Intel Raptor Lake-S
This patch adds the necessary PCI ID for Intel Raptor Lake-S
devices.

Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20220214141948.18637-1-heikki.krogerus@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-15 15:12:10 +01:00
Steev Klimaszewski
382e3e0eb6 arm64: dts: qcom: c630: disable crypto due to serror
Disable the crypto block due to it causing an SError in qce_start() on
the C630, which happens upon every boot when cryptomanager tests are
enabled.

Signed-off-by: Steev Klimaszewski <steev@kali.org>
[bjorn: Reworked commit message]
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211105035235.2392-1-steev@kali.org
2022-02-14 21:50:11 -06:00
Zhang Qiao
05c7b7a92c cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug
As previously discussed(https://lkml.org/lkml/2022/1/20/51),
cpuset_attach() is affected with similar cpu hotplug race,
as follow scenario:

     cpuset_attach()				cpu hotplug
    ---------------------------            ----------------------
    down_write(cpuset_rwsem)
    guarantee_online_cpus() // (load cpus_attach)
					sched_cpu_deactivate
					  set_cpu_active()
					  // will change cpu_active_mask
    set_cpus_allowed_ptr(cpus_attach)
      __set_cpus_allowed_ptr_locked()
       // (if the intersection of cpus_attach and
         cpu_active_mask is empty, will return -EINVAL)
    up_write(cpuset_rwsem)

To avoid races such as described above, protect cpuset_attach() call
with cpu_hotplug_lock.

Fixes: be367d0992 ("cgroups: let ss->can_attach and ss->attach do whole threadgroups at a time")
Cc: stable@vger.kernel.org # v2.6.32+
Reported-by: Zhao Gongyi <zhaogongyi@huawei.com>
Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-02-14 09:48:04 -10:00
Pali Rohár
c49ae61990 PCI: mvebu: Fix device enumeration regression
Jan reported that on Turris Omnia (Armada 385), no PCIe devices were
detected after upgrading from v5.16.1 to v5.16.3 and identified the cause
as the backport of 91a8d79fc7 ("PCI: mvebu: Fix configuring secondary bus
of PCIe Root Port via emulated bridge"), which appeared in v5.17-rc1.

91a8d79fc7 was incorrectly applied from mailing list patch [1] to the
linux git repository [2] probably due to resolving merge conflicts
incorrectly. Fix it now.

[1] https://lore.kernel.org/r/20211125124605.25915-12-pali@kernel.org
[2] https://git.kernel.org/linus/91a8d79fc797

[bhelgaas: commit log]
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215540
Fixes: 91a8d79fc7 ("PCI: mvebu: Fix configuring secondary bus of PCIe Root Port via emulated bridge")
Link: https://lore.kernel.org/r/20220214110228.25825-1-pali@kernel.org
Link: https://lore.kernel.org/r/20220127234917.GA150851@bhelgaas
Reported-by: Jan Palus <jpalus@fastmail.com>
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-02-14 09:34:23 -06:00
Suravee Suthikulpanit
6b0b2d9a6a iommu/amd: Fix I/O page table memory leak
The current logic updates the I/O page table mode for the domain
before calling the logic to free memory used for the page table.
This results in IOMMU page table memory leak, and can be observed
when launching VM w/ pass-through devices.

Fix by freeing the memory used for page table before updating the mode.

Cc: Joerg Roedel <joro@8bytes.org>
Reported-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Tested-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Fixes: e42ba06330 ("iommu/amd: Restructure code for freeing page table")
Link: https://lore.kernel.org/all/20220118194720.urjgi73b7c3tq2o6@oracle.com/
Link: https://lore.kernel.org/r/20220210154745.11524-1-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-14 12:52:40 +01:00
Yang Yingliang
40eb0dcf41 tee: optee: fix error return code in probe function
If teedev_open() fails, probe function need return
error code.

Fixes: aceeafefff ("optee: use driver internal tee_context for some rpc")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2022-02-14 12:36:48 +01:00
Lennert Buytenhek
5ce97f4ec5 iommu/amd: Recover from event log overflow
The AMD IOMMU logs I/O page faults and such to a ring buffer in
system memory, and this ring buffer can overflow.  The AMD IOMMU
spec has the following to say about the interrupt status bit that
signals this overflow condition:

	EventOverflow: Event log overflow. RW1C. Reset 0b. 1 = IOMMU
	event log overflow has occurred. This bit is set when a new
	event is to be written to the event log and there is no usable
	entry in the event log, causing the new event information to
	be discarded. An interrupt is generated when EventOverflow = 1b
	and MMIO Offset 0018h[EventIntEn] = 1b. No new event log
	entries are written while this bit is set. Software Note: To
	resume logging, clear EventOverflow (W1C), and write a 1 to
	MMIO Offset 0018h[EventLogEn].

The AMD IOMMU driver doesn't currently implement this recovery
sequence, meaning that if a ring buffer overflow occurs, logging
of EVT/PPR/GA events will cease entirely.

This patch implements the spec-mandated reset sequence, with the
minor tweak that the hardware seems to want to have a 0 written to
MMIO Offset 0018h[EventLogEn] first, before writing an 1 into this
field, or the IOMMU won't actually resume logging events.

Signed-off-by: Lennert Buytenhek <buytenh@arista.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/YVrSXEdW2rzEfOvk@wantstofly.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-14 12:06:55 +01:00
Halil Pasic
ddbd89deb7 swiotlb: fix info leak with DMA_FROM_DEVICE
The problem I'm addressing was discovered by the LTP test covering
cve-2018-1000204.

A short description of what happens follows:
1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO
   interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV
   and a corresponding dxferp. The peculiar thing about this is that TUR
   is not reading from the device.
2) In sg_start_req() the invocation of blk_rq_map_user() effectively
   bounces the user-space buffer. As if the device was to transfer into
   it. Since commit a45b599ad8 ("scsi: sg: allocate with __GFP_ZERO in
   sg_build_indirect()") we make sure this first bounce buffer is
   allocated with GFP_ZERO.
3) For the rest of the story we keep ignoring that we have a TUR, so the
   device won't touch the buffer we prepare as if the we had a
   DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device
   and the  buffer allocated by SG is mapped by the function
   virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here
   scatter-gather and not scsi generics). This mapping involves bouncing
   via the swiotlb (we need swiotlb to do virtio in protected guest like
   s390 Secure Execution, or AMD SEV).
4) When the SCSI TUR is done, we first copy back the content of the second
   (that is swiotlb) bounce buffer (which most likely contains some
   previous IO data), to the first bounce buffer, which contains all
   zeros.  Then we copy back the content of the first bounce buffer to
   the user-space buffer.
5) The test case detects that the buffer, which it zero-initialized,
  ain't all zeros and fails.

One can argue that this is an swiotlb problem, because without swiotlb
we leak all zeros, and the swiotlb should be transparent in a sense that
it does not affect the outcome (if all other participants are well
behaved).

Copying the content of the original buffer into the swiotlb buffer is
the only way I can think of to make swiotlb transparent in such
scenarios. So let's do just that if in doubt, but allow the driver
to tell us that the whole mapped buffer is going to be overwritten,
in which case we can preserve the old behavior and avoid the performance
impact of the extra bounce.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-02-14 10:22:28 +01:00
Sudeep Holla
45d941f67b arm64: dts: imx8ulp: Set #thermal-sensor-cells to 1 as required
The SCMI binding clearly states the value of #thermal-sensor-cells must
be 1. However arch/arm64/boot/dts/freescale/imx8ulp.dtsi sets it 0 which
results in the following warning with dtbs_check:

  |  arch/arm64/boot/dts/freescale/imx8ulp-evk.dt.yaml: scmi:
  | 		protocol@15:#thermal-sensor-cells:0:0: 1 was expected
  |	From schema: Documentation/devicetree/bindings/firmware/arm,scmi.yaml

Fix it by setting it to 1 as required.

Cc:Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Acked-by: Peng Fan <peng.fan@nxp.com>
Fixes: a38771d7a4 ("arm64: dts: imx8ulp: add scmi firmware node")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2022-02-14 08:39:12 +08:00
Adam Ford
ef3075d663 arm64: dts: imx8mm: Fix VPU Hanging
The vpumix power domain has a reset assigned to it, however
when used, it causes a system hang.  Testing has shown that
it does not appear to be needed anywhere.

Fixes: d39d4bb153 ("arm64: dts: imx8mm: add GPC node")
Signed-off-by: Adam Ford <aford173@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2022-02-14 08:39:12 +08:00
Pablo Neira Ayuso
2874b79111 netfilter: xt_socket: missing ifdef CONFIG_IP6_NF_IPTABLES dependency
nf_defrag_ipv6_disable() requires CONFIG_IP6_NF_IPTABLES.

Fixes: 75063c9294 ("netfilter: xt_socket: fix a typo in socket_mt_destroy()")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Eric Dumazet<edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-13 23:55:48 +01:00
Corentin Labbe
3916c36195 ARM: dts: rockchip: fix a typo on rk3288 crypto-controller
crypto-controller had a typo, fix it.
In the same time, rename it to just crypto

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20220209120355.1985707-1-clabbe@baylibre.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2022-02-12 01:27:33 +01:00
Sascha Hauer
be4e65bdff ARM: dts: rockchip: reorder rk322x hmdi clocks
The binding specifies the clock order to "iahb", "isfr", "cec". Reorder
the clocks accordingly.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20220210142353.3420859-1-s.hauer@pengutronix.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2022-02-12 01:26:16 +01:00
Alexei Starovoitov
3df9d80316 Merge branch 'bpf: fix a bpf_timer initialization issue'
Yonghong Song says:

====================

The patch [1] exposed a bpf_timer initialization bug in function
check_and_init_map_value(). With bug fix here, the patch [1]
can be applied with all selftests passed. Please see individual
patches for fix details.

  [1] https://lore.kernel.org/bpf/20220209070324.1093182-2-memxor@gmail.com/

Changelog:
  v3 -> v4:
    . move header file in patch #1 to avoid bpf-next merge conflict
  v2 -> v3:
    . switch patch #1 and patch #2 for better bisecting
  v1 -> v2:
    . add Fixes tag for patch #1
    . rebase against bpf tree
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-02-11 13:21:47 -08:00
Yonghong Song
5eaed6eedb bpf: Fix a bpf_timer initialization issue
The patch in [1] intends to fix a bpf_timer related issue,
but the fix caused existing 'timer' selftest to fail with
hang or some random errors. After some debug, I found
an issue with check_and_init_map_value() in the hashtab.c.
More specifically, in hashtab.c, we have code
  l_new = bpf_map_kmalloc_node(&htab->map, ...)
  check_and_init_map_value(&htab->map, l_new...)
Note that bpf_map_kmalloc_node() does not do initialization
so l_new contains random value.

The function check_and_init_map_value() intends to zero the
bpf_spin_lock and bpf_timer if they exist in the map.
But I found bpf_spin_lock is zero'ed but bpf_timer is not zero'ed.
With [1], later copy_map_value() skips copying of
bpf_spin_lock and bpf_timer. The non-zero bpf_timer caused
random failures for 'timer' selftest.
Without [1], for both bpf_spin_lock and bpf_timer case,
bpf_timer will be zero'ed, so 'timer' self test is okay.

For check_and_init_map_value(), why bpf_spin_lock is zero'ed
properly while bpf_timer not. In bpf uapi header, we have
  struct bpf_spin_lock {
        __u32   val;
  };
  struct bpf_timer {
        __u64 :64;
        __u64 :64;
  } __attribute__((aligned(8)));

The initialization code:
  *(struct bpf_spin_lock *)(dst + map->spin_lock_off) =
      (struct bpf_spin_lock){};
  *(struct bpf_timer *)(dst + map->timer_off) =
      (struct bpf_timer){};
It appears the compiler has no obligation to initialize anonymous fields.
For example, let us use clang with bpf target as below:
  $ cat t.c
  struct bpf_timer {
        unsigned long long :64;
  };
  struct bpf_timer2 {
        unsigned long long a;
  };

  void test(struct bpf_timer *t) {
    *t = (struct bpf_timer){};
  }
  void test2(struct bpf_timer2 *t) {
    *t = (struct bpf_timer2){};
  }
  $ clang -target bpf -O2 -c -g t.c
  $ llvm-objdump -d t.o
   ...
   0000000000000000 <test>:
       0:       95 00 00 00 00 00 00 00 exit
   0000000000000008 <test2>:
       1:       b7 02 00 00 00 00 00 00 r2 = 0
       2:       7b 21 00 00 00 00 00 00 *(u64 *)(r1 + 0) = r2
       3:       95 00 00 00 00 00 00 00 exit

gcc11.2 does not have the above issue. But from
  INTERNATIONAL STANDARD ©ISO/IEC ISO/IEC 9899:201x
  Programming languages — C
  http://www.open-std.org/Jtc1/sc22/wg14/www/docs/n1547.pdf
  page 157:
  Except where explicitly stated otherwise, for the purposes of
  this subclause unnamed members of objects of structure and union
  type do not participate in initialization. Unnamed members of
  structure objects have indeterminate value even after initialization.

To fix the problem, let use memset for bpf_timer case in
check_and_init_map_value(). For consistency, memset is also
used for bpf_spin_lock case.

  [1] https://lore.kernel.org/bpf/20220209070324.1093182-2-memxor@gmail.com/

Fixes: 68134668c1 ("bpf: Add map side support for bpf timers.")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220211194953.3142152-1-yhs@fb.com
2022-02-11 13:21:47 -08:00
Yonghong Song
3bd916ee0e bpf: Emit bpf_timer in vmlinux BTF
Currently the following code in check_and_init_map_value()
  *(struct bpf_timer *)(dst + map->timer_off) =
      (struct bpf_timer){};
can help generate bpf_timer definition in vmlinuxBTF.
But the code above may not zero the whole structure
due to anonymour members and that code will be replaced
by memset in the subsequent patch and
bpf_timer definition will disappear from vmlinuxBTF.
Let us emit the type explicitly so bpf program can continue
to use it from vmlinux.h.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220211194948.3141529-1-yhs@fb.com
2022-02-11 13:21:47 -08:00
Arnd Bergmann
a8cd28553f Merge tag 'at91-fixes-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/fixes
AT91 fixes #1 for 5.17:

- MAINTAINERS file update.

* tag 'at91-fixes-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux:
  dt-bindings: ARM: at91: update maintainers entry
  MAINTAINERS: replace a Microchip AT91 maintainer

Link: https://lore.kernel.org/r/20220211133515.15314-1-nicolas.ferre@microchip.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-11 22:13:18 +01:00
Alexei Starovoitov
acc3c47394 Merge branch 'Fix for crash due to overwrite in copy_map_value'
Kumar Kartikeya says:

====================

A fix for an oversight in copy_map_value that leads to kernel crash.

Also, a question for BPF developers:
It seems in arraymap.c, we always do check_and_free_timer_in_array after we do
copy_map_value in map_update_elem callback, but the same is not done for
hashtab.c. Is there a specific reason for this difference in behavior, or did I
miss that it happens for hashtab.c as well?

Changlog:
---------
v1 -> v2:
v1: https://lore.kernel.org/bpf/20220209051113.870717-1-memxor@gmail.com

 * Fix build error for selftests patch due to missing SYS_PREFIX in bpf tree
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-02-11 13:13:04 -08:00
Kumar Kartikeya Dwivedi
a7e75016a0 selftests/bpf: Add test for bpf_timer overwriting crash
Add a test that validates that timer value is not overwritten when doing
a copy_map_value call in the kernel. Without the prior fix, this test
triggers a crash.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220209070324.1093182-3-memxor@gmail.com
2022-02-11 13:13:04 -08:00
Kumar Kartikeya Dwivedi
a8abb0c3dc bpf: Fix crash due to incorrect copy_map_value
When both bpf_spin_lock and bpf_timer are present in a BPF map value,
copy_map_value needs to skirt both objects when copying a value into and
out of the map. However, the current code does not set both s_off and
t_off in copy_map_value, which leads to a crash when e.g. bpf_spin_lock
is placed in map value with bpf_timer, as bpf_map_update_elem call will
be able to overwrite the other timer object.

When the issue is not fixed, an overwriting can produce the following
splat:

[root@(none) bpf]# ./test_progs -t timer_crash
[   15.930339] bpf_testmod: loading out-of-tree module taints kernel.
[   16.037849] ==================================================================
[   16.038458] BUG: KASAN: user-memory-access in __pv_queued_spin_lock_slowpath+0x32b/0x520
[   16.038944] Write of size 8 at addr 0000000000043ec0 by task test_progs/325
[   16.039399]
[   16.039514] CPU: 0 PID: 325 Comm: test_progs Tainted: G           OE     5.16.0+ #278
[   16.039983] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ArchLinux 1.15.0-1 04/01/2014
[   16.040485] Call Trace:
[   16.040645]  <TASK>
[   16.040805]  dump_stack_lvl+0x59/0x73
[   16.041069]  ? __pv_queued_spin_lock_slowpath+0x32b/0x520
[   16.041427]  kasan_report.cold+0x116/0x11b
[   16.041673]  ? __pv_queued_spin_lock_slowpath+0x32b/0x520
[   16.042040]  __pv_queued_spin_lock_slowpath+0x32b/0x520
[   16.042328]  ? memcpy+0x39/0x60
[   16.042552]  ? pv_hash+0xd0/0xd0
[   16.042785]  ? lockdep_hardirqs_off+0x95/0xd0
[   16.043079]  __bpf_spin_lock_irqsave+0xdf/0xf0
[   16.043366]  ? bpf_get_current_comm+0x50/0x50
[   16.043608]  ? jhash+0x11a/0x270
[   16.043848]  bpf_timer_cancel+0x34/0xe0
[   16.044119]  bpf_prog_c4ea1c0f7449940d_sys_enter+0x7c/0x81
[   16.044500]  bpf_trampoline_6442477838_0+0x36/0x1000
[   16.044836]  __x64_sys_nanosleep+0x5/0x140
[   16.045119]  do_syscall_64+0x59/0x80
[   16.045377]  ? lock_is_held_type+0xe4/0x140
[   16.045670]  ? irqentry_exit_to_user_mode+0xa/0x40
[   16.046001]  ? mark_held_locks+0x24/0x90
[   16.046287]  ? asm_exc_page_fault+0x1e/0x30
[   16.046569]  ? asm_exc_page_fault+0x8/0x30
[   16.046851]  ? lockdep_hardirqs_on+0x7e/0x100
[   16.047137]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   16.047405] RIP: 0033:0x7f9e4831718d
[   16.047602] Code: b4 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b3 6c 0c 00 f7 d8 64 89 01 48
[   16.048764] RSP: 002b:00007fff488086b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000023
[   16.049275] RAX: ffffffffffffffda RBX: 00007f9e48683740 RCX: 00007f9e4831718d
[   16.049747] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fff488086d0
[   16.050225] RBP: 00007fff488086f0 R08: 00007fff488085d7 R09: 00007f9e4cb594a0
[   16.050648] R10: 0000000000000000 R11: 0000000000000206 R12: 00007f9e484cde30
[   16.051124] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   16.051608]  </TASK>
[   16.051762] ==================================================================

Fixes: 68134668c1 ("bpf: Add map side support for bpf timers.")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220209070324.1093182-2-memxor@gmail.com
2022-02-11 13:13:04 -08:00
Felix Maurer
4a11678f68 bpf: Do not try bpf_msg_push_data with len 0
If bpf_msg_push_data() is called with len 0 (as it happens during
selftests/bpf/test_sockmap), we do not need to do anything and can
return early.

Calling bpf_msg_push_data() with len 0 previously lead to a wrong ENOMEM
error: we later called get_order(copy + len); if len was 0, copy + len
was also often 0 and get_order() returned some undefined value (at the
moment 52). alloc_pages() caught that and failed, but then bpf_msg_push_data()
returned ENOMEM. This was wrong because we are most probably not out of
memory and actually do not need any additional memory.

Fixes: 6fff607e2f ("bpf: sk_msg program helper bpf_msg_push_data")
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/df69012695c7094ccb1943ca02b4920db3537466.1644421921.git.fmaurer@redhat.com
2022-02-11 13:49:23 +01:00
Alyssa Ross
1ba603f565 firmware: arm_scmi: Remove space in MODULE_ALIAS name
modprobe can't handle spaces in aliases. Get rid of it to fix the issue.

Link: https://lore.kernel.org/r/20220211102704.128354-1-sudeep.holla@arm.com
Fixes: aa4f886f38 ("firmware: arm_scmi: add basic driver infrastructure for SCMI")
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-02-11 10:37:16 +00:00
Sean Anderson
e9f7b9228a pinctrl: k210: Fix bias-pull-up
Using bias-pull-up would actually cause the pin to have its pull-down
enabled. Fix this.

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Fixes: d4c34d09ab ("pinctrl: Add RISC-V Canaan Kendryte K210 FPIOA driver")
Link: https://lore.kernel.org/r/20220209182822.640905-1-seanga2@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-02-11 02:19:47 +01:00
Dan Carpenter
ba2ab85951 pinctrl: fix loop in k210_pinconf_get_drive()
The loop exited too early so the k210_pinconf_drive_strength[0] array
element was never used.

Fixes: d4c34d09ab ("pinctrl: Add RISC-V Canaan Kendryte K210 FPIOA driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Sean Anderson <seanga2@gmail.com>
Link: https://lore.kernel.org/r/20220209180804.GA18385@kili
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-02-11 02:18:35 +01:00
Darrick J. Wong
b97cca3ba9 xfs: only bother with sync_filesystem during readonly remount
In commit 02b9984d64, we pushed a sync_filesystem() call from the VFS
into xfs_fs_remount.  The only time that we ever need to push dirty file
data or metadata to disk for a remount is if we're remounting the
filesystem read only, so this really could be moved to xfs_remount_ro.

Once we've moved the call site, actually check the return value from
sync_filesystem.

Fixes: 02b9984d64 ("fs: push sync_filesystem() down to the file system's remount_fs()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-02-09 21:07:24 -08:00
Dinh Nguyen
268a491aeb arm64: dts: agilex: use the compatible "intel,socfpga-agilex-hsotg"
The DWC2 USB controller on the Agilex platform does not support clock
gating, so use the chip specific "intel,socfpga-agilex-hsotg"
compatible.

Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-02-09 13:18:48 -06:00
Dinh Nguyen
728390fce4 dt-bindings: usb: dwc2: add compatible "intel,socfpga-agilex-hsotg"
Add the compatible "intel,socfpga-agilex-hsotg" to the DWC2
implementation, because the Agilex DWC2 implementation does not support
clock gating.

Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-02-09 13:18:13 -06:00
Nicolas Ferre
26077968f8 dt-bindings: ARM: at91: update maintainers entry
Align the binding documentation with the newly updated MAINTAINERS
entry.

Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Acked-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Link: https://lore.kernel.org/r/5bf9873eeee3cd49c52a8952a7cd4cb60b61d50a.1643553501.git.nicolas.ferre@microchip.com
2022-02-09 11:30:01 +01:00
Nicolas Ferre
6620e311ae MAINTAINERS: replace a Microchip AT91 maintainer
As Ludovic is more focusing on other aspects of the Microchip
Linux-based development, replace him with Claudiu.
Entry is added to the CREDITS file.

Thanks Ludovic for these great contributions in the kernel space!

Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Acked-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Link: https://lore.kernel.org/r/23819d8baa635815d0893955197561fe4f044d5e.1643553501.git.nicolas.ferre@microchip.com
2022-02-09 11:30:01 +01:00
Leon Romanovsky
7c76ecd9c9 xfrm: enforce validity of offload input flags
struct xfrm_user_offload has flags variable that received user input,
but kernel didn't check if valid bits were provided. It caused a situation
where not sanitized input was forwarded directly to the drivers.

For example, XFRM_OFFLOAD_IPV6 define that was exposed, was used by
strongswan, but not implemented in the kernel at all.

As a solution, check and sanitize input flags to forward
XFRM_OFFLOAD_INBOUND to the drivers.

Fixes: d77e38e612 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-02-09 09:00:40 +01:00
Bjorn Andersson
ef8ee1cb8f cpufreq: qcom-hw: Delay enabling throttle_irq
In the event that the SoC is under thermal pressure while booting it's
possible for the dcvs notification to happen inbetween the cpufreq
framework calling init and it actually updating the policy's
related_cpus cpumask.

Prior to the introduction of the thermal pressure update helper an empty
cpumask would simply result in the thermal pressure of no cpus being
updated, but the new code will attempt to dereference an invalid per_cpu
variable.

Avoid this problem by using the newly reintroduced "ready" callback, to
postpone enabling the IRQ until the related_cpus cpumask is filled in.

Fixes: 0258cb19c7 ("cpufreq: qcom-cpufreq-hw: Use new thermal pressure update function")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-02-09 13:18:49 +05:30
Bjorn Andersson
4f774c4a65 cpufreq: Reintroduce ready() callback
This effectively revert '4bf8e582119e ("cpufreq: Remove ready()
callback")', in order to reintroduce the ready callback.

This is needed in order to be able to leave the thermal pressure
interrupts in the Qualcomm CPUfreq driver disabled during
initialization, so that it doesn't fire while related_cpus are still 0.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
[ Viresh: Added the Chinese translation as well and updated commit msg ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-02-09 13:18:49 +05:30
Zhou Qingyang
ab3824427b spi: spi-zynq-qspi: Fix a NULL pointer dereference in zynq_qspi_exec_mem_op()
In zynq_qspi_exec_mem_op(), kzalloc() is directly used in memset(),
which could lead to a NULL pointer dereference on failure of
kzalloc().

Fix this bug by adding a check of tmpbuf.

This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.

Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.

Builds with CONFIG_SPI_ZYNQ_QSPI=m show no new warnings,
and our static analyzer no longer warns about this code.

Fixes: 67dca5e580 ("spi: spi-mem: Add support for Zynq QSPI controller")
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Link: https://lore.kernel.org/r/20211130172253.203700-1-zhou1615@umn.edu
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-08 13:37:50 +00:00
Sascha Hauer
2e8a8b5955 arm64: dts: rockchip: reorder rk3399 hdmi clocks
The binding specifies the clock order to "cec", "grf", "vpll". Reorder
the clocks accordingly.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20220126145549.617165-19-s.hauer@pengutronix.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2022-02-08 13:21:09 +01:00
Lorenzo Bianconi
ea85bf9064 iio: imu: st_lsm6dsx: wait for settling time in st_lsm6dsx_read_oneshot
We need to wait for sensor settling time (~ 3/ODR) before reading data
in st_lsm6dsx_read_oneshot routine in order to avoid corrupted samples.

Fixes: 290a6ce11d ("iio: imu: add support to lsm6dsx driver")
Reported-by: Mario Tesi <mario.tesi@st.com>
Tested-by: Mario Tesi <mario.tesi@st.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/b41ebda5535895298716c76d939f9f165fcd2d13.1644098120.git.lorenzo@kernel.org
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-02-07 20:45:49 +00:00
Hans de Goede
e3d13da7f7 platform/x86: asus-wmi: Fix regression when probing for fan curve control
The fan curve control patches introduced a regression for at least the
TUF FX506 and possibly other TUF series laptops that do not have support
for fan curve control.

As part of the probing process, asus_wmi_evaluate_method_buf is called
to get the factory default fan curve . The WMI management function
returns 0 on certain laptops to indicate lack of fan curve control
instead of ASUS_WMI_UNSUPPORTED_METHOD. This 0 is transformed to
-ENODATA which results in failure when probing.

Fixes: 0f0ac158d2 ("platform/x86: asus-wmi: Add support for custom fan curves")
Reported-and-tested-by: Abhijeet V <abhijeetviswa@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220205112840.33095-1-hdegoede@redhat.com
2022-02-05 14:46:08 +01:00
Hans de Goede
868d7618d7 platform/x86: thinkpad_acpi: Add dual-fan quirk for T15g (2nd gen)
The ThinkPad T15g Gen 2 has 2 fan, add a TPACPI_FAN_2CTL quirk entry for
it to the fan_quirk_table[] so that both fans can be controllerd.

Reported-and-tested-by: David Dreschner <david@dreschner.net>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220203103302.49401-1-hdegoede@redhat.com
2022-02-03 11:34:06 +01:00
Antony Antony
6d0d95a1c2 xfrm: fix the if_id check in changelink
if_id will be always 0, because it was not yet initialized.

Fixes: 8dce439195 ("xfrm: interface with if_id 0 should return error")
Reported-by: Pavel Machek <pavel@denx.de>
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-02-03 08:02:05 +01:00
Dave Jiang
9b818634f8 MAINTAINERS: update mailing list address for NTB subsystem
NTB mailing list is moving from linux-ntb@googlegroups.com to
ntb@lists.linux.dev in order to get better archive and lore support.
Update all entries in MAINTAINERS.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2022-02-02 17:34:18 -05:00
Jonathan Marek
7baa00bef3 arm64: dts: qcom: sm8450: fix apps_smmu interrupts
Update interrupts in apps_smmu to match downstream. This is fixes apps_smmu
failing to probe when running at EL2 (expects 96 context interrupts)

Fixes: 892d539539 ("arm64: dts: qcom: sm8450: add smmu nodes")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220122162932.7686-2-jonathan@marek.ca
2022-01-31 17:44:43 -06:00
Jonathan Marek
197769fede arm64: dts: qcom: sm8450: enable GCC_USB3_0_CLKREF_EN for usb
USB doesn't work at all without this clock enabled. This fixes USB when not
using clk_ignore_unused.

Fixes: 19fd04fb92 ("arm64: dts: qcom: sm8450: Add usb nodes")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220122162932.7686-1-jonathan@marek.ca
2022-01-31 17:44:43 -06:00
Bjorn Andersson
0fd4dcb607 arm64: dts: qcom: sm8350: Correct UFS symbol clocks
The introduction of '9a61f813fcc8 ("clk: qcom: regmap-mux: fix parent
clock lookup")' broke UFS support on SM8350.

The cause for this is that the symbol clocks have a specified rate in
the "freq-table-hz" table in the UFS node, which causes the UFS code to
request a rate change, for which the "bi_tcxo" happens to provide the
closest rate.  Prior to the change in regmap-mux it was determined
(incorrectly) that no change was needed and everything worked.

The rates of 75 and 300MHz matches the documentation for the symbol
clocks, but we don't represent the parent clocks today. So let's mimic
the configuration found in other platforms, by omitting the rate for the
symbol clocks as well to avoid the rate change.

While at it also fill in the dummy symbol clocks that was dropped from
the GCC driver as it was upstreamed.

Fixes: 59c7cf8147 ("arm64: dts: qcom: sm8350: Add UFS nodes")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20211222162058.3418902-1-bjorn.andersson@linaro.org
2022-01-31 17:44:38 -06:00
Frank Rowand
fa4300f060 of: unittest: update text of expected warnings
The text of various warning messages triggered by unittest has
changed.  Update the text of expected warnings to match.

The expected vs actual warnings are most easily seen by filtering
the boot console messages with the of_unittest_expect program at
https://github.com/frowand/dt_tools.git.  The filter prefixes
problem lines with '***', and prefixes lines that match expected
errors with 'ok '.  All other lines are prefixed with '   '.
Unrelated lines have been deleted in the following examples.

The mismatch appears as:

-> ### dt-test ### start of unittest - you will see error messages
      OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found 1
   ** of_unittest_expect WARNING - not found ---> OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found -1
      OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found 1
   ** of_unittest_expect WARNING - not found ---> OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found -1
      OF: /testcase-data/phandle-tests/consumer-b: #phandle-cells = 2 found 1
   ** of_unittest_expect WARNING - not found ---> OF: /testcase-data/phandle-tests/consumer-b: #phandle-cells = 2 found -1
      platform testcase-data:testcase-device2: error -ENXIO: IRQ index 0 not found
   ** of_unittest_expect WARNING - not found ---> platform testcase-data:testcase-device2: IRQ index 0 not found
   -> ### dt-test ### end of unittest - 254 passed, 0 failed
   ** EXPECT statistics:
   **
   **   EXPECT found          :   42
   **   EXPECT not found      :    4

With this commit applied, the mismatch is resolved:

   -> ### dt-test ### start of unittest - you will see error messages
   ok OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found 1
   ok OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found 1
   ok OF: /testcase-data/phandle-tests/consumer-b: #phandle-cells = 2 found 1
   ok platform testcase-data:testcase-device2: error -ENXIO: IRQ index 0 not found
   -> ### dt-test ### end of unittest - 254 passed, 0 failed
   ** EXPECT statistics:
   **
   **   EXPECT found          :   46
   **   EXPECT not found      :    0

Fixes: 2043727c28 ("driver core: platform: Make use of the helper function dev_err_probe()")
Fixes: 94a4950a4a ("of: base: Fix phandle argument length mismatch error message")
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220127192643.2534941-1-frowand.list@gmail.com
2022-01-31 08:35:57 -06:00
Miaoqian Lin
632fe0bb8c iio: Fix error handling for PM
The pm_runtime_enable will increase power disable depth.
If the probe fails, we should use pm_runtime_disable() to balance
pm_runtime_enable(). In the PM Runtime docs:
    Drivers in ->remove() callback should undo the runtime PM changes done
    in ->probe(). Usually this means calling pm_runtime_disable(),
    pm_runtime_dont_use_autosuspend() etc.
We should do this in error handling.

Fix this problem for the following drivers: bmc150, bmg160, kmx61,
kxcj-1013, mma9551, mma9553.

Fixes: 7d0ead5c3f ("iio: Reconcile operation order between iio_register/unregister and pm functions")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220106112309.16879-1-linmq006@gmail.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-01-30 14:50:34 +00:00
Cosmin Tanislav
4165456fe6 iio: addac: ad74413r: correct comparator gpio getters mask usage
The value of the GPIOs is currently altered using offsets rather
than masks. Make use of __assign_bit and the BIT macro to turn
the offsets into masks.

Fixes: fea251b6a5 ("iio: addac: add AD74413R driver")
Signed-off-by: Cosmin Tanislav <cosmin.tanislav@analog.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220111074703.3677392-2-cosmin.tanislav@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-01-30 14:23:33 +00:00
Cosmin Tanislav
8a3e4a5614 iio: addac: ad74413r: use ngpio size when iterating over mask
ngpio is the actual number of GPIOs handled by the GPIO chip,
as opposed to the max number of GPIOs.

Fixes: fea251b6a5 ("iio: addac: add AD74413R driver")
Signed-off-by: Cosmin Tanislav <cosmin.tanislav@analog.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220111074703.3677392-1-cosmin.tanislav@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-01-30 14:22:21 +00:00
Kees Cook
e7a3290d33 iio: addac: ad74413r: Do not reference negative array offsets
Instead of aiming rx_buf at an invalid array-boundary-crossing location,
just skip the first increment. Seen when building with -Warray-bounds:

drivers/iio/addac/ad74413r.c: In function 'ad74413r_update_scan_mode':
drivers/iio/addac/ad74413r.c:843:22: warning: array subscript -4 is below array bounds of 'u8[16]' { aka 'unsigned char[16]'} [-Warray-bounds]
  843 |         u8 *rx_buf = &st->adc_samples_buf.rx_buf[-1 * AD74413R_FRAME_SIZE];
      |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/iio/addac/ad74413r.c:84:20: note: while referencing 'rx_buf'
   84 |                 u8 rx_buf[AD74413R_FRAME_SIZE * AD74413R_CHANNEL_MAX];
      |                    ^~~~~~

Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Michael Hennerich <Michael.Hennerich@analog.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: linux-iio@vger.kernel.org
Fixes: fea251b6a5 ("iio: addac: add AD74413R driver")
Reviewed-by: Cosmin Tanislav <cosmin.tanislav@analog.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220112203456.3950884-1-keescook@chromium.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-01-30 14:19:20 +00:00
Christophe JAILLET
e0a2e37f30 iio: adc: men_z188_adc: Fix a resource leak in an error handling path
If iio_device_register() fails, a previous ioremap() is left unbalanced.

Update the error handling path and add the missing iounmap() call, as
already done in the remove function.

Fixes: 74aeac4da6 ("iio: adc: Add MEN 16z188 ADC driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/320fc777863880247c2aff4a9d1a54ba69abf080.1643445149.git.christophe.jaillet@wanadoo.fr
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-01-30 13:51:24 +00:00
Muhammad Usama Anjum
a1cba0e2de iio: frequency: admv1013: remove the always true condition
unsigned int variable is always greater than or equal to zero. Make the
if condition simple.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Fixes: da35a7b526 ("iio: frequency: admv1013: add support for ADMV1013")
Link: https://lore.kernel.org/r/YdS3gJYtECMaDDjA@debian-BULLSEYE-live-builder-AMD64
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-01-30 11:50:16 +00:00
Krzysztof Kozlowski
8fd9415042 arm64: dts: rockchip: align pl330 node name with dtschema
Fixes dtbs_check warnings like:

  dmac@ff240000: $nodename:0: 'dmac@ff240000' does not match '^dma-controller(@.*)?$'

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20220129175429.298836-1-krzysztof.kozlowski@canonical.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2022-01-30 10:02:00 +01:00
Jakob Unterwurzacher
62966cbdda arm64: dts: rockchip: fix rk3399-puma eMMC HS400 signal integrity
There are signal integrity issues running the eMMC at 200MHz on Puma
RK3399-Q7.

Similar to the work-around found for RK3399 Gru boards, lowering the
frequency to 100MHz made the eMMC much more stable, so let's lower the
frequency to 100MHz.

It might be possible to run at 150MHz as on RK3399 Gru boards but only
100MHz was extensively tested.

Cc: Quentin Schulz <foss+kernel@0leil.net>
Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
Signed-off-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Link: https://lore.kernel.org/r/20220119134948.1444965-1-quentin.schulz@theobroma-systems.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2022-01-30 10:01:59 +01:00
Peter Geis
ad02776cf8 arm64: dts: rockchip: fix Quartz64-A ddr regulator voltage
The Quartz64 Model A uses a voltage divider to ensure ddr voltage is
within specification from the default regulator configuration.
Adjusting this voltage is detrimental, and currently causes the ddr
voltage to be about 0.8v.

Remove the min and max voltage setpoints for the ddr regulator.

Fixes: b33a22a1e7 ("arm64: dts: rockchip: add basic dts for Pine64 Quartz64-A")
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Link: https://lore.kernel.org/r/20220128003809.3291407-2-pgwipeout@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2022-01-30 10:01:44 +01:00
Dave Jiang
d5081bf5dc ntb: intel: fix port config status offset for SPR
The field offset for port configuration status on SPR has been changed to
bit 14 from ICX where it resides at bit 12. By chance link status detection
continued to work on SPR. This is due to bit 12 being a configuration bit
which is in sync with the status bit. Fix this by checking for a SPR device
and checking correct status bit.

Fixes: 26bfe3d0b2 ("ntb: intel: Add Icelake (gen4) support for Intel NTB")
Tested-by: Jerry Dai <jerry.dai@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2022-01-28 10:19:16 -05:00
Jon Hunter
ebea268ea5 arm64: tegra: Disable ISO SMMU for Tegra194
Commit e762232f94 ("arm64: tegra: Add ISO SMMU controller for Tegra194")
added the ISO SMMU for display devices on Tegra194. The SMMU is enabled by
default but not hooked up to the display controllers yet because we do not
have a way to pass frame-buffer memory from the bootloader to the kernel.
However, even though the SMMU is not hooked up to the display controllers'
SMMU faults are being seen if a display is connected. Therefore, keep the
ISO SMMU disabled by default for now.

Fixes: e762232f94 ("arm64: tegra: Add ISO SMMU controller for Tegra194")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-01-27 18:35:39 +01:00
Dmitry Osipenko
22d7ee32f1 gpu: host1x: Fix hang on Tegra186+
Tegra186+ hangs if host1x hardware is disabled at a kernel boot time
because we touch hardware before runtime PM is resumed. Move sync point
assignment initialization to the RPM-resume callback. Older SoCs were
unaffected because they skip that sync point initialization.

Tested-by: Jon Hunter <jonathanh@nvidia.com> # T186
Reported-by: Jon Hunter <jonathanh@nvidia.com> # T186
Fixes: 6b6776e2ab ("gpu: host1x: Add initial runtime PM and OPP support")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-01-27 18:27:02 +01:00
Jiri Bohac
a6d95c5a62 Revert "xfrm: xfrm_state_mtu should return at least 1280 for ipv6"
This reverts commit b515d26372.

Commit b515d26372 ("xfrm: xfrm_state_mtu
should return at least 1280 for ipv6") in v5.14 breaks the TCP MSS
calculation in ipsec transport mode, resulting complete stalls of TCP
connections. This happens when the (P)MTU is 1280 or slighly larger.

The desired formula for the MSS is:
MSS = (MTU - ESP_overhead) - IP header - TCP header

However, the above commit clamps the (MTU - ESP_overhead) to a
minimum of 1280, turning the formula into
MSS = max(MTU - ESP overhead, 1280) -  IP header - TCP header

With the (P)MTU near 1280, the calculated MSS is too large and the
resulting TCP packets never make it to the destination because they
are over the actual PMTU.

The above commit also causes suboptimal double fragmentation in
xfrm tunnel mode, as described in
https://lore.kernel.org/netdev/20210429202529.codhwpc7w6kbudug@dwarf.suse.cz/

The original problem the above commit was trying to fix is now fixed
by commit 6596a02295 ("xfrm: fix MTU
regression").

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-01-27 07:34:06 +01:00
Robin Murphy
31eeb6b09f arm64: dts: juno: Remove GICv2m dma-range
Although it is painstakingly honest to describe all 3 PCI windows in
"dma-ranges", it misses the the subtle distinction that the window for
the GICv2m range is normally programmed for Device memory attributes
rather than Normal Cacheable like the DRAM windows. Since MMU-401 only
offers stage 2 translation, this means that when the PCI SMMU is
enabled, accesses through that IPA range unexpectedly lose coherency if
mapped as cacheable at the SMMU, due to the attribute combining rules.
Since an extra 256KB is neither here nor there when we still have 10GB
worth of usable address space, rather than attempting to describe and
cope with this detail let's just remove the offending range. If the SMMU
is not used then it makes no difference anyway.

Link: https://lore.kernel.org/r/856c3f7192c6c3ce545ba67462f2ce9c86ed6b0c.1643046936.git.robin.murphy@arm.com
Fixes: 4ac4d146cb ("arm64: dts: juno: Describe PCI dma-ranges")
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-01-26 10:23:04 +00:00
Yan Yan
e03c3bba35 xfrm: Fix xfrm migrate issues when address family changes
xfrm_migrate cannot handle address family change of an xfrm_state.
The symptons are the xfrm_state will be migrated to a wrong address,
and sending as well as receiving packets wil be broken.

This commit fixes it by breaking the original xfrm_state_clone
method into two steps so as to update the props.family before
running xfrm_init_state. As the result, xfrm_state's inner mode,
outer mode, type and IP header length in xfrm_state_migrate can
be updated with the new address family.

Tested with additions to Android's kernel unit test suite:
https://android-review.googlesource.com/c/kernel/tests/+/1885354

Signed-off-by: Yan Yan <evitayan@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-01-26 07:44:02 +01:00
Yan Yan
c1aca3080e xfrm: Check if_id in xfrm_migrate
This patch enables distinguishing SAs and SPs based on if_id during
the xfrm_migrate flow. This ensures support for xfrm interfaces
throughout the SA/SP lifecycle.

When there are multiple existing SPs with the same direction,
the same xfrm_selector and different endpoint addresses,
xfrm_migrate might fail with ENODATA.

Specifically, the code path for performing xfrm_migrate is:
  Stage 1: find policy to migrate with
    xfrm_migrate_policy_find(sel, dir, type, net)
  Stage 2: find and update state(s) with
    xfrm_migrate_state_find(mp, net)
  Stage 3: update endpoint address(es) of template(s) with
    xfrm_policy_migrate(pol, m, num_migrate)

Currently "Stage 1" always returns the first xfrm_policy that
matches, and "Stage 3" looks for the xfrm_tmpl that matches the
old endpoint address. Thus if there are multiple xfrm_policy
with same selector, direction, type and net, "Stage 1" might
rertun a wrong xfrm_policy and "Stage 3" will fail with ENODATA
because it cannot find a xfrm_tmpl with the matching endpoint
address.

The fix is to allow userspace to pass an if_id and add if_id
to the matching rule in Stage 1 and Stage 2 since if_id is a
unique ID for xfrm_policy and xfrm_state. For compatibility,
if_id will only be checked if the attribute is set.

Tested with additions to Android's kernel unit test suite:
https://android-review.googlesource.com/c/kernel/tests/+/1668886

Signed-off-by: Yan Yan <evitayan@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-01-26 07:44:01 +01:00
Jiri Bohac
6596a02295 xfrm: fix MTU regression
Commit 749439bfac ("ipv6: fix udpv6
sendmsg crash caused by too small MTU") breaks PMTU for xfrm.

A Packet Too Big ICMPv6 message received in response to an ESP
packet will prevent all further communication through the tunnel
if the reported MTU minus the ESP overhead is smaller than 1280.

E.g. in a case of a tunnel-mode ESP with sha256/aes the overhead
is 92 bytes. Receiving a PTB with MTU of 1371 or less will result
in all further packets in the tunnel dropped. A ping through the
tunnel fails with "ping: sendmsg: Invalid argument".

Apparently the MTU on the xfrm route is smaller than 1280 and
fails the check inside ip6_setup_cork() added by 749439bf.

We found this by debugging USGv6/ipv6ready failures. Failing
tests are: "Phase-2 Interoperability Test Scenario IPsec" /
5.3.11 and 5.4.11 (Tunnel Mode: Fragmentation).

Commit b515d26372 ("xfrm:
xfrm_state_mtu should return at least 1280 for ipv6") attempted
to fix this but caused another regression in TCP MSS calculations
and had to be reverted.

The patch below fixes the situation by dropping the MTU
check and instead checking for the underflows described in the
749439bf commit message.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Fixes: 749439bfac ("ipv6: fix udpv6 sendmsg crash caused by too small MTU")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-01-24 16:51:36 +01:00
Gustavo A. R. Silva
305325688f NTB/msi: Use struct_size() helper in devm_kzalloc()
Make use of the struct_size() helper instead of an open-coded version,
in order to avoid any potential type mistakes or integer overflows that,
in the worst scenario, could lead to heap overflows.

Also, address the following sparse warnings:
drivers/ntb/msi.c:46:23: warning: using sizeof on a flexible structure

Link: https://github.com/KSPP/linux/issues/174
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2022-01-23 16:15:15 -05:00
Sean Nyekjaer
ccbed9d8d2 iio: accel: fxls8962af: add padding to regmap for SPI
Add missing don't care padding between address and
data for SPI transfers

Fixes: a3e0b51884 ("iio: accel: add support for FXLS8962AF/FXLS8964AF accelerometers")
Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Link: https://lore.kernel.org/r/20211220125144.3630539-1-sean@geanix.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-01-23 18:23:48 +00:00
Nuno Sá
b0e85f95e3 iio:imu:adis16480: fix buffering for devices with no burst mode
The trigger handler defined in the driver assumes that burst mode is
being used. Hence, for devices that do not support it, we have to use
the adis library default trigger implementation.

Tested-by: Julia Pineda <julia.pineda@analog.com>
Fixes: 941f130881 ("iio: adis16480: support burst read function")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20220114132608.241-1-nuno.sa@analog.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-01-23 18:23:48 +00:00
Cosmin Tanislav
0e33d15f1d iio: adc: ad7124: fix mask used for setting AIN_BUFP & AIN_BUFM bits
According to page 90 of the datasheet [1], AIN_BUFP is bit 6 and
AIN_BUFM is bit 5 of the CONFIG_0 -> CONFIG_7 registers.

Fix the mask used for setting these bits.

[1]: https://www.analog.com/media/en/technical-documentation/data-sheets/ad7124-8.pdf

Fixes: 0eaecea6e4 ("iio: adc: ad7124: Add buffered input support")
Signed-off-by: Cosmin Tanislav <cosmin.tanislav@analog.com>
Link: https://lore.kernel.org/r/20220112200036.694490-1-cosmin.tanislav@analog.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-01-23 18:23:48 +00:00
Oleksij Rempel
b7a78a8ada iio: adc: tsc2046: fix memory corruption by preventing array overflow
On one side we have indio_dev->num_channels includes all physical channels +
timestamp channel. On other side we have an array allocated only for
physical channels. So, fix memory corruption by ARRAY_SIZE() instead of
num_channels variable.

Note the first case is a cleanup rather than a fix as the software
timestamp channel bit in active_scanmask is never set by the IIO core.

Fixes: 9374e8f5a3 ("iio: adc: add ADC driver for the TI TSC2046 controller")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20220107081401.2816357-1-o.rempel@pengutronix.de
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-01-23 18:23:48 +00:00
Brian Norris
b5fbaf7d77 arm64: dts: rockchip: Switch RK3399-Gru DP to SPDIF output
Commit b18c6c3c77 ("ASoC: rockchip: cdn-dp sound output use spdif")
switched the platform to SPDIF, but we didn't fix up the device tree.

Drop the pinctrl settings, because the 'spdif_bus' pins are either:
 * unused (on kevin, bob), so the settings is ~harmless
 * used by a different function (on scarlet), which causes probe
   failures (!!)

Fixes: b18c6c3c77 ("ASoC: rockchip: cdn-dp sound output use spdif")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Link: https://lore.kernel.org/r/20220114150129.v2.1.I46f64b00508d9dff34abe1c3e8d2defdab4ea1e5@changeid
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2022-01-23 15:35:42 +01:00
Quentin Schulz
ed2c66a95c arm64: dts: rockchip: fix rk3399-puma-haikou USB OTG mode
The micro USB3.0 port available on the Haikou evaluation kit for Puma
RK3399-Q7 SoM supports dual-role model (aka drd or OTG) but its support
was broken until now because of missing logic around the ID pin.

This adds proper support for USB OTG on Puma Haikou by "connecting" the
GPIO used for USB ID to the USB3 controller device.

Cc: Quentin Schulz <foss+kernel@0leil.net>
Signed-off-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Link: https://lore.kernel.org/r/20220120125156.16217-1-quentin.schulz@theobroma-systems.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2022-01-23 15:21:00 +01:00
Frank Wunderlich
85a8bccfa9 arm64: dts: rockchip: drop pclk_xpcs from gmac0 on rk3568
pclk_xpcs is not supported by mainline driver and breaks dtbs_check

following warnings occour, and many more

rk3568-evb1-v10.dt.yaml: ethernet@fe2a0000: clocks:
    [[15, 386], [15, 389], [15, 389], [15, 184], [15, 180], [15, 181],
    [15, 389], [15, 185], [15, 172]] is too long
	From schema: Documentation/devicetree/bindings/net/snps,dwmac.yaml
rk3568-evb1-v10.dt.yaml: ethernet@fe2a0000: clock-names:
    ['stmmaceth', 'mac_clk_rx', 'mac_clk_tx', 'clk_mac_refout', 'aclk_mac',
    'pclk_mac', 'clk_mac_speed', 'ptp_ref', 'pclk_xpcs'] is too long
	From schema: Documentation/devicetree/bindings/net/snps,dwmac.yaml

after removing it, the clock and other warnings are gone.

pclk_xpcs on gmac is used to support QSGMII, but this requires a driver
supporting it.
Once xpcs support is introduced, the clock can be added to the
documentation and both controllers.

Fixes: b8d41e5053 ("arm64: dts: rockchip: add gmac0 node to rk3568")
Co-developed-by: Peter Geis <pgwipeout@gmail.com>
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Acked-by: Michael Riesch <michael.riesch@wolfvision.net>
Link: https://lore.kernel.org/r/20220123133510.135651-1-linux@fw-web.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2022-01-23 14:56:42 +01:00
Frank Wunderlich
2ddd96aadb arm64: dts: rockchip: fix dma-controller node names on rk356x
DMA-Cotrollers defined in rk356x.dtsi do not match the pattern in bindings.

arch/arm64/boot/dts/rockchip/rk3568-evb1-v10.dt.yaml:
  dmac@fe530000: $nodename:0: 'dmac@fe530000' does not match '^dma-controller(@.*)?$'
	From schema: Documentation/devicetree/bindings/dma/arm,pl330.yaml
arch/arm64/boot/dts/rockchip/rk3568-evb1-v10.dt.yaml:
  dmac@fe550000: $nodename:0: 'dmac@fe550000' does not match '^dma-controller(@.*)?$'
	From schema: Documentation/devicetree/bindings/dma/arm,pl330.yaml

This Patch fixes it.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Link: https://lore.kernel.org/r/20220123133615.135789-1-linux@fw-web.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2022-01-23 14:56:04 +01:00
Stanislav Jakubek
fc5a40694b Revert "dt-bindings: arm: qcom: Document SDX65 platform and boards"
This reverts commit 3b338c9a6a.

This was a duplicate of 61339f368d,
causing the sdx65 compatible and its board to be documented twice.

Signed-off-by: Stanislav Jakubek <stano.jakubek@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211223173339.GA3925@standask-GA-A55M-S2HP
2021-12-23 12:00:24 -06:00
742 changed files with 7748 additions and 3292 deletions

View File

@@ -187,6 +187,8 @@ Jiri Slaby <jirislaby@kernel.org> <jslaby@novell.com>
Jiri Slaby <jirislaby@kernel.org> <jslaby@suse.com>
Jiri Slaby <jirislaby@kernel.org> <jslaby@suse.cz>
Jiri Slaby <jirislaby@kernel.org> <xslaby@fi.muni.cz>
Jisheng Zhang <jszhang@kernel.org> <jszhang@marvell.com>
Jisheng Zhang <jszhang@kernel.org> <Jisheng.Zhang@synaptics.com>
Johan Hovold <johan@kernel.org> <jhovold@gmail.com>
Johan Hovold <johan@kernel.org> <johan@hovoldconsulting.com>
John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
@@ -216,6 +218,7 @@ Koushik <raghavendra.koushik@neterion.com>
Krishna Manikandan <quic_mkrishn@quicinc.com> <mkrishn@codeaurora.org>
Krzysztof Kozlowski <krzk@kernel.org> <k.kozlowski.k@gmail.com>
Krzysztof Kozlowski <krzk@kernel.org> <k.kozlowski@samsung.com>
Krzysztof Kozlowski <krzk@kernel.org> <krzysztof.kozlowski@canonical.com>
Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Kuogee Hsieh <quic_khsieh@quicinc.com> <khsieh@codeaurora.org>
Leonardo Bras <leobras.c@gmail.com> <leonardo@linux.ibm.com>
@@ -333,6 +336,9 @@ Rémi Denis-Courmont <rdenis@simphalempin.com>
Ricardo Ribalda <ribalda@kernel.org> <ricardo@ribalda.com>
Ricardo Ribalda <ribalda@kernel.org> Ricardo Ribalda Delgado <ribalda@kernel.org>
Ricardo Ribalda <ribalda@kernel.org> <ricardo.ribalda@gmail.com>
Roman Gushchin <roman.gushchin@linux.dev> <guro@fb.com>
Roman Gushchin <roman.gushchin@linux.dev> <guroan@gmail.com>
Roman Gushchin <roman.gushchin@linux.dev> <klamm@yandex-team.ru>
Ross Zwisler <zwisler@kernel.org> <ross.zwisler@linux.intel.com>
Rudolf Marek <R.Marek@sh.cvut.cz>
Rui Saraiva <rmps@joel.ist.utl.pt>

View File

@@ -895,6 +895,12 @@ S: 3000 FORE Drive
S: Warrendale, Pennsylvania 15086
S: USA
N: Ludovic Desroches
E: ludovic.desroches@microchip.com
D: Maintainer for ARM/Microchip (AT91) SoC support
D: Author of ADC, pinctrl, XDMA and SDHCI drivers for this platform
S: France
N: Martin Devera
E: devik@cdi.cz
W: http://luxik.cdi.cz/~devik/qos/

View File

@@ -60,8 +60,8 @@ privileged data touched during the speculative execution.
Spectre variant 1 attacks take advantage of speculative execution of
conditional branches, while Spectre variant 2 attacks use speculative
execution of indirect branches to leak privileged memory.
See :ref:`[1] <spec_ref1>` :ref:`[5] <spec_ref5>` :ref:`[7] <spec_ref7>`
:ref:`[10] <spec_ref10>` :ref:`[11] <spec_ref11>`.
See :ref:`[1] <spec_ref1>` :ref:`[5] <spec_ref5>` :ref:`[6] <spec_ref6>`
:ref:`[7] <spec_ref7>` :ref:`[10] <spec_ref10>` :ref:`[11] <spec_ref11>`.
Spectre variant 1 (Bounds Check Bypass)
---------------------------------------
@@ -131,6 +131,19 @@ steer its indirect branch speculations to gadget code, and measure the
speculative execution's side effects left in level 1 cache to infer the
victim's data.
Yet another variant 2 attack vector is for the attacker to poison the
Branch History Buffer (BHB) to speculatively steer an indirect branch
to a specific Branch Target Buffer (BTB) entry, even if the entry isn't
associated with the source address of the indirect branch. Specifically,
the BHB might be shared across privilege levels even in the presence of
Enhanced IBRS.
Currently the only known real-world BHB attack vector is via
unprivileged eBPF. Therefore, it's highly recommended to not enable
unprivileged eBPF, especially when eIBRS is used (without retpolines).
For a full mitigation against BHB attacks, it's recommended to use
retpolines (or eIBRS combined with retpolines).
Attack scenarios
----------------
@@ -364,13 +377,15 @@ The possible values in this file are:
- Kernel status:
==================================== =================================
'Not affected' The processor is not vulnerable
'Vulnerable' Vulnerable, no mitigation
'Mitigation: Full generic retpoline' Software-focused mitigation
'Mitigation: Full AMD retpoline' AMD-specific software mitigation
'Mitigation: Enhanced IBRS' Hardware-focused mitigation
==================================== =================================
======================================== =================================
'Not affected' The processor is not vulnerable
'Mitigation: None' Vulnerable, no mitigation
'Mitigation: Retpolines' Use Retpoline thunks
'Mitigation: LFENCE' Use LFENCE instructions
'Mitigation: Enhanced IBRS' Hardware-focused mitigation
'Mitigation: Enhanced IBRS + Retpolines' Hardware-focused + Retpolines
'Mitigation: Enhanced IBRS + LFENCE' Hardware-focused + LFENCE
======================================== =================================
- Firmware status: Show if Indirect Branch Restricted Speculation (IBRS) is
used to protect against Spectre variant 2 attacks when calling firmware (x86 only).
@@ -583,12 +598,13 @@ kernel command line.
Specific mitigations can also be selected manually:
retpoline
replace indirect branches
retpoline,generic
google's original retpoline
retpoline,amd
AMD-specific minimal thunk
retpoline auto pick between generic,lfence
retpoline,generic Retpolines
retpoline,lfence LFENCE; indirect branch
retpoline,amd alias for retpoline,lfence
eibrs enhanced IBRS
eibrs,retpoline enhanced IBRS + Retpolines
eibrs,lfence enhanced IBRS + LFENCE
Not specifying this option is equivalent to
spectre_v2=auto.
@@ -599,7 +615,7 @@ kernel command line.
spectre_v2=off. Spectre variant 1 mitigations
cannot be disabled.
For spectre_v2_user see :doc:`/admin-guide/kernel-parameters`.
For spectre_v2_user see Documentation/admin-guide/kernel-parameters.txt
Mitigation selection guide
--------------------------
@@ -681,7 +697,7 @@ AMD white papers:
.. _spec_ref6:
[6] `Software techniques for managing speculation on AMD processors <https://developer.amd.com/wp-content/resources/90343-B_SoftwareTechniquesforManagingSpeculation_WP_7-18Update_FNL.pdf>`_.
[6] `Software techniques for managing speculation on AMD processors <https://developer.amd.com/wp-content/resources/Managing-Speculation-on-AMD-Processors.pdf>`_.
ARM white papers:

View File

@@ -5361,8 +5361,12 @@
Specific mitigations can also be selected manually:
retpoline - replace indirect branches
retpoline,generic - google's original retpoline
retpoline,amd - AMD-specific minimal thunk
retpoline,generic - Retpolines
retpoline,lfence - LFENCE; indirect branch
retpoline,amd - alias for retpoline,lfence
eibrs - enhanced IBRS
eibrs,retpoline - enhanced IBRS + Retpolines
eibrs,lfence - enhanced IBRS + LFENCE
Not specifying this option is equivalent to
spectre_v2=auto.

View File

@@ -23,7 +23,7 @@ There are four components to pagemap:
* Bit 56 page exclusively mapped (since 4.2)
* Bit 57 pte is uffd-wp write-protected (since 5.13) (see
:ref:`Documentation/admin-guide/mm/userfaultfd.rst <userfaultfd>`)
* Bits 57-60 zero
* Bits 58-60 zero
* Bit 61 page is file-page or shared-anon (since 3.5)
* Bit 62 page swapped
* Bit 63 page present

View File

@@ -75,6 +75,9 @@ And optionally
.resume - A pointer to a per-policy resume function which is called
with interrupts disabled and _before_ the governor is started again.
.ready - A pointer to a per-policy ready function which is called after
the policy is fully initialized.
.attr - A pointer to a NULL-terminated list of "struct freq_attr" which
allow to export values to sysfs.

View File

@@ -8,7 +8,8 @@ title: Atmel AT91 device tree bindings.
maintainers:
- Alexandre Belloni <alexandre.belloni@bootlin.com>
- Ludovic Desroches <ludovic.desroches@microchip.com>
- Claudiu Beznea <claudiu.beznea@microchip.com>
- Nicolas Ferre <nicolas.ferre@microchip.com>
description: |
Boards with a SoC of the Atmel AT91 or SMART family shall have the following

View File

@@ -8,7 +8,7 @@ Required properties:
- compatible: Should contain a chip-specific compatible string,
Chip-specific strings are of the form "fsl,<chip>-dcfg",
The following <chip>s are known to be supported:
ls1012a, ls1021a, ls1043a, ls1046a, ls2080a.
ls1012a, ls1021a, ls1043a, ls1046a, ls2080a, lx2160a
- reg : should contain base address and length of DCFG memory-mapped registers

View File

@@ -48,7 +48,6 @@ description: |
sdx65
sm7225
sm8150
sdx65
sm8250
sm8350
sm8450
@@ -228,11 +227,6 @@ properties:
- qcom,sdx65-mtp
- const: qcom,sdx65
- items:
- enum:
- qcom,sdx65-mtp
- const: qcom,sdx65
- items:
- enum:
- qcom,ipq6018-cp01

View File

@@ -44,6 +44,7 @@ Required properties:
* "fsl,ls1046a-clockgen"
* "fsl,ls1088a-clockgen"
* "fsl,ls2080a-clockgen"
* "fsl,lx2160a-clockgen"
Chassis-version clock strings include:
* "fsl,qoriq-clockgen-1.0": for chassis 1.0 clocks
* "fsl,qoriq-clockgen-2.0": for chassis 2.0 clocks

View File

@@ -91,22 +91,7 @@ properties:
$ref: /schemas/graph.yaml#/$defs/port-base
unevaluatedProperties: false
description:
MIPI DSI/DPI input.
properties:
endpoint:
$ref: /schemas/media/video-interfaces.yaml#
type: object
additionalProperties: false
properties:
remote-endpoint: true
bus-type:
enum: [1, 5]
default: 1
data-lanes: true
Video port for MIPI DSI input.
port@1:
$ref: /schemas/graph.yaml#/properties/port
@@ -155,8 +140,6 @@ examples:
reg = <0>;
anx7625_in: endpoint {
remote-endpoint = <&mipi_dsi>;
bus-type = <5>;
data-lanes = <0 1 2 3>;
};
};

View File

@@ -7,7 +7,6 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: SiFive GPIO controller
maintainers:
- Yash Shah <yash.shah@sifive.com>
- Paul Walmsley <paul.walmsley@sifive.com>
properties:

View File

@@ -39,7 +39,7 @@ patternProperties:
'^phy@[a-f0-9]+$':
$ref: ../phy/bcm-ns-usb2-phy.yaml
'^pin-controller@[a-f0-9]+$':
'^pinctrl@[a-f0-9]+$':
$ref: ../pinctrl/brcm,ns-pinmux.yaml
'^syscon@[a-f0-9]+$':
@@ -94,7 +94,7 @@ examples:
reg = <0x180 0x4>;
};
pin-controller@1c0 {
pinctrl@1c0 {
compatible = "brcm,bcm4708-pinmux";
reg = <0x1c0 0x24>;
reg-names = "cru_gpio_control";

View File

@@ -126,7 +126,7 @@ properties:
clock-frequency:
const: 12288000
lochnagar-pinctrl:
pinctrl:
type: object
$ref: /schemas/pinctrl/cirrus,lochnagar.yaml#
@@ -255,7 +255,7 @@ required:
- reg
- reset-gpios
- lochnagar-clk
- lochnagar-pinctrl
- pinctrl
additionalProperties: false
@@ -293,7 +293,7 @@ examples:
clock-frequency = <32768>;
};
lochnagar-pinctrl {
pinctrl {
compatible = "cirrus,lochnagar-pinctrl";
gpio-controller;

View File

@@ -20,7 +20,7 @@ description: |
maintainers:
- Kishon Vijay Abraham I <kishon@ti.com>
- Roger Quadros <rogerq@ti.com
- Roger Quadros <rogerq@kernel.org>
properties:
compatible:

View File

@@ -8,7 +8,7 @@ title: OMAP USB2 PHY
maintainers:
- Kishon Vijay Abraham I <kishon@ti.com>
- Roger Quadros <rogerq@ti.com>
- Roger Quadros <rogerq@kernel.org>
properties:
compatible:

View File

@@ -37,6 +37,12 @@ properties:
max bit rate supported in bps
minimum: 1
mux-states:
description:
mux controller node to route the signals from controller to
transceiver.
maxItems: 1
required:
- compatible
- '#phy-cells'
@@ -53,4 +59,5 @@ examples:
max-bitrate = <5000000>;
standby-gpios = <&wakeup_gpio1 16 GPIO_ACTIVE_LOW>;
enable-gpios = <&main_gpio1 67 GPIO_ACTIVE_HIGH>;
mux-states = <&mux0 1>;
};

View File

@@ -107,9 +107,6 @@ properties:
additionalProperties: false
allOf:
- $ref: "pinctrl.yaml#"
required:
- pinctrl-0
- pinctrl-names

View File

@@ -8,7 +8,6 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: SiFive PWM controller
maintainers:
- Yash Shah <yash.shah@sifive.com>
- Sagar Kadam <sagar.kadam@sifive.com>
- Paul Walmsley <paul.walmsley@sifive.com>

View File

@@ -9,7 +9,6 @@ title: SiFive L2 Cache Controller
maintainers:
- Sagar Kadam <sagar.kadam@sifive.com>
- Yash Shah <yash.shah@sifive.com>
- Paul Walmsley <paul.walmsley@sifive.com>
description:

View File

@@ -53,6 +53,7 @@ properties:
- const: st,stm32mp15-hsotg
- const: snps,dwc2
- const: samsung,s3c6400-hsotg
- const: intel,socfpga-agilex-hsotg
reg:
maxItems: 1

View File

@@ -7,7 +7,7 @@ $schema: "http://devicetree.org/meta-schemas/core.yaml#"
title: Bindings for the TI wrapper module for the Cadence USBSS-DRD controller
maintainers:
- Roger Quadros <rogerq@ti.com>
- Roger Quadros <rogerq@kernel.org>
properties:
compatible:

View File

@@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: TI Keystone Soc USB Controller
maintainers:
- Roger Quadros <rogerq@ti.com>
- Roger Quadros <rogerq@kernel.org>
properties:
compatible:

View File

@@ -2,7 +2,7 @@
Set the histogram bucket size (default *1*).
**-e**, **--entries** *N*
**-E**, **--entries** *N*
Set the number of entries of the histogram (default 256).

View File

@@ -1,7 +1,7 @@
The **rtla osnoise** tool is an interface for the *osnoise* tracer. The
*osnoise* tracer dispatches a kernel thread per-cpu. These threads read the
time in a loop while with preemption, softirq and IRQs enabled, thus
allowing all the sources of operating systme noise during its execution.
allowing all the sources of operating system noise during its execution.
The *osnoise*'s tracer threads take note of the delta between each time
read, along with an interference counter of all sources of interference.
At the end of each period, the *osnoise* tracer displays a summary of

View File

@@ -36,7 +36,7 @@ default). The reason for reducing the runtime is to avoid starving the
**rtla** tool. The tool is also set to run for *one minute*. The output
histogram is set to group outputs in buckets of *10us* and *25* entries::
[root@f34 ~/]# rtla osnoise hist -P F:1 -c 0-11 -r 900000 -d 1M -b 10 -e 25
[root@f34 ~/]# rtla osnoise hist -P F:1 -c 0-11 -r 900000 -d 1M -b 10 -E 25
# RTLA osnoise histogram
# Time unit is microseconds (us)
# Duration: 0 00:01:00

View File

@@ -84,6 +84,8 @@ CPUfreq核心层注册一个cpufreq_driver结构体。
.resume - 一个指向per-policy恢复函数的指针该函数在关中断且在调节器再一次启动前被
调用。
.ready - 一个指向per-policy准备函数的指针该函数在策略完全初始化之后被调用。
.attr - 一个指向NULL结尾的"struct freq_attr"列表的指针,该列表允许导出值到
sysfs。

View File

@@ -1394,7 +1394,7 @@ documentation when it pops into existence).
-------------------
:Capability: KVM_CAP_ENABLE_CAP
:Architectures: mips, ppc, s390
:Architectures: mips, ppc, s390, x86
:Type: vcpu ioctl
:Parameters: struct kvm_enable_cap (in)
:Returns: 0 on success; -1 on error
@@ -6997,6 +6997,20 @@ indicated by the fd to the VM this is called on.
This is intended to support intra-host migration of VMs between userspace VMMs,
upgrading the VMM process without interrupting the guest.
7.30 KVM_CAP_PPC_AIL_MODE_3
-------------------------------
:Capability: KVM_CAP_PPC_AIL_MODE_3
:Architectures: ppc
:Type: vm
This capability indicates that the kernel supports the mode 3 setting for the
"Address Translation Mode on Interrupt" aka "Alternate Interrupt Location"
resource that is controlled with the H_SET_MODE hypercall.
This capability allows a guest kernel to use a better-performance mode for
handling interrupts and system calls.
8. Other capabilities.
======================

View File

@@ -2254,7 +2254,7 @@ F: drivers/phy/mediatek/
ARM/Microchip (AT91) SoC support
M: Nicolas Ferre <nicolas.ferre@microchip.com>
M: Alexandre Belloni <alexandre.belloni@bootlin.com>
M: Ludovic Desroches <ludovic.desroches@microchip.com>
M: Claudiu Beznea <claudiu.beznea@microchip.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Supported
W: http://www.linux4sam.org
@@ -2572,7 +2572,7 @@ F: sound/soc/rockchip/
N: rockchip
ARM/SAMSUNG S3C, S5P AND EXYNOS ARM ARCHITECTURES
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
R: Alim Akhtar <alim.akhtar@samsung.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
L: linux-samsung-soc@vger.kernel.org
@@ -2739,7 +2739,7 @@ N: stm32
N: stm
ARM/Synaptics SoC support
M: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
M: Jisheng Zhang <jszhang@kernel.org>
M: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
@@ -3905,7 +3905,7 @@ M: Scott Branden <sbranden@broadcom.com>
M: bcm-kernel-feedback-list@broadcom.com
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
T: git git://github.com/broadcom/cygnus-linux.git
T: git git://github.com/broadcom/stblinux.git
F: arch/arm64/boot/dts/broadcom/northstar2/*
F: arch/arm64/boot/dts/broadcom/stingray/*
F: drivers/clk/bcm/clk-ns*
@@ -4913,7 +4913,8 @@ F: kernel/cgroup/cpuset.c
CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)
M: Johannes Weiner <hannes@cmpxchg.org>
M: Michal Hocko <mhocko@kernel.org>
M: Vladimir Davydov <vdavydov.dev@gmail.com>
M: Roman Gushchin <roman.gushchin@linux.dev>
M: Shakeel Butt <shakeelb@google.com>
L: cgroups@vger.kernel.org
L: linux-mm@kvack.org
S: Maintained
@@ -7011,12 +7012,6 @@ L: linux-edac@vger.kernel.org
S: Maintained
F: drivers/edac/sb_edac.c
EDAC-SIFIVE
M: Yash Shah <yash.shah@sifive.com>
L: linux-edac@vger.kernel.org
S: Supported
F: drivers/edac/sifive_edac.c
EDAC-SKYLAKE
M: Tony Luck <tony.luck@intel.com>
L: linux-edac@vger.kernel.org
@@ -7749,8 +7744,7 @@ M: Qiang Zhao <qiang.zhao@nxp.com>
L: linuxppc-dev@lists.ozlabs.org
S: Maintained
F: drivers/soc/fsl/qe/
F: include/soc/fsl/*qe*.h
F: include/soc/fsl/*ucc*.h
F: include/soc/fsl/qe/
FREESCALE QUICC ENGINE UCC ETHERNET DRIVER
M: Li Yang <leoyang.li@nxp.com>
@@ -7781,6 +7775,7 @@ F: Documentation/devicetree/bindings/misc/fsl,dpaa2-console.yaml
F: Documentation/devicetree/bindings/soc/fsl/
F: drivers/soc/fsl/
F: include/linux/fsl/
F: include/soc/fsl/
FREESCALE SOC FS_ENET DRIVER
M: Pantelis Antoniou <pantelis.antoniou@gmail.com>
@@ -11680,7 +11675,7 @@ F: drivers/iio/proximity/mb1232.c
MAXIM MAX17040 FAMILY FUEL GAUGE DRIVERS
R: Iskren Chernev <iskren.chernev@gmail.com>
R: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
R: Krzysztof Kozlowski <krzk@kernel.org>
R: Marek Szyprowski <m.szyprowski@samsung.com>
R: Matheus Castello <matheus@castello.eng.br>
L: linux-pm@vger.kernel.org
@@ -11690,7 +11685,7 @@ F: drivers/power/supply/max17040_battery.c
MAXIM MAX17042 FAMILY FUEL GAUGE DRIVERS
R: Hans de Goede <hdegoede@redhat.com>
R: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
R: Krzysztof Kozlowski <krzk@kernel.org>
R: Marek Szyprowski <m.szyprowski@samsung.com>
R: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
R: Purism Kernel Team <kernel@puri.sm>
@@ -11735,7 +11730,7 @@ F: Documentation/devicetree/bindings/power/supply/maxim,max77976.yaml
F: drivers/power/supply/max77976_charger.c
MAXIM MUIC CHARGER DRIVERS FOR EXYNOS BASED BOARDS
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
L: linux-pm@vger.kernel.org
S: Supported
@@ -11744,7 +11739,7 @@ F: drivers/power/supply/max77693_charger.c
MAXIM PMIC AND MUIC DRIVERS FOR EXYNOS BASED BOARDS
M: Chanwoo Choi <cw00.choi@samsung.com>
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
L: linux-kernel@vger.kernel.org
S: Supported
@@ -12433,7 +12428,7 @@ F: include/linux/memblock.h
F: mm/memblock.c
MEMORY CONTROLLER DRIVERS
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-kernel@vger.kernel.org
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl.git
@@ -13381,6 +13376,7 @@ F: net/core/drop_monitor.c
NETWORKING DRIVERS
M: "David S. Miller" <davem@davemloft.net>
M: Jakub Kicinski <kuba@kernel.org>
M: Paolo Abeni <pabeni@redhat.com>
L: netdev@vger.kernel.org
S: Maintained
Q: https://patchwork.kernel.org/project/netdevbpf/list/
@@ -13427,6 +13423,7 @@ F: tools/testing/selftests/drivers/net/dsa/
NETWORKING [GENERAL]
M: "David S. Miller" <davem@davemloft.net>
M: Jakub Kicinski <kuba@kernel.org>
M: Paolo Abeni <pabeni@redhat.com>
L: netdev@vger.kernel.org
S: Maintained
Q: https://patchwork.kernel.org/project/netdevbpf/list/
@@ -13570,7 +13567,7 @@ F: include/uapi/linux/nexthop.h
F: net/ipv4/nexthop.c
NFC SUBSYSTEM
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-nfc@lists.01.org (subscribers-only)
L: netdev@vger.kernel.org
S: Maintained
@@ -13704,7 +13701,7 @@ F: scripts/nsdeps
NTB AMD DRIVER
M: Sanjay R Mehta <sanju.mehta@amd.com>
M: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
L: linux-ntb@googlegroups.com
L: ntb@lists.linux.dev
S: Supported
F: drivers/ntb/hw/amd/
@@ -13712,7 +13709,7 @@ NTB DRIVER CORE
M: Jon Mason <jdmason@kudzu.us>
M: Dave Jiang <dave.jiang@intel.com>
M: Allen Hubbe <allenbh@gmail.com>
L: linux-ntb@googlegroups.com
L: ntb@lists.linux.dev
S: Supported
W: https://github.com/jonmason/ntb/wiki
T: git git://github.com/jonmason/ntb.git
@@ -13724,13 +13721,13 @@ F: tools/testing/selftests/ntb/
NTB IDT DRIVER
M: Serge Semin <fancer.lancer@gmail.com>
L: linux-ntb@googlegroups.com
L: ntb@lists.linux.dev
S: Supported
F: drivers/ntb/hw/idt/
NTB INTEL DRIVER
M: Dave Jiang <dave.jiang@intel.com>
L: linux-ntb@googlegroups.com
L: ntb@lists.linux.dev
S: Supported
W: https://github.com/davejiang/linux/wiki
T: git https://github.com/davejiang/linux.git
@@ -13884,7 +13881,7 @@ F: Documentation/devicetree/bindings/regulator/nxp,pf8x00-regulator.yaml
F: drivers/regulator/pf8x00-regulator.c
NXP PTN5150A CC LOGIC AND EXTCON DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-kernel@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/extcon/extcon-ptn5150.yaml
@@ -15310,7 +15307,7 @@ F: drivers/pinctrl/renesas/
PIN CONTROLLER - SAMSUNG
M: Tomasz Figa <tomasz.figa@gmail.com>
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Sylwester Nawrocki <s.nawrocki@samsung.com>
R: Alim Akhtar <alim.akhtar@samsung.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
@@ -15573,6 +15570,7 @@ M: Iurii Zaikin <yzaikin@google.com>
L: linux-kernel@vger.kernel.org
L: linux-fsdevel@vger.kernel.org
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git sysctl-next
F: fs/proc/proc_sysctl.c
F: include/linux/sysctl.h
F: kernel/sysctl-test.c
@@ -16079,8 +16077,8 @@ F: Documentation/devicetree/bindings/mtd/qcom,nandc.yaml
F: drivers/mtd/nand/raw/qcom_nandc.c
QUALCOMM RMNET DRIVER
M: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
M: Sean Tranchetti <stranche@codeaurora.org>
M: Subash Abhinov Kasiviswanathan <quic_subashab@quicinc.com>
M: Sean Tranchetti <quic_stranche@quicinc.com>
L: netdev@vger.kernel.org
S: Maintained
F: Documentation/networking/device_drivers/cellular/qualcomm/rmnet.rst
@@ -16372,6 +16370,7 @@ F: drivers/watchdog/realtek_otto_wdt.c
REALTEK RTL83xx SMI DSA ROUTER CHIPS
M: Linus Walleij <linus.walleij@linaro.org>
M: Alvin Šipraga <alsi@bang-olufsen.dk>
S: Maintained
F: Documentation/devicetree/bindings/net/dsa/realtek-smi.txt
F: drivers/net/dsa/realtek-smi*
@@ -16950,7 +16949,7 @@ W: http://www.ibm.com/developerworks/linux/linux390/
F: drivers/s390/scsi/zfcp_*
S3C ADC BATTERY DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-samsung-soc@vger.kernel.org
S: Odd Fixes
F: drivers/power/supply/s3c_adc_battery.c
@@ -16995,7 +16994,7 @@ F: Documentation/admin-guide/LSM/SafeSetID.rst
F: security/safesetid/
SAMSUNG AUDIO (ASoC) DRIVERS
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Sylwester Nawrocki <s.nawrocki@samsung.com>
L: alsa-devel@alsa-project.org (moderated for non-subscribers)
S: Supported
@@ -17003,7 +17002,7 @@ F: Documentation/devicetree/bindings/sound/samsung*
F: sound/soc/samsung/
SAMSUNG EXYNOS PSEUDO RANDOM NUMBER GENERATOR (RNG) DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-crypto@vger.kernel.org
L: linux-samsung-soc@vger.kernel.org
S: Maintained
@@ -17038,7 +17037,7 @@ S: Maintained
F: drivers/platform/x86/samsung-laptop.c
SAMSUNG MULTIFUNCTION PMIC DEVICE DRIVERS
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
L: linux-kernel@vger.kernel.org
L: linux-samsung-soc@vger.kernel.org
@@ -17064,7 +17063,7 @@ F: drivers/media/platform/s3c-camif/
F: include/media/drv-intf/s3c_camif.h
SAMSUNG S3FWRN5 NFC DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Krzysztof Opasiak <k.opasiak@samsung.com>
L: linux-nfc@lists.01.org (subscribers-only)
S: Maintained
@@ -17086,7 +17085,7 @@ S: Supported
F: drivers/media/i2c/s5k5baf.c
SAMSUNG S5P Security SubSystem (SSS) DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Vladimir Zapolskiy <vz@mleia.com>
L: linux-crypto@vger.kernel.org
L: linux-samsung-soc@vger.kernel.org
@@ -17121,7 +17120,7 @@ F: include/linux/clk/samsung.h
F: include/linux/platform_data/clk-s3c2410.h
SAMSUNG SPI DRIVERS
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Andi Shyti <andi@etezian.org>
L: linux-spi@vger.kernel.org
L: linux-samsung-soc@vger.kernel.org
@@ -17765,8 +17764,10 @@ M: David Rientjes <rientjes@google.com>
M: Joonsoo Kim <iamjoonsoo.kim@lge.com>
M: Andrew Morton <akpm@linux-foundation.org>
M: Vlastimil Babka <vbabka@suse.cz>
R: Roman Gushchin <roman.gushchin@linux.dev>
L: linux-mm@kvack.org
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git
F: include/linux/sl?b*.h
F: mm/sl?b*
@@ -21469,7 +21470,6 @@ THE REST
M: Linus Torvalds <torvalds@linux-foundation.org>
L: linux-kernel@vger.kernel.org
S: Buried alive in reporters
Q: http://patchwork.kernel.org/project/LKML/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
F: *
F: */

View File

@@ -2,7 +2,7 @@
VERSION = 5
PATCHLEVEL = 17
SUBLEVEL = 0
EXTRAVERSION = -rc5
EXTRAVERSION =
NAME = Superb Owl
# *DOCUMENTATION*

View File

@@ -118,7 +118,7 @@
};
pinctrl_fwqspid_default: fwqspid_default {
function = "FWQSPID";
function = "FWSPID";
groups = "FWQSPID";
};

View File

@@ -290,6 +290,7 @@
hvs: hvs@7e400000 {
compatible = "brcm,bcm2711-hvs";
reg = <0x7e400000 0x8000>;
interrupts = <GIC_SPI 97 IRQ_TYPE_LEVEL_HIGH>;
};

View File

@@ -158,6 +158,24 @@
status = "disabled";
};
/* Unusable as clockevent because if unreliable oscillator, allow to idle */
&timer1_target {
/delete-property/ti,no-reset-on-init;
/delete-property/ti,no-idle;
timer@0 {
/delete-property/ti,timer-alwon;
};
};
/* Preferred timer for clockevent */
&timer12_target {
ti,no-reset-on-init;
ti,no-idle;
timer@0 {
/* Always clocked by secure_32k_fck */
};
};
&twl_gpio {
ti,use-leds;
/*

View File

@@ -14,36 +14,3 @@
display2 = &tv0;
};
};
/* Unusable as clocksource because of unreliable oscillator */
&counter32k {
status = "disabled";
};
/* Unusable as clockevent because if unreliable oscillator, allow to idle */
&timer1_target {
/delete-property/ti,no-reset-on-init;
/delete-property/ti,no-idle;
timer@0 {
/delete-property/ti,timer-alwon;
};
};
/* Preferred always-on timer for clocksource */
&timer12_target {
ti,no-reset-on-init;
ti,no-idle;
timer@0 {
/* Always clocked by secure_32k_fck */
};
};
/* Preferred timer for clockevent */
&timer2_target {
ti,no-reset-on-init;
ti,no-idle;
timer@0 {
assigned-clocks = <&gpt2_fck>;
assigned-clock-parents = <&sys_ck>;
};
};

View File

@@ -718,8 +718,8 @@
interrupts = <GIC_SPI 35 IRQ_TYPE_LEVEL_HIGH>;
assigned-clocks = <&cru SCLK_HDMI_PHY>;
assigned-clock-parents = <&hdmi_phy>;
clocks = <&cru SCLK_HDMI_HDCP>, <&cru PCLK_HDMI_CTRL>, <&cru SCLK_HDMI_CEC>;
clock-names = "isfr", "iahb", "cec";
clocks = <&cru PCLK_HDMI_CTRL>, <&cru SCLK_HDMI_HDCP>, <&cru SCLK_HDMI_CEC>;
clock-names = "iahb", "isfr", "cec";
pinctrl-names = "default";
pinctrl-0 = <&hdmii2c_xfer &hdmi_hpd &hdmi_cec>;
resets = <&cru SRST_HDMI_P>;

View File

@@ -971,7 +971,7 @@
status = "disabled";
};
crypto: cypto-controller@ff8a0000 {
crypto: crypto@ff8a0000 {
compatible = "rockchip,rk3288-crypto";
reg = <0x0 0xff8a0000 0x0 0x4000>;
interrupts = <GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH>;

View File

@@ -5,7 +5,13 @@
/ {
/* Version of Nyan Big with 1080p panel */
panel {
compatible = "auo,b133htn01";
host1x@50000000 {
dpaux@545c0000 {
aux-bus {
panel: panel {
compatible = "auo,b133htn01";
};
};
};
};
};

View File

@@ -13,12 +13,15 @@
"google,nyan-big-rev1", "google,nyan-big-rev0",
"google,nyan-big", "google,nyan", "nvidia,tegra124";
panel: panel {
compatible = "auo,b133xtn01";
power-supply = <&vdd_3v3_panel>;
backlight = <&backlight>;
ddc-i2c-bus = <&dpaux>;
host1x@50000000 {
dpaux@545c0000 {
aux-bus {
panel: panel {
compatible = "auo,b133xtn01";
backlight = <&backlight>;
};
};
};
};
mmc@700b0400 { /* SD Card on this bus */

View File

@@ -15,12 +15,15 @@
"google,nyan-blaze-rev0", "google,nyan-blaze",
"google,nyan", "nvidia,tegra124";
panel: panel {
compatible = "samsung,ltn140at29-301";
power-supply = <&vdd_3v3_panel>;
backlight = <&backlight>;
ddc-i2c-bus = <&dpaux>;
host1x@50000000 {
dpaux@545c0000 {
aux-bus {
panel: panel {
compatible = "samsung,ltn140at29-301";
backlight = <&backlight>;
};
};
};
};
sound {

View File

@@ -48,6 +48,13 @@
dpaux@545c0000 {
vdd-supply = <&vdd_3v3_panel>;
status = "okay";
aux-bus {
panel: panel {
compatible = "lg,lp129qe";
backlight = <&backlight>;
};
};
};
};
@@ -1080,13 +1087,6 @@
};
};
panel: panel {
compatible = "lg,lp129qe";
power-supply = <&vdd_3v3_panel>;
backlight = <&backlight>;
ddc-i2c-bus = <&dpaux>;
};
vdd_mux: regulator-mux {
compatible = "regulator-fixed";
regulator-name = "+VDD_MUX";

View File

@@ -107,6 +107,16 @@
.endm
#endif
#if __LINUX_ARM_ARCH__ < 7
.macro dsb, args
mcr p15, 0, r0, c7, c10, 4
.endm
.macro isb, args
mcr p15, 0, r0, c7, c5, 4
.endm
#endif
.macro asm_trace_hardirqs_off, save=1
#if defined(CONFIG_TRACE_IRQFLAGS)
.if \save

View File

@@ -0,0 +1,38 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#ifndef __ASM_SPECTRE_H
#define __ASM_SPECTRE_H
enum {
SPECTRE_UNAFFECTED,
SPECTRE_MITIGATED,
SPECTRE_VULNERABLE,
};
enum {
__SPECTRE_V2_METHOD_BPIALL,
__SPECTRE_V2_METHOD_ICIALLU,
__SPECTRE_V2_METHOD_SMC,
__SPECTRE_V2_METHOD_HVC,
__SPECTRE_V2_METHOD_LOOP8,
};
enum {
SPECTRE_V2_METHOD_BPIALL = BIT(__SPECTRE_V2_METHOD_BPIALL),
SPECTRE_V2_METHOD_ICIALLU = BIT(__SPECTRE_V2_METHOD_ICIALLU),
SPECTRE_V2_METHOD_SMC = BIT(__SPECTRE_V2_METHOD_SMC),
SPECTRE_V2_METHOD_HVC = BIT(__SPECTRE_V2_METHOD_HVC),
SPECTRE_V2_METHOD_LOOP8 = BIT(__SPECTRE_V2_METHOD_LOOP8),
};
#ifdef CONFIG_GENERIC_CPU_VULNERABILITIES
void spectre_v2_update_state(unsigned int state, unsigned int methods);
#else
static inline void spectre_v2_update_state(unsigned int state,
unsigned int methods)
{}
#endif
int spectre_bhb_update_vectors(unsigned int method);
#endif

View File

@@ -26,6 +26,19 @@
#define ARM_MMU_DISCARD(x) x
#endif
/*
* ld.lld does not support NOCROSSREFS:
* https://github.com/ClangBuiltLinux/linux/issues/1609
*/
#ifdef CONFIG_LD_IS_LLD
#define NOCROSSREFS
#endif
/* Set start/end symbol names to the LMA for the section */
#define ARM_LMA(sym, section) \
sym##_start = LOADADDR(section); \
sym##_end = LOADADDR(section) + SIZEOF(section)
#define PROC_INFO \
. = ALIGN(4); \
__proc_info_begin = .; \
@@ -110,19 +123,31 @@
* only thing that matters is their relative offsets
*/
#define ARM_VECTORS \
__vectors_start = .; \
.vectors 0xffff0000 : AT(__vectors_start) { \
*(.vectors) \
__vectors_lma = .; \
OVERLAY 0xffff0000 : NOCROSSREFS AT(__vectors_lma) { \
.vectors { \
*(.vectors) \
} \
.vectors.bhb.loop8 { \
*(.vectors.bhb.loop8) \
} \
.vectors.bhb.bpiall { \
*(.vectors.bhb.bpiall) \
} \
} \
. = __vectors_start + SIZEOF(.vectors); \
__vectors_end = .; \
ARM_LMA(__vectors, .vectors); \
ARM_LMA(__vectors_bhb_loop8, .vectors.bhb.loop8); \
ARM_LMA(__vectors_bhb_bpiall, .vectors.bhb.bpiall); \
. = __vectors_lma + SIZEOF(.vectors) + \
SIZEOF(.vectors.bhb.loop8) + \
SIZEOF(.vectors.bhb.bpiall); \
\
__stubs_start = .; \
.stubs ADDR(.vectors) + 0x1000 : AT(__stubs_start) { \
__stubs_lma = .; \
.stubs ADDR(.vectors) + 0x1000 : AT(__stubs_lma) { \
*(.stubs) \
} \
. = __stubs_start + SIZEOF(.stubs); \
__stubs_end = .; \
ARM_LMA(__stubs, .stubs); \
. = __stubs_lma + SIZEOF(.stubs); \
\
PROVIDE(vector_fiq_offset = vector_fiq - ADDR(.vectors));

View File

@@ -106,4 +106,6 @@ endif
obj-$(CONFIG_HAVE_ARM_SMCCC) += smccc-call.o
obj-$(CONFIG_GENERIC_CPU_VULNERABILITIES) += spectre.o
extra-y := $(head-y) vmlinux.lds

View File

@@ -1002,12 +1002,11 @@ vector_\name:
sub lr, lr, #\correction
.endif
@
@ Save r0, lr_<exception> (parent PC) and spsr_<exception>
@ (parent CPSR)
@
@ Save r0, lr_<exception> (parent PC)
stmia sp, {r0, lr} @ save r0, lr
mrs lr, spsr
@ Save spsr_<exception> (parent CPSR)
2: mrs lr, spsr
str lr, [sp, #8] @ save spsr
@
@@ -1028,6 +1027,44 @@ vector_\name:
movs pc, lr @ branch to handler in SVC mode
ENDPROC(vector_\name)
#ifdef CONFIG_HARDEN_BRANCH_HISTORY
.subsection 1
.align 5
vector_bhb_loop8_\name:
.if \correction
sub lr, lr, #\correction
.endif
@ Save r0, lr_<exception> (parent PC)
stmia sp, {r0, lr}
@ bhb workaround
mov r0, #8
3: b . + 4
subs r0, r0, #1
bne 3b
dsb
isb
b 2b
ENDPROC(vector_bhb_loop8_\name)
vector_bhb_bpiall_\name:
.if \correction
sub lr, lr, #\correction
.endif
@ Save r0, lr_<exception> (parent PC)
stmia sp, {r0, lr}
@ bhb workaround
mcr p15, 0, r0, c7, c5, 6 @ BPIALL
@ isb not needed due to "movs pc, lr" in the vector stub
@ which gives a "context synchronisation".
b 2b
ENDPROC(vector_bhb_bpiall_\name)
.previous
#endif
.align 2
@ handler addresses follow this label
1:
@@ -1036,6 +1073,10 @@ ENDPROC(vector_\name)
.section .stubs, "ax", %progbits
@ This must be the first word
.word vector_swi
#ifdef CONFIG_HARDEN_BRANCH_HISTORY
.word vector_bhb_loop8_swi
.word vector_bhb_bpiall_swi
#endif
vector_rst:
ARM( swi SYS_ERROR0 )
@@ -1150,8 +1191,10 @@ vector_addrexcptn:
* FIQ "NMI" handler
*-----------------------------------------------------------------------------
* Handle a FIQ using the SVC stack allowing FIQ act like NMI on x86
* systems.
* systems. This must be the last vector stub, so lets place it in its own
* subsection.
*/
.subsection 2
vector_stub fiq, FIQ_MODE, 4
.long __fiq_usr @ 0 (USR_26 / USR_32)
@@ -1184,6 +1227,30 @@ vector_addrexcptn:
W(b) vector_irq
W(b) vector_fiq
#ifdef CONFIG_HARDEN_BRANCH_HISTORY
.section .vectors.bhb.loop8, "ax", %progbits
.L__vectors_bhb_loop8_start:
W(b) vector_rst
W(b) vector_bhb_loop8_und
W(ldr) pc, .L__vectors_bhb_loop8_start + 0x1004
W(b) vector_bhb_loop8_pabt
W(b) vector_bhb_loop8_dabt
W(b) vector_addrexcptn
W(b) vector_bhb_loop8_irq
W(b) vector_bhb_loop8_fiq
.section .vectors.bhb.bpiall, "ax", %progbits
.L__vectors_bhb_bpiall_start:
W(b) vector_rst
W(b) vector_bhb_bpiall_und
W(ldr) pc, .L__vectors_bhb_bpiall_start + 0x1008
W(b) vector_bhb_bpiall_pabt
W(b) vector_bhb_bpiall_dabt
W(b) vector_addrexcptn
W(b) vector_bhb_bpiall_irq
W(b) vector_bhb_bpiall_fiq
#endif
.data
.align 2

View File

@@ -153,6 +153,29 @@ ENDPROC(ret_from_fork)
*-----------------------------------------------------------------------------
*/
.align 5
#ifdef CONFIG_HARDEN_BRANCH_HISTORY
ENTRY(vector_bhb_loop8_swi)
sub sp, sp, #PT_REGS_SIZE
stmia sp, {r0 - r12}
mov r8, #8
1: b 2f
2: subs r8, r8, #1
bne 1b
dsb
isb
b 3f
ENDPROC(vector_bhb_loop8_swi)
.align 5
ENTRY(vector_bhb_bpiall_swi)
sub sp, sp, #PT_REGS_SIZE
stmia sp, {r0 - r12}
mcr p15, 0, r8, c7, c5, 6 @ BPIALL
isb
b 3f
ENDPROC(vector_bhb_bpiall_swi)
#endif
.align 5
ENTRY(vector_swi)
#ifdef CONFIG_CPU_V7M
@@ -160,6 +183,7 @@ ENTRY(vector_swi)
#else
sub sp, sp, #PT_REGS_SIZE
stmia sp, {r0 - r12} @ Calling r0 - r12
3:
ARM( add r8, sp, #S_PC )
ARM( stmdb r8, {sp, lr}^ ) @ Calling sp, lr
THUMB( mov r8, sp )

View File

@@ -154,22 +154,38 @@ static int kgdb_compiled_brk_fn(struct pt_regs *regs, unsigned int instr)
return 0;
}
static struct undef_hook kgdb_brkpt_hook = {
static struct undef_hook kgdb_brkpt_arm_hook = {
.instr_mask = 0xffffffff,
.instr_val = KGDB_BREAKINST,
.cpsr_mask = MODE_MASK,
.cpsr_mask = PSR_T_BIT | MODE_MASK,
.cpsr_val = SVC_MODE,
.fn = kgdb_brk_fn
};
static struct undef_hook kgdb_compiled_brkpt_hook = {
static struct undef_hook kgdb_brkpt_thumb_hook = {
.instr_mask = 0xffff,
.instr_val = KGDB_BREAKINST & 0xffff,
.cpsr_mask = PSR_T_BIT | MODE_MASK,
.cpsr_val = PSR_T_BIT | SVC_MODE,
.fn = kgdb_brk_fn
};
static struct undef_hook kgdb_compiled_brkpt_arm_hook = {
.instr_mask = 0xffffffff,
.instr_val = KGDB_COMPILED_BREAK,
.cpsr_mask = MODE_MASK,
.cpsr_mask = PSR_T_BIT | MODE_MASK,
.cpsr_val = SVC_MODE,
.fn = kgdb_compiled_brk_fn
};
static struct undef_hook kgdb_compiled_brkpt_thumb_hook = {
.instr_mask = 0xffff,
.instr_val = KGDB_COMPILED_BREAK & 0xffff,
.cpsr_mask = PSR_T_BIT | MODE_MASK,
.cpsr_val = PSR_T_BIT | SVC_MODE,
.fn = kgdb_compiled_brk_fn
};
static int __kgdb_notify(struct die_args *args, unsigned long cmd)
{
struct pt_regs *regs = args->regs;
@@ -210,8 +226,10 @@ int kgdb_arch_init(void)
if (ret != 0)
return ret;
register_undef_hook(&kgdb_brkpt_hook);
register_undef_hook(&kgdb_compiled_brkpt_hook);
register_undef_hook(&kgdb_brkpt_arm_hook);
register_undef_hook(&kgdb_brkpt_thumb_hook);
register_undef_hook(&kgdb_compiled_brkpt_arm_hook);
register_undef_hook(&kgdb_compiled_brkpt_thumb_hook);
return 0;
}
@@ -224,8 +242,10 @@ int kgdb_arch_init(void)
*/
void kgdb_arch_exit(void)
{
unregister_undef_hook(&kgdb_brkpt_hook);
unregister_undef_hook(&kgdb_compiled_brkpt_hook);
unregister_undef_hook(&kgdb_brkpt_arm_hook);
unregister_undef_hook(&kgdb_brkpt_thumb_hook);
unregister_undef_hook(&kgdb_compiled_brkpt_arm_hook);
unregister_undef_hook(&kgdb_compiled_brkpt_thumb_hook);
unregister_die_notifier(&kgdb_notifier);
}

71
arch/arm/kernel/spectre.c Normal file
View File

@@ -0,0 +1,71 @@
// SPDX-License-Identifier: GPL-2.0-only
#include <linux/bpf.h>
#include <linux/cpu.h>
#include <linux/device.h>
#include <asm/spectre.h>
static bool _unprivileged_ebpf_enabled(void)
{
#ifdef CONFIG_BPF_SYSCALL
return !sysctl_unprivileged_bpf_disabled;
#else
return false;
#endif
}
ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr,
char *buf)
{
return sprintf(buf, "Mitigation: __user pointer sanitization\n");
}
static unsigned int spectre_v2_state;
static unsigned int spectre_v2_methods;
void spectre_v2_update_state(unsigned int state, unsigned int method)
{
if (state > spectre_v2_state)
spectre_v2_state = state;
spectre_v2_methods |= method;
}
ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr,
char *buf)
{
const char *method;
if (spectre_v2_state == SPECTRE_UNAFFECTED)
return sprintf(buf, "%s\n", "Not affected");
if (spectre_v2_state != SPECTRE_MITIGATED)
return sprintf(buf, "%s\n", "Vulnerable");
if (_unprivileged_ebpf_enabled())
return sprintf(buf, "Vulnerable: Unprivileged eBPF enabled\n");
switch (spectre_v2_methods) {
case SPECTRE_V2_METHOD_BPIALL:
method = "Branch predictor hardening";
break;
case SPECTRE_V2_METHOD_ICIALLU:
method = "I-cache invalidation";
break;
case SPECTRE_V2_METHOD_SMC:
case SPECTRE_V2_METHOD_HVC:
method = "Firmware call";
break;
case SPECTRE_V2_METHOD_LOOP8:
method = "History overwrite";
break;
default:
method = "Multiple mitigations";
break;
}
return sprintf(buf, "Mitigation: %s\n", method);
}

View File

@@ -30,6 +30,7 @@
#include <linux/atomic.h>
#include <asm/cacheflush.h>
#include <asm/exception.h>
#include <asm/spectre.h>
#include <asm/unistd.h>
#include <asm/traps.h>
#include <asm/ptrace.h>
@@ -789,10 +790,59 @@ static inline void __init kuser_init(void *vectors)
}
#endif
#ifndef CONFIG_CPU_V7M
static void copy_from_lma(void *vma, void *lma_start, void *lma_end)
{
memcpy(vma, lma_start, lma_end - lma_start);
}
static void flush_vectors(void *vma, size_t offset, size_t size)
{
unsigned long start = (unsigned long)vma + offset;
unsigned long end = start + size;
flush_icache_range(start, end);
}
#ifdef CONFIG_HARDEN_BRANCH_HISTORY
int spectre_bhb_update_vectors(unsigned int method)
{
extern char __vectors_bhb_bpiall_start[], __vectors_bhb_bpiall_end[];
extern char __vectors_bhb_loop8_start[], __vectors_bhb_loop8_end[];
void *vec_start, *vec_end;
if (system_state >= SYSTEM_FREEING_INITMEM) {
pr_err("CPU%u: Spectre BHB workaround too late - system vulnerable\n",
smp_processor_id());
return SPECTRE_VULNERABLE;
}
switch (method) {
case SPECTRE_V2_METHOD_LOOP8:
vec_start = __vectors_bhb_loop8_start;
vec_end = __vectors_bhb_loop8_end;
break;
case SPECTRE_V2_METHOD_BPIALL:
vec_start = __vectors_bhb_bpiall_start;
vec_end = __vectors_bhb_bpiall_end;
break;
default:
pr_err("CPU%u: unknown Spectre BHB state %d\n",
smp_processor_id(), method);
return SPECTRE_VULNERABLE;
}
copy_from_lma(vectors_page, vec_start, vec_end);
flush_vectors(vectors_page, 0, vec_end - vec_start);
return SPECTRE_MITIGATED;
}
#endif
void __init early_trap_init(void *vectors_base)
{
#ifndef CONFIG_CPU_V7M
unsigned long vectors = (unsigned long)vectors_base;
extern char __stubs_start[], __stubs_end[];
extern char __vectors_start[], __vectors_end[];
unsigned i;
@@ -813,17 +863,20 @@ void __init early_trap_init(void *vectors_base)
* into the vector page, mapped at 0xffff0000, and ensure these
* are visible to the instruction stream.
*/
memcpy((void *)vectors, __vectors_start, __vectors_end - __vectors_start);
memcpy((void *)vectors + 0x1000, __stubs_start, __stubs_end - __stubs_start);
copy_from_lma(vectors_base, __vectors_start, __vectors_end);
copy_from_lma(vectors_base + 0x1000, __stubs_start, __stubs_end);
kuser_init(vectors_base);
flush_icache_range(vectors, vectors + PAGE_SIZE * 2);
flush_vectors(vectors_base, 0, PAGE_SIZE * 2);
}
#else /* ifndef CONFIG_CPU_V7M */
void __init early_trap_init(void *vectors_base)
{
/*
* on V7-M there is no need to copy the vector table to a dedicated
* memory area. The address is configurable and so a table in the kernel
* image can be used.
*/
#endif
}
#endif

View File

@@ -3,6 +3,7 @@ menuconfig ARCH_MSTARV7
depends on ARCH_MULTI_V7
select ARM_GIC
select ARM_HEAVY_MB
select HAVE_ARM_ARCH_TIMER
select MST_IRQ
select MSTAR_MSC313_MPLL
help

View File

@@ -830,6 +830,7 @@ config CPU_BPREDICT_DISABLE
config CPU_SPECTRE
bool
select GENERIC_CPU_VULNERABILITIES
config HARDEN_BRANCH_PREDICTOR
bool "Harden the branch predictor against aliasing attacks" if EXPERT
@@ -850,6 +851,16 @@ config HARDEN_BRANCH_PREDICTOR
If unsure, say Y.
config HARDEN_BRANCH_HISTORY
bool "Harden Spectre style attacks against branch history" if EXPERT
depends on CPU_SPECTRE
default y
help
Speculation attacks against some high-performance processors can
make use of branch history to influence future speculation. When
taking an exception, a sequence of branches overwrites the branch
history, or branch history is invalidated.
config TLS_REG_EMUL
bool
select NEED_KUSER_HELPERS

View File

@@ -212,12 +212,14 @@ early_param("ecc", early_ecc);
static int __init early_cachepolicy(char *p)
{
pr_warn("cachepolicy kernel parameter not supported without cp15\n");
return 0;
}
early_param("cachepolicy", early_cachepolicy);
static int __init noalign_setup(char *__unused)
{
pr_warn("noalign kernel parameter not supported without cp15\n");
return 1;
}
__setup("noalign", noalign_setup);

View File

@@ -6,8 +6,35 @@
#include <asm/cp15.h>
#include <asm/cputype.h>
#include <asm/proc-fns.h>
#include <asm/spectre.h>
#include <asm/system_misc.h>
#ifdef CONFIG_ARM_PSCI
static int __maybe_unused spectre_v2_get_cpu_fw_mitigation_state(void)
{
struct arm_smccc_res res;
arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
ARM_SMCCC_ARCH_WORKAROUND_1, &res);
switch ((int)res.a0) {
case SMCCC_RET_SUCCESS:
return SPECTRE_MITIGATED;
case SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED:
return SPECTRE_UNAFFECTED;
default:
return SPECTRE_VULNERABLE;
}
}
#else
static int __maybe_unused spectre_v2_get_cpu_fw_mitigation_state(void)
{
return SPECTRE_VULNERABLE;
}
#endif
#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
DEFINE_PER_CPU(harden_branch_predictor_fn_t, harden_branch_predictor_fn);
@@ -36,13 +63,61 @@ static void __maybe_unused call_hvc_arch_workaround_1(void)
arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
}
static void cpu_v7_spectre_init(void)
static unsigned int spectre_v2_install_workaround(unsigned int method)
{
const char *spectre_v2_method = NULL;
int cpu = smp_processor_id();
if (per_cpu(harden_branch_predictor_fn, cpu))
return;
return SPECTRE_MITIGATED;
switch (method) {
case SPECTRE_V2_METHOD_BPIALL:
per_cpu(harden_branch_predictor_fn, cpu) =
harden_branch_predictor_bpiall;
spectre_v2_method = "BPIALL";
break;
case SPECTRE_V2_METHOD_ICIALLU:
per_cpu(harden_branch_predictor_fn, cpu) =
harden_branch_predictor_iciallu;
spectre_v2_method = "ICIALLU";
break;
case SPECTRE_V2_METHOD_HVC:
per_cpu(harden_branch_predictor_fn, cpu) =
call_hvc_arch_workaround_1;
cpu_do_switch_mm = cpu_v7_hvc_switch_mm;
spectre_v2_method = "hypervisor";
break;
case SPECTRE_V2_METHOD_SMC:
per_cpu(harden_branch_predictor_fn, cpu) =
call_smc_arch_workaround_1;
cpu_do_switch_mm = cpu_v7_smc_switch_mm;
spectre_v2_method = "firmware";
break;
}
if (spectre_v2_method)
pr_info("CPU%u: Spectre v2: using %s workaround\n",
smp_processor_id(), spectre_v2_method);
return SPECTRE_MITIGATED;
}
#else
static unsigned int spectre_v2_install_workaround(unsigned int method)
{
pr_info("CPU%u: Spectre V2: workarounds disabled by configuration\n",
smp_processor_id());
return SPECTRE_VULNERABLE;
}
#endif
static void cpu_v7_spectre_v2_init(void)
{
unsigned int state, method = 0;
switch (read_cpuid_part()) {
case ARM_CPU_PART_CORTEX_A8:
@@ -51,69 +126,133 @@ static void cpu_v7_spectre_init(void)
case ARM_CPU_PART_CORTEX_A17:
case ARM_CPU_PART_CORTEX_A73:
case ARM_CPU_PART_CORTEX_A75:
per_cpu(harden_branch_predictor_fn, cpu) =
harden_branch_predictor_bpiall;
spectre_v2_method = "BPIALL";
state = SPECTRE_MITIGATED;
method = SPECTRE_V2_METHOD_BPIALL;
break;
case ARM_CPU_PART_CORTEX_A15:
case ARM_CPU_PART_BRAHMA_B15:
per_cpu(harden_branch_predictor_fn, cpu) =
harden_branch_predictor_iciallu;
spectre_v2_method = "ICIALLU";
state = SPECTRE_MITIGATED;
method = SPECTRE_V2_METHOD_ICIALLU;
break;
#ifdef CONFIG_ARM_PSCI
case ARM_CPU_PART_BRAHMA_B53:
/* Requires no workaround */
state = SPECTRE_UNAFFECTED;
break;
default:
/* Other ARM CPUs require no workaround */
if (read_cpuid_implementor() == ARM_CPU_IMP_ARM)
if (read_cpuid_implementor() == ARM_CPU_IMP_ARM) {
state = SPECTRE_UNAFFECTED;
break;
fallthrough;
/* Cortex A57/A72 require firmware workaround */
case ARM_CPU_PART_CORTEX_A57:
case ARM_CPU_PART_CORTEX_A72: {
struct arm_smccc_res res;
}
arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
ARM_SMCCC_ARCH_WORKAROUND_1, &res);
if ((int)res.a0 != 0)
return;
fallthrough;
/* Cortex A57/A72 require firmware workaround */
case ARM_CPU_PART_CORTEX_A57:
case ARM_CPU_PART_CORTEX_A72:
state = spectre_v2_get_cpu_fw_mitigation_state();
if (state != SPECTRE_MITIGATED)
break;
switch (arm_smccc_1_1_get_conduit()) {
case SMCCC_CONDUIT_HVC:
per_cpu(harden_branch_predictor_fn, cpu) =
call_hvc_arch_workaround_1;
cpu_do_switch_mm = cpu_v7_hvc_switch_mm;
spectre_v2_method = "hypervisor";
method = SPECTRE_V2_METHOD_HVC;
break;
case SMCCC_CONDUIT_SMC:
per_cpu(harden_branch_predictor_fn, cpu) =
call_smc_arch_workaround_1;
cpu_do_switch_mm = cpu_v7_smc_switch_mm;
spectre_v2_method = "firmware";
method = SPECTRE_V2_METHOD_SMC;
break;
default:
state = SPECTRE_VULNERABLE;
break;
}
}
#endif
if (state == SPECTRE_MITIGATED)
state = spectre_v2_install_workaround(method);
spectre_v2_update_state(state, method);
}
#ifdef CONFIG_HARDEN_BRANCH_HISTORY
static int spectre_bhb_method;
static const char *spectre_bhb_method_name(int method)
{
switch (method) {
case SPECTRE_V2_METHOD_LOOP8:
return "loop";
case SPECTRE_V2_METHOD_BPIALL:
return "BPIALL";
default:
return "unknown";
}
}
static int spectre_bhb_install_workaround(int method)
{
if (spectre_bhb_method != method) {
if (spectre_bhb_method) {
pr_err("CPU%u: Spectre BHB: method disagreement, system vulnerable\n",
smp_processor_id());
return SPECTRE_VULNERABLE;
}
if (spectre_bhb_update_vectors(method) == SPECTRE_VULNERABLE)
return SPECTRE_VULNERABLE;
spectre_bhb_method = method;
}
if (spectre_v2_method)
pr_info("CPU%u: Spectre v2: using %s workaround\n",
smp_processor_id(), spectre_v2_method);
pr_info("CPU%u: Spectre BHB: using %s workaround\n",
smp_processor_id(), spectre_bhb_method_name(method));
return SPECTRE_MITIGATED;
}
#else
static void cpu_v7_spectre_init(void)
static int spectre_bhb_install_workaround(int method)
{
return SPECTRE_VULNERABLE;
}
#endif
static void cpu_v7_spectre_bhb_init(void)
{
unsigned int state, method = 0;
switch (read_cpuid_part()) {
case ARM_CPU_PART_CORTEX_A15:
case ARM_CPU_PART_BRAHMA_B15:
case ARM_CPU_PART_CORTEX_A57:
case ARM_CPU_PART_CORTEX_A72:
state = SPECTRE_MITIGATED;
method = SPECTRE_V2_METHOD_LOOP8;
break;
case ARM_CPU_PART_CORTEX_A73:
case ARM_CPU_PART_CORTEX_A75:
state = SPECTRE_MITIGATED;
method = SPECTRE_V2_METHOD_BPIALL;
break;
default:
state = SPECTRE_UNAFFECTED;
break;
}
if (state == SPECTRE_MITIGATED)
state = spectre_bhb_install_workaround(method);
spectre_v2_update_state(state, method);
}
static __maybe_unused bool cpu_v7_check_auxcr_set(bool *warned,
u32 mask, const char *msg)
{
@@ -142,16 +281,17 @@ static bool check_spectre_auxcr(bool *warned, u32 bit)
void cpu_v7_ca8_ibe(void)
{
if (check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(6)))
cpu_v7_spectre_init();
cpu_v7_spectre_v2_init();
}
void cpu_v7_ca15_ibe(void)
{
if (check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(0)))
cpu_v7_spectre_init();
cpu_v7_spectre_v2_init();
}
void cpu_v7_bugs_init(void)
{
cpu_v7_spectre_init();
cpu_v7_spectre_v2_init();
cpu_v7_spectre_bhb_init();
}

View File

@@ -1252,9 +1252,6 @@ config HW_PERF_EVENTS
def_bool y
depends on ARM_PMU
config ARCH_HAS_FILTER_PGPROT
def_bool y
# Supported by clang >= 7.0
config CC_HAVE_SHADOW_CALL_STACK
def_bool $(cc-option, -fsanitize=shadow-call-stack -ffixed-x18)
@@ -1383,6 +1380,15 @@ config UNMAP_KERNEL_AT_EL0
If unsure, say Y.
config MITIGATE_SPECTRE_BRANCH_HISTORY
bool "Mitigate Spectre style attacks against branch history" if EXPERT
default y
help
Speculation attacks against some high-performance processors can
make use of branch history to influence future speculation.
When taking an exception from user-space, a sequence of branches
or a firmware call overwrites the branch history.
config RODATA_FULL_DEFAULT_ENABLED
bool "Apply r/o permissions of VM areas also to their linear aliases"
default y

View File

@@ -543,8 +543,7 @@
<0x02000000 0x00 0x50000000 0x00 0x50000000 0x0 0x08000000>,
<0x42000000 0x40 0x00000000 0x40 0x00000000 0x1 0x00000000>;
/* Standard AXI Translation entries as programmed by EDK2 */
dma-ranges = <0x02000000 0x0 0x2c1c0000 0x0 0x2c1c0000 0x0 0x00040000>,
<0x02000000 0x0 0x80000000 0x0 0x80000000 0x0 0x80000000>,
dma-ranges = <0x02000000 0x0 0x80000000 0x0 0x80000000 0x0 0x80000000>,
<0x43000000 0x8 0x00000000 0x8 0x00000000 0x2 0x00000000>;
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 7>;

View File

@@ -253,18 +253,18 @@
interrupt-controller;
reg = <0x14 4>;
interrupt-map =
<0 0 &gic 0 0 GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>,
<1 0 &gic 0 0 GIC_SPI 1 IRQ_TYPE_LEVEL_HIGH>,
<2 0 &gic 0 0 GIC_SPI 2 IRQ_TYPE_LEVEL_HIGH>,
<3 0 &gic 0 0 GIC_SPI 3 IRQ_TYPE_LEVEL_HIGH>,
<4 0 &gic 0 0 GIC_SPI 4 IRQ_TYPE_LEVEL_HIGH>,
<5 0 &gic 0 0 GIC_SPI 5 IRQ_TYPE_LEVEL_HIGH>,
<6 0 &gic 0 0 GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
<7 0 &gic 0 0 GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
<8 0 &gic 0 0 GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<9 0 &gic 0 0 GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>,
<10 0 &gic 0 0 GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>,
<11 0 &gic 0 0 GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
<0 0 &gic GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>,
<1 0 &gic GIC_SPI 1 IRQ_TYPE_LEVEL_HIGH>,
<2 0 &gic GIC_SPI 2 IRQ_TYPE_LEVEL_HIGH>,
<3 0 &gic GIC_SPI 3 IRQ_TYPE_LEVEL_HIGH>,
<4 0 &gic GIC_SPI 4 IRQ_TYPE_LEVEL_HIGH>,
<5 0 &gic GIC_SPI 5 IRQ_TYPE_LEVEL_HIGH>,
<6 0 &gic GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
<7 0 &gic GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
<8 0 &gic GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<9 0 &gic GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>,
<10 0 &gic GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>,
<11 0 &gic GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
interrupt-map-mask = <0xffffffff 0x0>;
};
};

View File

@@ -293,18 +293,18 @@
interrupt-controller;
reg = <0x14 4>;
interrupt-map =
<0 0 &gic 0 0 GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>,
<1 0 &gic 0 0 GIC_SPI 1 IRQ_TYPE_LEVEL_HIGH>,
<2 0 &gic 0 0 GIC_SPI 2 IRQ_TYPE_LEVEL_HIGH>,
<3 0 &gic 0 0 GIC_SPI 3 IRQ_TYPE_LEVEL_HIGH>,
<4 0 &gic 0 0 GIC_SPI 4 IRQ_TYPE_LEVEL_HIGH>,
<5 0 &gic 0 0 GIC_SPI 5 IRQ_TYPE_LEVEL_HIGH>,
<6 0 &gic 0 0 GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
<7 0 &gic 0 0 GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
<8 0 &gic 0 0 GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<9 0 &gic 0 0 GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>,
<10 0 &gic 0 0 GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>,
<11 0 &gic 0 0 GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
<0 0 &gic GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>,
<1 0 &gic GIC_SPI 1 IRQ_TYPE_LEVEL_HIGH>,
<2 0 &gic GIC_SPI 2 IRQ_TYPE_LEVEL_HIGH>,
<3 0 &gic GIC_SPI 3 IRQ_TYPE_LEVEL_HIGH>,
<4 0 &gic GIC_SPI 4 IRQ_TYPE_LEVEL_HIGH>,
<5 0 &gic GIC_SPI 5 IRQ_TYPE_LEVEL_HIGH>,
<6 0 &gic GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
<7 0 &gic GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
<8 0 &gic GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<9 0 &gic GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>,
<10 0 &gic GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>,
<11 0 &gic GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
interrupt-map-mask = <0xffffffff 0x0>;
};
};

View File

@@ -680,18 +680,18 @@
interrupt-controller;
reg = <0x14 4>;
interrupt-map =
<0 0 &gic 0 0 GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>,
<1 0 &gic 0 0 GIC_SPI 1 IRQ_TYPE_LEVEL_HIGH>,
<2 0 &gic 0 0 GIC_SPI 2 IRQ_TYPE_LEVEL_HIGH>,
<3 0 &gic 0 0 GIC_SPI 3 IRQ_TYPE_LEVEL_HIGH>,
<4 0 &gic 0 0 GIC_SPI 4 IRQ_TYPE_LEVEL_HIGH>,
<5 0 &gic 0 0 GIC_SPI 5 IRQ_TYPE_LEVEL_HIGH>,
<6 0 &gic 0 0 GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
<7 0 &gic 0 0 GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
<8 0 &gic 0 0 GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<9 0 &gic 0 0 GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>,
<10 0 &gic 0 0 GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>,
<11 0 &gic 0 0 GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
<0 0 &gic GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>,
<1 0 &gic GIC_SPI 1 IRQ_TYPE_LEVEL_HIGH>,
<2 0 &gic GIC_SPI 2 IRQ_TYPE_LEVEL_HIGH>,
<3 0 &gic GIC_SPI 3 IRQ_TYPE_LEVEL_HIGH>,
<4 0 &gic GIC_SPI 4 IRQ_TYPE_LEVEL_HIGH>,
<5 0 &gic GIC_SPI 5 IRQ_TYPE_LEVEL_HIGH>,
<6 0 &gic GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>,
<7 0 &gic GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
<8 0 &gic GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<9 0 &gic GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>,
<10 0 &gic GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>,
<11 0 &gic GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
interrupt-map-mask = <0xffffffff 0x0>;
};
};

View File

@@ -707,7 +707,6 @@
clocks = <&clk IMX8MM_CLK_VPU_DEC_ROOT>;
assigned-clocks = <&clk IMX8MM_CLK_VPU_BUS>;
assigned-clock-parents = <&clk IMX8MM_SYS_PLL1_800M>;
resets = <&src IMX8MQ_RESET_VPU_RESET>;
};
pgc_vpu_g1: power-domain@7 {

View File

@@ -132,7 +132,7 @@
scmi_sensor: protocol@15 {
reg = <0x15>;
#thermal-sensor-cells = <0>;
#thermal-sensor-cells = <1>;
};
};
};

View File

@@ -502,7 +502,7 @@
};
usb0: usb@ffb00000 {
compatible = "snps,dwc2";
compatible = "intel,socfpga-agilex-hsotg", "snps,dwc2";
reg = <0xffb00000 0x40000>;
interrupts = <GIC_SPI 93 IRQ_TYPE_LEVEL_HIGH>;
phys = <&usbphy0>;
@@ -515,7 +515,7 @@
};
usb1: usb@ffb40000 {
compatible = "snps,dwc2";
compatible = "intel,socfpga-agilex-hsotg", "snps,dwc2";
reg = <0xffb40000 0x40000>;
interrupts = <GIC_SPI 94 IRQ_TYPE_LEVEL_HIGH>;
phys = <&usbphy0>;

View File

@@ -18,6 +18,7 @@
aliases {
spi0 = &spi0;
ethernet0 = &eth0;
ethernet1 = &eth1;
mmc0 = &sdhci0;
mmc1 = &sdhci1;
@@ -138,7 +139,9 @@
/*
* U-Boot port for Turris Mox has a bug which always expects that "ranges" DT property
* contains exactly 2 ranges with 3 (child) address cells, 2 (parent) address cells and
* 2 size cells and also expects that the second range starts at 16 MB offset. If these
* 2 size cells and also expects that the second range starts at 16 MB offset. Also it
* expects that first range uses same address for PCI (child) and CPU (parent) cells (so
* no remapping) and that this address is the lowest from all specified ranges. If these
* conditions are not met then U-Boot crashes during loading kernel DTB file. PCIe address
* space is 128 MB long, so the best split between MEM and IO is to use fixed 16 MB window
* for IO and the rest 112 MB (64+32+16) for MEM, despite that maximal IO size is just 64 kB.
@@ -147,6 +150,9 @@
* https://source.denx.de/u-boot/u-boot/-/commit/cb2ddb291ee6fcbddd6d8f4ff49089dfe580f5d7
* https://source.denx.de/u-boot/u-boot/-/commit/c64ac3b3185aeb3846297ad7391fc6df8ecd73bf
* https://source.denx.de/u-boot/u-boot/-/commit/4a82fca8e330157081fc132a591ebd99ba02ee33
* Bug related to requirement of same child and parent addresses for first range is fixed
* in U-Boot version 2022.04 by following commit:
* https://source.denx.de/u-boot/u-boot/-/commit/1fd54253bca7d43d046bba4853fe5fafd034bc17
*/
#address-cells = <3>;
#size-cells = <2>;

View File

@@ -499,7 +499,7 @@
* (totaling 127 MiB) for MEM.
*/
ranges = <0x82000000 0 0xe8000000 0 0xe8000000 0 0x07f00000 /* Port 0 MEM */
0x81000000 0 0xefff0000 0 0xefff0000 0 0x00010000>; /* Port 0 IO */
0x81000000 0 0x00000000 0 0xefff0000 0 0x00010000>; /* Port 0 IO */
interrupt-map-mask = <0 0 0 7>;
interrupt-map = <0 0 0 1 &pcie_intc 0>,
<0 0 0 2 &pcie_intc 1>,

View File

@@ -1584,7 +1584,7 @@
#iommu-cells = <1>;
nvidia,memory-controller = <&mc>;
status = "okay";
status = "disabled";
};
smmu: iommu@12000000 {

View File

@@ -807,3 +807,8 @@
qcom,snoc-host-cap-8bit-quirk;
};
&crypto {
/* FIXME: qce_start triggers an SError */
status= "disable";
};

View File

@@ -35,6 +35,24 @@
clock-frequency = <32000>;
#clock-cells = <0>;
};
ufs_phy_rx_symbol_0_clk: ufs-phy-rx-symbol-0 {
compatible = "fixed-clock";
clock-frequency = <1000>;
#clock-cells = <0>;
};
ufs_phy_rx_symbol_1_clk: ufs-phy-rx-symbol-1 {
compatible = "fixed-clock";
clock-frequency = <1000>;
#clock-cells = <0>;
};
ufs_phy_tx_symbol_0_clk: ufs-phy-tx-symbol-0 {
compatible = "fixed-clock";
clock-frequency = <1000>;
#clock-cells = <0>;
};
};
cpus {
@@ -603,9 +621,9 @@
<0>,
<0>,
<0>,
<0>,
<0>,
<0>,
<&ufs_phy_rx_symbol_0_clk>,
<&ufs_phy_rx_symbol_1_clk>,
<&ufs_phy_tx_symbol_0_clk>,
<0>,
<0>;
};
@@ -1923,8 +1941,8 @@
<75000000 300000000>,
<0 0>,
<0 0>,
<75000000 300000000>,
<75000000 300000000>;
<0 0>,
<0 0>;
status = "disabled";
};

View File

@@ -726,7 +726,7 @@
compatible = "qcom,sm8450-smmu-500", "arm,mmu-500";
reg = <0 0x15000000 0 0x100000>;
#iommu-cells = <2>;
#global-interrupts = <2>;
#global-interrupts = <1>;
interrupts = <GIC_SPI 65 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 97 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 98 IRQ_TYPE_LEVEL_HIGH>,
@@ -813,6 +813,7 @@
<GIC_SPI 412 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 421 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 707 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 423 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 424 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 425 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 690 IRQ_TYPE_LEVEL_HIGH>,
@@ -1072,9 +1073,10 @@
<&gcc GCC_USB30_PRIM_MASTER_CLK>,
<&gcc GCC_AGGRE_USB3_PRIM_AXI_CLK>,
<&gcc GCC_USB30_PRIM_MOCK_UTMI_CLK>,
<&gcc GCC_USB30_PRIM_SLEEP_CLK>;
<&gcc GCC_USB30_PRIM_SLEEP_CLK>,
<&gcc GCC_USB3_0_CLKREF_EN>;
clock-names = "cfg_noc", "core", "iface", "mock_utmi",
"sleep";
"sleep", "xo";
assigned-clocks = <&gcc GCC_USB30_PRIM_MOCK_UTMI_CLK>,
<&gcc GCC_USB30_PRIM_MASTER_CLK>;

View File

@@ -711,7 +711,7 @@
clock-names = "pclk", "timer";
};
dmac: dmac@ff240000 {
dmac: dma-controller@ff240000 {
compatible = "arm,pl330", "arm,primecell";
reg = <0x0 0xff240000 0x0 0x4000>;
interrupts = <GIC_SPI 1 IRQ_TYPE_LEVEL_HIGH>,

View File

@@ -489,7 +489,7 @@
status = "disabled";
};
dmac: dmac@ff1f0000 {
dmac: dma-controller@ff1f0000 {
compatible = "arm,pl330", "arm,primecell";
reg = <0x0 0xff1f0000 0x0 0x4000>;
interrupts = <GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>,

View File

@@ -286,7 +286,7 @@
sound: sound {
compatible = "rockchip,rk3399-gru-sound";
rockchip,cpu = <&i2s0 &i2s2>;
rockchip,cpu = <&i2s0 &spdif>;
};
};
@@ -437,10 +437,6 @@ ap_i2c_audio: &i2c8 {
status = "okay";
};
&i2s2 {
status = "okay";
};
&io_domains {
status = "okay";
@@ -537,6 +533,17 @@ ap_i2c_audio: &i2c8 {
vqmmc-supply = <&ppvar_sd_card_io>;
};
&spdif {
status = "okay";
/*
* SPDIF is routed internally to DP; we either don't use these pins, or
* mux them to something else.
*/
/delete-property/ pinctrl-0;
/delete-property/ pinctrl-names;
};
&spi1 {
status = "okay";

View File

@@ -232,6 +232,7 @@
&usbdrd_dwc3_0 {
dr_mode = "otg";
extcon = <&extcon_usb3>;
status = "okay";
};

View File

@@ -25,6 +25,13 @@
};
};
extcon_usb3: extcon-usb3 {
compatible = "linux,extcon-usb-gpio";
id-gpio = <&gpio1 RK_PC2 GPIO_ACTIVE_HIGH>;
pinctrl-names = "default";
pinctrl-0 = <&usb3_id>;
};
clkin_gmac: external-gmac-clock {
compatible = "fixed-clock";
clock-frequency = <125000000>;
@@ -422,9 +429,22 @@
<4 RK_PA3 RK_FUNC_GPIO &pcfg_pull_none>;
};
};
usb3 {
usb3_id: usb3-id {
rockchip,pins =
<1 RK_PC2 RK_FUNC_GPIO &pcfg_pull_none>;
};
};
};
&sdhci {
/*
* Signal integrity isn't great at 200MHz but 100MHz has proven stable
* enough.
*/
max-frequency = <100000000>;
bus-width = <8>;
mmc-hs400-1_8v;
mmc-hs400-enhanced-strobe;

View File

@@ -1881,10 +1881,10 @@
interrupts = <GIC_SPI 23 IRQ_TYPE_LEVEL_HIGH 0>;
clocks = <&cru PCLK_HDMI_CTRL>,
<&cru SCLK_HDMI_SFR>,
<&cru PLL_VPLL>,
<&cru SCLK_HDMI_CEC>,
<&cru PCLK_VIO_GRF>,
<&cru SCLK_HDMI_CEC>;
clock-names = "iahb", "isfr", "vpll", "grf", "cec";
<&cru PLL_VPLL>;
clock-names = "iahb", "isfr", "cec", "grf", "vpll";
power-domains = <&power RK3399_PD_HDCP>;
reg-io-width = <4>;
rockchip,grf = <&grf>;

View File

@@ -285,8 +285,6 @@
vcc_ddr: DCDC_REG3 {
regulator-always-on;
regulator-boot-on;
regulator-min-microvolt = <1100000>;
regulator-max-microvolt = <1100000>;
regulator-initial-mode = <0x2>;
regulator-name = "vcc_ddr";
regulator-state-mem {

View File

@@ -32,13 +32,11 @@
clocks = <&cru SCLK_GMAC0>, <&cru SCLK_GMAC0_RX_TX>,
<&cru SCLK_GMAC0_RX_TX>, <&cru CLK_MAC0_REFOUT>,
<&cru ACLK_GMAC0>, <&cru PCLK_GMAC0>,
<&cru SCLK_GMAC0_RX_TX>, <&cru CLK_GMAC0_PTP_REF>,
<&cru PCLK_XPCS>;
<&cru SCLK_GMAC0_RX_TX>, <&cru CLK_GMAC0_PTP_REF>;
clock-names = "stmmaceth", "mac_clk_rx",
"mac_clk_tx", "clk_mac_refout",
"aclk_mac", "pclk_mac",
"clk_mac_speed", "ptp_ref",
"pclk_xpcs";
"clk_mac_speed", "ptp_ref";
resets = <&cru SRST_A_GMAC0>;
reset-names = "stmmaceth";
rockchip,grf = <&grf>;

View File

@@ -651,7 +651,7 @@
status = "disabled";
};
dmac0: dmac@fe530000 {
dmac0: dma-controller@fe530000 {
compatible = "arm,pl330", "arm,primecell";
reg = <0x0 0xfe530000 0x0 0x4000>;
interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>,
@@ -662,7 +662,7 @@
#dma-cells = <1>;
};
dmac1: dmac@fe550000 {
dmac1: dma-controller@fe550000 {
compatible = "arm,pl330", "arm,primecell";
reg = <0x0 0xfe550000 0x0 0x4000>;
interrupts = <GIC_SPI 16 IRQ_TYPE_LEVEL_HIGH>,

View File

@@ -108,6 +108,13 @@
hint #20
.endm
/*
* Clear Branch History instruction
*/
.macro clearbhb
hint #22
.endm
/*
* Speculation barrier
*/
@@ -850,4 +857,50 @@ alternative_endif
#endif /* GNU_PROPERTY_AARCH64_FEATURE_1_DEFAULT */
.macro __mitigate_spectre_bhb_loop tmp
#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
alternative_cb spectre_bhb_patch_loop_iter
mov \tmp, #32 // Patched to correct the immediate
alternative_cb_end
.Lspectre_bhb_loop\@:
b . + 4
subs \tmp, \tmp, #1
b.ne .Lspectre_bhb_loop\@
sb
#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */
.endm
.macro mitigate_spectre_bhb_loop tmp
#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
alternative_cb spectre_bhb_patch_loop_mitigation_enable
b .L_spectre_bhb_loop_done\@ // Patched to NOP
alternative_cb_end
__mitigate_spectre_bhb_loop \tmp
.L_spectre_bhb_loop_done\@:
#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */
.endm
/* Save/restores x0-x3 to the stack */
.macro __mitigate_spectre_bhb_fw
#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
stp x0, x1, [sp, #-16]!
stp x2, x3, [sp, #-16]!
mov w0, #ARM_SMCCC_ARCH_WORKAROUND_3
alternative_cb smccc_patch_fw_mitigation_conduit
nop // Patched to SMC/HVC #0
alternative_cb_end
ldp x2, x3, [sp], #16
ldp x0, x1, [sp], #16
#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */
.endm
.macro mitigate_spectre_bhb_clear_insn
#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
alternative_cb spectre_bhb_patch_clearbhb
/* Patched to NOP when not supported */
clearbhb
isb
alternative_cb_end
#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */
.endm
#endif /* __ASM_ASSEMBLER_H */

View File

@@ -637,6 +637,35 @@ static inline bool cpu_supports_mixed_endian_el0(void)
return id_aa64mmfr0_mixed_endian_el0(read_cpuid(ID_AA64MMFR0_EL1));
}
static inline bool supports_csv2p3(int scope)
{
u64 pfr0;
u8 csv2_val;
if (scope == SCOPE_LOCAL_CPU)
pfr0 = read_sysreg_s(SYS_ID_AA64PFR0_EL1);
else
pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
csv2_val = cpuid_feature_extract_unsigned_field(pfr0,
ID_AA64PFR0_CSV2_SHIFT);
return csv2_val == 3;
}
static inline bool supports_clearbhb(int scope)
{
u64 isar2;
if (scope == SCOPE_LOCAL_CPU)
isar2 = read_sysreg_s(SYS_ID_AA64ISAR2_EL1);
else
isar2 = read_sanitised_ftr_reg(SYS_ID_AA64ISAR2_EL1);
return cpuid_feature_extract_unsigned_field(isar2,
ID_AA64ISAR2_CLEARBHB_SHIFT);
}
const struct cpumask *system_32bit_el0_cpumask(void);
DECLARE_STATIC_KEY_FALSE(arm64_mismatched_32bit_el0);

View File

@@ -73,10 +73,14 @@
#define ARM_CPU_PART_CORTEX_A76 0xD0B
#define ARM_CPU_PART_NEOVERSE_N1 0xD0C
#define ARM_CPU_PART_CORTEX_A77 0xD0D
#define ARM_CPU_PART_NEOVERSE_V1 0xD40
#define ARM_CPU_PART_CORTEX_A78 0xD41
#define ARM_CPU_PART_CORTEX_X1 0xD44
#define ARM_CPU_PART_CORTEX_A510 0xD46
#define ARM_CPU_PART_CORTEX_A710 0xD47
#define ARM_CPU_PART_CORTEX_X2 0xD48
#define ARM_CPU_PART_NEOVERSE_N2 0xD49
#define ARM_CPU_PART_CORTEX_A78C 0xD4B
#define APM_CPU_PART_POTENZA 0x000
@@ -117,10 +121,14 @@
#define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
#define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1)
#define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77)
#define MIDR_NEOVERSE_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V1)
#define MIDR_CORTEX_A78 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78)
#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1)
#define MIDR_CORTEX_A510 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A510)
#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
#define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2)
#define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2)
#define MIDR_CORTEX_A78C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78C)
#define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
#define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
#define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX)

View File

@@ -62,9 +62,11 @@ enum fixed_addresses {
#endif /* CONFIG_ACPI_APEI_GHES */
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
FIX_ENTRY_TRAMP_TEXT3,
FIX_ENTRY_TRAMP_TEXT2,
FIX_ENTRY_TRAMP_TEXT1,
FIX_ENTRY_TRAMP_DATA,
FIX_ENTRY_TRAMP_TEXT,
#define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT))
#define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT1))
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
__end_of_permanent_fixed_addresses,

View File

@@ -65,6 +65,7 @@ enum aarch64_insn_hint_cr_op {
AARCH64_INSN_HINT_PSB = 0x11 << 5,
AARCH64_INSN_HINT_TSB = 0x12 << 5,
AARCH64_INSN_HINT_CSDB = 0x14 << 5,
AARCH64_INSN_HINT_CLEARBHB = 0x16 << 5,
AARCH64_INSN_HINT_BTI = 0x20 << 5,
AARCH64_INSN_HINT_BTIC = 0x22 << 5,

View File

@@ -714,6 +714,11 @@ static inline void kvm_init_host_cpu_context(struct kvm_cpu_context *cpu_ctxt)
ctxt_sys_reg(cpu_ctxt, MPIDR_EL1) = read_cpuid_mpidr();
}
static inline bool kvm_system_needs_idmapped_vectors(void)
{
return cpus_have_const_cap(ARM64_SPECTRE_V3A);
}
void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
static inline void kvm_arch_hardware_unsetup(void) {}

View File

@@ -5,6 +5,7 @@
#ifndef __ASM_MTE_KASAN_H
#define __ASM_MTE_KASAN_H
#include <asm/compiler.h>
#include <asm/mte-def.h>
#ifndef __ASSEMBLY__

View File

@@ -92,7 +92,7 @@ extern bool arm64_use_ng_mappings;
#define __P001 PAGE_READONLY
#define __P010 PAGE_READONLY
#define __P011 PAGE_READONLY
#define __P100 PAGE_EXECONLY
#define __P100 PAGE_READONLY_EXEC /* PAGE_EXECONLY if Enhanced PAN */
#define __P101 PAGE_READONLY_EXEC
#define __P110 PAGE_READONLY_EXEC
#define __P111 PAGE_READONLY_EXEC
@@ -101,7 +101,7 @@ extern bool arm64_use_ng_mappings;
#define __S001 PAGE_READONLY
#define __S010 PAGE_SHARED
#define __S011 PAGE_SHARED
#define __S100 PAGE_EXECONLY
#define __S100 PAGE_READONLY_EXEC /* PAGE_EXECONLY if Enhanced PAN */
#define __S101 PAGE_READONLY_EXEC
#define __S110 PAGE_SHARED_EXEC
#define __S111 PAGE_SHARED_EXEC

View File

@@ -1017,17 +1017,6 @@ static inline bool arch_wants_old_prefaulted_pte(void)
}
#define arch_wants_old_prefaulted_pte arch_wants_old_prefaulted_pte
static inline pgprot_t arch_filter_pgprot(pgprot_t prot)
{
if (cpus_have_const_cap(ARM64_HAS_EPAN))
return prot;
if (pgprot_val(prot) != pgprot_val(PAGE_EXECONLY))
return prot;
return PAGE_READONLY_EXEC;
}
static inline bool pud_sect_supported(void)
{
return PAGE_SIZE == SZ_4K;

View File

@@ -5,7 +5,7 @@
#ifndef __ASM_RWONCE_H
#define __ASM_RWONCE_H
#ifdef CONFIG_LTO
#if defined(CONFIG_LTO) && !defined(__ASSEMBLY__)
#include <linux/compiler_types.h>
#include <asm/alternative-macros.h>
@@ -66,7 +66,7 @@
})
#endif /* !BUILD_VDSO */
#endif /* CONFIG_LTO */
#endif /* CONFIG_LTO && !__ASSEMBLY__ */
#include <asm-generic/rwonce.h>

View File

@@ -23,4 +23,9 @@ extern char __mmuoff_data_start[], __mmuoff_data_end[];
extern char __entry_tramp_text_start[], __entry_tramp_text_end[];
extern char __relocate_new_kernel_start[], __relocate_new_kernel_end[];
static inline size_t entry_tramp_text_size(void)
{
return __entry_tramp_text_end - __entry_tramp_text_start;
}
#endif /* __ASM_SECTIONS_H */

View File

@@ -93,5 +93,9 @@ void spectre_v4_enable_task_mitigation(struct task_struct *tsk);
enum mitigation_state arm64_get_meltdown_state(void);
enum mitigation_state arm64_get_spectre_bhb_state(void);
bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry, int scope);
u8 spectre_bhb_loop_affected(int scope);
void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *__unused);
#endif /* __ASSEMBLY__ */
#endif /* __ASM_SPECTRE_H */

View File

@@ -773,6 +773,7 @@
#define ID_AA64ISAR1_GPI_IMP_DEF 0x1
/* id_aa64isar2 */
#define ID_AA64ISAR2_CLEARBHB_SHIFT 28
#define ID_AA64ISAR2_RPRES_SHIFT 4
#define ID_AA64ISAR2_WFXT_SHIFT 0
@@ -904,6 +905,7 @@
#endif
/* id_aa64mmfr1 */
#define ID_AA64MMFR1_ECBHB_SHIFT 60
#define ID_AA64MMFR1_AFP_SHIFT 44
#define ID_AA64MMFR1_ETS_SHIFT 36
#define ID_AA64MMFR1_TWED_SHIFT 32

View File

@@ -0,0 +1,73 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2022 ARM Ltd.
*/
#ifndef __ASM_VECTORS_H
#define __ASM_VECTORS_H
#include <linux/bug.h>
#include <linux/percpu.h>
#include <asm/fixmap.h>
extern char vectors[];
extern char tramp_vectors[];
extern char __bp_harden_el1_vectors[];
/*
* Note: the order of this enum corresponds to two arrays in entry.S:
* tramp_vecs and __bp_harden_el1_vectors. By default the canonical
* 'full fat' vectors are used directly.
*/
enum arm64_bp_harden_el1_vectors {
#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
/*
* Perform the BHB loop mitigation, before branching to the canonical
* vectors.
*/
EL1_VECTOR_BHB_LOOP,
/*
* Make the SMC call for firmware mitigation, before branching to the
* canonical vectors.
*/
EL1_VECTOR_BHB_FW,
/*
* Use the ClearBHB instruction, before branching to the canonical
* vectors.
*/
EL1_VECTOR_BHB_CLEAR_INSN,
#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */
/*
* Remap the kernel before branching to the canonical vectors.
*/
EL1_VECTOR_KPTI,
};
#ifndef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
#define EL1_VECTOR_BHB_LOOP -1
#define EL1_VECTOR_BHB_FW -1
#define EL1_VECTOR_BHB_CLEAR_INSN -1
#endif /* !CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */
/* The vectors to use on return from EL0. e.g. to remap the kernel */
DECLARE_PER_CPU_READ_MOSTLY(const char *, this_cpu_vector);
#ifndef CONFIG_UNMAP_KERNEL_AT_EL0
#define TRAMP_VALIAS 0ul
#endif
static inline const char *
arm64_get_bp_hardening_vector(enum arm64_bp_harden_el1_vectors slot)
{
if (arm64_kernel_unmapped_at_el0())
return (char *)(TRAMP_VALIAS + SZ_2K * slot);
WARN_ON_ONCE(slot == EL1_VECTOR_KPTI);
return __bp_harden_el1_vectors + SZ_2K * slot;
}
#endif /* __ASM_VECTORS_H */

View File

@@ -281,6 +281,11 @@ struct kvm_arm_copy_mte_tags {
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED 3
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED (1U << 4)
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3 KVM_REG_ARM_FW_REG(3)
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL 0
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL 1
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED 2
/* SVE registers */
#define KVM_REG_ARM64_SVE (0x15 << KVM_REG_ARM_COPROC_SHIFT)

View File

@@ -502,6 +502,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
.matches = has_spectre_v4,
.cpu_enable = spectre_v4_enable_mitigation,
},
{
.desc = "Spectre-BHB",
.capability = ARM64_SPECTRE_BHB,
.type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,
.matches = is_spectre_bhb_affected,
.cpu_enable = spectre_bhb_enable_mitigation,
},
#ifdef CONFIG_ARM64_ERRATUM_1418040
{
.desc = "ARM erratum 1418040",
@@ -604,7 +611,6 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
{
.desc = "ARM erratum 2077057",
.capability = ARM64_WORKAROUND_2077057,
.type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,
ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2),
},
#endif

View File

@@ -73,6 +73,8 @@
#include <linux/mm.h>
#include <linux/cpu.h>
#include <linux/kasan.h>
#include <linux/percpu.h>
#include <asm/cpu.h>
#include <asm/cpufeature.h>
#include <asm/cpu_ops.h>
@@ -85,6 +87,7 @@
#include <asm/smp.h>
#include <asm/sysreg.h>
#include <asm/traps.h>
#include <asm/vectors.h>
#include <asm/virt.h>
/* Kernel representation of AT_HWCAP and AT_HWCAP2 */
@@ -110,6 +113,8 @@ DECLARE_BITMAP(boot_capabilities, ARM64_NPATCHABLE);
bool arm64_use_ng_mappings = false;
EXPORT_SYMBOL(arm64_use_ng_mappings);
DEFINE_PER_CPU_READ_MOSTLY(const char *, this_cpu_vector) = vectors;
/*
* Permit PER_LINUX32 and execve() of 32-bit binaries even if not all CPUs
* support it?
@@ -226,6 +231,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
};
static const struct arm64_ftr_bits ftr_id_aa64isar2[] = {
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_CLEARBHB_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_RPRES_SHIFT, 4, 0),
ARM64_FTR_END,
};
@@ -1590,6 +1596,12 @@ kpti_install_ng_mappings(const struct arm64_cpu_capabilities *__unused)
int cpu = smp_processor_id();
if (__this_cpu_read(this_cpu_vector) == vectors) {
const char *v = arm64_get_bp_hardening_vector(EL1_VECTOR_KPTI);
__this_cpu_write(this_cpu_vector, v);
}
/*
* We don't need to rewrite the page-tables if either we've done
* it already or we have KASLR enabled and therefore have not

View File

@@ -37,18 +37,21 @@
.macro kernel_ventry, el:req, ht:req, regsize:req, label:req
.align 7
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
.Lventry_start\@:
.if \el == 0
alternative_if ARM64_UNMAP_KERNEL_AT_EL0
/*
* This must be the first instruction of the EL0 vector entries. It is
* skipped by the trampoline vectors, to trigger the cleanup.
*/
b .Lskip_tramp_vectors_cleanup\@
.if \regsize == 64
mrs x30, tpidrro_el0
msr tpidrro_el0, xzr
.else
mov x30, xzr
.endif
alternative_else_nop_endif
.Lskip_tramp_vectors_cleanup\@:
.endif
#endif
sub sp, sp, #PT_REGS_SIZE
#ifdef CONFIG_VMAP_STACK
@@ -95,11 +98,15 @@ alternative_else_nop_endif
mrs x0, tpidrro_el0
#endif
b el\el\ht\()_\regsize\()_\label
.org .Lventry_start\@ + 128 // Did we overflow the ventry slot?
.endm
.macro tramp_alias, dst, sym
.macro tramp_alias, dst, sym, tmp
mov_q \dst, TRAMP_VALIAS
add \dst, \dst, #(\sym - .entry.tramp.text)
adr_l \tmp, \sym
add \dst, \dst, \tmp
adr_l \tmp, .entry.tramp.text
sub \dst, \dst, \tmp
.endm
/*
@@ -116,7 +123,7 @@ alternative_cb_end
tbnz \tmp2, #TIF_SSBD, .L__asm_ssbd_skip\@
mov w0, #ARM_SMCCC_ARCH_WORKAROUND_2
mov w1, #\state
alternative_cb spectre_v4_patch_fw_mitigation_conduit
alternative_cb smccc_patch_fw_mitigation_conduit
nop // Patched to SMC/HVC #0
alternative_cb_end
.L__asm_ssbd_skip\@:
@@ -413,21 +420,26 @@ alternative_else_nop_endif
ldp x24, x25, [sp, #16 * 12]
ldp x26, x27, [sp, #16 * 13]
ldp x28, x29, [sp, #16 * 14]
ldr lr, [sp, #S_LR]
add sp, sp, #PT_REGS_SIZE // restore sp
.if \el == 0
alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0
alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0
ldr lr, [sp, #S_LR]
add sp, sp, #PT_REGS_SIZE // restore sp
eret
alternative_else_nop_endif
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
bne 4f
msr far_el1, x30
tramp_alias x30, tramp_exit_native
msr far_el1, x29
tramp_alias x30, tramp_exit_native, x29
br x30
4:
tramp_alias x30, tramp_exit_compat
tramp_alias x30, tramp_exit_compat, x29
br x30
#endif
.else
ldr lr, [sp, #S_LR]
add sp, sp, #PT_REGS_SIZE // restore sp
/* Ensure any device/NC reads complete */
alternative_insn nop, "dmb sy", ARM64_WORKAROUND_1508412
@@ -594,12 +606,6 @@ SYM_CODE_END(ret_to_user)
.popsection // .entry.text
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
/*
* Exception vectors trampoline.
*/
.pushsection ".entry.tramp.text", "ax"
// Move from tramp_pg_dir to swapper_pg_dir
.macro tramp_map_kernel, tmp
mrs \tmp, ttbr1_el1
@@ -633,12 +639,47 @@ alternative_else_nop_endif
*/
.endm
.macro tramp_ventry, regsize = 64
.macro tramp_data_page dst
adr_l \dst, .entry.tramp.text
sub \dst, \dst, PAGE_SIZE
.endm
.macro tramp_data_read_var dst, var
#ifdef CONFIG_RANDOMIZE_BASE
tramp_data_page \dst
add \dst, \dst, #:lo12:__entry_tramp_data_\var
ldr \dst, [\dst]
#else
ldr \dst, =\var
#endif
.endm
#define BHB_MITIGATION_NONE 0
#define BHB_MITIGATION_LOOP 1
#define BHB_MITIGATION_FW 2
#define BHB_MITIGATION_INSN 3
.macro tramp_ventry, vector_start, regsize, kpti, bhb
.align 7
1:
.if \regsize == 64
msr tpidrro_el0, x30 // Restored in kernel_ventry
.endif
.if \bhb == BHB_MITIGATION_LOOP
/*
* This sequence must appear before the first indirect branch. i.e. the
* ret out of tramp_ventry. It appears here because x30 is free.
*/
__mitigate_spectre_bhb_loop x30
.endif // \bhb == BHB_MITIGATION_LOOP
.if \bhb == BHB_MITIGATION_INSN
clearbhb
isb
.endif // \bhb == BHB_MITIGATION_INSN
.if \kpti == 1
/*
* Defend against branch aliasing attacks by pushing a dummy
* entry onto the return stack and using a RET instruction to
@@ -648,46 +689,75 @@ alternative_else_nop_endif
b .
2:
tramp_map_kernel x30
#ifdef CONFIG_RANDOMIZE_BASE
adr x30, tramp_vectors + PAGE_SIZE
alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
ldr x30, [x30]
#else
ldr x30, =vectors
#endif
tramp_data_read_var x30, vectors
alternative_if_not ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM
prfm plil1strm, [x30, #(1b - tramp_vectors)]
prfm plil1strm, [x30, #(1b - \vector_start)]
alternative_else_nop_endif
msr vbar_el1, x30
add x30, x30, #(1b - tramp_vectors)
isb
.else
ldr x30, =vectors
.endif // \kpti == 1
.if \bhb == BHB_MITIGATION_FW
/*
* The firmware sequence must appear before the first indirect branch.
* i.e. the ret out of tramp_ventry. But it also needs the stack to be
* mapped to save/restore the registers the SMC clobbers.
*/
__mitigate_spectre_bhb_fw
.endif // \bhb == BHB_MITIGATION_FW
add x30, x30, #(1b - \vector_start + 4)
ret
.org 1b + 128 // Did we overflow the ventry slot?
.endm
.macro tramp_exit, regsize = 64
adr x30, tramp_vectors
tramp_data_read_var x30, this_cpu_vector
get_this_cpu_offset x29
ldr x30, [x30, x29]
msr vbar_el1, x30
tramp_unmap_kernel x30
ldr lr, [sp, #S_LR]
tramp_unmap_kernel x29
.if \regsize == 64
mrs x30, far_el1
mrs x29, far_el1
.endif
add sp, sp, #PT_REGS_SIZE // restore sp
eret
sb
.endm
.align 11
SYM_CODE_START_NOALIGN(tramp_vectors)
.macro generate_tramp_vector, kpti, bhb
.Lvector_start\@:
.space 0x400
tramp_ventry
tramp_ventry
tramp_ventry
tramp_ventry
.rept 4
tramp_ventry .Lvector_start\@, 64, \kpti, \bhb
.endr
.rept 4
tramp_ventry .Lvector_start\@, 32, \kpti, \bhb
.endr
.endm
tramp_ventry 32
tramp_ventry 32
tramp_ventry 32
tramp_ventry 32
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
/*
* Exception vectors trampoline.
* The order must match __bp_harden_el1_vectors and the
* arm64_bp_harden_el1_vectors enum.
*/
.pushsection ".entry.tramp.text", "ax"
.align 11
SYM_CODE_START_NOALIGN(tramp_vectors)
#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
generate_tramp_vector kpti=1, bhb=BHB_MITIGATION_LOOP
generate_tramp_vector kpti=1, bhb=BHB_MITIGATION_FW
generate_tramp_vector kpti=1, bhb=BHB_MITIGATION_INSN
#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */
generate_tramp_vector kpti=1, bhb=BHB_MITIGATION_NONE
SYM_CODE_END(tramp_vectors)
SYM_CODE_START(tramp_exit_native)
@@ -704,12 +774,56 @@ SYM_CODE_END(tramp_exit_compat)
.pushsection ".rodata", "a"
.align PAGE_SHIFT
SYM_DATA_START(__entry_tramp_data_start)
__entry_tramp_data_vectors:
.quad vectors
#ifdef CONFIG_ARM_SDE_INTERFACE
__entry_tramp_data___sdei_asm_handler:
.quad __sdei_asm_handler
#endif /* CONFIG_ARM_SDE_INTERFACE */
__entry_tramp_data_this_cpu_vector:
.quad this_cpu_vector
SYM_DATA_END(__entry_tramp_data_start)
.popsection // .rodata
#endif /* CONFIG_RANDOMIZE_BASE */
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
/*
* Exception vectors for spectre mitigations on entry from EL1 when
* kpti is not in use.
*/
.macro generate_el1_vector, bhb
.Lvector_start\@:
kernel_ventry 1, t, 64, sync // Synchronous EL1t
kernel_ventry 1, t, 64, irq // IRQ EL1t
kernel_ventry 1, t, 64, fiq // FIQ EL1h
kernel_ventry 1, t, 64, error // Error EL1t
kernel_ventry 1, h, 64, sync // Synchronous EL1h
kernel_ventry 1, h, 64, irq // IRQ EL1h
kernel_ventry 1, h, 64, fiq // FIQ EL1h
kernel_ventry 1, h, 64, error // Error EL1h
.rept 4
tramp_ventry .Lvector_start\@, 64, 0, \bhb
.endr
.rept 4
tramp_ventry .Lvector_start\@, 32, 0, \bhb
.endr
.endm
/* The order must match tramp_vecs and the arm64_bp_harden_el1_vectors enum. */
.pushsection ".entry.text", "ax"
.align 11
SYM_CODE_START(__bp_harden_el1_vectors)
#ifdef CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
generate_el1_vector bhb=BHB_MITIGATION_LOOP
generate_el1_vector bhb=BHB_MITIGATION_FW
generate_el1_vector bhb=BHB_MITIGATION_INSN
#endif /* CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY */
SYM_CODE_END(__bp_harden_el1_vectors)
.popsection
/*
* Register switch for AArch64. The callee-saved registers need to be saved
* and restored. On entry:
@@ -835,14 +949,7 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
* Remember whether to unmap the kernel on exit.
*/
1: str x4, [x1, #(SDEI_EVENT_INTREGS + S_SDEI_TTBR1)]
#ifdef CONFIG_RANDOMIZE_BASE
adr x4, tramp_vectors + PAGE_SIZE
add x4, x4, #:lo12:__sdei_asm_trampoline_next_handler
ldr x4, [x4]
#else
ldr x4, =__sdei_asm_handler
#endif
tramp_data_read_var x4, __sdei_asm_handler
br x4
SYM_CODE_END(__sdei_asm_entry_trampoline)
NOKPROBE(__sdei_asm_entry_trampoline)
@@ -865,13 +972,6 @@ SYM_CODE_END(__sdei_asm_exit_trampoline)
NOKPROBE(__sdei_asm_exit_trampoline)
.ltorg
.popsection // .entry.tramp.text
#ifdef CONFIG_RANDOMIZE_BASE
.pushsection ".rodata", "a"
SYM_DATA_START(__sdei_asm_trampoline_next_handler)
.quad __sdei_asm_handler
SYM_DATA_END(__sdei_asm_trampoline_next_handler)
.popsection // .rodata
#endif /* CONFIG_RANDOMIZE_BASE */
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
/*
@@ -981,7 +1081,7 @@ alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0
alternative_else_nop_endif
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
tramp_alias dst=x5, sym=__sdei_asm_exit_trampoline
tramp_alias dst=x5, sym=__sdei_asm_exit_trampoline, tmp=x3
br x5
#endif
SYM_CODE_END(__sdei_asm_handler)

View File

@@ -66,6 +66,10 @@ KVM_NVHE_ALIAS(kvm_patch_vector_branch);
KVM_NVHE_ALIAS(kvm_update_va_mask);
KVM_NVHE_ALIAS(kvm_get_kimage_voffset);
KVM_NVHE_ALIAS(kvm_compute_final_ctr_el0);
KVM_NVHE_ALIAS(spectre_bhb_patch_loop_iter);
KVM_NVHE_ALIAS(spectre_bhb_patch_loop_mitigation_enable);
KVM_NVHE_ALIAS(spectre_bhb_patch_wa3);
KVM_NVHE_ALIAS(spectre_bhb_patch_clearbhb);
/* Global kernel state accessed by nVHE hyp code. */
KVM_NVHE_ALIAS(kvm_vgic_global_state);

View File

@@ -18,15 +18,18 @@
*/
#include <linux/arm-smccc.h>
#include <linux/bpf.h>
#include <linux/cpu.h>
#include <linux/device.h>
#include <linux/nospec.h>
#include <linux/prctl.h>
#include <linux/sched/task_stack.h>
#include <asm/debug-monitors.h>
#include <asm/insn.h>
#include <asm/spectre.h>
#include <asm/traps.h>
#include <asm/vectors.h>
#include <asm/virt.h>
/*
@@ -96,14 +99,51 @@ static bool spectre_v2_mitigations_off(void)
return ret;
}
static const char *get_bhb_affected_string(enum mitigation_state bhb_state)
{
switch (bhb_state) {
case SPECTRE_UNAFFECTED:
return "";
default:
case SPECTRE_VULNERABLE:
return ", but not BHB";
case SPECTRE_MITIGATED:
return ", BHB";
}
}
static bool _unprivileged_ebpf_enabled(void)
{
#ifdef CONFIG_BPF_SYSCALL
return !sysctl_unprivileged_bpf_disabled;
#else
return false;
#endif
}
ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr,
char *buf)
{
enum mitigation_state bhb_state = arm64_get_spectre_bhb_state();
const char *bhb_str = get_bhb_affected_string(bhb_state);
const char *v2_str = "Branch predictor hardening";
switch (spectre_v2_state) {
case SPECTRE_UNAFFECTED:
return sprintf(buf, "Not affected\n");
if (bhb_state == SPECTRE_UNAFFECTED)
return sprintf(buf, "Not affected\n");
/*
* Platforms affected by Spectre-BHB can't report
* "Not affected" for Spectre-v2.
*/
v2_str = "CSV2";
fallthrough;
case SPECTRE_MITIGATED:
return sprintf(buf, "Mitigation: Branch predictor hardening\n");
if (bhb_state == SPECTRE_MITIGATED && _unprivileged_ebpf_enabled())
return sprintf(buf, "Vulnerable: Unprivileged eBPF enabled\n");
return sprintf(buf, "Mitigation: %s%s\n", v2_str, bhb_str);
case SPECTRE_VULNERABLE:
fallthrough;
default:
@@ -554,9 +594,9 @@ void __init spectre_v4_patch_fw_mitigation_enable(struct alt_instr *alt,
* Patch a NOP in the Spectre-v4 mitigation code with an SMC/HVC instruction
* to call into firmware to adjust the mitigation state.
*/
void __init spectre_v4_patch_fw_mitigation_conduit(struct alt_instr *alt,
__le32 *origptr,
__le32 *updptr, int nr_inst)
void __init smccc_patch_fw_mitigation_conduit(struct alt_instr *alt,
__le32 *origptr,
__le32 *updptr, int nr_inst)
{
u32 insn;
@@ -770,3 +810,344 @@ int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which)
return -ENODEV;
}
}
/*
* Spectre BHB.
*
* A CPU is either:
* - Mitigated by a branchy loop a CPU specific number of times, and listed
* in our "loop mitigated list".
* - Mitigated in software by the firmware Spectre v2 call.
* - Has the ClearBHB instruction to perform the mitigation.
* - Has the 'Exception Clears Branch History Buffer' (ECBHB) feature, so no
* software mitigation in the vectors is needed.
* - Has CSV2.3, so is unaffected.
*/
static enum mitigation_state spectre_bhb_state;
enum mitigation_state arm64_get_spectre_bhb_state(void)
{
return spectre_bhb_state;
}
enum bhb_mitigation_bits {
BHB_LOOP,
BHB_FW,
BHB_HW,
BHB_INSN,
};
static unsigned long system_bhb_mitigations;
/*
* This must be called with SCOPE_LOCAL_CPU for each type of CPU, before any
* SCOPE_SYSTEM call will give the right answer.
*/
u8 spectre_bhb_loop_affected(int scope)
{
u8 k = 0;
static u8 max_bhb_k;
if (scope == SCOPE_LOCAL_CPU) {
static const struct midr_range spectre_bhb_k32_list[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
{},
};
static const struct midr_range spectre_bhb_k24_list[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
{},
};
static const struct midr_range spectre_bhb_k8_list[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
{},
};
if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k32_list))
k = 32;
else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list))
k = 24;
else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list))
k = 8;
max_bhb_k = max(max_bhb_k, k);
} else {
k = max_bhb_k;
}
return k;
}
static enum mitigation_state spectre_bhb_get_cpu_fw_mitigation_state(void)
{
int ret;
struct arm_smccc_res res;
arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
ARM_SMCCC_ARCH_WORKAROUND_3, &res);
ret = res.a0;
switch (ret) {
case SMCCC_RET_SUCCESS:
return SPECTRE_MITIGATED;
case SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED:
return SPECTRE_UNAFFECTED;
default:
fallthrough;
case SMCCC_RET_NOT_SUPPORTED:
return SPECTRE_VULNERABLE;
}
}
static bool is_spectre_bhb_fw_affected(int scope)
{
static bool system_affected;
enum mitigation_state fw_state;
bool has_smccc = arm_smccc_1_1_get_conduit() != SMCCC_CONDUIT_NONE;
static const struct midr_range spectre_bhb_firmware_mitigated_list[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
{},
};
bool cpu_in_list = is_midr_in_range_list(read_cpuid_id(),
spectre_bhb_firmware_mitigated_list);
if (scope != SCOPE_LOCAL_CPU)
return system_affected;
fw_state = spectre_bhb_get_cpu_fw_mitigation_state();
if (cpu_in_list || (has_smccc && fw_state == SPECTRE_MITIGATED)) {
system_affected = true;
return true;
}
return false;
}
static bool supports_ecbhb(int scope)
{
u64 mmfr1;
if (scope == SCOPE_LOCAL_CPU)
mmfr1 = read_sysreg_s(SYS_ID_AA64MMFR1_EL1);
else
mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
return cpuid_feature_extract_unsigned_field(mmfr1,
ID_AA64MMFR1_ECBHB_SHIFT);
}
bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry,
int scope)
{
WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
if (supports_csv2p3(scope))
return false;
if (supports_clearbhb(scope))
return true;
if (spectre_bhb_loop_affected(scope))
return true;
if (is_spectre_bhb_fw_affected(scope))
return true;
return false;
}
static void this_cpu_set_vectors(enum arm64_bp_harden_el1_vectors slot)
{
const char *v = arm64_get_bp_hardening_vector(slot);
if (slot < 0)
return;
__this_cpu_write(this_cpu_vector, v);
/*
* When KPTI is in use, the vectors are switched when exiting to
* user-space.
*/
if (arm64_kernel_unmapped_at_el0())
return;
write_sysreg(v, vbar_el1);
isb();
}
void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *entry)
{
bp_hardening_cb_t cpu_cb;
enum mitigation_state fw_state, state = SPECTRE_VULNERABLE;
struct bp_hardening_data *data = this_cpu_ptr(&bp_hardening_data);
if (!is_spectre_bhb_affected(entry, SCOPE_LOCAL_CPU))
return;
if (arm64_get_spectre_v2_state() == SPECTRE_VULNERABLE) {
/* No point mitigating Spectre-BHB alone. */
} else if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY)) {
pr_info_once("spectre-bhb mitigation disabled by compile time option\n");
} else if (cpu_mitigations_off()) {
pr_info_once("spectre-bhb mitigation disabled by command line option\n");
} else if (supports_ecbhb(SCOPE_LOCAL_CPU)) {
state = SPECTRE_MITIGATED;
set_bit(BHB_HW, &system_bhb_mitigations);
} else if (supports_clearbhb(SCOPE_LOCAL_CPU)) {
/*
* Ensure KVM uses the indirect vector which will have ClearBHB
* added.
*/
if (!data->slot)
data->slot = HYP_VECTOR_INDIRECT;
this_cpu_set_vectors(EL1_VECTOR_BHB_CLEAR_INSN);
state = SPECTRE_MITIGATED;
set_bit(BHB_INSN, &system_bhb_mitigations);
} else if (spectre_bhb_loop_affected(SCOPE_LOCAL_CPU)) {
/*
* Ensure KVM uses the indirect vector which will have the
* branchy-loop added. A57/A72-r0 will already have selected
* the spectre-indirect vector, which is sufficient for BHB
* too.
*/
if (!data->slot)
data->slot = HYP_VECTOR_INDIRECT;
this_cpu_set_vectors(EL1_VECTOR_BHB_LOOP);
state = SPECTRE_MITIGATED;
set_bit(BHB_LOOP, &system_bhb_mitigations);
} else if (is_spectre_bhb_fw_affected(SCOPE_LOCAL_CPU)) {
fw_state = spectre_bhb_get_cpu_fw_mitigation_state();
if (fw_state == SPECTRE_MITIGATED) {
/*
* Ensure KVM uses one of the spectre bp_hardening
* vectors. The indirect vector doesn't include the EL3
* call, so needs upgrading to
* HYP_VECTOR_SPECTRE_INDIRECT.
*/
if (!data->slot || data->slot == HYP_VECTOR_INDIRECT)
data->slot += 1;
this_cpu_set_vectors(EL1_VECTOR_BHB_FW);
/*
* The WA3 call in the vectors supersedes the WA1 call
* made during context-switch. Uninstall any firmware
* bp_hardening callback.
*/
cpu_cb = spectre_v2_get_sw_mitigation_cb();
if (__this_cpu_read(bp_hardening_data.fn) != cpu_cb)
__this_cpu_write(bp_hardening_data.fn, NULL);
state = SPECTRE_MITIGATED;
set_bit(BHB_FW, &system_bhb_mitigations);
}
}
update_mitigation_state(&spectre_bhb_state, state);
}
/* Patched to NOP when enabled */
void noinstr spectre_bhb_patch_loop_mitigation_enable(struct alt_instr *alt,
__le32 *origptr,
__le32 *updptr, int nr_inst)
{
BUG_ON(nr_inst != 1);
if (test_bit(BHB_LOOP, &system_bhb_mitigations))
*updptr++ = cpu_to_le32(aarch64_insn_gen_nop());
}
/* Patched to NOP when enabled */
void noinstr spectre_bhb_patch_fw_mitigation_enabled(struct alt_instr *alt,
__le32 *origptr,
__le32 *updptr, int nr_inst)
{
BUG_ON(nr_inst != 1);
if (test_bit(BHB_FW, &system_bhb_mitigations))
*updptr++ = cpu_to_le32(aarch64_insn_gen_nop());
}
/* Patched to correct the immediate */
void noinstr spectre_bhb_patch_loop_iter(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr, int nr_inst)
{
u8 rd;
u32 insn;
u16 loop_count = spectre_bhb_loop_affected(SCOPE_SYSTEM);
BUG_ON(nr_inst != 1); /* MOV -> MOV */
if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY))
return;
insn = le32_to_cpu(*origptr);
rd = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RD, insn);
insn = aarch64_insn_gen_movewide(rd, loop_count, 0,
AARCH64_INSN_VARIANT_64BIT,
AARCH64_INSN_MOVEWIDE_ZERO);
*updptr++ = cpu_to_le32(insn);
}
/* Patched to mov WA3 when supported */
void noinstr spectre_bhb_patch_wa3(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr, int nr_inst)
{
u8 rd;
u32 insn;
BUG_ON(nr_inst != 1); /* MOV -> MOV */
if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY) ||
!test_bit(BHB_FW, &system_bhb_mitigations))
return;
insn = le32_to_cpu(*origptr);
rd = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RD, insn);
insn = aarch64_insn_gen_logical_immediate(AARCH64_INSN_LOGIC_ORR,
AARCH64_INSN_VARIANT_32BIT,
AARCH64_INSN_REG_ZR, rd,
ARM_SMCCC_ARCH_WORKAROUND_3);
if (WARN_ON_ONCE(insn == AARCH64_BREAK_FAULT))
return;
*updptr++ = cpu_to_le32(insn);
}
/* Patched to NOP when not supported */
void __init spectre_bhb_patch_clearbhb(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr, int nr_inst)
{
BUG_ON(nr_inst != 2);
if (test_bit(BHB_INSN, &system_bhb_mitigations))
return;
*updptr++ = cpu_to_le32(aarch64_insn_gen_nop());
*updptr++ = cpu_to_le32(aarch64_insn_gen_nop());
}
#ifdef CONFIG_BPF_SYSCALL
#define EBPF_WARN "Unprivileged eBPF is enabled, data leaks possible via Spectre v2 BHB attacks!\n"
void unpriv_ebpf_notify(int new_state)
{
if (spectre_v2_state == SPECTRE_VULNERABLE ||
spectre_bhb_state != SPECTRE_MITIGATED)
return;
if (!new_state)
pr_err("WARNING: %s", EBPF_WARN);
}
#endif

View File

@@ -341,7 +341,7 @@ ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
<= SZ_4K, "Hibernate exit text too big or misaligned")
#endif
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == PAGE_SIZE,
ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) <= 3*PAGE_SIZE,
"Entry trampoline text too big")
#endif
#ifdef CONFIG_KVM

View File

@@ -1491,10 +1491,7 @@ static int kvm_init_vector_slots(void)
base = kern_hyp_va(kvm_ksym_ref(__bp_harden_hyp_vecs));
kvm_init_vector_slot(base, HYP_VECTOR_SPECTRE_DIRECT);
if (!cpus_have_const_cap(ARM64_SPECTRE_V3A))
return 0;
if (!has_vhe()) {
if (kvm_system_needs_idmapped_vectors() && !has_vhe()) {
err = create_hyp_exec_mappings(__pa_symbol(__bp_harden_hyp_vecs),
__BP_HARDEN_HYP_VECS_SZ, &base);
if (err)

View File

@@ -62,6 +62,10 @@ el1_sync: // Guest trapped into EL2
/* ARM_SMCCC_ARCH_WORKAROUND_2 handling */
eor w1, w1, #(ARM_SMCCC_ARCH_WORKAROUND_1 ^ \
ARM_SMCCC_ARCH_WORKAROUND_2)
cbz w1, wa_epilogue
eor w1, w1, #(ARM_SMCCC_ARCH_WORKAROUND_2 ^ \
ARM_SMCCC_ARCH_WORKAROUND_3)
cbnz w1, el1_trap
wa_epilogue:
@@ -192,7 +196,10 @@ SYM_CODE_END(__kvm_hyp_vector)
sub sp, sp, #(8 * 4)
stp x2, x3, [sp, #(8 * 0)]
stp x0, x1, [sp, #(8 * 2)]
alternative_cb spectre_bhb_patch_wa3
/* Patched to mov WA3 when supported */
mov w0, #ARM_SMCCC_ARCH_WORKAROUND_1
alternative_cb_end
smc #0
ldp x2, x3, [sp, #(8 * 0)]
add sp, sp, #(8 * 2)
@@ -205,6 +212,8 @@ SYM_CODE_END(__kvm_hyp_vector)
spectrev2_smccc_wa1_smc
.else
stp x0, x1, [sp, #-16]!
mitigate_spectre_bhb_loop x0
mitigate_spectre_bhb_clear_insn
.endif
.if \indirect != 0
alternative_cb kvm_patch_vector_branch

View File

@@ -148,8 +148,10 @@ int hyp_map_vectors(void)
phys_addr_t phys;
void *bp_base;
if (!cpus_have_const_cap(ARM64_SPECTRE_V3A))
if (!kvm_system_needs_idmapped_vectors()) {
__hyp_bp_vect_base = __bp_harden_hyp_vecs;
return 0;
}
phys = __hyp_pa(__bp_harden_hyp_vecs);
bp_base = (void *)__pkvm_create_private_mapping(phys,

Some files were not shown because too many files have changed in this diff Show More