Compare commits

..

684 Commits

Author SHA1 Message Date
Linus Torvalds
7d2a07b769 Linux 5.14 2021-08-29 15:04:50 -07:00
Linus Torvalds
90ac80dcd3 Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
 "One hotfix for a NULL pointer deref in the Renesas usb clk driver"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: renesas: rcar-usb2-clock-sel: Fix kernel NULL pointer dereference
2021-08-29 12:52:17 -07:00
Linus Torvalds
537b57bd5a Merge tag 'sched_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:

 - Have get_push_task() check whether current has migration disabled and
   thus avoid useless invocations of the migration thread

 - Rework initialization flow so that all rq->core's are initialized,
   even of CPUs which have not been onlined yet, so that iterating over
   them all works as expected

* tag 'sched_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix get_push_task() vs migrate_disable()
  sched: Fix Core-wide rq->lock for uninitialized CPUs
2021-08-29 10:54:14 -07:00
Linus Torvalds
f20a2637b1 Merge tag 'irq_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Borislav Petkov:

 - Have msix_mask_all() check a global control which says whether MSI-X
   masking should be done and thus make it usable on Xen-PV too

* tag 'irq_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  PCI/MSI: Skip masking MSI-X on Xen PV
2021-08-29 10:47:02 -07:00
Linus Torvalds
98d006eb49 Merge tag 'perf_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:

 - Prevent the amd/power module from being removed while in use

 - Mark AMD IBS as not supporting content exclusion

 - Add a workaround for AMD erratum #1197 where IBS registers might not
   be restored properly after exiting CC6 state

 - Fix a potential truncation of a 32-bit variable due to shifting

 - Read the correct bits describing the number of configurable address
   ranges on Intel PT

* tag 'perf_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/amd/power: Assign pmu.module
  perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op
  perf/x86/amd/ibs: Work around erratum #1197
  perf/x86/intel/uncore: Fix integer overflow on 23 bit left shift of a u32
  perf/x86/intel/pt: Fix mask of num_address_ranges
2021-08-29 10:36:32 -07:00
Linus Torvalds
072a276745 Merge tag 'x86_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Fix build error on RHEL where -Werror=maybe-uninitialized is set.

 - Restore the firmware's IDT when calling EFI boot services and before
   ExitBootServices() has been called. This fixes a boot failure on what
   appears to be a tablet with 32-bit UEFI running a 64-bit kernel.

* tag 'x86_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Fix a maybe-uninitialized build warning treated as error
  x86/efi: Restore Firmware IDT before calling ExitBootServices()
2021-08-29 10:26:00 -07:00
Helge Deller
f6a3308d6f Revert "parisc: Add assembly implementations for memset, strlen, strcpy, strncpy and strcat"
This reverts commit 83af58f806.

It turns out that at least the assembly implementation for strncpy() was
buggy.  Revert the whole commit and return back to the default coding.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.4+
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-29 10:13:32 -07:00
Adam Ford
1669a941f7 clk: renesas: rcar-usb2-clock-sel: Fix kernel NULL pointer dereference
The probe was manually passing NULL instead of dev to devm_clk_hw_register.
This caused a Unable to handle kernel NULL pointer dereference error.
Fix this by passing 'dev'.

Signed-off-by: Adam Ford <aford173@gmail.com>
Fixes: a20a40a8bb ("clk: renesas: rcar-usb2-clock-sel: Fix error handling in .probe()")
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-08-28 21:29:36 -07:00
Linus Torvalds
3f5ad13cb0 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
 "A single fix for a race introduced by a fix that went into 5.14-rc5"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: core: Fix hang of freezing queue between blocking and running device
2021-08-28 11:39:16 -07:00
Linus Torvalds
447e238f14 Merge tag 'usb-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are a few tiny USB fixes for reported issues with some USB
  drivers.

  These fixes include:

   - gadget driver fixes for regressions

   - tcpm driver fix

   - dwc3 driver fixes

   - xhci renesas firmware loading fix, again.

   - usb serial option driver device id addition

   - usb serial ch341 revert for regression

  All all of these have been in linux-next with no reported problems"

* tag 'usb-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: gadget: u_audio: fix race condition on endpoint stop
  usb: gadget: f_uac2: fixup feedback endpoint stop
  usb: typec: tcpm: Raise vdm_sm_running flag only when VDM SM is running
  usb: renesas-xhci: Prefer firmware loading on unknown ROM state
  usb: dwc3: gadget: Stop EP0 transfers during pullup disable
  usb: dwc3: gadget: Fix dwc3_calc_trbs_left()
  Revert "USB: serial: ch341: fix character loss at high transfer rates"
  USB: serial: option: add new VID/PID to support Fibocom FG150
2021-08-28 11:32:16 -07:00
Linus Torvalds
9f73eacde7 Merge tag 'powerpc-5.14-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:

 - Fix scv implicit soft-mask table for relocated (eg. kdump) kernels

 - Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK, which was disabled due to a
   typo

Thanks to Lukas Bulwahn, Nicholas Piggin, and Daniel Axtens.

* tag 'powerpc-5.14-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Fix scv implicit soft-mask table for relocated kernels
  powerpc: Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK
2021-08-28 10:40:41 -07:00
Linus Torvalds
64b4fc45be Merge tag 'block-5.14-2021-08-27' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Revert the mq-deadline priority handling, it's causing serious
   performance regressions. While experimental patches exists to fix
   this up, it's too late to do so now. Revert it and re-do it properly
   for 5.15 instead.

 - Fix a NULL vs IS_ERR() regression in this release (Dan)

 - Fix a mq-deadline accounting regression in this release (Bart)

 - Mark cryptoloop as deprecated. It's broken and dm-crypt fully
   supports it, and it's actively intefering with loop. Plan on removal
   for 5.16 (Christoph)

* tag 'block-5.14-2021-08-27' of git://git.kernel.dk/linux-block:
  cryptoloop: add a deprecation warning
  pd: fix a NULL vs IS_ERR() check
  Revert "block/mq-deadline: Prioritize high-priority requests"
  mq-deadline: Fix request accounting
2021-08-27 16:08:29 -07:00
Linus Torvalds
6f18b82b41 Merge tag 'soc-fixes-5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "Just two trivial fixes from the reset driver tree, nothing else came
  up since the last soc fixes"

* tag 'soc-fixes-5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  reset: reset-zynqmp: Fixed the argument data type
  reset: RESET_MCHP_SPARX5 should depend on ARCH_SPARX5
2021-08-27 15:59:00 -07:00
Linus Torvalds
8f9d034984 Merge tag 'acpi-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
 "Fix a regression introduced during this cycle that has been partially
  addressed by an earlier commit (Andy Shevchenko)"

* tag 'acpi-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  media: ipu3-cio2: Drop reference on error path in cio2_bridge_connect_sensor()
2021-08-27 12:18:09 -07:00
Linus Torvalds
c0006dc695 Merge tag 'pm-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix two issues introduced during this cycle, one of which is a
  regression and the other one affects new code.

  Specifics:

   - Prevent the operating performance points (OPP) code from crashing
     when some entries in the table of required OPPs are set to error
     pointer values (Marijn Suijten)

   - Prevent the generic power domains (genpd) framework from
     incorrectly overriding the performance state of a device set by its
     driver while it is runtime-suspended or when runtime PM of it is
     disabled (Dmitry Osipenko)"

* tag 'pm-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: domains: Improve runtime PM performance state handling
  opp: core: Check for pending links before reading required_opp pointers
2021-08-27 12:06:51 -07:00
David Hildenbrand
425bec0032 virtio-mem: fix sleeping in RCU read side section in virtio_mem_online_page_cb()
virtio_mem_set_fake_offline() might sleep now, and we call it under
rcu_read_lock(). To fix it, simply move the rcu_read_unlock() further
up, as we're done with the device.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 6cc26d7761: "virtio-mem: use page_offline_(start|end) when setting PageOffline()
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-27 11:39:36 -07:00
Rafael J. Wysocki
7ee5fd12e8 Merge branch 'pm-opp'
* pm-opp:
  opp: core: Check for pending links before reading required_opp pointers
2021-08-27 20:27:01 +02:00
Linus Torvalds
5a61b7a296 Merge tag 'riscv-for-linus-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - device tree updates for the Microsemi Polarfire development kit that
   fix some mismatches between the u-boot and Linux enternet entries

 - ensure that the F register state is correctly reflected in core dumps

* tag 'riscv-for-linus-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: dts: microchip: Add ethernet0 to the aliases node
  riscv: dts: microchip: Use 'local-mac-address' for emac1
  riscv: Ensure the value of FP registers in the core dump file is up to date
2021-08-27 11:04:57 -07:00
Linus Torvalds
1a6436f375 Merge tag 'mmc-v5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC host fix from Ulf Hansson:

 - sdhci-iproc: Fix clock error for ACPI rpi's

* tag 'mmc-v5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  Revert "mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN on BCM2711"
2021-08-27 09:52:48 -07:00
Christoph Hellwig
222013f9ac cryptoloop: add a deprecation warning
Support for cryptoloop has been officially marked broken and deprecated
in favor of dm-crypt (which supports the same broken algorithms if
needed) in Linux 2.6.4 (released in March 2004), and support for it has
been entirely removed from losetup in util-linux 2.23 (released in April
2013).  Add a warning and a deprecation schedule.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210827163250.255325-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-27 10:44:54 -06:00
Linus Torvalds
94606b893f Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fix from Russell King:
 "Resolve a Keystone 2 kernel mapping regression"

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9104/2: Fix Keystone 2 kernel mapping regression
2021-08-27 09:00:43 -07:00
Ulf Hansson
885814a97f Revert "mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN on BCM2711"
This reverts commit 419dd626e3.

It turned out that the change from the reverted commit breaks the ACPI
based rpi's because it causes the 100Mhz max clock to be overridden to the
return from sdhci_iproc_get_max_clock(), which is 0 because there isn't a
OF/DT based clock device.

Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Fixes: 419dd626e3 ("mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN on BCM2711")
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-27 16:30:36 +02:00
Jerome Brunet
068fdad204 usb: gadget: u_audio: fix race condition on endpoint stop
If the endpoint completion callback is call right after the ep_enabled flag
is cleared and before usb_ep_dequeue() is call, we could do a double free
on the request and the associated buffer.

Fix this by clearing ep_enabled after all the endpoint requests have been
dequeued.

Fixes: 7de8681be2 ("usb: gadget: u_audio: Free requests only after callback")
Cc: stable <stable@vger.kernel.org>
Reported-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20210827092927.366482-1-jbrunet@baylibre.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-27 16:07:23 +02:00
Jerome Brunet
75432ba583 usb: gadget: f_uac2: fixup feedback endpoint stop
When the uac2 function is stopped, there seems to be an issue reported on
some platforms (Intel Merrifield at least)

BUG: kernel NULL pointer dereference, address: 0000000000000008
...
RIP: 0010:dwc3_gadget_del_and_unmap_request+0x19/0xe0
...
Call Trace:
 dwc3_remove_requests.constprop.0+0x12f/0x170
 __dwc3_gadget_ep_disable+0x7a/0x160
 dwc3_gadget_ep_disable+0x3d/0xd0
 usb_ep_disable+0x1c/0x70
 u_audio_stop_capture+0x79/0x120 [u_audio]
 afunc_set_alt+0x73/0x80 [usb_f_uac2]
 composite_setup+0x224/0x1b90 [libcomposite]

The issue happens only when the gadget is using the sync type "async", not
"adaptive". This indicates that problem is coming from the feedback
endpoint, which is only used with async synchronization mode.

The problem is that request is freed regardless of usb_ep_dequeue(), which
ends up badly if the request is not actually dequeued yet.

Update the feedback endpoint free function to release the endpoint the same
way it is done for the data endpoint, which takes care of the problem.

Fixes: 24f779dac8 ("usb: gadget: f_uac2/u_audio: add feedback endpoint support")
Reported-by: Ferry Toth <ftoth@exalondelft.nl>
Tested-by: Ferry Toth <ftoth@exalondelft.nl>
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20210827075853.266912-1-jbrunet@baylibre.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-27 16:06:53 +02:00
Dan Carpenter
3375dca0b5 pd: fix a NULL vs IS_ERR() check
blk_mq_alloc_disk() returns error pointers, it doesn't return NULL
so correct the check.

Fixes: 262d431f90 ("pd: use blk_mq_alloc_disk and blk_cleanup_disk")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20210827100023.GB9449@kili
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-27 07:45:48 -06:00
Linus Torvalds
77dd11439b Merge tag 'drm-fixes-2021-08-27' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Last set of fixes for 5.14, nothing major a couple of i915, couple of
  imx and a few amdgpu. All pretty small.

  i915:
   - Fix syncmap memory leak
   - Drop redundant display port debug print

  amdgpu:
   - Fix for pinning display buffers multiple times
   - Fix delayed work handling for GFXOFF
   - Fix build when CONFIG_SUSPEND is not set

  imx:
   - fix planar offset calculations
   - fix accidental partial revert"

* tag 'drm-fixes-2021-08-27' of git://anongit.freedesktop.org/drm/drm:
  drm/i915/dp: Drop redundant debug print
  drm/i915: Fix syncmap memory leak
  drm/amdgpu: Fix build with missing pm_suspend_target_state module export
  drm/amdgpu: Cancel delayed work when GFXOFF is disabled
  drm/amdgpu: use the preferred pin domain after the check
  drm/imx: ipuv3-plane: fix accidental partial revert of 8 pixel alignment fix
  gpu: ipu-v3: Fix i.MX IPU-v3 offset calculations for (semi)planar U/V formats
2021-08-26 18:44:25 -07:00
Dave Airlie
9fe4f5a24f Merge tag 'imx-drm-fixes-2021-08-18' of git://git.pengutronix.de/pza/linux into drm-fixes
drm/imx: imx-drm alignment and plane offset fixes

Fix an accidental partial revert of commit 94dfec48fc ("drm/imx: Add 8
pixel alignment fix") and plane offset calculations for capture of
non-aligned resolutions.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Philipp Zabel <p.zabel@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/85a41af99beb2c9e7d6020435a135bf9f205a5ff.camel@pengutronix.de
2021-08-27 10:49:53 +10:00
Dave Airlie
589744dbdd Merge tag 'amd-drm-fixes-5.14-2021-08-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.14-2021-08-25:

amdgpu:
- Fix for pinning display buffers multiple times
- Fix delayed work handling for GFXOFF
- Fix build when CONFIG_SUSPEND is not set

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210826032658.4068-1-alexander.deucher@amd.com
2021-08-27 10:24:07 +10:00
Dave Airlie
4f33239615 Merge tag 'drm-intel-fixes-2021-08-26' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix syncmap memory leak
- Drop redundant display port debug print

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YSfSeHbyS5wBZtNJ@intel.com
2021-08-27 10:13:51 +10:00
Marek Marczykowski-Górecki
1a519dc7a7 PCI/MSI: Skip masking MSI-X on Xen PV
When running as Xen PV guest, masking MSI-X is a responsibility of the
hypervisor. The guest has no write access to the relevant BAR at all - when
it tries to, it results in a crash like this:

    BUG: unable to handle page fault for address: ffffc9004069100c
    #PF: supervisor write access in kernel mode
    #PF: error_code(0x0003) - permissions violation
    RIP: e030:__pci_enable_msix_range.part.0+0x26b/0x5f0
     e1000e_set_interrupt_capability+0xbf/0xd0 [e1000e]
     e1000_probe+0x41f/0xdb0 [e1000e]
     local_pci_probe+0x42/0x80
    (...)

The recently introduced function msix_mask_all() does not check the global
variable pci_msi_ignore_mask which is set by XEN PV to bypass the masking
of MSI[-X] interrupts.

Add the check to make this function XEN PV compatible.

Fixes: 7d5ec3d361 ("PCI/MSI: Mask all unused MSI-X entries")
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210826170342.135172-1-marmarek@invisiblethingslab.com
2021-08-27 00:27:15 +02:00
Linus Torvalds
73367f05b2 Merge tag 'nfsd-5.14-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd fix from Bruce Fields:
 "This is a one-liner fix for a serious bug that can cause the server to
  become unresponsive to a client, so I think it's worth the last-minute
  inclusion for 5.14"

* tag 'nfsd-5.14-1' of git://linux-nfs.org/~bfields/linux:
  SUNRPC: Fix XPT_BUSY flag leakage in svc_handle_xprt()...
2021-08-26 13:26:40 -07:00
Linus Torvalds
8a2cb8bd06 Merge tag 'net-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from can and bpf.

  Closing three hw-dependent regressions. Any fixes of note are in the
  'old code' category. Nothing blocking release from our perspective.

  Current release - regressions:

   - stmmac: revert "stmmac: align RX buffers"

   - usb: asix: ax88772: move embedded PHY detection as early as
     possible

   - usb: asix: do not call phy_disconnect() for ax88178

   - Revert "net: really fix the build...", from Kalle to fix QCA6390

  Current release - new code bugs:

   - phy: mediatek: add the missing suspend/resume callbacks

  Previous releases - regressions:

   - qrtr: fix another OOB Read in qrtr_endpoint_post

   - stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warnings

  Previous releases - always broken:

   - inet: use siphash in exception handling

   - ip_gre: add validation for csum_start

   - bpf: fix ringbuf helper function compatibility

   - rtnetlink: return correct error on changing device netns

   - e1000e: do not try to recover the NVM checksum on Tiger Lake"

* tag 'net-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
  Revert "net: really fix the build..."
  net: hns3: fix get wrong pfc_en when query PFC configuration
  net: hns3: fix GRO configuration error after reset
  net: hns3: change the method of getting cmd index in debugfs
  net: hns3: fix duplicate node in VLAN list
  net: hns3: fix speed unknown issue in bond 4
  net: hns3: add waiting time before cmdq memory is released
  net: hns3: clear hardware resource when loading driver
  net: fix NULL pointer reference in cipso_v4_doi_free
  rtnetlink: Return correct error on changing device netns
  net: dsa: hellcreek: Adjust schedule look ahead window
  net: dsa: hellcreek: Fix incorrect setting of GCL
  cxgb4: dont touch blocked freelist bitmap after free
  ipv4: use siphash instead of Jenkins in fnhe_hashfun()
  ipv6: use siphash in rt6_exception_hash()
  can: usb: esd_usb2: esd_usb2_rx_event(): fix the interchange of the CAN RX and TX error counters
  net: usb: asix: ax88772: fix boolconv.cocci warnings
  net/sched: ets: fix crash when flipping from 'strict' to 'quantum'
  qede: Fix memset corruption
  net: stmmac: fix kernel panic due to NULL pointer dereference of buf->xdp
  ...
2021-08-26 13:20:22 -07:00
Jens Axboe
7b05bf7710 Revert "block/mq-deadline: Prioritize high-priority requests"
This reverts commit fb926032b3.

Zhen reports that this commit slows down mq-deadline on a 128 thread
box, going from 258K IOPS to 170-180K. My testing shows that Optane
gen2 IOPS goes from 2.3M IOPS to 1.2M IOPS on a 64 thread box.

Looking in detail at the code, the main culprit here is needing to sum
percpu counters in the dispatch hot path, leading to very high CPU
utilization there. To make matters worse, the code currently needs to
sum 2 percpu counters, and it does so in the most naive way of iterating
possible CPUs _twice_.

Since we're close to release, revert this commit and we can re-do it
with regular per-priority counters instead for the 5.15 kernel.

Link: https://lore.kernel.org/linux-block/20210826144039.2143-1-thunder.leizhen@huawei.com/
Reported-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-26 12:59:44 -06:00
Linus Torvalds
1a6d80ff24 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
 "We received a report this week that the generic version of
  pfn_valid(), which we switched to this merge window in 16c9afc776
  ("arm64/mm: drop HAVE_ARCH_PFN_VALID"), interacts badly with
  dma_map_resource() due to the following check:

        /* Don't allow RAM to be mapped */
        if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
                return DMA_MAPPING_ERROR;

  Since the ongoing saga to determine the semantics of pfn_valid() is
  unlikely to be resolved this week (does it indicate valid memory, or
  just the presence of a struct page, or whether that struct page has
  been initialised?), just revert back to our old version of pfn_valid()
  for 5.14.

  Summary:

   - Fix dma_map_resource() by reverting back to old pfn_valid() code"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  Partially revert "arm64/mm: drop HAVE_ARCH_PFN_VALID"
2021-08-26 11:26:00 -07:00
Linus Torvalds
97d8cc2008 Merge tag 'ceph-for-5.14-rc8' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
 "Two memory management fixes for the filesystem"

* tag 'ceph-for-5.14-rc8' of git://github.com/ceph/ceph-client:
  ceph: fix possible null-pointer dereference in ceph_mdsmap_decode()
  ceph: correctly handle releasing an embedded cap flush
2021-08-26 11:18:30 -07:00
Kalle Valo
9ebc2758d0 Revert "net: really fix the build..."
This reverts commit ce78ffa3ef.

Wren and Nicolas reported that ath11k was failing to initialise QCA6390
Wi-Fi 6 device with error:

qcom_mhi_qrtr: probe of mhi0_IPCR failed with error -22

Commit ce78ffa3ef ("net: really fix the build..."), introduced in
v5.14-rc5, caused this regression in qrtr. Most likely all ath11k
devices are broken, but I only tested QCA6390. Let's revert the broken
commit so that ath11k works again.

Reported-by: Wren Turkal <wt@penguintechs.org>
Reported-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210826172816.24478-1-kvalo@codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26 11:08:32 -07:00
Linus Torvalds
9b49ceb854 Merge tag 'for-5.14-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
 "One more fix that I think qualifies for a late merge. It's a revert of
  a one-liner fix that meanwhile got backported to stable kernels and we
  got reports from users.

  The broken fix prevents creating compressed inline extents, which
  could be noticeable on space consumption.

  Technically it's a regression as the patch was merged in 5.14-rc1 but
  got propagated to several stable kernels and has higher exposure than
  a 'typical' development cycle bug"

* tag 'for-5.14-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Revert "btrfs: compression: don't try to compress if we don't have enough pages"
2021-08-26 11:05:11 -07:00
Sebastian Andrzej Siewior
e681dcbaa4 sched: Fix get_push_task() vs migrate_disable()
push_rt_task() attempts to move the currently running task away if the
next runnable task has migration disabled and therefore is pinned on the
current CPU.

The current task is retrieved via get_push_task() which only checks for
nr_cpus_allowed == 1, but does not check whether the task has migration
disabled and therefore cannot be moved either. The consequence is a
pointless invocation of the migration thread which correctly observes
that the task cannot be moved.

Return NULL if the task has migration disabled and cannot be moved to
another CPU.

Fixes: a7c81556ec ("sched: Fix migrate_disable() vs rt/dl balancing")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210826133738.yiotqbtdaxzjsnfj@linutronix.de
2021-08-26 19:02:00 +02:00
Andy Shevchenko
294c34e704 media: ipu3-cio2: Drop reference on error path in cio2_bridge_connect_sensor()
The commit 71f6428332 ("ACPI: utils: Fix reference counting in
for_each_acpi_dev_match()") moved adev assignment outside of error
path and hence made acpi_dev_put(sensor->adev) a no-op. We still
need to drop reference count on error path, and to achieve that,
replace sensor->adev by locally assigned adev.

Fixes: 71f6428332 ("ACPI: utils: Fix reference counting in for_each_acpi_dev_match()")
Depends-on: fc68f42aa7 ("ACPI: fix NULL pointer dereference")
Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-26 18:52:30 +02:00
Jakub Kicinski
75da63b7a1 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
bpf 2021-08-26

We've added 1 non-merge commit during the last 1 day(s):

1) Fix ringbuf helper function compatibility, from Daniel.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix ringbuf helper function compatibility
====================

Link: https://lore.kernel.org/r/20210826153720.19083-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26 08:44:38 -07:00
Jakub Kicinski
57f8178292 Merge branch 'net-hns3-add-some-fixes-for-net'
Guangbin Huang says:

====================
net: hns3: add some fixes for -net

This series adds some fixes for the HNS3 ethernet driver.
====================

Link: https://lore.kernel.org/r/1629976921-43438-1-git-send-email-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26 07:24:21 -07:00
Guangbin Huang
8c1671e0d1 net: hns3: fix get wrong pfc_en when query PFC configuration
Currently, when query PFC configuration by dcbtool, driver will return
PFC enable status based on TC. As all priorities are mapped to TC0 by
default, if TC0 is enabled, then all priorities mapped to TC0 will be
shown as enabled status when query PFC setting, even though some
priorities have never been set.

for example:
$ dcb pfc show dev eth0
pfc-cap 4 macsec-bypass off delay 0
prio-pfc 0:off 1:off 2:off 3:off 4:off 5:off 6:off 7:off
$ dcb pfc set dev eth0 prio-pfc 0:on 1:on 2:on 3:on
$ dcb pfc show dev eth0
pfc-cap 4 macsec-bypass off delay 0
prio-pfc 0:on 1:on 2:on 3:on 4:on 5:on 6:on 7:on

To fix this problem, just returns user's PFC config parameter saved in
driver.

Fixes: cacde272dd ("net: hns3: Add hclge_dcb module for the support of DCB feature")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26 07:24:17 -07:00
Yufeng Mo
3462207d2d net: hns3: fix GRO configuration error after reset
The GRO configuration is enabled by default after reset. This
is incorrect and should be restored to the user-configured value.
So this restoration is added during reset initialization.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26 07:24:17 -07:00
Yufeng Mo
55649d5654 net: hns3: change the method of getting cmd index in debugfs
Currently, the cmd index is obtained in debugfs by comparing file names.
However, this method may cause errors when processing more complex file
names. So, change this method by saving cmd in private data and comparing
it when getting cmd index in debugfs for optimization.

Fixes: 5e69ea7ee2 ("net: hns3: refactor the debugfs process")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26 07:24:17 -07:00
Guojia Liao
94391fae82 net: hns3: fix duplicate node in VLAN list
VLAN list should not be added duplicate VLAN node, otherwise it would
cause "add failed" when restore VLAN from VLAN list, so this patch adds
VLAN ID check before adding node into VLAN list.

Fixes: c6075b1934 ("net: hns3: Record VF vlan tables")
Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26 07:24:17 -07:00
Yonglong Liu
b15c072a9f net: hns3: fix speed unknown issue in bond 4
In bond 4, when the link goes down and up repeatedly, the bond may get an
unknown speed, and then this port can not work.

The driver notify netif_carrier_on() before update the link state, when the
bond receive carrier on, will query the speed of the port, if the query
operation happens before updating the link state, will get an unknown
speed. So need to notify netif_carrier_on() after update the link state.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26 07:24:16 -07:00
Yufeng Mo
a96d9330b0 net: hns3: add waiting time before cmdq memory is released
After the cmdq registers are cleared, the firmware may take time to
clear out possible left over commands in the cmdq. Driver must release
cmdq memory only after firmware has completed processing of left over
commands.

Fixes: 232d0d55fc ("net: hns3: uninitialize command queue while unloading PF driver")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26 07:24:16 -07:00
Yufeng Mo
1a6d281946 net: hns3: clear hardware resource when loading driver
If a PF is bonded to a virtual machine and the virtual machine exits
unexpectedly, some hardware resource cannot be cleared. In this case,
loading driver may cause exceptions. Therefore, the hardware resource
needs to be cleared when the driver is loaded.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26 07:24:16 -07:00
Kyle Tso
ef52b4a9fc usb: typec: tcpm: Raise vdm_sm_running flag only when VDM SM is running
If the port is going to send Discover_Identity Message, vdm_sm_running
flag was intentionally set before entering Ready States in order to
avoid the conflict because the port and the port partner might start
AMS at almost the same time after entering Ready States.

However, the original design has a problem. When the port is doing
DR_SWAP from Device to Host, it raises the flag. Later in the
tcpm_send_discover_work, the flag blocks the procedure of sending the
Discover_Identity and it might never be cleared until disconnection.

Since there exists another flag send_discover representing that the port
is going to send Discover_Identity or not, it is enough to use that flag
to prevent the conflict. Also change the timing of the set/clear of
vdm_sm_running to indicate whether the VDM SM is actually running or
not.

Fixes: c34e85fa69 ("usb: typec: tcpm: Send DISCOVER_IDENTITY from dedicated work")
Cc: stable <stable@vger.kernel.org>
Cc: Badhri Jagan Sridharan <badhri@google.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210826124201.1562502-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-26 14:59:04 +02:00
Takashi Iwai
c82cacd2f1 usb: renesas-xhci: Prefer firmware loading on unknown ROM state
The recent attempt to handle an unknown ROM state in the commit
d143825baf ("usb: renesas-xhci: Fix handling of unknown ROM state")
resulted in a regression and reverted later by the commit 44cf53602f
("Revert "usb: renesas-xhci: Fix handling of unknown ROM state"").
The problem of the former fix was that it treated the failure of
firmware loading as a fatal error.  Since the firmware files aren't
included in the standard linux-firmware tree, most users don't have
them, hence they got the non-working system after that.  The revert
fixed the regression, but also it didn't make the firmware loading
triggered even on the devices that do need it.  So we need still a fix
for them.

This is another attempt to handle the unknown ROM state.  Like the
previous fix, this also tries to load the firmware when ROM shows
unknown state.  In this patch, however, the failure of a firmware
loading (such as a missing firmware file) isn't handled as a fatal
error any longer when ROM has been already detected, but it falls back
to the ROM mode like before.  The error is returned only when no ROM
is detected and the firmware loading failed.

Along with it, for simplifying the code flow, the detection and the
check of ROM is factored out from renesas_fw_check_running() and done
in the caller side, renesas_xhci_check_request_fw().  It avoids the
redundant ROM checks.

The patch was tested on Lenovo Thinkpad T14 gen (BIOS 1.34).  Also it
was confirmed that no regression is seen on another Thinkpad T14
machine that has worked without the patch, too.

Fixes: 44cf53602f ("Revert "usb: renesas-xhci: Fix handling of unknown ROM state"")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1189207
Link: https://lore.kernel.org/r/20210826124127.14789-1-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-26 14:56:59 +02:00
Wesley Cheng
4a1e25c0a0 usb: dwc3: gadget: Stop EP0 transfers during pullup disable
During a USB cable disconnect, or soft disconnect scenario, a pending
SETUP transaction may not be completed, leading to the following
error:

    dwc3 a600000.dwc3: timed out waiting for SETUP phase

If this occurs, then the entire pullup disable routine is skipped and
proper cleanup and halting of the controller does not complete.

Instead of returning an error (which is ignored from the UDC
perspective), allow the pullup disable routine to continue, which
will also handle disabling of EP0/1.  This will end any active
transfers as well.  Ensure to clear any delayed_status also, as the
timeout could happen within the STATUS stage.

Fixes: bb01473648 ("usb: dwc3: gadget: don't clear RUN/STOP when it's invalid to do so")
Cc: <stable@vger.kernel.org>
Reviewed-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
Link: https://lore.kernel.org/r/20210825042855.7977-1-wcheng@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-26 13:48:58 +02:00
Arnd Bergmann
6c35ca0697 Merge tag 'reset-fixes-for-v5.14' of git://git.pengutronix.de/pza/linux into arm/fixes
Reset controller fixes for v5.14

Hide the Sparx5 reset driver unless the ARCH_SPARX5 or COMPILE_TEST
options are enabled, to avoid unnecessarily asking users about this
driver. Fix a return value argument type in the ZynqMP reset driver.

* tag 'reset-fixes-for-v5.14' of git://git.pengutronix.de/pza/linux:
  reset: reset-zynqmp: Fixed the argument data type
  reset: RESET_MCHP_SPARX5 should depend on ARCH_SPARX5

Link: https://lore.kernel.org/r/e543959c5b5ee7b25686f81049bf187d602daeda.camel@pengutronix.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-08-26 13:47:45 +02:00
Thinh Nguyen
51f1954ad8 usb: dwc3: gadget: Fix dwc3_calc_trbs_left()
We can't depend on the TRB's HWO bit to determine if the TRB ring is
"full". A TRB is only available when the driver had processed it, not
when the controller consumed and relinquished the TRB's ownership to the
driver. Otherwise, the driver may overwrite unprocessed TRBs. This can
happen when many transfer events accumulate and the system is slow to
process them and/or when there are too many small requests.

If a request is in the started_list, that means there is one or more
unprocessed TRBs remained. Check this instead of the TRB's HWO bit
whether the TRB ring is full.

Fixes: c4233573f6 ("usb: dwc3: gadget: prepare TRBs on update transfers too")
Cc: <stable@vger.kernel.org>
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/e91e975affb0d0d02770686afc3a5b9eb84409f6.1629335416.git.Thinh.Nguyen@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-26 13:47:03 +02:00
Swati Sharma
71de496cc4 drm/i915/dp: Drop redundant debug print
drm_dp_dpcd_read/write already has debug error message.
Drop redundant error messages which gives false
status even if correct value is read in drm_dp_dpcd_read().

v2: -Added fixes tag (Ankit)
v3: -Fixed build error (CI)

Fixes: 9488a030ac ("drm/i915: Add support for enabling link status and recovery")
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.12+
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210812131107.5531-1-swati2.sharma@intel.com
(cherry picked from commit b6dfa41617)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-08-26 07:31:52 -04:00
Matthew Brost
a63bcf08f0 drm/i915: Fix syncmap memory leak
A small race exists between intel_gt_retire_requests_timeout and
intel_timeline_exit which could result in the syncmap not getting
free'd. Rather than work to hard to seal this race, simply cleanup the
syncmap on fini.

unreferenced object 0xffff88813bc53b18 (size 96):
  comm "gem_close_race", pid 5410, jiffies 4294917818 (age 1105.600s)
  hex dump (first 32 bytes):
    01 00 00 00 00 00 00 00 00 00 00 00 0a 00 00 00  ................
    00 00 00 00 00 00 00 00 6b 6b 6b 6b 06 00 00 00  ........kkkk....
  backtrace:
    [<00000000120b863a>] __sync_alloc_leaf+0x1e/0x40 [i915]
    [<00000000042f6959>] __sync_set+0x1bb/0x240 [i915]
    [<0000000090f0e90f>] i915_request_await_dma_fence+0x1c7/0x400 [i915]
    [<0000000056a48219>] i915_request_await_object+0x222/0x360 [i915]
    [<00000000aaac4ee3>] i915_gem_do_execbuffer+0x1bd0/0x2250 [i915]
    [<000000003c9d830f>] i915_gem_execbuffer2_ioctl+0x405/0xce0 [i915]
    [<00000000fd7a8e68>] drm_ioctl_kernel+0xb0/0xf0 [drm]
    [<00000000e721ee87>] drm_ioctl+0x305/0x3c0 [drm]
    [<000000008b0d8986>] __x64_sys_ioctl+0x71/0xb0
    [<0000000076c362a4>] do_syscall_64+0x33/0x80
    [<00000000eb7a4831>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Fixes: 531958f6f3 ("drm/i915/gt: Track timeline activeness in enter/exit")
Cc: <stable@vger.kernel.org>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210730195342.110234-1-matthew.brost@intel.com
(cherry picked from commit faf890985e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-08-26 07:31:52 -04:00
王贇
733c99ee8b net: fix NULL pointer reference in cipso_v4_doi_free
In netlbl_cipsov4_add_std() when 'doi_def->map.std' alloc
failed, we sometime observe panic:

  BUG: kernel NULL pointer dereference, address:
  ...
  RIP: 0010:cipso_v4_doi_free+0x3a/0x80
  ...
  Call Trace:
   netlbl_cipsov4_add_std+0xf4/0x8c0
   netlbl_cipsov4_add+0x13f/0x1b0
   genl_family_rcv_msg_doit.isra.15+0x132/0x170
   genl_rcv_msg+0x125/0x240

This is because in cipso_v4_doi_free() there is no check
on 'doi_def->map.std' when 'doi_def->type' equal 1, which
is possibe, since netlbl_cipsov4_add_std() haven't initialize
it before alloc 'doi_def->map.std'.

This patch just add the check to prevent panic happen for similar
cases.

Reported-by: Abaci <abaci@linux.alibaba.com>
Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26 12:20:47 +01:00
Andrey Ignatov
96a6b93b69 rtnetlink: Return correct error on changing device netns
Currently when device is moved between network namespaces using
RTM_NEWLINK message type and one of netns attributes (FLA_NET_NS_PID,
IFLA_NET_NS_FD, IFLA_TARGET_NETNSID) but w/o specifying IFLA_IFNAME, and
target namespace already has device with same name, userspace will get
EINVAL what is confusing and makes debugging harder.

Fix it so that userspace gets more appropriate EEXIST instead what makes
debugging much easier.

Before:

  # ./ifname.sh
  + ip netns add ns0
  + ip netns exec ns0 ip link add l0 type dummy
  + ip netns exec ns0 ip link show l0
  8: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      link/ether 66:90:b5:d5:78:69 brd ff:ff:ff:ff:ff:ff
  + ip link add l0 type dummy
  + ip link show l0
  10: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      link/ether 6e:c6:1f:15:20:8d brd ff:ff:ff:ff:ff:ff
  + ip link set l0 netns ns0
  RTNETLINK answers: Invalid argument

After:

  # ./ifname.sh
  + ip netns add ns0
  + ip netns exec ns0 ip link add l0 type dummy
  + ip netns exec ns0 ip link show l0
  8: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      link/ether 1e:4a:72:e3:e3:8f brd ff:ff:ff:ff:ff:ff
  + ip link add l0 type dummy
  + ip link show l0
  10: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      link/ether f2:fc:fe:2b:7d:a6 brd ff:ff:ff:ff:ff:ff
  + ip link set l0 netns ns0
  RTNETLINK answers: File exists

The problem is that do_setlink() passes its `char *ifname` argument,
that it gets from a caller, to __dev_change_net_namespace() as is (as
`const char *pat`), but semantics of ifname and pat can be different.

For example, __rtnl_newlink() does this:

net/core/rtnetlink.c
    3270	char ifname[IFNAMSIZ];
     ...
    3286	if (tb[IFLA_IFNAME])
    3287		nla_strscpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
    3288	else
    3289		ifname[0] = '\0';
     ...
    3364	if (dev) {
     ...
    3394		return do_setlink(skb, dev, ifm, extack, tb, ifname, status);
    3395	}

, i.e. do_setlink() gets ifname pointer that is always valid no matter
if user specified IFLA_IFNAME or not and then do_setlink() passes this
ifname pointer as is to __dev_change_net_namespace() as pat argument.

But the pat (pattern) in __dev_change_net_namespace() is used as:

net/core/dev.c
   11198	err = -EEXIST;
   11199	if (__dev_get_by_name(net, dev->name)) {
   11200		/* We get here if we can't use the current device name */
   11201		if (!pat)
   11202			goto out;
   11203		err = dev_get_valid_name(net, dev, pat);
   11204		if (err < 0)
   11205			goto out;
   11206	}

As the result the `goto out` path on line 11202 is neven taken and
instead of returning EEXIST defined on line 11198,
__dev_change_net_namespace() returns an error from dev_get_valid_name()
and this, in turn, will be EINVAL for ifname[0] = '\0' set earlier.

Fixes: d8a5ec6727 ("[NET]: netlink support for moving devices between network namespaces.")
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26 12:08:08 +01:00
David S. Miller
a423cbe0f2 Merge branch 'dsa-hellcreek-fixes'
Kurt Kanzenbach says:

====================
net: dsa: hellcreek: 802.1Qbv Fixes

while using TAPRIO offloading on the Hirschmann hellcreek switch, I've noticed
two issues in the current implementation:

1. The gate control list is incorrectly programmed
2. The admin base time is not set properly

Fix it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26 10:26:06 +01:00
Kurt Kanzenbach
b7658ed35a net: dsa: hellcreek: Adjust schedule look ahead window
Traffic schedules can only be started up to eight seconds within the
future. Therefore, the driver periodically checks every two seconds whether the
admin base time provided by the user is inside that window. If so the schedule
is started. Otherwise the check is deferred.

However, according to the programming manual the look ahead window size should
be four - not eight - seconds. By using the proposed value of four seconds
starting a schedule at a specified admin base time actually works as expected.

Fixes: 24dfc6eb39 ("net: dsa: hellcreek: Add TAPRIO offloading support")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26 10:26:06 +01:00
Kurt Kanzenbach
a7db5ed863 net: dsa: hellcreek: Fix incorrect setting of GCL
Currently the gate control list which is programmed into the hardware is
incorrect resulting in wrong traffic schedules. The problem is the loop
variables are incremented before they are referenced. Therefore, move the
increment to the end of the loop.

Fixes: 24dfc6eb39 ("net: dsa: hellcreek: Add TAPRIO offloading support")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26 10:26:05 +01:00
Rahul Lakkireddy
43fed4d48d cxgb4: dont touch blocked freelist bitmap after free
When adapter init fails, the blocked freelist bitmap is already freed
up and should not be touched. So, move the bitmap zeroing closer to
where it was successfully allocated. Also handle adapter init failure
unwind path immediately and avoid setting up RDMA memory windows.

Fixes: 5b377d114f ("cxgb4: Add debugfs facility to inject FL starvation")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26 10:23:24 +01:00
David S. Miller
38d57551dd Merge branch 'inet-siphash'
Eric Dumazet says:

====================
inet: use siphash in exception handling

A group of security researchers brought to our attention
the weakness of hash functions used in rt6_exception_hash()
and fnhe_hashfun()

I made two distinct patches to help backports, since IPv6
part was added in 4.15
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26 10:20:34 +01:00
Eric Dumazet
6457378fe7 ipv4: use siphash instead of Jenkins in fnhe_hashfun()
A group of security researchers brought to our attention
the weakness of hash function used in fnhe_hashfun().

Lets use siphash instead of Jenkins Hash, to considerably
reduce security risks.

Also remove the inline keyword, this really is distracting.

Fixes: d546c62154 ("ipv4: harden fnhe_hashfun()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Keyu Man <kman001@ucr.edu>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26 10:20:34 +01:00
Eric Dumazet
4785305c05 ipv6: use siphash in rt6_exception_hash()
A group of security researchers brought to our attention
the weakness of hash function used in rt6_exception_hash()

Lets use siphash instead of Jenkins Hash, to considerably
reduce security risks.

Following patch deals with IPv4.

Fixes: 35732d01fe ("ipv6: introduce a hash table to store dst cache")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Keyu Man <kman001@ucr.edu>
Cc: Wei Wang <weiwan@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Acked-by: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26 10:20:34 +01:00
David S. Miller
92ea47fe09 Merge tag 'linux-can-fixes-for-5.14-20210826' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine says:

====================
pull-request: can 2021-08-26

this is a pull request of a single patch for net/master.

Stefan Mätje's patch fixes the interchange of RX and TX error counters
inthe esd_usb2 CAN driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26 09:37:40 +01:00
Greg Kroah-Hartman
662b932915 Merge tag 'usb-serial-5.14-rc8' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:

USB-serial fixes for 5.14-rc8

Here's a fix for a regression in 5.14 (also backported to stable) which
caused reads to stall for ch341 devices.

Included is also a new modem device id.

All but the revert have been in linux-next, and with no reported issues.

* tag 'usb-serial-5.14-rc8' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  Revert "USB: serial: ch341: fix character loss at high transfer rates"
  USB: serial: option: add new VID/PID to support Fibocom FG150
2021-08-26 10:27:18 +02:00
Kim Phillips
ccf2648341 perf/x86/amd/power: Assign pmu.module
Assign pmu.module so the driver can't be unloaded whilst in use.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210817221048.88063-4-kim.phillips@amd.com
2021-08-26 09:12:57 +02:00
Kim Phillips
f11dd0d805 perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op
Commit:

   2ff4025069 ("perf/core, arch/x86: Use PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs")

neglected to do so.

Fixes: 2ff4025069 ("perf/core, arch/x86: Use PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210817221048.88063-2-kim.phillips@amd.com
2021-08-26 08:58:02 +02:00
Kim Phillips
26db2e0c51 perf/x86/amd/ibs: Work around erratum #1197
Erratum #1197 "IBS (Instruction Based Sampling) Register State May be
Incorrect After Restore From CC6" is published in a document:

  "Revision Guide for AMD Family 19h Models 00h-0Fh Processors" 56683 Rev. 1.04 July 2021

  https://bugzilla.kernel.org/show_bug.cgi?id=206537

Implement the erratum's suggested workaround and ignore IBS samples if
MSRC001_1031 == 0.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210817221048.88063-3-kim.phillips@amd.com
2021-08-26 08:58:02 +02:00
Colin Ian King
0b3a8738b7 perf/x86/intel/uncore: Fix integer overflow on 23 bit left shift of a u32
The u32 variable pci_dword is being masked with 0x1fffffff and then left
shifted 23 places. The shift is a u32 operation,so a value of 0x200 or
more in pci_dword will overflow the u32 and only the bottow 32 bits
are assigned to addr. I don't believe this was the original intent.
Fix this by casting pci_dword to a resource_size_t to ensure no
overflow occurs.

Note that the mask and 12 bit left shift operation does not need this
because the mask SNR_IMC_MMIO_MEM0_MASK and shift is always a 32 bit
value.

Fixes: ee49532b38 ("perf/x86/intel/uncore: Add IMC uncore support for Snow Ridge")
Addresses-Coverity: ("Unintentional integer overflow")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20210706114553.28249-1-colin.king@canonical.com
2021-08-26 08:58:02 +02:00
Stefan Mätje
044012b520 can: usb: esd_usb2: esd_usb2_rx_event(): fix the interchange of the CAN RX and TX error counters
This patch fixes the interchanged fetch of the CAN RX and TX error
counters from the ESD_EV_CAN_ERROR_EXT message. The RX error counter
is really in struct rx_msg::data[2] and the TX error counter is in
struct rx_msg::data[3].

Fixes: 96d8e90382 ("can: Add driver for esd CAN-USB/2 device")
Link: https://lore.kernel.org/r/20210825215227.4947-2-stefan.maetje@esd.eu
Cc: stable@vger.kernel.org
Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-08-26 08:37:13 +02:00
kernel test robot
ec92e524ee net: usb: asix: ax88772: fix boolconv.cocci warnings
drivers/net/usb/asix_devices.c:757:60-65: WARNING: conversion to bool not needed here

 Remove unneeded conversion to bool

Semantic patch information:
 Relational and logical operators evaluate to bool,
 explicit conversion is overly verbose and unneeded.

Generated by: scripts/coccinelle/misc/boolconv.cocci

Fixes: 7a141e64cf ("net: usb: asix: ax88772: move embedded PHY detection as early as possible")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20210825183538.13070-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-25 16:35:51 -07:00
Trond Myklebust
062b829c52 SUNRPC: Fix XPT_BUSY flag leakage in svc_handle_xprt()...
If the attempt to reserve a slot fails, we currently leak the XPT_BUSY
flag on the socket. Among other things, this make it impossible to close
the socket.

Fixes: 82011c80b3 ("SUNRPC: Move svc_xprt_received() call sites")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-08-25 16:58:09 -04:00
Linus Torvalds
73f3af7b46 Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "2 patches.

  Subsystems affected by this patch series: mm/memory-hotplug and
  MAINTAINERS"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  MAINTAINERS: exfat: update my email address
  mm/memory_hotplug: fix potential permanent lru cache disable
2021-08-25 12:45:31 -07:00
Namjae Jeon
a34cc13add MAINTAINERS: exfat: update my email address
My email address in exfat entry will be not available in a few days.
Update it to my own kernel.org address.

Link: https://lkml.kernel.org/r/20210825044833.16806-1-namjae.jeon@samsung.com
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-25 12:25:12 -07:00
Miaohe Lin
946746d1ad mm/memory_hotplug: fix potential permanent lru cache disable
If offline_pages failed after lru_cache_disable(), it forgot to do
lru_cache_enable() in error path.  So we would have lru cache disabled
permanently in this case.

Link: https://lkml.kernel.org/r/20210821094246.10149-3-linmiaohe@huawei.com
Fixes: d479960e44 ("mm: disable LRU pagevec during the migration temporarily")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-25 12:25:12 -07:00
Dmitry Osipenko
3c5a272202 PM: domains: Improve runtime PM performance state handling
GENPD core doesn't support handling performance state changes while
consumer device is runtime-suspended or when runtime PM is disabled.
GENPD core may override performance state that was configured by device
driver while RPM of the device was disabled or device was RPM-suspended.

Let's close that gap by allowing drivers to control performance state
while RPM of a consumer device is disabled and to set up performance
state of RPM-suspended device that will be applied by GENPD core on
RPM-resume of the device.

Fixes: 5937c3ce21 ("PM: domains: Drop/restore performance state votes for devices at runtime PM")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-25 20:15:54 +02:00
Linus Torvalds
fe67f4dd8d pipe: do FASYNC notifications for every pipe IO, not just state changes
It turns out that the SIGIO/FASYNC situation is almost exactly the same
as the EPOLLET case was: user space really wants to be notified after
every operation.

Now, in a perfect world it should be sufficient to only notify user
space on "state transitions" when the IO state changes (ie when a pipe
goes from unreadable to readable, or from unwritable to writable).  User
space should then do as much as possible - fully emptying the buffer or
what not - and we'll notify it again the next time the state changes.

But as with EPOLLET, we have at least one case (stress-ng) where the
kernel sent SIGIO due to the pipe being marked for asynchronous
notification, but the user space signal handler then didn't actually
necessarily read it all before returning (it read more than what was
written, but since there could be multiple writes, it could leave data
pending).

The user space code then expected to get another SIGIO for subsequent
writes - even though the pipe had been readable the whole time - and
would only then read more.

This is arguably a user space bug - and Colin King already fixed the
stress-ng code in question - but the kernel regression rules are clear:
it doesn't matter if kernel people think that user space did something
silly and wrong.  What matters is that it used to work.

So if user space depends on specific historical kernel behavior, it's a
regression when that behavior changes.  It's on us: we were silly to
have that non-optimal historical behavior, and our old kernel behavior
was what user space was tested against.

Because of how the FASYNC notification was tied to wakeup behavior, this
was first broken by commits f467a6a664 and 1b6b26ae70 ("pipe: fix
and clarify pipe read/write wakeup logic"), but at the time it seems
nobody noticed.  Probably because the stress-ng problem case ends up
being timing-dependent too.

It was then unwittingly fixed by commit 3a34b13a88 ("pipe: make pipe
writes always wake up readers") only to be broken again when by commit
3b844826b6 ("pipe: avoid unnecessary EPOLLET wakeups under normal
loads").

And at that point the kernel test robot noticed the performance
refression in the stress-ng.sigio.ops_per_sec case.  So the "Fixes" tag
below is somewhat ad hoc, but it matches when the issue was noticed.

Fix it for good (knock wood) by simply making the kill_fasync() case
separate from the wakeup case.  FASYNC is quite rare, and we clearly
shouldn't even try to use the "avoid unnecessary wakeups" logic for it.

Link: https://lore.kernel.org/lkml/20210824151337.GC27667@xsang-OptiPlex-9020/
Fixes: 3b844826b6 ("pipe: avoid unnecessary EPOLLET wakeups under normal loads")
Reported-by: kernel test robot <oliver.sang@intel.com>
Tested-by: Oliver Sang <oliver.sang@intel.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-25 10:27:16 -07:00
Linus Torvalds
62add98208 Merge branch 'for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull ucount fixes from Eric Biederman:
 "This branch fixes a regression that made it impossible to increase
  rlimits that had been converted to the ucount infrastructure, and also
  fixes a reference counting bug where the reference was not incremented
  soon enough.

  The fixes are trivial and the bugs have been encountered in the wild,
  and the fixes have been tested"

* 'for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ucounts: Increase ucounts reference counter before the security hook
  ucounts: Fix regression preventing increasing of rlimits in init_user_ns
2021-08-25 09:56:10 -07:00
Tuo Li
a9e6ffbc5b ceph: fix possible null-pointer dereference in ceph_mdsmap_decode()
kcalloc() is called to allocate memory for m->m_info, and if it fails,
ceph_mdsmap_destroy() behind the label out_err will be called:
  ceph_mdsmap_destroy(m);

In ceph_mdsmap_destroy(), m->m_info is dereferenced through:
  kfree(m->m_info[i].export_targets);

To fix this possible null-pointer dereference, check m->m_info before the
for loop to free m->m_info[i].export_targets.

[ jlayton: fix up whitespace damage
	   only kfree(m->m_info) if it's non-NULL ]

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Tuo Li <islituo@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-08-25 16:34:11 +02:00
Xiubo Li
b2f9fa1f3b ceph: correctly handle releasing an embedded cap flush
The ceph_cap_flush structures are usually dynamically allocated, but
the ceph_cap_snap has an embedded one.

When force umounting, the client will try to remove all the session
caps. During this, it will free them, but that should not be done
with the ones embedded in a capsnap.

Fix this by adding a new boolean that indicates that the cap flush is
embedded in a capsnap, and skip freeing it if that's set.

At the same time, switch to using list_del_init() when detaching the
i_list and g_list heads.  It's possible for a forced umount to remove
these objects but then handle_cap_flushsnap_ack() races in and does the
list_del_init() again, corrupting memory.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/52283
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-08-25 16:34:11 +02:00
Xiaoyao Li
c53c6b7409 perf/x86/intel/pt: Fix mask of num_address_ranges
Per SDM, bit 2:0 of CPUID(0x14,1).EAX[2:0] reports the number of
configurable address ranges for filtering, not bit 1:0.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Link: https://lkml.kernel.org/r/20210824040622.4081502-1-xiaoyao.li@intel.com
2021-08-25 15:42:31 +02:00
Qu Wenruo
4e9655763b Revert "btrfs: compression: don't try to compress if we don't have enough pages"
This reverts commit f216562731.

[BUG]
It's no longer possible to create compressed inline extent after commit
f216562731 ("btrfs: compression: don't try to compress if we don't
have enough pages").

[CAUSE]
For compression code, there are several possible reasons we have a range
that needs to be compressed while it's no more than one page.

- Compressed inline write
  The data is always smaller than one sector and the test lacks the
  condition to properly recognize a non-inline extent.

- Compressed subpage write
  For the incoming subpage compressed write support, we require page
  alignment of the delalloc range.
  And for 64K page size, we can compress just one page into smaller
  sectors.

For those reasons, the requirement for the data to be more than one page
is not correct, and is already causing regression for compressed inline
data writeback.  The idea of skipping one page to avoid wasting CPU time
could be revisited in the future.

[FIX]
Fix it by reverting the offending commit.

Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Link: https://lore.kernel.org/linux-btrfs/afa2742.c084f5d6.17b6b08dffc@tnonline.net
Fixes: f216562731 ("btrfs: compression: don't try to compress if we don't have enough pages")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-25 15:08:19 +02:00
Will Deacon
3eb9cdffb3 Partially revert "arm64/mm: drop HAVE_ARCH_PFN_VALID"
This partially reverts commit 16c9afc776.

Alex Bee reports a regression in 5.14 on their RK3328 SoC when
configuring the PL330 DMA controller:

 | ------------[ cut here ]------------
 | WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
 | Modules linked in: spi_rockchip(+) fuse
 | CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
 | Hardware name: Pine64 Rock64 (DT)
 | pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
 | pc : dma_map_resource+0x68/0xc0
 | lr : pl330_prep_slave_fifo+0x78/0xd0

This appears to be because dma_map_resource() is being called for a
physical address which does not correspond to a memory address yet does
have a valid 'struct page' due to the way in which the vmemmap is
constructed.

Prior to 16c9afc776 ("arm64/mm: drop HAVE_ARCH_PFN_VALID"), the arm64
implementation of pfn_valid() called memblock_is_memory() to return
'false' for such regions and the DMA mapping request would proceed.
However, now that we are using the generic implementation where only the
presence of the memory map entry is considered, we return 'true' and
erroneously fail with DMA_MAPPING_ERROR because we identify the region
as DRAM.

Although fixing this in the DMA mapping code is arguably the right fix,
it is a risky, cross-architecture change at this stage in the cycle. So
just revert arm64 back to its old pfn_valid() implementation for v5.14.
The change to the generic pfn_valid() code is preserved from the original
patch, so as to avoid impacting other architectures.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Reported-by: Alex Bee <knaerzche@gmail.com>
Link: https://lore.kernel.org/r/d3a3c828-b777-faf8-e901-904995688437@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-08-25 11:33:24 +01:00
Davide Caratti
cd9b50adc6 net/sched: ets: fix crash when flipping from 'strict' to 'quantum'
While running kselftests, Hangbin observed that sch_ets.sh often crashes,
and splats like the following one are seen in the output of 'dmesg':

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 159f12067 P4D 159f12067 PUD 159f13067 PMD 0
 Oops: 0000 [#1] SMP NOPTI
 CPU: 2 PID: 921 Comm: tc Not tainted 5.14.0-rc6+ #458
 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
 RIP: 0010:__list_del_entry_valid+0x2d/0x50
 Code: 48 8b 57 08 48 b9 00 01 00 00 00 00 ad de 48 39 c8 0f 84 ac 6e 5b 00 48 b9 22 01 00 00 00 00 ad de 48 39 ca 0f 84 cf 6e 5b 00 <48> 8b 32 48 39 fe 0f 85 af 6e 5b 00 48 8b 50 08 48 39 f2 0f 85 94
 RSP: 0018:ffffb2da005c3890 EFLAGS: 00010217
 RAX: 0000000000000000 RBX: ffff9073ba23f800 RCX: dead000000000122
 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff9073ba23fbc8
 RBP: ffff9073ba23f890 R08: 0000000000000001 R09: 0000000000000001
 R10: 0000000000000001 R11: 0000000000000001 R12: dead000000000100
 R13: ffff9073ba23fb00 R14: 0000000000000002 R15: 0000000000000002
 FS:  00007f93e5564e40(0000) GS:ffff9073bba00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 000000014ad34000 CR4: 0000000000350ee0
 Call Trace:
  ets_qdisc_reset+0x6e/0x100 [sch_ets]
  qdisc_reset+0x49/0x1d0
  tbf_reset+0x15/0x60 [sch_tbf]
  qdisc_reset+0x49/0x1d0
  dev_reset_queue.constprop.42+0x2f/0x90
  dev_deactivate_many+0x1d3/0x3d0
  dev_deactivate+0x56/0x90
  qdisc_graft+0x47e/0x5a0
  tc_get_qdisc+0x1db/0x3e0
  rtnetlink_rcv_msg+0x164/0x4c0
  netlink_rcv_skb+0x50/0x100
  netlink_unicast+0x1a5/0x280
  netlink_sendmsg+0x242/0x480
  sock_sendmsg+0x5b/0x60
  ____sys_sendmsg+0x1f2/0x260
  ___sys_sendmsg+0x7c/0xc0
  __sys_sendmsg+0x57/0xa0
  do_syscall_64+0x3a/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f93e44b8338
 Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 43 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55
 RSP: 002b:00007ffc0db737a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 0000000061255c06 RCX: 00007f93e44b8338
 RDX: 0000000000000000 RSI: 00007ffc0db73810 RDI: 0000000000000003
 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
 R10: 000000000000000b R11: 0000000000000246 R12: 0000000000000001
 R13: 0000000000687880 R14: 0000000000000000 R15: 0000000000000000
 Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt iTCO_vendor_support intel_rapl_msr intel_rapl_common joydev i2c_i801 pcspkr i2c_smbus lpc_ich virtio_balloon ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel libata serio_raw virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod
 CR2: 0000000000000000

When the change() function decreases the value of 'nstrict', we must take
into account that packets might be already enqueued on a class that flips
from 'strict' to 'quantum': otherwise that class will not be added to the
bandwidth-sharing list. Then, a call to ets_qdisc_reset() will attempt to
do list_del(&alist) with 'alist' filled with zero, hence the NULL pointer
dereference.
For classes flipping from 'strict' to 'quantum', initialize an empty list
and eventually add it to the bandwidth-sharing list, if there are packets
already enqueued. In this way, the kernel will:
 a) prevent crashing as described above.
 b) avoid retaining the backlog packets (for an arbitrarily long time) in
    case no packet is enqueued after a change from 'strict' to 'quantum'.

Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: dcc68b4d80 ("net: sch_ets: Add a new Qdisc")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25 11:15:30 +01:00
Shai Malin
e543468869 qede: Fix memset corruption
Thanks to Kees Cook who detected the problem of memset that starting
from not the first member, but sized for the whole struct.
The better change will be to remove the redundant memset and to clear
only the msix_cnt member.

Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Reported-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25 11:07:55 +01:00
Song Yoong Siang
2b9fff64f0 net: stmmac: fix kernel panic due to NULL pointer dereference of buf->xdp
Ensure a valid XSK buffer before proceed to free the xdp buffer.

The following kernel panic is observed without this patch:

RIP: 0010:xp_free+0x5/0x40
Call Trace:
stmmac_napi_poll_rxtx+0x332/0xb30 [stmmac]
? stmmac_tx_timer+0x3c/0xb0 [stmmac]
net_rx_action+0x13d/0x3d0
__do_softirq+0xfc/0x2fb
? smpboot_register_percpu_thread+0xe0/0xe0
run_ksoftirqd+0x32/0x70
smpboot_thread_fn+0x1d8/0x2c0
kthread+0x169/0x1a0
? kthread_park+0x90/0x90
ret_from_fork+0x1f/0x30
---[ end trace 0000000000000002 ]---

Fixes: bba2556efa ("net: stmmac: Enable RX via AF_XDP zero-copy")
Cc: <stable@vger.kernel.org> # 5.13.x
Suggested-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25 10:59:39 +01:00
Song Yoong Siang
a6451192da net: stmmac: fix kernel panic due to NULL pointer dereference of xsk_pool
After free xsk_pool, there is possibility that napi polling is still
running in the middle, thus causes a kernel crash due to kernel NULL
pointer dereference of rx_q->xsk_pool and tx_q->xsk_pool.

Fix this by changing the XDP pool setup sequence to:
 1. disable napi before free xsk_pool
 2. enable napi after init xsk_pool

The following kernel panic is observed without this patch:

RIP: 0010:xsk_uses_need_wakeup+0x5/0x10
Call Trace:
stmmac_napi_poll_rxtx+0x3a9/0xae0 [stmmac]
__napi_poll+0x27/0x130
net_rx_action+0x233/0x280
__do_softirq+0xe2/0x2b6
run_ksoftirqd+0x1a/0x20
smpboot_thread_fn+0xac/0x140
? sort_range+0x20/0x20
kthread+0x124/0x150
? set_kthread_struct+0x40/0x40
ret_from_fork+0x1f/0x30
---[ end trace a77c8956b79ac107 ]---

Fixes: bba2556efa ("net: stmmac: Enable RX via AF_XDP zero-copy")
Cc: <stable@vger.kernel.org> # 5.13.x
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25 10:59:39 +01:00
Harini Katakam
85520079af net: macb: Add a NULL check on desc_ptp
macb_ptp_desc will not return NULL under most circumstances with correct
Kconfig and IP design config register. But for the sake of the extreme
corner case, check for NULL when using the helper. In case of rx_tstamp,
no action is necessary except to return (similar to timestamp disabled)
and warn. In case of TX, return -EINVAL to let the skb be free. Perform
this check before marking skb in progress.
Fixes coverity warning:
(4) Event dereference:
Dereferencing a null pointer "desc_ptp"

Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25 10:39:17 +01:00
Michael Riesch
2d26f6e39a net: stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warnings
This reverts commit 2c896fb02e
"net: stmmac: dwmac-rk: add pd_gmac support for rk3399" and fixes
unbalanced pm_runtime_enable warnings.

In the commit to be reverted, support for power management was
introduced to the Rockchip glue code. Later, power management support
was introduced to the stmmac core code, resulting in multiple
invocations of pm_runtime_{enable,disable,get_sync,put_sync}.

The multiple invocations happen in rk_gmac_powerup and
stmmac_{dvr_probe, resume} as well as in rk_gmac_powerdown and
stmmac_{dvr_remove, suspend}, respectively, which are always called
in conjunction.

Fixes: 5ec5582343 ("net: stmmac: add clocks management for gmac driver")
Signed-off-by: Michael Riesch <michael.riesch@wolfvision.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25 10:37:17 +01:00
Johan Hovold
df7b16d1c0 Revert "USB: serial: ch341: fix character loss at high transfer rates"
This reverts commit 3c18e9baee.

These devices do not appear to send a zero-length packet when the
transfer size is a multiple of the bulk-endpoint max-packet size. This
means that incoming data may not be processed by the driver until a
short packet is received or the receive buffer is full.

Revert back to using endpoint-sized receive buffers to avoid stalled
reads.

Reported-by: Paul Größel <pb.g@gmx.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214131
Fixes: 3c18e9baee ("USB: serial: ch341: fix character loss at high transfer rates")
Cc: stable@vger.kernel.org
Cc: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20210824121926.19311-1-johan@kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2021-08-25 09:13:33 +02:00
Bin Meng
417166ddec riscv: dts: microchip: Add ethernet0 to the aliases node
U-Boot expects this alias to be in place in order to fix up the mac
address of the ethernet node.

Note on the Icicle Kit board, currently only emac1 is enabled so it
becomes the 'ethernet0'.

Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-24 20:57:32 -07:00
Bin Meng
719588dee2 riscv: dts: microchip: Use 'local-mac-address' for emac1
Per the DT spec, 'local-mac-address' is used to specify MAC address
that was assigned to the network device, while 'mac-address' is used
to specify the MAC address that was last used by the boot program,
and shall be used only if the value differs from 'local-mac-address'
property value.

Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: conor dooley <conor.dooley@microchip.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-24 20:56:47 -07:00
Vincent Chen
379eb01c21 riscv: Ensure the value of FP registers in the core dump file is up to date
The value of FP registers in the core dump file comes from the
thread.fstate. However, kernel saves the FP registers to the thread.fstate
only before scheduling out the process. If no process switch happens
during the exception handling process, kernel will not have a chance to
save the latest value of FP registers to thread.fstate. It will cause the
value of FP registers in the core dump file may be incorrect. To solve this
problem, this patch force lets kernel save the FP register into the
thread.fstate if the target task_struct equals the current.

Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: b8c8a9590e ("RISC-V: Add FP register ptrace support for gdb.")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-24 20:54:10 -07:00
Li Jinlin
02c6dcd543 scsi: core: Fix hang of freezing queue between blocking and running device
We found a hang, the steps to reproduce  are as follows:

  1. blocking device via scsi_device_set_state()

  2. dd if=/dev/sda of=/mnt/t.log bs=1M count=10

  3. echo none > /sys/block/sda/queue/scheduler

  4. echo "running" >/sys/block/sda/device/state

Step 3 and 4 should complete after step 4, but they hang.

  CPU#0               CPU#1                CPU#2
  ---------------     ----------------     ----------------
                                           Step 1: blocking device

                                           Step 2: dd xxxx
                                                  ^^^^^^ get request
                                                         q_usage_counter++

                      Step 3: switching scheculer
                      elv_iosched_store
                        elevator_switch
                          blk_mq_freeze_queue
                            blk_freeze_queue
                              > blk_freeze_queue_start
                                ^^^^^^ mq_freeze_depth++

                              > blk_mq_run_hw_queues
                                ^^^^^^ can't run queue when dev blocked

                              > blk_mq_freeze_queue_wait
                                ^^^^^^ Hang here!!!
                                       wait q_usage_counter==0

  Step 4: running device
  store_state_field
    scsi_rescan_device
      scsi_attach_vpd
        scsi_vpd_inquiry
          __scsi_execute
            blk_get_request
              blk_mq_alloc_request
                blk_queue_enter
                ^^^^^^ Hang here!!!
                       wait mq_freeze_depth==0

    blk_mq_run_hw_queues
    ^^^^^^ dispatch IO, q_usage_counter will reduce to zero

                            blk_mq_unfreeze_queue
                            ^^^^^ mq_freeze_depth--

To fix this, we need to run queue before rescanning device when the device
state changes to SDEV_RUNNING.

Link: https://lore.kernel.org/r/20210824025921.3277629-1-lijinlin3@huawei.com
Fixes: f0f82e2476 ("scsi: core: Fix capacity set to zero after offlinining device")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Li Jinlin <lijinlin3@huawei.com>
Signed-off-by: Qiu Laibin <qiulaibin@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 23:06:19 -04:00
DENG Qingfang
93100d6817 net: phy: mediatek: add the missing suspend/resume callbacks
Without suspend/resume callbacks, the PHY cannot be powered down/up
administratively.

Fixes: e40d2cca01 ("net: phy: add MediaTek Gigabit Ethernet PHY driver")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20210823044422.164184-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-24 16:52:43 -07:00
Bart Van Assche
b6d2b054e8 mq-deadline: Fix request accounting
The block layer may call the I/O scheduler .finish_request() callback
without having called the .insert_requests() callback. Make sure that the
mq-deadline I/O statistics are correct if the block layer inserts an I/O
request that bypasses the I/O scheduler. This patch prevents that lower
priority I/O is delayed longer than necessary for mixed I/O priority
workloads.

Cc: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Reported-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Fixes: 08a9ad8bf6 ("block/mq-deadline: Add cgroup support")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20210824170520.1659173-1-bvanassche@acm.org
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-24 16:18:01 -06:00
Linus Torvalds
6e764bcd1c Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "Several small fixes, the first three are significant:

   - mlx5 crash unloading drivers with a rare HW config

   - missing userspace reporting for the new dmabuf objects

   - random rxe failure due to missing memory zeroing

   - static checker/etc reports: missing spin lock init, null pointer
     deref on error, extra unlock on error path, memory allocation under
     spinlock, missing IRQ vector cleanup

   - kconfig typo in the new irdma driver"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/rxe: Zero out index member of struct rxe_queue
  RDMA/efa: Free IRQ vectors on error flow
  RDMA/rxe: Fix memory allocation while in a spin lock
  RDMA/bnxt_re: Remove unpaired rtnl unlock in bnxt_re_dev_init()
  IB/hfi1: Fix possible null-pointer dereference in _extend_sdma_tx_descs()
  RDMA/irdma: Use correct kconfig symbol for AUXILIARY_BUS
  RDMA/bnxt_re: Add missing spin lock initialization
  RDMA/uverbs: Track dmabuf memory regions
  RDMA/mlx5: Fix crash when unbind multiport slave
2021-08-24 09:55:50 -07:00
Borislav Petkov
c41a4e877a drm/amdgpu: Fix build with missing pm_suspend_target_state module export
Building a randconfig here triggered:

  ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

because the module export of that symbol happens in
kernel/power/suspend.c which is enabled with CONFIG_SUSPEND.

The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for
CONFIG_PM_SLEEP which is defined like this:

  config PM_SLEEP
          def_bool y
          depends on SUSPEND || HIBERNATE_CALLBACKS

and that randconfig has:

  # CONFIG_SUSPEND is not set
  CONFIG_HIBERNATE_CALLBACKS=y

leading to the module export missing.

Change the ifdeffery to depend directly on CONFIG_SUSPEND.

Fixes: 5706cb3c91 ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/YSP6Lv53QV0cOAsd@zn.tnic
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-24 11:57:44 -04:00
Zhengjun Zhang
2829a4e3cf USB: serial: option: add new VID/PID to support Fibocom FG150
Fibocom FG150 is a 5G module based on Qualcomm SDX55 platform,
support Sub-6G band.

Here are the outputs of lsusb -v and usb-devices:

> T:  Bus=02 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  2 Spd=5000 MxCh= 0
> D:  Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs=  1
> P:  Vendor=2cb7 ProdID=010b Rev=04.14
> S:  Manufacturer=Fibocom
> S:  Product=Fibocom Modem_SN:XXXXXXXX
> S:  SerialNumber=XXXXXXXX
> C:  #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=896mA
> I:  If#=0x0 Alt= 0 #EPs= 1 Cls=ef(misc ) Sub=04 Prot=01 Driver=rndis_host
> I:  If#=0x1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host
> I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
> I:  If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=(none)
> I:  If#=0x4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)

> Bus 002 Device 002: ID 2cb7:010b Fibocom Fibocom Modem_SN:XXXXXXXX
> Device Descriptor:
>   bLength                18
>   bDescriptorType         1
>   bcdUSB               3.20
>   bDeviceClass            0
>   bDeviceSubClass         0
>   bDeviceProtocol         0
>   bMaxPacketSize0         9
>   idVendor           0x2cb7 Fibocom
>   idProduct          0x010b
>   bcdDevice            4.14
>   iManufacturer           1 Fibocom
>   iProduct                2 Fibocom Modem_SN:XXXXXXXX
>   iSerial                 3 XXXXXXXX
>   bNumConfigurations      1
>   Configuration Descriptor:
>     bLength                 9
>     bDescriptorType         2
>     wTotalLength       0x00e6
>     bNumInterfaces          5
>     bConfigurationValue     1
>     iConfiguration          4 RNDIS_DUN_DIAG_ADB
>     bmAttributes         0xa0
>       (Bus Powered)
>       Remote Wakeup
>     MaxPower              896mA
>     Interface Association:
>       bLength                 8
>       bDescriptorType        11
>       bFirstInterface         0
>       bInterfaceCount         2
>       bFunctionClass        239 Miscellaneous Device
>       bFunctionSubClass       4
>       bFunctionProtocol       1
>       iFunction               7 RNDIS
>     Interface Descriptor:
>       bLength                 9
>       bDescriptorType         4
>       bInterfaceNumber        0
>       bAlternateSetting       0
>       bNumEndpoints           1
>       bInterfaceClass       239 Miscellaneous Device
>       bInterfaceSubClass      4
>       bInterfaceProtocol      1
>       iInterface              0
>       ** UNRECOGNIZED:  05 24 00 10 01
>       ** UNRECOGNIZED:  05 24 01 00 01
>       ** UNRECOGNIZED:  04 24 02 00
>       ** UNRECOGNIZED:  05 24 06 00 01
>       Endpoint Descriptor:
>         bLength                 7
>         bDescriptorType         5
>         bEndpointAddress     0x81  EP 1 IN
>         bmAttributes            3
>           Transfer Type            Interrupt
>           Synch Type               None
>           Usage Type               Data
>         wMaxPacketSize     0x0008  1x 8 bytes
>         bInterval               9
>         bMaxBurst               0
>     Interface Descriptor:
>       bLength                 9
>       bDescriptorType         4
>       bInterfaceNumber        1
>       bAlternateSetting       0
>       bNumEndpoints           2
>       bInterfaceClass        10 CDC Data
>       bInterfaceSubClass      0
>       bInterfaceProtocol      0
>       iInterface              0
>       Endpoint Descriptor:
>         bLength                 7
>         bDescriptorType         5
>         bEndpointAddress     0x8e  EP 14 IN
>         bmAttributes            2
>           Transfer Type            Bulk
>           Synch Type               None
>           Usage Type               Data
>         wMaxPacketSize     0x0400  1x 1024 bytes
>         bInterval               0
>         bMaxBurst               6
>       Endpoint Descriptor:
>         bLength                 7
>         bDescriptorType         5
>         bEndpointAddress     0x0f  EP 15 OUT
>         bmAttributes            2
>           Transfer Type            Bulk
>           Synch Type               None
>           Usage Type               Data
>         wMaxPacketSize     0x0400  1x 1024 bytes
>         bInterval               0
>         bMaxBurst               6
>     Interface Descriptor:
>       bLength                 9
>       bDescriptorType         4
>       bInterfaceNumber        2
>       bAlternateSetting       0
>       bNumEndpoints           3
>       bInterfaceClass       255 Vendor Specific Class
>       bInterfaceSubClass      0
>       bInterfaceProtocol      0
>       iInterface              0
>       ** UNRECOGNIZED:  05 24 00 10 01
>       ** UNRECOGNIZED:  05 24 01 00 00
>       ** UNRECOGNIZED:  04 24 02 02
>       ** UNRECOGNIZED:  05 24 06 00 00
>       Endpoint Descriptor:
>         bLength                 7
>         bDescriptorType         5
>         bEndpointAddress     0x83  EP 3 IN
>         bmAttributes            3
>           Transfer Type            Interrupt
>           Synch Type               None
>           Usage Type               Data
>         wMaxPacketSize     0x000a  1x 10 bytes
>         bInterval               9
>         bMaxBurst               0
>       Endpoint Descriptor:
>         bLength                 7
>         bDescriptorType         5
>         bEndpointAddress     0x82  EP 2 IN
>         bmAttributes            2
>           Transfer Type            Bulk
>           Synch Type               None
>           Usage Type               Data
>         wMaxPacketSize     0x0400  1x 1024 bytes
>         bInterval               0
>         bMaxBurst               0
>       Endpoint Descriptor:
>         bLength                 7
>         bDescriptorType         5
>         bEndpointAddress     0x01  EP 1 OUT
>         bmAttributes            2
>           Transfer Type            Bulk
>           Synch Type               None
>           Usage Type               Data
>         wMaxPacketSize     0x0400  1x 1024 bytes
>         bInterval               0
>         bMaxBurst               0
>     Interface Descriptor:
>       bLength                 9
>       bDescriptorType         4
>       bInterfaceNumber        3
>       bAlternateSetting       0
>       bNumEndpoints           2
>       bInterfaceClass       255 Vendor Specific Class
>       bInterfaceSubClass    255 Vendor Specific Subclass
>       bInterfaceProtocol     48
>       iInterface              0
>       Endpoint Descriptor:
>         bLength                 7
>         bDescriptorType         5
>         bEndpointAddress     0x84  EP 4 IN
>         bmAttributes            2
>           Transfer Type            Bulk
>           Synch Type               None
>           Usage Type               Data
>         wMaxPacketSize     0x0400  1x 1024 bytes
>         bInterval               0
>         bMaxBurst               0
>       Endpoint Descriptor:
>         bLength                 7
>         bDescriptorType         5
>         bEndpointAddress     0x02  EP 2 OUT
>         bmAttributes            2
>           Transfer Type            Bulk
>           Synch Type               None
>           Usage Type               Data
>         wMaxPacketSize     0x0400  1x 1024 bytes
>         bInterval               0
>         bMaxBurst               0
>     Interface Descriptor:
>       bLength                 9
>       bDescriptorType         4
>       bInterfaceNumber        4
>       bAlternateSetting       0
>       bNumEndpoints           2
>       bInterfaceClass       255 Vendor Specific Class
>       bInterfaceSubClass     66
>       bInterfaceProtocol      1
>       iInterface              0
>       Endpoint Descriptor:
>         bLength                 7
>         bDescriptorType         5
>         bEndpointAddress     0x03  EP 3 OUT
>         bmAttributes            2
>           Transfer Type            Bulk
>           Synch Type               None
>           Usage Type               Data
>         wMaxPacketSize     0x0400  1x 1024 bytes
>         bInterval               0
>         bMaxBurst               0
>       Endpoint Descriptor:
>         bLength                 7
>         bDescriptorType         5
>         bEndpointAddress     0x85  EP 5 IN
>         bmAttributes            2
>           Transfer Type            Bulk
>           Synch Type               None
>           Usage Type               Data
>         wMaxPacketSize     0x0400  1x 1024 bytes
>         bInterval               0
>         bMaxBurst               0
> Binary Object Store Descriptor:
>   bLength                 5
>   bDescriptorType        15
>   wTotalLength       0x0016
>   bNumDeviceCaps          2
>   USB 2.0 Extension Device Capability:
>     bLength                 7
>     bDescriptorType        16
>     bDevCapabilityType      2
>     bmAttributes   0x00000006
>       BESL Link Power Management (LPM) Supported
>   SuperSpeed USB Device Capability:
>     bLength                10
>     bDescriptorType        16
>     bDevCapabilityType      3
>     bmAttributes         0x00
>     wSpeedsSupported   0x000f
>       Device can operate at Low Speed (1Mbps)
>       Device can operate at Full Speed (12Mbps)
>       Device can operate at High Speed (480Mbps)
>       Device can operate at SuperSpeed (5Gbps)
>     bFunctionalitySupport   1
>       Lowest fully-functional device speed is Full Speed (12Mbps)
>     bU1DevExitLat           1 micro seconds
>     bU2DevExitLat         500 micro seconds
> Device Status:     0x0000
>   (Bus Powered)

Signed-off-by: Zhengjun Zhang <zhangzhengjun@aicrobo.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2021-08-24 15:46:09 +02:00
Nathan Rossi
3b0720ba00 net: dsa: mv88e6xxx: Update mv88e6393x serdes errata
In early erratas this issue only covered port 0 when changing from
[x]MII (rev A 3.6). In subsequent errata versions this errata changed to
cover the additional "Hardware reset in CPU managed mode" condition, and
removed the note specifying that it only applied to port 0.

In designs where the device is configured with CPU managed mode
(CPU_MGD), on reset all SERDES ports (p0, p9, p10) have a stuck power
down bit and require this initial power up procedure. As such apply this
errata to all three SERDES ports of the mv88e6393x.

Signed-off-by: Nathan Rossi <nathan.rossi@digi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24 10:48:46 +01:00
zhang kai
446e7f218b ipv6: correct comments about fib6_node sernum
correct comments in set and get fn_sernum

Signed-off-by: zhang kai <zhangkaiheb@126.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24 09:59:01 +01:00
Shai Malin
b0cd08537d qed: Fix the VF msix vectors flow
For VFs we should return with an error in case we didn't get the exact
number of msix vectors as we requested.
Not doing that will lead to a crash when starting queues for this VF.

Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24 09:22:37 +01:00
Alexey Gladkov
bbb6d0f3e1 ucounts: Increase ucounts reference counter before the security hook
We need to increment the ucounts reference counter befor security_prepare_creds()
because this function may fail and abort_creds() will try to decrement
this reference.

[   96.465056][ T8641] FAULT_INJECTION: forcing a failure.
[   96.465056][ T8641] name fail_page_alloc, interval 1, probability 0, space 0, times 0
[   96.478453][ T8641] CPU: 1 PID: 8641 Comm: syz-executor668 Not tainted 5.14.0-rc6-syzkaller #0
[   96.487215][ T8641] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[   96.497254][ T8641] Call Trace:
[   96.500517][ T8641]  dump_stack_lvl+0x1d3/0x29f
[   96.505758][ T8641]  ? show_regs_print_info+0x12/0x12
[   96.510944][ T8641]  ? log_buf_vmcoreinfo_setup+0x498/0x498
[   96.516652][ T8641]  should_fail+0x384/0x4b0
[   96.521141][ T8641]  prepare_alloc_pages+0x1d1/0x5a0
[   96.526236][ T8641]  __alloc_pages+0x14d/0x5f0
[   96.530808][ T8641]  ? __rmqueue_pcplist+0x2030/0x2030
[   96.536073][ T8641]  ? lockdep_hardirqs_on_prepare+0x3e2/0x750
[   96.542056][ T8641]  ? alloc_pages+0x3f3/0x500
[   96.546635][ T8641]  allocate_slab+0xf1/0x540
[   96.551120][ T8641]  ___slab_alloc+0x1cf/0x350
[   96.555689][ T8641]  ? kzalloc+0x1d/0x30
[   96.559740][ T8641]  __kmalloc+0x2e7/0x390
[   96.563980][ T8641]  ? kzalloc+0x1d/0x30
[   96.568029][ T8641]  kzalloc+0x1d/0x30
[   96.571903][ T8641]  security_prepare_creds+0x46/0x220
[   96.577174][ T8641]  prepare_creds+0x411/0x640
[   96.581747][ T8641]  __sys_setfsuid+0xe2/0x3a0
[   96.586333][ T8641]  do_syscall_64+0x3d/0xb0
[   96.590739][ T8641]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   96.596611][ T8641] RIP: 0033:0x445a69
[   96.600483][ T8641] Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
[   96.620152][ T8641] RSP: 002b:00007f1054173318 EFLAGS: 00000246 ORIG_RAX: 000000000000007a
[   96.628543][ T8641] RAX: ffffffffffffffda RBX: 00000000004ca4c8 RCX: 0000000000445a69
[   96.636600][ T8641] RDX: 0000000000000010 RSI: 00007f10541732f0 RDI: 0000000000000000
[   96.644550][ T8641] RBP: 00000000004ca4c0 R08: 0000000000000001 R09: 0000000000000000
[   96.652500][ T8641] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004ca4cc
[   96.660631][ T8641] R13: 00007fffffe0b62f R14: 00007f1054173400 R15: 0000000000022000

Fixes: 905ae01c4a ("Add a reference to ucounts for each cred")
Reported-by: syzbot+01985d7909f9468f013c@syzkaller.appspotmail.com
Signed-off-by: Alexey Gladkov <legion@kernel.org>
Link: https://lkml.kernel.org/r/97433b1742c3331f02ad92de5a4f07d673c90613.1629735352.git.legion@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-08-23 16:13:04 -05:00
Eric W. Biederman
5ddf994fa2 ucounts: Fix regression preventing increasing of rlimits in init_user_ns
"Ma, XinjianX" <xinjianx.ma@intel.com> reported:

> When lkp team run kernel selftests, we found after these series of patches, testcase mqueue: mq_perf_tests
> in kselftest failed with following message.
>
> # selftests: mqueue: mq_perf_tests
> #
> # Initial system state:
> #       Using queue path:                       /mq_perf_tests
> #       RLIMIT_MSGQUEUE(soft):                  819200
> #       RLIMIT_MSGQUEUE(hard):                  819200
> #       Maximum Message Size:                   8192
> #       Maximum Queue Size:                     10
> #       Nice value:                             0
> #
> # Adjusted system state for testing:
> #       RLIMIT_MSGQUEUE(soft):                  (unlimited)
> #       RLIMIT_MSGQUEUE(hard):                  (unlimited)
> #       Maximum Message Size:                   16777216
> #       Maximum Queue Size:                     65530
> #       Nice value:                             -20
> #       Continuous mode:                        (disabled)
> #       CPUs to pin:                            3
> # ./mq_perf_tests: mq_open() at 296: Too many open files
> not ok 2 selftests: mqueue: mq_perf_tests # exit=1
> ```
>
> Test env:
> rootfs: debian-10
> gcc version: 9

After investigation the problem turned out to be that ucount_max for
the rlimits in init_user_ns was being set to the initial rlimit value.
The practical problem is that ucount_max provides a limit that
applications inside the user namespace can not exceed.  Which means in
practice that rlimits that have been converted to use the ucount
infrastructure were not able to exceend their initial rlimits.

Solve this by setting the relevant values of ucount_max to
RLIM_INIFINITY.  A limit in init_user_ns is pointless so the code
should allow the values to grow as large as possible without riscking
an underflow or an overflow.

As the ltp test case was a bit of a pain I have reproduced the rlimit failure
and tested the fix with the following little C program:
> #include <stdio.h>
> #include <fcntl.h>
> #include <sys/stat.h>
> #include <mqueue.h>
> #include <sys/time.h>
> #include <sys/resource.h>
> #include <errno.h>
> #include <string.h>
> #include <stdlib.h>
> #include <limits.h>
> #include <unistd.h>
>
> int main(int argc, char **argv)
> {
> 	struct mq_attr mq_attr;
> 	struct rlimit rlim;
> 	mqd_t mqd;
> 	int ret;
>
> 	ret = getrlimit(RLIMIT_MSGQUEUE, &rlim);
> 	if (ret != 0) {
> 		fprintf(stderr, "getrlimit(RLIMIT_MSGQUEUE) failed: %s\n", strerror(errno));
> 		exit(EXIT_FAILURE);
> 	}
> 	printf("RLIMIT_MSGQUEUE %lu %lu\n",
> 	       rlim.rlim_cur, rlim.rlim_max);
> 	rlim.rlim_cur = RLIM_INFINITY;
> 	rlim.rlim_max = RLIM_INFINITY;
> 	ret = setrlimit(RLIMIT_MSGQUEUE, &rlim);
> 	if (ret != 0) {
> 		fprintf(stderr, "setrlimit(RLIMIT_MSGQUEUE, RLIM_INFINITY) failed: %s\n", strerror(errno));
> 		exit(EXIT_FAILURE);
> 	}
>
> 	memset(&mq_attr, 0, sizeof(struct mq_attr));
> 	mq_attr.mq_maxmsg = 65536 - 1;
> 	mq_attr.mq_msgsize = 16*1024*1024 - 1;
>
> 	mqd = mq_open("/mq_rlimit_test", O_RDONLY|O_CREAT, 0600, &mq_attr);
> 	if (mqd == (mqd_t)-1) {
> 		fprintf(stderr, "mq_open failed: %s\n", strerror(errno));
> 		exit(EXIT_FAILURE);
> 	}
> 	ret = mq_close(mqd);
> 	if (ret) {
> 		fprintf(stderr, "mq_close failed; %s\n", strerror(errno));
> 		exit(EXIT_FAILURE);
> 	}
>
> 	return EXIT_SUCCESS;
> }

Fixes: 6e52a9f053 ("Reimplement RLIMIT_MSGQUEUE on top of ucounts")
Fixes: d7c9e99aee ("Reimplement RLIMIT_MEMLOCK on top of ucounts")
Fixes: d646969055 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
Fixes: 21d1c5e386 ("Reimplement RLIMIT_NPROC on top of ucounts")
Reported-by: kernel test robot lkp@intel.com
Acked-by: Alexey Gladkov <legion@kernel.org>
Link: https://lkml.kernel.org/r/87eeajswfc.fsf_-_@disp2133
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-08-23 16:10:42 -05:00
Daniel Borkmann
5b029a32cf bpf: Fix ringbuf helper function compatibility
Commit 457f44363a ("bpf: Implement BPF ring buffer and verifier support
for it") extended check_map_func_compatibility() by enforcing map -> helper
function match, but not helper -> map type match.

Due to this all of the bpf_ringbuf_*() helper functions could be used with
a wrong map type such as array or hash map, leading to invalid access due
to type confusion.

Also, both BPF_FUNC_ringbuf_{submit,discard} have ARG_PTR_TO_ALLOC_MEM as
argument and not a BPF map. Therefore, their check_map_func_compatibility()
presence is incorrect since it's only for map type checking.

Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: Ryota Shiga (Flatt Security)
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-08-23 23:09:10 +02:00
Linus Torvalds
d5ae8d7f85 Revert "media: dvb header files: move some headers to staging"
This reverts commit 819fbd3d8e.

It turns out that some user-space applications use these uapi header
files, so even though the only user of the interface is an old driver
that was moved to staging, moving the header files causes unnecessary
pain.

Generally, we really don't want user space to use kernel headers
directly (exactly because it causes pain when we re-organize), and
instead copy them as needed.  But these things happen, and the headers
were in the uapi directory, so I guess it's not entirely unreasonable.

Link: https://lore.kernel.org/lkml/4e3e0d40-df4a-94f8-7c2d-85010b0873c4@web.de/
Reported-by: Soeren Moch <smoch@web.de>
Cc: stable@kernel.org  # 5.13
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-23 09:49:09 -07:00
Rafael J. Wysocki
1f8b66d965 Merge branch 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm into pm-opp
Pull regression fix for the operating performance points (OPP)
framework for v5.15 from Viresh Kumar:

"This fixes regression in the OPP core for a corner case."

* 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  opp: core: Check for pending links before reading required_opp pointers
2021-08-23 13:51:30 +02:00
David S. Miller
14315498f5 Merge branch 'asix-fixes'
Oleksij Rempel says:

====================
asix fixes

changes v2:
- rebase against current net
- add one more fix for the ax88178 variant
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23 12:39:42 +01:00
Oleksij Rempel
1406e8cb4b net: usb: asix: do not call phy_disconnect() for ax88178
Fix crash on reboot on a system with ASIX AX88178 USB adapter attached
to it:
| asix 1-1.4:1.0 eth0: unregister 'asix' usb-ci_hdrc.0-1.4, ASIX AX88178 USB 2.0 Ethernet
| 8<--- cut here ---
| Unable to handle kernel NULL pointer dereference at virtual address 0000028c
| pgd = 5ec93aee
| [0000028c] *pgd=00000000
| Internal error: Oops: 5 [#1] PREEMPT SMP ARM
| Modules linked in:
| CPU: 1 PID: 1 Comm: systemd-shutdow Not tainted 5.14.0-rc1-20210811-1 #4
| Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
| PC is at phy_disconnect+0x8/0x48
| LR is at ax88772_unbind+0x14/0x20
| [<80650d04>] (phy_disconnect) from [<80741aa4>] (ax88772_unbind+0x14/0x20)
| [<80741aa4>] (ax88772_unbind) from [<8074e250>] (usbnet_disconnect+0x48/0xd8)
| [<8074e250>] (usbnet_disconnect) from [<807655e0>] (usb_unbind_interface+0x78/0x25c)
| [<807655e0>] (usb_unbind_interface) from [<805b03a0>] (__device_release_driver+0x154/0x20c)
| [<805b03a0>] (__device_release_driver) from [<805b0478>] (device_release_driver+0x20/0x2c)
| [<805b0478>] (device_release_driver) from [<805af944>] (bus_remove_device+0xcc/0xf8)
| [<805af944>] (bus_remove_device) from [<805ab26c>] (device_del+0x178/0x4b0)
| [<805ab26c>] (device_del) from [<807634a4>] (usb_disable_device+0xcc/0x178)
| [<807634a4>] (usb_disable_device) from [<8075a060>] (usb_disconnect+0xd8/0x238)
| [<8075a060>] (usb_disconnect) from [<8075a02c>] (usb_disconnect+0xa4/0x238)
| [<8075a02c>] (usb_disconnect) from [<8075a02c>] (usb_disconnect+0xa4/0x238)
| [<8075a02c>] (usb_disconnect) from [<80af3520>] (usb_remove_hcd+0xa0/0x198)
| [<80af3520>] (usb_remove_hcd) from [<807902e0>] (host_stop+0x38/0xa8)
| [<807902e0>] (host_stop) from [<8078d9e4>] (ci_hdrc_remove+0x3c/0x118)
| [<8078d9e4>] (ci_hdrc_remove) from [<805b27ec>] (platform_remove+0x20/0x50)
| [<805b27ec>] (platform_remove) from [<805b03a0>] (__device_release_driver+0x154/0x20c)
| [<805b03a0>] (__device_release_driver) from [<805b0478>] (device_release_driver+0x20/0x2c)
| [<805b0478>] (device_release_driver) from [<805af944>] (bus_remove_device+0xcc/0xf8)
| [<805af944>] (bus_remove_device) from [<805ab26c>] (device_del+0x178/0x4b0)

For this adapter we call ax88178_bind() and ax88772_unbind(), which is
related to different chip version and different counter part *bind()
function.

Since this chip is currently not ported to the PHYLIB, we do not need to
call phy_disconnect() here. So, to fix this crash, we need to add
ax88178_unbind().

Fixes: e532a096be ("net: usb: asix: ax88772: add phylib support")
Reported-by: Robin van der Gracht <robin@protonic.nl>
Tested-by: Robin van der Gracht <robin@protonic.nl>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23 12:39:42 +01:00
Oleksij Rempel
7a141e64cf net: usb: asix: ax88772: move embedded PHY detection as early as possible
Some HW revisions need additional MAC configuration before the embedded PHY
can be enabled. If this is not done, we won't be able to get response
from the internal PHY.

This issue was detected on chipcode == AX_AX88772_CHIPCODE variant,
where ax88772_hw_reset() was executed with missing embd_phy flag.

Fixes: e532a096be ("net: usb: asix: ax88772: add phylib support")
Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23 12:39:41 +01:00
Sai Krishna Potthuri
ed104ca4bd reset: reset-zynqmp: Fixed the argument data type
This patch changes the data type of the variable 'val' from
int to u32.

Addresses-Coverity: argument of type "int *" is incompatible with parameter of type "u32 *"
Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lore.kernel.org/r/925cebbe4eb73c7d0a536da204748d33c7100d8c.1624448778.git.michal.simek@xilinx.com
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2021-08-23 12:55:18 +02:00
Maxim Kiselev
359f4cdd7d net: marvell: fix MVNETA_TX_IN_PRGRS bit number
According to Armada XP datasheet bit at 0 position is corresponding for
TxInProg indication.

Fixes: c5aff18204 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Signed-off-by: Maxim Kiselev <bigunclemax@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23 11:51:26 +01:00
Wong Vee Khee
82a44ae113 net: stmmac: fix kernel panic due to NULL pointer dereference of plat->est
In the case of taprio offload is not enabled, the error handling path
causes a kernel crash due to kernel NULL pointer deference.

Fix this by adding check for NULL before attempt to access 'plat->est'
on the mutex_lock() call.

The following kernel panic is observed without this patch:

RIP: 0010:mutex_lock+0x10/0x20
Call Trace:
tc_setup_taprio+0x482/0x560 [stmmac]
kmem_cache_alloc_trace+0x13f/0x490
taprio_disable_offload.isra.0+0x9d/0x180 [sch_taprio]
taprio_destroy+0x6c/0x100 [sch_taprio]
qdisc_create+0x2e5/0x4f0
tc_modify_qdisc+0x126/0x740
rtnetlink_rcv_msg+0x12b/0x380
_raw_spin_lock_irqsave+0x19/0x40
_raw_spin_unlock_irqrestore+0x18/0x30
create_object+0x212/0x340
rtnl_calcit.isra.0+0x110/0x110
netlink_rcv_skb+0x50/0x100
netlink_unicast+0x191/0x230
netlink_sendmsg+0x243/0x470
sock_sendmsg+0x5e/0x60
____sys_sendmsg+0x20b/0x280
copy_msghdr_from_user+0x5c/0x90
__mod_memcg_state+0x87/0xf0
 ___sys_sendmsg+0x7c/0xc0
lru_cache_add+0x7f/0xa0
_raw_spin_unlock+0x16/0x30
wp_page_copy+0x449/0x890
handle_mm_fault+0x921/0xfc0
__sys_sendmsg+0x59/0xa0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace b1f19b24368a96aa ]---

Fixes: b60189e039 ("net: stmmac: Integrate EST with TAPRIO scheduler API")
Cc: <stable@vger.kernel.org> # 5.10.x
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23 11:49:34 +01:00
David S. Miller
46002bf300 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-08-20

This series contains updates to igc and e1000e drivers.

Aaron Ma resolves a page fault which occurs when thunderbolt is
unplugged for igc.

Toshiki Nishioka fixes Tx queue looping to use actual number of queues
instead of max value for igc.

Sasha fixes an incorrect latency comparison by decoding the values before
comparing and prevents attempted writes to read-only NVMs for e1000e.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23 11:45:37 +01:00
Christophe JAILLET
5ed74b03eb xgene-v2: Fix a resource leak in the error handling path of 'xge_probe()'
A successful 'xge_mdio_config()' call should be balanced by a corresponding
'xge_mdio_remove()' call in the error handling path of the probe, as
already done in the remove function.

Update the error handling path accordingly.

Fixes: ea8ab16ab2 ("drivers: net: xgene-v2: Add MDIO support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23 11:23:48 +01:00
Marijn Suijten
19526d092c opp: core: Check for pending links before reading required_opp pointers
Commit 4fa82a87ba ("opp: Allow required-opps to be used for non genpd
use cases") dereferences the pointers in required_opp_tables but these
might be set to an ERR_PTR if the list still has lazy links pending,
resulting in segfaults.  Prior to this patch IS_ERR was also checked on
required_opp_tables[i] before reading ->is_genpd inside
_opp_table_alloc_required_tables, which is at the same time the
predicate to add this table to the lazy list.  This segfault is solved
by reordering the checks to bail on lazy pending tables before reading
->is_genpd.

Fixes: 4fa82a87ba ("opp: Allow required-opps to be used for non genpd use cases")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-08-23 12:44:55 +05:30
Linus Torvalds
e22ce8eb63 Linux 5.14-rc7 2021-08-22 14:24:56 -07:00
Shreyansh Chouhan
9cf448c200 ip6_gre: add validation for csum_start
Validate csum_start in gre_handle_offloads before we call _gre_xmit so
that we do not crash later when the csum_start value is used in the
lco_csum function call.

This patch deals with ipv6 code.

Fixes: Fixes: b05229f442 ("gre6: Cleanup GREv6 transmit path, call common
GRE functions")
Reported-by: syzbot+ff8e1b9f2f36481e2efc@syzkaller.appspotmail.com
Signed-off-by: Shreyansh Chouhan <chouhan.shreyansh630@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-22 21:24:41 +01:00
Shreyansh Chouhan
1d011c4803 ip_gre: add validation for csum_start
Validate csum_start in gre_handle_offloads before we call _gre_xmit so
that we do not crash later when the csum_start value is used in the
lco_csum function call.

This patch deals with ipv4 code.

Fixes: c544193214 ("GRE: Refactor GRE tunneling code.")
Reported-by: syzbot+ff8e1b9f2f36481e2efc@syzkaller.appspotmail.com
Signed-off-by: Shreyansh Chouhan <chouhan.shreyansh630@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-22 21:24:40 +01:00
Linus Torvalds
1bdc3d5be7 Merge tag 'powerpc-5.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:

 - Fix random crashes on some 32-bit CPUs by adding isync() after
   locking/unlocking KUEP

 - Fix intermittent crashes when loading modules with strict module RWX

 - Fix a section mismatch introduce by a previous fix.

Thanks to Christophe Leroy, Fabiano Rosas, Laurent Vivier, Murilo
Opsfelder Araújo, Nathan Chancellor, and Stan Johnson.

h# -----BEGIN PGP SIGNATURE-----

* tag 'powerpc-5.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Fix set_memory_*() against concurrent accesses
  powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP
  powerpc/xive: Do not mark xive_request_ipi() as __init
2021-08-22 09:49:31 -07:00
Babu Moger
527f721478 x86/resctrl: Fix a maybe-uninitialized build warning treated as error
The recent commit

  064855a690 ("x86/resctrl: Fix default monitoring groups reporting")

caused a RHEL build failure with an uninitialized variable warning
treated as an error because it removed the default case snippet.

The RHEL Makefile uses '-Werror=maybe-uninitialized' to force possibly
uninitialized variable warnings to be treated as errors. This is also
reported by smatch via the 0day robot.

The error from the RHEL build is:

  arch/x86/kernel/cpu/resctrl/monitor.c: In function ‘__mon_event_count’:
  arch/x86/kernel/cpu/resctrl/monitor.c:261:12: error: ‘m’ may be used
  uninitialized in this function [-Werror=maybe-uninitialized]
    m->chunks += chunks;
              ^~

The upstream Makefile does not build using '-Werror=maybe-uninitialized'.
So, the problem is not seen there. Fix the problem by putting back the
default case snippet.

 [ bp: note that there's nothing wrong with the code and other compilers
   do not trigger this warning - this is being done just so the RHEL compiler
   is happy. ]

Fixes: 064855a690 ("x86/resctrl: Fix default monitoring groups reporting")
Reported-by: Terry Bowman <Terry.Bowman@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/162949631908.23903.17090272726012848523.stgit@bmoger-ubuntu
2021-08-22 09:11:29 +02:00
Linus Torvalds
9ff50bf2f2 Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk driver fixes from Stephen Boyd:

 - Make the regulator state match the GDSC power domain state at boot on
   Qualcomm SoCs so that the regulator isn't turned off inadvertently.

 - Fix earlycon on i.MX6Q SoCs

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: qcom: gdsc: Ensure regulator init state matches GDSC state
  clk: imx6q: fix uart earlycon unwork
2021-08-21 11:27:16 -07:00
Linus Torvalds
9085423f0e Merge tag 'char-misc-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
 "Here are some small driver fixes for 5.14-rc7.

  They consist of:

   - revert for an interconnect patch that was found to have problems

   - ipack tpci200 driver fixes for reported problems

   - slimbus messaging and ngd fixes for reported problems

  All are small and have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  ipack: tpci200: fix memory leak in the tpci200_register
  ipack: tpci200: fix many double free issues in tpci200_pci_probe
  slimbus: ngd: reset dma setup during runtime pm
  slimbus: ngd: set correct device for pm
  slimbus: messaging: check for valid transaction id
  slimbus: messaging: start transaction ids from 1 instead of zero
  Revert "interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate"
2021-08-21 11:22:10 -07:00
Linus Torvalds
f4ff9e6b01 Merge tag 'usb-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fix from Greg KH:
 "Here is a single USB typec tcpm fix for a reported problem for
  5.14-rc7. It showed up in 5.13 and resolves an issue that Hans found.
  It has been in linux-next this week with no reported problems"

* tag 'usb-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: typec: tcpm: Fix VDMs sometimes not being forwarded to alt-mode drivers
2021-08-21 11:10:06 -07:00
Linus Torvalds
a09434f181 Merge tag 'riscv-for-linus-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - fix the sifive-l2-cache device tree bindings for json-schema
   compatibility. This does not change the intended behavior of the
   binding.

 - avoid improperly freeing necessary resources during early boot.

* tag 'riscv-for-linus-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix a number of free'd resources in init_resources()
  dt-bindings: sifive-l2-cache: Fix 'select' matching
2021-08-21 11:04:26 -07:00
Linus Torvalds
5479a7fe89 Merge tag 's390-5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fix from Vasily Gorbik:

 - fix use after free of zpci_dev in pci code

* tag 's390-5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pci: fix use after free of zpci_dev
2021-08-21 10:56:06 -07:00
Linus Torvalds
15517c724c Merge tag 'locks-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull mandatory file locking deprecation warning from Jeff Layton:
 "As discussed on the list, this patch just adds a new warning for folks
  who still have mandatory locking enabled and actually mount with '-o
  mand'. I'd like to get this in for v5.14 so we can push this out into
  stable kernels and hopefully reach folks who have mounts with -o mand.

  For now, I'm operating under the assumption that we'll fully remove
  this support in v5.15, but we can move that out if any legitimate
  users of this facility speak up between now and then"

* tag 'locks-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  fs: warn about impending deprecation of mandatory locks
2021-08-21 10:50:22 -07:00
Joerg Roedel
22aa45cb46 x86/efi: Restore Firmware IDT before calling ExitBootServices()
Commit

  79419e13e8 ("x86/boot/compressed/64: Setup IDT in startup_32 boot path")

introduced an IDT into the 32-bit boot path of the decompressor stub.
But the IDT is set up before ExitBootServices() is called, and some UEFI
firmwares rely on their own IDT.

Save the firmware IDT on boot and restore it before calling into EFI
functions to fix boot failures introduced by above commit.

Fixes: 79419e13e8 ("x86/boot/compressed/64: Setup IDT in startup_32 boot path")
Reported-by: Fabio Aiuto <fabioaiuto83@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Cc: stable@vger.kernel.org # 5.13+
Link: https://lkml.kernel.org/r/20210820125703.32410-1-joro@8bytes.org
2021-08-21 17:57:04 +02:00
Linus Torvalds
002c0aef10 Merge tag 'block-5.14-2021-08-20' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Three fixes from Ming Lei that should go into 5.14:

   - Fix for a kernel panic when iterating over tags for some cases
     where a flush request is present, a regression in this cycle.

   - Request timeout fix

   - Fix flush request checking"

* tag 'block-5.14-2021-08-20' of git://git.kernel.dk/linux-block:
  blk-mq: fix is_flush_rq
  blk-mq: fix kernel panic during iterating over flush request
  blk-mq: don't grab rq's refcount in blk_mq_check_expired()
2021-08-21 08:11:22 -07:00
Linus Torvalds
1e6907d58c Merge tag 'io_uring-5.14-2021-08-20' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "A few small fixes that should go into this release:

   - Fix never re-assigning an initial error value for io_uring_enter()
     for SQPOLL, if asked to do nothing

   - Fix xa_alloc_cycle() return value checking, for cases where we have
     wrapped around

   - Fix for a ctx pin issue introduced in this cycle (Pavel)"

* tag 'io_uring-5.14-2021-08-20' of git://git.kernel.dk/linux-block:
  io_uring: fix xa_alloc_cycle() error return value check
  io_uring: pin ctx on fallback execution
  io_uring: only assign io_uring_enter() SQPOLL error in actual error case
2021-08-21 08:06:26 -07:00
Jeff Layton
fdd92b64d1 fs: warn about impending deprecation of mandatory locks
We've had CONFIG_MANDATORY_FILE_LOCKING since 2015 and a lot of distros
have disabled it. Warn the stragglers that still use "-o mand" that
we'll be dropping support for that mount option.

Cc: stable@vger.kernel.org
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2021-08-21 07:32:45 -04:00
Marc Zyngier
12d125b457 stmmac: Revert "stmmac: align RX buffers"
This reverts commit a955318fe6 ("stmmac: align RX buffers"),
which breaks at least one platform (Nvidia Jetson-X1), causing
packet corruption. This is 100% reproducible, and reverting
the patch results in a working system again.

Given that it is "only" a performance optimisation, let's
return to a known working configuration until we can have a
good understanding of what is happening here.

Fixes: a955318fe6 ("stmmac: align RX buffers")
Cc: Matteo Croce <mcroce@linux.microsoft.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Link: https://lore.kernel.org/netdev/871r71azjw.wl-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210820183002.457226-1-maz@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-20 14:44:49 -07:00
Jens Axboe
a30f895ad3 io_uring: fix xa_alloc_cycle() error return value check
We currently check for ret != 0 to indicate error, but '1' is a valid
return and just indicates that the allocation succeeded with a wrap.
Correct the check to be for < 0, like it was before the xarray
conversion.

Cc: stable@vger.kernel.org
Fixes: 61cf93700f ("io_uring: Convert personality_idr to XArray")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-20 14:59:58 -06:00
Linus Torvalds
fa54d366a6 Merge tag 'acpi-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
 "These fix two mistakes in new code.

  Specifics:

   - Prevent confusing messages from being printed if the PRMT table is
     not present or there are no PRM modules (Aubrey Li).

   - Fix the handling of suspend-to-idle entry and exit in the case when
     the Microsoft UUID is used with the Low-Power S0 Idle _DSM
     interface (Mario Limonciello)"

* tag 'acpi-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: PM: s2idle: Invert Microsoft UUID entry and exit
  ACPI: PRM: Deal with table not present or no module found
2021-08-20 13:44:25 -07:00
Linus Torvalds
cae6876458 Merge tag 'pm-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix some issues in the ARM cpufreq drivers and in the operating
  performance points (OPP) framework.

  Specifics:

   - Fix useless WARN() in the OPP core and prevent a noisy warning
     from being printed by OPP _put functions (Dmitry Osipenko).

   - Fix error path when allocation failed in the arm_scmi cpufreq
     driver (Lukasz Luba).

   - Blacklist Qualcomm sc8180x and Qualcomm sm8150 in
     cpufreq-dt-platdev (Bjorn Andersson, Thara Gopinath).

   - Forbid cpufreq for 1.2 GHz variant in the armada-37xx cpufreq
     driver (Marek Behún)"

* tag 'pm-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  opp: Drop empty-table checks from _put functions
  cpufreq: armada-37xx: forbid cpufreq for 1.2 GHz variant
  cpufreq: blocklist Qualcomm sm8150 in cpufreq-dt-platdev
  cpufreq: arm_scmi: Fix error path when allocation failed
  opp: remove WARN when no valid OPPs remain
  cpufreq: blacklist Qualcomm sc8180x in cpufreq-dt-platdev
2021-08-20 13:38:42 -07:00
Linus Torvalds
ed3bad2e4f Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "10 patches.

  Subsystems affected by this patch series: MAINTAINERS and mm (shmem,
  pagealloc, tracing, memcg, memory-failure, vmscan, kfence, and
  hugetlb)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  hugetlb: don't pass page cache pages to restore_reserve_on_error
  kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE
  mm: vmscan: fix missing psi annotation for node_reclaim()
  mm/hwpoison: retry with shake_page() for unhandlable pages
  mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim
  MAINTAINERS: update ClangBuiltLinux IRC chat
  mmflags.h: add missing __GFP_ZEROTAGS and __GFP_SKIP_KASAN_POISON names
  mm/page_alloc: don't corrupt pcppage_migratetype
  Revert "mm: swap: check if swap backing device is congested or not"
  Revert "mm/shmem: fix shmem_swapin() race with swapoff"
2021-08-20 13:08:56 -07:00
Linus Torvalds
8ba9fbe1e4 Merge tag 'drm-fixes-2021-08-20-3' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Regularly scheduled fixes. The ttm one solves a problem of GPU drivers
  failing to load if debugfs is off in Kconfig, otherwise the i915 and
  mediatek, and amdgpu fixes all fairly normal.

  Nouveau has a couple of display fixes, but it has a fix for a
  longstanding race condition in it's memory manager code, and the fix
  mostly removes some code that wasn't working properly and has no
  userspace users. This fix makes the diffstat kinda larger but in a
  good (negative line-count) way.

  core:
   - fix drm_wait_vblank uapi copying bug

  ttm:
   - fix debugfs init when debugfs is off

  amdgpu:
   - vega10 SMU workload fix
   - DCN VM fix
   - DCN 3.01 watermark fix

  amdkfd:
   - SVM fix

  nouveau:
   - ampere display fixes
   - remove MM misfeature to fix a longstanding race condition

  i915:
   - tweaked display workaround for all PCHs
   - eDP MSO pipe sanity for ADL-P fix
   - remove unused symbol export

  mediatek:
   - AAL output size setting
   - Delete component in remove function"

* tag 'drm-fixes-2021-08-20-3' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/display: Use DCN30 watermark calc for DCN301
  drm/i915/dp: remove superfluous EXPORT_SYMBOL()
  drm/i915/edp: fix eDP MSO pipe sanity checks for ADL-P
  drm/i915: Tweaked Wa_14010685332 for all PCHs
  drm/nouveau: rip out nvkm_client.super
  drm/nouveau: block a bunch of classes from userspace
  drm/nouveau/fifo/nv50-: rip out dma channels
  drm/nouveau/kms/nv50: workaround EFI GOP window channel format differences
  drm/nouveau/disp: power down unused DP links during init
  drm/nouveau: recognise GA107
  drm: Copy drm_wait_vblank to user before returning
  drm/amd/display: Ensure DCN save after VM setup
  drm/amdkfd: fix random KFDSVMRangeTest.SetGetAttributesTest test failure
  drm/amd/pm: change the workload type for some cards
  Revert "drm/amd/pm: fix workload mismatch on vega10"
  drm: ttm: Don't bail from ttm_global_init if debugfs_create_dir fails
  drm/mediatek: Add component_del in OVL and COLOR remove function
  drm/mediatek: Add AAL output size configuration
2021-08-20 12:59:54 -07:00
Linus Torvalds
3db903a8ea Merge tag 'pci-v5.14-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:

 - Add Rahul Tanwar as Intel LGM Gateway PCIe maintainer (Rahul Tanwar)

 - Add Jim Quinlan et al as Broadcom STB PCIe maintainers (Jim Quinlan)

 - Increase D3hot-to-D0 delay for AMD Renoir/Cezanne XHCI (Marcin
   Bachry)

 - Correct iomem_get_mapping() usage for legacy_mem sysfs (Krzysztof
   Wilczyński)

* tag 'pci-v5.14-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI/sysfs: Use correct variable for the legacy_mem sysfs object
  PCI: Increase D3 delay for AMD Renoir/Cezanne XHCI
  MAINTAINERS: Add Jim Quinlan et al as Broadcom STB PCIe maintainers
  MAINTAINERS: Add Rahul Tanwar as Intel LGM Gateway PCIe maintainer
2021-08-20 12:51:37 -07:00
Linus Torvalds
a27c75e554 Merge tag 'mmc-v5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:

 - dw_mmc: Fix hang on data CRC error

 - mmci: Fix voltage switch procedure for the stm32 variant

 - sdhci-iproc: Fix some clock issues for BCM2711

 - sdhci-msm: Fixup software timeout value

* tag 'mmc-v5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN on BCM2711
  mmc: sdhci-iproc: Cap min clock frequency on BCM2711
  mmc: sdhci-msm: Update the software timeout value for sdhc
  mmc: mmci: stm32: Check when the voltage switch procedure should be done
  mmc: dw_mmc: Fix hang on data CRC error
2021-08-20 12:46:00 -07:00
Linus Torvalds
43a6473e47 Merge tag 'sound-5.14-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull more sound fixes from Takashi Iwai:
 "This is a quick follow up for 5.14: a fix for a very recently
  introduced regression on ASoC Intel Atom driver, and another trivial
  HD-audio quirk for HP laptops"

* tag 'sound-5.14-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: intel: atom: Fix breakage for PCM buffer address setup
  ALSA: hda/realtek: Limit mic boost on HP ProBook 445 G8
2021-08-20 12:31:10 -07:00
Linus Torvalds
54e9ea3cdb Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:

 - Fix cleaning of vDSO directories

 - Ensure CNTHCTL_EL2 is fully initialised when booting at EL2

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: initialize all of CNTHCTL_EL2
  arm64: clean vdso & vdso32 files
2021-08-20 12:18:49 -07:00
Rafael J. Wysocki
0f09f4c481 Merge branch 'acpi-pm'
* acpi-pm:
  ACPI: PM: s2idle: Invert Microsoft UUID entry and exit
2021-08-20 21:11:43 +02:00
Linus Torvalds
b7d184d37e Merge tag 'iommu-fixes-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:

 - Fix for a potential NULL-ptr dereference in IOMMU core code

 - Two resource leak fixes

 - Cache flush fix in the Intel VT-d driver

* tag 'iommu-fixes-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Fix incomplete cache flush in intel_pasid_tear_down_entry()
  iommu/vt-d: Fix PASID reference leak
  iommu: Check if group is NULL before remove device
  iommu/dma: Fix leak in non-contiguous API
2021-08-20 12:11:33 -07:00
Rafael J. Wysocki
f2963c7ec7 Merge branch 'pm-opp'
* pm-opp:
  opp: Drop empty-table checks from _put functions
  opp: remove WARN when no valid OPPs remain
2021-08-20 21:11:16 +02:00
Xiao Yang
cc4f596cf8 RDMA/rxe: Zero out index member of struct rxe_queue
1) New index member of struct rxe_queue was introduced but not zeroed so
   the initial value of index may be random.

2) The current index is not masked off to index_mask.

In this case producer_addr() and consumer_addr() will get an invalid
address by the random index and then accessing the invalid address
triggers the following panic:

"BUG: unable to handle page fault for address: ffff9ae2c07a1414"

Fix the issue by using kzalloc() to zero out index member.

Fixes: 5bcf5a59c4 ("RDMA/rxe: Protext kernel index from user space")
Link: https://lore.kernel.org/r/20210820111509.172500-1-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-20 15:48:58 -03:00
Mike Kravetz
c7b1850dfb hugetlb: don't pass page cache pages to restore_reserve_on_error
syzbot hit kernel BUG at fs/hugetlbfs/inode.c:532 as described in [1].
This BUG triggers if the HPageRestoreReserve flag is set on a page in
the page cache.  It should never be set, as the routine
huge_add_to_page_cache explicitly clears the flag after adding a page to
the cache.

The only code other than huge page allocation which sets the flag is
restore_reserve_on_error.  It will potentially set the flag in rare out
of memory conditions.  syzbot was injecting errors to cause memory
allocation errors which exercised this specific path.

The code in restore_reserve_on_error is doing the right thing.  However,
there are instances where pages in the page cache were being passed to
restore_reserve_on_error.  This is incorrect, as once a page goes into
the cache reservation information will not be modified for the page
until it is removed from the cache.  Error paths do not remove pages
from the cache, so even in the case of error, the page will remain in
the cache and no reservation adjustment is needed.

Modify routines that potentially call restore_reserve_on_error with a
page cache page to no longer do so.

Note on fixes tag: Prior to commit 846be08578 ("mm/hugetlb: expand
restore_reserve_on_error functionality") the routine would not process
page cache pages because the HPageRestoreReserve flag is not set on such
pages.  Therefore, this issue could not be trigggered.  The code added
by commit 846be08578 ("mm/hugetlb: expand restore_reserve_on_error
functionality") is needed and correct.  It exposed incorrect calls to
restore_reserve_on_error which is the root cause addressed by this
commit.

[1] https://lore.kernel.org/linux-mm/00000000000050776d05c9b7c7f0@google.com/

Link: https://lkml.kernel.org/r/20210818213304.37038-1-mike.kravetz@oracle.com
Fixes: 846be08578 ("mm/hugetlb: expand restore_reserve_on_error functionality")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: <syzbot+67654e51e54455f1c585@syzkaller.appspotmail.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20 11:31:42 -07:00
Marco Elver
a7cb5d23ea kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE
Originally the addr != NULL check was meant to take care of the case
where __kfence_pool == NULL (KFENCE is disabled).  However, this does
not work for addresses where addr > 0 && addr < KFENCE_POOL_SIZE.

This can be the case on NULL-deref where addr > 0 && addr < PAGE_SIZE or
any other faulting access with addr < KFENCE_POOL_SIZE.  While the
kernel would likely crash, the stack traces and report might be
confusing due to double faults upon KFENCE's attempt to unprotect such
an address.

Fix it by just checking that __kfence_pool != NULL instead.

Link: https://lkml.kernel.org/r/20210818130300.2482437-1-elver@google.com
Fixes: 0ce20dd840 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>    [5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20 11:31:42 -07:00
Johannes Weiner
57f29762cd mm: vmscan: fix missing psi annotation for node_reclaim()
In a debugging session the other day, Rik noticed that node_reclaim()
was missing memstall annotations.  This means we'll miss pressure and
lost productivity resulting from reclaim on an overloaded local NUMA
node when vm.zone_reclaim_mode is enabled.

There haven't been any reports, but that's likely because
vm.zone_reclaim_mode hasn't been a commonly used feature recently, and
the intersection between such setups and psi users is probably nil.

But secondary memory such as CXL-connected DIMMS, persistent memory etc,
and the page demotion patches that handle them
(https://lore.kernel.org/lkml/20210401183216.443C4443@viggo.jf.intel.com/)
could soon make this a more common codepath again.

Link: https://lkml.kernel.org/r/20210818152457.35846-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20 11:31:42 -07:00
Naoya Horiguchi
fcc00621d8 mm/hwpoison: retry with shake_page() for unhandlable pages
HWPoisonHandlable() sometimes returns false for typical user pages due
to races with average memory events like transfers over LRU lists.  This
causes failures in hwpoison handling.

There's retry code for such a case but does not work because the retry
loop reaches the retry limit too quickly before the page settles down to
handlable state.  Let get_any_page() call shake_page() to fix it.

[naoya.horiguchi@nec.com: get_any_page(): return -EIO when retry limit reached]
  Link: https://lkml.kernel.org/r/20210819001958.2365157-1-naoya.horiguchi@linux.dev

Link: https://lkml.kernel.org/r/20210817053703.2267588-1-naoya.horiguchi@linux.dev
Fixes: 25182f05ff ("mm,hwpoison: fix race with hugetlb page allocation")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>		[5.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20 11:31:42 -07:00
Johannes Weiner
f56ce412a5 mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim
We've noticed occasional OOM killing when memory.low settings are in
effect for cgroups.  This is unexpected and undesirable as memory.low is
supposed to express non-OOMing memory priorities between cgroups.

The reason for this is proportional memory.low reclaim.  When cgroups
are below their memory.low threshold, reclaim passes them over in the
first round, and then retries if it couldn't find pages anywhere else.
But when cgroups are slightly above their memory.low setting, page scan
force is scaled down and diminished in proportion to the overage, to the
point where it can cause reclaim to fail as well - only in that case we
currently don't retry, and instead trigger OOM.

To fix this, hook proportional reclaim into the same retry logic we have
in place for when cgroups are skipped entirely.  This way if reclaim
fails and some cgroups were scanned with diminished pressure, we'll try
another full-force cycle before giving up and OOMing.

[akpm@linux-foundation.org: coding-style fixes]

Link: https://lkml.kernel.org/r/20210817180506.220056-1-hannes@cmpxchg.org
Fixes: 9783aa9917 ("mm, memcg: proportional memory.{low,min} reclaim")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Leon Yang <lnyng@fb.com>
Reviewed-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Chris Down <chris@chrisdown.name>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>		[5.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20 11:31:42 -07:00
Nathan Chancellor
91ed3ed0f7 MAINTAINERS: update ClangBuiltLinux IRC chat
Everyone has moved from Freenode to Libera so updated the channel entry
for MAINTAINERS.

Link: https://github.com/ClangBuiltLinux/linux/issues/1402
Link: https://lkml.kernel.org/r/20210818022339.3863058-1-nathan@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20 11:31:42 -07:00
Mike Rapoport
b16ee0f9ed mmflags.h: add missing __GFP_ZEROTAGS and __GFP_SKIP_KASAN_POISON names
printk("%pGg") outputs these two flags as hexadecimal number, rather
than as a string, e.g:

	GFP_KERNEL|0x1800000

Fix this by adding missing names of __GFP_ZEROTAGS and
__GFP_SKIP_KASAN_POISON flags to __def_gfpflag_names.

Link: https://lkml.kernel.org/r/20210816133502.590-1-rppt@kernel.org
Fixes: 013bb59dbb ("arm64: mte: handle tags zeroing at page allocation time")
Fixes: c275c5c6d5 ("kasan: disable freed user page poisoning with HW tags")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20 11:31:42 -07:00
Doug Berger
47aef6010b mm/page_alloc: don't corrupt pcppage_migratetype
When placing pages on a pcp list, migratetype values over
MIGRATE_PCPTYPES get added to the MIGRATE_MOVABLE pcp list.

However, the actual migratetype is preserved in the page and should
not be changed to MIGRATE_MOVABLE or the page may end up on the wrong
free_list.

The impact is that HIGHATOMIC or CMA pages getting bulk freed from the
PCP lists could potentially end up on the wrong buddy list.  There are
various consequences but minimally NR_FREE_CMA_PAGES accounting could
get screwed up.

[mgorman@techsingularity.net: changelog update]

Link: https://lkml.kernel.org/r/20210811182917.2607994-1-opendmb@gmail.com
Fixes: df1acc8569 ("mm/page_alloc: avoid conflating IRQs disabled with zone->lock")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20 11:31:42 -07:00
Yang Shi
c04b3d0690 Revert "mm: swap: check if swap backing device is congested or not"
Due to the change about how block layer detects congestion the
justification of commit 8fd2e0b505 ("mm: swap: check if swap backing
device is congested or not") doesn't stand anymore, so the commit could
be just reverted in order to solve the race reported by commit
2efa33fc7f ("mm/shmem: fix shmem_swapin() race with swapoff").  The
fix was reverted by the previous patch.

Link: https://lkml.kernel.org/r/20210810202936.2672-3-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20 11:31:42 -07:00
Yang Shi
b1e1ef3454 Revert "mm/shmem: fix shmem_swapin() race with swapoff"
Due to the change about how block layer detects congestion the
justification of commit 8fd2e0b505 ("mm: swap: check if swap backing
device is congested or not") doesn't stand anymore, so the commit could
be just reverted in order to solve the race reported by commit
2efa33fc7f ("mm/shmem: fix shmem_swapin() race with swapoff"), so the
fix commit could be just reverted as well.

And that fix is also kind of buggy as discussed by [1] and [2].

[1] https://lore.kernel.org/linux-mm/24187e5e-069-9f3f-cefe-39ac70783753@google.com/
[2] https://lore.kernel.org/linux-mm/e82380b9-3ad4-4a52-be50-6d45c7f2b5da@google.com/

Link: https://lkml.kernel.org/r/20210810202936.2672-2-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20 11:31:41 -07:00
Gal Pressman
dbe986bdfd RDMA/efa: Free IRQ vectors on error flow
Make sure to free the IRQ vectors in case the allocation doesn't return
the expected number of IRQs.

Fixes: b7f5e880f3 ("RDMA/efa: Add the efa module")
Link: https://lore.kernel.org/r/20210811151131.39138-2-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-20 15:27:47 -03:00
Michel Dänzer
32bc8f8373 drm/amdgpu: Cancel delayed work when GFXOFF is disabled
schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
  checking for it to be 0 (Evan Quan)

Cc: stable@vger.kernel.org
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
Acked-by: Christian König <christian.koenig@amd.com> # v3
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20 13:35:42 -04:00
Christian König
2a7b9a8437 drm/amdgpu: use the preferred pin domain after the check
For some reason we run into an use case where a BO is already pinned
into GTT, but should be pinned into VRAM|GTT again.

Handle that case gracefully as well.

Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-20 13:35:35 -04:00
Petr Pavlu
aa3e1ba32e riscv: Fix a number of free'd resources in init_resources()
Function init_resources() allocates a boot memory block to hold an array of
resources which it adds to iomem_resource. The array is filled in from its
end and the function then attempts to free any unused memory at the
beginning. The problem is that size of the unused memory is incorrectly
calculated and this can result in releasing memory which is in use by
active resources. Their data then gets corrupted later when the memory is
reused by a different part of the system.

Fix the size of the released memory to correctly match the number of unused
resource entries.

Fixes: ffe0e52612 ("RISC-V: Improve init_resources()")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Nick Kossifidis <mick@ics.forth.gr>
Tested-by: Sunil V L <sunilvl@ventanamicro.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-20 10:15:51 -07:00
Sasha Neftin
4051f68318 e1000e: Do not take care about recovery NVM checksum
On new platforms, the NVM is read-only. Attempting to update the NVM
is causing a lockup to occur. Do not attempt to write to the NVM
on platforms where it's not supported.
Emit an error message when the NVM checksum is invalid.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=213667
Fixes: fb776f5d57 ("e1000e: Add support for Tiger Lake")
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Suggested-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-20 08:38:01 -07:00
Sasha Neftin
44a13a5d99 e1000e: Fix the max snoop/no-snoop latency for 10M
We should decode the latency and the max_latency before directly compare.
The latency should be presented as lat_enc = scale x value:
lat_enc_d = (lat_enc & 0x0x3ff) x (1U << (5*((max_ltr_enc & 0x1c00)
>> 10)))

Fixes: cf8fb73c23 ("e1000e: add support for LTR on I217/I218")
Suggested-by: Yee Li <seven.yi.lee@gmail.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-20 08:38:01 -07:00
Toshiki Nishioka
691bd4d776 igc: Use num_tx_queues when iterating over tx_ring queue
Use num_tx_queues rather than the IGC_MAX_TX_QUEUES fixed number 4 when
iterating over tx_ring queue since instantiated queue count could be
less than 4 where on-line cpu count is less than 4.

Fixes: ec50a9d437 ("igc: Add support for taprio offloading")
Signed-off-by: Toshiki Nishioka <toshiki.nishioka@intel.com>
Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
Tested-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-20 08:37:49 -07:00
Aaron Ma
4b79959510 igc: fix page fault when thunderbolt is unplugged
After unplug thunderbolt dock with i225, pciehp interrupt is triggered,
remove call will read/write mmio address which is already disconnected,
then cause page fault and make system hang.

Check PCI state to remove device safely.

Trace:
BUG: unable to handle page fault for address: 000000000000b604
Oops: 0000 [#1] SMP NOPTI
RIP: 0010:igc_rd32+0x1c/0x90 [igc]
Call Trace:
igc_ptp_suspend+0x6c/0xa0 [igc]
igc_ptp_stop+0x12/0x50 [igc]
igc_remove+0x7f/0x1c0 [igc]
pci_device_remove+0x3e/0xb0
__device_release_driver+0x181/0x240

Fixes: 13b5b7fd6a ("igc: Add support for Tx/Rx rings")
Fixes: b03c49cde6 ("igc: Save PTP time before a reset")
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-20 07:36:22 -07:00
Petko Manolov
ffc9c3ebb4 net: usb: pegasus: fixes of set_register(s) return value evaluation;
- restore the behavior in enable_net_traffic() to avoid regressions - Jakub
    Kicinski;
  - hurried up and removed redundant assignment in pegasus_open() before yet
    another checker complains;

Fixes: 8a160e2e9a ("net: usb: pegasus: Check the return value of get_geristers() and friends;")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Petko Manolov <petko.manolov@konsulko.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-20 14:57:05 +01:00
Xiaolong Huang
7e78c597c3 net: qrtr: fix another OOB Read in qrtr_endpoint_post
This check was incomplete, did not consider size is 0:

	if (len != ALIGN(size, 4) + hdrlen)
                    goto err;

if size from qrtr_hdr is 0, the result of ALIGN(size, 4)
will be 0, In case of len == hdrlen and size == 0
in header this check won't fail and

	if (cb->type == QRTR_TYPE_NEW_SERVER) {
                /* Remote node endpoint can bridge other distant nodes */
                const struct qrtr_ctrl_pkt *pkt = data + hdrlen;

                qrtr_node_assign(node, le32_to_cpu(pkt->server.node));
        }

will also read out of bound from data, which is hdrlen allocated block.

Fixes: 194ccc8829 ("net: qrtr: Support decoding incoming v2 packets")
Fixes: ad9d24c942 ("net: qrtr: fix OOB Read in qrtr_endpoint_post")
Signed-off-by: Xiaolong Huang <butterflyhuangxx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-20 14:41:03 +01:00
Nicholas Piggin
787c70f2f9 powerpc/64s: Fix scv implicit soft-mask table for relocated kernels
The implict soft-mask table addresses get relocated if they use a
relative symbol like a label. This is right for code that runs relocated
but not for unrelocated. The scv interrupt vectors run unrelocated, so
absolute addresses are required for their soft-mask table entry.

This fixes crashing with relocated kernels, usually an asynchronous
interrupt hitting in the scv handler, then hitting the trap that checks
whether r1 is in userspace.

Fixes: 325678fd05 ("powerpc/64s: add a table of implicit soft-masked addresses")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210820103431.1701240-1-npiggin@gmail.com
2021-08-20 22:35:18 +10:00
Peter Zijlstra
3c474b3239 sched: Fix Core-wide rq->lock for uninitialized CPUs
Eugene tripped over the case where rq_lock(), as called in a
for_each_possible_cpu() loop came apart because rq->core hadn't been
setup yet.

This is a somewhat unusual, but valid case.

Rework things such that rq->core is initialized to point at itself. IOW
initialize each CPU as a single threaded Core. CPU online will then join
the new CPU (thread) to an existing Core where needed.

For completeness sake, have CPU offline fully undo the state so as to
not presume the topology will match the next time it comes online.

Fixes: 9edeaea1bc ("sched: Core-wide rq->lock")
Reported-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Don <joshdon@google.com>
Tested-by: Eugene Syromiatnikov <esyr@redhat.com>
Link: https://lkml.kernel.org/r/YR473ZGeKqMs6kw+@hirez.programming.kicks-ass.net
2021-08-20 12:32:53 +02:00
Dave Airlie
daa7772d47 Merge tag 'amd-drm-fixes-5.14-2021-08-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.14-2021-08-18:

amdgpu:
- vega10 SMU workload fix
- DCN VM fix
- DCN 3.01 watermark fix

amdkfd:
- SVM fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210818225137.4070-1-alexander.deucher@amd.com
2021-08-20 15:13:56 +10:00
Rob Herring
1c8094e394 dt-bindings: sifive-l2-cache: Fix 'select' matching
When the schema fixups are applied to 'select' the result is a single
entry is required for a match, but that will never match as there should
be 2 entries. Also, a 'select' schema should have the widest possible
match, so use 'contains' which matches the compatible string(s) in any
position and not just the first position.

Fixes: 993dcfac64 ("dt-bindings: riscv: sifive-l2-cache: convert bindings to json-schema")
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-19 20:55:49 -07:00
Lukas Bulwahn
310d2e83cb powerpc: Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK
Commit 66f24fa766 ("mm: drop redundant ARCH_ENABLE_SPLIT_PMD_PTLOCK")
broke PMD split page table lock for powerpc.

It selects the non-existent config ARCH_ENABLE_PMD_SPLIT_PTLOCK in
arch/powerpc/platforms/Kconfig.cputype, but clearly intended to
select ARCH_ENABLE_SPLIT_PMD_PTLOCK (notice the word swapping!), as
that commit did for all other architectures.

Fix it by selecting the correct symbol ARCH_ENABLE_SPLIT_PMD_PTLOCK.

Fixes: 66f24fa766 ("mm: drop redundant ARCH_ENABLE_SPLIT_PMD_PTLOCK")
Cc: stable@vger.kernel.org # v5.13+
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
[mpe: Reword change log to make it clear this is a bug fix]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210819113954.17515-3-lukas.bulwahn@gmail.com
2021-08-20 12:38:50 +10:00
Jacob Keller
a8f89fa277 ice: do not abort devlink info if board identifier can't be found
The devlink dev info command reports version information about the
device and firmware running on the board. This includes the "board.id"
field which is supposed to represent an identifier of the board design.
The ice driver uses the Product Board Assembly identifier for this.

In some cases, the PBA is not present in the NVM. If this happens,
devlink dev info will fail with an error. Instead, modify the
ice_info_pba function to just exit without filling in the context
buffer. This will cause the board.id field to be skipped. Log a dev_dbg
message in case someone wants to confirm why board.id is not showing up
for them.

Fixes: e961b679fb ("ice: add board identifier info to devlink .info_get")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20210819223451.245613-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19 18:20:01 -07:00
Dave Airlie
f5b27f7f8d Merge tag 'mediatek-drm-fixes-5.14-2' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes
Mediatek DRM Fixes for Linux 5.14-2

1. Fix AAL output size setting.
2. Delete component in remove function.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210819001635.14803-1-chunkuang.hu@kernel.org
2021-08-20 10:15:04 +10:00
Dave Airlie
5ce5cef019 Merge tag 'drm-intel-fixes-2021-08-18' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Expand a tweaked display workaround for all PCHs. (Anshuman)
- Fix eDP MSO pipe sanity checks for ADL-P. (Jani)
- Remove superfluous EXPORT_SYMBOL(). (Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YR137zkSAIbun1Ed@intel.com
2021-08-20 09:43:31 +10:00
Dave Airlie
b88aefc51c Merge branch 'linux-5.14' of git://github.com/skeggsb/linux into drm-fixes
- Ampere display fixes
- Fix longstanding MM race issue by removing unused code.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CACAvsv5jtUFkHsGe-pf-=RceDOgKygjPnCi=6d5vCLM_f5aeMQ@mail.gmail.com
2021-08-20 09:18:14 +10:00
Bob Pearson
65a81b61d8 RDMA/rxe: Fix memory allocation while in a spin lock
rxe_mcast_add_grp_elem() in rxe_mcast.c calls rxe_alloc() while holding
spinlocks which in turn calls kzalloc(size, GFP_KERNEL) which is
incorrect.  This patch replaces rxe_alloc() by rxe_alloc_locked() which
uses GFP_ATOMIC.  This bug was caused by the below mentioned commit and
failing to handle the need for the atomic allocate.

Fixes: 4276fd0ddd ("RDMA/rxe: Remove RXE_POOL_ATOMIC")
Link: https://lore.kernel.org/r/20210813210625.4484-1-rpearsonhpe@gmail.com
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-19 20:11:16 -03:00
Linus Torvalds
d992fe5318 Merge tag 'soc-fixes-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "Not much to see here. Half the fixes this time are for Qualcomm dts
  files, fixing small mistakes on certain machines. The other fixes are:

   - A 5.13 regression fix for freescale QE interrupt controller\

   - A fix for TI OMAP gpt12 timer error handling

   - A randconfig build regression fix for ixp4xx

   - Another defconfig fix following the CONFIG_FB dependency rework"

* tag 'soc-fixes-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  soc: fsl: qe: fix static checker warning
  ARM: ixp4xx: fix building both pci drivers
  ARM: configs: Update the nhk8815_defconfig
  bus: ti-sysc: Fix error handling for sysc_check_active_timer()
  soc: fsl: qe: convert QE interrupt controller to platform_device
  arm64: dts: qcom: sdm845-oneplus: fix reserved-mem
  arm64: dts: qcom: msm8994-angler: Disable cont_splash_mem
  arm64: dts: qcom: sc7280: Fixup cpufreq domain info for cpu7
  arm64: dts: qcom: msm8992-bullhead: Fix cont_splash_mem mapping
  arm64: dts: qcom: msm8992-bullhead: Remove PSCI
  arm64: dts: qcom: c630: fix correct powerdown pin for WSA881x
2021-08-19 15:32:58 -07:00
Dave Airlie
e213bd1e72 Merge tag 'drm-misc-fixes-2021-08-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull:

 * UAPI: Return results for failed drm_wait_vblank_ioctl()
 * ttm: Fix debugfs initialization

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YR1c7cG1IaL+g8EN@linux-uq9g.fritz.box
2021-08-20 06:01:34 +10:00
Linus Torvalds
f87d64319e Merge tag 'net-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from bpf, wireless and mac80211
  trees.

  Current release - regressions:

   - tipc: call tipc_wait_for_connect only when dlen is not 0

   - mac80211: fix locking in ieee80211_restart_work()

  Current release - new code bugs:

   - bpf: add rcu_read_lock in bpf_get_current_[ancestor_]cgroup_id()

   - ethernet: ice: fix perout start time rounding

   - wwan: iosm: prevent underflow in ipc_chnl_cfg_get()

  Previous releases - regressions:

   - bpf: clear zext_dst of dead insns

   - sch_cake: fix srchost/dsthost hashing mode

   - vrf: reset skb conntrack connection on VRF rcv

   - net/rds: dma_map_sg is entitled to merge entries

  Previous releases - always broken:

   - ethernet: bnxt: fix Tx path locking and races, add Rx path
     barriers"

* tag 'net-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (42 commits)
  net: dpaa2-switch: disable the control interface on error path
  Revert "flow_offload: action should not be NULL when it is referenced"
  iavf: Fix ping is lost after untrusted VF had tried to change MAC
  i40e: Fix ATR queue selection
  r8152: fix the maximum number of PLA bp for RTL8153C
  r8152: fix writing USB_BP2_EN
  mptcp: full fully established support after ADD_ADDR
  mptcp: fix memory leak on address flush
  net/rds: dma_map_sg is entitled to merge entries
  net: mscc: ocelot: allow forwarding from bridge ports to the tag_8021q CPU port
  net: asix: fix uninit value bugs
  ovs: clear skb->tstamp in forwarding path
  net: mdio-mux: Handle -EPROBE_DEFER correctly
  net: mdio-mux: Don't ignore memory allocation errors
  net: mdio-mux: Delete unnecessary devm_kfree
  net: dsa: sja1105: fix use-after-free after calling of_find_compatible_node, or worse
  sch_cake: fix srchost/dsthost hashing mode
  ixgbe, xsk: clean up the resources in ixgbe_xsk_pool_enable error path
  net: qlcnic: add missed unlock in qlcnic_83xx_flash_read32
  mac80211: fix locking in ieee80211_restart_work()
  ...
2021-08-19 12:33:43 -07:00
Linus Torvalds
e649e4c806 Merge tag 'platform-drivers-x86-v5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:

 - Enable SW_TABLET_MODE support for the TP200s

 - Enable WMI on two more Gigabyte motherboards

* tag 'platform-drivers-x86-v5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: gigabyte-wmi: add support for B450M S2H V2
  platform/x86: gigabyte-wmi: add support for X570 GAMING X
  platform/x86: asus-nb-wmi: Add tablet_mode_sw=lid-flip quirk for the TP200s
  platform/x86: asus-nb-wmi: Allow configuring SW_TABLET_MODE method with a module option
2021-08-19 12:19:58 -07:00
Dinghao Liu
a036ad0883 RDMA/bnxt_re: Remove unpaired rtnl unlock in bnxt_re_dev_init()
The fixed commit removes all rtnl_lock() and rtnl_unlock() calls in
function bnxt_re_dev_init(), but forgets to remove a rtnl_unlock() in the
error handling path of bnxt_re_register_netdev(), which may cause a
deadlock. This bug is suggested by a static analysis tool.

Fixes: c2b777a959 ("RDMA/bnxt_re: Refactor device add/remove functionalities")
Link: https://lore.kernel.org/r/20210816085531.12167-1-dinghao.liu@zju.edu.cn
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-19 14:48:09 -03:00
Vladimir Oltean
cd0a719fbd net: dpaa2-switch: disable the control interface on error path
Currently dpaa2_switch_takedown has a funny name and does not do the
opposite of dpaa2_switch_init, which makes probing fail when we need to
handle an -EPROBE_DEFER.

A sketch of what dpaa2_switch_init does:

	dpsw_open

	dpaa2_switch_detect_features

	dpsw_reset

	for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
		dpsw_if_disable

		dpsw_if_set_stp

		dpsw_vlan_remove_if_untagged

		dpsw_if_set_tci

		dpsw_vlan_remove_if
	}

	dpsw_vlan_remove

	alloc_ordered_workqueue

	dpsw_fdb_remove

	dpaa2_switch_ctrl_if_setup

When dpaa2_switch_takedown is called from the error path of
dpaa2_switch_probe(), the control interface, enabled by
dpaa2_switch_ctrl_if_setup from dpaa2_switch_init, remains enabled,
because dpaa2_switch_takedown does not call
dpaa2_switch_ctrl_if_teardown.

Since dpaa2_switch_probe might fail due to EPROBE_DEFER of a PHY, this
means that a second probe of the driver will happen with the control
interface directly enabled.

This will trigger a second error:

[   93.273528] fsl_dpaa2_switch dpsw.0: dpsw_ctrl_if_set_pools() failed
[   93.281966] fsl_dpaa2_switch dpsw.0: fsl_mc_driver_probe failed: -13
[   93.288323] fsl_dpaa2_switch: probe of dpsw.0 failed with error -13

Which if we investigate the /dev/dpaa2_mc_console log, we find out is
caused by:

[E, ctrl_if_set_pools:2211, DPMNG]  ctrl_if must be disabled

So make dpaa2_switch_takedown do the opposite of dpaa2_switch_init (in
reasonable limits, no reason to change STP state, re-add VLANs etc), and
rename it to something more conventional, like dpaa2_switch_teardown.

Fixes: 613c0a5810 ("staging: dpaa2-switch: enable the control interface")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20210819141755.1931423-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19 10:00:59 -07:00
Ido Schimmel
fa05bdb89b Revert "flow_offload: action should not be NULL when it is referenced"
This reverts commit 9ea3e52c5b.

Cited commit added a check to make sure 'action' is not NULL, but
'action' is already dereferenced before the check, when calling
flow_offload_has_one_action().

Therefore, the check does not make any sense and results in a smatch
warning:

include/net/flow_offload.h:322 flow_action_mixed_hw_stats_check() warn:
variable dereferenced before check 'action' (see line 319)

Fix by reverting this commit.

Cc: gushengxian <gushengxian@yulong.com>
Fixes: 9ea3e52c5b ("flow_offload: action should not be NULL when it is referenced")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20210819105842.1315705-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19 10:00:38 -07:00
Jakub Kicinski
d584566c4b Merge branch 'intel-wired-lan-driver-updates-2021-08-18'
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-08-18

This series contains updates to i40e and iavf drivers.

Arkadiusz fixes Flow Director not using the correct queue due to calling
the wrong pick Tx function for i40e.

Sylwester resolves traffic loss for iavf when it attempts to change its
MAC address when it does not have permissions to do so.
====================

Link: https://lore.kernel.org/r/20210818174217.4138922-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19 09:56:42 -07:00
Sylwester Dziedziuch
8da80c9d50 iavf: Fix ping is lost after untrusted VF had tried to change MAC
Make changes to MAC address dependent on the response of PF.
Disallow changes to HW MAC address and MAC filter from untrusted
VF, thanks to that ping is not lost if VF tries to change MAC.
Add a new field in iavf_mac_filter, to indicate whether there
was response from PF for given filter. Based on this field pass
or discard the filter.
If untrusted VF tried to change it's address, it's not changed.
Still filter was changed, because of that ping couldn't go through.

Fixes: c5c922b3e0 ("iavf: fix MAC address setting for VFs when filter is rejected")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Gurucharan G <Gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19 09:56:15 -07:00
Arkadiusz Kubalewski
a222be597e i40e: Fix ATR queue selection
Without this patch, ATR does not work. Receive/transmit uses queue
selection based on SW DCB hashing method.

If traffic classes are not configured for PF, then use
netdev_pick_tx function for selecting queue for packet transmission.
Instead of calling i40e_swdcb_skb_tx_hash, call netdev_pick_tx,
which ensures that packet is transmitted/received from CPU that is
running the application.

Reproduction steps:
1. Load i40e driver
2. Map each MSI interrupt of i40e port for each CPU
3. Disable ntuple, enable ATR i.e.:
ethtool -K $interface ntuple off
ethtool --set-priv-flags $interface flow-director-atr
4. Run application that is generating traffic and is bound to a
single CPU, i.e.:
taskset -c 9 netperf -H 1.1.1.1 -t TCP_RR -l 10
5. Observe behavior:
Application's traffic should be restricted to the CPU provided in
taskset.

Fixes: 89ec1f0886 ("i40e: Fix queue-to-TC mapping on Tx")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19 09:55:23 -07:00
Jakub Kicinski
316749009f Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-08-19

We've added 3 non-merge commits during the last 3 day(s) which contain
a total of 3 files changed, 29 insertions(+), 6 deletions(-).

The main changes are:

1) Fix to clear zext_dst for dead instructions which was causing invalid program
   rejections on JITs with bpf_jit_needs_zext such as s390x, from Ilya Leoshkevich.

2) Fix RCU splat in bpf_get_current_{ancestor_,}cgroup_id() helpers when they are
   invoked from sleepable programs, from Yonghong Song.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests, bpf: Test that dead ldx_w insns are accepted
  bpf: Clear zext_dst of dead insns
  bpf: Add rcu_read_lock in bpf_get_current_[ancestor_]cgroup_id() helpers
====================

Link: https://lore.kernel.org/r/20210819144904.20069-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-19 08:58:17 -07:00
Takashi Iwai
65ca89c2b1 ASoC: intel: atom: Fix breakage for PCM buffer address setup
The commit 2e6b836312 ("ASoC: intel: atom: Fix reference to PCM
buffer address") changed the reference of PCM buffer address to
substream->runtime->dma_addr as the buffer address may change
dynamically.  However, I forgot that the dma_addr field is still not
set up for the CONTINUOUS buffer type (that this driver uses) yet in
5.14 and earlier kernels, and it resulted in garbage I/O.  The problem
will be fixed in 5.15, but we need to address it quickly for now.

The fix is to deduce the address again from the DMA pointer with
virt_to_phys(), but from the right one, substream->runtime->dma_area.

Fixes: 2e6b836312 ("ASoC: intel: atom: Fix reference to PCM buffer address")
Reported-and-tested-by: Hans de Goede <hdegoede@redhat.com>
Cc: <stable@vger.kernel.org>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/2048c6aa-2187-46bd-6772-36a4fb3c5aeb@redhat.com
Link: https://lore.kernel.org/r/20210819152945.8510-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-19 17:57:51 +02:00
Kai-Heng Feng
8903376dc6 ALSA: hda/realtek: Limit mic boost on HP ProBook 445 G8
The mic has lots of noises if mic boost is enabled. So disable mic boost
to get crystal clear audio capture.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210818144119.121738-1-kai.heng.feng@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-19 17:52:25 +02:00
Arnd Bergmann
1e16a40211 Merge tag 'omap-for-v5.14/gpt12-fix-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes
Fix for omap gpt12 timer error handling

Two of the recent fixes for ti-sysc driver had bad interaction for a
function return value that caused one of the fixes to not work so we
need to change the return value handling. Otherwise early beagleboard
variants still have a boot issue.

* tag 'omap-for-v5.14/gpt12-fix-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  bus: ti-sysc: Fix error handling for sysc_check_active_timer()

Link: https://lore.kernel.org/r/pull-1629354796-830948@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-08-19 17:22:47 +02:00
Krzysztof Wilczyński
045a9277b5 PCI/sysfs: Use correct variable for the legacy_mem sysfs object
Two legacy PCI sysfs objects "legacy_io" and "legacy_mem" were updated
to use an unified address space in the commit 636b21b501 ("PCI: Revoke
mappings like devmem").  This allows for revocations to be managed from
a single place when drivers want to take over and mmap() a /dev/mem
range.

Following the update, both of the sysfs objects should leverage the
iomem_get_mapping() function to get an appropriate address range, but
only the "legacy_io" has been correctly updated - the second attribute
seems to be using a wrong variable to pass the iomem_get_mapping()
function to.

Thus, correct the variable name used so that the "legacy_mem" sysfs
object would also correctly call the iomem_get_mapping() function.

Fixes: 636b21b501 ("PCI: Revoke mappings like devmem")
Link: https://lore.kernel.org/r/20210812132144.791268-1-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-08-19 10:21:53 -05:00
Marcin Bachry
e0bff43220 PCI: Increase D3 delay for AMD Renoir/Cezanne XHCI
The Renoir XHCI controller apparently doesn't resume reliably with the
standard D3hot-to-D0 delay.  Increase it to 20ms.

[Alex: I talked to the AMD USB hardware team and the AMD Windows team and
they are not aware of any HW errata or specific issues.  The HW works fine
in Windows.  I was told Windows uses a rather generous default delay of
100ms for PCI state transitions.]

Link: https://lore.kernel.org/r/20210722025858.220064-1-alexander.deucher@amd.com
Signed-off-by: Marcin Bachry <hegel666@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Prike Liang <prike.liang@amd.com>
Cc: Shyam Sundar S K <shyam-sundar.s-k@amd.com>
2021-08-19 10:21:53 -05:00
Jim Quinlan
e647eff574 MAINTAINERS: Add Jim Quinlan et al as Broadcom STB PCIe maintainers
Add Jim Quinlan, Nicolas Saenz Julienne, and Florian Fainelli as
maintainers of the Broadcom STB PCIe controller driver.

This driver is also included in these entries:

  BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE
  BROADCOM BCM7XXX ARM ARCHITECTURE

which cover the Raspberry Pi specifics of the PCIe driver.

Link: https://lore.kernel.org/r/20210818225031.8502-1-jim2101024@gmail.com
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
2021-08-19 10:21:53 -05:00
Tuo Li
cbe71c6199 IB/hfi1: Fix possible null-pointer dereference in _extend_sdma_tx_descs()
kmalloc_array() is called to allocate memory for tx->descp. If it fails,
the function __sdma_txclean() is called:
  __sdma_txclean(dd, tx);

However, in the function __sdma_txclean(), tx-descp is dereferenced if
tx->num_desc is not zero:
  sdma_unmap_desc(dd, &tx->descp[0]);

To fix this possible null-pointer dereference, assign the return value of
kmalloc_array() to a local variable descp, and then assign it to tx->descp
if it is not NULL. Otherwise, go to enomem.

Fixes: 7724105686 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20210806133029.194964-1-islituo@gmail.com
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Tuo Li <islituo@gmail.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-19 11:33:35 -03:00
Lukas Bulwahn
0032640204 RDMA/irdma: Use correct kconfig symbol for AUXILIARY_BUS
In Kconfig, references to config symbols do not use the prefix "CONFIG_".

Commit fa0cf568fd ("RDMA/irdma: Add irdma Kconfig/Makefile and remove
i40iw") selects config CONFIG_AUXILIARY_BUS in config INFINIBAND_IRDMA,
but intended to select config AUXILIARY_BUS.

Fixes: fa0cf568fd ("RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw")
Link: https://lore.kernel.org/r/20210817084158.10095-1-lukas.bulwahn@gmail.com
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-19 10:28:49 -03:00
Naresh Kumar PBS
17f2569dce RDMA/bnxt_re: Add missing spin lock initialization
Add the missing initialization of srq lock.

Fixes: 37cb11acf1 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Link: https://lore.kernel.org/r/1629343553-5843-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-19 10:19:10 -03:00
Gal Pressman
f6018cc460 RDMA/uverbs: Track dmabuf memory regions
The dmabuf memory registrations are missing the restrack handling and
hence do not appear in rdma tool.

Fixes: bfe0cc6eb2 ("RDMA/uverbs: Add uverbs command for dma-buf based MR registration")
Link: https://lore.kernel.org/r/20210812135607.6228-1-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-19 09:59:53 -03:00
Maor Gottlieb
da78fe5fb3 RDMA/mlx5: Fix crash when unbind multiport slave
Fix the below crash when deleting a slave from the unaffiliated list
twice. First time when the slave is bound to the master and the second
when the slave is unloaded.

Fix it by checking if slave is unaffiliated (doesn't have ib device)
before removing from the list.

  RIP: 0010:mlx5r_mp_remove+0x4e/0xa0 [mlx5_ib]
  Call Trace:
   auxiliary_bus_remove+0x18/0x30
   __device_release_driver+0x177/x220
   device_release_driver+0x24/0x30
   bus_remove_device+0xd8/0x140
   device_del+0x18a/0x3e0
   mlx5_rescan_drivers_locked+0xa9/0x210 [mlx5_core]
   mlx5_unregister_device+0x34/0x60 [mlx5_core]
   mlx5_uninit_one+0x32/0x100 [mlx5_core]
   remove_one+0x6e/0xe0 [mlx5_core]
   pci_device_remove+0x36/0xa0
   __device_release_driver+0x177/0x220
   device_driver_detach+0x3c/0xa0
   unbind_store+0x113/0x130
   kernfs_fop_write_iter+0x110/0x1a0
   new_sync_write+0x116/0x1a0
   vfs_write+0x1ba/0x260
   ksys_write+0x5f/0xe0
   do_syscall_64+0x3d/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 93f8244431 ("RDMA/mlx5: Convert mlx5_ib to use auxiliary bus")
Link: https://lore.kernel.org/r/17ec98989b0ba88f7adfbad68eb20bce8d567b44.1628587493.git.leonro@nvidia.com
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-19 09:59:20 -03:00
David S. Miller
c15128c97b Merge branch 'r8152-bp-settings'
Hayes Wang says:

====================
r8152: fix bp settings

Fix the wrong bp settings of the firmware.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:19:30 +01:00
Hayes Wang
6633fb83f1 r8152: fix the maximum number of PLA bp for RTL8153C
The maximum PLA bp number of RTL8153C is 16, not 8. That is, the
bp 0 ~ 15 are at 0xfc28 ~ 0xfc46, and the bp_en is at 0xfc48.

Fixes: 195aae321c ("r8152: support new chips")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:19:30 +01:00
Hayes Wang
a876a33d2a r8152: fix writing USB_BP2_EN
The register of USB_BP2_EN is 16 bits, so we should use
ocp_write_word(), not ocp_write_byte().

Fixes: 9370f2d05a ("support request_firmware for RTL8153")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:19:30 +01:00
David S. Miller
d98c821067 Merge branch 'mptcp-fixes'
Mat Martineau says:

====================
mptcp: Bug fixes

Here are two bug fixes for the net tree:

Patch 1 fixes a memory leak that could be encountered when clearing the
list of advertised MPTCP addresses.

Patch 2 fixes a protocol issue early in an MPTCP connection, to ensure
both peers correctly understand that the full MPTCP connection handshake
has completed even when the server side quickly sends an ADD_ADDR
option.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:17:05 +01:00
Matthieu Baerts
67b12f792d mptcp: full fully established support after ADD_ADDR
If directly after an MP_CAPABLE 3WHS, the client receives an ADD_ADDR
with HMAC from the server, it is enough to switch to a "fully
established" mode because it has received more MPTCP options.

It was then OK to enable the "fully_established" flag on the MPTCP
socket. Still, best to check if the ADD_ADDR looks valid by looking if
it contains an HMAC (no 'echo' bit). If an ADD_ADDR echo is received
while we are not in "fully established" mode, it is strange and then
we should not switch to this mode now.

But that is not enough. On one hand, the path-manager has be notified
the state has changed. On the other hand, the "fully_established" flag
on the subflow socket should be turned on as well not to re-send the
MP_CAPABLE 3rd ACK content with the next ACK.

Fixes: 84dfe3677a ("mptcp: send out dedicated ADD_ADDR packet")
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:16:54 +01:00
Paolo Abeni
a0eea5f10e mptcp: fix memory leak on address flush
The endpoint cleanup path is prone to a memory leak, as reported
by syzkaller:

 BUG: memory leak
 unreferenced object 0xffff88810680ea00 (size 64):
   comm "syz-executor.6", pid 6191, jiffies 4295756280 (age 24.138s)
   hex dump (first 32 bytes):
     58 75 7d 3c 80 88 ff ff 22 01 00 00 00 00 ad de  Xu}<....".......
     01 00 02 00 00 00 00 00 ac 1e 00 07 00 00 00 00  ................
   backtrace:
     [<0000000072a9f72a>] kmalloc include/linux/slab.h:591 [inline]
     [<0000000072a9f72a>] mptcp_nl_cmd_add_addr+0x287/0x9f0 net/mptcp/pm_netlink.c:1170
     [<00000000f6e931bf>] genl_family_rcv_msg_doit.isra.0+0x225/0x340 net/netlink/genetlink.c:731
     [<00000000f1504a2c>] genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
     [<00000000f1504a2c>] genl_rcv_msg+0x341/0x5b0 net/netlink/genetlink.c:792
     [<0000000097e76f6a>] netlink_rcv_skb+0x148/0x430 net/netlink/af_netlink.c:2504
     [<00000000ceefa2b8>] genl_rcv+0x24/0x40 net/netlink/genetlink.c:803
     [<000000008ff91aec>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
     [<000000008ff91aec>] netlink_unicast+0x537/0x750 net/netlink/af_netlink.c:1340
     [<0000000041682c35>] netlink_sendmsg+0x846/0xd80 net/netlink/af_netlink.c:1929
     [<00000000df3aa8e7>] sock_sendmsg_nosec net/socket.c:704 [inline]
     [<00000000df3aa8e7>] sock_sendmsg+0x14e/0x190 net/socket.c:724
     [<000000002154c54c>] ____sys_sendmsg+0x709/0x870 net/socket.c:2403
     [<000000001aab01d7>] ___sys_sendmsg+0xff/0x170 net/socket.c:2457
     [<00000000fa3b1446>] __sys_sendmsg+0xe5/0x1b0 net/socket.c:2486
     [<00000000db2ee9c7>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     [<00000000db2ee9c7>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
     [<000000005873517d>] entry_SYSCALL_64_after_hwframe+0x44/0xae

We should not require an allocation to cleanup stuff.

Rework the code a bit so that the additional RCU work is no more needed.

Fixes: 1729cf186d ("mptcp: create the listening socket for new port")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:16:54 +01:00
Mark Rutland
bde8fff82e arm64: initialize all of CNTHCTL_EL2
In __init_el2_timers we initialize CNTHCTL_EL2.{EL1PCEN,EL1PCTEN} with a
RMW sequence, leaving all other bits UNKNOWN.

In general, we should initialize all bits in a register rather than
using an RMW sequence, since most bits are UNKNOWN out of reset, and as
new bits are added to the reigster their reset value might not result in
expected behaviour.

In the case of CNTHCTL_EL2, FEAT_ECV added a number of new control bits
in previously RES0 bits, which reset to UNKNOWN values, and may cause
issues for EL1 and EL0:

* CNTHCTL_EL2.ECV enables the CNTPOFF_EL2 offset (which itself resets to
  an UNKNOWN value) at EL0 and EL1. Since the offset could reset to
  distinct values across CPUs, when the control bit resets to 1 this
  could break timekeeping generally.

* CNTHCTL_EL2.{EL1TVT,EL1TVCT} trap EL0 and EL1 accesses to the EL1
  virtual timer/counter registers to EL2. When reset to 1, this could
  cause unexpected traps to EL2.

Initializing these bits to zero avoids these problems, and all other
bits in CNTHCTL_EL2 other than EL1PCEN and EL1PCTEN can safely be reset
to zero.

This patch ensures we initialize CNTHCTL_EL2 accordingly, only setting
EL1PCEN and EL1PCTEN, and setting all other bits to zero.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oupton@google.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Oliver Upton <oupton@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210818161535.52786-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-08-19 10:02:10 +01:00
Michael Ellerman
9f7853d760 powerpc/mm: Fix set_memory_*() against concurrent accesses
Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
on one of his systems:

  kernel tried to execute exec-protected page (c008000004073278) - exploit attempt? (uid: 0)
  BUG: Unable to handle kernel instruction fetch
  Faulting instruction address: 0xc008000004073278
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
  CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
  Workqueue: events control_work_handler [virtio_console]
  NIP:  c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
  REGS: c00000002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
  MSR:  800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002822 XER: 200400cf
  ...
  NIP fill_queue+0xf0/0x210 [virtio_console]
  LR  fill_queue+0xf0/0x210 [virtio_console]
  Call Trace:
    fill_queue+0xb4/0x210 [virtio_console] (unreliable)
    add_port+0x1a8/0x470 [virtio_console]
    control_work_handler+0xbc/0x1e8 [virtio_console]
    process_one_work+0x290/0x590
    worker_thread+0x88/0x620
    kthread+0x194/0x1a0
    ret_from_kernel_thread+0x5c/0x64

Jordan, Fabiano & Murilo were able to reproduce and identify that the
problem is caused by the call to module_enable_ro() in do_init_module(),
which happens after the module's init function has already been called.

Our current implementation of change_page_attr() is not safe against
concurrent accesses, because it invalidates the PTE before flushing the
TLB and then installing the new PTE. That leaves a window in time where
there is no valid PTE for the page, if another CPU tries to access the
page at that time we see something like the fault above.

We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
code doesn't handle a set_pte_at() of a valid PTE. See [1].

But we do have pte_update(), which replaces the old PTE with the new,
meaning there's no window where the PTE is invalid. And the hash MMU
version hash__pte_update() deals with synchronising the hash page table
correctly.

[1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/

Fixes: 1f9ad21c3b ("powerpc/mm: Implement set_memory() routines")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Murilo Opsfelder Araújo <muriloo@linux.ibm.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210818120518.3603172-1-mpe@ellerman.id.au
2021-08-19 09:41:54 +10:00
Christophe Leroy
ef486bf448 powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP
Commit b5efec00b6 ("powerpc/32s: Move KUEP locking/unlocking in C")
removed the 'isync' instruction after adding/removing NX bit in user
segments. The reasoning behind this change was that when setting the
NX bit we don't mind it taking effect with delay as the kernel never
executes text from userspace, and when clearing the NX bit this is
to return to userspace and then the 'rfi' should synchronise the
context.

However, it looks like on book3s/32 having a hash page table, at least
on the G3 processor, we get an unexpected fault from userspace, then
this is followed by something wrong in the verification of MSR_PR
at end of another interrupt.

This is fixed by adding back the removed isync() following update
of NX bit in user segment registers. Only do it for cores with an
hash table, as 603 cores don't exhibit that problem and the two isync
increase ./null_syscall selftest by 6 cycles on an MPC 832x.

First problem: unexpected WARN_ON() for mysterious PROTFAULT

  WARNING: CPU: 0 PID: 1660 at arch/powerpc/mm/fault.c:354 do_page_fault+0x6c/0x5b0
  Modules linked in:
  CPU: 0 PID: 1660 Comm: Xorg Not tainted 5.13.0-pmac-00028-gb3c15b60339a #40
  NIP:  c001b5c8 LR: c001b6f8 CTR: 00000000
  REGS: e2d09e40 TRAP: 0700   Not tainted  (5.13.0-pmac-00028-gb3c15b60339a)
  MSR:  00021032 <ME,IR,DR,RI>  CR: 42d04f30  XER: 20000000
  GPR00: c000424c e2d09f00 c301b680 e2d09f40 0000001e 42000000 00cba028 00000000
  GPR08: 08000000 48000010 c301b680 e2d09f30 22d09f30 00c1fff0 00cba000 a7b7ba4c
  GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010
  GPR24: a7b7b64c a7b7d2f0 00000004 00000000 c1efa6c0 00cba02c 00000300 e2d09f40
  NIP [c001b5c8] do_page_fault+0x6c/0x5b0
  LR [c001b6f8] do_page_fault+0x19c/0x5b0
  Call Trace:
  [e2d09f00] [e2d09f04] 0xe2d09f04 (unreliable)
  [e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4
  --- interrupt: 300 at 0xa7a261dc
  NIP:  a7a261dc LR: a7a253bc CTR: 00000000
  REGS: e2d09f40 TRAP: 0300   Not tainted  (5.13.0-pmac-00028-gb3c15b60339a)
  MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 228428e2  XER: 20000000
  DAR: 00cba02c DSISR: 42000000
  GPR00: a7a27448 afa6b0e0 a74c35c0 a7b7b614 0000001e a7b7b614 00cba028 00000000
  GPR08: 00020fd9 00000031 00cb9ff8 a7a273b0 220028e2 00c1fff0 00cba000 a7b7ba4c
  GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010
  GPR24: a7b7b64c a7b7d2f0 00000004 00000002 0000001e a7b7b614 a7b7aff4 00000030
  NIP [a7a261dc] 0xa7a261dc
  LR [a7a253bc] 0xa7a253bc
  --- interrupt: 300
  Instruction dump:
  7c4a1378 810300a0 75278410 83820298 83a300a4 553b018c 551e0036 4082038c
  2e1b0000 40920228 75280800 41820220 <0fe00000> 3b600000 41920214 81420594

Second problem: MSR PR is seen unset allthough the interrupt frame shows it set

  kernel BUG at arch/powerpc/kernel/interrupt.c:458!
  Oops: Exception in kernel mode, sig: 5 [#1]
  BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
  Modules linked in:
  CPU: 0 PID: 1660 Comm: Xorg Tainted: G        W         5.13.0-pmac-00028-gb3c15b60339a #40
  NIP:  c0011434 LR: c001629c CTR: 00000000
  REGS: e2d09e70 TRAP: 0700   Tainted: G        W          (5.13.0-pmac-00028-gb3c15b60339a)
  MSR:  00029032 <EE,ME,IR,DR,RI>  CR: 42d09f30  XER: 00000000
  GPR00: 00000000 e2d09f30 c301b680 e2d09f40 83440000 c44d0e68 e2d09e8c 00000000
  GPR08: 00000002 00dc228a 00004000 e2d09f30 22d09f30 00c1fff0 afa6ceb4 00c26144
  GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000
  GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089
  NIP [c0011434] interrupt_exit_kernel_prepare+0x10/0x70
  LR [c001629c] interrupt_return+0x9c/0x144
  Call Trace:
  [e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4 (unreliable)
  --- interrupt: 300 at 0xa09be008
  NIP:  a09be008 LR: a09bdfe8 CTR: a09bdfc0
  REGS: e2d09f40 TRAP: 0300   Tainted: G        W          (5.13.0-pmac-00028-gb3c15b60339a)
  MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 420028e2  XER: 20000000
  DAR: a539a308 DSISR: 0a000000
  GPR00: a7b90d50 afa6b2d0 a74c35c0 a0a8b690 a0a8b698 a5365d70 a4fa82a8 00000004
  GPR08: 00000000 a09bdfc0 00000000 a5360000 a09bde7c 00c1fff0 afa6ceb4 00c26144
  GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000
  GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089
  NIP [a09be008] 0xa09be008
  LR [a09bdfe8] 0xa09bdfe8
  --- interrupt: 300
  Instruction dump:
  80010024 83e1001c 7c0803a6 4bffff80 3bc00800 4bffffd0 486b42fd 4bffffcc
  81430084 71480002 41820038 554a0462 <0f0a0000> 80620060 74630001 40820034

Fixes: b5efec00b6 ("powerpc/32s: Move KUEP locking/unlocking in C")
Cc: stable@vger.kernel.org # v5.13+
Reported-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4856f5574906e2aec0522be17bf3848a22b2cd0b.1629269345.git.christophe.leroy@csgroup.eu
2021-08-19 09:41:54 +10:00
Gerd Rausch
fb4b1373dc net/rds: dma_map_sg is entitled to merge entries
Function "dma_map_sg" is entitled to merge adjacent entries
and return a value smaller than what was passed as "nents".

Subsequently "ib_map_mr_sg" needs to work with this value ("sg_dma_len")
rather than the original "nents" parameter ("sg_len").

This old RDS bug was exposed and reliably causes kernel panics
(using RDMA operations "rds-stress -D") on x86_64 starting with:
commit c588072bba ("iommu/vt-d: Convert intel iommu driver to the iommu ops")

Simply put: Linux 5.11 and later.

Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Link: https://lore.kernel.org/r/60efc69f-1f35-529d-a7ef-da0549cad143@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-18 15:35:50 -07:00
Vladimir Oltean
c1930148a3 net: mscc: ocelot: allow forwarding from bridge ports to the tag_8021q CPU port
Currently we are unable to ping a bridge on top of a felix switch which
uses the ocelot-8021q tagger. The packets are dropped on the ingress of
the user port and the 'drop_local' counter increments (the counter which
denotes drops due to no valid destinations).

Dumping the PGID tables, it becomes clear that the PGID_SRC of the user
port is zero, so it has no valid destinations.

But looking at the code, the cpu_fwd_mask (the bit mask of DSA tag_8021q
ports) is clearly missing from the forwarding mask of ports that are
under a bridge. So this has always been broken.

Looking at the version history of the patch, in v7
https://patchwork.kernel.org/project/netdevbpf/patch/20210125220333.1004365-12-olteanv@gmail.com/
the code looked like this:

	/* Standalone ports forward only to DSA tag_8021q CPU ports */
	unsigned long mask = cpu_fwd_mask;

(...)
	} else if (ocelot->bridge_fwd_mask & BIT(port)) {
		mask |= ocelot->bridge_fwd_mask & ~BIT(port);

while in v8 (the merged version)
https://patchwork.kernel.org/project/netdevbpf/patch/20210129010009.3959398-12-olteanv@gmail.com/
it looked like this:

	unsigned long mask;

(...)
	} else if (ocelot->bridge_fwd_mask & BIT(port)) {
		mask = ocelot->bridge_fwd_mask & ~BIT(port);

So the breakage was introduced between v7 and v8 of the patch.

Fixes: e21268efbe ("net: dsa: felix: perform switch setup for tag_8021q")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20210817160425.3702809-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-18 15:34:52 -07:00
Zhan Liu
37717b8c9f drm/amd/display: Use DCN30 watermark calc for DCN301
[why]
dcn301_calculate_wm_and_dl() causes flickering when external monitor is
connected.

This issue has been fixed before by commit 0e4c0ae59d
("drm/amdgpu/display: drop dcn301_calculate_wm_and_dl for now"), however
part of the fix was gone after commit 2cbcb78c9e ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next").

[how]
Use dcn30_calculate_wm_and_dlg() instead as in the original fix.

Fixes: 2cbcb78c9e ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next")

Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Tested-by: Zhan Liu <zhan.liu@amd.com>
Tested-by: Oliver Logush <oliver.logush@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:30:00 -04:00
Linus Torvalds
d6d09a6942 Merge tag 'for-5.14-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
 "One more fix for cross-rename, adding a missing check for directory
  and subvolume, this could lead to a crash"

* tag 'for-5.14-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: prevent rename2 from exchanging a subvol with a directory from different parents
2021-08-18 12:06:42 -07:00
Linus Torvalds
01f15f3773 Merge tag 'sound-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "Only a few regression fixes and trivial device quirks"

* tag 'sound-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/via: Apply runtime PM workaround for ASUS B23E
  ALSA: hda: Fix hang during shutdown due to link reset
  ALSA: hda/realtek: Enable 4-speaker output for Dell XPS 15 9510 laptop
  ALSA: oxfw: fix functioal regression for silence in Apogee Duet FireWire
  ALSA: hda - fix the 'Capture Switch' value change notifications
2021-08-18 12:00:27 -07:00
Linus Torvalds
a83955bdad Merge tag 'cfi-v5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull clang cfi fix from Kees Cook:

 - Use rcu_read_{un}lock_sched_notrace to avoid recursion (Elliot Berman)

* tag 'cfi-v5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  cfi: Use rcu_read_{un}lock_sched_notrace
2021-08-18 11:55:50 -07:00
Linus Torvalds
3b844826b6 pipe: avoid unnecessary EPOLLET wakeups under normal loads
I had forgotten just how sensitive hackbench is to extra pipe wakeups,
and commit 3a34b13a88 ("pipe: make pipe writes always wake up
readers") ended up causing a quite noticeable regression on larger
machines.

Now, hackbench isn't necessarily a hugely meaningful benchmark, and it's
not clear that this matters in real life all that much, but as Mel
points out, it's used often enough when comparing kernels and so the
performance regression shows up like a sore thumb.

It's easy enough to fix at least for the common cases where pipes are
used purely for data transfer, and you never have any exciting poll
usage at all.  So set a special 'poll_usage' flag when there is polling
activity, and make the ugly "EPOLLET has crazy legacy expectations"
semantics explicit to only that case.

I would love to limit it to just the broken EPOLLET case, but the pipe
code can't see the difference between epoll and regular select/poll, so
any non-read/write waiting will trigger the extra wakeup behavior.  That
is sufficient for at least the hackbench case.

Apart from making the odd extra wakeup cases more explicitly about
EPOLLET, this also makes the extra wakeup be at the _end_ of the pipe
write, not at the first write chunk.  That is actually much saner
semantics (as much as you can call any of the legacy edge-triggered
expectations for EPOLLET "sane") since it means that you know the wakeup
will happen once the write is done, rather than possibly in the middle
of one.

[ For stable people: I'm putting a "Fixes" tag on this, but I leave it
  up to you to decide whether you actually want to backport it or not.
  It likely has no impact outside of synthetic benchmarks  - Linus ]

Link: https://lore.kernel.org/lkml/20210802024945.GA8372@xsang-OptiPlex-9020/
Fixes: 3a34b13a88 ("pipe: make pipe writes always wake up readers")
Reported-by: kernel test robot <oliver.sang@intel.com>
Tested-by: Sandeep Patil <sspatil@android.com>
Tested-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-18 11:39:46 -07:00
Thomas Weißschuh
1e35b8a778 platform/x86: gigabyte-wmi: add support for B450M S2H V2
Reported as working here:
https://github.com/t-8ch/linux-gigabyte-wmi-driver/issues/1#issuecomment-901207693

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20210818164435.99821-1-linux@weissschuh.net
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-08-18 19:39:31 +02:00
Hans de Goede
5571ea3117 usb: typec: tcpm: Fix VDMs sometimes not being forwarded to alt-mode drivers
Commit a20dcf53ea ("usb: typec: tcpm: Respond Not_Supported if no
snk_vdo"), stops tcpm_pd_data_request() calling tcpm_handle_vdm_request()
when port->nr_snk_vdo is not set. But the VDM might be intended for an
altmode-driver, in which case nr_snk_vdo does not matter.

This change breaks the forwarding of connector hotplug (HPD) events
for displayport altmode on devices which don't set nr_snk_vdo.

tcpm_pd_data_request() is the only caller of tcpm_handle_vdm_request(),
so we can move the nr_snk_vdo check to inside it, at which point we
have already looked up the altmode device so we can check for this too.

Doing this check here also ensures that vdm_state gets set to
VDM_STATE_DONE if it was VDM_STATE_BUSY, even if we end up with
responding with PD_MSG_CTRL_NOT_SUPP later.

Note that tcpm_handle_vdm_request() was already sending
PD_MSG_CTRL_NOT_SUPP in some circumstances, after moving the nr_snk_vdo
check the same error-path is now taken when that check fails. So that
we have only one error-path for this and not two. Replace the
tcpm_queue_message(PD_MSG_CTRL_NOT_SUPP) used by the existing error-path
with the more robust tcpm_pd_handle_msg() from the (now removed) second
error-path.

Fixes: a20dcf53ea ("usb: typec: tcpm: Respond Not_Supported if no snk_vdo")
Cc: stable <stable@vger.kernel.org>
Cc: Kyle Tso <kyletso@google.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Kyle Tso <kyletso@google.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210816154632.381968-1-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-18 15:59:23 +02:00
Nathan Chancellor
3f78c90f9e powerpc/xive: Do not mark xive_request_ipi() as __init
Compiling ppc64le_defconfig with clang-14 shows a modpost warning:

WARNING: modpost: vmlinux.o(.text+0xa74e0): Section mismatch in
reference from the function xive_setup_cpu_ipi() to the function
.init.text:xive_request_ipi()
The function xive_setup_cpu_ipi() references
the function __init xive_request_ipi().
This is often because xive_setup_cpu_ipi lacks a __init
annotation or the annotation of xive_request_ipi is wrong.

xive_request_ipi() is called from xive_setup_cpu_ipi(), which is not
__init, so xive_request_ipi() should not be marked __init. Remove the
attribute so there is no more warning.

Fixes: cbc06f051c ("powerpc/xive: Do not skip CPU-less nodes when creating the IPIs")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210816185711.21563-1-nathan@kernel.org
2021-08-18 23:52:13 +10:00
Jani Nikula
e3e86f4138 drm/i915/dp: remove superfluous EXPORT_SYMBOL()
The symbol isn't needed outside of i915.ko.

Fixes: b30edfd8d0 ("drm/i915: Switch to LTTPR non-transparent mode link training")
Fixes: 264613b406 ("drm/i915: Disable LTTPR support when the DPCD rev < 1.4")
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210816071737.2917-1-jani.nikula@intel.com
(cherry picked from commit d8959fb338)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-08-18 07:23:57 -04:00
Jani Nikula
baa2152dae drm/i915/edp: fix eDP MSO pipe sanity checks for ADL-P
ADL-P supports stream splitter on pipe B in addition to pipe A. Update
the sanity check in intel_ddi_mso_get_config() to reflect this, and
remove the check in intel_ddi_mso_configure() as redundant with
encoder->pipe_mask. Abstract the splitter pipe mask to a single point of
truth while at it to avoid similar mistakes in the future.

Fixes: 7bc188cc2c ("drm/i915/adl_p: enable MSO on pipe B")
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Swati Sharma <swati2.sharma@intel.com>
Tested-by: Swati Sharma <swati2.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210812132354.10885-1-jani.nikula@intel.com
(cherry picked from commit f6864b27d6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-08-18 07:23:54 -04:00
Anshuman Gupta
b8441b288d drm/i915: Tweaked Wa_14010685332 for all PCHs
dispcnlunit1_cp_xosc_clkreq clock observed to be active on TGL-H platform
despite Wa_14010685332 original sequence,
thus blocks entry to deeper s0ix state.

The Tweaked Wa_14010685332 sequence fixes this issue, therefore use tweaked
Wa_14010685332 sequence for every PCH since PCH_CNP.

v2:
- removed RKL from comment and simplified condition. [Rodrigo]

Fixes: b896898c73 ("drm/i915: Tweaked Wa_14010685332 for PCHs used on gen11 platforms")
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210810113112.31739-2-anshuman.gupta@intel.com
(cherry picked from commit 8b46cc6577)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-08-18 07:23:50 -04:00
Liu Yi L
8798d36411 iommu/vt-d: Fix incomplete cache flush in intel_pasid_tear_down_entry()
This fixes improper iotlb invalidation in intel_pasid_tear_down_entry().
When a PASID was used as nested mode, released and reused, the following
error message will appear:

[  180.187556] Unexpected page request in Privilege Mode
[  180.187565] Unexpected page request in Privilege Mode
[  180.279933] Unexpected page request in Privilege Mode
[  180.279937] Unexpected page request in Privilege Mode

Per chapter 6.5.3.3 of VT-d spec 3.3, when tear down a pasid entry, the
software should use Domain selective IOTLB flush if the PGTT of the pasid
entry is SL only or Nested, while for the pasid entries whose PGTT is FL
only or PT using PASID-based IOTLB flush is enough.

Fixes: 2cd1311a26 ("iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr")
Signed-off-by: Kumar Sanjay K <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Tested-by: Yi Sun <yi.y.sun@intel.com>
Link: https://lore.kernel.org/r/20210817042425.1784279-1-yi.l.liu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210817124321.1517985-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18 13:15:58 +02:00
Fenghua Yu
62ef907a04 iommu/vt-d: Fix PASID reference leak
A PASID reference is increased whenever a device is bound to an mm (and
its PASID) successfully (i.e. the device's sdev user count is increased).
But the reference is not dropped every time the device is unbound
successfully from the mm (i.e. the device's sdev user count is decreased).
The reference is dropped only once by calling intel_svm_free_pasid() when
there isn't any device bound to the mm. intel_svm_free_pasid() drops the
reference and only frees the PASID on zero reference.

Fix the issue by dropping the PASID reference and freeing the PASID when
no reference on successful unbinding the device by calling
intel_svm_free_pasid() .

Fixes: 4048377414 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers")
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20210813181345.1870742-1-fenghua.yu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210817124321.1517985-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18 13:15:58 +02:00
Pavel Skripkin
a786e3195d net: asix: fix uninit value bugs
Syzbot reported uninit-value in asix_mdio_read(). The problem was in
missing error handling. asix_read_cmd() should initialize passed stack
variable smsr, but it can fail in some cases. Then while condidition
checks possibly uninit smsr variable.

Since smsr is uninitialized stack variable, driver can misbehave,
because smsr will be random in case of asix_read_cmd() failure.
Fix it by adding error handling and just continue the loop instead of
checking uninit value.

Added helper function for checking Host_En bit, since wrong loop was used
in 4 functions and there is no need in copy-pasting code parts.

Cc: Robert Foss <robert.foss@collabora.com>
Fixes: d9fe64e511 ("net: asix: Add in_pm parameter")
Reported-by: syzbot+a631ec9e717fb0423053@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18 11:46:52 +01:00
kaixi.fan
01634047bf ovs: clear skb->tstamp in forwarding path
fq qdisc requires tstamp to be cleared in the forwarding path. Now ovs
doesn't clear skb->tstamp. We encountered a problem with linux
version 5.4.56 and ovs version 2.14.1, and packets failed to
dequeue from qdisc when fq qdisc was attached to ovs port.

Fixes: fb420d5d91 ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: kaixi.fan <fankaixi.li@bytedance.com>
Signed-off-by: xiexiaohui <xiexiaohui.xxh@bytedance.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18 11:31:13 +01:00
David S. Miller
97712f8f91 Merge branch 'mdio-fixes'
Saravana Kannan says:

====================
Clean up and fix error handling in mdio_mux_init()

This patch series was started due to -EPROBE_DEFER not being handled
correctly in mdio_mux_init() and causing issues [1]. While at it, I also
did some more error handling fixes and clean ups. The -EPROBE_DEFER fix is
the last patch.

Ideally, in the last patch we'd treat any error similar to -EPROBE_DEFER
but I'm not sure if it'll break any board/platforms where some child
mdiobus never successfully registers. If we treated all errors similar to
-EPROBE_DEFER, then none of the child mdiobus will work and that might be a
regression. If people are sure this is not a real case, then I can fix up
the last patch to always fail the entire mdio-mux init if any of the child
mdiobus registration fails.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18 10:48:52 +01:00
Saravana Kannan
7bd0cef5da net: mdio-mux: Handle -EPROBE_DEFER correctly
When registering mdiobus children, if we get an -EPROBE_DEFER, we shouldn't
ignore it and continue registering the rest of the mdiobus children. This
would permanently prevent the deferring child mdiobus from working instead
of reattempting it in the future. So, if a child mdiobus needs to be
reattempted in the future, defer the entire mdio-mux initialization.

This fixes the issue where PHYs sitting under the mdio-mux aren't
initialized correctly if the PHY's interrupt controller is not yet ready
when the mdio-mux is being probed. Additional context in the link below.

Fixes: 0ca2997d14 ("netdev/of/phy: Add MDIO bus multiplexer support.")
Link: https://lore.kernel.org/lkml/CAGETcx95kHrv8wA-O+-JtfH7H9biJEGJtijuPVN0V5dUKUAB3A@mail.gmail.com/#t
Signed-off-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18 10:48:52 +01:00
Saravana Kannan
99d81e9424 net: mdio-mux: Don't ignore memory allocation errors
If we are seeing memory allocation errors, don't try to continue
registering child mdiobus devices. It's unlikely they'll succeed.

Fixes: 342fa19644 ("mdio: mux: make child bus walking more permissive and errors more verbose")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18 10:48:52 +01:00
Saravana Kannan
663d946af5 net: mdio-mux: Delete unnecessary devm_kfree
The whole point of devm_* APIs is that you don't have to undo them if you
are returning an error that's going to get propagated out of a probe()
function. So delete unnecessary devm_kfree() call in the error return path.

Fixes: b601616681 ("mdio: mux: Correct mdio_mux_init error path issues")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18 10:48:52 +01:00
Vladimir Oltean
ed5d2937a6 net: dsa: sja1105: fix use-after-free after calling of_find_compatible_node, or worse
It seems that of_find_compatible_node has a weird calling convention in
which it calls of_node_put() on the "from" node argument, instead of
leaving that up to the caller. This comes from the fact that
of_find_compatible_node with a non-NULL "from" argument it only supposed
to be used as the iterator function of for_each_compatible_node(). OF
iterator functions call of_node_get on the next OF node and of_node_put()
on the previous one.

When of_find_compatible_node calls of_node_put, it actually never
expects the refcount to drop to zero, because the call is done under the
atomic devtree_lock context, and when the refcount drops to zero it
triggers a kobject and a sysfs file deletion, which assume blocking
context.

So any driver call to of_find_compatible_node is probably buggy because
an unexpected of_node_put() takes place.

What should be done is to use the of_get_compatible_child() function.

Fixes: 5a8f09748e ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX")
Link: https://lore.kernel.org/netdev/20210814010139.kzryimmp4rizlznt@skbuf/
Suggested-by: Frank Rowand <frowand.list@gmail.com>
Suggested-by: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18 10:21:01 +01:00
Toke Høiland-Jørgensen
86b9bbd332 sch_cake: fix srchost/dsthost hashing mode
When adding support for using the skb->hash value as the flow hash in CAKE,
I accidentally introduced a logic error that broke the host-only isolation
modes of CAKE (srchost and dsthost keywords). Specifically, the flow_hash
variable should stay initialised to 0 in cake_hash() in pure host-based
hashing mode. Add a check for this before using the skb->hash value as
flow_hash.

Fixes: b0c19ed608 ("sch_cake: Take advantage of skb->hash where appropriate")
Reported-by: Pete Heist <pete@heistp.net>
Tested-by: Pete Heist <pete@heistp.net>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18 10:14:00 +01:00
Ben Skeggs
59f216cf04 drm/nouveau: rip out nvkm_client.super
No longer required now that userspace can't touch anything that might
need it, and should fix DRM MM operations racing with each other, and
the random hangs/crashes that come with that.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 19:00:15 +10:00
Ben Skeggs
148a865378 drm/nouveau: block a bunch of classes from userspace
Long ago, there had been plans for making use of a bunch of these APIs
from userspace and there's various checks in place to stop misbehaving.

Countless other projects have occurred in the meantime, and the pieces
didn't finish falling into place for that to happen.

They will (hopefully) in the not-too-distant future, but it won't look
quite as insane.  The super checks are causing problems right now, and
are going to be removed.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 19:00:13 +10:00
Ben Skeggs
50c4a64491 drm/nouveau/fifo/nv50-: rip out dma channels
I honestly don't even know why...  These have never been used.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 19:00:11 +10:00
Ben Skeggs
e78b1b545c drm/nouveau/kms/nv50: workaround EFI GOP window channel format differences
Should fix some initial modeset failures on (at least) Ampere boards.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 19:00:08 +10:00
Ben Skeggs
6eaa1f3c59 drm/nouveau/disp: power down unused DP links during init
When booted with multiple displays attached, the EFI GOP driver on (at
least) Ampere, can leave DP links powered up that aren't being used to
display anything.  This confuses our tracking of SOR routing, with the
likely result being a failed modeset and display engine hang.

Fix this by (ab?)using the DisableLT IED script to power-down the link,
restoring HW to a state the driver expects.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 19:00:04 +10:00
Ben Skeggs
fa25f28ef2 drm/nouveau: recognise GA107
Still no GA106 as I don't have HW to verif.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 18:59:41 +10:00
Niklas Schnelle
2a671f77ee s390/pci: fix use after free of zpci_dev
The struct pci_dev uses reference counting but zPCI assumed erroneously
that the last reference would always be the local reference after
calling pci_stop_and_remove_bus_device(). This is usually the case but
not how reference counting works and thus inherently fragile.

In fact one case where this causes a NULL pointer dereference when on an
SRIOV device the function 0 was hot unplugged before another function of
the same multi-function device. In this case the second function's
pdev->sriov->dev reference keeps the struct pci_dev of function 0 alive
even after the unplug. This bug was previously hidden by the fact that
we were leaking the struct pci_dev which in turn means that it always
outlived the struct zpci_dev. This was fixed in commit 0b13525c20
("s390/pci: fix leak of PCI device structure") exposing the broken
behavior.

Fix this by accounting for the long living reference a struct pci_dev
has to its underlying struct zpci_dev via the zbus->function[] array and
only release that in pcibios_release_device() ensuring that the struct
pci_dev is not left with a dangling reference. This is a minimal fix in
the future it would probably better to use fine grained reference
counting for struct zpci_dev.

Fixes: 05bc1be6db ("s390/pci: create zPCI bus")
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-08-18 10:12:42 +02:00
Thomas Weißschuh
b9570f5c92 platform/x86: gigabyte-wmi: add support for X570 GAMING X
Reported as working here:
https://github.com/t-8ch/linux-gigabyte-wmi-driver/issues/1#issuecomment-900263115

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20210817154628.84992-1-linux@weissschuh.net
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-08-18 09:43:51 +02:00
Ming Lei
a9ed27a764 blk-mq: fix is_flush_rq
is_flush_rq() is called from bt_iter()/bt_tags_iter(), and runs the
following check:

	hctx->fq->flush_rq == req

but the passed hctx from bt_iter()/bt_tags_iter() may be NULL because:

1) memory re-order in blk_mq_rq_ctx_init():

	rq->mq_hctx = data->hctx;
	...
	refcount_set(&rq->ref, 1);

OR

2) tag re-use and ->rqs[] isn't updated with new request.

Fix the issue by re-writing is_flush_rq() as:

	return rq->end_io == flush_end_io;

which turns out simpler to follow and immune to data race since we have
ordered WRITE rq->end_io and refcount_set(&rq->ref, 1).

Fixes: 2e315dc07d ("blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter")
Cc: "Blank-Burian, Markus, Dr." <blankburian@uni-muenster.de>
Cc: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210818010925.607383-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-17 20:17:34 -06:00
Wang Hai
1b80fec7b0 ixgbe, xsk: clean up the resources in ixgbe_xsk_pool_enable error path
In ixgbe_xsk_pool_enable(), if ixgbe_xsk_wakeup() fails,
We should restore the previous state and clean up the
resources. Add the missing clear af_xdp_zc_qps and unmap dma
to fix this bug.

Fixes: d49e286d35 ("ixgbe: add tracking of AF_XDP zero-copy state for each queue pair")
Fixes: 4a9b32f30f ("ixgbe: fix potential RX buffer starvation for AF_XDP")
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20210817203736.3529939-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-17 17:47:52 -07:00
Jakub Kicinski
e5e487a2ec Merge tag 'wireless-drivers-2021-08-17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:

====================
wireless-drivers fixes for v5.14

First set of fixes for v5.14 and nothing major this time. New devices
for iwlwifi and one fix for a compiler warning.

iwlwifi
 * support for new devices

mt76
 * fix compiler warning about MT_CIPHER_NONE

* tag 'wireless-drivers-2021-08-17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers:
  mt76: fix enum type mismatch
  iwlwifi: add new so-jf devices
  iwlwifi: add new SoF with JF devices
  iwlwifi: pnvm: accept multiple HW-type TLVs
====================

Link: https://lore.kernel.org/r/20210817171027.EC1E6C43460@smtp.codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-17 15:08:14 -07:00
Pavel Begunkov
9cb0073b30 io_uring: pin ctx on fallback execution
Pin ring in io_fallback_req_func() by briefly elevating ctx->refs in
case any task_work handler touches ctx after releasing a request.

Fixes: 9011bf9a13 ("io_uring: fix stuck fallback reqs")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/833a494713d235ec144284a9bbfe418df4f6b61c.1629235576.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-17 16:06:14 -06:00
Linus Torvalds
614cb2751d Merge tag 'trace-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
 "Limit the shooting in the foot of tp_printk

  The "tp_printk" option redirects the trace event output to printk at
  boot up. This is useful when a machine crashes before boot where the
  trace events can not be retrieved by the in kernel ring buffer. But it
  can be "dangerous" because trace events can be located in high
  frequency locations such as interrupts and the scheduler, where a
  printk can slow it down that it live locks the machine (because by the
  time the printk finishes, the next event is triggered). Thus tp_printk
  must be used with care.

  It was discovered that the filter logic to trace events does not apply
  to the tp_printk events. This can cause a surprise and live lock when
  the user expects it to be filtered to limit the amount of events
  printed to the console when in fact it still prints everything"

* tag 'trace-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Apply trace filters on all output channels
2021-08-17 09:47:18 -10:00
Rafael J. Wysocki
a87a10961a Merge branch 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull ARM cpufreq fixes for v5.14 from Viresh Kumar:

"This contains:

 - Addition of SoCs to blocklist for cpufreq-dt driver (Bjorn Andersson
   and Thara Gopinath).

 - Fix error path for scmi driver (Lukasz Luba).

 - Temporarily disable highest frequency for armada, its unsafe and
   breaks stuff."

* 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  cpufreq: armada-37xx: forbid cpufreq for 1.2 GHz variant
  cpufreq: blocklist Qualcomm sm8150 in cpufreq-dt-platdev
  cpufreq: arm_scmi: Fix error path when allocation failed
  cpufreq: blacklist Qualcomm sc8180x in cpufreq-dt-platdev
2021-08-17 20:52:07 +02:00
Mark Yacoub
fa0b1ef5f7 drm: Copy drm_wait_vblank to user before returning
[Why]
Userspace should get back a copy of drm_wait_vblank that's been modified
even when drm_wait_vblank_ioctl returns a failure.

Rationale:
drm_wait_vblank_ioctl modifies the request and expects the user to read
it back. When the type is RELATIVE, it modifies it to ABSOLUTE and updates
the sequence to become current_vblank_count + sequence (which was
RELATIVE), but now it became ABSOLUTE.
drmWaitVBlank (in libdrm) expects this to be the case as it modifies
the request to be Absolute so it expects the sequence to would have been
updated.

The change is in compat_drm_wait_vblank, which is called by
drm_compat_ioctl. This change of copying the data back regardless of the
return number makes it en par with drm_ioctl, which always copies the
data before returning.

[How]
Return from the function after everything has been copied to user.

Fixes IGT:kms_flip::modeset-vs-vblank-race-interruptible
Tested on ChromeOS Trogdor(msm)

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Mark Yacoub <markyacoub@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210812194917.1703356-1-markyacoub@chromium.org
2021-08-17 13:56:03 -04:00
Dinghao Liu
0a298d1338 net: qlcnic: add missed unlock in qlcnic_83xx_flash_read32
qlcnic_83xx_unlock_flash() is called on all paths after we call
qlcnic_83xx_lock_flash(), except for one error path on failure
of QLCRD32(), which may cause a deadlock. This bug is suggested
by a static analysis tool, please advise.

Fixes: 81d0aeb0a4 ("qlcnic: flash template based firmware reset recovery")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Link: https://lore.kernel.org/r/20210816131405.24024-1-dinghao.liu@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-17 08:27:31 -07:00
Ming Lei
c2da19ed50 blk-mq: fix kernel panic during iterating over flush request
For fixing use-after-free during iterating over requests, we grabbed
request's refcount before calling ->fn in commit 2e315dc07d ("blk-mq:
grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter").
Turns out this way may cause kernel panic when iterating over one flush
request:

1) old flush request's tag is just released, and this tag is reused by
one new request, but ->rqs[] isn't updated yet

2) the flush request can be re-used for submitting one new flush command,
so blk_rq_init() is called at the same time

3) meantime blk_mq_queue_tag_busy_iter() is called, and old flush request
is retrieved from ->rqs[tag]; when blk_mq_put_rq_ref() is called,
flush_rq->end_io may not be updated yet, so NULL pointer dereference
is triggered in blk_mq_put_rq_ref().

Fix the issue by calling refcount_set(&flush_rq->ref, 1) after
flush_rq->end_io is set. So far the only other caller of blk_rq_init() is
scsi_ioctl_reset() in which the request doesn't enter block IO stack and
the request reference count isn't used, so the change is safe.

Fixes: 2e315dc07d ("blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter")
Reported-by: "Blank-Burian, Markus, Dr." <blankburian@uni-muenster.de>
Tested-by: "Blank-Burian, Markus, Dr." <blankburian@uni-muenster.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20210811142624.618598-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-17 08:33:32 -06:00
Ming Lei
c797b40ccc blk-mq: don't grab rq's refcount in blk_mq_check_expired()
Inside blk_mq_queue_tag_busy_iter() we already grabbed request's
refcount before calling ->fn(), so needn't to grab it one more time
in blk_mq_check_expired().

Meantime remove extra request expire check in blk_mq_check_expired().

Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20210811155202.629575-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-17 08:32:45 -06:00
Johannes Berg
276e189f8e mac80211: fix locking in ieee80211_restart_work()
Ilan's change to move locking around accidentally lost the
wiphy_lock() during some porting, add it back.

Fixes: 45daaa1318 ("mac80211: Properly WARN on HW scan before restart")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20210817121210.47bdb177064f.Ib1ef79440cd27f318c028ddfc0c642406917f512@changeid
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-17 06:51:43 -07:00
Jason Wang
dbcf24d153 virtio-net: use NETIF_F_GRO_HW instead of NETIF_F_LRO
Commit a02e8964ea ("virtio-net: ethtool configurable LRO")
maps LRO to virtio guest offloading features and allows the
administrator to enable and disable those features via ethtool.

This leads to several issues:

- For a device that doesn't support control guest offloads, the "LRO"
  can't be disabled triggering WARN in dev_disable_lro() when turning
  off LRO or when enabling forwarding bridging etc.

- For a device that supports control guest offloads, the guest
  offloads are disabled in cases of bridging, forwarding etc slowing
  down the traffic.

Fix this by using NETIF_F_GRO_HW instead. Though the spec does not
guarantee packets to be re-segmented as the original ones,
we can add that to the spec, possibly with a flag for devices to
differentiate between GRO and LRO.

Further, we never advertised LRO historically before a02e8964ea
("virtio-net: ethtool configurable LRO") and so bridged/forwarded
configs effectively always relied on virtio receive offloads behaving
like GRO - thus even if this breaks any configs it is at least not
a regression.

Fixes: a02e8964ea ("virtio-net: ethtool configurable LRO")
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Ivan <ivan@prestigetransportation.com>
Tested-by: Ivan <ivan@prestigetransportation.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:45:09 +01:00
Takashi Iwai
4bf61ad5f0 ALSA: hda/via: Apply runtime PM workaround for ASUS B23E
ASUS B23E requires the same workaround like other machines with
VT1802, otherwise it looses the codec power on a few nodes and the
sound kept silence.

Fixes: a0645daf16 ("ALSA: HDA: Early Forbid of runtime PM")
Link: https://lore.kernel.org/r/ac2232f142efcd67fe6ac38897f704f7176bd200.camel@gmail.com
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210817052432.14751-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-17 08:02:44 +02:00
Imre Deak
0165c4e19f ALSA: hda: Fix hang during shutdown due to link reset
During system shutdown codecs may be still active, and resetting the
controller->codec HW link in this state - based on the bug reporter's
tests - leads to the shutdown sequence to get stuck. This happens at
least on the reporter's KBL system with an ALC662 codec.

For now fix the issue by skipping the link reset step.

Fixes: 472e18f63c ("ALSA: hda: Release controller display power during shutdown/reboot")
References: https://bugzilla.kernel.org/show_bug.cgi?id=214045
References: https://gitlab.freedesktop.org/drm/intel/-/issues/3618#note_1024665
Reported-and-tested-by: youling257@gmail.com
Cc: youling257@gmail.com
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://lore.kernel.org/r/20210816174259.2759103-1-imre.deak@intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-17 07:14:30 +02:00
Linus Torvalds
794c7931a2 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "This contains a fix for a potential boot failure due to a missing
  Kconfig dependency for people upgrading with the DRBG enabled"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: drbg - select SHA512
2021-08-16 15:42:09 -10:00
Lahav Schlesinger
09e856d54b vrf: Reset skb conntrack connection on VRF rcv
To fix the "reverse-NAT" for replies.

When a packet is sent over a VRF, the POST_ROUTING hooks are called
twice: Once from the VRF interface, and once from the "actual"
interface the packet will be sent from:
1) First SNAT: l3mdev_l3_out() -> vrf_l3_out() -> .. -> vrf_output_direct()
     This causes the POST_ROUTING hooks to run.
2) Second SNAT: 'ip_output()' calls POST_ROUTING hooks again.

Similarly for replies, first ip_rcv() calls PRE_ROUTING hooks, and
second vrf_l3_rcv() calls them again.

As an example, consider the following SNAT rule:
> iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j SNAT --to-source 2.2.2.2 -o vrf_1

In this case sending over a VRF will create 2 conntrack entries.
The first is from the VRF interface, which performs the IP SNAT.
The second will run the SNAT, but since the "expected reply" will remain
the same, conntrack randomizes the source port of the packet:
e..g With a socket bound to 1.1.1.1:10000, sending to 3.3.3.3:53, the conntrack
rules are:
udp      17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=0 bytes=0 mark=0 use=1
udp      17 29 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1

i.e. First SNAT IP from 1.1.1.1 --> 2.2.2.2, and second the src port is
SNAT-ed from 10000 --> 61033.

But when a reply is sent (3.3.3.3:53 -> 2.2.2.2:61033) only the later
conntrack entry is matched:
udp      17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=1 bytes=49 mark=0 use=1
udp      17 28 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1

And a "port 61033 unreachable" ICMP packet is sent back.

The issue is that when PRE_ROUTING hooks are called from vrf_l3_rcv(),
the skb already has a conntrack flow attached to it, which means
nf_conntrack_in() will not resolve the flow again.

This means only the dest port is "reverse-NATed" (61033 -> 10000) but
the dest IP remains 2.2.2.2, and since the socket is bound to 1.1.1.1 it's
not received.
This can be verified by logging the 4-tuple of the packet in '__udp4_lib_rcv()'.

The fix is then to reset the flow when skb is received on a VRF, to let
conntrack resolve the flow again (which now will hit the earlier flow).

To reproduce: (Without the fix "Got pkt_to_nat_port" will not be printed by
  running 'bash ./repro'):
  $ cat run_in_A1.py
  import logging
  logging.getLogger("scapy.runtime").setLevel(logging.ERROR)
  from scapy.all import *
  import argparse

  def get_packet_to_send(udp_dst_port, msg_name):
      return Ether(src='11:22:33:44:55:66', dst=iface_mac)/ \
          IP(src='3.3.3.3', dst='2.2.2.2')/ \
          UDP(sport=53, dport=udp_dst_port)/ \
          Raw(f'{msg_name}\x0012345678901234567890')

  parser = argparse.ArgumentParser()
  parser.add_argument('-iface_mac', dest="iface_mac", type=str, required=True,
                      help="From run_in_A3.py")
  parser.add_argument('-socket_port', dest="socket_port", type=str,
                      required=True, help="From run_in_A3.py")
  parser.add_argument('-v1_mac', dest="v1_mac", type=str, required=True,
                      help="From script")

  args, _ = parser.parse_known_args()
  iface_mac = args.iface_mac
  socket_port = int(args.socket_port)
  v1_mac = args.v1_mac

  print(f'Source port before NAT: {socket_port}')

  while True:
      pkts = sniff(iface='_v0', store=True, count=1, timeout=10)
      if 0 == len(pkts):
          print('Something failed, rerun the script :(', flush=True)
          break
      pkt = pkts[0]
      if not pkt.haslayer('UDP'):
          continue

      pkt_sport = pkt.getlayer('UDP').sport
      print(f'Source port after NAT: {pkt_sport}', flush=True)

      pkt_to_send = get_packet_to_send(pkt_sport, 'pkt_to_nat_port')
      sendp(pkt_to_send, '_v0', verbose=False) # Will not be received

      pkt_to_send = get_packet_to_send(socket_port, 'pkt_to_socket_port')
      sendp(pkt_to_send, '_v0', verbose=False)
      break

  $ cat run_in_A2.py
  import socket
  import netifaces

  print(f"{netifaces.ifaddresses('e00000')[netifaces.AF_LINK][0]['addr']}",
        flush=True)
  s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
  s.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE,
               str('vrf_1' + '\0').encode('utf-8'))
  s.connect(('3.3.3.3', 53))
  print(f'{s. getsockname()[1]}', flush=True)
  s.settimeout(5)

  while True:
      try:
          # Periodically send in order to keep the conntrack entry alive.
          s.send(b'a'*40)
          resp = s.recvfrom(1024)
          msg_name = resp[0].decode('utf-8').split('\0')[0]
          print(f"Got {msg_name}", flush=True)
      except Exception as e:
          pass

  $ cat repro.sh
  ip netns del A1 2> /dev/null
  ip netns del A2 2> /dev/null
  ip netns add A1
  ip netns add A2

  ip -n A1 link add _v0 type veth peer name _v1 netns A2
  ip -n A1 link set _v0 up

  ip -n A2 link add e00000 type bond
  ip -n A2 link add lo0 type dummy
  ip -n A2 link add vrf_1 type vrf table 10001
  ip -n A2 link set vrf_1 up
  ip -n A2 link set e00000 master vrf_1

  ip -n A2 addr add 1.1.1.1/24 dev e00000
  ip -n A2 link set e00000 up
  ip -n A2 link set _v1 master e00000
  ip -n A2 link set _v1 up
  ip -n A2 link set lo0 up
  ip -n A2 addr add 2.2.2.2/32 dev lo0

  ip -n A2 neigh add 1.1.1.10 lladdr 77:77:77:77:77:77 dev e00000
  ip -n A2 route add 3.3.3.3/32 via 1.1.1.10 dev e00000 table 10001

  ip netns exec A2 iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j \
	SNAT --to-source 2.2.2.2 -o vrf_1

  sleep 5
  ip netns exec A2 python3 run_in_A2.py > x &
  XPID=$!
  sleep 5

  IFACE_MAC=`sed -n 1p x`
  SOCKET_PORT=`sed -n 2p x`
  V1_MAC=`ip -n A2 link show _v1 | sed -n 2p | awk '{print $2'}`
  ip netns exec A1 python3 run_in_A1.py -iface_mac ${IFACE_MAC} -socket_port \
          ${SOCKET_PORT} -v1_mac ${SOCKET_PORT}
  sleep 5

  kill -9 $XPID
  wait $XPID 2> /dev/null
  ip netns del A1
  ip netns del A2
  tail x -n 2
  rm x
  set +x

Fixes: 73e20b761a ("net: vrf: Add support for PREROUTING rules on vrf device")
Signed-off-by: Lahav Schlesinger <lschlesinger@drivenets.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210815120002.2787653-1-lschlesinger@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-16 16:37:01 -07:00
Arnd Bergmann
d0dc706ab1 Merge tag 'qcom-arm64-fixes-for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm ARM64 fixes for v5.14

This fixes three regressions across Angler and Bullhead, introduced by
advancements in the platform definition. It then corrects the powerdown
GPIOs for the speaker amps on C630 and lastly fixes a typo that assigned
CPU7 in SC7280 to the wrong CPUfreq domain.

* tag 'qcom-arm64-fixes-for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  arm64: dts: qcom: sdm845-oneplus: fix reserved-mem
  arm64: dts: qcom: msm8994-angler: Disable cont_splash_mem
  arm64: dts: qcom: sc7280: Fixup cpufreq domain info for cpu7
  arm64: dts: qcom: msm8992-bullhead: Fix cont_splash_mem mapping
  arm64: dts: qcom: msm8992-bullhead: Remove PSCI
  arm64: dts: qcom: c630: fix correct powerdown pin for WSA881x

Link: https://lore.kernel.org/r/20210816205030.576348-1-bjorn.andersson@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-08-16 23:22:03 +02:00
Arnd Bergmann
df97e5f3b2 Merge tag 'soc-fsl-fix-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux into arm/fixes
NXP/FSL SoC driver fixes for v5.14

QE interrupt controller driver
- Convert it to platform_device driver to make it work with fw_devlink
- Fix static analysis issue

* tag 'soc-fsl-fix-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux:
  soc: fsl: qe: fix static checker warning
  soc: fsl: qe: convert QE interrupt controller to platform_device

Link: https://lore.kernel.org/r/20210813222305.13663-1-leoyang.li@nxp.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-08-16 22:42:02 +02:00
Jake Wang
71ae580f31 drm/amd/display: Ensure DCN save after VM setup
[Why]
DM initializes VM context after DMCUB initialization.
This results in loss of DCN_VM_CONTEXT registers after z10.

[How]
Notify DMCUB when VM setup is complete, and have DMCUB
save init registers.

v2: squash in CONFIG_DRM_AMD_DC_DCN3_1 fix

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Jake Wang <haonan.wang2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 16:04:24 -04:00
Yifan Zhang
f924f3a1f0 drm/amdkfd: fix random KFDSVMRangeTest.SetGetAttributesTest test failure
KFDSVMRangeTest.SetGetAttributesTest randomly fails in stress test.

Note: Google Test filter = KFDSVMRangeTest.*
[==========] Running 18 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 18 tests from KFDSVMRangeTest
[ RUN      ] KFDSVMRangeTest.BasicSystemMemTest
[       OK ] KFDSVMRangeTest.BasicSystemMemTest (30 ms)
[ RUN      ] KFDSVMRangeTest.SetGetAttributesTest
[          ] Get default atrributes
/home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:154: Failure
Value of: expectedDefaultResults[i]
  Actual: 4294967295
Expected: outputAttributes[i].value
Which is: 0
/home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:154: Failure
Value of: expectedDefaultResults[i]
  Actual: 4294967295
Expected: outputAttributes[i].value
Which is: 0
/home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:152: Failure
Value of: expectedDefaultResults[i]
  Actual: 4
Expected: outputAttributes[i].type
Which is: 2
[          ] Setting/Getting atrributes
[  FAILED  ]

the root cause is that svm work queue has not finished when svm_range_get_attr is called, thus
some garbage svm interval tree data make svm_range_get_attr get wrong result. Flush work queue before
iterate svm interval tree.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:59:41 -04:00
Kenneth Feng
93c5701b00 drm/amd/pm: change the workload type for some cards
change the workload type for some cards as it is needed.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:58:57 -04:00
Kenneth Feng
2fd31689f9 Revert "drm/amd/pm: fix workload mismatch on vega10"
This reverts commit 0979d43259.
Revert this because it does not apply to all the cards.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:58:44 -04:00
Linus Torvalds
a2824f19e6 Merge tag 'mtd/fixes-for-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fixes from Miquel Raynal:
 "MTD core fixes:
   - Fix lock hierarchy in deregister_mtd_blktrans
   - Handle flashes without OTP gracefully
   - Break circular locks in register_mtd_blktrans

  MTD device fixes:
   - mchp48l640:
      - Fix memory leak on cmd
      - Silence some uninitialized variable warnings
   - blkdevs:
      - Initialize rq.limits.discard_granularity

  CFI fixes:
   - Fix crash when erasing/writing AMD cards

  Raw NAND fixes:
   - Fix of_get_nand_secure_regions():
      - Add a missing check
      - Avoid an unwanted probe failure when a DT property is missing"

* tag 'mtd/fixes-for-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: Fix probe failure due to of_get_nand_secure_regions()
  mtd: fix lock hierarchy in deregister_mtd_blktrans
  mtd: devices: mchp48l640: Fix memory leak on cmd
  mtd: cfi_cmdset_0002: fix crash when erasing/writing AMD cards
  mtd: core: handle flashes without OTP gracefully
  mtd: mchp48l640: silence some uninitialized variable warnings
  mtd: break circular locks in register_mtd_blktrans
  mtd: rawnand: Add a check in of_get_nand_secure_regions()
  mtd: mtd_blkdevs: Initialize rq.limits.discard_granularity
2021-08-16 06:36:01 -10:00
Linus Torvalds
b88bcc7d54 Merge tag 'trace-v5.14-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
 "Fixes and clean ups to tracing:

   - Fix header alignment when PREEMPT_RT is enabled for osnoise tracer

   - Inject "stop" event to see where osnoise stopped the trace

   - Define DYNAMIC_FTRACE_WITH_ARGS as some code had an #ifdef for it

   - Fix erroneous message for bootconfig cmdline parameter

   - Fix crash caused by not found variable in histograms"

* tag 'trace-v5.14-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing / histogram: Fix NULL pointer dereference on strcmp() on NULL event name
  init: Suppress wrong warning for bootconfig cmdline parameter
  tracing: define needed config DYNAMIC_FTRACE_WITH_ARGS
  trace/osnoise: Print a stop tracing message
  trace/timerlat: Add a header with PREEMPT_RT additional fields
  trace/osnoise: Add a header with PREEMPT_RT additional fields
2021-08-16 06:31:06 -10:00
Mario Limonciello
4753b46e16 ACPI: PM: s2idle: Invert Microsoft UUID entry and exit
It was reported by a user with a Dell m15 R5 (5800H) that
the keyboard backlight was turning on when entering suspend
and turning off when exiting (the opposite of how it should be).

The user bisected it back to commit 5dbf509975 ("ACPI: PM:
s2idle: Add support for new Microsoft UUID").  Previous to that
commit the LEDs didn't turn off at all.  Confirming in the spec,
these were reversed when introduced.

Fix them to match the spec.

BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1230#note_1021836
Fixes: 5dbf509975 ("ACPI: PM: s2idle: Add support for new Microsoft UUID")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-16 18:28:12 +02:00
Linus Torvalds
02a3715449 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "Two nested virtualization fixes for AMD processors"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: nSVM: always intercept VMLOAD/VMSAVE when nested (CVE-2021-3656)
  KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl (CVE-2021-3653)
2021-08-16 06:23:26 -10:00
Linus Torvalds
94e95d5899 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
 "Fixes in virtio, vhost, and vdpa drivers"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vdpa/mlx5: Fix queue type selection logic
  vdpa/mlx5: Avoid destroying MR on empty iotlb
  tools/virtio: fix build
  virtio_ring: pull in spinlock header
  vringh: pull in spinlock header
  virtio-blk: Add validation for block size in config space
  vringh: Use wiov->used to check for read/write desc order
  virtio_vdpa: reject invalid vq indices
  vdpa: Add documentation for vdpa_alloc_device() macro
  vDPA/ifcvf: Fix return value check for vdpa_alloc_device()
  vp_vdpa: Fix return value check for vdpa_alloc_device()
  vdpa_sim: Fix return value check for vdpa_alloc_device()
  vhost: Fix the calculation in vhost_overflow()
  vhost-vdpa: Fix integer overflow in vhost_vdpa_process_iotlb_update()
  virtio_pci: Support surprise removal of virtio pci device
  virtio: Protect vqs list access
  virtio: Keep vring_del_virtqueue() mirror of VQ create
  virtio: Improve vq->broken access to avoid any compiler optimization
2021-08-16 06:16:25 -10:00
Aubrey Li
2bbfa0addd ACPI: PRM: Deal with table not present or no module found
On the system PRMT table is not present, dmesg output:

	$ dmesg | grep PRM
	[    1.532237] ACPI: PRMT not present
	[    1.532237] PRM: found 4294967277 modules

The result of acpi_table_parse_entries need to be checked and return
immediately if PRMT table is not present or no PRM module found.

Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-16 17:06:40 +02:00
Pingfan Liu
6c34df6f35 tracing: Apply trace filters on all output channels
The event filters are not applied on all of the output, which results in
the flood of printk when using tp_printk. Unfolding
event_trigger_unlock_commit_regs() into trace_event_buffer_commit(), so
the filters can be applied on every output.

Link: https://lkml.kernel.org/r/20210814034538.8428-1-kernelfans@gmail.com

Cc: stable@vger.kernel.org
Fixes: 0daa230296 ("tracing: Add tp_printk cmdline to have tracepoints go to printk()")
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-16 11:01:52 -04:00
Rafael J. Wysocki
0da04f884a Merge branch 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull operating performance points (OPP) framework fixes for v5.14
from iresh Kumar:

"This removes few WARN() statements from the OPP core."

* 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  opp: Drop empty-table checks from _put functions
  opp: remove WARN when no valid OPPs remain
2021-08-16 16:43:46 +02:00
Maxim Levitsky
c7dfa40099 KVM: nSVM: always intercept VMLOAD/VMSAVE when nested (CVE-2021-3656)
If L1 disables VMLOAD/VMSAVE intercepts, and doesn't enable
Virtual VMLOAD/VMSAVE (currently not supported for the nested hypervisor),
then VMLOAD/VMSAVE must operate on the L1 physical memory, which is only
possible by making L0 intercept these instructions.

Failure to do so allowed the nested guest to run VMLOAD/VMSAVE unintercepted,
and thus read/write portions of the host physical memory.

Fixes: 89c8a4984f ("KVM: SVM: Enable Virtual VMLOAD VMSAVE feature")

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-16 09:48:37 -04:00
Maxim Levitsky
0f923e0712 KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl (CVE-2021-3653)
* Invert the mask of bits that we pick from L2 in
  nested_vmcb02_prepare_control

* Invert and explicitly use VIRQ related bits bitmask in svm_clear_vintr

This fixes a security issue that allowed a malicious L1 to run L2 with
AVIC enabled, which allowed the L2 to exploit the uninitialized and enabled
AVIC to read/write the host physical memory at some offsets.

Fixes: 3d6368ef58 ("KVM: SVM: Add VMRUN handler")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-16 09:48:27 -04:00
Philipp Zabel
72fc2752f9 drm/imx: ipuv3-plane: fix accidental partial revert of 8 pixel alignment fix
This fixes an accidental partial revert of commit 94dfec48fc
("drm/imx: Add 8 pixel alignment fix") during a rebase of
commit fc1e985b67 ("drm/imx: ipuv3-plane: add color encoding and range
properties").

Fixes: fc1e985b67 ("drm/imx: ipuv3-plane: add color encoding and range properties")
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Link: https://lore.kernel.org/r/20210816131728.30987-1-p.zabel@pengutronix.de
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2021-08-16 15:18:31 +02:00
Krzysztof Hałasa
7cca7c8096 gpu: ipu-v3: Fix i.MX IPU-v3 offset calculations for (semi)planar U/V formats
Video captured in 1400x1050 resolution (bytesperline aka stride = 1408
bytes) is invalid. Fix it.

Signed-off-by: Krzysztof Halasa <khalasa@piap.pl>
Link: https://lore.kernel.org/r/m3y2bmq7a4.fsf@t19.piap.pl
[p.zabel@pengutronix.de: added "gpu: ipu-v3:" prefix to commit description]
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2021-08-16 15:05:22 +02:00
Dan Carpenter
4f3f2e3fa0 net: iosm: Prevent underflow in ipc_chnl_cfg_get()
The bounds check on "index" doesn't catch negative values.  Using
ARRAY_SIZE() directly is more readable and more robust because it prevents
negative values for "index".  Fortunately we only pass valid values to
ipc_chnl_cfg_get() so this patch does not affect runtime.

Reported-by: Solomon Ucko <solly.ucko@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 13:40:18 +01:00
Dan Moulding
958f442550 drm: ttm: Don't bail from ttm_global_init if debugfs_create_dir fails
In 69de4421bb ("drm/ttm: Initialize debugfs from
ttm_global_init()"), ttm_global_init was changed so that if creation
of the debugfs global root directory fails, ttm_global_init will bail
out early and return an error, leading to initialization failure of
DRM drivers. However, not every system will be using debugfs. On such
a system, debugfs directory creation can be expected to fail, but DRM
drivers must still be usable. This changes it so that if creation of
TTM's debugfs root directory fails, then no biggie: keep calm and
carry on.

Fixes: 69de4421bb ("drm/ttm: Initialize debugfs from ttm_global_init()")
Signed-off-by: Dan Moulding <dmoulding@me.com>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210810195906.22220-2-dmoulding@me.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-08-16 14:11:51 +02:00
NeilBrown
3f79f6f624 btrfs: prevent rename2 from exchanging a subvol with a directory from different parents
Cross-rename lacks a check when that would prevent exchanging a
directory and subvolume from different parent subvolume. This causes
data inconsistencies and is caught before commit by tree-checker,
turning the filesystem to read-only.

Calling the renameat2 with RENAME_EXCHANGE flags like

  renameat2(AT_FDCWD, namesrc, AT_FDCWD, namedest, (1 << 1))

on two paths:

  namesrc = dir1/subvol1/dir2
 namedest = subvol2/subvol3

will cause key order problem with following write time tree-checker
report:

  [1194842.307890] BTRFS critical (device loop1): corrupt leaf: root=5 block=27574272 slot=10 ino=258, invalid previous key objectid, have 257 expect 258
  [1194842.322221] BTRFS info (device loop1): leaf 27574272 gen 8 total ptrs 11 free space 15444 owner 5
  [1194842.331562] BTRFS info (device loop1): refs 2 lock_owner 0 current 26561
  [1194842.338772]        item 0 key (256 1 0) itemoff 16123 itemsize 160
  [1194842.338793]                inode generation 3 size 16 mode 40755
  [1194842.338801]        item 1 key (256 12 256) itemoff 16111 itemsize 12
  [1194842.338809]        item 2 key (256 84 2248503653) itemoff 16077 itemsize 34
  [1194842.338817]                dir oid 258 type 2
  [1194842.338823]        item 3 key (256 84 2363071922) itemoff 16043 itemsize 34
  [1194842.338830]                dir oid 257 type 2
  [1194842.338836]        item 4 key (256 96 2) itemoff 16009 itemsize 34
  [1194842.338843]        item 5 key (256 96 3) itemoff 15975 itemsize 34
  [1194842.338852]        item 6 key (257 1 0) itemoff 15815 itemsize 160
  [1194842.338863]                inode generation 6 size 8 mode 40755
  [1194842.338869]        item 7 key (257 12 256) itemoff 15801 itemsize 14
  [1194842.338876]        item 8 key (257 84 2505409169) itemoff 15767 itemsize 34
  [1194842.338883]                dir oid 256 type 2
  [1194842.338888]        item 9 key (257 96 2) itemoff 15733 itemsize 34
  [1194842.338895]        item 10 key (258 12 256) itemoff 15719 itemsize 14
  [1194842.339163] BTRFS error (device loop1): block=27574272 write time tree block corruption detected
  [1194842.339245] ------------[ cut here ]------------
  [1194842.443422] WARNING: CPU: 6 PID: 26561 at fs/btrfs/disk-io.c:449 csum_one_extent_buffer+0xed/0x100 [btrfs]
  [1194842.511863] CPU: 6 PID: 26561 Comm: kworker/u17:2 Not tainted 5.14.0-rc3-git+ #793
  [1194842.511870] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008
  [1194842.511876] Workqueue: btrfs-worker-high btrfs_work_helper [btrfs]
  [1194842.511976] RIP: 0010:csum_one_extent_buffer+0xed/0x100 [btrfs]
  [1194842.512068] RSP: 0018:ffffa2c284d77da0 EFLAGS: 00010282
  [1194842.512074] RAX: 0000000000000000 RBX: 0000000000001000 RCX: ffff928867bd9978
  [1194842.512078] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff928867bd9970
  [1194842.512081] RBP: ffff92876b958000 R08: 0000000000000001 R09: 00000000000c0003
  [1194842.512085] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
  [1194842.512088] R13: ffff92875f989f98 R14: 0000000000000000 R15: 0000000000000000
  [1194842.512092] FS:  0000000000000000(0000) GS:ffff928867a00000(0000) knlGS:0000000000000000
  [1194842.512095] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [1194842.512099] CR2: 000055f5384da1f0 CR3: 0000000102fe4000 CR4: 00000000000006e0
  [1194842.512103] Call Trace:
  [1194842.512128]  ? run_one_async_free+0x10/0x10 [btrfs]
  [1194842.631729]  btree_csum_one_bio+0x1ac/0x1d0 [btrfs]
  [1194842.631837]  run_one_async_start+0x18/0x30 [btrfs]
  [1194842.631938]  btrfs_work_helper+0xd5/0x1d0 [btrfs]
  [1194842.647482]  process_one_work+0x262/0x5e0
  [1194842.647520]  worker_thread+0x4c/0x320
  [1194842.655935]  ? process_one_work+0x5e0/0x5e0
  [1194842.655946]  kthread+0x135/0x160
  [1194842.655953]  ? set_kthread_struct+0x40/0x40
  [1194842.655965]  ret_from_fork+0x1f/0x30
  [1194842.672465] irq event stamp: 1729
  [1194842.672469] hardirqs last  enabled at (1735): [<ffffffffbd1104f5>] console_trylock_spinning+0x185/0x1a0
  [1194842.672477] hardirqs last disabled at (1740): [<ffffffffbd1104cc>] console_trylock_spinning+0x15c/0x1a0
  [1194842.672482] softirqs last  enabled at (1666): [<ffffffffbdc002e1>] __do_softirq+0x2e1/0x50a
  [1194842.672491] softirqs last disabled at (1651): [<ffffffffbd08aab7>] __irq_exit_rcu+0xa7/0xd0

The corrupted data will not be written, and filesystem can be unmounted
and mounted again (all changes since the last commit will be lost).

Add the missing check for new_ino so that all non-subvolumes must reside
under the same parent subvolume. There's an exception allowing to
exchange two subvolumes from any parents as the directory representing a
subvolume is only a logical link and does not have any other structures
related to the parent subvolume, unlike files, directories etc, that
are always in the inode namespace of the parent subvolume.

Fixes: cdd1fedf82 ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT")
CC: stable@vger.kernel.org # 4.7+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-16 13:33:23 +02:00
David S. Miller
517c54d282 Merge branch 'bnxt_en-fixes'
Michael Chan says:

====================
bnxt_en: 2 bug fixes

The first one disables aRFS/NTUPLE on an older broken firmware version.
The second one adds missing memory barriers related to completion ring
handling.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:31:41 +01:00
Michael Chan
828affc27e bnxt_en: Add missing DMA memory barriers
Each completion ring entry has a valid bit to indicate that the entry
contains a valid completion event.  The driver's main poll loop
__bnxt_poll_work() has the proper dma_rmb() to make sure the valid
bit of the next entry has been checked before proceeding further.
But when we call bnxt_rx_pkt() to process the RX event, the RX
completion event consists of two completion entries and only the
first entry has been checked to be valid.  We need the same barrier
after checking the next completion entry.  Add missing dma_rmb()
barriers in bnxt_rx_pkt() and other similar locations.

Fixes: 67a95e2022 ("bnxt_en: Need memory barrier when processing the completion ring.")
Reported-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:31:41 +01:00
Michael Chan
976e52b718 bnxt_en: Disable aRFS if running on 212 firmware
212 firmware broke aRFS, so disable it.  Traffic may stop after ntuple
filters are inserted and deleted by the 212 firmware.

Fixes: ae10ae740a ("bnxt_en: Add new hardware RFS mode.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:31:41 +01:00
Shai Malin
d33d19d313 qed: Fix null-pointer dereference in qed_rdma_create_qp()
Fix a possible null-pointer dereference in qed_rdma_create_qp().

Changes from V2:
- Revert checkpatch fixes.

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:23:32 +01:00
Shai Malin
37110237f3 qed: qed ll2 race condition fixes
Avoiding qed ll2 race condition and NULL pointer dereference as part
of the remove and recovery flows.

Changes form V1:
- Change (!p_rx->set_prod_addr).
- qed_ll2.c checkpatch fixes.

Change from V2:
- Revert "qed_ll2.c checkpatch fixes".

Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:22:57 +01:00
Xin Long
7387a72c5f tipc: call tipc_wait_for_connect only when dlen is not 0
__tipc_sendmsg() is called to send SYN packet by either tipc_sendmsg()
or tipc_connect(). The difference is in tipc_connect(), it will call
tipc_wait_for_connect() after __tipc_sendmsg() to wait until connecting
is done. So there's no need to wait in __tipc_sendmsg() for this case.

This patch is to fix it by calling tipc_wait_for_connect() only when dlen
is not 0 in __tipc_sendmsg(), which means it's called by tipc_connect().

Note this also fixes the failure in tipcutils/test/ptts/:

  # ./tipcTS &
  # ./tipcTC 9
  (hang)

Fixes: 36239dab6da7 ("tipc: fix implicit-connect for SYN+")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:20:56 +01:00
Nicolas Saenz Julienne
419dd626e3 mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN on BCM2711
The controller doesn't seem to pick-up on clock changes, so set the
SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN flag to query the clock frequency
directly from the clock.

Fixes: f84e411c85 ("mmc: sdhci-iproc: Add support for emmc2 of the BCM2711")
Signed-off-by: Nicolas Saenz Julienne <nsaenz@kernel.org>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1628334401-6577-6-git-send-email-stefan.wahren@i2se.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-16 12:12:05 +02:00
Andy Shevchenko
55c8fca1da ptp_pch: Restore dependency on PCI
During the swap dependency on PCH_GBE to selection PTP_1588_CLOCK_PCH
incidentally dropped the implicit dependency on the PCI. Restore it.

Fixes: 18d359ceb0 ("pch_gbe, ptp_pch: Fix the dependency direction between these drivers")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:11:06 +01:00
Nicolas Saenz Julienne
c9107dd0b8 mmc: sdhci-iproc: Cap min clock frequency on BCM2711
There is a known bug on BCM2711's SDHCI core integration where the
controller will hang when the difference between the core clock and the
bus clock is too great. Specifically this can be reproduced under the
following conditions:

- No SD card plugged in, polling thread is running, probing cards at
  100 kHz.
- BCM2711's core clock configured at 500MHz or more.

So set 200 kHz as the minimum clock frequency available for that board.

For more information on the issue see this:
https://lore.kernel.org/linux-mmc/20210322185816.27582-1-nsaenz@kernel.org/T/#m11f2783a09b581da6b8a15f302625b43a6ecdeca

Fixes: f84e411c85 ("mmc: sdhci-iproc: Add support for emmc2 of the BCM2711")
Signed-off-by: Nicolas Saenz Julienne <nsaenz@kernel.org>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1628334401-6577-5-git-send-email-stefan.wahren@i2se.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-16 12:09:11 +02:00
Pavel Skripkin
19d1532a18 net: 6pack: fix slab-out-of-bounds in decode_data
Syzbot reported slab-out-of bounds write in decode_data().
The problem was in missing validation checks.

Syzbot's reproducer generated malicious input, which caused
decode_data() to be called a lot in sixpack_decode(). Since
rx_count_cooked is only 400 bytes and noone reported before,
that 400 bytes is not enough, let's just check if input is malicious
and complain about buffer overrun.

Fail log:
==================================================================
BUG: KASAN: slab-out-of-bounds in drivers/net/hamradio/6pack.c:843
Write of size 1 at addr ffff888087c5544e by task kworker/u4:0/7

CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.6.0-rc3-syzkaller #0
...
Workqueue: events_unbound flush_to_ldisc
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x197/0x210 lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
 __kasan_report.cold+0x1b/0x32 mm/kasan/report.c:506
 kasan_report+0x12/0x20 mm/kasan/common.c:641
 __asan_report_store1_noabort+0x17/0x20 mm/kasan/generic_report.c:137
 decode_data.part.0+0x23b/0x270 drivers/net/hamradio/6pack.c:843
 decode_data drivers/net/hamradio/6pack.c:965 [inline]
 sixpack_decode drivers/net/hamradio/6pack.c:968 [inline]

Reported-and-tested-by: syzbot+fc8cd9a673d4577fb2e4@syzkaller.appspotmail.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:08:05 +01:00
Dmitry Osipenko
c3ddfe66d2 opp: Drop empty-table checks from _put functions
The current_opp is released only when whole OPP table is released,
otherwise it's only marked as removed by dev_pm_opp_remove_table().
Functions like dev_pm_opp_put_clkname() and dev_pm_opp_put_supported_hw()
are checking whether OPP table is empty and it's not if current_opp is
set since it holds the refcount of OPP, this produces a noisy warning
from these functions about busy OPP table. Remove the checks to fix it.

Cc: stable@vger.kernel.org
Fixes: 81c4d8a3c4 ("opp: Keep track of currently programmed OPP")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-08-16 09:42:08 +05:30
Linus Torvalds
7c60610d47 Linux 5.14-rc6 2021-08-15 13:40:53 -10:00
Linus Torvalds
ecf9343196 Merge tag 'powerpc-5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:

 - Fix crashes coming out of nap on 32-bit Book3s (eg. powerbooks).

 - Fix critical and debug interrupts on BookE, seen as crashes when
   using ptrace.

 - Fix an oops when running an SMP kernel on a UP system.

 - Update pseries LPAR security flavor after partition migration.

 - Fix an oops when using kprobes on BookE.

 - Fix oops on 32-bit pmac by not calling do_IRQ() from
   timer_interrupt().

 - Fix softlockups on CPU hotplug into a CPU-less node with xive (P9).

Thanks to Cédric Le Goater, Christophe Leroy, Finn Thain, Geetika
Moolchandani, Laurent Dufour, Laurent Vivier, Nicholas Piggin, Pu Lehui,
Radu Rendec, Srikar Dronamraju, and Stan Johnson.

* tag 'powerpc-5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/xive: Do not skip CPU-less nodes when creating the IPIs
  powerpc/interrupt: Do not call single_step_exception() from other exceptions
  powerpc/interrupt: Fix OOPS by not calling do_IRQ() from timer_interrupt()
  powerpc/kprobes: Fix kprobe Oops happens in booke
  powerpc/pseries: Fix update of LPAR security flavor after LPM
  powerpc/smp: Fix OOPS in topology_init()
  powerpc/32: Fix critical and debug interrupts on BOOKE
  powerpc/32s: Fix napping restore in data storage interrupt (DSI)
2021-08-15 06:57:43 -10:00
Linus Torvalds
c4f14eac22 Merge tag 'irq-urgent-2021-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A set of fixes for PCI/MSI and x86 interrupt startup:

   - Mask all MSI-X entries when enabling MSI-X otherwise stale unmasked
     entries stay around e.g. when a crashkernel is booted.

   - Enforce masking of a MSI-X table entry when updating it, which
     mandatory according to speification

   - Ensure that writes to MSI[-X} tables are flushed.

   - Prevent invalid bits being set in the MSI mask register

   - Properly serialize modifications to the mask cache and the mask
     register for multi-MSI.

   - Cure the violation of the affinity setting rules on X86 during
     interrupt startup which can cause lost and stale interrupts. Move
     the initial affinity setting ahead of actualy enabling the
     interrupt.

   - Ensure that MSI interrupts are completely torn down before freeing
     them in the error handling case.

   - Prevent an array out of bounds access in the irq timings code"

* tag 'irq-urgent-2021-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  driver core: Add missing kernel doc for device::msi_lock
  genirq/msi: Ensure deactivation on teardown
  genirq/timings: Prevent potential array overflow in __irq_timings_store()
  x86/msi: Force affinity setup before startup
  x86/ioapic: Force affinity setup before startup
  genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP
  PCI/MSI: Protect msi_desc::masked for multi-MSI
  PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown()
  PCI/MSI: Correct misleading comments
  PCI/MSI: Do not set invalid bits in MSI mask
  PCI/MSI: Enforce MSI[X] entry updates to be visible
  PCI/MSI: Enforce that MSI-X table entry is masked for update
  PCI/MSI: Mask all unused MSI-X entries
  PCI/MSI: Enable and mask MSI-X early
2021-08-15 06:49:40 -10:00
Linus Torvalds
839da25385 Merge tag 'locking_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Borislav Petkov:

 - Fix a CONFIG symbol's spelling

* tag 'locking_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rtmutex: Use the correct rtmutex debugging config option
2021-08-15 06:46:04 -10:00
Linus Torvalds
12aef8acf0 Merge tag 'efi_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Borislav Petkov:
 "A batch of fixes for the arm64 stub image loader:

   - fix a logic bug that can make the random page allocator fail
     spuriously

   - force reallocation of the Image when it overlaps with firmware
     reserved memory regions

   - fix an oversight that defeated on optimization introduced earlier
     where images loaded at a suitable offset are never moved if booting
     without randomization

   - complain about images that were not loaded at the right offset by
     the firmware image loader"

* tag 'efi_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi/libstub: arm64: Double check image alignment at entry
  efi/libstub: arm64: Warn when efi_random_alloc() fails
  efi/libstub: arm64: Relax 2M alignment again for relocatable kernels
  efi/libstub: arm64: Force Image reallocation if BSS was not reserved
  arm64: efi: kaslr: Fix occasional random alloc (and boot) failure
2021-08-15 06:38:26 -10:00
Linus Torvalds
b045b8cc86 Merge tag 'x86_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
 "Two fixes:

   - An objdump checker fix to ignore parenthesized strings in the
     objdump version

   - Fix resctrl default monitoring groups reporting when new subgroups
     get created"

* tag 'x86_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Fix default monitoring groups reporting
  x86/tools: Fix objdump version check again
2021-08-15 06:30:24 -10:00
Linus Torvalds
3e763ec791 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "ARM:

   - Plug race between enabling MTE and creating vcpus

   - Fix off-by-one bug when checking whether an address range is RAM

  x86:

   - Fixes for the new MMU, especially a memory leak on hosts with <39
     physical address bits

   - Remove bogus EFER.NX checks on 32-bit non-PAE hosts

   - WAITPKG fix"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock
  KVM: x86/mmu: Don't step down in the TDP iterator when zapping all SPTEs
  KVM: x86/mmu: Don't leak non-leaf SPTEs when zapping all SPTEs
  KVM: nVMX: Use vmx_need_pf_intercept() when deciding if L0 wants a #PF
  kvm: vmx: Sync all matching EPTPs when injecting nested EPT fault
  KVM: x86: remove dead initialization
  KVM: x86: Allow guest to set EFER.NX=1 on non-PAE 32-bit kernels
  KVM: VMX: Use current VMCS to query WAITPKG support for MSR emulation
  KVM: arm64: Fix race when enabling KVM_ARM_CAP_MTE
  KVM: arm64: Fix off-by-one in range_is_memory
2021-08-15 06:21:30 -10:00
Greg Kroah-Hartman
d30836a952 Merge tag 'icc-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-linus
Georgi writes:

interconnect fix for v5.14

This contains a revert for a patch that has been causing issues:
- Revert: qcom: rpmh: Add BCMs to commit list in pre_aggregate

Signed-off-by: Georgi Djakov <djakov@kernel.org>

* tag 'icc-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc:
  Revert "interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate"
2021-08-15 11:21:02 +02:00
Kristin Paget
da94692001 ALSA: hda/realtek: Enable 4-speaker output for Dell XPS 15 9510 laptop
The 2021-model XPS 15 appears to use the same 4-speakers-on-ALC289 audio
setup as the Precision models, so requires the same quirk to enable woofer
output. Tested on my own 9510.

Signed-off-by: Kristin Paget <kristin@tombom.co.uk>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/e1fc95c5-c10a-1f98-a5c2-dd6e336157e1@tombom.co.uk
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-15 09:27:37 +02:00
Linus Torvalds
0aa78d1709 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Three minor fixes, all in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: mpt3sas: Fix incorrectly assigned error return and check
  scsi: storvsc: Log TEST_UNIT_READY errors as warnings
  scsi: lpfc: Move initialization of phba->poll_list earlier to avoid crash
2021-08-14 19:51:58 -10:00
Linus Torvalds
7ba34c0cba Merge tag 'libnvdimm-fixes-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
 "A couple of fixes for long standing bugs, a warning fixup, and some
  miscellaneous dax cleanups.

  The bugs were recently found due to new platforms looking to use the
  ACPI NFIT "virtual" device definition, and new error injection
  capabilities to trigger error responses to label area requests. Ira's
  cleanups have been long pending, I neglected to send them earlier, and
  see no harm in including them now. This has all appeared in -next with
  no reported issues.

  Summary:

   - Fix support for NFIT "virtual" ranges (BIOS-defined memory disks)

   - Fix recovery from failed label storage areas on NVDIMM devices

   - Miscellaneous cleanups from Ira's investigation of
     dax_direct_access paths preparing for stray-write protection"

* tag 'libnvdimm-fixes-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  tools/testing/nvdimm: Fix missing 'fallthrough' warning
  libnvdimm/region: Fix label activation vs errors
  ACPI: NFIT: Fix support for virtual SPA ranges
  dax: Ensure errno is returned from dax_direct_access
  fs/dax: Clarify nr_pages to dax_direct_access()
  fs/fuse: Remove unneeded kaddr parameter
2021-08-14 19:46:39 -10:00
Linus Torvalds
12f41321ce Merge tag 'usb-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fix from Greg KH:
 "A single revert of a commit that caused problems in 5.14-rc5 for
  5.14-rc6. It has been in linux-next almost all week, and has resolved
  the issues that were reported on lots of different systems that were
  not the platform that the change was originally tested on (gotta love
  SoC cores used in multiple devices from multiple vendors...)"

* tag 'usb-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  Revert "usb: dwc3: gadget: Use list_replace_init() before traversing lists"
2021-08-14 19:22:33 -10:00
Linus Torvalds
56aee57345 Merge tag 'staging-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull IIO driver fixes from Greg KH:
 "Here are some small IIO driver fixes for reported problems for
  5.14-rc6 (no staging driver fixes at the moment).

  All of them resolve reported issues and have been in linux-next all
  week with no reported problems. Full details are in the shortlog"

* tag 'staging-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  iio: adc: Fix incorrect exit of for-loop
  iio: humidity: hdc100x: Add margin to the conversion time
  dt-bindings: iio: st: Remove wrong items length check
  iio: accel: fxls8962af: fix i2c dependency
  iio: adis: set GPIO reset pin direction
  iio: adc: ti-ads7950: Ensure CS is deasserted after reading channels
  iio: accel: fxls8962af: fix potential use of uninitialized symbol
2021-08-14 19:16:30 -10:00
Linus Torvalds
76c9e465dd Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "One driver bugfix, a documentation bugfix, and an "uninitialized data"
  leak fix for the core"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  Documentation: i2c: add i2c-sysfs into index
  i2c: dev: zero out array used for i2c reads from userspace
  i2c: iproc: fix race between client unreg and tasklet
2021-08-14 18:59:53 -10:00
Jens Axboe
21f965221e io_uring: only assign io_uring_enter() SQPOLL error in actual error case
If an SQPOLL based ring is newly created and an application issues an
io_uring_enter(2) system call on it, then we can return a spurious
-EOWNERDEAD error. This happens because there's nothing to submit, and
if the caller doesn't specify any other action, the initial error
assignment of -EOWNERDEAD never gets overwritten. This causes us to
return it directly, even if it isn't valid.

Move the error assignment into the actual failure case instead.

Cc: stable@vger.kernel.org
Fixes: d9d05217cb ("io_uring: stop SQPOLL submit on creator's death")
Reported-by: Sherlock Holo sherlockya@gmail.com
Link: https://github.com/axboe/liburing/issues/413
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-14 12:38:21 -06:00
Linus Torvalds
ba31f97d43 Merge tag 'for-linus-5.14-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
 "A small cleanup patch and a fix of a rare race in the Xen evtchn
  driver"

* tag 'for-linus-5.14-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/events: Fix race in set_evtchn_to_irq
  xen/events: remove redundant initialization of variable irq
2021-08-14 06:31:22 -10:00
Linus Torvalds
a7a4f1c0c8 Merge tag 'riscv-for-linus-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - avoid passing -mno-relax to compilers that don't support it

 - a comment fix

* tag 'riscv-for-linus-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix comment regarding kernel mapping overlapping with IS_ERR_VALUE
  riscv: kexec: do not add '-mno-relax' flag if compiler doesn't support it
2021-08-14 06:28:19 -10:00
Linus Torvalds
118516e212 Merge tag 'configfs-5.14' of git://git.infradead.org/users/hch/configfs
Pull configfs fix from Christoph Hellwig:

 - fix to revert to the historic write behavior (Bart Van Assche)

* tag 'configfs-5.14' of git://git.infradead.org/users/hch/configfs:
  configfs: restore the kernel v5.13 text attribute write behavior
2021-08-14 06:22:42 -10:00
Linus Torvalds
dfa377c35d Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "7 patches.

  Subsystems affected by this patch series: mm (kasan, mm/slub,
  mm/madvise, and memcg), and lib"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  lib: use PFN_PHYS() in devmem_is_allowed()
  mm/memcg: fix incorrect flushing of lruvec data in obj_stock
  mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)
  mm: slub: fix slub_debug disabling for list of slabs
  slub: fix kmalloc_pagealloc_invalid_free unit test
  kasan, slub: reset tag when printing address
  kasan, kmemleak: reset tags when scanning block
2021-08-13 15:05:23 -10:00
Linus Torvalds
27b2eaa118 Merge tag '5.14-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
 "Four CIFS/SMB3 Fixes, all for stable, two relating to deferred close,
  and one for the 'modefromsid' mount option (when 'idsfromsid' not
  specified)"

* tag '5.14-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Call close synchronously during unlink/rename/lease break.
  cifs: Handle race conditions during rename
  cifs: use the correct max-length for dentry_path_raw()
  cifs: create sd context must be a multiple of 8
2021-08-13 14:44:32 -10:00
Linus Torvalds
a83ed22577 Merge tag 'linux-kselftest-fixes-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fix from Shuah Khan:
 "A single patch to sgx test to fix Q1 and Q2 calculation"

* tag 'linux-kselftest-fixes-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/sgx: Fix Q1 and Q2 calculation in sigstruct.c
2021-08-13 14:32:38 -10:00
Maciej Machnikowski
5f77351963 ice: Fix perout start time rounding
Internal tests found out that the latest code doesn't bring up 1PPS out
as expected. As a result of incorrect define used to round the time up
the time was round down to the past second boundary.

Fix define used for rounding to properly round up to the next Top of
second in ice_ptp_cfg_clkout to fix it.

Fixes: 172db5f91d ("ice: add support for auxiliary input/output pins")
Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20210813165018.2196013-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-13 17:22:53 -07:00
Liang Wang
854f32648b lib: use PFN_PHYS() in devmem_is_allowed()
The physical address may exceed 32 bits on 32-bit systems with more than
32 bits of physcial address.  Use PFN_PHYS() in devmem_is_allowed(), or
the physical address may overflow and be truncated.

We found this bug when mapping a high addresses through devmem tool,
when CONFIG_STRICT_DEVMEM is enabled on the ARM with ARM_LPAE and devmem
is used to map a high address that is not in the iomem address range, an
unexpected error indicating no permission is returned.

This bug was initially introduced from v2.6.37, and the function was
moved to lib in v5.11.

Link: https://lkml.kernel.org/r/20210731025057.78825-1-wangliang101@huawei.com
Fixes: 087aaffcdf ("ARM: implement CONFIG_STRICT_DEVMEM by disabling access to RAM via /dev/mem")
Fixes: 527701eda5 ("lib: Add a generic version of devmem_is_allowed()")
Signed-off-by: Liang Wang <wangliang101@huawei.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Liang Wang <wangliang101@huawei.com>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: <stable@vger.kernel.org>	[2.6.37+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-13 14:09:32 -10:00
Waiman Long
7fa0dacbaf mm/memcg: fix incorrect flushing of lruvec data in obj_stock
When mod_objcg_state() is called with a pgdat that is different from
that in the obj_stock, the old lruvec data cached in obj_stock are
flushed out.  Unfortunately, they were flushed to the new pgdat and so
the data go to the wrong node.  This will screw up the slab data
reported in /sys/devices/system/node/node*/meminfo.

Fix that by flushing the data to the cached pgdat instead.

Link: https://lkml.kernel.org/r/20210802143834.30578-1-longman@redhat.com
Fixes: 68ac5b3c8d ("mm/memcg: cache vmstat data in percpu memcg_stock_pcp")
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: Yafang Shao <laoar.shao@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Masayoshi Mizuma <msys.mizuma@gmail.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-13 14:09:32 -10:00
David Hildenbrand
eb2faa513c mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)
Doing some extended tests and polishing the man page update for
MADV_POPULATE_(READ|WRITE), I realized that we end up converting also
SIGBUS (via -EFAULT) to -EINVAL, making it look like yet another
madvise() user error.

We want to report only problematic mappings and permission problems that
the user could have know as -EINVAL.

Let's not convert -EFAULT arising due to SIGBUS (or SIGSEGV) to -EINVAL,
but instead indicate -EFAULT to user space.  While we could also convert
it to -ENOMEM, using -EFAULT looks more helpful when user space might
want to troubleshoot what's going wrong: MADV_POPULATE_(READ|WRITE) is
not part of an final Linux release and we can still adjust the behavior.

Link: https://lkml.kernel.org/r/20210726154932.102880-1-david@redhat.com
Fixes: 4ca9b3859d ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables")
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-13 14:09:31 -10:00
Vlastimil Babka
a7f1d48585 mm: slub: fix slub_debug disabling for list of slabs
Vijayanand Jitta reports:

  Consider the scenario where CONFIG_SLUB_DEBUG_ON is set and we would
  want to disable slub_debug for few slabs. Using boot parameter with
  slub_debug=-,slab_name syntax doesn't work as expected i.e; only
  disabling debugging for the specified list of slabs. Instead it
  disables debugging for all slabs, which is wrong.

This patch fixes it by delaying the moment when the global slub_debug
flags variable is updated.  In case a "slub_debug=-,slab_name" has been
passed, the global flags remain as initialized (depending on
CONFIG_SLUB_DEBUG_ON enabled or disabled) and are not simply reset to 0.

Link: https://lkml.kernel.org/r/8a3d992a-473a-467b-28a0-4ad2ff60ab82@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Vijayanand Jitta <vjitta@codeaurora.org>
Reviewed-by: Vijayanand Jitta <vjitta@codeaurora.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-13 14:09:31 -10:00
Shakeel Butt
1ed7ce574c slub: fix kmalloc_pagealloc_invalid_free unit test
The unit test kmalloc_pagealloc_invalid_free makes sure that for the
higher order slub allocation which goes to page allocator, the free is
called with the correct address i.e.  the virtual address of the head
page.

Commit f227f0faf6 ("slub: fix unreclaimable slab stat for bulk free")
unified the free code paths for page allocator based slub allocations
but instead of using the address passed by the caller, it extracted the
address from the page.  Thus making the unit test
kmalloc_pagealloc_invalid_free moot.  So, fix this by using the address
passed by the caller.

Should we fix this? I think yes because dev expect kasan to catch these
type of programming bugs.

Link: https://lkml.kernel.org/r/20210802180819.1110165-1-shakeelb@google.com
Fixes: f227f0faf6 ("slub: fix unreclaimable slab stat for bulk free")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-13 14:09:31 -10:00
Kuan-Ying Lee
340caf178d kasan, slub: reset tag when printing address
The address still includes the tags when it is printed.  With hardware
tag-based kasan enabled, we will get a false positive KASAN issue when
we access metadata.

Reset the tag before we access the metadata.

Link: https://lkml.kernel.org/r/20210804090957.12393-3-Kuan-Ying.Lee@mediatek.com
Fixes: aa1ef4d7b3 ("kasan, mm: reset tags when accessing metadata")
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: Nicholas Tang <nicholas.tang@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-13 14:09:31 -10:00
Kuan-Ying Lee
6c7a00b843 kasan, kmemleak: reset tags when scanning block
Patch series "kasan, slub: reset tag when printing address", v3.

With hardware tag-based kasan enabled, we reset the tag when we access
metadata to avoid from false alarm.

This patch (of 2):

Kmemleak needs to scan kernel memory to check memory leak.  With hardware
tag-based kasan enabled, when it scans on the invalid slab and
dereference, the issue will occur as below.

Hardware tag-based KASAN doesn't use compiler instrumentation, we can not
use kasan_disable_current() to ignore tag check.

Based on the below report, there are 11 0xf7 granules, which amounts to
176 bytes, and the object is allocated from the kmalloc-256 cache.  So
when kmemleak accesses the last 256-176 bytes, it causes faults, as those
are marked with KASAN_KMALLOC_REDZONE == KASAN_TAG_INVALID == 0xfe.

Thus, we reset tags before accessing metadata to avoid from false positives.

  BUG: KASAN: out-of-bounds in scan_block+0x58/0x170
  Read at addr f7ff0000c0074eb0 by task kmemleak/138
  Pointer tag: [f7], memory tag: [fe]

  CPU: 7 PID: 138 Comm: kmemleak Not tainted 5.14.0-rc2-00001-g8cae8cd89f05-dirty #134
  Hardware name: linux,dummy-virt (DT)
  Call trace:
   dump_backtrace+0x0/0x1b0
   show_stack+0x1c/0x30
   dump_stack_lvl+0x68/0x84
   print_address_description+0x7c/0x2b4
   kasan_report+0x138/0x38c
   __do_kernel_fault+0x190/0x1c4
   do_tag_check_fault+0x78/0x90
   do_mem_abort+0x44/0xb4
   el1_abort+0x40/0x60
   el1h_64_sync_handler+0xb4/0xd0
   el1h_64_sync+0x78/0x7c
   scan_block+0x58/0x170
   scan_gray_list+0xdc/0x1a0
   kmemleak_scan+0x2ac/0x560
   kmemleak_scan_thread+0xb0/0xe0
   kthread+0x154/0x160
   ret_from_fork+0x10/0x18

  Allocated by task 0:
   kasan_save_stack+0x2c/0x60
   __kasan_kmalloc+0xec/0x104
   __kmalloc+0x224/0x3c4
   __register_sysctl_paths+0x200/0x290
   register_sysctl_table+0x2c/0x40
   sysctl_init+0x20/0x34
   proc_sys_init+0x3c/0x48
   proc_root_init+0x80/0x9c
   start_kernel+0x648/0x6a4
   __primary_switched+0xc0/0xc8

  Freed by task 0:
   kasan_save_stack+0x2c/0x60
   kasan_set_track+0x2c/0x40
   kasan_set_free_info+0x44/0x54
   ____kasan_slab_free.constprop.0+0x150/0x1b0
   __kasan_slab_free+0x14/0x20
   slab_free_freelist_hook+0xa4/0x1fc
   kfree+0x1e8/0x30c
   put_fs_context+0x124/0x220
   vfs_kern_mount.part.0+0x60/0xd4
   kern_mount+0x24/0x4c
   bdev_cache_init+0x70/0x9c
   vfs_caches_init+0xdc/0xf4
   start_kernel+0x638/0x6a4
   __primary_switched+0xc0/0xc8

  The buggy address belongs to the object at ffff0000c0074e00
   which belongs to the cache kmalloc-256 of size 256
  The buggy address is located 176 bytes inside of
   256-byte region [ffff0000c0074e00, ffff0000c0074f00)
  The buggy address belongs to the page:
  page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x100074
  head:(____ptrval____) order:2 compound_mapcount:0 compound_pincount:0
  flags: 0xbfffc0000010200(slab|head|node=0|zone=2|lastcpupid=0xffff|kasantag=0x0)
  raw: 0bfffc0000010200 0000000000000000 dead000000000122 f5ff0000c0002300
  raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
  page dumped because: kasan: bad access detected

  Memory state around the buggy address:
   ffff0000c0074c00: f0 f0 f0 f0 f0 f0 f0 f0 f0 fe fe fe fe fe fe fe
   ffff0000c0074d00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
  >ffff0000c0074e00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 fe fe fe fe fe
                                                      ^
   ffff0000c0074f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
   ffff0000c0075000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ==================================================================
  Disabling lock debugging due to kernel taint
  kmemleak: 181 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

Link: https://lkml.kernel.org/r/20210804090957.12393-1-Kuan-Ying.Lee@mediatek.com
Link: https://lkml.kernel.org/r/20210804090957.12393-2-Kuan-Ying.Lee@mediatek.com
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Marco Elver <elver@google.com>
Cc: Nicholas Tang <nicholas.tang@mediatek.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-13 14:09:31 -10:00
Linus Torvalds
020efdadd8 Merge tag 'block-5.14-2021-08-13' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A few fixes for block that should go into 5.14:

   - Revert the mq-deadline cgroup addition. More work is needed on this
     front, let's revert it for now and get it right before having it in
     a released kernel (Tejun)

   - blk-iocost lockdep fix (Ming)

   - nbd double completion fix (Xie)

   - Fix for non-idling when clearing the shared tag flag (Yu)"

* tag 'block-5.14-2021-08-13' of git://git.kernel.dk/linux-block:
  nbd: Aovid double completion of a request
  blk-mq: clear active_queues before clearing BLK_MQ_F_TAG_QUEUE_SHARED
  Revert "block/mq-deadline: Add cgroup support"
  blk-iocost: fix lockdep warning on blkcg->lock
2021-08-13 13:36:42 -10:00
Linus Torvalds
42995cee61 Merge tag 'io_uring-5.14-2021-08-13' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "A bit bigger than the previous weeks, but mostly just a few stable
  bound fixes. In detail:

   - Followup fixes to patches from last week for io-wq, turns out they
     weren't complete (Hao)

   - Two lockdep reported fixes out of the RT camp (me)

   - Sync the io_uring-cp example with liburing, as a few bug fixes
     never made it to the kernel carried version (me)

   - SQPOLL related TIF_NOTIFY_SIGNAL fix (Nadav)

   - Use WRITE_ONCE() when writing sq flags (Nadav)

   - io_rsrc_put_work() deadlock fix (Pavel)"

* tag 'io_uring-5.14-2021-08-13' of git://git.kernel.dk/linux-block:
  tools/io_uring/io_uring-cp: sync with liburing example
  io_uring: fix ctx-exit io_rsrc_put_work() deadlock
  io_uring: drop ctx->uring_lock before flushing work item
  io-wq: fix IO_WORKER_F_FIXED issue in create_io_worker()
  io-wq: fix bug of creating io-wokers unconditionally
  io_uring: rsrc ref lock needs to be IRQ safe
  io_uring: Use WRITE_ONCE() when writing to sq_flags
  io_uring: clear TIF_NOTIFY_SIGNAL when running task work
2021-08-13 13:25:08 -10:00
Linus Torvalds
462938cd48 Merge tag 'pinctrl-v5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
 "An assortment of pin control fixes of varying importance, the most
  important ones affecting Intel and AMD laptops turned up the recent
  few days so it's time to push this to your tree.

   - Fix the Kconfig dependency for Qualcomm SM8350 pin controller

   - Fix pin biasing fallback behaviour on the Mediatek pin controller

   - Fix the GPIO numbering scheme for Intel Tiger Lake-H to correspond
     to the products that are now actually out on the market

   - Fix a pin control function itemization in the Sunxi driver
     out-of-bounds access bug

   - Fix disable clocking for the RISC-V K210 pin controller on the
     errorpath

   - Fix a system shutdown bug affecting AMD Ryzen-based laptops, the
     system would not suspend but just bounce back up"

* tag 'pinctrl-v5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: amd: Fix an issue with shutdown when system set to s0ix
  pinctrl: k210: Fix k210_fpioa_probe()
  pinctrl: sunxi: Don't underestimate number of functions
  pinctrl: tigerlake: Fix GPIO mapping for newer version of software
  pinctrl: mediatek: Fix fallback behavior for bias_set_combo
  pinctrl: qcom: fix GPIOLIB dependencies
2021-08-13 12:41:45 -10:00
Maxim Kochetkov
c1e64c0aec soc: fsl: qe: fix static checker warning
The patch be7ecbd240: "soc: fsl: qe: convert QE interrupt
controller to platform_device" from Aug 3, 2021, leads to the
following static checker warning:

	drivers/soc/fsl/qe/qe_ic.c:438 qe_ic_init()
	warn: unsigned 'qe_ic->virq_low' is never less than zero.

In old variant irq_of_parse_and_map() returns zero if failed so
unsigned int for virq_high/virq_low was ok.
In new variant platform_get_irq() returns negative error codes
if failed so we need to use int for virq_high/virq_low.

Also simplify high_handler checking and remove the curly braces
to make checkpatch happy.

Fixes: be7ecbd240 ("soc: fsl: qe: convert QE interrupt controller to platform_device")
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
2021-08-13 16:56:10 -05:00
Jakub Kicinski
9d5e6a7076 Merge branch 'bnxt-tx-napi-disabling-resiliency-improvements'
Jakub Kicinski says:

====================
bnxt: Tx NAPI disabling resiliency improvements

A lockdep warning was triggered by netpoll because napi poll
was taking the xmit lock. Fix that and a couple more issues
noticed while reading the code.
====================

Link: https://lore.kernel.org/r/20210812214242.578039-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-13 10:26:20 -07:00
Jakub Kicinski
fb9f719009 bnxt: count Tx drops
Drivers should count packets they are dropping.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-13 10:26:17 -07:00
Jakub Kicinski
e8d8c5d80f bnxt: make sure xmit_more + errors does not miss doorbells
skbs are freed on error and not put on the ring. We may, however,
be in a situation where we're freeing the last skb of a batch,
and there is a doorbell ring pending because of xmit_more() being
true earlier. Make sure we ring the door bell in such situations.

Since errors are rare don't pay attention to xmit_more() and just
always flush the pending frames.

The busy case should be safe to be left alone because it can
only happen if start_xmit races with completions and they
both enable the queue. In that case the kick can't be pending.

Noticed while reading the code.

Fixes: 4d172f21ce ("bnxt_en: Implement xmit_more.")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-13 10:26:17 -07:00
Jakub Kicinski
01cca6b933 bnxt: disable napi before canceling DIM
napi schedules DIM, napi has to be disabled first,
then DIM canceled.

Noticed while reading the code.

Fixes: 0bc0b97fca ("bnxt_en: cleanup DIM work on device shutdown")
Fixes: 6a8788f256 ("bnxt_en: add support for software dynamic interrupt moderation")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-13 10:26:17 -07:00
Jakub Kicinski
3c603136c9 bnxt: don't lock the tx queue from napi poll
We can't take the tx lock from the napi poll routine, because
netpoll can poll napi at any moment, including with the tx lock
already held.

The tx lock is protecting against two paths - the disable
path, and (as Michael points out) the NETDEV_TX_BUSY case
which may occur if NAPI completions race with start_xmit
and both decide to re-enable the queue.

For the disable/ifdown path use synchronize_net() to make sure
closing the device does not race we restarting the queues.
Annotate accesses to dev_state against data races.

For the NAPI cleanup vs start_xmit path - appropriate barriers
are already in place in the main spot where Tx queue is stopped
but we need to do the same careful dance in the TX_BUSY case.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-13 10:26:17 -07:00
Xie Yongji
cddce01160 nbd: Aovid double completion of a request
There is a race between iterating over requests in
nbd_clear_que() and completing requests in recv_work(),
which can lead to double completion of a request.

To fix it, flush the recv worker before iterating over
the requests and don't abort the completed request
while iterating.

Fixes: 96d97e1782 ("nbd: clear_sock on netlink disconnect")
Reported-by: Jiang Yadong <jiangyadong@bytedance.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20210813151330.96-1-xieyongji@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-13 09:46:48 -06:00
Ilya Leoshkevich
3776f3517e selftests, bpf: Test that dead ldx_w insns are accepted
Prevent regressions related to zero-extension metadata handling during
dead code sanitization.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210812151811.184086-3-iii@linux.ibm.com
2021-08-13 17:46:26 +02:00
Ilya Leoshkevich
45c709f8c7 bpf: Clear zext_dst of dead insns
"access skb fields ok" verifier test fails on s390 with the "verifier
bug. zext_dst is set, but no reg is defined" message. The first insns
of the test prog are ...

   0:	61 01 00 00 00 00 00 00 	ldxw %r0,[%r1+0]
   8:	35 00 00 01 00 00 00 00 	jge %r0,0,1
  10:	61 01 00 08 00 00 00 00 	ldxw %r0,[%r1+8]

... and the 3rd one is dead (this does not look intentional to me, but
this is a separate topic).

sanitize_dead_code() converts dead insns into "ja -1", but keeps
zext_dst. When opt_subreg_zext_lo32_rnd_hi32() tries to parse such
an insn, it sees this discrepancy and bails. This problem can be seen
only with JITs whose bpf_jit_needs_zext() returns true.

Fix by clearning dead insns' zext_dst.

The commits that contributed to this problem are:

1. 5aa5bd14c5 ("bpf: add initial suite for selftests"), which
   introduced the test with the dead code.
2. 5327ed3d44 ("bpf: verifier: mark verified-insn with
   sub-register zext flag"), which introduced the zext_dst flag.
3. 83a2881903 ("bpf: Account for BPF_FETCH in
   insn_has_def32()"), which introduced the sanity check.
4. 9183671af6 ("bpf: Fix leakage under speculation on
   mispredicted branches"), which bisect points to.

It's best to fix this on stable branches that contain the second one,
since that's the point where the inconsistency was introduced.

Fixes: 5327ed3d44 ("bpf: verifier: mark verified-insn with sub-register zext flag")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210812151811.184086-2-iii@linux.ibm.com
2021-08-13 17:43:43 +02:00
Jens Axboe
8f40d03707 tools/io_uring/io_uring-cp: sync with liburing example
This example is missing a few fixes that are in the liburing version,
synchronize with the upstream version.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-13 08:58:11 -06:00
Yu Kuai
454bb67752 blk-mq: clear active_queues before clearing BLK_MQ_F_TAG_QUEUE_SHARED
We run a test that delete and recover devcies frequently(two devices on
the same host), and we found that 'active_queues' is super big after a
period of time.

If device a and device b share a tag set, and a is deleted, then
blk_mq_exit_queue() will clear BLK_MQ_F_TAG_QUEUE_SHARED because there
is only one queue that are using the tag set. However, if b is still
active, the active_queues of b might never be cleared even if b is
deleted.

Thus clear active_queues before BLK_MQ_F_TAG_QUEUE_SHARED is cleared.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210731062130.1533893-1-yukuai3@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-13 08:01:34 -06:00
Thomas Gleixner
7a3dc4f35b driver core: Add missing kernel doc for device::msi_lock
Fixes: 77e89afc25 ("PCI/MSI: Protect msi_desc::masked for multi-MSI")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2021-08-13 12:38:48 +02:00
Dongliang Mu
50f05bd114 ipack: tpci200: fix memory leak in the tpci200_register
The error handling code in tpci200_register does not free interface_regs
allocated by ioremap and the current version of error handling code is
problematic.

Fix this by refactoring the error handling code and free interface_regs
when necessary.

Fixes: 43986798fd ("ipack: add error handling for ioremap_nocache")
Cc: stable@vger.kernel.org
Reported-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Link: https://lore.kernel.org/r/20210810100323.3938492-2-mudongliangabcd@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-13 10:24:37 +02:00
Dongliang Mu
57a1681095 ipack: tpci200: fix many double free issues in tpci200_pci_probe
The function tpci200_register called by tpci200_install and
tpci200_unregister called by tpci200_uninstall are in pair. However,
tpci200_unregister has some cleanup operations not in the
tpci200_register. So the error handling code of tpci200_pci_probe has
many different double free issues.

Fix this problem by moving those cleanup operations out of
tpci200_unregister, into tpci200_pci_remove and reverting
the previous commit 9272e5d002 ("ipack/carriers/tpci200:
Fix a double free in tpci200_pci_probe").

Fixes: 9272e5d002 ("ipack/carriers/tpci200: Fix a double free in tpci200_pci_probe")
Cc: stable@vger.kernel.org
Reported-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Link: https://lore.kernel.org/r/20210810100323.3938492-1-mudongliangabcd@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-13 10:24:37 +02:00
Srinivas Kandagatla
d77772538f slimbus: ngd: reset dma setup during runtime pm
During suspend/resume NGD remote instance is power cycled along
with remotely controlled bam dma engine.
So Reset the dma configuration during this suspend resume path
so that we are not dealing with any stale dma setup.

Without this transactions timeout after first suspend resume path.

Fixes: 917809e228 ("slimbus: ngd: Add qcom SLIMBus NGD driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20210809082428.11236-5-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-13 10:22:30 +02:00
Srinivas Kandagatla
c0e38eaa8d slimbus: ngd: set correct device for pm
For some reason we ended up using wrong device in some places for pm_runtime calls.
Fix this so that NGG driver can do runtime pm correctly.

Fixes: 917809e228 ("slimbus: ngd: Add qcom SLIMBus NGD driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20210809082428.11236-4-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-13 10:22:30 +02:00
Srinivas Kandagatla
a263c1ff6a slimbus: messaging: check for valid transaction id
In some usecases transaction ids are dynamically allocated inside
the controller driver after sending the messages which have generic
acknowledge responses. So check for this before refcounting pm_runtime.

Without this we would end up imbalancing runtime pm count by
doing pm_runtime_put() in both slim_do_transfer() and slim_msg_response()
for a single  pm_runtime_get() in slim_do_transfer()

Fixes: d3062a2109 ("slimbus: messaging: add slim_alloc/free_txn_tid()")
Cc: <stable@vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20210809082428.11236-3-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-13 10:22:30 +02:00
Srinivas Kandagatla
9659281ce7 slimbus: messaging: start transaction ids from 1 instead of zero
As tid is unsigned its hard to figure out if the tid is valid or
invalid. So Start the transaction ids from 1 instead of zero
so that we could differentiate between a valid tid and invalid tids

This is useful in cases where controller would add a tid for controller
specific transfers.

Fixes: d3062a2109 ("slimbus: messaging: add slim_alloc/free_txn_tid()")
Cc: <stable@vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20210809082428.11236-2-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-13 10:22:30 +02:00
Paolo Bonzini
6e949ddb0a Merge branch 'kvm-tdpmmu-fixes' into kvm-master
Merge topic branch with fixes for both 5.14-rc6 and 5.15.
2021-08-13 03:33:13 -04:00
Sean Christopherson
ce25681d59 KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock
Add yet another spinlock for the TDP MMU and take it when marking indirect
shadow pages unsync.  When using the TDP MMU and L1 is running L2(s) with
nested TDP, KVM may encounter shadow pages for the TDP entries managed by
L1 (controlling L2) when handling a TDP MMU page fault.  The unsync logic
is not thread safe, e.g. the kvm_mmu_page fields are not atomic, and
misbehaves when a shadow page is marked unsync via a TDP MMU page fault,
which runs with mmu_lock held for read, not write.

Lack of a critical section manifests most visibly as an underflow of
unsync_children in clear_unsync_child_bit() due to unsync_children being
corrupted when multiple CPUs write it without a critical section and
without atomic operations.  But underflow is the best case scenario.  The
worst case scenario is that unsync_children prematurely hits '0' and
leads to guest memory corruption due to KVM neglecting to properly sync
shadow pages.

Use an entirely new spinlock even though piggybacking tdp_mmu_pages_lock
would functionally be ok.  Usurping the lock could degrade performance when
building upper level page tables on different vCPUs, especially since the
unsync flow could hold the lock for a comparatively long time depending on
the number of indirect shadow pages and the depth of the paging tree.

For simplicity, take the lock for all MMUs, even though KVM could fairly
easily know that mmu_lock is held for write.  If mmu_lock is held for
write, there cannot be contention for the inner spinlock, and marking
shadow pages unsync across multiple vCPUs will be slow enough that
bouncing the kvm_arch cacheline should be in the noise.

Note, even though L2 could theoretically be given access to its own EPT
entries, a nested MMU must hold mmu_lock for write and thus cannot race
against a TDP MMU page fault.  I.e. the additional spinlock only _needs_ to
be taken by the TDP MMU, as opposed to being taken by any MMU for a VM
that is running with the TDP MMU enabled.  Holding mmu_lock for read also
prevents the indirect shadow page from being freed.  But as above, keep
it simple and always take the lock.

Alternative #1, the TDP MMU could simply pass "false" for can_unsync and
effectively disable unsync behavior for nested TDP.  Write protecting leaf
shadow pages is unlikely to noticeably impact traditional L1 VMMs, as such
VMMs typically don't modify TDP entries, but the same may not hold true for
non-standard use cases and/or VMMs that are migrating physical pages (from
L1's perspective).

Alternative #2, the unsync logic could be made thread safe.  In theory,
simply converting all relevant kvm_mmu_page fields to atomics and using
atomic bitops for the bitmap would suffice.  However, (a) an in-depth audit
would be required, (b) the code churn would be substantial, and (c) legacy
shadow paging would incur additional atomic operations in performance
sensitive paths for no benefit (to legacy shadow paging).

Fixes: a2855afc7e ("KVM: x86/mmu: Allow parallel page faults for the TDP MMU")
Cc: stable@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210812181815.3378104-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13 03:32:14 -04:00
Sean Christopherson
0103098fb4 KVM: x86/mmu: Don't step down in the TDP iterator when zapping all SPTEs
Set the min_level for the TDP iterator at the root level when zapping all
SPTEs to optimize the iterator's try_step_down().  Zapping a non-leaf
SPTE will recursively zap all its children, thus there is no need for the
iterator to attempt to step down.  This avoids rereading the top-level
SPTEs after they are zapped by causing try_step_down() to short-circuit.

In most cases, optimizing try_step_down() will be in the noise as the cost
of zapping SPTEs completely dominates the overall time.  The optimization
is however helpful if the zap occurs with relatively few SPTEs, e.g. if KVM
is zapping in response to multiple memslot updates when userspace is adding
and removing read-only memslots for option ROMs.  In that case, the task
doing the zapping likely isn't a vCPU thread, but it still holds mmu_lock
for read and thus can be a noisy neighbor of sorts.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210812181414.3376143-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13 03:31:56 -04:00
Sean Christopherson
524a1e4e38 KVM: x86/mmu: Don't leak non-leaf SPTEs when zapping all SPTEs
Pass "all ones" as the end GFN to signal "zap all" for the TDP MMU and
really zap all SPTEs in this case.  As is, zap_gfn_range() skips non-leaf
SPTEs whose range exceeds the range to be zapped.  If shadow_phys_bits is
not aligned to the range size of top-level SPTEs, e.g. 512gb with 4-level
paging, the "zap all" flows will skip top-level SPTEs whose range extends
beyond shadow_phys_bits and leak their SPs when the VM is destroyed.

Use the current upper bound (based on host.MAXPHYADDR) to detect that the
caller wants to zap all SPTEs, e.g. instead of using the max theoretical
gfn, 1 << (52 - 12).  The more precise upper bound allows the TDP iterator
to terminate its walk earlier when running on hosts with MAXPHYADDR < 52.

Add a WARN on kmv->arch.tdp_mmu_pages when the TDP MMU is destroyed to
help future debuggers should KVM decide to leak SPTEs again.

The bug is most easily reproduced by running (and unloading!) KVM in a
VM whose host.MAXPHYADDR < 39, as the SPTE for gfn=0 will be skipped.

  =============================================================================
  BUG kvm_mmu_page_header (Not tainted): Objects remaining in kvm_mmu_page_header on __kmem_cache_shutdown()
  -----------------------------------------------------------------------------
  Slab 0x000000004d8f7af1 objects=22 used=2 fp=0x00000000624d29ac flags=0x4000000000000200(slab|zone=1)
  CPU: 0 PID: 1582 Comm: rmmod Not tainted 5.14.0-rc2+ #420
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  Call Trace:
   dump_stack_lvl+0x45/0x59
   slab_err+0x95/0xc9
   __kmem_cache_shutdown.cold+0x3c/0x158
   kmem_cache_destroy+0x3d/0xf0
   kvm_mmu_module_exit+0xa/0x30 [kvm]
   kvm_arch_exit+0x5d/0x90 [kvm]
   kvm_exit+0x78/0x90 [kvm]
   vmx_exit+0x1a/0x50 [kvm_intel]
   __x64_sys_delete_module+0x13f/0x220
   do_syscall_64+0x3b/0xc0
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: faaf05b00a ("kvm: x86/mmu: Support zapping SPTEs in the TDP MMU")
Cc: stable@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210812181414.3376143-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13 03:31:46 -04:00
Paolo Bonzini
c5e2bf0b4a Merge tag 'kvmarm-fixes-5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.14, take #2

- Plug race between enabling MTE and creating vcpus
- Fix off-by-one bug when checking whether an address range is RAM
2021-08-13 03:21:13 -04:00
Sean Christopherson
18712c1370 KVM: nVMX: Use vmx_need_pf_intercept() when deciding if L0 wants a #PF
Use vmx_need_pf_intercept() when determining if L0 wants to handle a #PF
in L2 or if the VM-Exit should be forwarded to L1.  The current logic fails
to account for the case where #PF is intercepted to handle
guest.MAXPHYADDR < host.MAXPHYADDR and ends up reflecting all #PFs into
L1.  At best, L1 will complain and inject the #PF back into L2.  At
worst, L1 will eat the unexpected fault and cause L2 to hang on infinite
page faults.

Note, while the bug was technically introduced by the commit that added
support for the MAXPHYADDR madness, the shame is all on commit
a0c134347b ("KVM: VMX: introduce vmx_need_pf_intercept").

Fixes: 1dbf5d68af ("KVM: VMX: Add guest physical address check in EPT violation and misconfig")
Cc: stable@vger.kernel.org
Cc: Peter Shier <pshier@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210812045615.3167686-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13 03:20:58 -04:00
Junaid Shahid
85aa8889b8 kvm: vmx: Sync all matching EPTPs when injecting nested EPT fault
When a nested EPT violation/misconfig is injected into the guest,
the shadow EPT PTEs associated with that address need to be synced.
This is done by kvm_inject_emulated_page_fault() before it calls
nested_ept_inject_page_fault(). However, that will only sync the
shadow EPT PTE associated with the current L1 EPTP. Since the ASID
is based on EP4TA rather than the full EPTP, so syncing the current
EPTP is not enough. The SPTEs associated with any other L1 EPTPs
in the prev_roots cache with the same EP4TA also need to be synced.

Signed-off-by: Junaid Shahid <junaids@google.com>
Message-Id: <20210806222229.1645356-1-junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13 03:20:58 -04:00
Paolo Bonzini
375d1adebc Merge branch 'kvm-vmx-secctl' into kvm-master
Merge common topic branch for 5.14-rc6 and 5.15 merge window.
2021-08-13 03:20:18 -04:00
Paolo Bonzini
ffbe17cada KVM: x86: remove dead initialization
hv_vcpu is initialized again a dozen lines below, and at this
point vcpu->arch.hyperv is not valid.  Remove the initializer.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13 03:20:18 -04:00
Sean Christopherson
1383279c64 KVM: x86: Allow guest to set EFER.NX=1 on non-PAE 32-bit kernels
Remove an ancient restriction that disallowed exposing EFER.NX to the
guest if EFER.NX=0 on the host, even if NX is fully supported by the CPU.
The motivation of the check, added by commit 2cc51560ae ("KVM: VMX:
Avoid saving and restoring msr_efer on lightweight vmexit"), was to rule
out the case of host.EFER.NX=0 and guest.EFER.NX=1 so that KVM could run
the guest with the host's EFER.NX and thus avoid context switching EFER
if the only divergence was the NX bit.

Fast forward to today, and KVM has long since stopped running the guest
with the host's EFER.NX.  Not only does KVM context switch EFER if
host.EFER.NX=1 && guest.EFER.NX=0, KVM also forces host.EFER.NX=0 &&
guest.EFER.NX=1 when using shadow paging (to emulate SMEP).  Furthermore,
the entire motivation for the restriction was made obsolete over a decade
ago when Intel added dedicated host and guest EFER fields in the VMCS
(Nehalem timeframe), which reduced the overhead of context switching EFER
from 400+ cycles (2 * WRMSR + 1 * RDMSR) to a mere ~2 cycles.

In practice, the removed restriction only affects non-PAE 32-bit kernels,
as EFER.NX is set during boot if NX is supported and the kernel will use
PAE paging (32-bit or 64-bit), regardless of whether or not the kernel
will actually use NX itself (mark PTEs non-executable).

Alternatively and/or complementarily, startup_32_smp() in head_32.S could
be modified to set EFER.NX=1 regardless of paging mode, thus eliminating
the scenario where NX is supported but not enabled.  However, that runs
the risk of breaking non-KVM non-PAE kernels (though the risk is very,
very low as there are no known EFER.NX errata), and also eliminates an
easy-to-use mechanism for stressing KVM's handling of guest vs. host EFER
across nested virtualization transitions.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210805183804.1221554-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13 03:20:17 -04:00
Linus Torvalds
f8e6dfc64f Merge tag 'net-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from netfilter, bpf, can and
  ieee802154.

  The size of this is pretty normal, but we got more fixes for 5.14
  changes this week than last week. Nothing major but the trend is the
  opposite of what we like. We'll see how the next week goes..

  Current release - regressions:

   - r8169: fix ASPM-related link-up regressions

   - bridge: fix flags interpretation for extern learn fdb entries

   - phy: micrel: fix link detection on ksz87xx switch

   - Revert "tipc: Return the correct errno code"

   - ptp: fix possible memory leak caused by invalid cast

  Current release - new code bugs:

   - bpf: add missing bpf_read_[un]lock_trace() for syscall program

   - bpf: fix potentially incorrect results with bpf_get_local_storage()

   - page_pool: mask the page->signature before the checking, avoid dma
     mapping leaks

   - netfilter: nfnetlink_hook: 5 fixes to information in netlink dumps

   - bnxt_en: fix firmware interface issues with PTP

   - mlx5: Bridge, fix ageing time

  Previous releases - regressions:

   - linkwatch: fix failure to restore device state across
     suspend/resume

   - bareudp: fix invalid read beyond skb's linear data

  Previous releases - always broken:

   - bpf: fix integer overflow involving bucket_size

   - ppp: fix issues when desired interface name is specified via
     netlink

   - wwan: mhi_wwan_ctrl: fix possible deadlock

   - dsa: microchip: ksz8795: fix number of VLAN related bugs

   - dsa: drivers: fix broken backpressure in .port_fdb_dump

   - dsa: qca: ar9331: make proper initial port defaults

  Misc:

   - bpf: add lockdown check for probe_write_user helper

   - netfilter: conntrack: remove offload_pickup sysctl before 5.14 is
     out

   - netfilter: conntrack: collect all entries in one cycle,
     heuristically slow down garbage collection scans on idle systems to
     prevent frequent wake ups"

* tag 'net-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
  vsock/virtio: avoid potential deadlock when vsock device remove
  wwan: core: Avoid returning NULL from wwan_create_dev()
  net: dsa: sja1105: unregister the MDIO buses during teardown
  Revert "tipc: Return the correct errno code"
  net: mscc: Fix non-GPL export of regmap APIs
  net: igmp: increase size of mr_ifc_count
  MAINTAINERS: switch to my OMP email for Renesas Ethernet drivers
  tcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets
  net: pcs: xpcs: fix error handling on failed to allocate memory
  net: linkwatch: fix failure to restore device state across suspend/resume
  net: bridge: fix memleak in br_add_if()
  net: switchdev: zero-initialize struct switchdev_notifier_fdb_info emitted by drivers towards the bridge
  net: bridge: fix flags interpretation for extern learn fdb entries
  net: dsa: sja1105: fix broken backpressure in .port_fdb_dump
  net: dsa: lantiq: fix broken backpressure in .port_fdb_dump
  net: dsa: lan9303: fix broken backpressure in .port_fdb_dump
  net: dsa: hellcreek: fix broken backpressure in .port_fdb_dump
  bpf, core: Fix kernel-doc notation
  net: igmp: fix data-race in igmp_ifc_timer_expire()
  net: Fix memory leak in ieee802154_raw_deliver
  ...
2021-08-12 16:24:03 -10:00
Linus Torvalds
3a03c67de2 Merge tag 'ceph-for-5.14-rc6' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
 "A patch to avoid a soft lockup in ceph_check_delayed_caps() from Luis
  and a reference handling fix from Jeff that should address some memory
  corruption reports in the snaprealm area.

  Both marked for stable"

* tag 'ceph-for-5.14-rc6' of git://github.com/ceph/ceph-client:
  ceph: take snap_empty_lock atomically with snaprealm refcount change
  ceph: reduce contention in ceph_check_delayed_caps()
2021-08-12 16:16:01 -10:00
Linus Torvalds
82cce5f429 Merge tag 'drm-fixes-2021-08-13' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Another week, another set of pretty regular fixes, nothing really
  stands out too much.

  amdgpu:
   - Yellow carp update
   - RAS EEPROM fixes
   - BACO/BOCO fixes
   - Fix a memory leak in an error path
   - Freesync fix
   - VCN harvesting fix
   - Display fixes

  i915:
   - GVT fix for Windows VM hang.
   - Display fix of 12 BPC bits for display 12 and newer.
   - Don't try to access some media register for fused off domains.
   - Fix kerneldoc build warnings.

  mediatek:
   - Fix dpi bridge bug.
   - Fix cursor plane no update.

  meson:
   - Fix colors when booting with HDR"

* tag 'drm-fixes-2021-08-13' of git://anongit.freedesktop.org/drm/drm:
  drm/doc/rfc: drop lmem uapi section
  drm/i915: Only access SFC_DONE when media domain is not fused off
  drm/i915/display: Fix the 12 BPC bits for PIPE_MISC reg
  drm/amd/display: use GFP_ATOMIC in amdgpu_dm_irq_schedule_work
  drm/amd/display: Remove invalid assert for ODM + MPC case
  drm/amd/pm: bug fix for the runtime pm BACO
  drm/amdgpu: handle VCN instances when harvesting (v2)
  drm/meson: fix colour distortion from HDR set during vendor u-boot
  drm/i915/gvt: Fix cached atomics setting for Windows VM
  drm/amdgpu: Add preferred mode in modeset when freesync video mode's enabled.
  drm/amd/pm: Fix a memory leak in an error handling path in 'vangogh_tables_init()'
  drm/amdgpu: don't enable baco on boco platforms in runpm
  drm/amdgpu: set RAS EEPROM address from VBIOS
  drm/amd/pm: update smu v13.0.1 firmware header
  drm/mediatek: Fix cursor plane no update
  drm/mediatek: mtk-dpi: Set out_fmt from config if not the last bridge
  drm/mediatek: dpi: Fix NULL dereference in mtk_dpi_bridge_atomic_check
2021-08-12 16:09:25 -10:00
Arnd Bergmann
cbfece7518 ARM: ixp4xx: fix building both pci drivers
When both the old and the new PCI drivers are enabled
in the same kernel, there are a couple of namespace
conflicts that cause a build failure:

drivers/pci/controller/pci-ixp4xx.c:38: error: "IXP4XX_PCI_CSR" redefined [-Werror]
   38 | #define IXP4XX_PCI_CSR                  0x1c
      |
In file included from arch/arm/mach-ixp4xx/include/mach/hardware.h:23,
                 from arch/arm/mach-ixp4xx/include/mach/io.h:15,
                 from arch/arm/include/asm/io.h:198,
                 from include/linux/io.h:13,
                 from drivers/pci/controller/pci-ixp4xx.c:20:
arch/arm/mach-ixp4xx/include/mach/ixp4xx-regs.h:221: note: this is the location of the previous definition
  221 | #define IXP4XX_PCI_CSR(x) ((volatile u32 *)(IXP4XX_PCI_CFG_BASE_VIRT+(x)))
      |
drivers/pci/controller/pci-ixp4xx.c:148:12: error: 'ixp4xx_pci_read' redeclared as different kind of symbol
  148 | static int ixp4xx_pci_read(struct ixp4xx_pci *p, u32 addr, u32 cmd, u32 *data)
      |            ^~~~~~~~~~~~~~~

Rename both the ixp4xx_pci_read/ixp4xx_pci_write functions and the
IXP4XX_PCI_CSR macro. In each case, I went with the version that
has fewer callers to keep the change small.

Fixes: f7821b4934 ("PCI: ixp4xx: Add a new driver for IXP4xx")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: soc@kernel.org
Link: https://lore.kernel.org/r/20210721151546.2325937-1-arnd@kernel.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-08-12 23:10:09 +02:00
Linus Walleij
813bacf410 ARM: configs: Update the nhk8815_defconfig
The platform lost the framebuffer due to a commit solving a
circular dependency in v5.14-rc1, so add it back in by explicitly
selecting the framebuffer.

Also fix up some Kconfig options that got dropped or moved around
while we're at it.

Fixes: f611b1e762 ("drm: Avoid circular dependencies for CONFIG_FB")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210807225518.3607126-1-linus.walleij@linaro.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-08-12 23:09:47 +02:00
Dave Airlie
a1fa726831 Merge tag 'drm-misc-fixes-2021-08-12' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull:

 * meson: Fix colors when booting with HDR

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YRTb+qUuBYWjJDVg@linux-uq9g.fritz.box
2021-08-13 06:37:40 +10:00
Dave Airlie
3e234e9f7f Merge tag 'drm-intel-fixes-2021-08-12' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- GVT fix for Windows VM hang.
- Display fix of 12 BPC bits for display 12 and newer.
- Don't try to access some media register for fused off domains.
- Fix kerneldoc build warnings.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YRU/hnQ1sNr+j37x@intel.com
2021-08-13 06:31:26 +10:00
Jakub Kicinski
a9a507013a Merge tag 'ieee802154-for-davem-2021-08-12' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
Stefan Schmidt says:

====================
ieee802154 for net 2021-08-12

Mostly fixes coming from bot reports. Dongliang Mu tackled some syzkaller
reports in hwsim again and Takeshi Misawa a memory leak  in  ieee802154 raw.

* tag 'ieee802154-for-davem-2021-08-12' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan:
  net: Fix memory leak in ieee802154_raw_deliver
  ieee802154: hwsim: fix GPF in hwsim_new_edge_nl
  ieee802154: hwsim: fix GPF in hwsim_set_edge_lqi
====================

Link: https://lore.kernel.org/r/20210812183912.1663996-1-stefan@datenfreihafen.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-12 11:50:17 -07:00
Babu Moger
064855a690 x86/resctrl: Fix default monitoring groups reporting
Creating a new sub monitoring group in the root /sys/fs/resctrl leads to
getting the "Unavailable" value for mbm_total_bytes and mbm_local_bytes
on the entire filesystem.

Steps to reproduce:

  1. mount -t resctrl resctrl /sys/fs/resctrl/

  2. cd /sys/fs/resctrl/

  3. cat mon_data/mon_L3_00/mbm_total_bytes
     23189832

  4. Create sub monitor group:
  mkdir mon_groups/test1

  5. cat mon_data/mon_L3_00/mbm_total_bytes
     Unavailable

When a new monitoring group is created, a new RMID is assigned to the
new group. But the RMID is not active yet. When the events are read on
the new RMID, it is expected to report the status as "Unavailable".

When the user reads the events on the default monitoring group with
multiple subgroups, the events on all subgroups are consolidated
together. Currently, if any of the RMID reads report as "Unavailable",
then everything will be reported as "Unavailable".

Fix the issue by discarding the "Unavailable" reads and reporting all
the successful RMID reads. This is not a problem on Intel systems as
Intel reports 0 on Inactive RMIDs.

Fixes: d89b737901 ("x86/intel_rdt/cqm: Add mon_data")
Reported-by: Paweł Szulik <pawel.szulik@intel.com>
Signed-off-by: Babu Moger <Babu.Moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213311
Link: https://lkml.kernel.org/r/162793309296.9224.15871659871696482080.stgit@bmoger-ubuntu
2021-08-12 20:12:20 +02:00
Longpeng(Mike)
49b0b6ffe2 vsock/virtio: avoid potential deadlock when vsock device remove
There's a potential deadlock case when remove the vsock device or
process the RESET event:

  vsock_for_each_connected_socket:
      spin_lock_bh(&vsock_table_lock) ----------- (1)
      ...
          virtio_vsock_reset_sock:
              lock_sock(sk) --------------------- (2)
      ...
      spin_unlock_bh(&vsock_table_lock)

lock_sock() may do initiative schedule when the 'sk' is owned by
other thread at the same time, we would receivce a warning message
that "scheduling while atomic".

Even worse, if the next task (selected by the scheduler) try to
release a 'sk', it need to request vsock_table_lock and the deadlock
occur, cause the system into softlockup state.
  Call trace:
   queued_spin_lock_slowpath
   vsock_remove_bound
   vsock_remove_sock
   virtio_transport_release
   __vsock_release
   vsock_release
   __sock_release
   sock_close
   __fput
   ____fput

So we should not require sk_lock in this case, just like the behavior
in vhost_vsock or vmci.

Fixes: 0ea9e1d3a9 ("VSOCK: Introduce virtio_transport.ko")
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210812053056.1699-1-longpeng2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-12 10:57:27 -07:00
Steven Rostedt (VMware)
5acce0bff2 tracing / histogram: Fix NULL pointer dereference on strcmp() on NULL event name
The following commands:

 # echo 'read_max u64 size;' > synthetic_events
 # echo 'hist:keys=common_pid:count=count:onmax($count).trace(read_max,count)' > events/syscalls/sys_enter_read/trigger

Causes:

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP
 CPU: 4 PID: 1763 Comm: bash Not tainted 5.14.0-rc2-test+ #155
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01
v03.03 07/14/2016
 RIP: 0010:strcmp+0xc/0x20
 Code: 75 f7 31 c0 0f b6 0c 06 88 0c 02 48 83 c0 01 84 c9 75 f1 4c 89 c0
c3 0f 1f 80 00 00 00 00 31 c0 eb 08 48 83 c0 01 84 d2 74 0f <0f> b6 14 07
3a 14 06 74 ef 19 c0 83 c8 01 c3 31 c0 c3 66 90 48 89
 RSP: 0018:ffffb5fdc0963ca8 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffffffffb3a4e040 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: ffff9714c0d0b640 RDI: 0000000000000000
 RBP: 0000000000000000 R08: 00000022986b7cde R09: ffffffffb3a4dff8
 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9714c50603c8
 R13: 0000000000000000 R14: ffff97143fdf9e48 R15: ffff9714c01a2210
 FS:  00007f1fa6785740(0000) GS:ffff9714da400000(0000)
knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 000000002d863004 CR4: 00000000001706e0
 Call Trace:
  __find_event_file+0x4e/0x80
  action_create+0x6b7/0xeb0
  ? kstrdup+0x44/0x60
  event_hist_trigger_func+0x1a07/0x2130
  trigger_process_regex+0xbd/0x110
  event_trigger_write+0x71/0xd0
  vfs_write+0xe9/0x310
  ksys_write+0x68/0xe0
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f1fa6879e87

The problem was the "trace(read_max,count)" where the "count" should be
"$count" as "onmax()" only handles variables (although it really should be
able to figure out that "count" is a field of sys_enter_read). But there's
a path that does not find the variable and ends up passing a NULL for the
event, which ends up getting passed to "strcmp()".

Add a check for NULL to return and error on the command with:

 # cat error_log
  hist:syscalls:sys_enter_read: error: Couldn't create or find variable
  Command: hist:keys=common_pid:count=count:onmax($count).trace(read_max,count)
                                ^
Link: https://lkml.kernel.org/r/20210808003011.4037f8d0@oasis.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 50450603ec tracing: Add 'onmax' hist trigger action support
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12 13:35:57 -04:00
Masami Hiramatsu
d0ac5fbaf7 init: Suppress wrong warning for bootconfig cmdline parameter
Since the 'bootconfig' command line parameter is handled before
parsing the command line, it doesn't use early_param(). But in
this case, kernel shows a wrong warning message about it.

[    0.013714] Kernel command line: ro console=ttyS0  bootconfig console=tty0
[    0.013741] Unknown command line parameters: bootconfig

To suppress this message, add a dummy handler for 'bootconfig'.

Link: https://lkml.kernel.org/r/162812945097.77369.1849780946468010448.stgit@devnote2

Fixes: 86d1919a4f ("init: print out unknown kernel parameters")
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12 13:35:57 -04:00
Lukas Bulwahn
12f9951d3f tracing: define needed config DYNAMIC_FTRACE_WITH_ARGS
Commit 2860cd8a23 ("livepatch: Use the default ftrace_ops instead of
REGS when ARGS is available") intends to enable config LIVEPATCH when
ftrace with ARGS is available. However, the chain of configs to enable
LIVEPATCH is incomplete, as HAVE_DYNAMIC_FTRACE_WITH_ARGS is available,
but the definition of DYNAMIC_FTRACE_WITH_ARGS, combining DYNAMIC_FTRACE
and HAVE_DYNAMIC_FTRACE_WITH_ARGS, needed to enable LIVEPATCH, is missing
in the commit.

Fortunately, ./scripts/checkkconfigsymbols.py detects this and warns:

DYNAMIC_FTRACE_WITH_ARGS
Referencing files: kernel/livepatch/Kconfig

So, define the config DYNAMIC_FTRACE_WITH_ARGS analogously to the already
existing similar configs, DYNAMIC_FTRACE_WITH_REGS and
DYNAMIC_FTRACE_WITH_DIRECT_CALLS, in ./kernel/trace/Kconfig to connect the
chain of configs.

Link: https://lore.kernel.org/kernel-janitors/CAKXUXMwT2zS9fgyQHKUUiqo8ynZBdx2UEUu1WnV_q0OCmknqhw@mail.gmail.com/
Link: https://lkml.kernel.org/r/20210806195027.16808-1-lukas.bulwahn@gmail.com

Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: stable@vger.kernel.org
Fixes: 2860cd8a23 ("livepatch: Use the default ftrace_ops instead of REGS when ARGS is available")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12 13:35:57 -04:00
Daniel Bristot de Oliveira
0e05ba498d trace/osnoise: Print a stop tracing message
When using osnoise/timerlat with stop tracing, sometimes it is
not clear in which CPU the stop condition was hit, mainly
when using some extra events.

Print a message informing in which CPU the trace stopped, like
in the example below:

          <idle>-0       [006] d.h.  2932.676616: #1672599 context    irq timer_latency     34689 ns
          <idle>-0       [006] dNh.  2932.676618: irq_noise: local_timer:236 start 2932.676615639 duration 2391 ns
          <idle>-0       [006] dNh.  2932.676620: irq_noise: virtio0-output.0:47 start 2932.676620180 duration 86 ns
          <idle>-0       [003] d.h.  2932.676621: #1673374 context    irq timer_latency      1200 ns
          <idle>-0       [006] d...  2932.676623: thread_noise: swapper/6:0 start 2932.676615964 duration 4339 ns
          <idle>-0       [003] dNh.  2932.676623: irq_noise: local_timer:236 start 2932.676620597 duration 1881 ns
          <idle>-0       [006] d...  2932.676623: sched_switch: prev_comm=swapper/6 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=timerlat/6 next_pid=852 next_prio=4
      timerlat/6-852     [006] ....  2932.676623: #1672599 context thread timer_latency     41931 ns
          <idle>-0       [003] d...  2932.676623: thread_noise: swapper/3:0 start 2932.676620854 duration 880 ns
          <idle>-0       [003] d...  2932.676624: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=timerlat/3 next_pid=849 next_prio=4
      timerlat/6-852     [006] ....  2932.676624: timerlat_main: stop tracing hit on cpu 6
      timerlat/3-849     [003] ....  2932.676624: #1673374 context thread timer_latency      4310 ns

Link: https://lkml.kernel.org/r/b30a0d7542adba019185f44ee648e60e14923b11.1626598844.git.bristot@kernel.org

Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12 13:35:56 -04:00
Daniel Bristot de Oliveira
e1c4ad4a7f trace/timerlat: Add a header with PREEMPT_RT additional fields
Some extra flags are printed to the trace header when using the
PREEMPT_RT config. The extra flags are: need-resched-lazy,
preempt-lazy-depth, and migrate-disable.

Without printing these fields, the timerlat specific fields are
shifted by three positions, for example:

 # tracer: timerlat
 #
 #                                _-----=> irqs-off
 #                               / _----=> need-resched
 #                              | / _---=> hardirq/softirq
 #                              || / _--=> preempt-depth
 #                              || /
 #                              ||||             ACTIVATION
 #           TASK-PID      CPU# ||||   TIMESTAMP    ID            CONTEXT                LATENCY
 #              | |         |   ||||      |         |                  |                       |
           <idle>-0       [000] d..h...  3279.798871: #1     context    irq timer_latency       830 ns
            <...>-807     [000] .......  3279.798881: #1     context thread timer_latency     11301 ns

Add a new header for timerlat with the missing fields, to be used
when the PREEMPT_RT is enabled.

Link: https://lkml.kernel.org/r/babb83529a3211bd0805be0b8c21608230202c55.1626598844.git.bristot@kernel.org

Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12 13:35:56 -04:00
Daniel Bristot de Oliveira
d03721a6e7 trace/osnoise: Add a header with PREEMPT_RT additional fields
Some extra flags are printed to the trace header when using the
PREEMPT_RT config. The extra flags are: need-resched-lazy,
preempt-lazy-depth, and migrate-disable.

Without printing these fields, the osnoise specific fields are
shifted by three positions, for example:

 # tracer: osnoise
 #
 #                                _-----=> irqs-off
 #                               / _----=> need-resched
 #                              | / _---=> hardirq/softirq
 #                              || / _--=> preempt-depth                            MAX
 #                              || /                                             SINGLE      Interference counters:
 #                              ||||               RUNTIME      NOISE  %% OF CPU  NOISE    +-----------------------------+
 #           TASK-PID      CPU# ||||   TIMESTAMP    IN US       IN US  AVAILABLE  IN US     HW    NMI    IRQ   SIRQ THREAD
 #              | |         |   ||||      |           |             |    |            |      |      |      |      |      |
            <...>-741     [000] .......  1105.690909: 1000000        234  99.97660      36     21      0   1001     22      3
            <...>-742     [001] .......  1105.691923: 1000000        281  99.97190     197      7      0   1012     35     14
            <...>-743     [002] .......  1105.691958: 1000000       1324  99.86760     118     11      0   1016    155    143
            <...>-744     [003] .......  1105.691998: 1000000        109  99.98910      21      4      0   1004     33      7
            <...>-745     [004] .......  1105.692015: 1000000       2023  99.79770      97     37      0   1023     52     18

Add a new header for osnoise with the missing fields, to be used
when the PREEMPT_RT is enabled.

Link: https://lkml.kernel.org/r/1f03289d2a51fde5a58c2e7def063dc630820ad1.1626598844.git.bristot@kernel.org

Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12 13:35:56 -04:00
Linus Torvalds
f8fbb47c6e Merge branch 'for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull ucounts fix from Eric Biederman:
 "This fixes the ucount sysctls on big endian architectures.

  The counts were expanded to be longs instead of ints, and the sysctl
  code was overlooked, so only the low 32bit were being processed. On
  litte endian just processing the low 32bits is fine, but on 64bit big
  endian processing just the low 32bits results in the high order bits
  instead of the low order bits being processed and nothing works
  proper.

  This change took a little bit to mature as we have the SYSCTL_ZERO,
  and SYSCTL_INT_MAX macros that are only usable for sysctls operating
  on ints, but unfortunately are not obviously broken. Which resulted in
  the versions of this change working on big endian and not on little
  endian, because the int SYSCTL_ZERO when extended 64bit wound up being
  0x100000000. So we only allowed values greater than 0x100000000 and
  less than 0faff. Which unfortunately broken everything that tried to
  set the sysctls. (First reported with the windows subsystem for
  linux).

  I have tested this on x86_64 64bit after first reproducing the
  problems with the earlier version of this change, and then verifying
  the problems do not exist when we use appropriate long min and max
  values for extra1 and extra2"

* 'for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ucounts: add missing data type changes
2021-08-12 07:20:16 -10:00
Linus Torvalds
59cd4f435e Merge tag 'sound-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "This seems to be a usual bump in the middle, containing lots of
  pending ASoC fixes:

   - Yet another PCM mmap regression fix

   - Fix for ASoC DAPM prefix handling

   - Various cs42l42 codec fixes

   - PCM buffer reference fixes in a few ASoC drivers

   - Fixes for ASoC SOF, AMD, tlv320, WM

   - HD-audio quirks"

* tag 'sound-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (32 commits)
  ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 650 G8 Notebook PC
  ALSA: pcm: Fix mmap breakage without explicit buffer setup
  ALSA: hda: Add quirk for ASUS Flow x13
  ASoC: cs42l42: Fix mono playback
  ASoC: cs42l42: Constrain sample rate to prevent illegal SCLK
  ASoC: cs42l42: Fix LRCLK frame start edge
  ASoC: cs42l42: PLL must be running when changing MCLK_SRC_SEL
  ASoC: cs42l42: Remove duplicate control for WNF filter frequency
  ASoC: cs42l42: Fix inversion of ADC Notch Switch control
  ASoC: SOF: Intel: hda-ipc: fix reply size checking
  ASoC: SOF: Intel: Kconfig: fix SoundWire dependencies
  ASoC: amd: Fix reference to PCM buffer address
  ASoC: nau8824: Fix open coded prefix handling
  ASoC: kirkwood: Fix reference to PCM buffer address
  ASoC: uniphier: Fix reference to PCM buffer address
  ASoC: xilinx: Fix reference to PCM buffer address
  ASoC: intel: atom: Fix reference to PCM buffer address
  ASoC: cs42l42: Fix bclk calculation for mono
  ASoC: cs42l42: Don't allow SND_SOC_DAIFMT_LEFT_J
  ASoC: cs42l42: Correct definition of ADC Volume control
  ...
2021-08-12 07:06:40 -10:00
Andy Shevchenko
d9d5b89612 wwan: core: Avoid returning NULL from wwan_create_dev()
Make wwan_create_dev() to return either valid or error pointer,
In some cases it may return NULL. Prevent this by converting
it to the respective error pointer.

Fixes: 9a44c1cc63 ("net: Add a WWAN subsystem")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Link: https://lore.kernel.org/r/20210811124845.10955-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-12 09:53:02 -07:00
Rohith Surabattula
9e992755be cifs: Call close synchronously during unlink/rename/lease break.
During unlink/rename/lease break, deferred work for close is
scheduled immediately but in an asynchronous manner which might
lead to race with actual(unlink/rename) commands.

This change will schedule close synchronously which will avoid
the race conditions with other commands.

Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Cc: stable@vger.kernel.org # 5.13
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-12 11:29:58 -05:00
Rohith Surabattula
41535701da cifs: Handle race conditions during rename
When rename is executed on directory which has files for which
close is deferred, then rename will fail with EACCES.

This patch will try to close all deferred files when EACCES is received
and retry rename on a directory.

Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Cc: stable@vger.kernel.org # 5.13
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-12 11:29:54 -05:00
Maximilian Heyne
88ca2521bd xen/events: Fix race in set_evtchn_to_irq
There is a TOCTOU issue in set_evtchn_to_irq. Rows in the evtchn_to_irq
mapping are lazily allocated in this function. The check whether the row
is already present and the row initialization is not synchronized. Two
threads can at the same time allocate a new row for evtchn_to_irq and
add the irq mapping to the their newly allocated row. One thread will
overwrite what the other has set for evtchn_to_irq[row] and therefore
the irq mapping is lost. This will trigger a BUG_ON later in
bind_evtchn_to_cpu:

  INFO: pci 0000:1a:15.4: [1d0f:8061] type 00 class 0x010802
  INFO: nvme 0000:1a:12.1: enabling device (0000 -> 0002)
  INFO: nvme nvme77: 1/0/0 default/read/poll queues
  CRIT: kernel BUG at drivers/xen/events/events_base.c:427!
  WARN: invalid opcode: 0000 [#1] SMP NOPTI
  WARN: Workqueue: nvme-reset-wq nvme_reset_work [nvme]
  WARN: RIP: e030:bind_evtchn_to_cpu+0xc2/0xd0
  WARN: Call Trace:
  WARN:  set_affinity_irq+0x121/0x150
  WARN:  irq_do_set_affinity+0x37/0xe0
  WARN:  irq_setup_affinity+0xf6/0x170
  WARN:  irq_startup+0x64/0xe0
  WARN:  __setup_irq+0x69e/0x740
  WARN:  ? request_threaded_irq+0xad/0x160
  WARN:  request_threaded_irq+0xf5/0x160
  WARN:  ? nvme_timeout+0x2f0/0x2f0 [nvme]
  WARN:  pci_request_irq+0xa9/0xf0
  WARN:  ? pci_alloc_irq_vectors_affinity+0xbb/0x130
  WARN:  queue_request_irq+0x4c/0x70 [nvme]
  WARN:  nvme_reset_work+0x82d/0x1550 [nvme]
  WARN:  ? check_preempt_wakeup+0x14f/0x230
  WARN:  ? check_preempt_curr+0x29/0x80
  WARN:  ? nvme_irq_check+0x30/0x30 [nvme]
  WARN:  process_one_work+0x18e/0x3c0
  WARN:  worker_thread+0x30/0x3a0
  WARN:  ? process_one_work+0x3c0/0x3c0
  WARN:  kthread+0x113/0x130
  WARN:  ? kthread_park+0x90/0x90
  WARN:  ret_from_fork+0x3a/0x50

This patch sets evtchn_to_irq rows via a cmpxchg operation so that they
will be set only once. The row is now cleared before writing it to
evtchn_to_irq in order to not create a race once the row is visible for
other threads.

While at it, do not require the page to be zeroed, because it will be
overwritten with -1's in clear_evtchn_to_irq_row anyway.

Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Fixes: d0b075ffee ("xen/events: Refactor evtchn_to_irq array to be dynamically allocated")
Link: https://lore.kernel.org/r/20210812130930.127134-1-mheyne@amazon.de
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-08-12 10:49:54 -05:00
Hans de Goede
73fcbad691 platform/x86: asus-nb-wmi: Add tablet_mode_sw=lid-flip quirk for the TP200s
The Asus TP200s / E205SA 360 degree hinges 2-in-1 supports reporting
SW_TABLET_MODE info through the ASUS_WMI_DEVID_LID_FLIP WMI device-id.
Add a quirk to enable this.

BugLink: https://gitlab.freedesktop.org/libinput/libinput/-/issues/639
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210812145513.39117-2-hdegoede@redhat.com
2021-08-12 17:18:28 +02:00
Hans de Goede
7f45621c14 platform/x86: asus-nb-wmi: Allow configuring SW_TABLET_MODE method with a module option
Unfortunately we have been unable to find a reliable way to detect if
and how SW_TABLET_MODE reporting is supported, so we are relying on
DMI quirks for this.

Add a module-option to specify the SW_TABLET_MODE method so that this can
be easily tested without needing to rebuild the kernel.

BugLink: https://gitlab.freedesktop.org/libinput/libinput/-/issues/639
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210812145513.39117-1-hdegoede@redhat.com
2021-08-12 17:18:19 +02:00
Randy Dunlap
839ad22f75 x86/tools: Fix objdump version check again
Skip (omit) any version string info that is parenthesized.

Warning: objdump version 15) is older than 2.19
Warning: Skipping posttest.

where 'objdump -v' says:
GNU objdump (GNU Binutils; SUSE Linux Enterprise 15) 2.35.1.20201123-7.18

Fixes: 8bee738bb1 ("x86: Fix objdump version check in chkobjdump.awk for different formats.")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20210731000146.2720-1-rdunlap@infradead.org
2021-08-12 17:17:25 +02:00
Alexandre Ghiti
fdf3a7a1e0 riscv: Fix comment regarding kernel mapping overlapping with IS_ERR_VALUE
The current comment states that we check if the 64-bit kernel mapping
overlaps with the last 4K of the address space that is reserved to
error values in create_kernel_page_table, which is not the case since it
is done in setup_vm. But anyway, remove the reference to any function
and simply note that in 64-bit kernel, the check should be done as soon
as the kernel mapping base address is known.

Fixes: db6b84a368 ("riscv: Make sure the kernel mapping does not overlap with IS_ERR_VALUE")
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-12 07:16:58 -07:00
Changbin Du
030d6dbf0c riscv: kexec: do not add '-mno-relax' flag if compiler doesn't support it
The RISC-V special option '-mno-relax' which to disable linker relaxations
is supported by GCC8+. For GCC7 and lower versions do not support this
option.

Fixes: fba8a8674f ("RISC-V: Add kexec support")
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-12 07:16:52 -07:00
Cédric Le Goater
cbc06f051c powerpc/xive: Do not skip CPU-less nodes when creating the IPIs
On PowerVM, CPU-less nodes can be populated with hot-plugged CPUs at
runtime. Today, the IPI is not created for such nodes, and hot-plugged
CPUs use a bogus IPI, which leads to soft lockups.

We can not directly allocate and request the IPI on demand because
bringup_up() is called under the IRQ sparse lock. The alternative is
to allocate the IPIs for all possible nodes at startup and to request
the mapping on demand when the first CPU of a node is brought up.

Fixes: 7dcc37b3ef ("powerpc/xive: Map one IPI interrupt per node")
Cc: stable@vger.kernel.org # v5.13
Reported-by: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210807072057.184698-1-clg@kaod.org
2021-08-12 22:31:41 +10:00
Christophe Leroy
01fcac8e4d powerpc/interrupt: Do not call single_step_exception() from other exceptions
single_step_exception() is called by emulate_single_step() which
is called from (at least) alignment exception() handler and
program_check_exception() handler.

Redefine it as a regular __single_step_exception() which is called
by both single_step_exception() handler and emulate_single_step()
function.

Fixes: 3a96570ffc ("powerpc: convert interrupt handlers to use wrappers")
Cc: stable@vger.kernel.org # v5.12+
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/aed174f5cbc06f2cf95233c071d8aac948e46043.1628611921.git.christophe.leroy@csgroup.eu
2021-08-12 22:22:57 +10:00
Christophe Leroy
98694166c2 powerpc/interrupt: Fix OOPS by not calling do_IRQ() from timer_interrupt()
An interrupt handler shall not be called from another interrupt
handler otherwise this leads to problems like the following:

  Kernel attempted to write user page (afd4fa84) - exploit attempt? (uid: 1000)
  ------------[ cut here ]------------
  Bug: Write fault blocked by KUAP!
  WARNING: CPU: 0 PID: 1617 at arch/powerpc/mm/fault.c:230 do_page_fault+0x484/0x720
  Modules linked in:
  CPU: 0 PID: 1617 Comm: sshd Tainted: G        W         5.13.0-pmac-00010-g8393422eb77 #7
  NIP:  c001b77c LR: c001b77c CTR: 00000000
  REGS: cb9e5bc0 TRAP: 0700   Tainted: G        W          (5.13.0-pmac-00010-g8393422eb77)
  MSR:  00021032 <ME,IR,DR,RI>  CR: 24942424  XER: 00000000

  GPR00: c001b77c cb9e5c80 c1582c00 00000021 3ffffbff 085b0000 00000027 c8eb644c
  GPR08: 00000023 00000000 00000000 00000000 24942424 0063f8c8 00000000 000186a0
  GPR16: afd52dd4 afd52dd0 afd52dcc afd52dc8 0065a990 c07640c4 cb9e5e98 cb9e5e90
  GPR24: 00000040 afd4fa96 00000040 02000000 c1fda6c0 afd4fa84 00000300 cb9e5cc0
  NIP [c001b77c] do_page_fault+0x484/0x720
  LR [c001b77c] do_page_fault+0x484/0x720
  Call Trace:
  [cb9e5c80] [c001b77c] do_page_fault+0x484/0x720 (unreliable)
  [cb9e5cb0] [c000424c] DataAccess_virt+0xd4/0xe4
  --- interrupt: 300 at __copy_tofrom_user+0x110/0x20c
  NIP:  c001f9b4 LR: c03250a0 CTR: 00000004
  REGS: cb9e5cc0 TRAP: 0300   Tainted: G        W          (5.13.0-pmac-00010-g8393422eb77)
  MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 48028468  XER: 20000000
  DAR: afd4fa84 DSISR: 0a000000
  GPR00: 20726f6f cb9e5d80 c1582c00 00000004 cb9e5e3a 00000016 afd4fa80 00000000
  GPR08: 3835202d 72777872 2d78722d 00000004 28028464 0063f8c8 00000000 000186a0
  GPR16: afd52dd4 afd52dd0 afd52dcc afd52dc8 0065a990 c07640c4 cb9e5e98 cb9e5e90
  GPR24: 00000040 afd4fa96 00000040 cb9e5e0c 00000daa a0000000 cb9e5e98 afd4fa56
  NIP [c001f9b4] __copy_tofrom_user+0x110/0x20c
  LR [c03250a0] _copy_to_iter+0x144/0x990
  --- interrupt: 300
  [cb9e5d80] [c03e89c0] n_tty_read+0xa4/0x598 (unreliable)
  [cb9e5df0] [c03e2a0c] tty_read+0xdc/0x2b4
  [cb9e5e80] [c0156bf8] vfs_read+0x274/0x340
  [cb9e5f00] [c01571ac] ksys_read+0x70/0x118
  [cb9e5f30] [c0016048] ret_from_syscall+0x0/0x28
  --- interrupt: c00 at 0xa7855c88
  NIP:  a7855c88 LR: a7855c5c CTR: 00000000
  REGS: cb9e5f40 TRAP: 0c00   Tainted: G        W          (5.13.0-pmac-00010-g8393422eb77)
  MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 2402446c  XER: 00000000

  GPR00: 00000003 afd4ec70 a72137d0 0000000b afd4ecac 00004000 0065a990 00000800
  GPR08: 00000000 a7947930 00000000 00000004 c15831b0 0063f8c8 00000000 000186a0
  GPR16: afd52dd4 afd52dd0 afd52dcc afd52dc8 0065a990 0065a9e0 00000001 0065fac0
  GPR24: 00000000 00000089 00664050 00000000 00668e30 a720c8dc a7943ff4 0065f9b0
  NIP [a7855c88] 0xa7855c88
  LR [a7855c5c] 0xa7855c5c
  --- interrupt: c00
  Instruction dump:
  3884aa88 38630178 48076861 807f0080 48042e45 2f830000 419e0148 3c80c079
  3c60c076 38841be4 386301c0 4801f705 <0fe00000> 3860000b 4bfffe30 3c80c06b
  ---[ end trace fd69b91a8046c2e5 ]---

Here the problem is that by re-enterring an exception handler,
kuap_save_and_lock() is called a second time with this time KUAP
access locked, leading to regs->kuap being overwritten hence
KUAP not being unlocked at exception exit as expected.

Do not call do_IRQ() from timer_interrupt() directly. Instead,
redefine do_IRQ() as a standard function named __do_IRQ(), and
call it from both do_IRQ() and time_interrupt() handlers.

Fixes: 3a96570ffc ("powerpc: convert interrupt handlers to use wrappers")
Cc: stable@vger.kernel.org # v5.12+
Reported-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c17d234f4927d39a1d7100864a8e1145323d33a0.1628611927.git.christophe.leroy@csgroup.eu
2021-08-12 22:21:57 +10:00
Takashi Sakamoto
67bb66d329 ALSA: oxfw: fix functioal regression for silence in Apogee Duet FireWire
OXFW 971 has no function to use the value in syt field of received
isochronous packet for playback timing generation. In kernel prepatch for
v5.14, ALSA OXFW driver got change to send NO_INFO value in the field
instead of actual timing value. The change brings Apogee Duet FireWire to
generate no playback sound, while output meter moves.

As long as I investigate, _any_ value in the syt field takes the device to
generate sound. It's reasonable to think that the device just ignores data
blocks in packet with NO_INFO value in its syt field for audio data
processing.

This commit adds a new flag for the quirk to fix regression.

Fixes: 029ffc4294 ("ALSA: oxfw: perform sequence replay for media clock recovery")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20210812022839.42043-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-12 13:40:27 +02:00
Jaroslav Kysela
a2befe9380 ALSA: hda - fix the 'Capture Switch' value change notifications
The original code in the cap_put_caller() function does not
handle correctly the positive values returned from the passed
function for multiple iterations. It means that the change
notifications may be lost.

Fixes: 352f7f914e ("ALSA: hda - Merge Realtek parser code to generic parser")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213851
Cc: <stable@kernel.org>
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20210811161441.1325250-1-perex@perex.cz
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-12 13:39:30 +02:00
Daniel Vetter
ffd5caa26f drm/doc/rfc: drop lmem uapi section
We still have quite a bit more work to do with overall reworking of
the ttm-based dg1 code, but the uapi stuff is now finalized with the
latest pull. So remove that.

This also fixes kerneldoc build warnings because we've included the
same headers in two places, resulting in sphinx complaining about
duplicated symbols. This regression has been created when we moved the
uapi definitions to the real include/uapi/ folder in 727ecd99a4
("drm/doc/rfc: drop the i915_gem_lmem.h header")

v2: Fix a few references that I missed, the htmldocs build took
forever.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Tested-by Stephen Rothwell <sfr@canb.auug.org.au> (v1)
References: https://lore.kernel.org/dri-devel/20210603193242.1ce99344@canb.auug.org.au/
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 727ecd99a4 ("drm/doc/rfc: drop the i915_gem_lmem.h header")
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210810142748.1983271-1-daniel.vetter@ffwll.ch
(cherry picked from commit dae2d28832)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-08-12 06:05:45 -04:00
Matt Roper
24d032e235 drm/i915: Only access SFC_DONE when media domain is not fused off
The SFC_DONE register lives within the corresponding VD0/VD2/VD4/VD6
forcewake domain and is not accessible if the vdbox in that domain is
fused off and the forcewake is not initialized.

This mistake went unnoticed because until recently we were using the
wrong register offset for the SFC_DONE register; once the register
offset was corrected, we started hitting errors like

  <4> [544.989065] i915 0000:cc:00.0: Uninitialized forcewake domain(s) 0x80 accessed at 0x1ce000

on parts with fused-off vdbox engines.

Fixes: e50dbdbfd9 ("drm/i915/tgl: Add SFC instdone to error state")
Fixes: 9c9c6d0ab0 ("drm/i915: Correct SFC_DONE register offset")
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210806174130.1058960-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit c5589bb5dc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Changed Fixes tag to match the cherry-picked 82929a2140]
2021-08-12 06:04:38 -04:00
Ankit Nautiyal
abd9d66a05 drm/i915/display: Fix the 12 BPC bits for PIPE_MISC reg
Till DISPLAY12 the PIPE_MISC bits 5-7 are used to set the
Dithering BPC, with valid values of 6, 8, 10 BPC.
For ADLP+ these bits are used to set the PORT OUTPUT BPC, with valid
values of: 6, 8, 10, 12 BPC, and need to be programmed whether
dithering is enabled or not.

This patch:
-corrects the bits 5-7 for PIPE MISC register for 12 BPC.
-renames the bits and mask to have generic names for these bits for
dithering bpc and port output bpc.

v3: Added a note for MIPI DSI which uses the PIPE_MISC for readout
for pipe_bpp. (Uma Shankar)

v2: Added 'display' to the subject and fixes tag. (Uma Shankar)

Fixes: 756f85cffe ("drm/i915/bdw: Broadwell has PIPEMISC")
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> (v1)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v3.13+

Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210811051857.109723-1-ankit.k.nautiyal@intel.com
(cherry picked from commit 70418a6871)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-08-12 05:54:44 -04:00
Vladimir Oltean
700fa08da4 net: dsa: sja1105: unregister the MDIO buses during teardown
The call to sja1105_mdiobus_unregister is present in the error path but
absent from the main driver unbind path.

Fixes: 5a8f09748e ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-12 10:53:40 +01:00
Mario Limonciello
c4b68e5139 pinctrl: amd: Fix an issue with shutdown when system set to s0ix
IRQs are getting armed on shutdown causing the system to immediately
wake back up.

Link: https://lkml.org/lkml/2021/8/2/1114
Reported-by: nix.or.die@googlemail.com
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Tested-by: Gabriel Craciunescu <nix.or.die@gmail.com>
CC: Raul E Rangel <rrangel@chromium.org>
Fixes: d62bd5ce12 ("pinctrl: amd: Implement irq_set_wake")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20210809201513.12367-1-mario.limonciello@amd.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-08-12 11:16:40 +02:00
Hoang Le
86704993e6 Revert "tipc: Return the correct errno code"
This reverts commit 0efea3c649 because of:
- The returning -ENOBUF error is fine on socket buffer allocation.
- There is side effect in the calling path
tipc_node_xmit()->tipc_link_xmit() when checking error code returning.

Fixes: 0efea3c649 ("tipc: Return the correct errno code")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-12 09:44:31 +01:00
Mark Brown
48c812e032 net: mscc: Fix non-GPL export of regmap APIs
The ocelot driver makes use of regmap, wrapping it with driver specific
operations that are thin wrappers around the core regmap APIs. These are
exported with EXPORT_SYMBOL, dropping the _GPL from the core regmap
exports which is frowned upon. Add _GPL suffixes to at least the APIs that
are doing register I/O.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-12 09:44:31 +01:00
Georgi Djakov
f753067494 Revert "interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate"
This reverts commit f84f5b6f72, which is
causing regressions on some platforms, preventing them to boot or do a
clean reboot. This is because the above commit is sending also all the
zero bandwidth requests to turn off any resources that might be enabled
unnecessarily, but currently this may turn off interconnects that are
enabled by default, but with no consumer to keep them on.

Let's revert this for now as some platforms are not ready for such
change yet. In the future we can introduce some _ignore_unused option
that could keep also the unused resources on platforms that have only
partial interconnect support and also add .shutdown callbacks to deal
with disabling the resources in the right order.

Reported-by: Stephen Boyd <swboyd@chromium.org>
Reported-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/CAE-0n52iVgX0JjjnYi=NDg49xP961p=+W5R2bmO+2xwRceFhfA@mail.gmail.com
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2021-08-12 09:24:39 +03:00
Linus Torvalds
1746f4db51 Merge tag 'orphans-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull orphan section linker fix from Kees Cook:

 - Handle changes to Clang's Sanitizer section layout (Nathan
   Chancellor)

* tag 'orphans-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  vmlinux.lds.h: Handle clang's module.{c,d}tor sections
2021-08-11 20:00:55 -10:00
Linus Torvalds
fd66ad69ef Merge tag 'seccomp-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp fixes from Kees Cook:

 - Fix typo in user notification documentation (Rodrigo Campos)

 - Fix userspace counter report when using TSYNC (Hsuan-Chi Kuo, Wiktor
   Garbacz)

* tag 'seccomp-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  seccomp: Fix setting loaded filter count during TSYNC
  Documentation: seccomp: Fix typo in user notification
2021-08-11 19:56:10 -10:00
Dave Airlie
bf71bde473 Merge tag 'amd-drm-fixes-5.14-2021-08-11' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.14-2021-08-11:

amdgpu:
- Yellow carp update
- RAS EEPROM fixes
- BACO/BOCO fixes
- Fix a memory leak in an error path
- Freesync fix
- VCN harvesting fix
- Display fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210812022153.4005-1-alexander.deucher@amd.com
2021-08-12 13:38:13 +10:00
jason-jh.lin
da4d4517ba drm/mediatek: Add component_del in OVL and COLOR remove function
Add component_del in OVL and COLOR remove function.

Fixes: ff1395609e ("drm/mediatek: Move mtk_ddp_comp_init() from sub driver to DRM driver")
Signed-off-by: jason-jh.lin <jason-jh.lin@mediatek.com>
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2021-08-12 07:00:50 +08:00
Eric Dumazet
b69dd5b378 net: igmp: increase size of mr_ifc_count
Some arches support cmpxchg() on 4-byte and 8-byte only.
Increase mr_ifc_count width to 32bit to fix this problem.

Fixes: 4a2b285e7e ("net: igmp: fix data-race in igmp_ifc_timer_expire()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210811195715.3684218-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-11 15:54:10 -07:00
jason-jh.lin
71ac6f390f drm/mediatek: Add AAL output size configuration
To avoid the output width and height is incorrect,
AAL_OUTPUT_SIZE configuration should be set.

Fixes: 0664d1392c ("drm/mediatek: Add AAL engine basic function")
Signed-off-by: jason-jh.lin <jason-jh.lin@mediatek.com>
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2021-08-12 06:43:28 +08:00
Sergey Shtylyov
0271824d9e MAINTAINERS: switch to my OMP email for Renesas Ethernet drivers
I'm still going to continue looking after the Renesas Ethernet drivers and
device tree bindings. Now my new employer, Open Mobile Platform (OMP), will
pay for all my upstream work. Let's switch to my OMP email for the reviews.

Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/9c212711-a0d7-39cd-7840-ff7abf938da1@omp.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-11 15:01:24 -07:00
Neal Cardwell
6de035fec0 tcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets
Currently if BBR congestion control is initialized after more than 2B
packets have been delivered, depending on the phase of the
tp->delivered counter the tracking of BBR round trips can get stuck.

The bug arises because if tp->delivered is between 2^31 and 2^32 at
the time the BBR congestion control module is initialized, then the
initialization of bbr->next_rtt_delivered to 0 will cause the logic to
believe that the end of the round trip is still billions of packets in
the future. More specifically, the following check will fail
repeatedly:

  !before(rs->prior_delivered, bbr->next_rtt_delivered)

and thus the connection will take up to 2B packets delivered before
that check will pass and the connection will set:

  bbr->round_start = 1;

This could cause many mechanisms in BBR to fail to trigger, for
example bbr_check_full_bw_reached() would likely never exit STARTUP.

This bug is 5 years old and has not been observed, and as a practical
matter this would likely rarely trigger, since it would require
transferring at least 2B packets, or likely more than 3 terabytes of
data, before switching congestion control algorithms to BBR.

This patch is a stable candidate for kernels as far back as v4.9,
when tcp_bbr.c was added.

Fixes: 0f8782ea14 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Kevin Yang <yyd@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20210811024056.235161-1-ncardwell@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-11 15:00:15 -07:00
Wong Vee Khee
2cad5d2ed1 net: pcs: xpcs: fix error handling on failed to allocate memory
Drivers such as sja1105 and stmmac that call xpcs_create() expects an
error returned by the pcs-xpcs module, but this was not the case on
failed to allocate memory.

Fixed this by returning an -ENOMEM instead of a NULL pointer.

Fixes: 3ad1d17154 ("net: dsa: sja1105: migrate to xpcs for SGMII")
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20210810085812.1808466-1-vee.khee.wong@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-11 14:50:54 -07:00
Willy Tarreau
6922110d15 net: linkwatch: fix failure to restore device state across suspend/resume
After migrating my laptop from 4.19-LTS to 5.4-LTS a while ago I noticed
that my Ethernet port to which a bond and a VLAN interface are attached
appeared to remain up after resuming from suspend with the cable unplugged
(and that problem still persists with 5.10-LTS).

It happens that the following happens:

  - the network driver (e1000e here) prepares to suspend, calls e1000e_down()
    which calls netif_carrier_off() to signal that the link is going down.
  - netif_carrier_off() adds a link_watch event to the list of events for
    this device
  - the device is completely stopped.
  - the machine suspends
  - the cable is unplugged and the machine brought to another location
  - the machine is resumed
  - the queued linkwatch events are processed for the device
  - the device doesn't yet have the __LINK_STATE_PRESENT bit and its events
    are silently dropped
  - the device is resumed with its link down
  - the upper VLAN and bond interfaces are never notified that the link had
    been turned down and remain up
  - the only way to provoke a change is to physically connect the machine
    to a port and possibly unplug it.

The state after resume looks like this:
  $ ip -br li | egrep 'bond|eth'
  bond0            UP             e8:6a:64:64:64:64 <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP>
  eth0             DOWN           e8:6a:64:64:64:64 <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP>
  eth0.2@eth0      UP             e8:6a:64:64:64:64 <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP>

Placing an explicit call to netdev_state_change() either in the suspend
or the resume code in the NIC driver worked around this but the solution
is not satisfying.

The issue in fact really is in link_watch that loses events while it
ought not to. It happens that the test for the device being present was
added by commit 124eee3f69 ("net: linkwatch: add check for netdevice
being present to linkwatch_do_dev") in 4.20 to avoid an access to
devices that are not present.

Instead of dropping events, this patch proceeds slightly differently by
postponing their handling so that they happen after the device is fully
resumed.

Fixes: 124eee3f69 ("net: linkwatch: add check for netdevice being present to linkwatch_do_dev")
Link: https://lists.openwall.net/netdev/2018/03/15/62
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20210809160628.22623-1-w@1wt.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-11 14:43:16 -07:00
Elliot Berman
14c4c8e415 cfi: Use rcu_read_{un}lock_sched_notrace
If rcu_read_lock_sched tracing is enabled, the tracing subsystem can
perform a jump which needs to be checked by CFI. For example, stm_ftrace
source is enabled as a module and hooks into enabled ftrace events. This
can cause an recursive loop where find_shadow_check_fn ->
rcu_read_lock_sched -> (call to stm_ftrace generates cfi slowpath) ->
find_shadow_check_fn -> rcu_read_lock_sched -> ...

To avoid the recursion, either the ftrace codes needs to be marked with
__no_cfi or CFI should not trace. Use the "_notrace" in CFI to avoid
tracing so that CFI can guard ftrace.

Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Cc: stable@vger.kernel.org
Fixes: cf68fffb66 ("add support for Clang CFI")
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210811155914.19550-1-quic_eberman@quicinc.com
2021-08-11 13:11:12 -07:00
Tejun Heo
0f78399551 Revert "block/mq-deadline: Add cgroup support"
This reverts commit 08a9ad8bf6 ("block/mq-deadline: Add cgroup support")
and a follow-up commit c06bc5a3fb ("block/mq-deadline: Remove a
WARN_ON_ONCE() call"). The added cgroup support has the following issues:

* It breaks cgroup interface file format rule by adding custom elements to a
  nested key-value file.

* It registers mq-deadline as a cgroup-aware policy even though all it's
  doing is collecting per-cgroup stats. Even if we need these stats, this
  isn't the right way to add them.

* It hasn't been reviewed from cgroup side.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-11 13:47:26 -06:00
Nathan Chancellor
848378812e vmlinux.lds.h: Handle clang's module.{c,d}tor sections
A recent change in LLVM causes module_{c,d}tor sections to appear when
CONFIG_K{A,C}SAN are enabled, which results in orphan section warnings
because these are not handled anywhere:

ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_ctor) is being placed in '.text.asan.module_ctor'
ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_dtor) is being placed in '.text.asan.module_dtor'
ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.tsan.module_ctor) is being placed in '.text.tsan.module_ctor'

Fangrui explains: "the function asan.module_ctor has the SHF_GNU_RETAIN
flag, so it is in a separate section even with -fno-function-sections
(default)".

Place them in the TEXT_TEXT section so that these technologies continue
to work with the newer compiler versions. All of the KASAN and KCSAN
KUnit tests continue to pass after this change.

Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1432
Link: 7b78956224
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Fangrui Song <maskray@google.com>
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210731023107.1932981-1-nathan@kernel.org
2021-08-11 12:19:58 -07:00
Dan Williams
96dcb97d0a Merge branch 'for-5.14/dax' into libnvdimm-fixes
Pick up some small dax cleanups that make some of Ira's follow on work
easier.
2021-08-11 12:04:43 -07:00
Dan Williams
f21453b0ff tools/testing/nvdimm: Fix missing 'fallthrough' warning
Use "fallthrough;" to address:

tools/testing/nvdimm/test/nfit.c: In function ‘nd_intel_test_finish_query’:
tools/testing/nvdimm/test/nfit.c:436:37: warning: this statement may
	fall through [-Wimplicit-fallthrough=]
  436 |                 fw->missed_activate = false;
      |                 ~~~~~~~~~~~~~~~~~~~~^~~~~~~
tools/testing/nvdimm/test/nfit.c:438:9: note: here
  438 |         case FW_STATE_UPDATED:
      |         ^~~~

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Link: https://lore.kernel.org/r/162767522046.3313209.14767278726893995797.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-08-11 11:55:54 -07:00
Dan Williams
d9cee9f85b libnvdimm/region: Fix label activation vs errors
There are a few scenarios where init_active_labels() can return without
registering deactivate_labels() to run when the region is disabled. In
particular label error injection creates scenarios where a DIMM is
disabled, but labels on other DIMMs in the region become activated.

Arrange for init_active_labels() to always register deactivate_labels().

Reported-by: Krzysztof Kensicki <krzysztof.kensicki@intel.com>
Cc: <stable@vger.kernel.org>
Fixes: bf9bccc14c ("libnvdimm: pmem label sets and namespace instantiation.")
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Link: https://lore.kernel.org/r/162766356450.3223041.1183118139023841447.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-08-11 11:54:43 -07:00
Dan Williams
b93dfa6bda ACPI: NFIT: Fix support for virtual SPA ranges
Fix the NFIT parsing code to treat a 0 index in a SPA Range Structure as
a special case and not match Region Mapping Structures that use 0 to
indicate that they are not mapped. Without this fix some platform BIOS
descriptions of "virtual disk" ranges do not result in the pmem driver
attaching to the range.

Details:
In addition to typical persistent memory ranges, the ACPI NFIT may also
convey "virtual" ranges. These ranges are indicated by a UUID in the SPA
Range Structure of UUID_VOLATILE_VIRTUAL_DISK, UUID_VOLATILE_VIRTUAL_CD,
UUID_PERSISTENT_VIRTUAL_DISK, or UUID_PERSISTENT_VIRTUAL_CD. The
critical difference between virtual ranges and UUID_PERSISTENT_MEMORY,
is that virtual do not support associations with Region Mapping
Structures.  For this reason the "index" value of virtual SPA Range
Structures is allowed to be 0. If a platform BIOS decides to represent
NVDIMMs with disconnected "Region Mapping Structures" (range-index ==
0), the kernel may falsely associate them with standalone ranges where
the "SPA Range Structure Index" is also zero. When this happens the
driver may falsely require labels where "virtual disks" are expected to
be label-less. I.e. "label-less" is where the namespace-range ==
region-range and the pmem driver attaches with no user action to create
a namespace.

Cc: Jacek Zloch <jacek.zloch@intel.com>
Cc: Lukasz Sobieraj <lukasz.sobieraj@intel.com>
Cc: "Lee, Chun-Yi" <jlee@suse.com>
Cc: <stable@vger.kernel.org>
Fixes: c2f32acdf8 ("acpi, nfit: treat virtual ramdisk SPA as pmem region")
Reported-by: Krzysztof Rusocki <krzysztof.rusocki@intel.com>
Reported-by: Damian Bassa <damian.bassa@intel.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Link: https://lore.kernel.org/r/162870796589.2521182.1240403310175570220.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-08-11 11:54:33 -07:00
Hsuan-Chi Kuo
b4d8a58f8d seccomp: Fix setting loaded filter count during TSYNC
The desired behavior is to set the caller's filter count to thread's.
This value is reported via /proc, so this fixes the inaccurate count
exposed to userspace; it is not used for reference counting, etc.

Signed-off-by: Hsuan-Chi Kuo <hsuanchikuo@gmail.com>
Link: https://lore.kernel.org/r/20210304233708.420597-1-hsuanchikuo@gmail.com
Co-developed-by: Wiktor Garbacz <wiktorg@google.com>
Signed-off-by: Wiktor Garbacz <wiktorg@google.com>
Link: https://lore.kernel.org/lkml/20210810125158.329849-1-wiktorg@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Fixes: c818c03b66 ("seccomp: Report number of loaded filters in /proc/$pid/status")
2021-08-11 11:48:28 -07:00
Yonghong Song
2d3a1e3615 bpf: Add rcu_read_lock in bpf_get_current_[ancestor_]cgroup_id() helpers
Currently, if bpf_get_current_cgroup_id() or
bpf_get_current_ancestor_cgroup_id() helper is
called with sleepable programs e.g., sleepable
fentry/fmod_ret/fexit/lsm programs, a rcu warning
may appear. For example, if I added the following
hack to test_progs/test_lsm sleepable fentry program
test_sys_setdomainname:

  --- a/tools/testing/selftests/bpf/progs/lsm.c
  +++ b/tools/testing/selftests/bpf/progs/lsm.c
  @@ -168,6 +168,10 @@ int BPF_PROG(test_sys_setdomainname, struct pt_regs *regs)
          int buf = 0;
          long ret;

  +       __u64 cg_id = bpf_get_current_cgroup_id();
  +       if (cg_id == 1000)
  +               copy_test++;
  +
          ret = bpf_copy_from_user(&buf, sizeof(buf), ptr);
          if (len == -2 && ret == 0 && buf == 1234)
                  copy_test++;

I will hit the following rcu warning:

  include/linux/cgroup.h:481 suspicious rcu_dereference_check() usage!
  other info that might help us debug this:
    rcu_scheduler_active = 2, debug_locks = 1
    1 lock held by test_progs/260:
      #0: ffffffffa5173360 (rcu_read_lock_trace){....}-{0:0}, at: __bpf_prog_enter_sleepable+0x0/0xa0
    stack backtrace:
    CPU: 1 PID: 260 Comm: test_progs Tainted: G           O      5.14.0-rc2+ #176
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
    Call Trace:
      dump_stack_lvl+0x56/0x7b
      bpf_get_current_cgroup_id+0x9c/0xb1
      bpf_prog_a29888d1c6706e09_test_sys_setdomainname+0x3e/0x89c
      bpf_trampoline_6442469132_0+0x2d/0x1000
      __x64_sys_setdomainname+0x5/0x110
      do_syscall_64+0x3a/0x80
      entry_SYSCALL_64_after_hwframe+0x44/0xae

I can get similar warning using bpf_get_current_ancestor_cgroup_id() helper.
syzbot reported a similar issue in [1] for syscall program. Helper
bpf_get_current_cgroup_id() or bpf_get_current_ancestor_cgroup_id()
has the following callchain:
   task_dfl_cgroup
     task_css_set
       task_css_set_check
and we have
   #define task_css_set_check(task, __c)                                   \
           rcu_dereference_check((task)->cgroups,                          \
                   lockdep_is_held(&cgroup_mutex) ||                       \
                   lockdep_is_held(&css_set_lock) ||                       \
                   ((task)->flags & PF_EXITING) || (__c))
Since cgroup_mutex/css_set_lock is not held and the task
is not existing and rcu read_lock is not held, a warning
will be issued. Note that bpf sleepable program is protected by
rcu_read_lock_trace().

The above sleepable bpf programs are already protected
by migrate_disable(). Adding rcu_read_lock() in these
two helpers will silence the above warning.
I marked the patch fixing 95b861a793
("bpf: Allow bpf_get_current_ancestor_cgroup_id for tracing")
which added bpf_get_current_ancestor_cgroup_id() to tracing programs
in 5.14. I think backporting 5.14 is probably good enough as sleepable
progrems are not widely used.

This patch should fix [1] as well since syscall program is a sleepable
program protected with migrate_disable().

 [1] https://lore.kernel.org/bpf/0000000000006d5cab05c7d9bb87@google.com/

Fixes: 95b861a793 ("bpf: Allow bpf_get_current_ancestor_cgroup_id for tracing")
Reported-by: syzbot+7ee5c2c09c284495371f@syzkaller.appspotmail.com
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210810230537.2864668-1-yhs@fb.com
2021-08-11 11:45:43 -07:00
Linus Walleij
86e5fbcaf7 Merge tag 'intel-pinctrl-v5.14-2' of gitolite.kernel.org:pub/scm/linux/kernel/git/pinctrl/intel into fixes
intel-pinctrl for v5.14-2

* Fix the software mapping of GPIOs on Intel Tiger Lake-H

The following is an automated git shortlog grouped by driver:

tigerlake:
 -  Fix GPIO mapping for newer version of software
2021-08-11 15:10:32 +02:00
Damien Le Moal
31697ef7f3 pinctrl: k210: Fix k210_fpioa_probe()
In k210_fpioa_probe(), add missing calls to clk_disable_unprepare() in
case of error after cenabling the clk and pclk clocks. Also add missing
error handling when enabling pclk.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: d4c34d09ab ("pinctrl: Add RISC-V Canaan Kendryte K210 FPIOA driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Link: https://lore.kernel.org/r/20210806004311.52859-1-damien.lemoal@wdc.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-08-11 15:03:53 +02:00
Eli Cohen
879753c816 vdpa/mlx5: Fix queue type selection logic
get_queue_type() comments that splict virtqueue is preferred, however,
the actual logic preferred packed virtqueues. Since firmware has not
supported packed virtqueues we ended up using split virtqueues as was
desired.

Since we do not advertise support for packed virtqueues, we add a check
to verify split virtqueues are indeed supported.

Fixes: 1a86b377aa ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210811053759.66752-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-11 06:44:43 -04:00
Eli Cohen
08dbd56602 vdpa/mlx5: Avoid destroying MR on empty iotlb
The current code treats an empty iotlb provdied in set_map() as a
special case and destroy the memory region object. This must not be done
since the virtqueue objects reference this MR. Doing so will cause the
driver unload to emit errors and log timeouts caused by the firmware
complaining on busy resources.

This patch treats an empty iotlb as any other change of mapping. In this
case, mlx5_vdpa_create_mr() will fail and the entire set_map() call to
fail.

This issue has not been encountered before but was seen to occur in a
non-official version of qemu. Since qemu is a userspace program, the
driver must protect against such case.

Fixes: 94abbccdf2 ("vdpa/mlx5: Add shared memory registration code")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210811053713.66658-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-11 06:44:41 -04:00
Michael S. Tsirkin
a24ce06c70 tools/virtio: fix build
We use a spinlock now so add a stub.
Ignore bogus uninitialized variable warnings.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-11 06:44:24 -04:00
Michael S. Tsirkin
f8ce72632f virtio_ring: pull in spinlock header
we use a spinlock now pull in the correct header to
make virtio_ring.c self sufficient.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-11 06:44:24 -04:00
Michael S. Tsirkin
ea2f6af165 vringh: pull in spinlock header
we use a spinlock now pull in the correct header to
make vring.h self sufficient.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-11 06:44:24 -04:00
Xie Yongji
82e89ea077 virtio-blk: Add validation for block size in config space
An untrusted device might presents an invalid block size
in configuration space. This tries to add validation for it
in the validate callback and clear the VIRTIO_BLK_F_BLK_SIZE
feature bit if the value is out of the supported range.

And we also double check the value in virtblk_probe() in
case that it's changed after the validation.

Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210809101609.148-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-08-11 06:44:24 -04:00
Neeraj Upadhyay
e74cfa91f4 vringh: Use wiov->used to check for read/write desc order
As __vringh_iov() traverses a descriptor chain, it populates
each descriptor entry into either read or write vring iov
and increments that iov's ->used member. So, as we iterate
over a descriptor chain, at any point, (riov/wriov)->used
value gives the number of descriptor enteries available,
which are to be read or written by the device. As all read
iovs must precede the write iovs, wiov->used should be zero
when we are traversing a read descriptor. Current code checks
for wiov->i, to figure out whether any previous entry in the
current descriptor chain was a write descriptor. However,
iov->i is only incremented, when these vring iovs are consumed,
at a later point, and remain 0 in __vringh_iov(). So, correct
the check for read and write descriptor order, to use
wiov->used.

Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Link: https://lore.kernel.org/r/1624591502-4827-1-git-send-email-neeraju@codeaurora.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-11 06:44:24 -04:00
Vincent Whitchurch
cb5d2c1f6c virtio_vdpa: reject invalid vq indices
Do not call vDPA drivers' callbacks with vq indicies larger than what
the drivers indicate that they support.  vDPA drivers do not bounds
check the indices.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/r/20210701114652.21956-1-vincent.whitchurch@axis.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-08-11 06:44:23 -04:00
Xie Yongji
c8d182bd38 vdpa: Add documentation for vdpa_alloc_device() macro
The return value of vdpa_alloc_device() macro is not very
clear, so that most of callers did the wrong check. Let's
add some comments to better document it.

Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210715080026.242-4-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-08-11 06:44:23 -04:00
Xie Yongji
1057afa012 vDPA/ifcvf: Fix return value check for vdpa_alloc_device()
The vdpa_alloc_device() returns an error pointer upon
failure, not NULL. To handle the failure correctly, this
replaces NULL check with IS_ERR() check and propagate the
error upwards.

Fixes: 5a2414bc45 ("virtio: Intel IFC VF driver for VDPA")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210715080026.242-3-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-08-11 06:44:23 -04:00
Xie Yongji
9632e78e82 vp_vdpa: Fix return value check for vdpa_alloc_device()
The vdpa_alloc_device() returns an error pointer upon
failure, not NULL. To handle the failure correctly, this
replaces NULL check with IS_ERR() check and propagate the
error upwards.

Fixes: 64b9f64f80 ("vdpa: introduce virtio pci driver")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210715080026.242-2-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-08-11 06:44:23 -04:00
Xie Yongji
2b847f2114 vdpa_sim: Fix return value check for vdpa_alloc_device()
The vdpa_alloc_device() returns an error pointer upon
failure, not NULL. To handle the failure correctly, this
replaces NULL check with IS_ERR() check and propagate the
error upwards.

Fixes: 2c53d0f64c ("vdpasim: vDPA device simulator")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210715080026.242-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-08-11 06:44:23 -04:00
Xie Yongji
f7ad318ea0 vhost: Fix the calculation in vhost_overflow()
This fixes the incorrect calculation for integer overflow
when the last address of iova range is 0xffffffff.

Fixes: ec33d031a1 ("vhost: detect 32 bit integer wrap around")
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210728130756.97-2-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-11 06:44:15 -04:00
Andrew Delgadillo
017f5fb9ce arm64: clean vdso & vdso32 files
commit a5b8ca97fb ("arm64: do not descend to vdso directories twice")
changes the cleaning behavior of arm64's vdso files, in that vdso.lds,
vdso.so, and vdso.so.dbg are not removed upon a 'make clean/mrproper':

$ make defconfig ARCH=arm64
$ make ARCH=arm64
$ make mrproper ARCH=arm64
$ git clean -nxdf
Would remove arch/arm64/kernel/vdso/vdso.lds
Would remove arch/arm64/kernel/vdso/vdso.so
Would remove arch/arm64/kernel/vdso/vdso.so.dbg

To remedy this, manually descend into arch/arm64/kernel/vdso upon
cleaning.

After this commit:
$ make defconfig ARCH=arm64
$ make ARCH=arm64
$ make mrproper ARCH=arm64
$ git clean -nxdf
<empty>

Similar results are obtained for the vdso32 equivalent.

Signed-off-by: Andrew Delgadillo <adelg@google.com>
Cc: stable@vger.kernel.org
Fixes: a5b8ca97fb ("arm64: do not descend to vdso directories twice")
Link: https://lore.kernel.org/r/20210810231755.1743524-1-adelg@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-08-11 11:04:55 +01:00
Tony Lindgren
06a089ef64 bus: ti-sysc: Fix error handling for sysc_check_active_timer()
We have changed the return type for sysc_check_active_timer() from -EBUSY
to -ENXIO, but the gpt12 system timer fix still checks for -EBUSY. We are
also not returning on other errors like we did earlier as noted by
Pavel Machek <pavel@denx.de>.

Commit 3ff340e24c ("bus: ti-sysc: Fix gpt12 system timer issue with
reserved status") should have been updated for commit 65fb736761
("bus: ti-sysc: suppress err msg for timers used as clockevent/source").

Let's fix the issue by checking for -ENXIO and returning on any other
errors as suggested by Pavel Machek <pavel@denx.de>.

Fixes: 3ff340e24c ("bus: ti-sysc: Fix gpt12 system timer issue with reserved status")
Depends-on: 65fb736761 ("bus: ti-sysc: suppress err msg for timers used as clockevent/source")
Reported-by: Pavel Machek <pavel@denx.de>
Reviewed-by: Pavel Machek (CIP) <pavel@denx.de>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Jarkko Nikula <jarkko.nikula@bitmer.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2021-08-11 08:34:46 +03:00
Dave Airlie
1648740b2e Merge tag 'mediatek-drm-fixes-5.14' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes
Mediatek DRM Fixes for Linux 5.14

1. Fix dpi bridge bug.
2. Fix cursor plane no update.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210809150604.32426-1-chunkuang.hu@kernel.org
2021-08-11 14:11:51 +10:00
Linus Torvalds
761c6d7ec8 Merge tag 'arc-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:

 - Fix FPU_STATUS update

 - Update my email address

 - Other spellos and fixes

* tag 'arc-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  MAINTAINERS: update Vineet's email address
  ARC: fp: set FPU_STATUS.FWE to enable FPU_STATUS update on context switch
  ARC: Fix CONFIG_STACKDEPOT
  arc: Fix spelling mistake and grammar in Kconfig
  arc: Prefer unsigned int to bare use of unsigned
2021-08-10 16:34:34 -10:00
Hu Haowen
3f12cc4bb0 Documentation: i2c: add i2c-sysfs into index
Append i2c-sysfs to toctree in order to get rid of building warnings.

Fixes: 31df7195b1 ("Documentation: i2c: Add doc for I2C sysfs")
Signed-off-by: Hu Haowen <src.res@email.cn>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-08-10 22:58:32 +02:00
Greg Kroah-Hartman
86ff25ed6c i2c: dev: zero out array used for i2c reads from userspace
If an i2c driver happens to not provide the full amount of data that a
user asks for, it is possible that some uninitialized data could be sent
to userspace.  While all in-kernel drivers look to be safe, just be sure
by initializing the buffer to zero before it is passed to the i2c driver
so that any future drivers will not have this issue.

Also properly copy the amount of data recvieved to the userspace buffer,
as pointed out by Dan Carpenter.

Reported-by: Eric Dumazet <edumazet@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-08-10 22:54:10 +02:00
Dhananjay Phadke
bba676cc0b i2c: iproc: fix race between client unreg and tasklet
Similar NULL deref was originally fixed by graceful teardown sequence -

https://lore.kernel.org/linux-i2c/1597106560-79693-1-git-send-email-dphadke@linux.microsoft.com

After this, a tasklet was added to take care of FIFO full condition for large i2c
transaction.

https://lore.kernel.org/linux-arm-kernel/20201102035433.6774-1-rayagonda.kokatanur@broadcom.com/

This introduced regression, a new race condition between tasklet enabling
interrupts and client unreg teardown sequence.

Kill tasklet before unreg_slave() masks bits in IE_OFFSET.
Updated teardown sequence -
(1) disable_irq()
(2) Kill tasklet
(3) Mask event enable bits in control reg
(4) Erase slave address (avoid further writes to rx fifo)
(5) Flush tx and rx FIFOs
(6) Clear pending event (interrupt) bits in status reg
(7) Set client pointer to NULL
(8) enable_irq()

 --

 Unable to handle kernel read from unreadable memory at virtual address 0000000000000320
 Mem abort info:
   ESR = 0x96000004
   EC = 0x25: DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
 Data abort info:
   ISV = 0, ISS = 0x00000004
   CM = 0, WnR = 0
 user pgtable: 4k pages, 48-bit VAs, pgdp=000000009212a000
 [0000000000000320] pgd=0000000000000000, p4d=0000000000000000
 Internal error: Oops: 96000004 [#1] SMP
 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O
 Hardware name: Overlake (DT)
 pstate: 40400085 (nZcv daIf +PAN -UAO -TCO BTYPE=--)
 pc : bcm_iproc_i2c_slave_isr+0x2b8/0x8e4
 lr : bcm_iproc_i2c_slave_isr+0x1c8/0x8e4
 sp : ffff800010003e70
 x29: ffff800010003e80 x28: ffffda017acdc000
 x27: ffffda017b0ae000 x26: ffff800010004000
 x25: ffff800010000000 x24: ffffda017af4a168
 x23: 0000000000000073 x22: 0000000000000000
 x21: 0000000001400000 x20: 0000000001000000
 x19: ffff06f09583f880 x18: 00000000fa83b2da
 x17: 000000000000b67e x16: 0000000002edb2f3
 x15: 00000000000002c7 x14: 00000000000002c7
 x13: 0000000000000006 x12: 0000000000000033
 x11: 0000000000000000 x10: 0000000001000000
 x9 : 0000000003289312 x8 : 0000000003289311
 x7 : 02d0cd03a303adbc x6 : 02d18e7f0a4dfc6c
 x5 : 02edb2f33f76ea68 x4 : 00000000fa83b2da
 x3 : ffffda017af43cd0 x2 : ffff800010003e74
 x1 : 0000000001400000 x0 : 0000000000000000
 Call trace:
  bcm_iproc_i2c_slave_isr+0x2b8/0x8e4
  bcm_iproc_i2c_isr+0x178/0x290
  __handle_irq_event_percpu+0xd0/0x200
  handle_irq_event+0x60/0x1a0
  handle_fasteoi_irq+0x130/0x220
  __handle_domain_irq+0x8c/0xcc
  gic_handle_irq+0xc0/0x120
  el1_irq+0xcc/0x180
  finish_task_switch+0x100/0x1d8
  __schedule+0x61c/0x7a0
  schedule_idle+0x28/0x44
  do_idle+0x254/0x28c
  cpu_startup_entry+0x28/0x2c
  rest_init+0xc4/0xd0
  arch_call_rest_init+0x14/0x1c
  start_kernel+0x33c/0x3b8
 Code: f9423260 910013e2 11000509 b9047a69 (f9419009)
 ---[ end trace 4781455b2a7bec15 ]---

Fixes: 4d658451c9 ("i2c: iproc: handle rx fifo full interrupt")

Signed-off-by: Dhananjay Phadke <dphadke@linux.microsoft.com>
Acked-by: Ray Jui <ray.jui@broadcom.com>
Acked-by: Rayagonda Kokatanur <rayagonda.kokatanur@broadcom.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-08-10 22:40:55 +02:00
Yang Yingliang
519133debc net: bridge: fix memleak in br_add_if()
I got a memleak report:

BUG: memory leak
unreferenced object 0x607ee521a658 (size 240):
comm "syz-executor.0", pid 955, jiffies 4294780569 (age 16.449s)
hex dump (first 32 bytes, cpu 1):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000d830ea5a>] br_multicast_add_port+0x1c2/0x300 net/bridge/br_multicast.c:1693
[<00000000274d9a71>] new_nbp net/bridge/br_if.c:435 [inline]
[<00000000274d9a71>] br_add_if+0x670/0x1740 net/bridge/br_if.c:611
[<0000000012ce888e>] do_set_master net/core/rtnetlink.c:2513 [inline]
[<0000000012ce888e>] do_set_master+0x1aa/0x210 net/core/rtnetlink.c:2487
[<0000000099d1cafc>] __rtnl_newlink+0x1095/0x13e0 net/core/rtnetlink.c:3457
[<00000000a01facc0>] rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3488
[<00000000acc9186c>] rtnetlink_rcv_msg+0x369/0xa10 net/core/rtnetlink.c:5550
[<00000000d4aabb9c>] netlink_rcv_skb+0x134/0x3d0 net/netlink/af_netlink.c:2504
[<00000000bc2e12a3>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
[<00000000bc2e12a3>] netlink_unicast+0x4a0/0x6a0 net/netlink/af_netlink.c:1340
[<00000000e4dc2d0e>] netlink_sendmsg+0x789/0xc70 net/netlink/af_netlink.c:1929
[<000000000d22c8b3>] sock_sendmsg_nosec net/socket.c:654 [inline]
[<000000000d22c8b3>] sock_sendmsg+0x139/0x170 net/socket.c:674
[<00000000e281417a>] ____sys_sendmsg+0x658/0x7d0 net/socket.c:2350
[<00000000237aa2ab>] ___sys_sendmsg+0xf8/0x170 net/socket.c:2404
[<000000004f2dc381>] __sys_sendmsg+0xd3/0x190 net/socket.c:2433
[<0000000005feca6c>] do_syscall_64+0x37/0x90 arch/x86/entry/common.c:47
[<000000007304477d>] entry_SYSCALL_64_after_hwframe+0x44/0xae

On error path of br_add_if(), p->mcast_stats allocated in
new_nbp() need be freed, or it will be leaked.

Fixes: 1080ab95e3 ("net: bridge: add support for IGMP/MLD stats and export them via netlink")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20210809132023.978546-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-10 13:25:14 -07:00
Vladimir Oltean
c35b57ceff net: switchdev: zero-initialize struct switchdev_notifier_fdb_info emitted by drivers towards the bridge
The blamed commit added a new field to struct switchdev_notifier_fdb_info,
but did not make sure that all call paths set it to something valid.
For example, a switchdev driver may emit a SWITCHDEV_FDB_ADD_TO_BRIDGE
notifier, and since the 'is_local' flag is not set, it contains junk
from the stack, so the bridge might interpret those notifications as
being for local FDB entries when that was not intended.

To avoid that now and in the future, zero-initialize all
switchdev_notifier_fdb_info structures created by drivers such that all
newly added fields to not need to touch drivers again.

Fixes: 2c4eca3ef7 ("net: bridge: switchdev: include local flag in FDB notifications")
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/20210810115024.1629983-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-10 13:22:57 -07:00
Nikolay Aleksandrov
45a687879b net: bridge: fix flags interpretation for extern learn fdb entries
Ignore fdb flags when adding port extern learn entries and always set
BR_FDB_LOCAL flag when adding bridge extern learn entries. This is
closest to the behaviour we had before and avoids breaking any use cases
which were allowed.

This patch fixes iproute2 calls which assume NUD_PERMANENT and were
allowed before, example:
$ bridge fdb add 00:11:22:33:44:55 dev swp1 extern_learn

Extern learn entries are allowed to roam, but do not expire, so static
or dynamic flags make no sense for them.

Also add a comment for future reference.

Fixes: eb100e0e24 ("net: bridge: allow to add externally learned entries from user-space")
Fixes: 0541a62932 ("net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry")
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20210810110010.43859-1-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-10 11:29:39 -07:00
Sean Christopherson
7b9cae027b KVM: VMX: Use current VMCS to query WAITPKG support for MSR emulation
Use the secondary_exec_controls_get() accessor in vmx_has_waitpkg() to
effectively get the controls for the current VMCS, as opposed to using
vmx->secondary_exec_controls, which is the cached value of KVM's desired
controls for vmcs01 and truly not reflective of any particular VMCS.

While the waitpkg control is not dynamic, i.e. vmcs01 will always hold
the same waitpkg configuration as vmx->secondary_exec_controls, the same
does not hold true for vmcs02 if the L1 VMM hides the feature from L2.
If L1 hides the feature _and_ does not intercept MSR_IA32_UMWAIT_CONTROL,
L2 could incorrectly read/write L1's virtual MSR instead of taking a #GP.

Fixes: 6e3ba4abce ("KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210810171952.2758100-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-10 13:32:09 -04:00
Linus Torvalds
9e723c5380 Merge tag 'platform-drivers-x86-v5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
 "Small set of pdx86 fixes for 5.14"

* tag 'platform-drivers-x86-v5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: pcengines-apuv2: Add missing terminating entries to gpio-lookup tables
  platform/x86: Make dual_accel_detect() KIOX010A + KIOX020A detect more robust
  platform/x86: Add and use a dual_accel_detect() helper
2021-08-10 09:46:33 -07:00
Linus Torvalds
b3f0ccc59c Merge tag 'ovl-fixes-5.14-rc6-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs fixes from Miklos Szeredi:
 "Fix several bugs in overlayfs"

* tag 'ovl-fixes-5.14-rc6-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: prevent private clone if bind mount is not allowed
  ovl: fix uninitialized pointer read in ovl_lookup_real_one()
  ovl: fix deadlock in splice write
  ovl: skip stale entries in merge dir cache iteration
2021-08-10 09:40:09 -07:00
Xie Yongji
0e398290cf vhost-vdpa: Fix integer overflow in vhost_vdpa_process_iotlb_update()
The "msg->iova + msg->size" addition can have an integer overflow
if the iotlb message is from a malicious user space application.
So let's fix it.

Fixes: 1b48dc03e5 ("vhost: vdpa: report iova range")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210728130756.97-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-10 11:52:23 -04:00
Parav Pandit
43bb40c5b9 virtio_pci: Support surprise removal of virtio pci device
When a virtio pci device undergo surprise removal (aka async removal in
PCIe spec), mark the device as broken so that any upper layer drivers can
abort any outstanding operation.

When a virtio net pci device undergo surprise removal which is used by a
NetworkManager, a below call trace was observed.

kernel:watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [kworker/1:1:27059]
watchdog: BUG: soft lockup - CPU#1 stuck for 52s! [kworker/1:1:27059]
CPU: 1 PID: 27059 Comm: kworker/1:1 Tainted: G S      W I  L    5.13.0-hotplug+ #8
Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.9.4 11/06/2020
Workqueue: events linkwatch_event
RIP: 0010:virtnet_send_command+0xfc/0x150 [virtio_net]
Call Trace:
 virtnet_set_rx_mode+0xcf/0x2a7 [virtio_net]
 ? __hw_addr_create_ex+0x85/0xc0
 __dev_mc_add+0x72/0x80
 igmp6_group_added+0xa7/0xd0
 ipv6_mc_up+0x3c/0x60
 ipv6_find_idev+0x36/0x80
 addrconf_add_dev+0x1e/0xa0
 addrconf_dev_config+0x71/0x130
 addrconf_notify+0x1f5/0xb40
 ? rtnl_is_locked+0x11/0x20
 ? __switch_to_asm+0x42/0x70
 ? finish_task_switch+0xaf/0x2c0
 ? raw_notifier_call_chain+0x3e/0x50
 raw_notifier_call_chain+0x3e/0x50
 netdev_state_change+0x67/0x90
 linkwatch_do_dev+0x3c/0x50
 __linkwatch_run_queue+0xd2/0x220
 linkwatch_event+0x21/0x30
 process_one_work+0x1c8/0x370
 worker_thread+0x30/0x380
 ? process_one_work+0x370/0x370
 kthread+0x118/0x140
 ? set_kthread_struct+0x40/0x40
 ret_from_fork+0x1f/0x30

Hence, add the ability to abort the command on surprise removal
which prevents infinite loop and system lockup.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Link: https://lore.kernel.org/r/20210721142648.1525924-5-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-10 11:50:55 -04:00
Parav Pandit
0e566c8f0f virtio: Protect vqs list access
VQs may be accessed to mark the device broken while they are
created/destroyed. Hence protect the access to the vqs list.

Fixes: e2dcdfe95c ("virtio: virtio_break_device() to mark all virtqueues broken.")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Link: https://lore.kernel.org/r/20210721142648.1525924-4-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-10 11:50:55 -04:00
Parav Pandit
249f255476 virtio: Keep vring_del_virtqueue() mirror of VQ create
Keep the vring_del_virtqueue() mirror of the create routines.
i.e. to delete list entry first as it is added last during the create
routine.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Link: https://lore.kernel.org/r/20210721142648.1525924-3-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-10 11:50:55 -04:00
Parav Pandit
60f0779862 virtio: Improve vq->broken access to avoid any compiler optimization
Currently vq->broken field is read by virtqueue_is_broken() in busy
loop in one context by virtnet_send_command().

vq->broken is set to true in other process context by
virtio_break_device(). Reader and writer are accessing it without any
synchronization. This may lead to a compiler optimization which may
result to optimize reading vq->broken only once.

Hence, force reading vq->broken on each invocation of
virtqueue_is_broken() and also force writing it so that such
update is visible to the readers.

It is a theoretical fix that isn't yet encountered in the field.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Link: https://lore.kernel.org/r/20210721142648.1525924-2-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-10 11:50:54 -04:00
Ronnie Sahlberg
981567bd96 cifs: use the correct max-length for dentry_path_raw()
RHBZ: 1972502

PATH_MAX is 4096 but PAGE_SIZE can be >4096 on some architectures
such as ppc and would thus write beyond the end of the actual object.

Cc: <stable@vger.kernel.org>
Reported-by: Xiaoli Feng <xifeng@redhat.com>
Suggested-by: Brian foster <bfoster@redhat.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-10 10:45:50 -05:00
Jakub Kicinski
2e273b0996 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
bpf 2021-08-10

We've added 5 non-merge commits during the last 2 day(s) which contain
a total of 7 files changed, 27 insertions(+), 15 deletions(-).

1) Fix missing bpf_read_lock_trace() context for BPF loader progs, from Yonghong Song.

2) Fix corner case where BPF prog retrieves wrong local storage, also from Yonghong Song.

3) Restrict availability of BPF write_user helper behind lockdown, from Daniel Borkmann.

4) Fix multiple kernel-doc warnings in BPF core, from Randy Dunlap.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, core: Fix kernel-doc notation
  bpf: Fix potentially incorrect results with bpf_get_local_storage()
  bpf: Add missing bpf_read_[un]lock_trace() for syscall program
  bpf: Add lockdown check for probe_write_user helper
  bpf: Add _kernel suffix to internal lockdown_bpf_read
====================

Link: https://lore.kernel.org/r/20210810144025.22814-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-10 07:53:11 -07:00
Anson Jacob
0cde63a8fc drm/amd/display: use GFP_ATOMIC in amdgpu_dm_irq_schedule_work
Replace GFP_KERNEL with GFP_ATOMIC as amdgpu_dm_irq_schedule_work
can't sleep.

BUG: sleeping function called from invalid context at include/linux/sched/mm.h:196
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 253, name: kworker/6:1H
CPU: 6 PID: 253 Comm: kworker/6:1H Tainted: G        W  OE     5.11.0-promotion_2021_06_07-18_36_28_prelim_revert_retrain #8
Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 3405 02/01/2021
Workqueue: events_highpri dm_irq_work_func [amdgpu]
Call Trace:
 <IRQ>
 dump_stack+0x5e/0x74
 ___might_sleep.cold+0x87/0x98
 __might_sleep+0x4b/0x80
 kmem_cache_alloc_trace+0x390/0x4f0
 amdgpu_dm_irq_handler+0x171/0x230 [amdgpu]
 amdgpu_irq_dispatch+0xc0/0x1e0 [amdgpu]
 amdgpu_ih_process+0x81/0x100 [amdgpu]
 amdgpu_irq_handler+0x26/0xa0 [amdgpu]
 __handle_irq_event_percpu+0x49/0x190
 ? __hrtimer_get_next_event+0x4d/0x80
 handle_irq_event_percpu+0x33/0x80
 handle_irq_event+0x33/0x60
 handle_edge_irq+0x82/0x190
 asm_call_irq_on_stack+0x12/0x20
 </IRQ>
 common_interrupt+0xbb/0x140
 asm_common_interrupt+0x1e/0x40
RIP: 0010:amdgpu_device_rreg.part.0+0x44/0xf0 [amdgpu]
Code: 53 48 89 fb 4c 3b af c8 08 00 00 73 6d 83 e2 02 75 0d f6 87 40 62 01 00 10 0f 85 83 00 00 00 4c 03 ab d0 08 00 00 45 8b 6d 00 <8b> 05 3e b6 52 00 85 c0 7e 62 48 8b 43 08 0f b7 70 3e 65 8b 05 e3
RSP: 0018:ffffae7740fff9e8 EFLAGS: 00000286
RAX: ffffffffc05ee610 RBX: ffff8aaf8f620000 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000005430 RDI: ffff8aaf8f620000
RBP: ffffae7740fffa08 R08: 0000000000000001 R09: 000000000000000a
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000005430
R13: 0000000071000000 R14: 0000000000000001 R15: 0000000000005430
 ? amdgpu_cgs_write_register+0x20/0x20 [amdgpu]
 amdgpu_device_rreg+0x17/0x20 [amdgpu]
 amdgpu_cgs_read_register+0x14/0x20 [amdgpu]
 dm_read_reg_func+0x38/0xb0 [amdgpu]
 generic_reg_wait+0x80/0x160 [amdgpu]
 dce_aux_transfer_raw+0x324/0x7c0 [amdgpu]
 dc_link_aux_transfer_raw+0x43/0x50 [amdgpu]
 dm_dp_aux_transfer+0x87/0x110 [amdgpu]
 drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper]
 drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper]
 drm_dp_get_one_sb_msg+0x349/0x480 [drm_kms_helper]
 drm_dp_mst_hpd_irq+0xc5/0xe40 [drm_kms_helper]
 ? drm_dp_mst_hpd_irq+0xc5/0xe40 [drm_kms_helper]
 dm_handle_hpd_rx_irq+0x184/0x1a0 [amdgpu]
 ? dm_handle_hpd_rx_irq+0x184/0x1a0 [amdgpu]
 handle_hpd_rx_irq+0x195/0x240 [amdgpu]
 ? __switch_to_asm+0x42/0x70
 ? __switch_to+0x131/0x450
 dm_irq_work_func+0x19/0x20 [amdgpu]
 process_one_work+0x209/0x400
 worker_thread+0x4d/0x3e0
 ? cancel_delayed_work+0xa0/0xa0
 kthread+0x124/0x160
 ? kthread_park+0x90/0x90
 ret_from_fork+0x22/0x30

Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Anson Jacob <Anson.Jacob@amd.com>
Cc: stable@vger.kernel.org
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-10 10:43:06 -04:00
Eric Bernstein
c90f6263f5 drm/amd/display: Remove invalid assert for ODM + MPC case
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Eric Bernstein <eric.bernstein@amd.com>
Cc: stable@vger.kernel.org
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-10 10:40:47 -04:00
Kenneth Feng
3042f80c6c drm/amd/pm: bug fix for the runtime pm BACO
In some systems only MACO is supported. This is to fix the problem
that runtime pm is enabled but BACO is not supported. MACO will be
handled seperately.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Jack Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-10 10:39:33 -04:00
Alex Deucher
7cbe08a930 drm/amdgpu: handle VCN instances when harvesting (v2)
There may be multiple instances and only one is harvested.

v2: fix typo in commit message

Fixes: 83a0b86391 ("drm/amdgpu: add judgement when add ip blocks (v2)")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1673
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-10 10:38:10 -04:00
Bixuan Cui
dbbc93576e genirq/msi: Ensure deactivation on teardown
msi_domain_alloc_irqs() invokes irq_domain_activate_irq(), but
msi_domain_free_irqs() does not enforce deactivation before tearing down
the interrupts.

This happens when PCI/MSI interrupts are set up and never used before being
torn down again, e.g. in error handling pathes. The only place which cleans
that up is the error handling path in msi_domain_alloc_irqs().

Move the cleanup from msi_domain_alloc_irqs() into msi_domain_free_irqs()
to cure that.

Fixes: f3b0946d62 ("genirq/msi: Make sure PCI MSIs are activated early")
Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210518033117.78104-1-cuibixuan@huawei.com
2021-08-10 15:55:19 +02:00
Rodrigo Vivi
d927ae73e1 Merge tag 'gvt-fixes-2021-08-10' of https://github.com/intel/gvt-linux into drm-intel-fixes
gvt-fixes-2021-08-10

- Fix windows VM hang issue for atomics workaround (Zhenyu)

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210810050133.GO13928@zhen-hp.sh.intel.com
2021-08-10 09:49:15 -04:00
Ben Dai
b9cc7d8a46 genirq/timings: Prevent potential array overflow in __irq_timings_store()
When the interrupt interval is greater than 2 ^ PREDICTION_BUFFER_SIZE *
PREDICTION_FACTOR us and less than 1s, the calculated index will be greater
than the length of irqs->ema_time[]. Check the calculated index before
using it to prevent array overflow.

Fixes: 23aa3b9a6b ("genirq/timings: Encapsulate storing function")
Signed-off-by: Ben Dai <ben.dai@unisoc.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210425150903.25456-1-ben.dai9703@gmail.com
2021-08-10 15:39:00 +02:00
Andre Przywara
d1dee81416 pinctrl: sunxi: Don't underestimate number of functions
When we are building all the various pinctrl structures for the
Allwinner pinctrl devices, we do some estimation about the maximum
number of distinct function (names) that we will need.

So far we take the number of pins as an upper bound, even though we
can actually have up to four special functions per pin. This wasn't a
problem until now, since we indeed have typically far more pins than
functions, and most pins share common functions.

However the H616 "-r" pin controller has only two pins, but four
functions, so we run over the end of the array when we are looking for
a matching function name in sunxi_pinctrl_add_function - there is no
NULL sentinel left that would terminate the loop:

[    8.200648] Unable to handle kernel paging request at virtual address fffdff7efbefaff5
[    8.209179] Mem abort info:
....
[    8.368456] Call trace:
[    8.370925]  __pi_strcmp+0x90/0xf0
[    8.374559]  sun50i_h616_r_pinctrl_probe+0x1c/0x28
[    8.379557]  platform_probe+0x68/0xd8

Do an actual worst case allocation (4 functions per pin, three common
functions and the sentinel) for the initial array allocation. This is
now heavily overestimating the number of functions in the common case,
but we will reallocate this array later with the actual number of
functions, so it's only temporarily.

Fixes: 561c1cf17c ("pinctrl: sunxi: Add support for the Allwinner H616-R pin controller")
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20210722132548.22121-1-andre.przywara@arm.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-08-10 14:55:35 +02:00
Jeremy Szu
d07149aba2 ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 650 G8 Notebook PC
The HP ProBook 650 G8 Notebook PC is using ALC236 codec which is
using 0x02 to control mute LED and 0x01 to control micmute LED.
Therefore, add a quirk to make it works.

Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210810100846.65844-1-jeremy.szu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-10 14:23:10 +02:00
David S. Miller
09c7fd5218 Merge branch 'fdb-backpressure-fixes'
Vladimir Oltean says:

====================
Fix broken backpressure during FDB dump in DSA drivers

rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into
multiple netlink skbs if the buffer provided by user space is too small
(one buffer will typically handle a few hundred FDB entries).

When the current buffer becomes full, nlmsg_put() in
dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index
of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that
point, and then the dump resumes on the same port with a new skb, and
FDB entries up to the saved index are simply skipped.

Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to
drivers, then drivers must check for the -EMSGSIZE error code returned
by it. Otherwise, when a netlink skb becomes full, DSA will no longer
save newly dumped FDB entries to it, but the driver will continue
dumping. So FDB entries will be missing from the dump.

DSA is one of the few switchdev drivers that have an .ndo_fdb_dump
implementation, because of the assumption that the hardware and software
FDBs cannot be efficiently kept in sync via SWITCHDEV_FDB_ADD_TO_BRIDGE.
Other drivers with a home-cooked .ndo_fdb_dump implementation are
ocelot and dpaa2-switch. These appear to do the correct thing, as do the
other DSA drivers, so nothing else appears to need fixing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 13:17:22 +01:00
Vladimir Oltean
21b52fed92 net: dsa: sja1105: fix broken backpressure in .port_fdb_dump
rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into
multiple netlink skbs if the buffer provided by user space is too small
(one buffer will typically handle a few hundred FDB entries).

When the current buffer becomes full, nlmsg_put() in
dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index
of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that
point, and then the dump resumes on the same port with a new skb, and
FDB entries up to the saved index are simply skipped.

Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to
drivers, then drivers must check for the -EMSGSIZE error code returned
by it. Otherwise, when a netlink skb becomes full, DSA will no longer
save newly dumped FDB entries to it, but the driver will continue
dumping. So FDB entries will be missing from the dump.

Fix the broken backpressure by propagating the "cb" return code and
allow rtnl_fdb_dump() to restart the FDB dump with a new skb.

Fixes: 291d1e72b7 ("net: dsa: sja1105: Add support for FDB and MDB management")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 13:17:22 +01:00
Vladimir Oltean
871a73a1c8 net: dsa: lantiq: fix broken backpressure in .port_fdb_dump
rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into
multiple netlink skbs if the buffer provided by user space is too small
(one buffer will typically handle a few hundred FDB entries).

When the current buffer becomes full, nlmsg_put() in
dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index
of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that
point, and then the dump resumes on the same port with a new skb, and
FDB entries up to the saved index are simply skipped.

Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to
drivers, then drivers must check for the -EMSGSIZE error code returned
by it. Otherwise, when a netlink skb becomes full, DSA will no longer
save newly dumped FDB entries to it, but the driver will continue
dumping. So FDB entries will be missing from the dump.

Fix the broken backpressure by propagating the "cb" return code and
allow rtnl_fdb_dump() to restart the FDB dump with a new skb.

Fixes: 58c59ef9e9 ("net: dsa: lantiq: Add Forwarding Database access")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 13:17:22 +01:00
Vladimir Oltean
ada2fee185 net: dsa: lan9303: fix broken backpressure in .port_fdb_dump
rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into
multiple netlink skbs if the buffer provided by user space is too small
(one buffer will typically handle a few hundred FDB entries).

When the current buffer becomes full, nlmsg_put() in
dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index
of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that
point, and then the dump resumes on the same port with a new skb, and
FDB entries up to the saved index are simply skipped.

Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to
drivers, then drivers must check for the -EMSGSIZE error code returned
by it. Otherwise, when a netlink skb becomes full, DSA will no longer
save newly dumped FDB entries to it, but the driver will continue
dumping. So FDB entries will be missing from the dump.

Fix the broken backpressure by propagating the "cb" return code and
allow rtnl_fdb_dump() to restart the FDB dump with a new skb.

Fixes: ab335349b8 ("net: dsa: lan9303: Add port_fast_age and port_fdb_dump methods")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 13:17:22 +01:00
Vladimir Oltean
cd391280bf net: dsa: hellcreek: fix broken backpressure in .port_fdb_dump
rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into
multiple netlink skbs if the buffer provided by user space is too small
(one buffer will typically handle a few hundred FDB entries).

When the current buffer becomes full, nlmsg_put() in
dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index
of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that
point, and then the dump resumes on the same port with a new skb, and
FDB entries up to the saved index are simply skipped.

Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to
drivers, then drivers must check for the -EMSGSIZE error code returned
by it. Otherwise, when a netlink skb becomes full, DSA will no longer
save newly dumped FDB entries to it, but the driver will continue
dumping. So FDB entries will be missing from the dump.

Fix the broken backpressure by propagating the "cb" return code and
allow rtnl_fdb_dump() to restart the FDB dump with a new skb.

Fixes: e4b27ebc78 ("net: dsa: Add DSA driver for Hirschmann Hellcreek switches")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 13:17:22 +01:00
Linus Walleij
463dbba4d1 ARM: 9104/2: Fix Keystone 2 kernel mapping regression
This fixes a Keystone 2 regression discovered as a side effect of
defining an passing the physical start/end sections of the kernel
to the MMU remapping code.

As the Keystone applies an offset to all physical addresses,
including those identified and patches by phys2virt, we fail to
account for this offset in the kernel_sec_start and kernel_sec_end
variables.

Further these offsets can extend into the 64bit range on LPAE
systems such as the Keystone 2.

Fix it like this:
- Extend kernel_sec_start and kernel_sec_end to be 64bit
- Add the offset also to kernel_sec_start and kernel_sec_end

As passing kernel_sec_start and kernel_sec_end as 64bit invariably
incurs BE8 endianness issues I have attempted to dry-code around
these.

Tested on the Vexpress QEMU model both with and without LPAE
enabled.

Fixes: 6e121df14c ("ARM: 9090/1: Map the lowmem and kernel separately")
Reported-by: Nishanth Menon <nmenon@kernel.org>
Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Nishanth Menon <nmenon@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-08-10 12:17:25 +01:00
Randy Dunlap
019d0454c6 bpf, core: Fix kernel-doc notation
Fix kernel-doc warnings in kernel/bpf/core.c (found by scripts/kernel-doc
and W=1 builds). That is, correct a function name in a comment and add
return descriptions for 2 functions.

Fixes these kernel-doc warnings:

  kernel/bpf/core.c:1372: warning: expecting prototype for __bpf_prog_run(). Prototype was for ___bpf_prog_run() instead
  kernel/bpf/core.c:1372: warning: No description found for return value of '___bpf_prog_run'
  kernel/bpf/core.c:1883: warning: No description found for return value of 'bpf_prog_select_runtime'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210809215229.7556-1-rdunlap@infradead.org
2021-08-10 13:09:28 +02:00
Eric Dumazet
4a2b285e7e net: igmp: fix data-race in igmp_ifc_timer_expire()
Fix the data-race reported by syzbot [1]
Issue here is that igmp_ifc_timer_expire() can update in_dev->mr_ifc_count
while another change just occured from another context.

in_dev->mr_ifc_count is only 8bit wide, so the race had little
consequences.

[1]
BUG: KCSAN: data-race in igmp_ifc_event / igmp_ifc_timer_expire

write to 0xffff8881051e3062 of 1 bytes by task 12547 on cpu 0:
 igmp_ifc_event+0x1d5/0x290 net/ipv4/igmp.c:821
 igmp_group_added+0x462/0x490 net/ipv4/igmp.c:1356
 ____ip_mc_inc_group+0x3ff/0x500 net/ipv4/igmp.c:1461
 __ip_mc_join_group+0x24d/0x2c0 net/ipv4/igmp.c:2199
 ip_mc_join_group_ssm+0x20/0x30 net/ipv4/igmp.c:2218
 do_ip_setsockopt net/ipv4/ip_sockglue.c:1285 [inline]
 ip_setsockopt+0x1827/0x2a80 net/ipv4/ip_sockglue.c:1423
 tcp_setsockopt+0x8c/0xa0 net/ipv4/tcp.c:3657
 sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3362
 __sys_setsockopt+0x18f/0x200 net/socket.c:2159
 __do_sys_setsockopt net/socket.c:2170 [inline]
 __se_sys_setsockopt net/socket.c:2167 [inline]
 __x64_sys_setsockopt+0x62/0x70 net/socket.c:2167
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff8881051e3062 of 1 bytes by interrupt on cpu 1:
 igmp_ifc_timer_expire+0x706/0xa30 net/ipv4/igmp.c:808
 call_timer_fn+0x2e/0x1d0 kernel/time/timer.c:1419
 expire_timers+0x135/0x250 kernel/time/timer.c:1464
 __run_timers+0x358/0x420 kernel/time/timer.c:1732
 run_timer_softirq+0x19/0x30 kernel/time/timer.c:1745
 __do_softirq+0x12c/0x26e kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu+0x9a/0xb0 kernel/softirq.c:636
 sysvec_apic_timer_interrupt+0x69/0x80 arch/x86/kernel/apic/apic.c:1100
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
 console_unlock+0x8e8/0xb30 kernel/printk/printk.c:2646
 vprintk_emit+0x125/0x3d0 kernel/printk/printk.c:2174
 vprintk_default+0x22/0x30 kernel/printk/printk.c:2185
 vprintk+0x15a/0x170 kernel/printk/printk_safe.c:392
 printk+0x62/0x87 kernel/printk/printk.c:2216
 selinux_netlink_send+0x399/0x400 security/selinux/hooks.c:6041
 security_netlink_send+0x42/0x90 security/security.c:2070
 netlink_sendmsg+0x59e/0x7c0 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:703 [inline]
 sock_sendmsg net/socket.c:723 [inline]
 ____sys_sendmsg+0x360/0x4d0 net/socket.c:2392
 ___sys_sendmsg net/socket.c:2446 [inline]
 __sys_sendmsg+0x1ed/0x270 net/socket.c:2475
 __do_sys_sendmsg net/socket.c:2484 [inline]
 __se_sys_sendmsg net/socket.c:2482 [inline]
 __x64_sys_sendmsg+0x42/0x50 net/socket.c:2482
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x01 -> 0x02

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 12539 Comm: syz-executor.1 Not tainted 5.14.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 11:56:52 +01:00
Takeshi Misawa
1090340f7e net: Fix memory leak in ieee802154_raw_deliver
If IEEE-802.15.4-RAW is closed before receive skb, skb is leaked.
Fix this, by freeing sk_receive_queue in sk->sk_destruct().

syzbot report:
BUG: memory leak
unreferenced object 0xffff88810f644600 (size 232):
  comm "softirq", pid 0, jiffies 4294967032 (age 81.270s)
  hex dump (first 32 bytes):
    10 7d 4b 12 81 88 ff ff 10 7d 4b 12 81 88 ff ff  .}K......}K.....
    00 00 00 00 00 00 00 00 40 7c 4b 12 81 88 ff ff  ........@|K.....
  backtrace:
    [<ffffffff83651d4a>] skb_clone+0xaa/0x2b0 net/core/skbuff.c:1496
    [<ffffffff83fe1b80>] ieee802154_raw_deliver net/ieee802154/socket.c:369 [inline]
    [<ffffffff83fe1b80>] ieee802154_rcv+0x100/0x340 net/ieee802154/socket.c:1070
    [<ffffffff8367cc7a>] __netif_receive_skb_one_core+0x6a/0xa0 net/core/dev.c:5384
    [<ffffffff8367cd07>] __netif_receive_skb+0x27/0xa0 net/core/dev.c:5498
    [<ffffffff8367cdd9>] netif_receive_skb_internal net/core/dev.c:5603 [inline]
    [<ffffffff8367cdd9>] netif_receive_skb+0x59/0x260 net/core/dev.c:5662
    [<ffffffff83fe6302>] ieee802154_deliver_skb net/mac802154/rx.c:29 [inline]
    [<ffffffff83fe6302>] ieee802154_subif_frame net/mac802154/rx.c:102 [inline]
    [<ffffffff83fe6302>] __ieee802154_rx_handle_packet net/mac802154/rx.c:212 [inline]
    [<ffffffff83fe6302>] ieee802154_rx+0x612/0x620 net/mac802154/rx.c:284
    [<ffffffff83fe59a6>] ieee802154_tasklet_handler+0x86/0xa0 net/mac802154/main.c:35
    [<ffffffff81232aab>] tasklet_action_common.constprop.0+0x5b/0x100 kernel/softirq.c:557
    [<ffffffff846000bf>] __do_softirq+0xbf/0x2ab kernel/softirq.c:345
    [<ffffffff81232f4c>] do_softirq kernel/softirq.c:248 [inline]
    [<ffffffff81232f4c>] do_softirq+0x5c/0x80 kernel/softirq.c:235
    [<ffffffff81232fc1>] __local_bh_enable_ip+0x51/0x60 kernel/softirq.c:198
    [<ffffffff8367a9a4>] local_bh_enable include/linux/bottom_half.h:32 [inline]
    [<ffffffff8367a9a4>] rcu_read_unlock_bh include/linux/rcupdate.h:745 [inline]
    [<ffffffff8367a9a4>] __dev_queue_xmit+0x7f4/0xf60 net/core/dev.c:4221
    [<ffffffff83fe2db4>] raw_sendmsg+0x1f4/0x2b0 net/ieee802154/socket.c:295
    [<ffffffff8363af16>] sock_sendmsg_nosec net/socket.c:654 [inline]
    [<ffffffff8363af16>] sock_sendmsg+0x56/0x80 net/socket.c:674
    [<ffffffff8363deec>] __sys_sendto+0x15c/0x200 net/socket.c:1977
    [<ffffffff8363dfb6>] __do_sys_sendto net/socket.c:1989 [inline]
    [<ffffffff8363dfb6>] __se_sys_sendto net/socket.c:1985 [inline]
    [<ffffffff8363dfb6>] __x64_sys_sendto+0x26/0x30 net/socket.c:1985

Fixes: 9ec7671603 ("net: add IEEE 802.15.4 socket family implementation")
Reported-and-tested-by: syzbot+1f68113fa907bf0695a8@syzkaller.appspotmail.com
Signed-off-by: Takeshi Misawa <jeliantsurux@gmail.com>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210805075414.GA15796@DESKTOP
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2021-08-10 12:18:10 +02:00
Thomas Gleixner
ff363f480e x86/msi: Force affinity setup before startup
The X86 MSI mechanism cannot handle interrupt affinity changes safely after
startup other than from an interrupt handler, unless interrupt remapping is
enabled. The startup sequence in the generic interrupt code violates that
assumption.

Mark the irq chips with the new IRQCHIP_AFFINITY_PRE_STARTUP flag so that
the default interrupt setting happens before the interrupt is started up
for the first time.

While the interrupt remapping MSI chip does not require this, there is no
point in treating it differently as this might spare an interrupt to a CPU
which is not in the default affinity mask.

For the non-remapping case go to the direct write path when the interrupt
is not yet started similar to the not yet activated case.

Fixes: 1840475676 ("genirq: Expose default irq affinity mask (take 3)")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.886722080@linutronix.de
2021-08-10 10:59:21 +02:00
Thomas Gleixner
0c0e37dc11 x86/ioapic: Force affinity setup before startup
The IO/APIC cannot handle interrupt affinity changes safely after startup
other than from an interrupt handler. The startup sequence in the generic
interrupt code violates that assumption.

Mark the irq chip with the new IRQCHIP_AFFINITY_PRE_STARTUP flag so that
the default interrupt setting happens before the interrupt is started up
for the first time.

Fixes: 1840475676 ("genirq: Expose default irq affinity mask (take 3)")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.832143400@linutronix.de
2021-08-10 10:59:21 +02:00
Thomas Gleixner
826da77129 genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP
X86 IO/APIC and MSI interrupts (when used without interrupts remapping)
require that the affinity setup on startup is done before the interrupt is
enabled for the first time as the non-remapped operation mode cannot safely
migrate enabled interrupts from arbitrary contexts. Provide a new irq chip
flag which allows affected hardware to request this.

This has to be opt-in because there have been reports in the past that some
interrupt chips cannot handle affinity setting before startup.

Fixes: 1840475676 ("genirq: Expose default irq affinity mask (take 3)")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.779791738@linutronix.de
2021-08-10 10:59:20 +02:00
Thomas Gleixner
77e89afc25 PCI/MSI: Protect msi_desc::masked for multi-MSI
Multi-MSI uses a single MSI descriptor and there is a single mask register
when the device supports per vector masking. To avoid reading back the mask
register the value is cached in the MSI descriptor and updates are done by
clearing and setting bits in the cache and writing it to the device.

But nothing protects msi_desc::masked and the mask register from being
modified concurrently on two different CPUs for two different Linux
interrupts which belong to the same multi-MSI descriptor.

Add a lock to struct device and protect any operation on the mask and the
mask register with it.

This makes the update of msi_desc::masked unconditional, but there is no
place which requires a modification of the hardware register without
updating the masked cache.

msi_mask_irq() is now an empty wrapper which will be cleaned up in follow
up changes.

The problem goes way back to the initial support of multi-MSI, but picking
the commit which introduced the mask cache is a valid cut off point
(2.6.30).

Fixes: f2440d9acb ("PCI MSI: Refactor interrupt masking code")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.726833414@linutronix.de
2021-08-10 10:59:20 +02:00
Thomas Gleixner
d28d4ad2a1 PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown()
No point in using the raw write function from shutdown. Preparatory change
to introduce proper serialization for the msi_desc::masked cache.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.674391354@linutronix.de
2021-08-10 10:59:20 +02:00
Thomas Gleixner
689e6b5351 PCI/MSI: Correct misleading comments
The comments about preserving the cached state in pci_msi[x]_shutdown() are
misleading as the MSI descriptors are freed right after those functions
return. So there is nothing to restore. Preparatory change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.621609423@linutronix.de
2021-08-10 10:59:20 +02:00
Thomas Gleixner
361fd37397 PCI/MSI: Do not set invalid bits in MSI mask
msi_mask_irq() takes a mask and a flags argument. The mask argument is used
to mask out bits from the cached mask and the flags argument to set bits.

Some places invoke it with a flags argument which sets bits which are not
used by the device, i.e. when the device supports up to 8 vectors a full
unmask in some places sets the mask to 0xFFFFFF00. While devices probably
do not care, it's still bad practice.

Fixes: 7ba1930db0 ("PCI MSI: Unmask MSI if setup failed")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.568173099@linutronix.de
2021-08-10 10:59:20 +02:00
Thomas Gleixner
b9255a7cb5 PCI/MSI: Enforce MSI[X] entry updates to be visible
Nothing enforces the posted writes to be visible when the function
returns. Flush them even if the flush might be redundant when the entry is
masked already as the unmask will flush as well. This is either setup or a
rare affinity change event so the extra flush is not the end of the world.

While this is more a theoretical issue especially the logic in the X86
specific msi_set_affinity() function relies on the assumption that the
update has reached the hardware when the function returns.

Again, as this never has been enforced the Fixes tag refers to a commit in:
   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.515188147@linutronix.de
2021-08-10 10:59:20 +02:00
Thomas Gleixner
da181dc974 PCI/MSI: Enforce that MSI-X table entry is masked for update
The specification (PCIe r5.0, sec 6.1.4.5) states:

    For MSI-X, a function is permitted to cache Address and Data values
    from unmasked MSI-X Table entries. However, anytime software unmasks a
    currently masked MSI-X Table entry either by clearing its Mask bit or
    by clearing the Function Mask bit, the function must update any Address
    or Data values that it cached from that entry. If software changes the
    Address or Data value of an entry while the entry is unmasked, the
    result is undefined.

The Linux kernel's MSI-X support never enforced that the entry is masked
before the entry is modified hence the Fixes tag refers to a commit in:
      git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

Enforce the entry to be masked across the update.

There is no point in enforcing this to be handled at all possible call
sites as this is just pointless code duplication and the common update
function is the obvious place to enforce this.

Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Reported-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.462096385@linutronix.de
2021-08-10 10:59:20 +02:00
Thomas Gleixner
7d5ec3d361 PCI/MSI: Mask all unused MSI-X entries
When MSI-X is enabled the ordering of calls is:

  msix_map_region();
  msix_setup_entries();
  pci_msi_setup_msi_irqs();
  msix_program_entries();

This has a few interesting issues:

 1) msix_setup_entries() allocates the MSI descriptors and initializes them
    except for the msi_desc:masked member which is left zero initialized.

 2) pci_msi_setup_msi_irqs() allocates the interrupt descriptors and sets
    up the MSI interrupts which ends up in pci_write_msi_msg() unless the
    interrupt chip provides its own irq_write_msi_msg() function.

 3) msix_program_entries() does not do what the name suggests. It solely
    updates the entries array (if not NULL) and initializes the masked
    member for each MSI descriptor by reading the hardware state and then
    masks the entry.

Obviously this has some issues:

 1) The uninitialized masked member of msi_desc prevents the enforcement
    of masking the entry in pci_write_msi_msg() depending on the cached
    masked bit. Aside of that half initialized data is a NONO in general

 2) msix_program_entries() only ensures that the actually allocated entries
    are masked. This is wrong as experimentation with crash testing and
    crash kernel kexec has shown.

    This limited testing unearthed that when the production kernel had more
    entries in use and unmasked when it crashed and the crash kernel
    allocated a smaller amount of entries, then a full scan of all entries
    found unmasked entries which were in use in the production kernel.

    This is obviously a device or emulation issue as the device reset
    should mask all MSI-X table entries, but obviously that's just part
    of the paper specification.

Cure this by:

 1) Masking all table entries in hardware
 2) Initializing msi_desc::masked in msix_setup_entries()
 3) Removing the mask dance in msix_program_entries()
 4) Renaming msix_program_entries() to msix_update_entries() to
    reflect the purpose of that function.

As the masking of unused entries has never been done the Fixes tag refers
to a commit in:
   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.403833459@linutronix.de
2021-08-10 10:59:20 +02:00
Thomas Gleixner
438553958b PCI/MSI: Enable and mask MSI-X early
The ordering of MSI-X enable in hardware is dysfunctional:

 1) MSI-X is disabled in the control register
 2) Various setup functions
 3) pci_msi_setup_msi_irqs() is invoked which ends up accessing
    the MSI-X table entries
 4) MSI-X is enabled and masked in the control register with the
    comment that enabling is required for some hardware to access
    the MSI-X table

Step #4 obviously contradicts #3. The history of this is an issue with the
NIU hardware. When #4 was introduced the table access actually happened in
msix_program_entries() which was invoked after enabling and masking MSI-X.

This was changed in commit d71d6432e1 ("PCI/MSI: Kill redundant call of
irq_set_msi_desc() for MSI-X interrupts") which removed the table write
from msix_program_entries().

Interestingly enough nobody noticed and either NIU still works or it did
not get any testing with a kernel 3.19 or later.

Nevertheless this is inconsistent and there is no reason why MSI-X can't be
enabled and masked in the control register early on, i.e. move step #4
above to step #1. This preserves the NIU workaround and has no side effects
on other hardware.

Fixes: d71d6432e1 ("PCI/MSI: Kill redundant call of irq_set_msi_desc() for MSI-X interrupts")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.344136412@linutronix.de
2021-08-10 10:59:20 +02:00
David S. Miller
37c86c4a0b Merge branch 'ks8795-vlan-fixes'
Ben Hutchings says:

====================
ksz8795 VLAN fixes

This series fixes a number of bugs in the ksz8795 driver that affect
VLAN filtering, tag insertion, and tag removal.

I've tested these on the KSZ8795CLXD evaluation board, and checked the
register usage against the datasheets for the other supported chips.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 09:58:15 +01:00
Ben Hutchings
411d466d94 net: dsa: microchip: ksz8795: Don't use phy_port_cnt in VLAN table lookup
The magic number 4 in VLAN table lookup was the number of entries we
can read and write at once.  Using phy_port_cnt here doesn't make
sense and presumably broke VLAN filtering for 3-port switches.  Change
it back to 4.

Fixes: 4ce2a984ab ("net: dsa: microchip: ksz8795: use phy_port_cnt ...")
Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 09:58:15 +01:00
Ben Hutchings
164844135a net: dsa: microchip: ksz8795: Fix VLAN filtering
Currently ksz8_port_vlan_filtering() sets or clears the VLAN Enable
hardware flag.  That controls discarding of packets with a VID that
has not been enabled for any port on the switch.

Since it is a global flag, set the dsa_switch::vlan_filtering_is_global
flag so that the DSA core understands this can't be controlled per
port.

When VLAN filtering is enabled, the switch should also discard packets
with a VID that's not enabled on the ingress port.  Set or clear each
external port's VLAN Ingress Filter flag in ksz8_port_vlan_filtering()
to make that happen.

Fixes: e66f840c08 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver")
Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 09:58:15 +01:00
Ben Hutchings
9130c2d30c net: dsa: microchip: ksz8795: Use software untagging on CPU port
On the CPU port, we can support both tagged and untagged VLANs at the
same time by doing any necessary untagging in software rather than
hardware.  To enable that, keep the CPU port's Remove Tag flag cleared
and set the dsa_switch::untag_bridge_pvid flag.

Fixes: e66f840c08 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver")
Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 09:58:15 +01:00
Ben Hutchings
af01754f9e net: dsa: microchip: ksz8795: Fix VLAN untagged flag change on deletion
When a VLAN is deleted from a port, the flags in struct
switchdev_obj_port_vlan are always 0.  ksz8_port_vlan_del() copies the
BRIDGE_VLAN_INFO_UNTAGGED flag to the port's Tag Removal flag, and
therefore always clears it.

In case there are multiple VLANs configured as untagged on this port -
which seems useless, but is allowed - deleting one of them changes the
remaining VLANs to be tagged.

It's only ever necessary to change this flag when a VLAN is added to
the port, so leave it unchanged in ksz8_port_vlan_del().

Fixes: e66f840c08 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver")
Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 09:58:15 +01:00
Ben Hutchings
8f4f58f88f net: dsa: microchip: ksz8795: Reject unsupported VLAN configuration
The switches supported by ksz8795 only have a per-port flag for Tag
Removal.  This means it is not possible to support both tagged and
untagged VLANs on the same port.  Reject attempts to add a VLAN that
requires the flag to be changed, unless there are no VLANs currently
configured.

VID 0 is excluded from this check since it is untagged regardless of
the state of the flag.

On the CPU port we could support tagged and untagged VLANs at the same
time.  This will be enabled by a later patch.

Fixes: e66f840c08 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver")
Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 09:58:14 +01:00
Ben Hutchings
ef3b02a1d7 net: dsa: microchip: ksz8795: Fix PVID tag insertion
ksz8795 has never actually enabled PVID tag insertion, and it also
programmed the PVID incorrectly.  To fix this:

* Allow tag insertion to be controlled per ingress port.  On most
  chips, set bit 2 in Global Control 19.  On KSZ88x3 this control
  flag doesn't exist.

* When adding a PVID:
  - Set the appropriate register bits to enable tag insertion on
    egress at every other port if this was the packet's ingress port.
  - Mask *out* the VID from the default tag, before or-ing in the new
    PVID.

* When removing a PVID:
  - Clear the same control bits to disable tag insertion.
  - Don't update the default tag.  This wasn't doing anything useful.

Fixes: e66f840c08 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver")
Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 09:58:14 +01:00
Ben Hutchings
c34f674c88 net: dsa: microchip: Fix ksz_read64()
ksz_read64() currently does some dubious byte-swapping on the two
halves of a 64-bit register, and then only returns the high bits.
Replace this with a straightforward expression.

Fixes: e66f840c08 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver")
Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 09:58:14 +01:00
David S. Miller
31782a01d1 Merge tag 'linux-can-fixes-for-5.14-20210810' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
linux-can-fixes-for-5.14-20210810

Marc Kleine-Budde says:

====================
pull-request: can 2021-08-10

this is a pull request of 2 patches for net/master.

Baruch Siach's patch fixes a typo for the Microchip CAN BUS Analyzer
Tool entry in the MAINTAINERS file.

Hussein Alasadi fixes the setting of the M_CAN_DBTP register in the
m_can driver. The regression git mainline in v5.14-rc1, so no backport
to stable is needed.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 09:53:15 +01:00
David S. Miller
6a279f61e2 Merge tag 'mlx5-fixes-2021-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5 fixes 2021-08-09

This series introduces fixes to mlx5 driver.
Please pull and let me know if there is any problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 09:42:39 +01:00
David S. Miller
ea377dca46 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-08-09

This series contains updates to ice and iavf drivers.

Ani prevents the ice driver from accidentally being probed to a virtual
function and stops processing of VF messages when VFs are being torn
down.

Brett prevents the ice driver from deleting is own MAC address.

Fahad ensures the RSS LUT and key are always set following reset for
iavf.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10 09:30:04 +01:00
Yonghong Song
a2baf4e8bb bpf: Fix potentially incorrect results with bpf_get_local_storage()
Commit b910eaaaa4 ("bpf: Fix NULL pointer dereference in bpf_get_local_storage()
helper") fixed a bug for bpf_get_local_storage() helper so different tasks
won't mess up with each other's percpu local storage.

The percpu data contains 8 slots so it can hold up to 8 contexts (same or
different tasks), for 8 different program runs, at the same time. This in
general is sufficient. But our internal testing showed the following warning
multiple times:

  [...]
  warning: WARNING: CPU: 13 PID: 41661 at include/linux/bpf-cgroup.h:193
     __cgroup_bpf_run_filter_sock_ops+0x13e/0x180
  RIP: 0010:__cgroup_bpf_run_filter_sock_ops+0x13e/0x180
  <IRQ>
   tcp_call_bpf.constprop.99+0x93/0xc0
   tcp_conn_request+0x41e/0xa50
   ? tcp_rcv_state_process+0x203/0xe00
   tcp_rcv_state_process+0x203/0xe00
   ? sk_filter_trim_cap+0xbc/0x210
   ? tcp_v6_inbound_md5_hash.constprop.41+0x44/0x160
   tcp_v6_do_rcv+0x181/0x3e0
   tcp_v6_rcv+0xc65/0xcb0
   ip6_protocol_deliver_rcu+0xbd/0x450
   ip6_input_finish+0x11/0x20
   ip6_input+0xb5/0xc0
   ip6_sublist_rcv_finish+0x37/0x50
   ip6_sublist_rcv+0x1dc/0x270
   ipv6_list_rcv+0x113/0x140
   __netif_receive_skb_list_core+0x1a0/0x210
   netif_receive_skb_list_internal+0x186/0x2a0
   gro_normal_list.part.170+0x19/0x40
   napi_complete_done+0x65/0x150
   mlx5e_napi_poll+0x1ae/0x680
   __napi_poll+0x25/0x120
   net_rx_action+0x11e/0x280
   __do_softirq+0xbb/0x271
   irq_exit_rcu+0x97/0xa0
   common_interrupt+0x7f/0xa0
   </IRQ>
   asm_common_interrupt+0x1e/0x40
  RIP: 0010:bpf_prog_1835a9241238291a_tw_egress+0x5/0xbac
   ? __cgroup_bpf_run_filter_skb+0x378/0x4e0
   ? do_softirq+0x34/0x70
   ? ip6_finish_output2+0x266/0x590
   ? ip6_finish_output+0x66/0xa0
   ? ip6_output+0x6c/0x130
   ? ip6_xmit+0x279/0x550
   ? ip6_dst_check+0x61/0xd0
  [...]

Using drgn [0] to dump the percpu buffer contents showed that on this CPU
slot 0 is still available, but slots 1-7 are occupied and those tasks in
slots 1-7 mostly don't exist any more. So we might have issues in
bpf_cgroup_storage_unset().

Further debugging confirmed that there is a bug in bpf_cgroup_storage_unset().
Currently, it tries to unset "current" slot with searching from the start.
So the following sequence is possible:

  1. A task is running and claims slot 0
  2. Running BPF program is done, and it checked slot 0 has the "task"
     and ready to reset it to NULL (not yet).
  3. An interrupt happens, another BPF program runs and it claims slot 1
     with the *same* task.
  4. The unset() in interrupt context releases slot 0 since it matches "task".
  5. Interrupt is done, the task in process context reset slot 0.

At the end, slot 1 is not reset and the same process can continue to occupy
slots 2-7 and finally, when the above step 1-5 is repeated again, step 3 BPF
program won't be able to claim an empty slot and a warning will be issued.

To fix the issue, for unset() function, we should traverse from the last slot
to the first. This way, the above issue can be avoided.

The same reverse traversal should also be done in bpf_get_local_storage() helper
itself. Otherwise, incorrect local storage may be returned to BPF program.

  [0] https://github.com/osandov/drgn

Fixes: b910eaaaa4 ("bpf: Fix NULL pointer dereference in bpf_get_local_storage() helper")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210810010413.1976277-1-yhs@fb.com
2021-08-10 10:27:16 +02:00
Thomas Gleixner
55203550f9 Merge tag 'efi-urgent-for-v5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/urgent
Pull EFI fixes from Ard Biesheuvel:

 A batch of fixes for the arm64 stub image loader:

 - fix a logic bug that can make the random page allocator fail
   spuriously
 - force reallocation of the Image when it overlaps with firmware
   reserved memory regions
 - fix an oversight that defeated on optimization introduced earlier
   where images loaded at a suitable offset are never moved if booting
   without randomization
 - complain about images that were not loaded at the right offset by the
   firmware image loader.

Link: https://lore.kernel.org/r/20210803091215.2566-1-ardb@kernel.org
2021-08-10 10:24:49 +02:00
Miklos Szeredi
427215d85e ovl: prevent private clone if bind mount is not allowed
Add the following checks from __do_loopback() to clone_private_mount() as
well:

 - verify that the mount is in the current namespace

 - verify that there are no locked children

Reported-by: Alois Wohlschlager <alois1@gmx-topmail.de>
Fixes: c771d683a6 ("vfs: introduce clone_private_mount()")
Cc: <stable@vger.kernel.org> # v3.18
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-10 10:21:31 +02:00
Miklos Szeredi
580c610429 ovl: fix uninitialized pointer read in ovl_lookup_real_one()
One error path can result in release_dentry_name_snapshot() being called
before "name" was initialized by take_dentry_name_snapshot().

Fix by moving the release_dentry_name_snapshot() to immediately after the
only use.

Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-10 10:21:30 +02:00
Miklos Szeredi
9b91b6b019 ovl: fix deadlock in splice write
There's possibility of an ABBA deadlock in case of a splice write to an
overlayfs file and a concurrent splice write to a corresponding real file.

The call chain for splice to an overlay file:

 -> do_splice                     [takes sb_writers on overlay file]
   -> do_splice_from
     -> iter_file_splice_write    [takes pipe->mutex]
       -> vfs_iter_write
         ...
         -> ovl_write_iter        [takes sb_writers on real file]

And the call chain for splice to a real file:

 -> do_splice                     [takes sb_writers on real file]
   -> do_splice_from
     -> iter_file_splice_write    [takes pipe->mutex]

Syzbot successfully bisected this to commit 82a763e61e ("ovl: simplify
file splice").

Fix by reverting the write part of the above commit and by adding missing
bits from ovl_write_iter() into ovl_splice_write().

Fixes: 82a763e61e ("ovl: simplify file splice")
Reported-and-tested-by: syzbot+579885d1a9a833336209@syzkaller.appspotmail.com
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-10 10:21:30 +02:00
Amir Goldstein
9011c2791e ovl: skip stale entries in merge dir cache iteration
On the first getdents call, ovl_iterate() populates the readdir cache
with a list of entries, but for upper entries with origin lower inode,
p->ino remains zero.

Following getdents calls traverse the readdir cache list and call
ovl_cache_update_ino() for entries with zero p->ino to lookup the entry
in the overlay and return d_ino that is consistent with st_ino.

If the upper file was unlinked between the first getdents call and the
getdents call that lists the file entry, ovl_cache_update_ino() will not
find the entry and fall back to setting d_ino to the upper real st_ino,
which is inconsistent with how this object was presented to users.

Instead of listing a stale entry with inconsistent d_ino, simply skip
the stale entry, which is better for users.

xfstest overlay/077 is failing without this patch.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/fstests/CAOQ4uxgR_cLnC_vdU5=seP3fwqVkuZM_-WfD6maFTMbMYq=a9w@mail.gmail.com/
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-10 10:21:30 +02:00
Yonghong Song
87b7b5335e bpf: Add missing bpf_read_[un]lock_trace() for syscall program
Commit 79a7f8bdb1 ("bpf: Introduce bpf_sys_bpf() helper and program type.")
added support for syscall program, which is a sleepable program.

But the program run missed bpf_read_lock_trace()/bpf_read_unlock_trace(),
which is needed to ensure proper rcu callback invocations. This patch adds
bpf_read_[un]lock_trace() properly.

Fixes: 79a7f8bdb1 ("bpf: Introduce bpf_sys_bpf() helper and program type.")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210809235151.1663680-1-yhs@fb.com
2021-08-10 10:10:49 +02:00
Daniel Borkmann
51e1bb9eea bpf: Add lockdown check for probe_write_user helper
Back then, commit 96ae522795 ("bpf: Add bpf_probe_write_user BPF helper
to be called in tracers") added the bpf_probe_write_user() helper in order
to allow to override user space memory. Its original goal was to have a
facility to "debug, divert, and manipulate execution of semi-cooperative
processes" under CAP_SYS_ADMIN. Write to kernel was explicitly disallowed
since it would otherwise tamper with its integrity.

One use case was shown in cf9b1199de ("samples/bpf: Add test/example of
using bpf_probe_write_user bpf helper") where the program DNATs traffic
at the time of connect(2) syscall, meaning, it rewrites the arguments to
a syscall while they're still in userspace, and before the syscall has a
chance to copy the argument into kernel space. These days we have better
mechanisms in BPF for achieving the same (e.g. for load-balancers), but
without having to write to userspace memory.

Of course the bpf_probe_write_user() helper can also be used to abuse
many other things for both good or bad purpose. Outside of BPF, there is
a similar mechanism for ptrace(2) such as PTRACE_PEEK{TEXT,DATA} and
PTRACE_POKE{TEXT,DATA}, but would likely require some more effort.
Commit 96ae522795 explicitly dedicated the helper for experimentation
purpose only. Thus, move the helper's availability behind a newly added
LOCKDOWN_BPF_WRITE_USER lockdown knob so that the helper is disabled under
the "integrity" mode. More fine-grained control can be implemented also
from LSM side with this change.

Fixes: 96ae522795 ("bpf: Add bpf_probe_write_user BPF helper to be called in tracers")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-10 10:10:10 +02:00
Christian Hewitt
bf33677a3c drm/meson: fix colour distortion from HDR set during vendor u-boot
Add support for the OSD1 HDR registers so meson DRM can handle the HDR
properties set by Amlogic u-boot on G12A and newer devices which result
in blue/green/pink colour distortion to display output.

This takes the original patch submissions from Mathias [0] and [1] with
corrections for formatting and the missing description and attribution
needed for merge.

[0] https://lore.kernel.org/linux-amlogic/59dfd7e6-fc91-3d61-04c4-94e078a3188c@baylibre.com/T/
[1] https://lore.kernel.org/linux-amlogic/CAOKfEHBx_fboUqkENEMd-OC-NSrf46nto+vDLgvgttzPe99kXg@mail.gmail.com/T/#u

Fixes: 728883948b ("drm/meson: Add G12A Support for VIU setup")
Suggested-by: Mathias Steiger <mathias.steiger@googlemail.com>
Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Philip Milev <milev.philip@gmail.com>
[narmsrong: adding missing space on second tested-by tag]
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210806094005.7136-1-christianshewitt@gmail.com
2021-08-10 10:00:02 +02:00
Greg Kroah-Hartman
664cc971fb Revert "usb: dwc3: gadget: Use list_replace_init() before traversing lists"
This reverts commit d25d85061b as it is
reported to cause problems on many different types of boards.

Reported-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Reported-by: John Stultz <john.stultz@linaro.org>
Cc: Ray Chi <raychi@google.com>
Link: https://lore.kernel.org/r/CANcMJZCEVxVLyFgLwK98hqBEdc0_n4P0x_K6Gih8zNH3ouzbJQ@mail.gmail.com
Fixes: d25d85061b ("usb: dwc3: gadget: Use list_replace_init() before traversing lists")
Cc: stable <stable@vger.kernel.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Wesley Cheng <wcheng@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-10 09:12:32 +02:00
Greg Kroah-Hartman
a5056c0bc2 Merge tag 'iio-fixes-5.14a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus
Jonathan writes:

First set of fixes for IIO in the 5.14 cycle

adi,adis:
 - Ensure GPIO pin direction set explicitly in driver.
fxls8952af:
 - Fix use of ret when not initialized.
 - Fix issue with use of module symbol from built in.
hdc100x:
 - Add a margin to conversion time as some parts run to slowly.
palmas-adc:
 - Fix a wrong exit condition that leads to adc period always being set
   to maximum value.
st,sensors:
 - Drop a wrong restriction on number of interrupts in dt binding.
ti-ads7950:
 - Ensure CS deasserted after channel read.

* tag 'iio-fixes-5.14a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
  iio: adc: Fix incorrect exit of for-loop
  iio: humidity: hdc100x: Add margin to the conversion time
  dt-bindings: iio: st: Remove wrong items length check
  iio: accel: fxls8962af: fix i2c dependency
  iio: adis: set GPIO reset pin direction
  iio: adc: ti-ads7950: Ensure CS is deasserted after reading channels
  iio: accel: fxls8962af: fix potential use of uninitialized symbol
2021-08-10 08:54:36 +02:00
Zhen Lei
07d25971b2 locking/rtmutex: Use the correct rtmutex debugging config option
It's CONFIG_DEBUG_RT_MUTEXES not CONFIG_DEBUG_RT_MUTEX.

Fixes: f7efc4799f ("locking/rtmutex: Inline chainwalk depth check")
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210731123011.4555-1-thunder.leizhen@huawei.com
2021-08-10 08:21:52 +02:00
Hussein Alasadi
aae32b784e can: m_can: m_can_set_bittiming(): fix setting M_CAN_DBTP register
This patch fixes the setting of the M_CAN_DBTP register contents:
- use DBTP_ (the data bitrate macros) instead of NBTP_ which area used
  for the nominal bitrate
- do not overwrite possibly-existing DBTP_TDC flag by ORing reg_btp
  instead of overwriting

Link: https://lore.kernel.org/r/FRYP281MB06140984ABD9994C0AAF7433D1F69@FRYP281MB0614.DEUP281.PROD.OUTLOOK.COM
Fixes: 20779943a0 ("can: m_can: use bits.h macros for all regmasks")
Cc: Torin Cooper-Bennun <torin@maxiluxsystems.com>
Cc: Chandrasekar Ramakrishnan <rcsekar@samsung.com>
Signed-off-by: Hussein Alasadi <alasadi@arecs.eu>
[mkl: update patch description, update indention]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-08-10 08:10:27 +02:00
Baruch Siach
7b637cd52f MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo
This patch fixes the abbreviated name of the Microchip CAN BUS
Analyzer Tool.

Fixes: 8a7b46fa79 ("MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver")
Link: https://lore.kernel.org/r/cc4831cb1c8759c15fb32c21fd326e831183733d.1627876781.git.baruch@tkos.co.il
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-08-10 08:10:27 +02:00
Aya Levin
bd37c2888c net/mlx5: Fix return value from tracer initialization
Check return value of mlx5_fw_tracer_start(), set error path and fix
return value of mlx5_fw_tracer_init() accordingly.

Fixes: c71ad41ccb ("net/mlx5: FW tracer, events handling")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:57:03 -07:00
Shay Drory
563476ae0c net/mlx5: Synchronize correct IRQ when destroying CQ
The CQ destroy is performed based on the IRQ number that is stored in
cq->irqn. That number wasn't set explicitly during CQ creation and as
expected some of the API users of mlx5_core_create_cq() forgot to update
it.

This caused to wrong synchronization call of the wrong IRQ with a number
0 instead of the real one.

As a fix, set the IRQ number directly in the mlx5_core_create_cq() and
update all users accordingly.

Fixes: 1a86b377aa ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Fixes: ef1659ade3 ("IB/mlx5: Add DEVX support for CQ events")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:57:00 -07:00
Chris Mi
88bbd7b236 net/mlx5e: TC, Fix error handling memory leak
Free the offload sample action on error.

Fixes: f94d6389f6 ("net/mlx5e: TC, Add support to offload sample action")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:56:58 -07:00
Shay Drory
ba317e832d net/mlx5: Destroy pool->mutex
Destroy pool->mutex when we destroy the pool.

Fixes: c36326d38d ("net/mlx5: Round-Robin EQs over IRQs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:56:55 -07:00
Shay Drory
5957cc557d net/mlx5: Set all field of mlx5_irq before inserting it to the xarray
Currently irq->pool is set after the irq is insert to the xarray.
Set irq->pool before the irq is inserted to the xarray.

Fixes: 71e084e264 ("net/mlx5: Allocating a pool of MSI-X vectors for SFs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:56:52 -07:00
Shay Drory
3c8946e0e2 net/mlx5: Fix order of functions in mlx5_irq_detach_nb()
Change order of functions in mlx5_irq_detach_nb() so it will be
a mirror of mlx5_irq_attach_nb.

Fixes: 71e084e264 ("net/mlx5: Allocating a pool of MSI-X vectors for SFs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:56:49 -07:00
Aya Levin
c85a6b8feb net/mlx5: Block switchdev mode while devlink traps are active
Since switchdev mode can't support  devlink traps, verify there are
no active devlink traps before moving eswitch to switchdev mode. If
there are active traps, prevent the switchdev mode configuration.

Fixes: eb3862a052 ("net/mlx5e: Enable traps according to link state")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:56:46 -07:00
Maxim Mikityanskiy
8ba3e4c858 net/mlx5e: Destroy page pool after XDP SQ to fix use-after-free
mlx5e_close_xdpsq does the cleanup: it calls mlx5e_free_xdpsq_descs to
free the outstanding descriptors, which relies on
mlx5e_page_release_dynamic and page_pool_release_page. However,
page_pool_destroy is already called by this point, because
mlx5e_close_rq runs before mlx5e_close_xdpsq.

This commit fixes the use-after-free by swapping mlx5e_close_xdpsq and
mlx5e_close_rq.

The commit cited below started calling page_pool_destroy directly from
the driver. Previously, the page pool was destroyed under a call_rcu
from xdp_rxq_info_unreg_mem_model, which would defer the deallocation
until after the XDPSQ is cleaned up.

Fixes: 1da4bbeffe ("net: core: page_pool: add user refcnt and reintroduce page_pool_destroy")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:56:43 -07:00
Vlad Buslov
6d8680da2e net/mlx5: Bridge, fix ageing time
Ageing time is not converted from clock_t to jiffies which results
incorrect ageing timeout calculation in workqueue update task. Fix it by
applying clock_t_to_jiffies() to provided value.

Fixes: c636a0f0f3 ("net/mlx5: Bridge, dynamic entry ageing")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:56:40 -07:00
Roi Dayan
c623c95afa net/mlx5e: Avoid creating tunnel headers for local route
It could be local and remote are on the same machine and the route
result will be a local route which will result in creating encap id
with src/dst mac address of 0.

Fixes: a54e20b4fc ("net/mlx5e: Add basic TC tunnel set action for SRIOV offloads")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:56:38 -07:00
Alex Vesker
d3875924da net/mlx5: DR, Add fail on error check on decap
While processing encapsulated packet on RX, one of the fields that is
checked is the inner packet length. If the length as specified in the header
doesn't match the actual inner packet length, the packet is invalid
and should be dropped. However, such packet caused the NIC to hang.

This patch turns on a 'fail_on_error' HW bit which allows HW to drop
such an invalid packet while processing RX packet and trying to decap it.

Fixes: ad17dc8cf9 ("net/mlx5: DR, Move STEv0 action apply logic")
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:56:34 -07:00
Leon Romanovsky
c633e79964 net/mlx5: Don't skip subfunction cleanup in case of error in module init
Clean SF resources if mlx5 eth failed to initialize.

Fixes: 1958fc2f07 ("net/mlx5: SF, Add auxiliary device driver")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-09 20:56:31 -07:00
Colin Ian King
40d3272793 scsi: mpt3sas: Fix incorrectly assigned error return and check
Currently the call to _base_static_config_pages() is assigning the error
return to variable 'rc' but checking the error return in error 'r'. Fix
this by assigning the error return to variable 'r' instead of 'rc'.

Link: https://lore.kernel.org/r/20210804134940.114011-1-colin.king@canonical.com
Fixes: 19a622c39a ("scsi: mpt3sas: Handle firmware faults during first half of IOC init")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Addresses-Coverity: ("Unused value")
2021-08-09 23:46:19 -04:00
Michael Kelley
dbe7633c39 scsi: storvsc: Log TEST_UNIT_READY errors as warnings
Commit 08f76547f0 ("scsi: storvsc: Update error logging") added more
robust logging of errors, particularly those reported as Hyper-V
errors. But this change produces extra logging noise in that
TEST_UNIT_READY may report errors during the normal course of detecting
device adds and removes.

Fix this by logging TEST_UNIT_READY errors as warnings, so that log lines
are produced only if the storvsc log level is changed to WARN level on the
kernel boot line.

Link: https://lore.kernel.org/r/1628269970-87876-1-git-send-email-mikelley@microsoft.com
Fixes: 08f76547f0 ("scsi: storvsc: Update error logging")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-09 23:17:10 -04:00
Ewan D. Milne
9977d880f7 scsi: lpfc: Move initialization of phba->poll_list earlier to avoid crash
The phba->poll_list is traversed in case of an error in
lpfc_sli4_hba_setup(), so it must be initialized earlier in case the error
path is taken.

[  490.030738] lpfc 0000:65:00.0: 0:1413 Failed to init iocb list.
[  490.036661] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[  490.044485] PGD 0 P4D 0
[  490.047027] Oops: 0000 [#1] SMP PTI
[  490.050518] CPU: 0 PID: 7 Comm: kworker/0:1 Kdump: loaded Tainted: G          I      --------- -  - 4.18.
[  490.060511] Hardware name: Dell Inc. PowerEdge R440/0WKGTH, BIOS 1.4.8 05/22/2018
[  490.067994] Workqueue: events work_for_cpu_fn
[  490.072371] RIP: 0010:lpfc_sli4_cleanup_poll_list+0x20/0xb0 [lpfc]
[  490.078546] Code: cf e9 04 f7 fe ff 0f 1f 40 00 0f 1f 44 00 00 41 57 49 89 ff 41 56 41 55 41 54 4d 8d a79
[  490.097291] RSP: 0018:ffffbd1a463dbcc8 EFLAGS: 00010246
[  490.102518] RAX: 0000000000008200 RBX: ffff945cdb8c0000 RCX: 0000000000000000
[  490.109649] RDX: 0000000000018200 RSI: ffff9468d0e16818 RDI: 0000000000000000
[  490.116783] RBP: ffff945cdb8c1740 R08: 00000000000015c5 R09: 0000000000000042
[  490.123915] R10: 0000000000000000 R11: ffffbd1a463dbab0 R12: ffff945cdb8c25c0
[  490.131049] R13: 00000000fffffff4 R14: 0000000000001800 R15: ffff945cdb8c0000
[  490.138182] FS:  0000000000000000(0000) GS:ffff9468d0e00000(0000) knlGS:0000000000000000
[  490.146267] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  490.152013] CR2: 0000000000000000 CR3: 000000042ca10002 CR4: 00000000007706f0
[  490.159146] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  490.166277] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  490.173409] PKRU: 55555554
[  490.176123] Call Trace:
[  490.178598]  lpfc_sli4_queue_destroy+0x7f/0x3c0 [lpfc]
[  490.183745]  lpfc_sli4_hba_setup+0x1bc7/0x23e0 [lpfc]
[  490.188797]  ? kernfs_activate+0x63/0x80
[  490.192721]  ? kernfs_add_one+0xe7/0x130
[  490.196647]  ? __kernfs_create_file+0x80/0xb0
[  490.201020]  ? lpfc_pci_probe_one_s4.isra.48+0x46f/0x9e0 [lpfc]
[  490.206944]  lpfc_pci_probe_one_s4.isra.48+0x46f/0x9e0 [lpfc]
[  490.212697]  lpfc_pci_probe_one+0x179/0xb70 [lpfc]
[  490.217492]  local_pci_probe+0x41/0x90
[  490.221246]  work_for_cpu_fn+0x16/0x20
[  490.224994]  process_one_work+0x1a7/0x360
[  490.229009]  ? create_worker+0x1a0/0x1a0
[  490.232933]  worker_thread+0x1cf/0x390
[  490.236687]  ? create_worker+0x1a0/0x1a0
[  490.240612]  kthread+0x116/0x130
[  490.243846]  ? kthread_flush_work_fn+0x10/0x10
[  490.248293]  ret_from_fork+0x35/0x40
[  490.251869] Modules linked in: lpfc(+) xt_CHECKSUM ipt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4i
[  490.332609] CR2: 0000000000000000

Link: https://lore.kernel.org/r/20210809150947.18104-1-emilne@redhat.com
Fixes: 93a4d6f401 ("scsi: lpfc: Add registration for CPU Offline/Online events")
Cc: stable@vger.kernel.org
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-09 22:45:51 -04:00
Ming Lei
11431e26c9 blk-iocost: fix lockdep warning on blkcg->lock
blkcg->lock depends on q->queue_lock which may depend on another driver
lock required in irq context, one example is dm-thin:

	Chain exists of:
	  &pool->lock#3 --> &q->queue_lock --> &blkcg->lock

	 Possible interrupt unsafe locking scenario:

	       CPU0                    CPU1
	       ----                    ----
	  lock(&blkcg->lock);
	                               local_irq_disable();
	                               lock(&pool->lock#3);
	                               lock(&q->queue_lock);
	  <Interrupt>
	    lock(&pool->lock#3);

Fix the issue by using spin_lock_irq(&blkcg->lock) in ioc_weight_write().

Cc: Tejun Heo <tj@kernel.org>
Reported-by: Bruno Goncalves <bgoncalv@redhat.com>
Link: https://lore.kernel.org/linux-block/CA+QYu4rzz6079ighEanS3Qq_Dmnczcf45ZoJoHKVLVATTo1e4Q@mail.gmail.com/T/#u
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20210803070608.1766400-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-09 20:00:26 -06:00
Pavel Begunkov
43597aac1f io_uring: fix ctx-exit io_rsrc_put_work() deadlock
__io_rsrc_put_work() might need ->uring_lock, so nobody should wait for
rsrc nodes holding the mutex. However, that's exactly what
io_ring_ctx_free() does with io_wait_rsrc_data().

Split it into rsrc wait + dealloc, and move the first one out of the
lock.

Cc: stable@vger.kernel.org
Fixes: b60c8dce33 ("io_uring: preparation for rsrc tagging")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0130c5c2693468173ec1afab714e0885d2c9c363.1628559783.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-09 19:59:28 -06:00
Jens Axboe
c018db4a57 io_uring: drop ctx->uring_lock before flushing work item
Ammar reports that he's seeing a lockdep splat on running test/rsrc_tags
from the regression suite:

======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc3-bluetea-test-00249-gc7d102232649 #5 Tainted: G           OE
------------------------------------------------------
kworker/2:4/2684 is trying to acquire lock:
ffff88814bb1c0a8 (&ctx->uring_lock){+.+.}-{3:3}, at: io_rsrc_put_work+0x13d/0x1a0

but task is already holding lock:
ffffc90001c6be70 ((work_completion)(&(&ctx->rsrc_put_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1bc/0x530

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 ((work_completion)(&(&ctx->rsrc_put_work)->work)){+.+.}-{0:0}:
       __flush_work+0x31b/0x490
       io_rsrc_ref_quiesce.part.0.constprop.0+0x35/0xb0
       __do_sys_io_uring_register+0x45b/0x1060
       do_syscall_64+0x35/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (&ctx->uring_lock){+.+.}-{3:3}:
       __lock_acquire+0x119a/0x1e10
       lock_acquire+0xc8/0x2f0
       __mutex_lock+0x86/0x740
       io_rsrc_put_work+0x13d/0x1a0
       process_one_work+0x236/0x530
       worker_thread+0x52/0x3b0
       kthread+0x135/0x160
       ret_from_fork+0x1f/0x30

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((work_completion)(&(&ctx->rsrc_put_work)->work));
                               lock(&ctx->uring_lock);
                               lock((work_completion)(&(&ctx->rsrc_put_work)->work));
  lock(&ctx->uring_lock);

 *** DEADLOCK ***

2 locks held by kworker/2:4/2684:
 #0: ffff88810004d938 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1bc/0x530
 #1: ffffc90001c6be70 ((work_completion)(&(&ctx->rsrc_put_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1bc/0x530

stack backtrace:
CPU: 2 PID: 2684 Comm: kworker/2:4 Tainted: G           OE     5.14.0-rc3-bluetea-test-00249-gc7d102232649 #5
Hardware name: Acer Aspire ES1-421/OLVIA_BE, BIOS V1.05 07/02/2015
Workqueue: events io_rsrc_put_work
Call Trace:
 dump_stack_lvl+0x6a/0x9a
 check_noncircular+0xfe/0x110
 __lock_acquire+0x119a/0x1e10
 lock_acquire+0xc8/0x2f0
 ? io_rsrc_put_work+0x13d/0x1a0
 __mutex_lock+0x86/0x740
 ? io_rsrc_put_work+0x13d/0x1a0
 ? io_rsrc_put_work+0x13d/0x1a0
 ? io_rsrc_put_work+0x13d/0x1a0
 ? process_one_work+0x1ce/0x530
 io_rsrc_put_work+0x13d/0x1a0
 process_one_work+0x236/0x530
 worker_thread+0x52/0x3b0
 ? process_one_work+0x530/0x530
 kthread+0x135/0x160
 ? set_kthread_struct+0x40/0x40
 ret_from_fork+0x1f/0x30

which is due to holding the ctx->uring_lock when flushing existing
pending work, while the pending work flushing may need to grab the uring
lock if we're using IOPOLL.

Fix this by dropping the uring_lock a bit earlier as part of the flush.

Cc: stable@vger.kernel.org
Link: https://github.com/axboe/liburing/issues/404
Tested-by: Ammar Faizi <ammarfaizi2@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-09 19:59:06 -06:00
Hao Xu
47cae0c71f io-wq: fix IO_WORKER_F_FIXED issue in create_io_worker()
There may be cases like:
        A                                 B
spin_lock(wqe->lock)
nr_workers is 0
nr_workers++
spin_unlock(wqe->lock)
                                     spin_lock(wqe->lock)
                                     nr_wokers is 1
                                     nr_workers++
                                     spin_unlock(wqe->lock)
create_io_worker()
  acct->worker is 1
                                     create_io_worker()
                                       acct->worker is 1

There should be one worker marked IO_WORKER_F_FIXED, but no one is.
Fix this by introduce a new agrument for create_io_worker() to indicate
if it is the first worker.

Fixes: 3d4e4face9 ("io-wq: fix no lock protection of acct->nr_worker")
Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20210808135434.68667-3-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-09 19:59:06 -06:00
Hao Xu
49e7f0c789 io-wq: fix bug of creating io-wokers unconditionally
The former patch to add check between nr_workers and max_workers has a
bug, which will cause unconditionally creating io-workers. That's
because the result of the check doesn't affect the call of
create_io_worker(), fix it by bringing in a boolean value for it.

Fixes: 21698274da ("io-wq: fix lack of acct->nr_workers < acct->max_workers judgement")
Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20210808135434.68667-2-haoxu@linux.alibaba.com
[axboe: drop hunk that isn't strictly needed]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-09 19:59:06 -06:00
Jens Axboe
4956b9eaad io_uring: rsrc ref lock needs to be IRQ safe
Nadav reports running into the below splat on re-enabling softirqs:

WARNING: CPU: 2 PID: 1777 at kernel/softirq.c:364 __local_bh_enable_ip+0xaa/0xe0
Modules linked in:
CPU: 2 PID: 1777 Comm: umem Not tainted 5.13.1+ #161
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
RIP: 0010:__local_bh_enable_ip+0xaa/0xe0
Code: a9 00 ff ff 00 74 38 65 ff 0d a2 21 8c 7a e8 ed 1a 20 00 fb 66 0f 1f 44 00 00 5b 41 5c 5d c3 65 8b 05 e6 2d 8c 7a 85 c0 75 9a <0f> 0b eb 96 e8 2d 1f 20 00 eb a5 4c 89 e7 e8 73 4f 0c 00 eb ae 65
RSP: 0018:ffff88812e58fcc8 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000201 RCX: dffffc0000000000
RDX: 0000000000000007 RSI: 0000000000000201 RDI: ffffffff8898c5ac
RBP: ffff88812e58fcd8 R08: ffffffff8575dbbf R09: ffffed1028ef14f9
R10: ffff88814778a7c3 R11: ffffed1028ef14f8 R12: ffffffff85c9e9ae
R13: ffff88814778a000 R14: ffff88814778a7b0 R15: ffff8881086db890
FS:  00007fbcfee17700(0000) GS:ffff8881e0300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c0402a5008 CR3: 000000011c1ac003 CR4: 00000000003706e0
Call Trace:
 _raw_spin_unlock_bh+0x31/0x40
 io_rsrc_node_ref_zero+0x13e/0x190
 io_dismantle_req+0x215/0x220
 io_req_complete_post+0x1b8/0x720
 __io_complete_rw.isra.0+0x16b/0x1f0
 io_complete_rw+0x10/0x20

where it's clear we end up calling the percpu count release directly
from the completion path, as it's in atomic mode and we drop the last
ref. For file/block IO, this can be from IRQ context already, and the
softirq locking for rsrc isn't enough.

Just make the lock fully IRQ safe, and ensure we correctly safe state
from the release path as we don't know the full context there.

Reported-by: Nadav Amit <nadav.amit@gmail.com>
Tested-by: Nadav Amit <nadav.amit@gmail.com>
Link: https://lore.kernel.org/io-uring/C187C836-E78B-4A31-B24C-D16919ACA093@gmail.com/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-09 19:58:59 -06:00
Linus Torvalds
9a73fa375d Merge branch 'for-5.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
 "One commit to fix a possible A-A deadlock around u64_stats_sync on
  32bit machines caused by updating it without disabling IRQ when it may
  be read from IRQ context"

* 'for-5.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: rstat: fix A-A deadlock on 32bit around u64_stats_sync
2021-08-09 16:47:36 -07:00
Guillaume Nault
143a8526ab bareudp: Fix invalid read beyond skb's linear data
Data beyond the UDP header might not be part of the skb's linear data.
Use skb_copy_bits() instead of direct access to skb->data+X, so that
we read the correct bytes even on a fragmented skb.

Fixes: 4b5f67232d ("net: Special handling for IP & MPLS.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Link: https://lore.kernel.org/r/7741c46545c6ef02e70c80a9b32814b22d9616b3.1628264975.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-09 15:37:43 -07:00
Randy Dunlap
d6e712aa7e net: openvswitch: fix kernel-doc warnings in flow.c
Repair kernel-doc notation in a few places to make it conform to
the expected format.

Fixes the following kernel-doc warnings:

flow.c:296: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * Parse vlan tag from vlan header.
flow.c:296: warning: missing initial short description on line:
 * Parse vlan tag from vlan header.
flow.c:537: warning: No description found for return value of 'key_extract_l3l4'
flow.c:769: warning: No description found for return value of 'key_extract'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: dev@openvswitch.org
Link: https://lore.kernel.org/r/20210808190834.23362-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-09 15:37:35 -07:00
Roi Dayan
beb7f2de57 psample: Add a fwd declaration for skbuff
Without this there is a warning if source files include psample.h
before skbuff.h or doesn't include it at all.

Fixes: 6ae0a62861 ("net: Introduce psample, a new genetlink channel for packet sampling")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Link: https://lore.kernel.org/r/20210808065242.1522535-1-roid@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-09 15:34:21 -07:00
Vineet Gupta
669d94219d MAINTAINERS: update Vineet's email address
I'll be leaving Synopsys shortly, but will continue to handle maintenance
for the transition period.

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-09 15:17:14 -07:00
Sven Schnelle
f153c22467 ucounts: add missing data type changes
commit f9c82a4ea8 ("Increase size of ucounts to atomic_long_t")
changed the data type of ucounts/ucounts_max to long, but missed to
adjust a few other places. This is noticeable on big endian platforms
from user space because the /proc/sys/user/max_*_names files all
contain 0.

v4 - Made the min and max constants long so the sysctl values
     are actually settable on little endian machines.
     -- EWB

Fixes: f9c82a4ea8 ("Increase size of ucounts to atomic_long_t")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Acked-by: Alexey Gladkov <legion@kernel.org>
v1: https://lkml.kernel.org/r/20210721115800.910778-1-svens@linux.ibm.com
v2: https://lkml.kernel.org/r/20210721125233.1041429-1-svens@linux.ibm.com
v3: https://lkml.kernel.org/r/20210730062854.3601635-1-svens@linux.ibm.com
Link: https://lkml.kernel.org/r/8735rijqlv.fsf_-_@disp2133
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-08-09 15:45:02 -05:00
Daniel Borkmann
71330842ff bpf: Add _kernel suffix to internal lockdown_bpf_read
Rename LOCKDOWN_BPF_READ into LOCKDOWN_BPF_READ_KERNEL so we have naming
more consistent with a LOCKDOWN_BPF_WRITE_USER option that we are adding.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-09 21:50:41 +02:00
Md Fahad Iqbal Polash
a7550f8b1c iavf: Set RSS LUT and key in reset handle path
iavf driver should set RSS LUT and key unconditionally in reset
path. Currently, the driver does not do that. This patch fixes
this issue.

Fixes: 2c86ac3c70 ("i40evf: create a generic config RSS function")
Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-09 09:59:23 -07:00
Brett Creeley
3ba7f53f8b ice: don't remove netdev->dev_addr from uc sync list
In some circumstances, such as with bridging, it's possible that the
stack will add the device's own MAC address to its unicast address list.

If, later, the stack deletes this address, the driver will receive a
request to remove this address.

The driver stores its current MAC address as part of the VSI MAC filter
list instead of separately. So, this causes a problem when the device's
MAC address is deleted unexpectedly, which results in traffic failure in
some cases.

The following configuration steps will reproduce the previously
mentioned problem:

> ip link set eth0 up
> ip link add dev br0 type bridge
> ip link set br0 up
> ip addr flush dev eth0
> ip link set eth0 master br0
> echo 1 > /sys/class/net/br0/bridge/vlan_filtering
> modprobe -r veth
> modprobe -r bridge
> ip addr add 192.168.1.100/24 dev eth0

The following ping command fails due to the netdev->dev_addr being
deleted when removing the bridge module.
> ping <link partner>

Fix this by making sure to not delete the netdev->dev_addr during MAC
address sync. After fixing this issue it was noticed that the
netdev_warn() in .set_mac was overly verbose, so make it at
netdev_dbg().

Also, there is a possibility of a race condition between .set_mac and
.set_rx_mode. Fix this by calling netif_addr_lock_bh() and
netif_addr_unlock_bh() on the device's netdev when the netdev->dev_addr
is going to be updated in .set_mac.

Fixes: e94d447866 ("ice: Implement filter sync, NDO operations and bump version")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Liang Li <liali@redhat.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-09 09:59:23 -07:00
Anirudh Venkataramanan
c503e63200 ice: Stop processing VF messages during teardown
When VFs are setup and torn down in quick succession, it is possible
that a VF is torn down by the PF while the VF's virtchnl requests are
still in the PF's mailbox ring. Processing the VF's virtchnl request
when the VF itself doesn't exist results in undefined behavior. Fix
this by adding a check to stop processing virtchnl requests when VF
teardown is in progress.

Fixes: ddf30f7ff8 ("ice: Add handler to configure SR-IOV")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-09 09:59:23 -07:00
Anirudh Venkataramanan
50ac747984 ice: Prevent probing virtual functions
The userspace utility "driverctl" can be used to change/override the
system's default driver choices. This is useful in some situations
(buggy driver, old driver missing a device ID, trying a workaround,
etc.) where the user needs to load a different driver.

However, this is also prone to user error, where a driver is mapped
to a device it's not designed to drive. For example, if the ice driver
is mapped to driver iavf devices, the ice driver crashes.

Add a check to return an error if the ice driver is being used to
probe a virtual function.

Fixes: 837f08fdec ("ice: Add basic driver framework for Intel(R) E800 Series")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-09 09:59:23 -07:00
Bart Van Assche
769f526767 configfs: restore the kernel v5.13 text attribute write behavior
Instead of appending new text attribute data at the offset specified by the
write() system call, only pass the newly written data to the .store()
callback.

Reported-by: Bodo Stroesser <bostroesser@gmail.com>
Tested-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-08-09 16:56:00 +02:00
Hangbin Liu
d09c548dbf net: sched: act_mirred: Reset ct info when mirror/redirect skb
When mirror/redirect a skb to a different port, the ct info should be reset
for reclassification. Or the pkts will match unexpected rules. For example,
with following topology and commands:

    -----------
              |
       veth0 -+-------
              |
       veth1 -+-------
              |
   ------------

 tc qdisc add dev veth0 clsact
 # The same with "action mirred egress mirror dev veth1" or "action mirred ingress redirect dev veth1"
 tc filter add dev veth0 egress chain 1 protocol ip flower ct_state +trk action mirred ingress mirror dev veth1
 tc filter add dev veth0 egress chain 0 protocol ip flower ct_state -inv action ct commit action goto chain 1
 tc qdisc add dev veth1 clsact
 tc filter add dev veth1 ingress chain 0 protocol ip flower ct_state +trk action drop

 ping <remove ip via veth0> &
 tc -s filter show dev veth1 ingress

With command 'tc -s filter show', we can find the pkts were dropped on
veth1.

Fixes: b57dc7c13e ("net/sched: Introduce action ct")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09 10:58:47 +01:00
David S. Miller
605bb4434d Merge branch 'smc-fixes'
Guvenc Gulce says:

====================
net/smc: fixes 2021-08-09

please apply the following patch series for smc to netdev's net tree.
One patch fixes invalid connection counting for links and the other
one fixes an access to an already cleared link.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09 10:47:00 +01:00
Guvenc Gulce
64513d269e net/smc: Correct smc link connection counter in case of smc client
SMC clients may be assigned to a different link after the initial
connection between two peers was established. In such a case,
the connection counter was not correctly set.

Update the connection counter correctly when a smc client connection
is assigned to a different smc link.

Fixes: 07d51580ff ("net/smc: Add connection counters for links")
Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Tested-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09 10:46:59 +01:00
Karsten Graul
8f3d65c166 net/smc: fix wait on already cleared link
There can be a race between the waiters for a tx work request buffer
and the link down processing that finally clears the link. Although
all waiters are woken up before the link is cleared there might be
waiters which did not yet get back control and are still waiting.
This results in an access to a cleared wait queue head.

Fix this by introducing atomic reference counting around the wait calls,
and wait with the link clear processing until all waiters have finished.
Move the work request layer related calls into smc_wr.c and set the
link state to INACTIVE before calling smcr_link_clear() in
smc_llc_srv_add_link().

Fixes: 15e1b99aad ("net/smc: no WR buffer wait for terminating link group")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09 10:46:59 +01:00
Grygorii Strashko
acc68b8d2a net: ethernet: ti: cpsw: fix min eth packet size for non-switch use-cases
The CPSW switchdev driver inherited fix from commit 9421c90150 ("net:
ethernet: ti: cpsw: fix min eth packet size") which changes min TX packet
size to 64bytes (VLAN_ETH_ZLEN, excluding ETH_FCS). It was done to fix HW
packed drop issue when packets are sent from Host to the port with PVID and
un-tagging enabled. Unfortunately this breaks some other non-switch
specific use-cases, like:
- [1] CPSW port as DSA CPU port with DSA-tag applied at the end of the
packet
- [2] Some industrial protocols, which expects min TX packet size 60Bytes
(excluding FCS).

Fix it by configuring min TX packet size depending on driver mode
 - 60Bytes (ETH_ZLEN) for multi mac (dual-mac) mode
 - 64Bytes (VLAN_ETH_ZLEN) for switch mode
and update it during driver mode change and annotate with
READ_ONCE()/WRITE_ONCE() as it can be read by napi while writing.

[1] https://lore.kernel.org/netdev/20210531124051.GA15218@cephalopod/
[2] https://e2e.ti.com/support/arm/sitara_arm/f/791/t/701669

Cc: stable@vger.kernel.org
Fixes: ed3525eda4 ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac")
Reported-by: Ben Hutchings <ben.hutchings@essensium.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09 10:16:46 +01:00
Yunsheng Lin
0fa32ca438 page_pool: mask the page->signature before the checking
As mentioned in commit c07aea3ef4 ("mm: add a signature in
struct page"):
"The page->signature field is aliased to page->lru.next and
page->compound_head."

And as the comment in page_is_pfmemalloc():
"lru.next has bit 1 set if the page is allocated from the
pfmemalloc reserves. Callers may simply overwrite it if they
do not need to preserve that information."

The page->signature is OR’ed with PP_SIGNATURE when a page is
allocated in page pool, see __page_pool_alloc_pages_slow(),
and page->signature is checked directly with PP_SIGNATURE in
page_pool_return_skb_page(), which might cause resoure leaking
problem for a page from page pool if bit 1 of lru.next is set
for a pfmemalloc page. What happens here is that the original
pp->signature is OR'ed with PP_SIGNATURE after the allocation
in order to preserve any existing bits(such as the bit 1, used
to indicate a pfmemalloc page), so when those bits are present,
those page is not considered to be from page pool and the DMA
mapping of those pages will be left stale.

As bit 0 is for page->compound_head, So mask both bit 0/1 before
the checking in page_pool_return_skb_page(). And we will return
those pfmemalloc pages back to the page allocator after cleaning
up the DMA mapping.

Fixes: 6a5bcd84e8 ("page_pool: Allow drivers to hint on SKB recycling")
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09 10:03:02 +01:00
Randy Dunlap
86aab09a48 dccp: add do-while-0 stubs for dccp_pr_debug macros
GCC complains about empty macros in an 'if' statement, so convert
them to 'do {} while (0)' macros.

Fixes these build warnings:

net/dccp/output.c: In function 'dccp_xmit_packet':
../net/dccp/output.c:283:71: warning: suggest braces around empty body in an 'if' statement [-Wempty-body]
  283 |                 dccp_pr_debug("transmit_skb() returned err=%d\n", err);
net/dccp/ackvec.c: In function 'dccp_ackvec_update_old':
../net/dccp/ackvec.c:163:80: warning: suggest braces around empty body in an 'else' statement [-Wempty-body]
  163 |                                               (unsigned long long)seqno, state);

Fixes: dc841e30ea ("dccp: Extend CCID packet dequeueing interface")
Fixes: 3802408644 ("dccp ccid-2: Update code for the Ack Vector input/registration routine")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: dccp@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09 10:00:02 +01:00
Zhenyu Wang
699aa57b35 drm/i915/gvt: Fix cached atomics setting for Windows VM
We've seen recent regression with host and windows VM running
simultaneously that cause gpu hang or even crash. Finally bisect to
commit 58586680ff ("drm/i915: Disable atomics in L3 for gen9"),
which seems cached atomics behavior difference caused regression
issue.

This tries to add new scratch register handler and add those in mmio
save/restore list for context switch. No gpu hang produced with this one.

Cc: stable@vger.kernel.org # 5.12+
Cc: "Xu, Terrence" <terrence.xu@intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com>
Cc: "Ekstrand, Jason" <jason.ekstrand@intel.com>
Reviewed-by: Colin Xu <colin.xu@intel.com>
Fixes: 58586680ff ("drm/i915: Disable atomics in L3 for gen9")
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20210806044056.648016-1-zhenyuw@linux.intel.com
2021-08-09 14:42:09 +08:00
Pu Lehui
43e8f76006 powerpc/kprobes: Fix kprobe Oops happens in booke
When using kprobe on powerpc booke series processor, Oops happens
as show bellow:

/ # echo "p:myprobe do_nanosleep" > /sys/kernel/debug/tracing/kprobe_events
/ # echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable
/ # sleep 1
[   50.076730] Oops: Exception in kernel mode, sig: 5 [#1]
[   50.077017] BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500
[   50.077221] Modules linked in:
[   50.077462] CPU: 0 PID: 77 Comm: sleep Not tainted 5.14.0-rc4-00022-g251a1524293d #21
[   50.077887] NIP:  c0b9c4e0 LR: c00ebecc CTR: 00000000
[   50.078067] REGS: c3883de0 TRAP: 0700   Not tainted (5.14.0-rc4-00022-g251a1524293d)
[   50.078349] MSR:  00029000 <CE,EE,ME>  CR: 24000228  XER: 20000000
[   50.078675]
[   50.078675] GPR00: c00ebdf0 c3883e90 c313e300 c3883ea0 00000001 00000000 c3883ecc 00000001
[   50.078675] GPR08: c100598c c00ea250 00000004 00000000 24000222 102490c2 bff4180c 101e60d4
[   50.078675] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f8 10240000 00500000
[   50.078675] GPR24: 00000002 00000000 c3883ea0 00000001 00000000 0000c350 3b9b8d50 00000000
[   50.080151] NIP [c0b9c4e0] do_nanosleep+0x0/0x190
[   50.080352] LR [c00ebecc] hrtimer_nanosleep+0x14c/0x1e0
[   50.080638] Call Trace:
[   50.080801] [c3883e90] [c00ebdf0] hrtimer_nanosleep+0x70/0x1e0 (unreliable)
[   50.081110] [c3883f00] [c00ec004] sys_nanosleep_time32+0xa4/0x110
[   50.081336] [c3883f40] [c001509c] ret_from_syscall+0x0/0x28
[   50.081541] --- interrupt: c00 at 0x100a4d08
[   50.081749] NIP:  100a4d08 LR: 101b5234 CTR: 00000003
[   50.081931] REGS: c3883f50 TRAP: 0c00   Not tainted (5.14.0-rc4-00022-g251a1524293d)
[   50.082183] MSR:  0002f902 <CE,EE,PR,FP,ME>  CR: 24000222  XER: 00000000
[   50.082457]
[   50.082457] GPR00: 000000a2 bf980040 1024b4d0 bf980084 bf980084 64000000 00555345 fefefeff
[   50.082457] GPR08: 7f7f7f7f 101e0000 00000069 00000003 28000422 102490c2 bff4180c 101e60d4
[   50.082457] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f8 10240000 00500000
[   50.082457] GPR24: 00000002 bf9803f4 10240000 00000000 00000000 100039e0 00000000 102444e8
[   50.083789] NIP [100a4d08] 0x100a4d08
[   50.083917] LR [101b5234] 0x101b5234
[   50.084042] --- interrupt: c00
[   50.084238] Instruction dump:
[   50.084483] 4bfffc40 60000000 60000000 60000000 9421fff0 39400402 914200c0 38210010
[   50.084841] 4bfffc20 00000000 00000000 00000000 <7fe00008> 7c0802a6 7c892378 93c10048
[   50.085487] ---[ end trace f6fffe98e2fa8f3e ]---
[   50.085678]
Trace/breakpoint trap

There is no real mode for booke arch and the MMU translation is
always on. The corresponding MSR_IS/MSR_DS bit in booke is used
to switch the address space, but not for real mode judgment.

Fixes: 21f8b2fa3c ("powerpc/kprobes: Ignore traps that happened in real mode")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210809023658.218915-1-pulehui@huawei.com
2021-08-09 16:31:54 +10:00
Takashi Iwai
dc0dc8a73e ALSA: pcm: Fix mmap breakage without explicit buffer setup
The recent fix c4824ae7db ("ALSA: pcm: Fix mmap capability check")
restricts the mmap capability only to the drivers that properly set up
the buffers, but it caused a regression for a few drivers that manage
the buffer on its own way.

For those with UNKNOWN buffer type (i.e. the uninitialized / unused
substream->dma_buffer), just assume that the driver handles the mmap
properly and blindly trust the hardware info bit.

Fixes: c4824ae7db ("ALSA: pcm: Fix mmap capability check")
Reported-and-tested-by: Jeff Woods <jwoods@fnordco.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/s5him0gpghv.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-09 07:52:31 +02:00
Marek Behún
484f2b7c61 cpufreq: armada-37xx: forbid cpufreq for 1.2 GHz variant
The 1.2 GHz variant of the Armada 3720 SOC is unstable with DVFS: when
the SOC boots, the WTMI firmware sets clocks and AVS values that work
correctly with 1.2 GHz CPU frequency, but random crashes occur once
cpufreq driver starts scaling.

We do not know currently what is the reason:
- it may be that the voltage value for L0 for 1.2 GHz variant provided
  by the vendor in the OTP is simply incorrect when scaling is used,
- it may be that some delay is needed somewhere,
- it may be something else.

The most sane solution now seems to be to simply forbid the cpufreq
driver on 1.2 GHz variant.

Signed-off-by: Marek Behún <kabel@kernel.org>
Fixes: 92ce45fb87 ("cpufreq: Add DVFS support for Armada 37xx")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-08-09 09:31:22 +05:30
Nadav Amit
20c0b380f9 io_uring: Use WRITE_ONCE() when writing to sq_flags
The compiler should be forbidden from any strange optimization for async
writes to user visible data-structures. Without proper protection, the
compiler can cause write-tearing or invent writes that would confuse the
userspace.

However, there are writes to sq_flags which are not protected by
WRITE_ONCE(). Use WRITE_ONCE() for these writes.

This is purely a theoretical issue. Presumably, any compiler is very
unlikely to do such optimizations.

Fixes: 75b28affdd ("io_uring: allocate the two rings together")
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Link: https://lore.kernel.org/r/20210808001342.964634-3-namit@vmware.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-08 21:21:11 -06:00
Nadav Amit
ef98eb0409 io_uring: clear TIF_NOTIFY_SIGNAL when running task work
When using SQPOLL, the submission queue polling thread calls
task_work_run() to run queued work. However, when work is added with
TWA_SIGNAL - as done by io_uring itself - the TIF_NOTIFY_SIGNAL remains
set afterwards and is never cleared.

Consequently, when the submission queue polling thread checks whether
signal_pending(), it may always find a pending signal, if
task_work_add() was ever called before.

The impact of this bug might be different on different kernel versions.
It appears that on 5.14 it would only cause unnecessary calculation and
prevent the polling thread from sleeping. On 5.13, where the bug was
found, it stops the polling thread from finding newly submitted work.

Instead of task_work_run(), use tracehook_notify_signal() that clears
TIF_NOTIFY_SIGNAL. Test for TIF_NOTIFY_SIGNAL in addition to
current->task_works to avoid a race in which task_works is cleared but
the TIF_NOTIFY_SIGNAL is set.

Fixes: 685fe7feed ("io-wq: eliminate the need for a manager thread")
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Link: https://lore.kernel.org/r/20210808001342.964634-2-namit@vmware.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-08 21:21:11 -06:00
Pali Rohár
3125f26c51 ppp: Fix generating ppp unit id when ifname is not specified
When registering new ppp interface via PPPIOCNEWUNIT ioctl then kernel has
to choose interface name as this ioctl API does not support specifying it.

Kernel in this case register new interface with name "ppp<id>" where <id>
is the ppp unit id, which can be obtained via PPPIOCGUNIT ioctl. This
applies also in the case when registering new ppp interface via rtnl
without supplying IFLA_IFNAME.

PPPIOCNEWUNIT ioctl allows to specify own ppp unit id which will kernel
assign to ppp interface, in case this ppp id is not already used by other
ppp interface.

In case user does not specify ppp unit id then kernel choose the first free
ppp unit id. This applies also for case when creating ppp interface via
rtnl method as it does not provide a way for specifying own ppp unit id.

If some network interface (does not have to be ppp) has name "ppp<id>"
with this first free ppp id then PPPIOCNEWUNIT ioctl or rtnl call fails.

And registering new ppp interface is not possible anymore, until interface
which holds conflicting name is renamed. Or when using rtnl method with
custom interface name in IFLA_IFNAME.

As list of allocated / used ppp unit ids is not possible to retrieve from
kernel to userspace, userspace has no idea what happens nor which interface
is doing this conflict.

So change the algorithm how ppp unit id is generated. And choose the first
number which is not neither used as ppp unit id nor in some network
interface with pattern "ppp<id>".

This issue can be simply reproduced by following pppd call when there is no
ppp interface registered and also no interface with name pattern "ppp<id>":

    pppd ifname ppp1 +ipv6 noip noauth nolock local nodetach pty "pppd +ipv6 noip noauth nolock local nodetach notty"

Or by creating the one ppp interface (which gets assigned ppp unit id 0),
renaming it to "ppp1" and then trying to create a new ppp interface (which
will always fails as next free ppp unit id is 1, but network interface with
name "ppp1" exists).

This patch fixes above described issue by generating new and new ppp unit
id until some non-conflicting id with network interfaces is generated.

Signed-off-by: Pali Rohár <pali@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08 13:08:46 +01:00
Pali Rohár
2459dcb96b ppp: Fix generating ifname when empty IFLA_IFNAME is specified
IFLA_IFNAME is nul-term string which means that IFLA_IFNAME buffer can be
larger than length of string which contains.

Function __rtnl_newlink() generates new own ifname if either IFLA_IFNAME
was not specified at all or userspace passed empty nul-term string.

It is expected that if userspace does not specify ifname for new ppp netdev
then kernel generates one in format "ppp<id>" where id matches to the ppp
unit id which can be later obtained by PPPIOCGUNIT ioctl.

And it works in this way if IFLA_IFNAME is not specified at all. But it
does not work when IFLA_IFNAME is specified with empty string.

So fix this logic also for empty IFLA_IFNAME in ppp_nl_newlink() function
and correctly generates ifname based on ppp unit identifier if userspace
did not provided preferred ifname.

Without this patch when IFLA_IFNAME was specified with empty string then
kernel created a new ppp interface in format "ppp<id>" but id did not
match ppp unit id returned by PPPIOCGUNIT ioctl. In this case id was some
number generated by __rtnl_newlink() function.

Signed-off-by: Pali Rohár <pali@kernel.org>
Fixes: bb8082f691 ("ppp: build ifname using unit identifier for rtnl based devices")
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08 13:07:52 +01:00
David S. Miller
2f5501a8f1 Merge branch 'bnxt_en-ptp-fixes'
Michael Chan says:

====================
bnxt_en: PTP fixes

This series includes 2 fixes for the PTP feature.  Update to the new
firmware interface so that the driver can pass the PTP sequence number
header offset of TX packets to the firmware.  This is needed for all
PTP packet types (v1, v2, with or without VLAN) to work.  The 2nd
fix is to use a different register window to read the PHC to avoid
conflict with an older Broadcom tool.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08 13:05:51 +01:00
Michael Chan
92529df76d bnxt_en: Use register window 6 instead of 5 to read the PHC
Some older Broadcom debug tools use window 5 and may conflict, so switch
to use window 6 instead.

Fixes: 118612d519 ("bnxt_en: Add PTP clock APIs, ioctls, and ethtool methods")
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08 13:05:51 +01:00
Michael Chan
9e26680733 bnxt_en: Update firmware call to retrieve TX PTP timestamp
New firmware interface requires the PTP sequence ID header offset to
be passed to the firmware to properly find the matching timestamp
for all protocols.

Fixes: 83bb623c96 ("bnxt_en: Transmit and retrieve packet timestamps")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08 13:05:51 +01:00
Michael Chan
fbfee25796 bnxt_en: Update firmware interface to 1.10.2.52
The key change is the firmware call to retrieve the PTP TX timestamp.
The header offset for the PTP sequence number field is now added.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08 13:05:51 +01:00
Kefeng Wang
1027b96ec9 once: Fix panic when module unload
DO_ONCE
DEFINE_STATIC_KEY_TRUE(___once_key);
__do_once_done
  once_disable_jump(once_key);
    INIT_WORK(&w->work, once_deferred);
    struct once_work *w;
    w->key = key;
    schedule_work(&w->work);                     module unload
                                                   //*the key is
destroy*
process_one_work
  once_deferred
    BUG_ON(!static_key_enabled(work->key));
       static_key_count((struct static_key *)x)    //*access key, crash*

When module uses DO_ONCE mechanism, it could crash due to the above
concurrency problem, we could reproduce it with link[1].

Fix it by add/put module refcount in the once work process.

[1] https://lore.kernel.org/netdev/eaa6c371-465e-57eb-6be9-f4b16b9d7cbf@huawei.com/

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Reported-by: Minmin chen <chenmingmin@huawei.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08 13:00:20 +01:00
Vinicius Costa Gomes
d329e41a08 ptp: Fix possible memory leak caused by invalid cast
Fixes possible leak of PTP virtual clocks.

The number of PTP virtual clocks to be unregistered is passed as
'u32', but the function that unregister the devices handles that as
'u8'.

Fixes: 73f37068d5 ("ptp: support ptp physical/virtual clocks conversion")
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08 12:56:41 +01:00
Ben Hutchings
2383cb9497 net: phy: micrel: Fix link detection on ksz87xx switch"
Commit a5e63c7d38 "net: phy: micrel: Fix detection of ksz87xx
switch" broke link detection on the external ports of the KSZ8795.

The previously unused phy_driver structure for these devices specifies
config_aneg and read_status functions that appear to be designed for a
fixed link and do not work with the embedded PHYs in the KSZ8795.

Delete the use of these functions in favour of the generic PHY
implementations which were used previously.

Fixes: a5e63c7d38 ("net: phy: micrel: Fix detection of ksz87xx switch")
Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08 12:03:24 +01:00
Loic Poulain
34737e1320 net: wwan: mhi_wwan_ctrl: Fix possible deadlock
Lockdep detected possible interrupt unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&mhiwwan->rx_lock);
                               local_irq_disable();
                               lock(&mhi_cntrl->pm_lock);
                               lock(&mhiwwan->rx_lock);
   <Interrupt>
     lock(&mhi_cntrl->pm_lock);

  *** DEADLOCK ***

To prevent this we need to disable the soft-interrupts when taking
the rx_lock.

Cc: stable@vger.kernel.org
Fixes: fa588eba63 ("net: Add Qcom WWAN control driver")
Reported-by: Thomas Perrot <thomas.perrot@bootlin.com>
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-07 09:35:48 +01:00
Oleksij Rempel
47fac45600 net: dsa: qca: ar9331: make proper initial port defaults
Make sure that all external port are actually isolated from each other,
so no packets are leaked.

Fixes: ec6698c272 ("net: dsa: add support for Atheros AR9331 built-in switch")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-07 09:34:35 +01:00
David S. Miller
d992e99b87 Merge branch 'r8169-RTL8106e'
Hayes Wang says:

====================
r8169: adjust the setting for RTL8106e

These patches are uesed to avoid the delay of link-up interrupt, when
enabling ASPM for RTL8106e. The patch #1 is used to enable ASPM if
it is possible. And the patch #2 is used to modify the entrance latencies
of L0 and L1.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-07 09:33:22 +01:00
Hayes Wang
9c40186488 r8169: change the L0/L1 entrance latencies for RTL8106e
The original L0 and L1 entrance latencies of RTL8106e are 4us. And
they cause the delay of link-up interrupt when enabling ASPM. Change
the L0 entrance latency to 7us and L1 entrance latency to 32us. Then,
they could avoid the issue.

Tested-by: Koba Ko <koba.ko@canonical.com>
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-07 09:33:22 +01:00
Hayes Wang
2115d3d482 Revert "r8169: avoid link-up interrupt issue on RTL8106e if user enables ASPM"
This reverts commit 1ee8856de8.

This is used to re-enable ASPM on RTL8106e, if it is possible.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-07 09:33:22 +01:00
David S. Miller
84103209ba Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-08-07

The following pull-request contains BPF updates for your *net* tree.

We've added 4 non-merge commits during the last 9 day(s) which contain
a total of 4 files changed, 8 insertions(+), 7 deletions(-).

The main changes are:

1) Fix integer overflow in htab's lookup + delete batch op, from Tatsuhiko Yasumatsu.

2) Fix invalid fd 0 close in libbpf if BTF parsing failed, from Daniel Xu.

3) Fix libbpf feature probe for BPF_PROG_TYPE_CGROUP_SOCKOPT, from Robin Gögge.

4) Fix minor libbpf doc warning regarding code-block language, from Randy Dunlap.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-07 09:26:54 +01:00
Luke D Jones
739d0959fb ALSA: hda: Add quirk for ASUS Flow x13
The ASUS GV301QH sound appears to work well with the quirk for
ALC294_FIXUP_ASUS_DUAL_SPK.

Signed-off-by: Luke D Jones <luke@ljones.dev>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210807025805.27321-1-luke@ljones.dev
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-07 08:42:34 +02:00
Maxim Kochetkov
be7ecbd240 soc: fsl: qe: convert QE interrupt controller to platform_device
Since 5.13 QE's ucc nodes can't get interrupts from devicetree:

	ucc@2000 {
		cell-index = <1>;
		reg = <0x2000 0x200>;
		interrupts = <32>;
		interrupt-parent = <&qeic>;
	};

Now fw_devlink expects driver to create and probe a struct device
for interrupt controller.

So lets convert this driver to simple platform_device with probe().
Also use platform_get_ and devm_ family function to get/allocate
resources and drop unused .compatible = "qeic".

[1] - https://lore.kernel.org/lkml/CAGETcx9PiX==mLxB9PO8Myyk6u2vhPVwTMsA5NkD-ywH5xhusw@mail.gmail.com
Fixes: e590474768 ("driver core: Set fw_devlink=on by default")
Fixes: ea718c6990 ("Revert "Revert "driver core: Set fw_devlink=on by default""")
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
2021-08-06 18:41:30 -05:00
Tatsuhiko Yasumatsu
c4eb1f4032 bpf: Fix integer overflow involving bucket_size
In __htab_map_lookup_and_delete_batch(), hash buckets are iterated
over to count the number of elements in each bucket (bucket_size).
If bucket_size is large enough, the multiplication to calculate
kvmalloc() size could overflow, resulting in out-of-bounds write
as reported by KASAN:

  [...]
  [  104.986052] BUG: KASAN: vmalloc-out-of-bounds in __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.986489] Write of size 4194224 at addr ffffc9010503be70 by task crash/112
  [  104.986889]
  [  104.987193] CPU: 0 PID: 112 Comm: crash Not tainted 5.14.0-rc4 #13
  [  104.987552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
  [  104.988104] Call Trace:
  [  104.988410]  dump_stack_lvl+0x34/0x44
  [  104.988706]  print_address_description.constprop.0+0x21/0x140
  [  104.988991]  ? __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.989327]  ? __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.989622]  kasan_report.cold+0x7f/0x11b
  [  104.989881]  ? __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.990239]  kasan_check_range+0x17c/0x1e0
  [  104.990467]  memcpy+0x39/0x60
  [  104.990670]  __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.990982]  ? __wake_up_common+0x4d/0x230
  [  104.991256]  ? htab_of_map_free+0x130/0x130
  [  104.991541]  bpf_map_do_batch+0x1fb/0x220
  [...]

In hashtable, if the elements' keys have the same jhash() value, the
elements will be put into the same bucket. By putting a lot of elements
into a single bucket, the value of bucket_size can be increased to
trigger the integer overflow.

Triggering the overflow is possible for both callers with CAP_SYS_ADMIN
and callers without CAP_SYS_ADMIN.

It will be trivial for a caller with CAP_SYS_ADMIN to intentionally
reach this overflow by enabling BPF_F_ZERO_SEED. As this flag will set
the random seed passed to jhash() to 0, it will be easy for the caller
to prepare keys which will be hashed into the same value, and thus put
all the elements into the same bucket.

If the caller does not have CAP_SYS_ADMIN, BPF_F_ZERO_SEED cannot be
used. However, it will be still technically possible to trigger the
overflow, by guessing the random seed value passed to jhash() (32bit)
and repeating the attempt to trigger the overflow. In this case,
the probability to trigger the overflow will be low and will take
a very long time.

Fix the integer overflow by calling kvmalloc_array() instead of
kvmalloc() to allocate memory.

Fixes: 057996380a ("bpf: Add batch ops to all htab bpf map")
Signed-off-by: Tatsuhiko Yasumatsu <th.yasumatsu@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210806150419.109658-1-th.yasumatsu@gmail.com
2021-08-07 01:39:22 +02:00
Randy Dunlap
7c4a22339e libbpf, doc: Eliminate warnings in libbpf_naming_convention
Use "code-block: none" instead of "c" for non-C-language code blocks.
Removes these warnings:

  lnx-514-rc4/Documentation/bpf/libbpf/libbpf_naming_convention.rst:111: WARNING: Could not lex literal_block as "c". Highlighting skipped.
  lnx-514-rc4/Documentation/bpf/libbpf/libbpf_naming_convention.rst:124: WARNING: Could not lex literal_block as "c". Highlighting skipped.

Fixes: f42cfb469f ("bpf: Add documentation for libbpf including API autogen")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210802015037.787-1-rdunlap@infradead.org
2021-08-07 01:39:19 +02:00
Daniel Xu
c34c338a40 libbpf: Do not close un-owned FD 0 on errors
Before this patch, btf_new() was liable to close an arbitrary FD 0 if
BTF parsing failed. This was because:

* btf->fd was initialized to 0 through the calloc()
* btf__free() (in the `done` label) closed any FDs >= 0
* btf->fd is left at 0 if parsing fails

This issue was discovered on a system using libbpf v0.3 (without
BTF_KIND_FLOAT support) but with a kernel that had BTF_KIND_FLOAT types
in BTF. Thus, parsing fails.

While this patch technically doesn't fix any issues b/c upstream libbpf
has BTF_KIND_FLOAT support, it'll help prevent issues in the future if
more BTF types are added. It also allow the fix to be backported to
older libbpf's.

Fixes: 3289959b97 ("libbpf: Support BTF loading and raw data output in both endianness")
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/5969bb991adedb03c6ae93e051fd2a00d293cf25.1627513670.git.dxu@dxuuu.xyz
2021-08-07 01:39:15 +02:00
Robin Gögge
78d14bda86 libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT
This patch fixes the probe for BPF_PROG_TYPE_CGROUP_SOCKOPT,
so the probe reports accurate results when used by e.g.
bpftool.

Fixes: 4cdbfb59c4 ("libbpf: support sockopt hooks")
Signed-off-by: Robin Gögge <r.goegge@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20210728225825.2357586-1-r.goegge@gmail.com
2021-08-07 01:38:52 +02:00
Laurent Dufour
c18956e6e0 powerpc/pseries: Fix update of LPAR security flavor after LPM
After LPM, when migrating from a system with security mitigation enabled
to a system with mitigation disabled, the security flavor exposed in
/proc is not correctly set back to 0.

Do not assume the value of the security flavor is set to 0 when entering
init_cpu_char_feature_flags(), so when called after a LPM, the value is
set correctly even if the mitigation are not turned off.

Fixes: 6ce56e1ac3 ("powerpc/pseries: export LPAR security flavor in lparcfg")
Cc: stable@vger.kernel.org # v5.13+
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210805152308.33988-1-ldufour@linux.ibm.com
2021-08-07 08:53:59 +10:00
Christophe Leroy
8241461536 powerpc/smp: Fix OOPS in topology_init()
Running an SMP kernel on an UP platform not prepared for it,
I encountered the following OOPS:

	BUG: Kernel NULL pointer dereference on read at 0x00000034
	Faulting instruction address: 0xc0a04110
	Oops: Kernel access of bad area, sig: 11 [#1]
	BE PAGE_SIZE=4K SMP NR_CPUS=2 CMPCPRO
	Modules linked in:
	CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-pmac-00001-g230fedfaad21 #5234
	NIP:  c0a04110 LR: c0a040d8 CTR: c0a04084
	REGS: e100dda0 TRAP: 0300   Not tainted  (5.13.0-pmac-00001-g230fedfaad21)
	MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 84000284  XER: 00000000
	DAR: 00000034 DSISR: 20000000
	GPR00: c0006bd4 e100de60 c1033320 00000000 00000000 c0942274 00000000 00000000
	GPR08: 00000000 00000000 00000001 00000063 00000007 00000000 c0006f30 00000000
	GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000005
	GPR24: c0c67d74 c0c67f1c c0c60000 c0c67d70 c0c0c558 1efdf000 c0c00020 00000000
	NIP [c0a04110] topology_init+0x8c/0x138
	LR [c0a040d8] topology_init+0x54/0x138
	Call Trace:
	[e100de60] [80808080] 0x80808080 (unreliable)
	[e100de90] [c0006bd4] do_one_initcall+0x48/0x1bc
	[e100def0] [c0a0150c] kernel_init_freeable+0x1c8/0x278
	[e100df20] [c0006f44] kernel_init+0x14/0x10c
	[e100df30] [c00190fc] ret_from_kernel_thread+0x14/0x1c
	Instruction dump:
	7c692e70 7d290194 7c035040 7c7f1b78 5529103a 546706fe 5468103a 39400001
	7c641b78 40800054 80c690b4 7fb9402e <81060034> 7fbeea14 2c080000 7fa3eb78
	---[ end trace b246ffbc6bbbb6fb ]---

Fix it by checking smp_ops before using it, as already done in
several other places in the arch/powerpc/kernel/smp.c

Fixes: 39f8756145 ("powerpc/smp: Move ppc_md.cpu_die() to smp_ops.cpu_offline_self()")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/75287841cbb8740edd44880fe60be66d489160d9.1628097995.git.christophe.leroy@csgroup.eu
2021-08-07 08:53:59 +10:00
Christophe Leroy
b5cfc9cd7b powerpc/32: Fix critical and debug interrupts on BOOKE
32 bits BOOKE have special interrupts for debug and other
critical events.

When handling those interrupts, dedicated registers are saved
in the stack frame in addition to the standard registers, leading
to a shift of the pt_regs struct.

Since commit db297c3b07 ("powerpc/32: Don't save thread.regs on
interrupt entry"), the pt_regs struct is expected to be at the
same place all the time.

Instead of handling a special struct in addition to pt_regs, just
add those special registers to struct pt_regs.

Fixes: db297c3b07 ("powerpc/32: Don't save thread.regs on interrupt entry")
Cc: stable@vger.kernel.org
Reported-by: Radu Rendec <radu.rendec@gmail.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/028d5483b4851b01ea4334d0751e7f260419092b.1625637264.git.christophe.leroy@csgroup.eu
2021-08-07 08:53:59 +10:00
Christophe Leroy
6237636504 powerpc/32s: Fix napping restore in data storage interrupt (DSI)
When a DSI (Data Storage Interrupt) is taken while in NAP mode,
r11 doesn't survive the call to power_save_ppc32_restore().

So use r1 instead of r11 as they both contain the virtual stack
pointer at that point.

Fixes: 4c0104a83f ("powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE")
Cc: stable@vger.kernel.org # v5.13+
Reported-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/731694e0885271f6ee9ffc179eb4bcee78313682.1628003562.git.christophe.leroy@csgroup.eu
2021-08-07 08:53:59 +10:00
Solomon Chiu
46dd2965bd drm/amdgpu: Add preferred mode in modeset when freesync video mode's enabled.
[Why]
With kernel module parameter "freesync_video" is enabled, if the mode
is changed to preferred mode(the mode with highest rate), then Freesync
fails because the preferred mode is treated as one of freesync video
mode, and then be configurated as freesync video mode(fixed refresh
rate).

[How]
Skip freesync fixed rate configurating when modeset to preferred mode.

Signed-off-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-06 17:00:50 -04:00
Manivannan Sadhasivam
b48027083a mtd: rawnand: Fix probe failure due to of_get_nand_secure_regions()
Due to 14f97f0b8e, the rawnand platforms without "secure-regions"
property defined in DT fails to probe. The issue is,
of_get_nand_secure_regions() errors out if
of_property_count_elems_of_size() returns a negative error code.

If the "secure-regions" property is not present in DT, then also we'll
get -EINVAL from of_property_count_elems_of_size() but it should not
be treated as an error for platforms not declaring "secure-regions"
in DT.

So fix this behaviour by checking for the existence of that property in
DT and return 0 if it is not present.

Fixes: 14f97f0b8e ("mtd: rawnand: Add a check in of_get_nand_secure_regions()")
Reported-by: Martin Kaiser <martin@kaiser.cx>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Martin Kaiser <martin@kaiser.cx>
Tested-by: Martin Kaiser <martin@kaiser.cx>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210727062813.32619-1-manivannan.sadhasivam@linaro.org
2021-08-06 21:44:16 +02:00
Desmond Cheong Zhi Xi
b7abb05168 mtd: fix lock hierarchy in deregister_mtd_blktrans
There is a lock hierarchy of major_names_lock --> mtd_table_mutex. One
existing chain is as follows:

1. major_names_lock --> loop_ctl_mutex (when blk_request_module calls
loop_probe)

2. loop_ctl_mutex --> bdev->bd_mutex (when loop_control_ioctl calls
loop_remove, which then calls del_gendisk)

3. bdev->bd_mutex --> mtd_table_mutex (when blkdev_get_by_dev calls
__blkdev_get, which then calls blktrans_open)

Since unregister_blkdev grabs the major_names_lock, we need to call it
outside the critical section for mtd_table_mutex, otherwise we invert
the lock hierarchy.

Reported-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210717100719.728829-1-desmondcheongzx@gmail.com
2021-08-06 21:44:16 +02:00
Colin Ian King
99dc4ad992 mtd: devices: mchp48l640: Fix memory leak on cmd
The allocation for cmd is not being kfree'd on the return leading to
a memory leak. Fix this by kfree'ing it.

Addresses-Coverity: ("Resource leak")
Fixes: 88d1250267 ("mtd: devices: add support for microchip 48l640 EERAM")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Heiko Schocher <hs@denx.de>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210712145214.101377-1-colin.king@canonical.com
2021-08-06 21:44:09 +02:00
Jakub Kicinski
cc4e5eecd4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Restrict range element expansion in ipset to avoid soft lockup,
   from Jozsef Kadlecsik.

2) Memleak in error path for nf_conntrack_bridge for IPv4 packets,
   from Yajun Deng.

3) Simplify conntrack garbage collection strategy to avoid frequent
   wake-ups, from Florian Westphal.

4) Fix NFNLA_HOOK_FUNCTION_NAME string, do not include module name.

5) Missing chain family netlink attribute in chain description
   in nfnetlink_hook.

6) Incorrect sequence number on nfnetlink_hook dumps.

7) Use netlink request family in reply message for consistency.

8) Remove offload_pickup sysctl, use conntrack for established state
   instead, from Florian Westphal.

9) Translate NFPROTO_INET/ingress to NFPROTO_NETDEV/ingress, since
   NFPROTO_INET is not exposed through nfnetlink_hook.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
  netfilter: nfnetlink_hook: translate inet ingress to netdev
  netfilter: conntrack: remove offload_pickup sysctl again
  netfilter: nfnetlink_hook: Use same family as request message
  netfilter: nfnetlink_hook: use the sequence number of the request message
  netfilter: nfnetlink_hook: missing chain family
  netfilter: nfnetlink_hook: strip off module name from hookfn
  netfilter: conntrack: collect all entries in one cycle
  netfilter: nf_conntrack_bridge: Fix memory leak when error
  netfilter: ipset: Limit the maximal range of consecutive elements to add/delete
====================

Link: https://lore.kernel.org/r/20210806151149.6356-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-06 08:44:50 -07:00
Christophe JAILLET
5126da7d99 drm/amd/pm: Fix a memory leak in an error handling path in 'vangogh_tables_init()'
'watermarks_table' must be freed instead 'clocks_table', because
'clocks_table' is known to be NULL at this point and 'watermarks_table' is
never freed if the last kzalloc fails.

Fixes: c98ee89736 ("drm/amd/pm: add the fine grain tuning function for vangogh")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-06 11:36:55 -04:00
Alex Deucher
202ead5a3c drm/amdgpu: don't enable baco on boco platforms in runpm
If the platform uses BOCO, don't use BACO in runtime suspend.
We could end up executing the BACO path if the platform supports
both.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1669
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-06 11:35:58 -04:00
John Clements
39932ef758 drm/amdgpu: set RAS EEPROM address from VBIOS
update to latest atombios fw table

[Backport to 5.14 - Alex]

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1670
Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-06 11:33:50 -04:00
Xiaomeng Hou
ad89c9aa24 drm/amd/pm: update smu v13.0.1 firmware header
Update smu v13.0.1 firmware header for yellow carp.

Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-06 11:24:48 -04:00
Pablo Neira Ayuso
269fc69533 netfilter: nfnetlink_hook: translate inet ingress to netdev
The NFPROTO_INET pseudofamily is not exposed through this new netlink
interface. The netlink dump either shows NFPROTO_IPV4 or NFPROTO_IPV6
for NFPROTO_INET prerouting/input/forward/output/postrouting hooks.
The NFNLA_CHAIN_FAMILY attribute provides the family chain, which
specifies if this hook applies to inet traffic only (either IPv4 or
IPv6).

Translate the inet/ingress hook to netdev/ingress to fully hide the
NFPROTO_INET implementation details.

Fixes: e2cf17d377 ("netfilter: add new hook nfnl subsystem")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-06 17:07:41 +02:00
Florian Westphal
4592ee7f52 netfilter: conntrack: remove offload_pickup sysctl again
These two sysctls were added because the hardcoded defaults (2 minutes,
tcp, 30 seconds, udp) turned out to be too low for some setups.

They appeared in 5.14-rc1 so it should be fine to remove it again.

Marcelo convinced me that there should be no difference between a flow
that was offloaded vs. a flow that was not wrt. timeout handling.
Thus the default is changed to those for TCP established and UDP stream,
5 days and 120 seconds, respectively.

Marcelo also suggested to account for the timeout value used for the
offloading, this avoids increase beyond the value in the conntrack-sysctl
and will also instantly expire the conntrack entry with altered sysctls.

Example:
   nf_conntrack_udp_timeout_stream=60
   nf_flowtable_udp_timeout=60

This will remove offloaded udp flows after one minute, rather than two.

An earlier version of this patch also cleared the ASSURED bit to
allow nf_conntrack to evict the entry via early_drop (i.e., table full).
However, it looks like we can safely assume that connection timed out
via HW is still in established state, so this isn't needed.

Quoting Oz:
 [..] the hardware sends all packets with a set FIN flags to sw.
 [..] Connections that are aged in hardware are expected to be in the
 established state.

In case it turns out that back-to-sw-path transition can occur for
'dodgy' connections too (e.g., one side disappeared while software-path
would have been in RETRANS timeout), we can adjust this later.

Cc: Oz Shlomo <ozsh@nvidia.com>
Cc: Paul Blakey <paulb@nvidia.com>
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-06 17:07:41 +02:00
Pablo Neira Ayuso
69311e7c99 netfilter: nfnetlink_hook: Use same family as request message
Use the same family as the request message, for consistency. The
netlink payload provides sufficient information to describe the hook
object, including the family.

This makes it easier to userspace to correlate the hooks are that
visited by the packets for a certain family.

Fixes: e2cf17d377 ("netfilter: add new hook nfnl subsystem")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-06 17:07:41 +02:00
Pablo Neira Ayuso
3d9bbaf6c5 netfilter: nfnetlink_hook: use the sequence number of the request message
The sequence number allows to correlate the netlink reply message (as
part of the dump) with the original request message.

The cb->seq field is internally used to detect an interference (update)
of the hook list during the netlink dump, do not use it as sequence
number in the netlink dump header.

Fixes: e2cf17d377 ("netfilter: add new hook nfnl subsystem")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-06 17:07:40 +02:00
Pablo Neira Ayuso
a6e57c4af1 netfilter: nfnetlink_hook: missing chain family
The family is relevant for pseudo-families like NFPROTO_INET
otherwise the user needs to rely on the hook function name to
differentiate it from NFPROTO_IPV4 and NFPROTO_IPV6 names.

Add nfnl_hook_chain_desc_attributes instead of using the existing
NFTA_CHAIN_* attributes, since these do not provide a family number.

Fixes: e2cf17d377 ("netfilter: add new hook nfnl subsystem")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-06 17:07:40 +02:00
Pablo Neira Ayuso
61e0c2bc55 netfilter: nfnetlink_hook: strip off module name from hookfn
NFNLA_HOOK_FUNCTION_NAME should include the hook function name only,
the module name is already provided by NFNLA_HOOK_MODULE_NAME.

Fixes: e2cf17d377 ("netfilter: add new hook nfnl subsystem")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-06 17:07:40 +02:00
Florian Westphal
4608fdfc07 netfilter: conntrack: collect all entries in one cycle
Michal Kubecek reports that conntrack gc is responsible for frequent
wakeups (every 125ms) on idle systems.

On busy systems, timed out entries are evicted during lookup.
The gc worker is only needed to remove entries after system becomes idle
after a busy period.

To resolve this, always scan the entire table.
If the scan is taking too long, reschedule so other work_structs can run
and resume from next bucket.

After a completed scan, wait for 2 minutes before the next cycle.
Heuristics for faster re-schedule are removed.

GC_SCAN_INTERVAL could be exposed as a sysctl in the future to allow
tuning this as-needed or even turn the gc worker off.

Reported-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-06 17:07:35 +02:00
Takashi Iwai
56e7a93160 Merge tag 'asoc-fix-v5.14-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.14

Quite a lot of fixes here, the biggest set being for the cs42l42 driver
which is reasonably old but has seen a sudden uptick in activity.
There's also some fixes for correctly referencing PCM buffer addresses
and the removal of some driver-local bodges that had been done for the
lack of prefix handling in DAPM which were broken by the core handling
that as expected.
2021-08-06 17:00:51 +02:00
Hans de Goede
9d7b132e62 platform/x86: pcengines-apuv2: Add missing terminating entries to gpio-lookup tables
The gpiod_lookup_table.table passed to gpiod_add_lookup_table() must
be terminated with an empty entry, add this.

Note we have likely been getting away with this not being present because
the GPIO lookup code first matches on the dev_id, causing most lookups to
skip checking the table and the lookups which do check the table will
find a matching entry before reaching the end. With that said, terminating
these tables properly still is obviously the correct thing to do.

Fixes: f8eb0235f6 ("x86: pcengines apuv2 gpio/leds/keys platform driver")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210806115515.12184-1-hdegoede@redhat.com
2021-08-06 14:04:43 +02:00
Hans de Goede
085fc31f81 platform/x86: Make dual_accel_detect() KIOX010A + KIOX020A detect more robust
360 degree hinges devices with dual KIOX010A + KIOX020A accelerometers
always have both a KIOX010A and a KIOX020A ACPI device (one for each
accel).

Theoretical some vendor may re-use some DSDT for a non-convertible
stripping out just the KIOX020A ACPI device from the DSDT. Check that
both ACPI devices are present to make the check more robust.

Fixes: 153cca9caa ("platform/x86: Add and use a dual_accel_detect() helper")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210802141000.978035-1-hdegoede@redhat.com
2021-08-06 14:04:43 +02:00
John Hubbard
704e624f7b net: mvvp2: fix short frame size on s390
On s390, the following build warning occurs:

drivers/net/ethernet/marvell/mvpp2/mvpp2.h:844:2: warning: overflow in
conversion from 'long unsigned int' to 'int' changes value from
'18446744073709551584' to '-32' [-Woverflow]
844 |  ((total_size) - MVPP2_SKB_HEADROOM - MVPP2_SKB_SHINFO_SIZE)

This happens because MVPP2_SKB_SHINFO_SIZE, which is 320 bytes (which is
already 64-byte aligned) on some architectures, actually gets ALIGN'd up
to 512 bytes in the s390 case.

So then, when this is invoked:

    MVPP2_RX_MAX_PKT_SIZE(MVPP2_BM_SHORT_FRAME_SIZE)

...that turns into:

     704 - 224 - 512 == -32

...which is not a good frame size to end up with! The warning above is a
bit lucky: it notices a signed/unsigned bad behavior here, which leads
to the real problem of a frame that is too short for its contents.

Increase MVPP2_BM_SHORT_FRAME_SIZE by 32 (from 704 to 736), which is
just exactly big enough. (The other values can't readily be changed
without causing a lot of other problems.)

Fixes: 07dd0a7aae ("mvpp2: add basic XDP support")
Cc: Sven Auhagen <sven.auhagen@voleatech.de>
Cc: Matteo Croce <mcroce@microsoft.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-06 12:09:42 +01:00
DENG Qingfang
aff51c5da3 net: dsa: mt7530: add the missing RxUnicast MIB counter
Add the missing RxUnicast counter.

Fixes: b8f126a8d5 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-06 12:08:30 +01:00
Arnd Bergmann
abf3d98dee mt76: fix enum type mismatch
There is no 'NONE' version of 'enum mcu_cipher_type', and returning
'MT_CIPHER_NONE' causes a warning:

drivers/net/wireless/mediatek/mt76/mt7921/mcu.c: In function 'mt7921_mcu_get_cipher':
drivers/net/wireless/mediatek/mt76/mt7921/mcu.c:114:24: error: implicit conversion from 'enum mt76_cipher_type' to 'enum mcu_cipher_type' [-Werror=enum-conversion]
  114 |                 return MT_CIPHER_NONE;
      |                        ^~~~~~~~~~~~~~

Add the missing MCU_CIPHER_NONE defintion that fits in here with
the same value.

Fixes: c368362c36 ("mt76: fix iv and CCMP header insertion")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210721150745.1914829-1-arnd@kernel.org
2021-08-06 10:56:53 +03:00
Bjorn Andersson
9711759a87 clk: qcom: gdsc: Ensure regulator init state matches GDSC state
As GDSCs are registered and found to be already enabled gdsc_init()
ensures that 1) the kernel state matches the hardware state, and 2)
votable GDSCs are properly enabled from this master as well.

But as the (optional) supply regulator is enabled deep into
gdsc_toggle_logic(), which is only executed for votable GDSCs, the
kernel's state of the regulator might not match the hardware. The
regulator might be automatically turned off if no other users are
present or the next call to gdsc_disable() would cause an unbalanced
regulator_disable().

Given that the votable case deals with an already enabled GDSC, most of
gdsc_enable() and gdsc_toggle_logic() can be skipped. Reduce it to just
clearing the SW_COLLAPSE_MASK and enabling hardware control to simply
call regulator_enable() in both cases.

The enablement of hardware control seems to be an independent property
from the GDSC being enabled, so this is moved outside that conditional
segment.

Lastly, as the propagation of ALWAYS_ON to GENPD_FLAG_ALWAYS_ON needs to
happen regardless of the initial state this is grouped together with the
other sc->pd updates at the end of the function.

Cc: stable@vger.kernel.org
Fixes: 37416e5549 ("clk: qcom: gdsc: Handle GDSC regulator supplies")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210721224056.3035016-1-bjorn.andersson@linaro.org
[sboyd@kernel.org: Rephrase commit text]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-08-05 18:19:04 -07:00
Dong Aisheng
283f1b9a04 clk: imx6q: fix uart earlycon unwork
The earlycon depends on the bootloader setup UART clocks being retained.
There're actually two uart clocks (ipg, per) on MX6QDL,
but the 'Fixes' commit change to register only one which means
another clock may be disabled during booting phase
and result in the earlycon unwork.

Cc: stable@vger.kernel.org # v5.10+
Fixes: 379c9a24cc ("clk: imx: Fix reparenting of UARTs not associated with stdout")
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Link: https://lore.kernel.org/r/20210702085438.1988087-1-aisheng.dong@nxp.com
Reviewed-by: Abel Vesa <abel.vesa@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-08-05 18:12:23 -07:00
Richard Fitzgerald
e5ada3f678 ASoC: cs42l42: Fix mono playback
I2S always has two LRCLK phases and both CH1 and CH2 of the RX
must be enabled (corresponding to the low and high phases of LRCLK.)
The selection of the valid data channels is done by setting the DAC
CHA_SEL and CHB_SEL. CHA_SEL is always the first (left) channel,
CHB_SEL depends on the number of active channels.

Previously for mono ASP CH2 was not enabled, the result was playing
mono data would not produce any audio output.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 621d65f3b8 ("ASoC: cs42l42: Provide finer control on playback path")
Link: https://lore.kernel.org/r/20210805161111.10410-4-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-05 20:17:16 +01:00
Richard Fitzgerald
3a5d89a9c6 ASoC: cs42l42: Constrain sample rate to prevent illegal SCLK
The lowest valid SCLK corresponds to 44.1 kHz at 16-bit. Sample
rates less than this would produce SCLK below the minimum when using
a normal I2S frame. A constraint must be applied to prevent this.

The constraint is not applied if the machine driver sets SCLK, to
allow setups where the host generates additional bits per LRCLK
phase to increase the SCLK frequency. In these cases the machine
driver would always have to inform this driver of the actual SCLK,
and it must select a legal SCLK.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20210805161111.10410-3-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-05 20:17:15 +01:00
Richard Fitzgerald
0c2f2ad4f1 ASoC: cs42l42: Fix LRCLK frame start edge
An I2S frame starts on the falling edge of LRCLK so ASP_STP must
be 0.

At the same time, move other format settings in the same register
from cs42l42_pll_config() to cs42l42_set_dai_fmt() where you'd
expect to find them, and merge into a single write.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 2c394ca796 ("ASoC: Add support for CS42L42 codec")
Link: https://lore.kernel.org/r/20210805161111.10410-2-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-05 20:17:14 +01:00
Richard Fitzgerald
f1040e86f8 ASoC: cs42l42: PLL must be running when changing MCLK_SRC_SEL
Both SCLK and PLL clocks must be running to drive the glitch-free mux
behind MCLK_SRC_SEL and complete the switchover.

This patch moves the writing of MCLK_SRC_SEL to when the PLL is started
and stopped, so that it only transitions while the PLL is running.
The unconditional write MCLK_SRC_SEL=0 in cs42l42_mute_stream() is safe
because if the PLL is not running MCLK_SRC_SEL is already 0.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 43fc357199 ("ASoC: cs42l42: Set clock source for both ways of stream")
Link: https://lore.kernel.org/r/20210805161111.10410-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-05 20:17:13 +01:00
Shyam Prasad N
7d3fc01796 cifs: create sd context must be a multiple of 8
We used to follow the rule earlier that the create SD context
always be a multiple of 8. However, with the change:
cifs: refactor create_sd_buf() and and avoid corrupting the buffer
...we recompute the length, and we failed that rule.
Fixing that with this change.

Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-05 12:48:42 -05:00
Caleb Connolly
d77c95bf9a arm64: dts: qcom: sdm845-oneplus: fix reserved-mem
Fix the upper guard and the "removed_region", this fixes the random
crashes which used to occur in memory intensive loads. I'm not sure WHY
the upper guard being 0x2000 instead of 0x1000 doesn't fix this, but it
HAS to be 0x1000.

Fixes: e60fd5ac1f ("arm64: dts: qcom: sdm845-oneplus-common: guard rmtfs-mem")
Signed-off-by: Caleb Connolly <caleb@connolly.tech>
Link: https://lore.kernel.org/r/20210720153125.43389-2-caleb@connolly.tech
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2021-08-05 10:36:04 -05:00
Petr Vorel
0e5ded926f arm64: dts: qcom: msm8994-angler: Disable cont_splash_mem
As the default definition breaks booting angler:
[    1.862561] printk: console [ttyMSM0] enabled
[    1.872260] msm_serial: driver initialized
D -     15524 - pm_driver_init, Delta

cont_splash_mem was introduced in 74d6d0a145, but the problem
manifested after commit '86588296acbf ("fdt: Properly handle "no-map"
field in the memory region")'.

Disabling it because Angler's firmware does not report where the memory
is allocated (dmesg from downstream kernel):
[    0.000000] cma: Found cont_splash_mem@0, memory base 0x0000000000000000, size 16 MiB, limit 0x0000000000000000
[    0.000000] cma: CMA: reserved 16 MiB at 0x0000000000000000 for cont_splash_mem

Similar issue might be on Google Nexus 5X (lg-bullhead). Other MSM8992/4
are known to report correct address.

Fixes: 74d6d0a145 ("arm64: dts: qcom: msm8994/8994-kitakami: Fix up the memory map")
Suggested-by: Konrad Dybcio <konradybcio@gmail.com>
Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Link: https://lore.kernel.org/r/20210622191019.23771-1-petr.vorel@gmail.com
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2021-08-05 10:35:52 -05:00
Thara Gopinath
5d79e5ce54 cpufreq: blocklist Qualcomm sm8150 in cpufreq-dt-platdev
The Qualcomm sm8150 platform uses the qcom-cpufreq-hw driver, so
add it to the cpufreq-dt-platdev driver's blocklist.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-08-05 09:43:04 +05:30
Jeff Layton
8434ffe71c ceph: take snap_empty_lock atomically with snaprealm refcount change
There is a race in ceph_put_snap_realm. The change to the nref and the
spinlock acquisition are not done atomically, so you could decrement
nref, and before you take the spinlock, the nref is incremented again.
At that point, you end up putting it on the empty list when it
shouldn't be there. Eventually __cleanup_empty_realms runs and frees
it when it's still in-use.

Fix this by protecting the 1->0 transition with atomic_dec_and_lock,
and just drop the spinlock if we can get the rwsem.

Because these objects can also undergo a 0->1 refcount transition, we
must protect that change as well with the spinlock. Increment locklessly
unless the value is at 0, in which case we take the spinlock, increment
and then take it off the empty list if it did the 0->1 transition.

With these changes, I'm removing the dout() messages from these
functions, as well as in __put_snap_realm. They've always been racy, and
it's better to not print values that may be misleading.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/46419
Reported-by: Mark Nelson <mnelson@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-08-04 19:20:29 +02:00
Luis Henriques
bf2ba43221 ceph: reduce contention in ceph_check_delayed_caps()
Function ceph_check_delayed_caps() is called from the mdsc->delayed_work
workqueue and it can be kept looping for quite some time if caps keep
being added back to the mdsc->cap_delay_list.  This may result in the
watchdog tainting the kernel with the softlockup flag.

This patch breaks this loop if the caps have been recently (i.e. during
the loop execution).  Any new caps added to the list will be handled in
the next run.

Also, allow schedule_delayed() callers to explicitly set the delay value
instead of defaulting to 5s, so we can ensure that it runs soon
afterward if it looks like there is more work.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/46284
Signed-off-by: Luis Henriques <lhenriques@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-08-04 19:20:05 +02:00
Andy Shevchenko
2f658f7a39 pinctrl: tigerlake: Fix GPIO mapping for newer version of software
The software mapping for GPIO, which initially comes from Microsoft,
is subject to change by respective Windows and firmware developers.
Due to the above the driver had been written and published way ahead
of the schedule, and thus the numbering schema used in it is outdated.

Fix the numbering schema in accordance with the real products on market.

Fixes: 653d96455e ("pinctrl: tigerlake: Add support for Tiger Lake-H")
Reported-and-tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reported-by: Riccardo Mori <patacca@autistici.org>
Reported-and-tested-by: Lovesh <lovesh.bond@gmail.com>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213463
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213579
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213857
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2021-08-04 18:47:50 +03:00
Shaik Sajida Bhanu
67b13f3e22 mmc: sdhci-msm: Update the software timeout value for sdhc
Whenever SDHC run at clock rate 50MHZ or below, the hardware data
timeout value will be 21.47secs, which is approx. 22secs and we have
a current software timeout value as 10secs. We have to set software
timeout value more than the hardware data timeout value to avioid seeing
the below register dumps.

[  332.953670] mmc2: Timeout waiting for hardware interrupt.
[  332.959608] mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
[  332.966450] mmc2: sdhci: Sys addr:  0x00000000 | Version:  0x00007202
[  332.973256] mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000001
[  332.980054] mmc2: sdhci: Argument:  0x00000000 | Trn mode: 0x00000027
[  332.986864] mmc2: sdhci: Present:   0x01f801f6 | Host ctl: 0x0000001f
[  332.993671] mmc2: sdhci: Power:     0x00000001 | Blk gap:  0x00000000
[  333.000583] mmc2: sdhci: Wake-up:   0x00000000 | Clock:    0x00000007
[  333.007386] mmc2: sdhci: Timeout:   0x0000000e | Int stat: 0x00000000
[  333.014182] mmc2: sdhci: Int enab:  0x03ff100b | Sig enab: 0x03ff100b
[  333.020976] mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
[  333.027771] mmc2: sdhci: Caps:      0x322dc8b2 | Caps_1:   0x0000808f
[  333.034561] mmc2: sdhci: Cmd:       0x0000183a | Max curr: 0x00000000
[  333.041359] mmc2: sdhci: Resp[0]:   0x00000900 | Resp[1]:  0x00000000
[  333.048157] mmc2: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
[  333.054945] mmc2: sdhci: Host ctl2: 0x00000000
[  333.059657] mmc2: sdhci: ADMA Err:  0x00000000 | ADMA Ptr:
0x0000000ffffff218
[  333.067178] mmc2: sdhci_msm: ----------- VENDOR REGISTER DUMP
-----------
[  333.074343] mmc2: sdhci_msm: DLL sts: 0x00000000 | DLL cfg:
0x6000642c | DLL cfg2: 0x0020a000
[  333.083417] mmc2: sdhci_msm: DLL cfg3: 0x00000000 | DLL usr ctl:
0x00000000 | DDR cfg: 0x80040873
[  333.092850] mmc2: sdhci_msm: Vndr func: 0x00008a9c | Vndr func2 :
0xf88218a8 Vndr func3: 0x02626040
[  333.102371] mmc2: sdhci: ============================================

So, set software timeout value more than hardware timeout value.

Signed-off-by: Shaik Sajida Bhanu <sbhanu@codeaurora.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1626435974-14462-1-git-send-email-sbhanu@codeaurora.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-04 13:17:21 +02:00
Christophe Kerello
d8e193f13b mmc: mmci: stm32: Check when the voltage switch procedure should be done
If the card has not been power cycled, it may still be using 1.8V
signaling. This situation is detected in mmc_sd_init_card function and
should be handled in mmci stm32 variant.  The host->pwr_reg variable is
also correctly protected with spin locks.

Fixes: 94b94a93e3 ("mmc: mmci_sdmmc: Implement signal voltage callbacks")
Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>
Signed-off-by: Yann Gautier <yann.gautier@foss.st.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210701143353.13188-1-yann.gautier@foss.st.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-04 12:46:02 +02:00
Vincent Whitchurch
25f8203b4b mmc: dw_mmc: Fix hang on data CRC error
When a Data CRC interrupt is received, the driver disables the DMA, then
sends the stop/abort command and then waits for Data Transfer Over.

However, sometimes, when a data CRC error is received in the middle of a
multi-block write transfer, the Data Transfer Over interrupt is never
received, and the driver hangs and never completes the request.

The driver sets the BMOD.SWR bit (SDMMC_IDMAC_SWRESET) when stopping the
DMA, but according to the manual CMD.STOP_ABORT_CMD should be programmed
"before assertion of SWR".  Do these operations in the recommended
order.  With this change the Data Transfer Over is always received
correctly in my tests.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Reviewed-by: Jaehoon Chung <jh80.chung@samsung.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210630102232.16011-1-vincent.whitchurch@axis.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-08-04 12:41:20 +02:00
Yajun Deng
38ea9def5b netfilter: nf_conntrack_bridge: Fix memory leak when error
It should be added kfree_skb_list() when err is not equal to zero
in nf_br_ip_fragment().

v2: keep this aligned with IPv6.
v3: modify iter.frag_list to iter.frag.

Fixes: 3c171f496e ("netfilter: bridge: add connection tracking system")
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-04 10:41:29 +02:00
Jozsef Kadlecsik
5f7b51bf09 netfilter: ipset: Limit the maximal range of consecutive elements to add/delete
The range size of consecutive elements were not limited. Thus one could
define a huge range which may result soft lockup errors due to the long
execution time. Now the range size is limited to 2^20 entries.

Reported-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-04 10:41:03 +02:00
Lukasz Luba
f7d635883f cpufreq: arm_scmi: Fix error path when allocation failed
Stop the initialization when cpumask allocation failed and return an
error.

Fixes: 80a064dbd5 ("scmi-cpufreq: Get opp_shared_cpus from opp-v2 for EM")
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-08-04 09:31:57 +05:30
Michał Mirosław
335ffab3ef opp: remove WARN when no valid OPPs remain
This WARN can be triggered per-core and the stack trace is not useful.
Replace it with plain dev_err(). Fix a comment while at it.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-08-04 09:31:25 +05:30
Vineet Gupta
3a715e8040 ARC: fp: set FPU_STATUS.FWE to enable FPU_STATUS update on context switch
FPU_STATUS register contains FP exception flags bits which are updated
by core as side-effect of FP instructions but can also be manually
wiggled such as by glibc C99 functions fe{raise,clear,test}except() etc.
To effect the update, the programming model requires OR'ing FWE
bit (31). This bit is write-only and RAZ, meaning it is effectively
auto-cleared after write and thus needs to be set everytime: which
is how glibc implements this.

However there's another usecase of FPU_STATUS update, at the time of
Linux task switch when incoming task value needs to be programmed into
the register. This was added as part of f45ba2bd6d ("ARCv2:
fpu: preserve userspace fpu state") which missed OR'ing FWE bit,
meaning the new value is effectively not being written at all.
This patch remedies that.

Interestingly, this snafu was not caught in interm glibc testing as the
race window which relies on a specific exception bit to be set/clear is
really small specially when it nvolves context switch.
Fortunately this was caught by glibc's math/test-fenv-tls test which
repeatedly set/clear exception flags in a big loop, concurrently in main
program and also in a thread.

Fixes: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/54
Fixes: f45ba2bd6d ("ARCv2: fpu: preserve userspace fpu state")
Cc: stable@vger.kernel.org	#5.6+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2021-08-03 18:58:33 -07:00
Guenter Roeck
bf79167fd8 ARC: Fix CONFIG_STACKDEPOT
Enabling CONFIG_STACKDEPOT results in the following build error.

arc-elf-ld: lib/stackdepot.o: in function `filter_irq_stacks':
stackdepot.c:(.text+0x456): undefined reference to `__irqentry_text_start'
arc-elf-ld: stackdepot.c:(.text+0x456): undefined reference to `__irqentry_text_start'
arc-elf-ld: stackdepot.c:(.text+0x476): undefined reference to `__irqentry_text_end'
arc-elf-ld: stackdepot.c:(.text+0x476): undefined reference to `__irqentry_text_end'
arc-elf-ld: stackdepot.c:(.text+0x484): undefined reference to `__softirqentry_text_start'
arc-elf-ld: stackdepot.c:(.text+0x484): undefined reference to `__softirqentry_text_start'
arc-elf-ld: stackdepot.c:(.text+0x48c): undefined reference to `__softirqentry_text_end'
arc-elf-ld: stackdepot.c:(.text+0x48c): undefined reference to `__softirqentry_text_end'

Other architectures address this problem by adding IRQENTRY_TEXT and
SOFTIRQENTRY_TEXT to the text segment, so do the same here.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2021-08-03 18:58:33 -07:00
Colin Ian King
81e82fa580 arc: Fix spelling mistake and grammar in Kconfig
There is a spelling mistake and incorrect grammar in the Kconfig
text. Fix them.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2021-08-03 18:58:33 -07:00
Jinchao Wang
d406739551 arc: Prefer unsigned int to bare use of unsigned
Fix checkpatch warnings:
    WARNING: Prefer 'unsigned int' to bare use of 'unsigned'

Signed-off-by: Jinchao Wang <wjc@cdjrlc.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2021-08-03 18:58:33 -07:00
Richard Fitzgerald
8b353bbeae ASoC: cs42l42: Remove duplicate control for WNF filter frequency
The driver was defining two ALSA controls that both change the same
register field for the wind noise filter corner frequency. The filter
response has two corners, at different frequencies, and the duplicate
controls most likely were an attempt to be able to set the value using
either of the frequencies.

However, having two controls changing the same field can be problematic
and it is unnecessary. Both frequencies are related to each other so
setting one implies exactly what the other would be.

Removing a control affects user-side code, but there is currently no
known use of the removed control so it would be best to remove it now
before it becomes a problem.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 2c394ca796 ("ASoC: Add support for CS42L42 codec")
Link: https://lore.kernel.org/r/20210803160834.9005-2-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-03 18:23:45 +01:00
Richard Fitzgerald
30615bd21b ASoC: cs42l42: Fix inversion of ADC Notch Switch control
The underlying register field has inverted sense (0 = enabled) so
the control definition must be marked as inverted.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 2c394ca796 ("ASoC: Add support for CS42L42 codec")
Link: https://lore.kernel.org/r/20210803160834.9005-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-03 18:23:43 +01:00
Ard Biesheuvel
c32ac11da3 efi/libstub: arm64: Double check image alignment at entry
On arm64, the stub only moves the kernel image around in memory if
needed, which is typically only for KASLR, given that relocatable
kernels (which is the default) can run from any 64k aligned address,
which is also the minimum alignment communicated to EFI via the PE/COFF
header.

Unfortunately, some loaders appear to ignore this header, and load the
kernel at some arbitrary offset in memory. We can deal with this, but
let's check for this condition anyway, so non-compliant code can be
spotted and fixed.

Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2021-08-03 07:43:13 +02:00
Ard Biesheuvel
ff80ef5bf5 efi/libstub: arm64: Warn when efi_random_alloc() fails
Randomization of the physical load address of the kernel image relies on
efi_random_alloc() returning successfully, and currently, we ignore any
failures and just carry on, using the ordinary, non-randomized page
allocator routine. This means we never find out if a failure occurs,
which could harm security, so let's at least warn about this condition.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2021-08-03 07:43:07 +02:00
Ard Biesheuvel
3a26242375 efi/libstub: arm64: Relax 2M alignment again for relocatable kernels
Commit 82046702e2 ("efi/libstub/arm64: Replace 'preferred' offset with
alignment check") simplified the way the stub moves the kernel image
around in memory before booting it, given that a relocatable image does
not need to be copied to a 2M aligned offset if it was loaded on a 64k
boundary by EFI.

Commit d32de9130f ("efi/arm64: libstub: Deal gracefully with
EFI_RNG_PROTOCOL failure") inadvertently defeated this logic by
overriding the value of efi_nokaslr if EFI_RNG_PROTOCOL is not
available, which was mistaken by the loader logic as an explicit request
on the part of the user to disable KASLR and any associated relocation
of an Image not loaded on a 2M boundary.

So let's reinstate this functionality, by capturing the value of
efi_nokaslr at function entry to choose the minimum alignment.

Fixes: d32de9130f ("efi/arm64: libstub: Deal gracefully with EFI_RNG_PROTOCOL failure")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2021-08-03 07:43:02 +02:00
Ard Biesheuvel
5b94046efb efi/libstub: arm64: Force Image reallocation if BSS was not reserved
Distro versions of GRUB replace the usual LoadImage/StartImage calls
used to load the kernel image with some local code that fails to honor
the allocation requirements described in the PE/COFF header, as it
does not account for the image's BSS section at all: it fails to
allocate space for it, and fails to zero initialize it.

Since the EFI stub itself is allocated in the .init segment, which is
in the middle of the image, its BSS section is not impacted by this,
and the main consequence of this omission is that the BSS section may
overlap with memory regions that are already used by the firmware.

So let's warn about this condition, and force image reallocation to
occur in this case, which works around the problem.

Fixes: 82046702e2 ("efi/libstub/arm64: Replace 'preferred' offset with alignment check")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2021-08-03 07:41:53 +02:00
Guennadi Liakhovetski
973b393fdf ASoC: SOF: Intel: hda-ipc: fix reply size checking
Checking that two values don't have common bits makes no sense,
strict equality is meant.

Fixes: f3b433e469  ("ASoC: SOF: Implement Probe IPC API")
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20210802151749.15417-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-03 01:44:59 +01:00
Pierre-Louis Bossart
6b994c554e ASoC: SOF: Intel: Kconfig: fix SoundWire dependencies
The previous Kconfig cleanup added simplifications but also introduced
a new one by moving a boolean to a tristate. This leads to randconfig
problems.

This patch moves the select operations in the SOUNDWIRE_LINK_BASELINE
option. The INTEL_SOUNDWIRE config remains a tristate for backwards
compatibility with older configurations but is essentially an on/off
switch.

Fixes: cf5807f5f8 ('ASoC: SOF: Intel: SoundWire: simplify Kconfig')
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Rander Wang <rander.wang@intel.com>
Reviewed-by: Bard Liao <bard.liao@intel.com>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210802151628.15291-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-03 01:44:58 +01:00
Frank Wunderlich
5aa95d8834 iommu: Check if group is NULL before remove device
If probe_device is failing, iommu_group is not initialized because
iommu_group_add_device is not reached, so freeing it will result
in NULL pointer access.

iommu_bus_init
  ->bus_iommu_probe
      ->probe_iommu_group in for each:/* return -22 in fail case */
          ->iommu_probe_device
              ->__iommu_probe_device       /* return -22 here.*/
                  -> ops->probe_device          /* return -22 here.*/
                  -> iommu_group_get_for_dev
                        -> ops->device_group
                        -> iommu_group_add_device //good case
  ->remove_iommu_group  //in fail case, it will remove group
     ->iommu_release_device
         ->iommu_group_remove_device // here we don't have group

In my case ops->probe_device (mtk_iommu_probe_device from
mtk_iommu_v1.c) is due to failing fwspec->ops mismatch.

Fixes: d72e31c937 ("iommu: IOMMU Groups")
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Link: https://lore.kernel.org/r/20210731074737.4573-1-linux@fw-web.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-02 16:22:00 +02:00
Takashi Iwai
8b5d95313b ASoC: amd: Fix reference to PCM buffer address
PCM buffers might be allocated dynamically when the buffer
preallocation failed or a larger buffer is requested, and it's not
guaranteed that substream->dma_buffer points to the actually used
buffer.  The driver needs to refer to substream->runtime->dma_addr
instead for the buffer address.

Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20210731084331.32225-1-tiwai@suse.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-02 12:14:00 +01:00
Colin Ian King
5afc1540f1 iio: adc: Fix incorrect exit of for-loop
Currently the for-loop that scans for the optimial adc_period iterates
through all the possible adc_period levels because the exit logic in
the loop is inverted. I believe the comparison should be swapped and
the continue replaced with a break to exit the loop at the correct
point.

Addresses-Coverity: ("Continue has no effect")
Fixes: e08e19c331 ("iio:adc: add iio driver for Palmas (twl6035/7) gpadc")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210730071651.17394-1-colin.king@canonical.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2021-07-31 14:46:05 +01:00
Tianjia Zhang
567c39047d selftests/sgx: Fix Q1 and Q2 calculation in sigstruct.c
Q1 and Q2 are numbers with *maximum* length of 384 bytes. If the
calculated length of Q1 and Q2 is less than 384 bytes, things will
go wrong.

E.g. if Q2 is 383 bytes, then

1. The bytes of q2 are copied to sigstruct->q2 in calc_q1q2().
2. The entire sigstruct->q2 is reversed, which results it being
   256 * Q2, given that the last byte of sigstruct->q2 is added
   to before the bytes given by calc_q1q2().

Either change in key or measurement can trigger the bug. E.g. an
unmeasured heap could cause a devastating change in Q1 or Q2.

Reverse exactly the bytes of Q1 and Q2 in calc_q1q2() before returning
to the caller.

Fixes: 2adcba79e6 ("selftests/x86: Add a selftest for SGX")
Link: https://lore.kernel.org/linux-sgx/20210301051836.30738-1-tianjia.zhang@linux.alibaba.com/
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-07-30 17:20:01 -06:00
Mark Brown
1d25684e22 ASoC: nau8824: Fix open coded prefix handling
As with the component layer code the nau8824 driver had been doing some
open coded pin manipulation which will have been broken now the core is
fixed to handle this properly, remove the open coding to avoid the issue.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210728234729.10135-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-30 18:34:58 +01:00
Takashi Iwai
bb6a40fc5a ASoC: kirkwood: Fix reference to PCM buffer address
The transition to the managed PCM buffers allowed the dynamically
buffer allocation, while the driver code still assumes the fixed
preallocation buffer and sets up the DMA stuff at the open call.
This needs to be moved to hw_params after the buffer allocation and
setup.  Also, the reference to the buffer address has to be corrected
to runtime->dma_addr.

Fixes: b3c0ae75f5 ("ASoC: kirkwood: Use managed DMA buffer allocation")
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20210728112353.6675-6-tiwai@suse.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-30 17:20:51 +01:00
Takashi Iwai
827f3164aa ASoC: uniphier: Fix reference to PCM buffer address
Along with the transition to the managed PCM buffers, the driver now
accepts the dynamically allocated buffer, while it still kept the
reference to the old preallocated buffer address.  This patch corrects
to the right reference via runtime->dma_addr.

(Although this might have been already buggy before the cleanup with
the managed buffer, let's put Fixes tag to point that; it's a corner
case, after all.)

Fixes: d55894bc27 ("ASoC: uniphier: Use managed buffer allocation")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20210728112353.6675-5-tiwai@suse.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-30 17:20:50 +01:00
Takashi Iwai
42bc62c9f1 ASoC: xilinx: Fix reference to PCM buffer address
PCM buffers might be allocated dynamically when the buffer
preallocation failed or a larger buffer is requested, and it's not
guaranteed that substream->dma_buffer points to the actually used
buffer.  The driver needs to refer to substream->runtime->dma_addr
instead for the buffer address.

Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20210728112353.6675-4-tiwai@suse.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-30 17:20:50 +01:00
Takashi Iwai
2e6b836312 ASoC: intel: atom: Fix reference to PCM buffer address
PCM buffers might be allocated dynamically when the buffer
preallocation failed or a larger buffer is requested, and it's not
guaranteed that substream->dma_buffer points to the actually used
buffer.  The address should be retrieved from runtime->dma_addr,
instead of substream->dma_buffer (and shouldn't use virt_to_phys).

Also, remove the line overriding runtime->dma_area superfluously,
which was already set up at the PCM buffer allocation.

Cc: Cezary Rojewski <cezary.rojewski@intel.com>
Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20210728112353.6675-3-tiwai@suse.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-30 17:20:49 +01:00
Richard Fitzgerald
926ef1a4c2 ASoC: cs42l42: Fix bclk calculation for mono
An I2S frame always has a left and right channel slot even if mono
data is being sent. So if channels==1 the actual bitclock frequency
is 2 * snd_soc_params_to_bclk(params).

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 2cdba9b045 ("ASoC: cs42l42: Use bclk from hw_params if set_sysclk was not called")
Link: https://lore.kernel.org/r/20210729170929.6589-3-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-29 18:51:13 +01:00
Richard Fitzgerald
64324bac75 ASoC: cs42l42: Don't allow SND_SOC_DAIFMT_LEFT_J
The driver has no support for left-justified protocol so it should
not have been allowing this to be passed to cs42l42_set_dai_fmt().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 2c394ca796 ("ASoC: Add support for CS42L42 codec")
Link: https://lore.kernel.org/r/20210729170929.6589-2-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-29 18:51:12 +01:00
Richard Fitzgerald
ee86f680ff ASoC: cs42l42: Correct definition of ADC Volume control
The ADC volume is a signed 8-bit number with range -97 to +12,
with -97 being mute. Use a SOC_SINGLE_S8_TLV() to define this
and fix the DECLARE_TLV_DB_SCALE() to have the correct start and
mute flag.

Fixes: 2c394ca796 ("ASoC: Add support for CS42L42 codec")
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20210729170929.6589-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-29 18:51:11 +01:00
Steven Price
c4d7c51845 KVM: arm64: Fix race when enabling KVM_ARM_CAP_MTE
When enabling KVM_CAP_ARM_MTE the ioctl checks that there are no VCPUs
created to ensure that the capability is enabled before the VM is
running. However no locks are held at that point so it is
(theoretically) possible for another thread in the VMM to create VCPUs
between the check and actually setting mte_enabled. Close the race by
taking kvm->lock.

Reported-by: Alexandru Elisei <alexandru.elisei@arm.com>
Fixes: 673638f434 ("KVM: arm64: Expose KVM_ARM_CAP_MTE")
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729160036.20433-1-steven.price@arm.com
2021-07-29 17:34:01 +01:00
David Brazdil
facee1be76 KVM: arm64: Fix off-by-one in range_is_memory
Hyp checks whether an address range only covers RAM by checking the
start/endpoints against a list of memblock_region structs. However,
the endpoint here is exclusive but internally is treated as inclusive.
Fix the off-by-one error that caused valid address ranges to be
rejected.

Cc: Quentin Perret <qperret@google.com>
Fixes: 90134ac9ca ("KVM: arm64: Protect the .hyp sections from the host")
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210728153232.1018911-2-dbrazdil@google.com
2021-07-29 17:33:04 +01:00
Hans de Goede
153cca9caa platform/x86: Add and use a dual_accel_detect() helper
Various 360 degree hinges (yoga) style 2-in-1 devices use 2 accelerometers
to allow the OS to determine the angle between the display and the base of
the device.

On Windows these are read by a special HingeAngleService process which
calls undocumented ACPI methods, to let the firmware know if the 2-in-1 is
in tablet- or laptop-mode. The firmware may use this to disable the kbd and
touchpad to avoid spurious input in tablet-mode as well as to report
SW_TABLET_MODE info to the OS.

Since Linux does not call these undocumented methods, the SW_TABLET_MODE
info reported by various pdx86 drivers is incorrect on these devices.

Before this commit the intel-hid and thinkpad_acpi code already had 2
hardcoded checks for ACPI hardware-ids of dual-accel sensors to avoid
reporting broken info.

And now we also have a bug-report about the same problem in the intel-vbtn
code. Since there are at least 3 different ACPI hardware-ids in play, add
a new dual_accel_detect() helper which checks for all 3, rather then
adding different hardware-ids to the drivers as bug-reports trickle in.
Having shared code which checks all known hardware-ids is esp. important
for the intel-hid and intel-vbtn drivers as these are generic drivers
which are used on a lot of devices.

The BOSC0200 hardware-id requires special handling, because often it is
used for a single-accelerometer setup. Only in a few cases it refers to
a dual-accel setup, in which case there will be 2 I2cSerialBus resources
in the device's resource-list, so the helper checks for this.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=209011
Reported-and-tested-by: Julius Lehmann <julius@devpi.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20210729082134.6683-1-hdegoede@redhat.com
2021-07-29 13:14:07 +02:00
Richard Fitzgerald
830b69f6c0 MAINTAINERS: Add sound devicetree bindings for Wolfson Micro devices
Include all wm* sound bindings in the section for Wolfson Micro
drivers. This section already includes the actual driver source
files.

Also update the existing entry to match all wlf,* sound bindings.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20210727164948.4308-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-28 16:39:16 +01:00
Lucas Tanure
acbf58e530 ASoC: wm_adsp: Let soc_cleanup_component_debugfs remove debugfs
soc_cleanup_component_debugfs will debugfs_remove_recursive
the component->debugfs_root, so adsp doesn't need to also
remove the same entry.
By doing that adsp also creates a race with core component,
which causes a NULL pointer dereference

Signed-off-by: Lucas Tanure <tanureal@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20210728104416.636591-1-tanureal@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-28 16:39:15 +01:00
Mark Brown
31428c7874 ASoC: component: Remove misplaced prefix handling in pin control functions
When the component level pin control functions were added they for some
no longer obvious reason handled adding prefixing of widget names. This
meant that when the lack of prefix handling in the DAPM level pin
operations was fixed by ae4fc53224 (ASoC: dapm: use component
prefix when checking widget names) the one device using the component
level API ended up with the prefix being applied twice, causing all
lookups to fail.

Fix this by removing the redundant prefixing from the component code,
which has the nice side effect of also making that code much simpler.

Reported-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Tested-by: Lucas Tanure <tanureal@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20210726194123.54585-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-28 16:39:14 +01:00
Yaara Baruch
891332f697 iwlwifi: add new so-jf devices
Add new so-jf devices to the driver.

Signed-off-by: Yaara Baruch <yaara.baruch@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210719144523.1c9a59fd2760.If5aef1942007828210f0f2c4a17985f63050bb45@changeid
2021-07-28 18:01:38 +03:00
Yaara Baruch
a5bf1d4434 iwlwifi: add new SoF with JF devices
Add new SoF JF devices to the driver.

Signed-off-by: Yaara Baruch <yaara.baruch@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210719144523.0545d8964ff2.I3498879d8c184e42b1578a64aa7b7c99a18b75fb@changeid
2021-07-28 18:01:37 +03:00
Johannes Berg
0f673c16c8 iwlwifi: pnvm: accept multiple HW-type TLVs
Some products (So) may have two different types of products
with different mac-type that are otherwise equivalent, and
have the same PNVM data, so the PNVM file will contain two
(or perhaps later more) HW-type TLVs. Accept the file and
use the data section that contains any matching entry.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210719140154.a6a86e903035.Ic0b1b75c45d386698859f251518e8a5144431938@changeid
2021-07-28 18:00:59 +03:00
Tejun Heo
c3df5fb57f cgroup: rstat: fix A-A deadlock on 32bit around u64_stats_sync
0fa294fb19 ("cgroup: Replace cgroup_rstat_mutex with a spinlock") added
cgroup_rstat_flush_irqsafe() allowing flushing to happen from the irq
context. However, rstat paths use u64_stats_sync to synchronize access to
64bit stat counters on 32bit machines. u64_stats_sync is implemented using
seq_lock and trying to read from an irq context can lead to A-A deadlock if
the irq happens to interrupt the stat update.

Fix it by using the irqsafe variants - u64_stats_update_begin_irqsave() and
u64_stats_update_end_irqrestore() - in the update paths. Note that none of
this matters on 64bit machines. All these are just for 32bit SMP setups.

Note that the interface was introduced way back, its first and currently
only use was recently added by 2d146aa3aa ("mm: memcontrol: switch to
rstat"). Stable tagging targets this commit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Rik van Riel <riel@surriel.com>
Fixes: 2d146aa3aa ("mm: memcontrol: switch to rstat")
Cc: stable@vger.kernel.org # v5.13+
2021-07-27 13:12:20 -10:00
Pierre-Louis Bossart
61bef9e68d ASoC: SOF: Intel: hda: enforce exclusion between HDaudio and SoundWire
On some platforms with an external HDaudio codec, the DSDT reports the
presence of SoundWire devices. Pin-mux restrictions and board reworks
usually prevent coexistence between the two types of links, let's
prevent unnecessary operations from starting.

In the case of a single iDISP codec being detected, we still start the
links even if no SoundWire machine configuration was detected, so that
we can double-check what the hardware is and add the missing
configuration if applicable.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Bard Liao <bard.liao@intel.com>
Link: https://lore.kernel.org/r/20210726182855.179943-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-27 13:13:05 +01:00
Peter Ujfalusi
2635c22603 ASoC: topology: Select SND_DYNAMIC_MINORS
The indexes of the devices are described within the topology file, it is a
possibility that the topology encodes invalid indexes when DYNAMIC_MINORS
is not enabled in kernel:

 #define SNDRV_MINOR_COMPRESS		2	/* 2 - 3 */
 #define SNDRV_MINOR_HWDEP		4	/* 4 - 7 */
 #define SNDRV_MINOR_RAWMIDI		8	/* 8 - 15 */
 #define SNDRV_MINOR_PCM_PLAYBACK	16	/* 16 - 23 */
 #define SNDRV_MINOR_PCM_CAPTURE	24	/* 24 - 31 */

If the topology assigns an index greater than 7 for PLAYBACK/CAPTURE PCM
then there will be minor number collision.

As an example:
card0 creates a capture PCM with index 10 -> minor = 34
card1 creates compress device with index 0 -> minor = 34

Card1 will fail to instantiate because the minor for the compress stream is
already taken.

To avoid seemingly mysterious issues with card creation, select the
DYNAMIC_MINORS when the topology is enabled.

The other option would be to try to do out of bound index checks in case of
DYNAMIC_MINOR is not enabled and do not even attempt to create the device
with failing the topology load.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20210726182142.179604-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-27 13:13:04 +01:00
Brent Lu
0f32d9eb38 ASoC: Intel: sof_da7219_mx98360a: fail to initialize soundcard
The default codec for speaker amp's DAI Link is max98373 and will be
overwritten in probe function if the board id is sof_da7219_mx98360a.
However, the probe function does not do it because the board id is
changed in earlier commit.

Fixes: 1cc04d195d ("ASoC: Intel: sof_da7219_max98373: shrink platform_id below 20 characters")
Signed-off-by: Brent Lu <brent.lu@intel.com>
Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20210726094525.5748-1-brent.lu@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-26 18:57:12 +01:00
Ezequiel Garcia
0fbea68054 iommu/dma: Fix leak in non-contiguous API
Currently, iommu_dma_alloc_noncontiguous() allocates a
struct dma_sgt_handle object to hold some state needed for
iommu_dma_free_noncontiguous().

However, the handle is neither freed nor returned explicitly by
the ->alloc_noncontiguous method, and therefore seems leaked.
This was found by code inspection, so please review carefully and test.

As a side note, it appears the struct dma_sgt_handle type is exposed
to users of the DMA-API by linux/dma-map-ops.h, but is has no users
or functions returning the type explicitly.

This may indicate it's a good idea to move the struct dma_sgt_handle type
to drivers/iommu/dma-iommu.c. The decision is left to maintainers :-)

Cc: stable@vger.kernel.org
Fixes: e817ee5f2f ("dma-iommu: implement ->alloc_noncontiguous")
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210723010552.50969-1-ezequiel@collabora.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26 14:27:08 +02:00
Mark Brown
2c39ca6885 ASoC: tlv320aic31xx: Fix jack detection after suspend
The tlv320aic31xx driver relies on regcache_sync() to restore the register
contents after going to _BIAS_OFF, for example during system suspend. This
does not work for the jack detection configuration since that is configured
via the same register that status is read back from so the register is
volatile and not cached. This can also cause issues during init if the jack
detection ends up getting set up before the CODEC is initially brought out
of _BIAS_OFF, we will reset the CODEC and resync the cache as part of that
process.

Fix this by explicitly reapplying the jack detection configuration after
resyncing the register cache during power on.

This issue was found by an engineer working off-list on a product
kernel, I just wrote up the upstream fix.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210723180200.25105-1-broonie@kernel.org
Cc: stable@vger.kernel.org
2021-07-26 12:42:19 +01:00
Bjorn Andersson
d66cd5dea5 cpufreq: blacklist Qualcomm sc8180x in cpufreq-dt-platdev
The Qualcomm SC8180x platform uses the qcom-cpufreq-hw driver, so
it in the cpufreq-dt-platdev driver's blocklist.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-07-26 09:53:35 +05:30
Chris Lesiak
84edec86f4 iio: humidity: hdc100x: Add margin to the conversion time
The datasheets have the following note for the conversion time
specification: "This parameter is specified by design and/or
characterization and it is not tested in production."

Parts have been seen that require more time to do 14-bit conversions for
the relative humidity channel.  The result is ENXIO due to the address
phase of a transfer not getting an ACK.

Delay an additional 1 ms per conversion to allow for additional margin.

Fixes: 4839367d99 ("iio: humidity: add HDC100x support")
Signed-off-by: Chris Lesiak <chris.lesiak@licor.com>
Acked-by: Matt Ranostay <matt.ranostay@konsulko.com>
Link: https://lore.kernel.org/r/20210614141820.2034827-1-chris.lesiak@licor.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2021-07-24 18:13:02 +01:00
Maxime Ripard
14a30238ec dt-bindings: iio: st: Remove wrong items length check
The original bindings was listing the length of the interrupts as either
1 or 2, depending on the setup. This is also what is enforced by the top
level schema.

However, that is further constrained with an if clause that require
exactly two interrupts, even though it might not make sense on those
devices or in some setups.

Let's remove the clause entirely.

Cc: Denis Ciocca <denis.ciocca@st.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Fixes: 0cd7114580 ("iio: st-sensors: Update ST Sensor bindings")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20210721140424.725744-16-maxime@cerno.tech
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2021-07-24 16:19:00 +01:00
Arnd Bergmann
9f9decdb64 iio: accel: fxls8962af: fix i2c dependency
With CONFIG_SPI=y and CONFIG_I2C=m, building fxls8962af into vmlinux
causes a link error against the I2C module:

aarch64-linux-ld: drivers/iio/accel/fxls8962af-core.o: in function `fxls8962af_fifo_flush':
fxls8962af-core.c:(.text+0x3a0): undefined reference to `i2c_verify_client'

Work around it by adding a Kconfig dependency that forces the SPI driver
to be a loadable module whenever I2C is a module.

Fixes: af959b7b96 ("iio: accel: fxls8962af: fix errata bug E3 - I2C burst reads")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210721151330.2176653-1-arnd@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2021-07-24 16:13:11 +01:00
Rahul Tanwar
e2f55370b4 MAINTAINERS: Add Rahul Tanwar as Intel LGM Gateway PCIe maintainer
Add Rahul Tanwar as maintainer for PCIe RC controller driver for the Intel
Lightning Mountain (LGM) Gateway SoC.

Link: https://lore.kernel.org/r/b3249e08155e04ac08d820be3b8da29a913c472a.1625559158.git.rtanwar@maxlinear.com
Signed-off-by: Rahul Tanwar <rtanwar@maxlinear.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-07-23 16:05:46 -05:00
Hsin-Yi Wang
798a315fc3 pinctrl: mediatek: Fix fallback behavior for bias_set_combo
Some pin doesn't support PUPD register, if it fails and fallbacks with
bias_set_combo case, it will call mtk_pinconf_bias_set_pupd_r1_r0() to
modify the PUPD pin again.

Since the general bias set are either PU/PD or PULLSEL/PULLEN, try
bias_set or bias_set_rev1 for the other fallback case. If the pin
doesn't support neither PU/PD nor PULLSEL/PULLEN, it will return
-ENOTSUPP.

Fixes: 81bd1579b4 ("pinctrl: mediatek: Fix fallback call path")
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Zhiyong Tao <zhiyong.tao@mediatek.com>
Link: https://lore.kernel.org/r/20210701080955.2660294-1-hsinyi@chromium.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-07-23 17:41:56 +02:00
Arnd Bergmann
32ec396017 pinctrl: qcom: fix GPIOLIB dependencies
Enabling the PINCTRL_SM8350 symbol without GPIOLIB or SCM causes a build
failure:

WARNING: unmet direct dependencies detected for PINCTRL_MSM
  Depends on [m]: PINCTRL [=y] && (ARCH_QCOM [=y] || COMPILE_TEST [=y]) && GPIOLIB [=y] && (QCOM_SCM [=m] || !QCOM_SCM [=m])
  Selected by [y]:
  - PINCTRL_SM8350 [=y] && PINCTRL [=y] && (ARCH_QCOM [=y] || COMPILE_TEST [=y]) && GPIOLIB [=y] && OF [=y]
aarch64-linux-ld: drivers/pinctrl/qcom/pinctrl-msm.o: in function `msm_gpio_irq_set_type':
pinctrl-msm.c:(.text.msm_gpio_irq_set_type+0x1c8): undefined reference to `qcom_scm_io_readl'

The main problem here is the 'select PINCTRL_MSM', which needs to be a
'depends on' as it is for all the other front-ends. As the GPIOLIB
dependency is now implied by that, symbol, remove the duplicate
dependencies in the process.

Fixes: d5d348a327 ("pinctrl: qcom: Add SM8350 pinctrl driver")
Fixes: 376f9e34c1 ("drivers: pinctrl: qcom: fix Kconfig dependency on GPIOLIB")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210723091400.1669716-1-arnd@kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-07-23 11:51:19 +02:00
Vijendar Mukunda
5434d0dc56 ASoC: amd: enable stop_dma_first flag for cz_dai_7219_98357 dai link
DMA driver stop sequence should be invoked first before invoking I2S
controller driver stop sequence for Stoneyridge platform.

Enable stop_dma_first flag for cz_dai_7219_98357 dai link structure.

Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Link: https://lore.kernel.org/r/20210722130328.23796-1-Vijendar.Mukunda@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-22 16:02:20 +01:00
jason-jh.lin
1a64a7aff8 drm/mediatek: Fix cursor plane no update
The cursor plane should use the current plane state in atomic_async_update
because it would not be the new plane state in the global atomic state
since _swap_state happened when those hook are run.

Fix cursor plane issue by below modification:
1. Remove plane_helper_funcs->atomic_update(plane, state) in
   mtk_drm_crtc_async_update.
2. Add mtk_drm_update_new_state in to mtk_plane_atomic_async_update to
   update the cursor plane by current plane state hook and update
   others plane by the new_state.

Fixes: 37418bf14c ("drm: Use state helper instead of the plane state pointer")
Signed-off-by: jason-jh.lin <jason-jh.lin@mediatek.com>
Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2021-07-22 22:57:52 +08:00
Hsin-Yi Wang
6b57ba3243 drm/mediatek: mtk-dpi: Set out_fmt from config if not the last bridge
atomic_get_output_bus_fmts() is only called when the bridge is the last
element in the bridge chain.

If mtk-dpi is not the last bridge, the format of output_bus_cfg is
MEDIA_BUS_FMT_FIXED, and mtk_dpi_dual_edge() will fail to write correct
value to regs.

Fixes: ec8747c524 ("drm/mediatek: dpi: Add bus format negotiation")
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2021-07-22 22:44:42 +08:00
Marek Vasut
090c57da5f ASoC: tlv320aic32x4: Fix TAS2505/TAS2521 processing block selection
The TAS2505/TAS2521 does support only three processing block options, unlike
TLV320AIC32x4 which supports 25. This is documented in TI slau472 2.5.1.2
Processing Blocks and Page 0 / Register 60: DAC Instruction Set - 0x00 / 0x3C.

Limit the Processing Blocks maximum value to 3 on TAS2505/TAS2521 and select
processing block PRB_P1 always, because for the configuration of teh codec
implemented in this driver, this is the best quality option.

Fixes: b4525b6196 ("ASoC: tlv320aic32x4: add support for TAS2505")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Claudius Heine <ch@denx.de>
Cc: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210720200348.182139-1-marex@denx.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-22 12:40:03 +01:00
Mario Limonciello
d00f541a49 ASoC: amd: renoir: Run hibernation callbacks
The registers need to be re-initialized after hibernation or
microphone may be non-functional.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213793
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20210721183603.747-2-mario.limonciello@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-22 12:40:01 +01:00
Derek Fang
6d20bf7c02 ASoC: rt5682: Adjust headset volume button threshold
Adjust the threshold of headset button volume+ to fix
the wrong button detection issue with some brand headsets.

Signed-off-by: Derek Fang <derek.fang@realtek.com>
Link: https://lore.kernel.org/r/20210721133121.12333-1-derek.fang@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-22 12:40:00 +01:00
Arnd Bergmann
b9a4b57f42 ASoC: codecs: wcd938x: fix wcd module dependency
With SND_SOC_ALL_CODECS=y and SND_SOC_WCD938X_SDW=m, there is a link
error from a reverse dependency, since the built-in codec driver calls
into the modular soundwire back-end:

x86_64-linux-ld: sound/soc/codecs/wcd938x.o: in function `wcd938x_codec_free':
wcd938x.c:(.text+0x2c0): undefined reference to `wcd938x_sdw_free'
x86_64-linux-ld: sound/soc/codecs/wcd938x.o: in function `wcd938x_codec_hw_params':
wcd938x.c:(.text+0x2f6): undefined reference to `wcd938x_sdw_hw_params'
x86_64-linux-ld: sound/soc/codecs/wcd938x.o: in function `wcd938x_codec_set_sdw_stream':
wcd938x.c:(.text+0x332): undefined reference to `wcd938x_sdw_set_sdw_stream'
x86_64-linux-ld: sound/soc/codecs/wcd938x.o: in function `wcd938x_tx_swr_ctrl':
wcd938x.c:(.text+0x23de): undefined reference to `wcd938x_swr_get_current_bank'
x86_64-linux-ld: sound/soc/codecs/wcd938x.o: in function `wcd938x_bind':
wcd938x.c:(.text+0x2579): undefined reference to `wcd938x_sdw_device_get'
x86_64-linux-ld: wcd938x.c:(.text+0x25a1): undefined reference to `wcd938x_sdw_device_get'
x86_64-linux-ld: wcd938x.c:(.text+0x262a): undefined reference to `__devm_regmap_init_sdw'

Work around this using two small hacks: An added Kconfig dependency
prevents the main driver from being built-in when soundwire support
itself is a loadable module to allow calling devm_regmap_init_sdw(),
and a Makefile trick links the wcd938x-sdw backend as built-in
if needed to solve the dependency between the two modules.

Fixes: 0454422288 ("ASoC: codecs: wcd938x: add audio routing and Kconfig")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210721150510.1837221-1-arnd@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-22 12:39:59 +01:00
Frank Wunderlich
e062233c0e drm/mediatek: dpi: Fix NULL dereference in mtk_dpi_bridge_atomic_check
bridge->driver_private is not set (NULL) so use bridge_to_dpi(bridge)
like it's done in bridge_atomic_get_output_bus_fmts

Fixes: ec8747c524 ("drm/mediatek: dpi: Add bus format negotiation")
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Tested-by: Hsin-Yi Wang <hsinyi@chromium.org>
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2021-07-22 08:28:34 +08:00
Colin Ian King
83f877a095 xen/events: remove redundant initialization of variable irq
The variable irq is being initialized with a value that is never
read, it is being updated later on. The assignment is redundant and
can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210721114010.108648-1-colin.king@canonical.com
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-07-21 11:40:45 -05:00
Geert Uytterhoeven
1435f82689 reset: RESET_MCHP_SPARX5 should depend on ARCH_SPARX5
The Microchip Sparx5 switch reset block is only present on Microchip
Sparx5 SoCs.  Hence add a dependency on ARCH_SPARX5, to prevent asking
the user about this driver when configuring a kernel without Sparx5
support.

Fixes: 453ed4283b ("reset: mchp: sparx5: add switch reset driver")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/6e08f6f46123d0712397e901716b48f13fa5dc48.1624627657.git.geert@linux-m68k.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2021-07-21 12:19:03 +02:00
Sibi Sankar
4cbb02fa76 arm64: dts: qcom: sc7280: Fixup cpufreq domain info for cpu7
The SC7280 SoC supports a 4-Silver/3-Gold/1-Gold+ configuration and hence
the cpu7 node should point to cpufreq domain 2 instead.

Fixes: 7dbd121a2c ("arm64: dts: qcom: sc7280: Add cpufreq hw node")
Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
Link: https://lore.kernel.org/r/1626800953-613-1-git-send-email-sibis@codeaurora.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2021-07-20 12:21:41 -05:00
Benjamin Herrenschmidt
4152433c39 arm64: efi: kaslr: Fix occasional random alloc (and boot) failure
The EFI stub random allocator used for kaslr on arm64 has a subtle
bug. In function get_entry_num_slots() which counts the number of
possible allocation "slots" for the image in a given chunk of free
EFI memory, "last_slot" can become negative if the chunk is smaller
than the requested allocation size.

The test "if (first_slot > last_slot)" doesn't catch it because
both first_slot and last_slot are unsigned.

I chose not to make them signed to avoid problems if this is ever
used on architectures where there are meaningful addresses with the
top bit set. Instead, fix it with an additional test against the
allocation size.

This can cause a boot failure in addition to a loss of randomisation
due to another bug in the arm64 stub fixed separately.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fixes: 2ddbfc81ea ("efi: stub: add implementation of efi_random_alloc()")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2021-07-20 16:49:48 +02:00
Petr Vorel
3cb6a271f4 arm64: dts: qcom: msm8992-bullhead: Fix cont_splash_mem mapping
cont_splash_mem has different memory mapping than generic from msm8994.dtsi:

[    0.000000] cma: Found cont_splash_mem@0, memory base 0x0000000003400000, size 12 MiB, limit 0xffffffffffffffff
[    0.000000] cma: CMA: reserved 12 MiB at 0x0000000003400000 for cont_splash_mem

This fixes boot.

Fixes: 976d321f32 ("arm64: dts: qcom: msm8992: Make the DT an overlay on top of 8994")
Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Link: https://lore.kernel.org/r/20210713185734.380-3-petr.vorel@gmail.com
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2021-07-19 15:33:29 -05:00
Petr Vorel
9d1fc2e4f5 arm64: dts: qcom: msm8992-bullhead: Remove PSCI
Bullhead firmware obviously doesn't support PSCI as it fails to boot
with this definition.

Fixes: 329e16d5f8 ("arm64: dts: qcom: msm8992: Add PSCI support.")
Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Link: https://lore.kernel.org/r/20210713185734.380-2-petr.vorel@gmail.com
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2021-07-19 15:33:27 -05:00
Srinivas Kandagatla
9a253bb42f arm64: dts: qcom: c630: fix correct powerdown pin for WSA881x
WSA881x powerdown pin is connected to GPIO1, GPIO2 not GPIO2 and GPIO3,
so correct this. This was working so far due to a shift bug in gpio driver,
however once that is fixed this will stop working, so fix this!

For some reason we forgot to add this dts change in last merge cycle so
currently audio is broken in 5.13 as the gpio driver fix already landed
in 5.13.

Reported-by: Shawn Guo <shawnguo@kernel.org>
Fixes: 45021d35fc ("arm64: dts: qcom: c630: Enable audio support")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Tested-by: Shawn Guo <shawnguo@kernel.org>
Link: https://lore.kernel.org/r/20210706083523.10601-1-srinivas.kandagatla@linaro.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2021-07-19 11:43:44 -05:00
Antti Keränen
7e77ef8b8d iio: adis: set GPIO reset pin direction
Set reset pin direction to output as the reset pin needs to be an active
low output pin.

Co-developed-by: Hannu Hartikainen <hannu@hrtk.in>
Signed-off-by: Hannu Hartikainen <hannu@hrtk.in>
Signed-off-by: Antti Keränen <detegr@rbx.email>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Fixes: ecb010d441 ("iio: imu: adis: Refactor adis_initial_startup")
Link: https://lore.kernel.org/r/20210708095425.13295-1-detegr@rbx.email
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2021-07-17 18:41:04 +01:00
Uwe Kleine-König
9898cb24e4 iio: adc: ti-ads7950: Ensure CS is deasserted after reading channels
The ADS7950 requires that CS is deasserted after each SPI word. Before
commit e2540da86e ("iio: adc: ti-ads7950: use SPI_CS_WORD to reduce
CPU usage") the driver used a message with one spi transfer per channel
where each but the last one had .cs_change set to enforce a CS toggle.
This was wrongly translated into a message with a single transfer and
.cs_change set which results in a CS toggle after each word but the
last which corrupts the first adc conversion of all readouts after the
first readout.

Fixes: e2540da86e ("iio: adc: ti-ads7950: use SPI_CS_WORD to reduce CPU usage")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: David Lechner <david@lechnology.com>
Tested-by: David Lechner <david@lechnology.com>
Cc: <Stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210709101110.1814294-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2021-07-17 18:36:53 +01:00
Stephan Mueller
5261cdf457 crypto: drbg - select SHA512
With the swtich to use HMAC(SHA-512) as the default DRBG type, the
configuration must now also select SHA-512.

Fixes: 9b7b94683a "crypto: DRBG - switch to HMAC SHA512 DRBG as default
DRBG"
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Stephan Mueller <smueller@chronox.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-07-16 15:49:31 +08:00
Andreas Persson
2394e62873 mtd: cfi_cmdset_0002: fix crash when erasing/writing AMD cards
Erasing an AMD linear flash card (AM29F016D) crashes after the first
sector has been erased. Likewise, writing to it crashes after two bytes
have been written. The reason is a missing check for a null pointer -
the cmdset_priv field is not set for this type of card.

Fixes: 4844ef8030 ("mtd: cfi_cmdset_0002: Add support for polling status register")
Signed-off-by: Andreas Persson <andreasp56@outlook.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/DB6P189MB05830B3530B8087476C5CFE4C1159@DB6P189MB0583.EURP189.PROD.OUTLOOK.COM
2021-07-16 00:49:22 +02:00
Michael Walle
45bb1faa29 mtd: core: handle flashes without OTP gracefully
There are flash drivers which registers the OTP callbacks although the
flash doesn't support OTP regions and return -ENODATA for these
callbacks if there is no OTP. If this happens, the probe of the whole
flash will fail. Fix it by handling the ENODATA return code and skip
the OTP region nvmem setup.

Fixes: 4b361cfa86 ("mtd: core: add OTP nvmem provider support")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210707135359.32398-1-michael@walle.cc
2021-07-16 00:49:20 +02:00
Dan Carpenter
e83862ee1b mtd: mchp48l640: silence some uninitialized variable warnings
Smatch complains that zero length read/writes will lead to an
uninitalized return value.  I don't know if that's possible, but
it's nicer to return a zero literal anyway so let's do that.

Fixes: 88d1250267 ("mtd: devices: add support for microchip 48l640 EERAM")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Heiko Schocher <hs@denx.de>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/YMyir961W28TX5dT@mwanda
2021-07-16 00:49:19 +02:00
Desmond Cheong Zhi Xi
962bf783ef mtd: break circular locks in register_mtd_blktrans
Syzbot reported a circular locking dependency:
https://syzkaller.appspot.com/bug?id=7bd106c28e846d1023d4ca915718b1a0905444cb

This happens because of the following lock dependencies:

1. loop_ctl_mutex -> bdev->bd_mutex (when loop_control_ioctl calls
loop_remove, which then calls del_gendisk; this also happens in
loop_exit which eventually calls loop_remove)

2. bdev->bd_mutex -> mtd_table_mutex (when blkdev_get_by_dev calls
__blkdev_get, which then calls blktrans_open)

3. mtd_table_mutex -> major_names_lock (when register_mtd_blktrans
calls __register_blkdev)

4. major_names_lock -> loop_ctl_mutex (when blk_request_module calls
loop_probe)

Hence there's an overall dependency of:

loop_ctl_mutex   ----------> bdev->bd_mutex
      ^                            |
      |                            |
      |                            v
major_names_lock <---------  mtd_table_mutex

We can break this circular dependency by holding mtd_table_mutex only
for the required critical section in register_mtd_blktrans. This
avoids the mtd_table_mutex -> major_names_lock dependency.

Reported-and-tested-by: syzbot+6a8a0d93c91e8fbf2e80@syzkaller.appspotmail.com
Co-developed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210617160904.570111-1-desmondcheongzx@gmail.com
2021-07-16 00:49:17 +02:00
Dan Carpenter
14f97f0b8e mtd: rawnand: Add a check in of_get_nand_secure_regions()
Check for whether of_property_count_elems_of_size() returns a negative
error code.

Fixes: 13b8976827 ("mtd: rawnand: Add support for secure regions in NAND memory")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/YMtQFXE0F1w7mUh+@mwanda
2021-07-16 00:49:15 +02:00
Zhihao Cheng
2b6d2833cd mtd: mtd_blkdevs: Initialize rq.limits.discard_granularity
Since commit b35fd7422c2f8("block: check queue's limits.discard_granularity
in __blkdev_issue_discard()") checks rq.limits.discard_granularity in
__blkdev_issue_discard(), we may get following warnings on formatted ftl:

  WARNING: CPU: 2 PID: 7313 at block/blk-lib.c:51
  __blkdev_issue_discard+0x2a7/0x390

Reproducer:
  1. ftl_format /dev/mtd0
  2. modprobe ftl
  3. mkfs.vfat /dev/ftla
  4. mount -odiscard /dev/ftla temp
  5. dd if=/dev/zero of=temp/tst bs=1M count=10 oflag=direct
  6. dd if=/dev/zero of=temp/tst bs=1M count=10 oflag=direct

Fix it by initializing rq.limits.discard_granularity if device supports
discard operation.

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210615093905.3473709-1-chengzhihao1@huawei.com
2021-07-16 00:49:13 +02:00
Sean Nyekjaer
4377d9ab1f iio: accel: fxls8962af: fix potential use of uninitialized symbol
Fix this warning from kernel test robot:
smatch warnings:
drivers/iio/accel/fxls8962af-core.c:640
fxls8962af_i2c_raw_read_errata3() error: uninitialized symbol 'ret'.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Fixes: af959b7b96 ("iio: accel: fxls8962af: fix errata bug E3 - I2C burst reads")
Link: https://lore.kernel.org/r/20210709071727.2453536-1-sean@geanix.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2021-07-13 18:47:22 +01:00
Dongliang Mu
889d0e7dc6 ieee802154: hwsim: fix GPF in hwsim_new_edge_nl
Both MAC802154_HWSIM_ATTR_RADIO_ID and MAC802154_HWSIM_ATTR_RADIO_EDGE
must be present to fix GPF.

Fixes: f25da51fdc ("ieee802154: hwsim: add replacement for fakelb")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210707155633.1486603-1-mudongliangabcd@gmail.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2021-07-08 09:37:03 +02:00
Ira Weiny
b05d4c576b dax: Ensure errno is returned from dax_direct_access
If the caller specifies a negative nr_pages that is an invalid
parameter.

Return -EINVAL to ensure callers get an errno if they want to check it.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20210525172428.3634316-4-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-07-07 22:10:04 -07:00
Ira Weiny
44788591c3 fs/dax: Clarify nr_pages to dax_direct_access()
dax_direct_access() takes a number of pages.  PHYS_PFN(PAGE_SIZE) is a
very round about way to specify '1'.

Change the nr_pages parameter to the explicit value of '1'.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20210525172428.3634316-3-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-07-07 22:10:03 -07:00
Ira Weiny
2e29be2e49 fs/fuse: Remove unneeded kaddr parameter
fuse_dax_mem_range_init() does not need the address or the pfn of the
memory requested in dax_direct_access().  It is only calling direct
access to get the number of pages.

Remove the unused variables and stop requesting the kaddr and pfn from
dax_direct_access().

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Link: https://lore.kernel.org/r/20210525172428.3634316-2-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-07-07 22:10:03 -07:00
Dongliang Mu
e9faf53c5a ieee802154: hwsim: fix GPF in hwsim_set_edge_lqi
Both MAC802154_HWSIM_ATTR_RADIO_ID and MAC802154_HWSIM_ATTR_RADIO_EDGE,
MAC802154_HWSIM_EDGE_ATTR_ENDPOINT_ID and MAC802154_HWSIM_EDGE_ATTR_LQI
must be present to fix GPF.

Fixes: f25da51fdc ("ieee802154: hwsim: add replacement for fakelb")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210705131321.217111-1-mudongliangabcd@gmail.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2021-07-07 16:42:59 +02:00
Rodrigo Campos
19d6769474 Documentation: seccomp: Fix typo in user notification
The close on exec flag is O_CLOEXEC, not O_EXEC. This patch just fixes
the typo.

Suggested-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Fixes: 0ae71c7720 ("seccomp: Support atomic "addfd + send reply"")
Link: https://lore.kernel.org/r/20210702151927.263402-1-rodrigo@kinvolk.io
2021-07-02 10:39:45 -07:00
601 changed files with 4604 additions and 3325 deletions

View File

@@ -108,7 +108,7 @@ This bump in ABI version is at most once per kernel development cycle.
For example, if current state of ``libbpf.map`` is:
.. code-block:: c
.. code-block:: none
LIBBPF_0.0.1 {
global:
@@ -121,7 +121,7 @@ For example, if current state of ``libbpf.map`` is:
, and a new symbol ``bpf_func_c`` is being introduced, then
``libbpf.map`` should be changed like this:
.. code-block:: c
.. code-block:: none
LIBBPF_0.0.1 {
global:

View File

@@ -152,47 +152,6 @@ allOf:
maxItems: 1
st,drdy-int-pin: false
- if:
properties:
compatible:
enum:
# Two intertial interrupts i.e. accelerometer/gyro interrupts
- st,h3lis331dl-accel
- st,l3g4200d-gyro
- st,l3g4is-gyro
- st,l3gd20-gyro
- st,l3gd20h-gyro
- st,lis2de12
- st,lis2dw12
- st,lis2hh12
- st,lis2dh12-accel
- st,lis331dl-accel
- st,lis331dlh-accel
- st,lis3de
- st,lis3dh-accel
- st,lis3dhh
- st,lis3mdl-magn
- st,lng2dm-accel
- st,lps331ap-press
- st,lsm303agr-accel
- st,lsm303dlh-accel
- st,lsm303dlhc-accel
- st,lsm303dlm-accel
- st,lsm330-accel
- st,lsm330-gyro
- st,lsm330d-accel
- st,lsm330d-gyro
- st,lsm330dl-accel
- st,lsm330dl-gyro
- st,lsm330dlc-accel
- st,lsm330dlc-gyro
- st,lsm9ds0-gyro
- st,lsm9ds1-magn
then:
properties:
interrupts:
maxItems: 2
required:
- compatible
- reg

View File

@@ -24,10 +24,10 @@ allOf:
select:
properties:
compatible:
items:
- enum:
- sifive,fu540-c000-ccache
- sifive,fu740-c000-ccache
contains:
enum:
- sifive,fu540-c000-ccache
- sifive,fu740-c000-ccache
required:
- compatible

View File

@@ -18,114 +18,5 @@ real, with all the uAPI bits is:
* Route shmem backend over to TTM SYSTEM for discrete
* TTM purgeable object support
* Move i915 buddy allocator over to TTM
* MMAP ioctl mode(see `I915 MMAP`_)
* SET/GET ioctl caching(see `I915 SET/GET CACHING`_)
* Send RFC(with mesa-dev on cc) for final sign off on the uAPI
* Add pciid for DG1 and turn on uAPI for real
New object placement and region query uAPI
==========================================
Starting from DG1 we need to give userspace the ability to allocate buffers from
device local-memory. Currently the driver supports gem_create, which can place
buffers in system memory via shmem, and the usual assortment of other
interfaces, like dumb buffers and userptr.
To support this new capability, while also providing a uAPI which will work
beyond just DG1, we propose to offer three new bits of uAPI:
DRM_I915_QUERY_MEMORY_REGIONS
-----------------------------
New query ID which allows userspace to discover the list of supported memory
regions(like system-memory and local-memory) for a given device. We identify
each region with a class and instance pair, which should be unique. The class
here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
like DG1.
Side note: The class/instance design is borrowed from our existing engine uAPI,
where we describe every physical engine in terms of its class, and the
particular instance, since we can have more than one per class.
In the future we also want to expose more information which can further
describe the capabilities of a region.
.. kernel-doc:: include/uapi/drm/i915_drm.h
:functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
GEM_CREATE_EXT
--------------
New ioctl which is basically just gem_create but now allows userspace to provide
a chain of possible extensions. Note that if we don't provide any extensions and
set flags=0 then we get the exact same behaviour as gem_create.
Side note: We also need to support PXP[1] in the near future, which is also
applicable to integrated platforms, and adds its own gem_create_ext extension,
which basically lets userspace mark a buffer as "protected".
.. kernel-doc:: include/uapi/drm/i915_drm.h
:functions: drm_i915_gem_create_ext
I915_GEM_CREATE_EXT_MEMORY_REGIONS
----------------------------------
Implemented as an extension for gem_create_ext, we would now allow userspace to
optionally provide an immutable list of preferred placements at creation time,
in priority order, for a given buffer object. For the placements we expect
them each to use the class/instance encoding, as per the output of the regions
query. Having the list in priority order will be useful in the future when
placing an object, say during eviction.
.. kernel-doc:: include/uapi/drm/i915_drm.h
:functions: drm_i915_gem_create_ext_memory_regions
One fair criticism here is that this seems a little over-engineered[2]. If we
just consider DG1 then yes, a simple gem_create.flags or something is totally
all that's needed to tell the kernel to allocate the buffer in local-memory or
whatever. However looking to the future we need uAPI which can also support
upcoming Xe HP multi-tile architecture in a sane way, where there can be
multiple local-memory instances for a given device, and so using both class and
instance in our uAPI to describe regions is desirable, although specifically
for DG1 it's uninteresting, since we only have a single local-memory instance.
Existing uAPI issues
====================
Some potential issues we still need to resolve.
I915 MMAP
---------
In i915 there are multiple ways to MMAP GEM object, including mapping the same
object using different mapping types(WC vs WB), i.e multiple active mmaps per
object. TTM expects one MMAP at most for the lifetime of the object. If it
turns out that we have to backpedal here, there might be some potential
userspace fallout.
I915 SET/GET CACHING
--------------------
In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
DG1 doesn't support non-snooped pcie transactions, so we can just always
allocate as WB for smem-only buffers. If/when our hw gains support for
non-snooped pcie transactions then we must fix this mode at allocation time as
a new GEM extension.
This is related to the mmap problem, because in general (meaning, when we're
not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
allocation mode.
Possible idea is to let the kernel picks the mmap mode for userspace from the
following table:
smem-only: WB. Userspace does not need to call clflush.
smem+lmem: We only ever allow a single mode, so simply allocate this as uncached
memory, and always give userspace a WC mapping. GPU still does snooped access
here(assuming we can't turn it off like on DG1), which is a bit inefficient.
lmem only: always WC
This means on discrete you only get a single mmap mode, all others must be
rejected. That's probably going to be a new default mode or something like
that.
Links
=====
[1] https://patchwork.freedesktop.org/series/86798/
[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791

View File

@@ -17,6 +17,7 @@ Introduction
busses/index
i2c-topology
muxes/i2c-mux-gpio
i2c-sysfs
Writing device drivers
======================

View File

@@ -191,19 +191,9 @@ nf_flowtable_tcp_timeout - INTEGER (seconds)
TCP connections may be offloaded from nf conntrack to nf flow table.
Once aged, the connection is returned to nf conntrack with tcp pickup timeout.
nf_flowtable_tcp_pickup - INTEGER (seconds)
default 120
TCP connection timeout after being aged from nf flow table offload.
nf_flowtable_udp_timeout - INTEGER (seconds)
default 30
Control offload timeout for udp connections.
UDP connections may be offloaded from nf conntrack to nf flow table.
Once aged, the connection is returned to nf conntrack with udp pickup timeout.
nf_flowtable_udp_pickup - INTEGER (seconds)
default 30
UDP connection timeout after being aged from nf flow table offload.

View File

@@ -263,7 +263,7 @@ Userspace can also add file descriptors to the notifying process via
``ioctl(SECCOMP_IOCTL_NOTIF_ADDFD)``. The ``id`` member of
``struct seccomp_notif_addfd`` should be the same ``id`` as in
``struct seccomp_notif``. The ``newfd_flags`` flag may be used to set flags
like O_EXEC on the file descriptor in the notifying process. If the supervisor
like O_CLOEXEC on the file descriptor in the notifying process. If the supervisor
wants to inject the file descriptor with a specific number, the
``SECCOMP_ADDFD_FLAG_SETFD`` flag can be used, and set the ``newfd`` member to
the specific number to use. If that file descriptor is already open in the

View File

@@ -25,10 +25,10 @@ On x86:
- vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
- kvm->arch.mmu_lock is an rwlock. kvm->arch.tdp_mmu_pages_lock is
taken inside kvm->arch.mmu_lock, and cannot be taken without already
holding kvm->arch.mmu_lock (typically with ``read_lock``, otherwise
there's no need to take kvm->arch.tdp_mmu_pages_lock at all).
- kvm->arch.mmu_lock is an rwlock. kvm->arch.tdp_mmu_pages_lock and
kvm->arch.mmu_unsync_pages_lock are taken inside kvm->arch.mmu_lock, and
cannot be taken without already holding kvm->arch.mmu_lock (typically with
``read_lock`` for the TDP MMU, thus the need for additional spinlocks).
Everything else is a leaf: no other lock is taken inside the critical
sections.

View File

@@ -3866,6 +3866,16 @@ L: bcm-kernel-feedback-list@broadcom.com
S: Maintained
F: drivers/mtd/nand/raw/brcmnand/
BROADCOM STB PCIE DRIVER
M: Jim Quinlan <jim2101024@gmail.com>
M: Nicolas Saenz Julienne <nsaenz@kernel.org>
M: Florian Fainelli <f.fainelli@gmail.com>
M: bcm-kernel-feedback-list@broadcom.com
L: linux-pci@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
F: drivers/pci/controller/pcie-brcmstb.c
BROADCOM SYSTEMPORT ETHERNET DRIVER
M: Florian Fainelli <f.fainelli@gmail.com>
L: bcm-kernel-feedback-list@broadcom.com
@@ -4498,7 +4508,7 @@ L: clang-built-linux@googlegroups.com
S: Supported
W: https://clangbuiltlinux.github.io/
B: https://github.com/ClangBuiltLinux/linux/issues
C: irc://chat.freenode.net/clangbuiltlinux
C: irc://irc.libera.chat/clangbuiltlinux
F: Documentation/kbuild/llvm.rst
F: include/linux/compiler-clang.h
F: scripts/clang-tools/
@@ -6945,7 +6955,7 @@ F: include/uapi/linux/mdio.h
F: include/uapi/linux/mii.h
EXFAT FILE SYSTEM
M: Namjae Jeon <namjae.jeon@samsung.com>
M: Namjae Jeon <linkinjeon@kernel.org>
M: Sungjong Seo <sj1557.seo@samsung.com>
L: linux-fsdevel@vger.kernel.org
S: Maintained
@@ -11327,7 +11337,7 @@ W: https://linuxtv.org
T: git git://linuxtv.org/media_tree.git
F: drivers/media/radio/radio-maxiradio*
MCAB MICROCHIP CAN BUS ANALYZER TOOL DRIVER
MCBA MICROCHIP CAN BUS ANALYZER TOOL DRIVER
R: Yasushi SHOJI <yashi@spacecubics.com>
L: linux-can@vger.kernel.org
S: Maintained
@@ -14430,6 +14440,13 @@ S: Maintained
F: Documentation/devicetree/bindings/pci/hisilicon-histb-pcie.txt
F: drivers/pci/controller/dwc/pcie-histb.c
PCIE DRIVER FOR INTEL LGM GW SOC
M: Rahul Tanwar <rtanwar@maxlinear.com>
L: linux-pci@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/pci/intel-gw-pcie.yaml
F: drivers/pci/controller/dwc/pcie-intel-gw.c
PCIE DRIVER FOR MEDIATEK
M: Ryder Lee <ryder.lee@mediatek.com>
M: Jianjun Wang <jianjun.wang@mediatek.com>
@@ -15803,7 +15820,7 @@ F: Documentation/devicetree/bindings/i2c/renesas,iic-emev2.yaml
F: drivers/i2c/busses/i2c-emev2.c
RENESAS ETHERNET DRIVERS
R: Sergei Shtylyov <sergei.shtylyov@gmail.com>
R: Sergey Shtylyov <s.shtylyov@omp.ru>
L: netdev@vger.kernel.org
L: linux-renesas-soc@vger.kernel.org
F: Documentation/devicetree/bindings/net/renesas,*.yaml
@@ -17815,7 +17832,7 @@ F: include/linux/sync_file.h
F: include/uapi/linux/sync_file.h
SYNOPSYS ARC ARCHITECTURE
M: Vineet Gupta <vgupta@synopsys.com>
M: Vineet Gupta <vgupta@kernel.org>
L: linux-snps-arc@lists.infradead.org
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc.git
@@ -20017,7 +20034,8 @@ F: Documentation/devicetree/bindings/extcon/wlf,arizona.yaml
F: Documentation/devicetree/bindings/mfd/wlf,arizona.yaml
F: Documentation/devicetree/bindings/mfd/wm831x.txt
F: Documentation/devicetree/bindings/regulator/wlf,arizona.yaml
F: Documentation/devicetree/bindings/sound/wlf,arizona.yaml
F: Documentation/devicetree/bindings/sound/wlf,*.yaml
F: Documentation/devicetree/bindings/sound/wm*
F: Documentation/hwmon/wm83??.rst
F: arch/arm/mach-s3c/mach-crag6410*
F: drivers/clk/clk-wm83*.c

View File

@@ -2,7 +2,7 @@
VERSION = 5
PATCHLEVEL = 14
SUBLEVEL = 0
EXTRAVERSION = -rc5
EXTRAVERSION =
NAME = Opossums on Parade
# *DOCUMENTATION*

View File

@@ -409,7 +409,7 @@ choice
help
Depending on the configuration, CPU can contain DSP registers
(ACC0_GLO, ACC0_GHI, DSP_BFLY0, DSP_CTRL, DSP_FFT_CTRL).
Bellow is options describing how to handle these registers in
Below are options describing how to handle these registers in
interrupt entry / exit and in context switch.
config ARC_DSP_NONE

View File

@@ -24,7 +24,7 @@
*/
static inline __sum16 csum_fold(__wsum s)
{
unsigned r = s << 16 | s >> 16; /* ror */
unsigned int r = s << 16 | s >> 16; /* ror */
s = ~s;
s -= r;
return s >> 16;

View File

@@ -123,7 +123,7 @@ static const char * const arc_pmu_ev_hw_map[] = {
#define C(_x) PERF_COUNT_HW_CACHE_##_x
#define CACHE_OP_UNSUPPORTED 0xffff
static const unsigned arc_pmu_cache_map[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
static const unsigned int arc_pmu_cache_map[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
[C(L1D)] = {
[C(OP_READ)] = {
[C(RESULT_ACCESS)] = PERF_COUNT_ARC_LDC,

View File

@@ -57,23 +57,26 @@ void fpu_save_restore(struct task_struct *prev, struct task_struct *next)
void fpu_init_task(struct pt_regs *regs)
{
const unsigned int fwe = 0x80000000;
/* default rounding mode */
write_aux_reg(ARC_REG_FPU_CTRL, 0x100);
/* set "Write enable" to allow explicit write to exception flags */
write_aux_reg(ARC_REG_FPU_STATUS, 0x80000000);
/* Initialize to zero: setting requires FWE be set */
write_aux_reg(ARC_REG_FPU_STATUS, fwe);
}
void fpu_save_restore(struct task_struct *prev, struct task_struct *next)
{
struct arc_fpu *save = &prev->thread.fpu;
struct arc_fpu *restore = &next->thread.fpu;
const unsigned int fwe = 0x80000000;
save->ctrl = read_aux_reg(ARC_REG_FPU_CTRL);
save->status = read_aux_reg(ARC_REG_FPU_STATUS);
write_aux_reg(ARC_REG_FPU_CTRL, restore->ctrl);
write_aux_reg(ARC_REG_FPU_STATUS, restore->status);
write_aux_reg(ARC_REG_FPU_STATUS, (fwe | restore->status));
}
#endif

View File

@@ -260,7 +260,7 @@ static void init_unwind_hdr(struct unwind_table *table,
{
const u8 *ptr;
unsigned long tableSize = table->size, hdrSize;
unsigned n;
unsigned int n;
const u32 *fde;
struct {
u8 version;
@@ -462,7 +462,7 @@ static uleb128_t get_uleb128(const u8 **pcur, const u8 *end)
{
const u8 *cur = *pcur;
uleb128_t value;
unsigned shift;
unsigned int shift;
for (shift = 0, value = 0; cur < end; shift += 7) {
if (shift + 7 > 8 * sizeof(value)
@@ -483,7 +483,7 @@ static sleb128_t get_sleb128(const u8 **pcur, const u8 *end)
{
const u8 *cur = *pcur;
sleb128_t value;
unsigned shift;
unsigned int shift;
for (shift = 0, value = 0; cur < end; shift += 7) {
if (shift + 7 > 8 * sizeof(value)
@@ -609,7 +609,7 @@ static unsigned long read_pointer(const u8 **pLoc, const void *end,
static signed fde_pointer_type(const u32 *cie)
{
const u8 *ptr = (const u8 *)(cie + 2);
unsigned version = *ptr;
unsigned int version = *ptr;
if (*++ptr) {
const char *aug;
@@ -904,7 +904,7 @@ int arc_unwind(struct unwind_frame_info *frame)
const u8 *ptr = NULL, *end = NULL;
unsigned long pc = UNW_PC(frame) - frame->call_frame;
unsigned long startLoc = 0, endLoc = 0, cfa;
unsigned i;
unsigned int i;
signed ptrType = -1;
uleb128_t retAddrReg = 0;
const struct unwind_table *table;

View File

@@ -88,6 +88,8 @@ SECTIONS
CPUIDLE_TEXT
LOCK_TEXT
KPROBES_TEXT
IRQENTRY_TEXT
SOFTIRQENTRY_TEXT
*(.fixup)
*(.gnu.warning)
}

View File

@@ -15,8 +15,6 @@ CONFIG_SLAB=y
CONFIG_ARCH_NOMADIK=y
CONFIG_MACH_NOMADIK_8815NHK=y
CONFIG_AEABI=y
CONFIG_ZBOOT_ROM_TEXT=0x0
CONFIG_ZBOOT_ROM_BSS=0x0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_BLK_DEV_BSG is not set
@@ -52,9 +50,9 @@ CONFIG_MTD_BLOCK=y
CONFIG_MTD_ONENAND=y
CONFIG_MTD_ONENAND_VERIFY_WRITE=y
CONFIG_MTD_ONENAND_GENERIC=y
CONFIG_MTD_NAND_ECC_SW_HAMMING_SMC=y
CONFIG_MTD_RAW_NAND=y
CONFIG_MTD_NAND_FSMC=y
CONFIG_MTD_NAND_ECC_SW_HAMMING_SMC=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_CRYPTOLOOP=y
CONFIG_BLK_DEV_RAM=y
@@ -97,6 +95,7 @@ CONFIG_REGULATOR=y
CONFIG_DRM=y
CONFIG_DRM_PANEL_TPO_TPG110=y
CONFIG_DRM_PL111=y
CONFIG_FB=y
CONFIG_BACKLIGHT_CLASS_DEVICE=y
CONFIG_BACKLIGHT_PWM=y
CONFIG_FRAMEBUFFER_CONSOLE=y
@@ -136,9 +135,8 @@ CONFIG_NLS_ISO8859_15=y
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_DES=y
# CONFIG_DEBUG_BUGVERBOSE is not set
CONFIG_DEBUG_INFO=y
# CONFIG_ENABLE_MUST_CHECK is not set
CONFIG_DEBUG_FS=y
# CONFIG_SCHED_DEBUG is not set
# CONFIG_DEBUG_PREEMPT is not set
# CONFIG_DEBUG_BUGVERBOSE is not set

View File

@@ -160,10 +160,11 @@ extern unsigned long vectors_base;
/*
* Physical start and end address of the kernel sections. These addresses are
* 2MB-aligned to match the section mappings placed over the kernel.
* 2MB-aligned to match the section mappings placed over the kernel. We use
* u64 so that LPAE mappings beyond the 32bit limit will work out as well.
*/
extern u32 kernel_sec_start;
extern u32 kernel_sec_end;
extern u64 kernel_sec_start;
extern u64 kernel_sec_end;
/*
* Physical vs virtual RAM address space conversion. These are

View File

@@ -49,7 +49,8 @@
/*
* This needs to be assigned at runtime when the linker symbols are
* resolved.
* resolved. These are unsigned 64bit really, but in this assembly code
* We store them as 32bit.
*/
.pushsection .data
.align 2
@@ -57,7 +58,9 @@
.globl kernel_sec_end
kernel_sec_start:
.long 0
.long 0
kernel_sec_end:
.long 0
.long 0
.popsection
@@ -250,7 +253,11 @@ __create_page_tables:
add r0, r4, #KERNEL_OFFSET >> (SECTION_SHIFT - PMD_ORDER)
ldr r6, =(_end - 1)
adr_l r5, kernel_sec_start @ _pa(kernel_sec_start)
str r8, [r5] @ Save physical start of kernel
#ifdef CONFIG_CPU_ENDIAN_BE8
str r8, [r5, #4] @ Save physical start of kernel (BE)
#else
str r8, [r5] @ Save physical start of kernel (LE)
#endif
orr r3, r8, r7 @ Add the MMU flags
add r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
1: str r3, [r0], #1 << PMD_ORDER
@@ -259,7 +266,11 @@ __create_page_tables:
bls 1b
eor r3, r3, r7 @ Remove the MMU flags
adr_l r5, kernel_sec_end @ _pa(kernel_sec_end)
str r3, [r5] @ Save physical end of kernel
#ifdef CONFIG_CPU_ENDIAN_BE8
str r3, [r5, #4] @ Save physical end of kernel (BE)
#else
str r3, [r5] @ Save physical end of kernel (LE)
#endif
#ifdef CONFIG_XIP_KERNEL
/*

View File

@@ -218,30 +218,30 @@
/*
* PCI Control/Status Registers
*/
#define IXP4XX_PCI_CSR(x) ((volatile u32 *)(IXP4XX_PCI_CFG_BASE_VIRT+(x)))
#define _IXP4XX_PCI_CSR(x) ((volatile u32 *)(IXP4XX_PCI_CFG_BASE_VIRT+(x)))
#define PCI_NP_AD IXP4XX_PCI_CSR(PCI_NP_AD_OFFSET)
#define PCI_NP_CBE IXP4XX_PCI_CSR(PCI_NP_CBE_OFFSET)
#define PCI_NP_WDATA IXP4XX_PCI_CSR(PCI_NP_WDATA_OFFSET)
#define PCI_NP_RDATA IXP4XX_PCI_CSR(PCI_NP_RDATA_OFFSET)
#define PCI_CRP_AD_CBE IXP4XX_PCI_CSR(PCI_CRP_AD_CBE_OFFSET)
#define PCI_CRP_WDATA IXP4XX_PCI_CSR(PCI_CRP_WDATA_OFFSET)
#define PCI_CRP_RDATA IXP4XX_PCI_CSR(PCI_CRP_RDATA_OFFSET)
#define PCI_CSR IXP4XX_PCI_CSR(PCI_CSR_OFFSET)
#define PCI_ISR IXP4XX_PCI_CSR(PCI_ISR_OFFSET)
#define PCI_INTEN IXP4XX_PCI_CSR(PCI_INTEN_OFFSET)
#define PCI_DMACTRL IXP4XX_PCI_CSR(PCI_DMACTRL_OFFSET)
#define PCI_AHBMEMBASE IXP4XX_PCI_CSR(PCI_AHBMEMBASE_OFFSET)
#define PCI_AHBIOBASE IXP4XX_PCI_CSR(PCI_AHBIOBASE_OFFSET)
#define PCI_PCIMEMBASE IXP4XX_PCI_CSR(PCI_PCIMEMBASE_OFFSET)
#define PCI_AHBDOORBELL IXP4XX_PCI_CSR(PCI_AHBDOORBELL_OFFSET)
#define PCI_PCIDOORBELL IXP4XX_PCI_CSR(PCI_PCIDOORBELL_OFFSET)
#define PCI_ATPDMA0_AHBADDR IXP4XX_PCI_CSR(PCI_ATPDMA0_AHBADDR_OFFSET)
#define PCI_ATPDMA0_PCIADDR IXP4XX_PCI_CSR(PCI_ATPDMA0_PCIADDR_OFFSET)
#define PCI_ATPDMA0_LENADDR IXP4XX_PCI_CSR(PCI_ATPDMA0_LENADDR_OFFSET)
#define PCI_ATPDMA1_AHBADDR IXP4XX_PCI_CSR(PCI_ATPDMA1_AHBADDR_OFFSET)
#define PCI_ATPDMA1_PCIADDR IXP4XX_PCI_CSR(PCI_ATPDMA1_PCIADDR_OFFSET)
#define PCI_ATPDMA1_LENADDR IXP4XX_PCI_CSR(PCI_ATPDMA1_LENADDR_OFFSET)
#define PCI_NP_AD _IXP4XX_PCI_CSR(PCI_NP_AD_OFFSET)
#define PCI_NP_CBE _IXP4XX_PCI_CSR(PCI_NP_CBE_OFFSET)
#define PCI_NP_WDATA _IXP4XX_PCI_CSR(PCI_NP_WDATA_OFFSET)
#define PCI_NP_RDATA _IXP4XX_PCI_CSR(PCI_NP_RDATA_OFFSET)
#define PCI_CRP_AD_CBE _IXP4XX_PCI_CSR(PCI_CRP_AD_CBE_OFFSET)
#define PCI_CRP_WDATA _IXP4XX_PCI_CSR(PCI_CRP_WDATA_OFFSET)
#define PCI_CRP_RDATA _IXP4XX_PCI_CSR(PCI_CRP_RDATA_OFFSET)
#define PCI_CSR _IXP4XX_PCI_CSR(PCI_CSR_OFFSET)
#define PCI_ISR _IXP4XX_PCI_CSR(PCI_ISR_OFFSET)
#define PCI_INTEN _IXP4XX_PCI_CSR(PCI_INTEN_OFFSET)
#define PCI_DMACTRL _IXP4XX_PCI_CSR(PCI_DMACTRL_OFFSET)
#define PCI_AHBMEMBASE _IXP4XX_PCI_CSR(PCI_AHBMEMBASE_OFFSET)
#define PCI_AHBIOBASE _IXP4XX_PCI_CSR(PCI_AHBIOBASE_OFFSET)
#define PCI_PCIMEMBASE _IXP4XX_PCI_CSR(PCI_PCIMEMBASE_OFFSET)
#define PCI_AHBDOORBELL _IXP4XX_PCI_CSR(PCI_AHBDOORBELL_OFFSET)
#define PCI_PCIDOORBELL _IXP4XX_PCI_CSR(PCI_PCIDOORBELL_OFFSET)
#define PCI_ATPDMA0_AHBADDR _IXP4XX_PCI_CSR(PCI_ATPDMA0_AHBADDR_OFFSET)
#define PCI_ATPDMA0_PCIADDR _IXP4XX_PCI_CSR(PCI_ATPDMA0_PCIADDR_OFFSET)
#define PCI_ATPDMA0_LENADDR _IXP4XX_PCI_CSR(PCI_ATPDMA0_LENADDR_OFFSET)
#define PCI_ATPDMA1_AHBADDR _IXP4XX_PCI_CSR(PCI_ATPDMA1_AHBADDR_OFFSET)
#define PCI_ATPDMA1_PCIADDR _IXP4XX_PCI_CSR(PCI_ATPDMA1_PCIADDR_OFFSET)
#define PCI_ATPDMA1_LENADDR _IXP4XX_PCI_CSR(PCI_ATPDMA1_LENADDR_OFFSET)
/*
* PCI register values and bit definitions

View File

@@ -1608,6 +1608,13 @@ static void __init early_paging_init(const struct machine_desc *mdesc)
if (offset == 0)
return;
/*
* Offset the kernel section physical offsets so that the kernel
* mapping will work out later on.
*/
kernel_sec_start += offset;
kernel_sec_end += offset;
/*
* Get the address of the remap function in the 1:1 identity
* mapping setup by the early page table assembly code. We
@@ -1716,7 +1723,7 @@ void __init paging_init(const struct machine_desc *mdesc)
{
void *zero_page;
pr_debug("physical kernel sections: 0x%08x-0x%08x\n",
pr_debug("physical kernel sections: 0x%08llx-0x%08llx\n",
kernel_sec_start, kernel_sec_end);
prepare_page_table();

View File

@@ -29,7 +29,7 @@ ENTRY(lpae_pgtables_remap_asm)
ldr r6, =(_end - 1)
add r7, r2, #0x1000
add r6, r7, r6, lsr #SECTION_SHIFT - L2_ORDER
add r7, r7, #PAGE_OFFSET >> (SECTION_SHIFT - L2_ORDER)
add r7, r7, #KERNEL_OFFSET >> (SECTION_SHIFT - L2_ORDER)
1: ldrd r4, r5, [r7]
adds r4, r4, r0
adc r5, r5, r1

View File

@@ -156,6 +156,7 @@ config ARM64
select HAVE_ARCH_KGDB
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
select HAVE_ARCH_PFN_VALID
select HAVE_ARCH_PREL32_RELOCATIONS
select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
select HAVE_ARCH_SECCOMP_FILTER

View File

@@ -183,6 +183,8 @@ endif
# We use MRPROPER_FILES and CLEAN_FILES now
archclean:
$(Q)$(MAKE) $(clean)=$(boot)
$(Q)$(MAKE) $(clean)=arch/arm64/kernel/vdso
$(Q)$(MAKE) $(clean)=arch/arm64/kernel/vdso32
ifeq ($(KBUILD_EXTMOD),)
# We need to generate vdso-offsets.h before compiling certain files in kernel/.

View File

@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (c) 2015, LGE Inc. All rights reserved.
* Copyright (c) 2016, The Linux Foundation. All rights reserved.
* Copyright (c) 2021, Petr Vorel <petr.vorel@gmail.com>
*/
/dts-v1/;
@@ -9,6 +10,9 @@
#include "pm8994.dtsi"
#include "pmi8994.dtsi"
/* cont_splash_mem has different memory mapping */
/delete-node/ &cont_splash_mem;
/ {
model = "LG Nexus 5X";
compatible = "lg,bullhead", "qcom,msm8992";
@@ -17,6 +21,9 @@
qcom,board-id = <0xb64 0>;
qcom,pmic-id = <0x10009 0x1000A 0x0 0x0>;
/* Bullhead firmware doesn't support PSCI */
/delete-node/ psci;
aliases {
serial0 = &blsp1_uart2;
};
@@ -38,6 +45,11 @@
ftrace-size = <0x10000>;
pmsg-size = <0x20000>;
};
cont_splash_mem: memory@3400000 {
reg = <0 0x03400000 0 0x1200000>;
no-map;
};
};
};

View File

@@ -1,12 +1,16 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (c) 2015, Huawei Inc. All rights reserved.
* Copyright (c) 2016, The Linux Foundation. All rights reserved.
* Copyright (c) 2021, Petr Vorel <petr.vorel@gmail.com>
*/
/dts-v1/;
#include "msm8994.dtsi"
/* Angler's firmware does not report where the memory is allocated */
/delete-node/ &cont_splash_mem;
/ {
model = "Huawei Nexus 6P";
compatible = "huawei,angler", "qcom,msm8994";

View File

@@ -200,7 +200,7 @@
&BIG_CPU_SLEEP_1
&CLUSTER_SLEEP_0>;
next-level-cache = <&L2_700>;
qcom,freq-domain = <&cpufreq_hw 1>;
qcom,freq-domain = <&cpufreq_hw 2>;
#cooling-cells = <2>;
L2_700: l2-cache {
compatible = "cache";

View File

@@ -69,7 +69,7 @@
};
rmtfs_upper_guard: memory@f5d01000 {
no-map;
reg = <0 0xf5d01000 0 0x2000>;
reg = <0 0xf5d01000 0 0x1000>;
};
/*
@@ -78,7 +78,7 @@
*/
removed_region: memory@88f00000 {
no-map;
reg = <0 0x88f00000 0 0x200000>;
reg = <0 0x88f00000 0 0x1c00000>;
};
ramoops: ramoops@ac300000 {

View File

@@ -700,7 +700,7 @@
left_spkr: wsa8810-left{
compatible = "sdw10217211000";
reg = <0 3>;
powerdown-gpios = <&wcdgpio 2 GPIO_ACTIVE_HIGH>;
powerdown-gpios = <&wcdgpio 1 GPIO_ACTIVE_HIGH>;
#thermal-sensor-cells = <0>;
sound-name-prefix = "SpkrLeft";
#sound-dai-cells = <0>;
@@ -708,7 +708,7 @@
right_spkr: wsa8810-right{
compatible = "sdw10217211000";
powerdown-gpios = <&wcdgpio 3 GPIO_ACTIVE_HIGH>;
powerdown-gpios = <&wcdgpio 2 GPIO_ACTIVE_HIGH>;
reg = <0 4>;
#thermal-sensor-cells = <0>;
sound-name-prefix = "SpkrRight";

View File

@@ -33,8 +33,7 @@
* EL2.
*/
.macro __init_el2_timers
mrs x0, cnthctl_el2
orr x0, x0, #3 // Enable EL1 physical timers
mov x0, #3 // Enable EL1 physical timers
msr cnthctl_el2, x0
msr cntvoff_el2, xzr // Clear virtual offset
.endm

View File

@@ -41,6 +41,7 @@ void tag_clear_highpage(struct page *to);
typedef struct page *pgtable_t;
int pfn_valid(unsigned long pfn);
int pfn_is_map_memory(unsigned long pfn);
#include <asm/memory.h>

View File

@@ -94,10 +94,14 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
kvm->arch.return_nisv_io_abort_to_user = true;
break;
case KVM_CAP_ARM_MTE:
if (!system_supports_mte() || kvm->created_vcpus)
return -EINVAL;
r = 0;
kvm->arch.mte_enabled = true;
mutex_lock(&kvm->lock);
if (!system_supports_mte() || kvm->created_vcpus) {
r = -EINVAL;
} else {
r = 0;
kvm->arch.mte_enabled = true;
}
mutex_unlock(&kvm->lock);
break;
default:
r = -EINVAL;

View File

@@ -193,7 +193,7 @@ static bool range_is_memory(u64 start, u64 end)
{
struct kvm_mem_range r1, r2;
if (!find_mem_range(start, &r1) || !find_mem_range(end, &r2))
if (!find_mem_range(start, &r1) || !find_mem_range(end - 1, &r2))
return false;
if (r1.start != r2.start)
return false;

View File

@@ -219,6 +219,43 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
free_area_init(max_zone_pfns);
}
int pfn_valid(unsigned long pfn)
{
phys_addr_t addr = PFN_PHYS(pfn);
struct mem_section *ms;
/*
* Ensure the upper PAGE_SHIFT bits are clear in the
* pfn. Else it might lead to false positives when
* some of the upper bits are set, but the lower bits
* match a valid pfn.
*/
if (PHYS_PFN(addr) != pfn)
return 0;
if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
return 0;
ms = __pfn_to_section(pfn);
if (!valid_section(ms))
return 0;
/*
* ZONE_DEVICE memory does not have the memblock entries.
* memblock_is_map_memory() check for ZONE_DEVICE based
* addresses will always fail. Even the normal hotplugged
* memory will never have MEMBLOCK_NOMAP flag set in their
* memblock entries. Skip memblock search for all non early
* memory sections covering all of hotplug memory including
* both normal and ZONE_DEVICE based.
*/
if (!early_section(ms))
return pfn_section_valid(ms, pfn);
return memblock_is_memory(addr);
}
EXPORT_SYMBOL(pfn_valid);
int pfn_is_map_memory(unsigned long pfn)
{
phys_addr_t addr = PFN_PHYS(pfn);

View File

@@ -8,19 +8,4 @@ extern void * memset(void *, int, size_t);
#define __HAVE_ARCH_MEMCPY
void * memcpy(void * dest,const void *src,size_t count);
#define __HAVE_ARCH_STRLEN
extern size_t strlen(const char *s);
#define __HAVE_ARCH_STRCPY
extern char *strcpy(char *dest, const char *src);
#define __HAVE_ARCH_STRNCPY
extern char *strncpy(char *dest, const char *src, size_t count);
#define __HAVE_ARCH_STRCAT
extern char *strcat(char *dest, const char *src);
#define __HAVE_ARCH_MEMSET
extern void *memset(void *, int, size_t);
#endif

View File

@@ -17,10 +17,6 @@
#include <linux/string.h>
EXPORT_SYMBOL(memset);
EXPORT_SYMBOL(strlen);
EXPORT_SYMBOL(strcpy);
EXPORT_SYMBOL(strncpy);
EXPORT_SYMBOL(strcat);
#include <linux/atomic.h>
EXPORT_SYMBOL(__xchg8);

View File

@@ -3,7 +3,7 @@
# Makefile for parisc-specific library files
#
lib-y := lusercopy.o bitops.o checksum.o io.o memcpy.o \
ucmpdi2.o delay.o string.o
lib-y := lusercopy.o bitops.o checksum.o io.o memset.o memcpy.o \
ucmpdi2.o delay.o
obj-y := iomap.o

72
arch/parisc/lib/memset.c Normal file
View File

@@ -0,0 +1,72 @@
/* SPDX-License-Identifier: GPL-2.0-or-later */
#include <linux/types.h>
#include <asm/string.h>
#define OPSIZ (BITS_PER_LONG/8)
typedef unsigned long op_t;
void *
memset (void *dstpp, int sc, size_t len)
{
unsigned int c = sc;
long int dstp = (long int) dstpp;
if (len >= 8)
{
size_t xlen;
op_t cccc;
cccc = (unsigned char) c;
cccc |= cccc << 8;
cccc |= cccc << 16;
if (OPSIZ > 4)
/* Do the shift in two steps to avoid warning if long has 32 bits. */
cccc |= (cccc << 16) << 16;
/* There are at least some bytes to set.
No need to test for LEN == 0 in this alignment loop. */
while (dstp % OPSIZ != 0)
{
((unsigned char *) dstp)[0] = c;
dstp += 1;
len -= 1;
}
/* Write 8 `op_t' per iteration until less than 8 `op_t' remain. */
xlen = len / (OPSIZ * 8);
while (xlen > 0)
{
((op_t *) dstp)[0] = cccc;
((op_t *) dstp)[1] = cccc;
((op_t *) dstp)[2] = cccc;
((op_t *) dstp)[3] = cccc;
((op_t *) dstp)[4] = cccc;
((op_t *) dstp)[5] = cccc;
((op_t *) dstp)[6] = cccc;
((op_t *) dstp)[7] = cccc;
dstp += 8 * OPSIZ;
xlen -= 1;
}
len %= OPSIZ * 8;
/* Write 1 `op_t' per iteration until less than OPSIZ bytes remain. */
xlen = len / OPSIZ;
while (xlen > 0)
{
((op_t *) dstp)[0] = cccc;
dstp += OPSIZ;
xlen -= 1;
}
len %= OPSIZ;
}
/* Write the last few bytes. */
while (len > 0)
{
((unsigned char *) dstp)[0] = c;
dstp += 1;
len -= 1;
}
return dstpp;
}

View File

@@ -1,136 +0,0 @@
// SPDX-License-Identifier: GPL-2.0
/*
* PA-RISC assembly string functions
*
* Copyright (C) 2019 Helge Deller <deller@gmx.de>
*/
#include <asm/assembly.h>
#include <linux/linkage.h>
.section .text.hot
.level PA_ASM_LEVEL
t0 = r20
t1 = r21
t2 = r22
ENTRY_CFI(strlen, frame=0,no_calls)
or,COND(<>) arg0,r0,ret0
b,l,n .Lstrlen_null_ptr,r0
depwi 0,31,2,ret0
cmpb,COND(<>) arg0,ret0,.Lstrlen_not_aligned
ldw,ma 4(ret0),t0
cmpib,tr 0,r0,.Lstrlen_loop
uxor,nbz r0,t0,r0
.Lstrlen_not_aligned:
uaddcm arg0,ret0,t1
shladd t1,3,r0,t1
mtsar t1
depwi -1,%sar,32,t0
uxor,nbz r0,t0,r0
.Lstrlen_loop:
b,l,n .Lstrlen_end_loop,r0
ldw,ma 4(ret0),t0
cmpib,tr 0,r0,.Lstrlen_loop
uxor,nbz r0,t0,r0
.Lstrlen_end_loop:
extrw,u,<> t0,7,8,r0
addib,tr,n -3,ret0,.Lstrlen_out
extrw,u,<> t0,15,8,r0
addib,tr,n -2,ret0,.Lstrlen_out
extrw,u,<> t0,23,8,r0
addi -1,ret0,ret0
.Lstrlen_out:
bv r0(rp)
uaddcm ret0,arg0,ret0
.Lstrlen_null_ptr:
bv,n r0(rp)
ENDPROC_CFI(strlen)
ENTRY_CFI(strcpy, frame=0,no_calls)
ldb 0(arg1),t0
stb t0,0(arg0)
ldo 0(arg0),ret0
ldo 1(arg1),t1
cmpb,= r0,t0,2f
ldo 1(arg0),t2
1: ldb 0(t1),arg1
stb arg1,0(t2)
ldo 1(t1),t1
cmpb,<> r0,arg1,1b
ldo 1(t2),t2
2: bv,n r0(rp)
ENDPROC_CFI(strcpy)
ENTRY_CFI(strncpy, frame=0,no_calls)
ldb 0(arg1),t0
stb t0,0(arg0)
ldo 1(arg1),t1
ldo 0(arg0),ret0
cmpb,= r0,t0,2f
ldo 1(arg0),arg1
1: ldo -1(arg2),arg2
cmpb,COND(=),n r0,arg2,2f
ldb 0(t1),arg0
stb arg0,0(arg1)
ldo 1(t1),t1
cmpb,<> r0,arg0,1b
ldo 1(arg1),arg1
2: bv,n r0(rp)
ENDPROC_CFI(strncpy)
ENTRY_CFI(strcat, frame=0,no_calls)
ldb 0(arg0),t0
cmpb,= t0,r0,2f
ldo 0(arg0),ret0
ldo 1(arg0),arg0
1: ldb 0(arg0),t1
cmpb,<>,n r0,t1,1b
ldo 1(arg0),arg0
2: ldb 0(arg1),t2
stb t2,0(arg0)
ldo 1(arg0),arg0
ldb 0(arg1),t0
cmpb,<> r0,t0,2b
ldo 1(arg1),arg1
bv,n r0(rp)
ENDPROC_CFI(strcat)
ENTRY_CFI(memset, frame=0,no_calls)
copy arg0,ret0
cmpb,COND(=) r0,arg0,4f
copy arg0,t2
cmpb,COND(=) r0,arg2,4f
ldo -1(arg2),arg3
subi -1,arg3,t0
subi 0,t0,t1
cmpiclr,COND(>=) 0,t1,arg2
ldo -1(t1),arg2
extru arg2,31,2,arg0
2: stb arg1,0(t2)
ldo 1(t2),t2
addib,>= -1,arg0,2b
ldo -1(arg3),arg3
cmpiclr,COND(<=) 4,arg2,r0
b,l,n 4f,r0
#ifdef CONFIG_64BIT
depd,* r0,63,2,arg2
#else
depw r0,31,2,arg2
#endif
ldo 1(t2),t2
3: stb arg1,-1(t2)
stb arg1,0(t2)
stb arg1,1(t2)
stb arg1,2(t2)
addib,COND(>) -4,arg2,3b
ldo 4(t2),t2
4: bv,n r0(rp)
ENDPROC_CFI(memset)
.end

View File

@@ -4,6 +4,8 @@
#include <asm/bug.h>
#include <asm/book3s/32/mmu-hash.h>
#include <asm/mmu.h>
#include <asm/synch.h>
#ifndef __ASSEMBLY__
@@ -28,6 +30,15 @@ static inline void kuep_lock(void)
return;
update_user_segments(mfsr(0) | SR_NX);
/*
* This isync() shouldn't be necessary as the kernel is not excepted to
* run any instruction in userspace soon after the update of segments,
* but hash based cores (at least G3) seem to exhibit a random
* behaviour when the 'isync' is not there. 603 cores don't have this
* behaviour so don't do the 'isync' as it saves several CPU cycles.
*/
if (mmu_has_feature(MMU_FTR_HPTE_TABLE))
isync(); /* Context sync required after mtsr() */
}
static inline void kuep_unlock(void)
@@ -36,6 +47,15 @@ static inline void kuep_unlock(void)
return;
update_user_segments(mfsr(0) & ~SR_NX);
/*
* This isync() shouldn't be necessary as a 'rfi' will soon be executed
* to return to userspace, but hash based cores (at least G3) seem to
* exhibit a random behaviour when the 'isync' is not there. 603 cores
* don't have this behaviour so don't do the 'isync' as it saves several
* CPU cycles.
*/
if (mmu_has_feature(MMU_FTR_HPTE_TABLE))
isync(); /* Context sync required after mtsr() */
}
#ifdef CONFIG_PPC_KUAP

View File

@@ -583,6 +583,9 @@ DECLARE_INTERRUPT_HANDLER_NMI(hmi_exception_realmode);
DECLARE_INTERRUPT_HANDLER_ASYNC(TAUException);
/* irq.c */
DECLARE_INTERRUPT_HANDLER_ASYNC(do_IRQ);
void __noreturn unrecoverable_exception(struct pt_regs *regs);
void replay_system_reset(void);

View File

@@ -52,7 +52,7 @@ extern void *mcheckirq_ctx[NR_CPUS];
extern void *hardirq_ctx[NR_CPUS];
extern void *softirq_ctx[NR_CPUS];
extern void do_IRQ(struct pt_regs *regs);
void __do_IRQ(struct pt_regs *regs);
extern void __init init_IRQ(void);
extern void __do_irq(struct pt_regs *regs);

View File

@@ -70,6 +70,22 @@ struct pt_regs
unsigned long __pad[4]; /* Maintain 16 byte interrupt stack alignment */
};
#endif
#if defined(CONFIG_PPC32) && defined(CONFIG_BOOKE)
struct { /* Must be a multiple of 16 bytes */
unsigned long mas0;
unsigned long mas1;
unsigned long mas2;
unsigned long mas3;
unsigned long mas6;
unsigned long mas7;
unsigned long srr0;
unsigned long srr1;
unsigned long csrr0;
unsigned long csrr1;
unsigned long dsrr0;
unsigned long dsrr1;
};
#endif
};
#endif

View File

@@ -309,24 +309,21 @@ int main(void)
STACK_PT_REGS_OFFSET(STACK_REGS_IAMR, iamr);
#endif
#if defined(CONFIG_PPC32)
#if defined(CONFIG_BOOKE) || defined(CONFIG_40x)
DEFINE(EXC_LVL_SIZE, STACK_EXC_LVL_FRAME_SIZE);
DEFINE(MAS0, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, mas0));
#if defined(CONFIG_PPC32) && defined(CONFIG_BOOKE)
STACK_PT_REGS_OFFSET(MAS0, mas0);
/* we overload MMUCR for 44x on MAS0 since they are mutually exclusive */
DEFINE(MMUCR, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, mas0));
DEFINE(MAS1, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, mas1));
DEFINE(MAS2, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, mas2));
DEFINE(MAS3, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, mas3));
DEFINE(MAS6, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, mas6));
DEFINE(MAS7, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, mas7));
DEFINE(_SRR0, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, srr0));
DEFINE(_SRR1, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, srr1));
DEFINE(_CSRR0, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, csrr0));
DEFINE(_CSRR1, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, csrr1));
DEFINE(_DSRR0, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, dsrr0));
DEFINE(_DSRR1, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, dsrr1));
#endif
STACK_PT_REGS_OFFSET(MMUCR, mas0);
STACK_PT_REGS_OFFSET(MAS1, mas1);
STACK_PT_REGS_OFFSET(MAS2, mas2);
STACK_PT_REGS_OFFSET(MAS3, mas3);
STACK_PT_REGS_OFFSET(MAS6, mas6);
STACK_PT_REGS_OFFSET(MAS7, mas7);
STACK_PT_REGS_OFFSET(_SRR0, srr0);
STACK_PT_REGS_OFFSET(_SRR1, srr1);
STACK_PT_REGS_OFFSET(_CSRR0, csrr0);
STACK_PT_REGS_OFFSET(_CSRR1, csrr1);
STACK_PT_REGS_OFFSET(_DSRR0, dsrr0);
STACK_PT_REGS_OFFSET(_DSRR1, dsrr1);
#endif
/* About the CPU features table */

View File

@@ -812,7 +812,6 @@ __start_interrupts:
* syscall register convention is in Documentation/powerpc/syscall64-abi.rst
*/
EXC_VIRT_BEGIN(system_call_vectored, 0x3000, 0x1000)
1:
/* SCV 0 */
mr r9,r13
GET_PACA(r13)
@@ -842,10 +841,12 @@ EXC_VIRT_BEGIN(system_call_vectored, 0x3000, 0x1000)
b system_call_vectored_sigill
#endif
.endr
2:
EXC_VIRT_END(system_call_vectored, 0x3000, 0x1000)
SOFT_MASK_TABLE(1b, 2b) // Treat scv vectors as soft-masked, see comment above.
// Treat scv vectors as soft-masked, see comment above.
// Use absolute values rather than labels here, so they don't get relocated,
// because this code runs unrelocated.
SOFT_MASK_TABLE(0xc000000000003000, 0xc000000000004000)
#ifdef CONFIG_RELOCATABLE
TRAMP_VIRT_BEGIN(system_call_vectored_tramp)

View File

@@ -300,7 +300,7 @@ ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
EXCEPTION_PROLOG_1
EXCEPTION_PROLOG_2 INTERRUPT_DATA_STORAGE DataAccess handle_dar_dsisr=1
prepare_transfer_to_handler
lwz r5, _DSISR(r11)
lwz r5, _DSISR(r1)
andis. r0, r5, DSISR_DABRMATCH@h
bne- 1f
bl do_page_fault

View File

@@ -168,20 +168,18 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
/* only on e500mc */
#define DBG_STACK_BASE dbgirq_ctx
#define EXC_LVL_FRAME_OVERHEAD (THREAD_SIZE - INT_FRAME_SIZE - EXC_LVL_SIZE)
#ifdef CONFIG_SMP
#define BOOKE_LOAD_EXC_LEVEL_STACK(level) \
mfspr r8,SPRN_PIR; \
slwi r8,r8,2; \
addis r8,r8,level##_STACK_BASE@ha; \
lwz r8,level##_STACK_BASE@l(r8); \
addi r8,r8,EXC_LVL_FRAME_OVERHEAD;
addi r8,r8,THREAD_SIZE - INT_FRAME_SIZE;
#else
#define BOOKE_LOAD_EXC_LEVEL_STACK(level) \
lis r8,level##_STACK_BASE@ha; \
lwz r8,level##_STACK_BASE@l(r8); \
addi r8,r8,EXC_LVL_FRAME_OVERHEAD;
addi r8,r8,THREAD_SIZE - INT_FRAME_SIZE;
#endif
/*
@@ -208,7 +206,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
mtmsr r11; \
mfspr r11,SPRN_SPRG_THREAD; /* if from user, start at top of */\
lwz r11, TASK_STACK - THREAD(r11); /* this thread's kernel stack */\
addi r11,r11,EXC_LVL_FRAME_OVERHEAD; /* allocate stack frame */\
addi r11,r11,THREAD_SIZE - INT_FRAME_SIZE; /* allocate stack frame */\
beq 1f; \
/* COMING FROM USER MODE */ \
stw r9,_CCR(r11); /* save CR */\
@@ -516,24 +514,5 @@ label:
bl kernel_fp_unavailable_exception; \
b interrupt_return
#else /* __ASSEMBLY__ */
struct exception_regs {
unsigned long mas0;
unsigned long mas1;
unsigned long mas2;
unsigned long mas3;
unsigned long mas6;
unsigned long mas7;
unsigned long srr0;
unsigned long srr1;
unsigned long csrr0;
unsigned long csrr1;
unsigned long dsrr0;
unsigned long dsrr1;
};
/* ensure this structure is always sized to a multiple of the stack alignment */
#define STACK_EXC_LVL_FRAME_SIZE ALIGN(sizeof (struct exception_regs), 16)
#endif /* __ASSEMBLY__ */
#endif /* __HEAD_BOOKE_H__ */

View File

@@ -750,7 +750,7 @@ void __do_irq(struct pt_regs *regs)
trace_irq_exit(regs);
}
DEFINE_INTERRUPT_HANDLER_ASYNC(do_IRQ)
void __do_IRQ(struct pt_regs *regs)
{
struct pt_regs *old_regs = set_irq_regs(regs);
void *cursp, *irqsp, *sirqsp;
@@ -774,6 +774,11 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(do_IRQ)
set_irq_regs(old_regs);
}
DEFINE_INTERRUPT_HANDLER_ASYNC(do_IRQ)
{
__do_IRQ(regs);
}
static void *__init alloc_vm_stack(void)
{
return __vmalloc_node(THREAD_SIZE, THREAD_ALIGN, THREADINFO_GFP,

View File

@@ -292,7 +292,8 @@ int kprobe_handler(struct pt_regs *regs)
if (user_mode(regs))
return 0;
if (!(regs->msr & MSR_IR) || !(regs->msr & MSR_DR))
if (!IS_ENABLED(CONFIG_BOOKE) &&
(!(regs->msr & MSR_IR) || !(regs->msr & MSR_DR)))
return 0;
/*

View File

@@ -1167,7 +1167,7 @@ static int __init topology_init(void)
* CPU. For instance, the boot cpu might never be valid
* for hotplugging.
*/
if (smp_ops->cpu_offline_self)
if (smp_ops && smp_ops->cpu_offline_self)
c->hotpluggable = 1;
#endif

View File

@@ -586,7 +586,7 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
#if defined(CONFIG_PPC32) && defined(CONFIG_PPC_PMAC)
if (atomic_read(&ppc_n_lost_interrupts) != 0)
do_IRQ(regs);
__do_IRQ(regs);
#endif
old_regs = set_irq_regs(regs);

View File

@@ -1104,7 +1104,7 @@ DEFINE_INTERRUPT_HANDLER(RunModeException)
_exception(SIGTRAP, regs, TRAP_UNK, 0);
}
DEFINE_INTERRUPT_HANDLER(single_step_exception)
static void __single_step_exception(struct pt_regs *regs)
{
clear_single_step(regs);
clear_br_trace(regs);
@@ -1121,6 +1121,11 @@ DEFINE_INTERRUPT_HANDLER(single_step_exception)
_exception(SIGTRAP, regs, TRAP_TRACE, regs->nip);
}
DEFINE_INTERRUPT_HANDLER(single_step_exception)
{
__single_step_exception(regs);
}
/*
* After we have successfully emulated an instruction, we have to
* check if the instruction was being single-stepped, and if so,
@@ -1130,7 +1135,7 @@ DEFINE_INTERRUPT_HANDLER(single_step_exception)
static void emulate_single_step(struct pt_regs *regs)
{
if (single_stepping(regs))
single_step_exception(regs);
__single_step_exception(regs);
}
static inline int __parse_fpscr(unsigned long fpscr)

View File

@@ -18,16 +18,12 @@
/*
* Updates the attributes of a page in three steps:
*
* 1. invalidate the page table entry
* 2. flush the TLB
* 3. install the new entry with the updated attributes
*
* Invalidating the pte means there are situations where this will not work
* when in theory it should.
* For example:
* - removing write from page whilst it is being executed
* - setting a page read-only whilst it is being read by another CPU
* 1. take the page_table_lock
* 2. install the new entry with the updated attributes
* 3. flush the TLB
*
* This sequence is safe against concurrent updates, and also allows updating the
* attributes of a page currently being executed or accessed.
*/
static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
{
@@ -36,9 +32,7 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
spin_lock(&init_mm.page_table_lock);
/* invalidate the PTE so it's safe to modify */
pte = ptep_get_and_clear(&init_mm, addr, ptep);
flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
pte = ptep_get(ptep);
/* modify the PTE bits as desired, then apply */
switch (action) {
@@ -59,11 +53,14 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
break;
}
set_pte_at(&init_mm, addr, ptep, pte);
pte_update(&init_mm, addr, ptep, ~0UL, pte_val(pte), 0);
/* See ptesync comment in radix__set_pte_at() */
if (radix_enabled())
asm volatile("ptesync": : :"memory");
flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
spin_unlock(&init_mm.page_table_lock);
return 0;

View File

@@ -98,7 +98,7 @@ config PPC_BOOK3S_64
select PPC_HAVE_PMU_SUPPORT
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
select ARCH_ENABLE_HUGEPAGE_MIGRATION if HUGETLB_PAGE && MIGRATION
select ARCH_ENABLE_PMD_SPLIT_PTLOCK
select ARCH_ENABLE_SPLIT_PMD_PTLOCK
select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
select ARCH_SUPPORTS_HUGETLBFS
select ARCH_SUPPORTS_NUMA_BALANCING

View File

@@ -539,9 +539,10 @@ static void init_cpu_char_feature_flags(struct h_cpu_char_result *result)
* H_CPU_BEHAV_FAVOUR_SECURITY_H could be set only if
* H_CPU_BEHAV_FAVOUR_SECURITY is.
*/
if (!(result->behaviour & H_CPU_BEHAV_FAVOUR_SECURITY))
if (!(result->behaviour & H_CPU_BEHAV_FAVOUR_SECURITY)) {
security_ftr_clear(SEC_FTR_FAVOUR_SECURITY);
else if (result->behaviour & H_CPU_BEHAV_FAVOUR_SECURITY_H)
pseries_security_flavor = 0;
} else if (result->behaviour & H_CPU_BEHAV_FAVOUR_SECURITY_H)
pseries_security_flavor = 1;
else
pseries_security_flavor = 2;

View File

@@ -67,6 +67,7 @@ static struct irq_domain *xive_irq_domain;
static struct xive_ipi_desc {
unsigned int irq;
char name[16];
atomic_t started;
} *xive_ipis;
/*
@@ -1120,7 +1121,7 @@ static const struct irq_domain_ops xive_ipi_irq_domain_ops = {
.alloc = xive_ipi_irq_domain_alloc,
};
static int __init xive_request_ipi(void)
static int __init xive_init_ipis(void)
{
struct fwnode_handle *fwnode;
struct irq_domain *ipi_domain;
@@ -1144,10 +1145,6 @@ static int __init xive_request_ipi(void)
struct xive_ipi_desc *xid = &xive_ipis[node];
struct xive_ipi_alloc_info info = { node };
/* Skip nodes without CPUs */
if (cpumask_empty(cpumask_of_node(node)))
continue;
/*
* Map one IPI interrupt per node for all cpus of that node.
* Since the HW interrupt number doesn't have any meaning,
@@ -1159,11 +1156,6 @@ static int __init xive_request_ipi(void)
xid->irq = ret;
snprintf(xid->name, sizeof(xid->name), "IPI-%d", node);
ret = request_irq(xid->irq, xive_muxed_ipi_action,
IRQF_PERCPU | IRQF_NO_THREAD, xid->name, NULL);
WARN(ret < 0, "Failed to request IPI %d: %d\n", xid->irq, ret);
}
return ret;
@@ -1178,6 +1170,22 @@ out:
return ret;
}
static int xive_request_ipi(unsigned int cpu)
{
struct xive_ipi_desc *xid = &xive_ipis[early_cpu_to_node(cpu)];
int ret;
if (atomic_inc_return(&xid->started) > 1)
return 0;
ret = request_irq(xid->irq, xive_muxed_ipi_action,
IRQF_PERCPU | IRQF_NO_THREAD,
xid->name, NULL);
WARN(ret < 0, "Failed to request IPI %d: %d\n", xid->irq, ret);
return ret;
}
static int xive_setup_cpu_ipi(unsigned int cpu)
{
unsigned int xive_ipi_irq = xive_ipi_cpu_to_irq(cpu);
@@ -1192,6 +1200,9 @@ static int xive_setup_cpu_ipi(unsigned int cpu)
if (xc->hw_ipi != XIVE_BAD_IRQ)
return 0;
/* Register the IPI */
xive_request_ipi(cpu);
/* Grab an IPI from the backend, this will populate xc->hw_ipi */
if (xive_ops->get_ipi(cpu, xc))
return -EIO;
@@ -1231,6 +1242,8 @@ static void xive_cleanup_cpu_ipi(unsigned int cpu, struct xive_cpu *xc)
if (xc->hw_ipi == XIVE_BAD_IRQ)
return;
/* TODO: clear IPI mapping */
/* Mask the IPI */
xive_do_source_set_mask(&xc->ipi_data, true);
@@ -1253,7 +1266,7 @@ void __init xive_smp_probe(void)
smp_ops->cause_ipi = xive_cause_ipi;
/* Register the IPI */
xive_request_ipi();
xive_init_ipis();
/* Allocate and setup IPI for the boot CPU */
xive_setup_cpu_ipi(smp_processor_id());

View File

@@ -14,6 +14,10 @@
model = "Microchip PolarFire-SoC Icicle Kit";
compatible = "microchip,mpfs-icicle-kit";
aliases {
ethernet0 = &emac1;
};
chosen {
stdout-path = &serial0;
};

View File

@@ -317,7 +317,7 @@
reg = <0x0 0x20112000 0x0 0x2000>;
interrupt-parent = <&plic>;
interrupts = <70 71 72 73>;
mac-address = [00 00 00 00 00 00];
local-mac-address = [00 00 00 00 00 00];
clocks = <&clkcfg 5>, <&clkcfg 2>;
status = "disabled";
clock-names = "pclk", "hclk";

View File

@@ -11,7 +11,7 @@ endif
CFLAGS_syscall_table.o += $(call cc-option,-Wno-override-init,)
ifdef CONFIG_KEXEC
AFLAGS_kexec_relocate.o := -mcmodel=medany -mno-relax
AFLAGS_kexec_relocate.o := -mcmodel=medany $(call cc-option,-mno-relax)
endif
extra-y += head.o

View File

@@ -10,6 +10,7 @@
#include <asm/ptrace.h>
#include <asm/syscall.h>
#include <asm/thread_info.h>
#include <asm/switch_to.h>
#include <linux/audit.h>
#include <linux/ptrace.h>
#include <linux/elf.h>
@@ -56,6 +57,9 @@ static int riscv_fpr_get(struct task_struct *target,
{
struct __riscv_d_ext_state *fstate = &target->thread.fstate;
if (target == current)
fstate_save(current, task_pt_regs(current));
membuf_write(&to, fstate, offsetof(struct __riscv_d_ext_state, fcsr));
membuf_store(&to, fstate->fcsr);
return membuf_zero(&to, 4); // explicitly pad

View File

@@ -229,8 +229,8 @@ static void __init init_resources(void)
}
/* Clean-up any unused pre-allocated resources */
mem_res_sz = (num_resources - res_idx + 1) * sizeof(*mem_res);
memblock_free(__pa(mem_res), mem_res_sz);
if (res_idx >= 0)
memblock_free(__pa(mem_res), (res_idx + 1) * sizeof(*mem_res));
return;
error:

View File

@@ -197,7 +197,7 @@ static void __init setup_bootmem(void)
* if end of dram is equal to maximum addressable memory. For 64-bit
* kernel, this problem can't happen here as the end of the virtual
* address space is occupied by the kernel mapping then this check must
* be done in create_kernel_page_table.
* be done as soon as the kernel mapping base address is determined.
*/
max_mapped_addr = __pa(~(ulong)0);
if (max_mapped_addr == (phys_ram_end - 1))

View File

@@ -560,9 +560,12 @@ static void zpci_cleanup_bus_resources(struct zpci_dev *zdev)
int pcibios_add_device(struct pci_dev *pdev)
{
struct zpci_dev *zdev = to_zpci(pdev);
struct resource *res;
int i;
/* The pdev has a reference to the zdev via its bus */
zpci_zdev_get(zdev);
if (pdev->is_physfn)
pdev->no_vf_scan = 1;
@@ -582,7 +585,10 @@ int pcibios_add_device(struct pci_dev *pdev)
void pcibios_release_device(struct pci_dev *pdev)
{
struct zpci_dev *zdev = to_zpci(pdev);
zpci_unmap_resources(pdev);
zpci_zdev_put(zdev);
}
int pcibios_enable_device(struct pci_dev *pdev, int mask)

View File

@@ -22,6 +22,11 @@ static inline void zpci_zdev_put(struct zpci_dev *zdev)
kref_put(&zdev->kref, zpci_release_device);
}
static inline void zpci_zdev_get(struct zpci_dev *zdev)
{
kref_get(&zdev->kref);
}
int zpci_alloc_domain(int domain);
void zpci_free_domain(int domain);
int zpci_setup_bus_resources(struct zpci_dev *zdev,

View File

@@ -5,9 +5,8 @@
* Early support for invoking 32-bit EFI services from a 64-bit kernel.
*
* Because this thunking occurs before ExitBootServices() we have to
* restore the firmware's 32-bit GDT before we make EFI service calls,
* since the firmware's 32-bit IDT is still currently installed and it
* needs to be able to service interrupts.
* restore the firmware's 32-bit GDT and IDT before we make EFI service
* calls.
*
* On the plus side, we don't have to worry about mangling 64-bit
* addresses into 32-bits because we're executing with an identity
@@ -39,7 +38,7 @@ SYM_FUNC_START(__efi64_thunk)
/*
* Convert x86-64 ABI params to i386 ABI
*/
subq $32, %rsp
subq $64, %rsp
movl %esi, 0x0(%rsp)
movl %edx, 0x4(%rsp)
movl %ecx, 0x8(%rsp)
@@ -49,14 +48,19 @@ SYM_FUNC_START(__efi64_thunk)
leaq 0x14(%rsp), %rbx
sgdt (%rbx)
addq $16, %rbx
sidt (%rbx)
/*
* Switch to gdt with 32-bit segments. This is the firmware GDT
* that was installed when the kernel started executing. This
* pointer was saved at the EFI stub entry point in head_64.S.
* Switch to IDT and GDT with 32-bit segments. This is the firmware GDT
* and IDT that was installed when the kernel started executing. The
* pointers were saved at the EFI stub entry point in head_64.S.
*
* Pass the saved DS selector to the 32-bit code, and use far return to
* restore the saved CS selector.
*/
leaq efi32_boot_idt(%rip), %rax
lidt (%rax)
leaq efi32_boot_gdt(%rip), %rax
lgdt (%rax)
@@ -67,7 +71,7 @@ SYM_FUNC_START(__efi64_thunk)
pushq %rax
lretq
1: addq $32, %rsp
1: addq $64, %rsp
movq %rdi, %rax
pop %rbx
@@ -128,10 +132,13 @@ SYM_FUNC_START_LOCAL(efi_enter32)
/*
* Some firmware will return with interrupts enabled. Be sure to
* disable them before we switch GDTs.
* disable them before we switch GDTs and IDTs.
*/
cli
lidtl (%ebx)
subl $16, %ebx
lgdtl (%ebx)
movl %cr4, %eax
@@ -166,6 +173,11 @@ SYM_DATA_START(efi32_boot_gdt)
.quad 0
SYM_DATA_END(efi32_boot_gdt)
SYM_DATA_START(efi32_boot_idt)
.word 0
.quad 0
SYM_DATA_END(efi32_boot_idt)
SYM_DATA_START(efi32_boot_cs)
.word 0
SYM_DATA_END(efi32_boot_cs)

View File

@@ -319,6 +319,9 @@ SYM_INNER_LABEL(efi32_pe_stub_entry, SYM_L_LOCAL)
movw %cs, rva(efi32_boot_cs)(%ebp)
movw %ds, rva(efi32_boot_ds)(%ebp)
/* Store firmware IDT descriptor */
sidtl rva(efi32_boot_idt)(%ebp)
/* Disable paging */
movl %cr0, %eax
btrl $X86_CR0_PG_BIT, %eax

View File

@@ -90,6 +90,7 @@ struct perf_ibs {
unsigned long offset_mask[1];
int offset_max;
unsigned int fetch_count_reset_broken : 1;
unsigned int fetch_ignore_if_zero_rip : 1;
struct cpu_perf_ibs __percpu *pcpu;
struct attribute **format_attrs;
@@ -570,6 +571,7 @@ static struct perf_ibs perf_ibs_op = {
.start = perf_ibs_start,
.stop = perf_ibs_stop,
.read = perf_ibs_read,
.capabilities = PERF_PMU_CAP_NO_EXCLUDE,
},
.msr = MSR_AMD64_IBSOPCTL,
.config_mask = IBS_OP_CONFIG_MASK,
@@ -672,6 +674,10 @@ fail:
if (check_rip && (ibs_data.regs[2] & IBS_RIP_INVALID)) {
regs.flags &= ~PERF_EFLAGS_EXACT;
} else {
/* Workaround for erratum #1197 */
if (perf_ibs->fetch_ignore_if_zero_rip && !(ibs_data.regs[1]))
goto out;
set_linear_ip(&regs, ibs_data.regs[1]);
regs.flags |= PERF_EFLAGS_EXACT;
}
@@ -769,6 +775,9 @@ static __init void perf_event_ibs_init(void)
if (boot_cpu_data.x86 >= 0x16 && boot_cpu_data.x86 <= 0x18)
perf_ibs_fetch.fetch_count_reset_broken = 1;
if (boot_cpu_data.x86 == 0x19 && boot_cpu_data.x86_model < 0x10)
perf_ibs_fetch.fetch_ignore_if_zero_rip = 1;
perf_ibs_pmu_init(&perf_ibs_fetch, "ibs_fetch");
if (ibs_caps & IBS_CAPS_OPCNT) {

View File

@@ -213,6 +213,7 @@ static struct pmu pmu_class = {
.stop = pmu_event_stop,
.read = pmu_event_read,
.capabilities = PERF_PMU_CAP_NO_EXCLUDE,
.module = THIS_MODULE,
};
static int power_cpu_exit(unsigned int cpu)

View File

@@ -62,7 +62,7 @@ static struct pt_cap_desc {
PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)),
PT_CAP(output_subsys, 0, CPUID_ECX, BIT(3)),
PT_CAP(payloads_lip, 0, CPUID_ECX, BIT(31)),
PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3),
PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x7),
PT_CAP(mtc_periods, 1, CPUID_EAX, 0xffff0000),
PT_CAP(cycle_thresholds, 1, CPUID_EBX, 0xffff),
PT_CAP(psb_periods, 1, CPUID_EBX, 0xffff0000),

View File

@@ -4811,7 +4811,7 @@ static void __snr_uncore_mmio_init_box(struct intel_uncore_box *box,
return;
pci_read_config_dword(pdev, SNR_IMC_MMIO_BASE_OFFSET, &pci_dword);
addr = (pci_dword & SNR_IMC_MMIO_BASE_MASK) << 23;
addr = ((resource_size_t)pci_dword & SNR_IMC_MMIO_BASE_MASK) << 23;
pci_read_config_dword(pdev, mem_offset, &pci_dword);
addr |= (pci_dword & SNR_IMC_MMIO_MEM0_MASK) << 12;

View File

@@ -1038,6 +1038,13 @@ struct kvm_arch {
struct list_head lpage_disallowed_mmu_pages;
struct kvm_page_track_notifier_node mmu_sp_tracker;
struct kvm_page_track_notifier_head track_notifier_head;
/*
* Protects marking pages unsync during page faults, as TDP MMU page
* faults only take mmu_lock for read. For simplicity, the unsync
* pages lock is always taken when marking pages unsync regardless of
* whether mmu_lock is held for read or write.
*/
spinlock_t mmu_unsync_pages_lock;
struct list_head assigned_dev_head;
struct iommu_domain *iommu_domain;

View File

@@ -184,6 +184,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
#define V_IGN_TPR_SHIFT 20
#define V_IGN_TPR_MASK (1 << V_IGN_TPR_SHIFT)
#define V_IRQ_INJECTION_BITS_MASK (V_IRQ_MASK | V_INTR_PRIO_MASK | V_IGN_TPR_MASK)
#define V_INTR_MASKING_SHIFT 24
#define V_INTR_MASKING_MASK (1 << V_INTR_MASKING_SHIFT)

View File

@@ -1986,7 +1986,8 @@ static struct irq_chip ioapic_chip __read_mostly = {
.irq_set_affinity = ioapic_set_affinity,
.irq_retrigger = irq_chip_retrigger_hierarchy,
.irq_get_irqchip_state = ioapic_irq_get_chip_state,
.flags = IRQCHIP_SKIP_SET_WAKE,
.flags = IRQCHIP_SKIP_SET_WAKE |
IRQCHIP_AFFINITY_PRE_STARTUP,
};
static struct irq_chip ioapic_ir_chip __read_mostly = {
@@ -1999,7 +2000,8 @@ static struct irq_chip ioapic_ir_chip __read_mostly = {
.irq_set_affinity = ioapic_set_affinity,
.irq_retrigger = irq_chip_retrigger_hierarchy,
.irq_get_irqchip_state = ioapic_irq_get_chip_state,
.flags = IRQCHIP_SKIP_SET_WAKE,
.flags = IRQCHIP_SKIP_SET_WAKE |
IRQCHIP_AFFINITY_PRE_STARTUP,
};
static inline void init_IO_APIC_traps(void)

View File

@@ -58,11 +58,13 @@ msi_set_affinity(struct irq_data *irqd, const struct cpumask *mask, bool force)
* The quirk bit is not set in this case.
* - The new vector is the same as the old vector
* - The old vector is MANAGED_IRQ_SHUTDOWN_VECTOR (interrupt starts up)
* - The interrupt is not yet started up
* - The new destination CPU is the same as the old destination CPU
*/
if (!irqd_msi_nomask_quirk(irqd) ||
cfg->vector == old_cfg.vector ||
old_cfg.vector == MANAGED_IRQ_SHUTDOWN_VECTOR ||
!irqd_is_started(irqd) ||
cfg->dest_apicid == old_cfg.dest_apicid) {
irq_msi_update_msg(irqd, cfg);
return ret;
@@ -150,7 +152,8 @@ static struct irq_chip pci_msi_controller = {
.irq_ack = irq_chip_ack_parent,
.irq_retrigger = irq_chip_retrigger_hierarchy,
.irq_set_affinity = msi_set_affinity,
.flags = IRQCHIP_SKIP_SET_WAKE,
.flags = IRQCHIP_SKIP_SET_WAKE |
IRQCHIP_AFFINITY_PRE_STARTUP,
};
int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
@@ -219,7 +222,8 @@ static struct irq_chip pci_msi_ir_controller = {
.irq_mask = pci_msi_mask_irq,
.irq_ack = irq_chip_ack_parent,
.irq_retrigger = irq_chip_retrigger_hierarchy,
.flags = IRQCHIP_SKIP_SET_WAKE,
.flags = IRQCHIP_SKIP_SET_WAKE |
IRQCHIP_AFFINITY_PRE_STARTUP,
};
static struct msi_domain_info pci_msi_ir_domain_info = {
@@ -273,7 +277,8 @@ static struct irq_chip dmar_msi_controller = {
.irq_retrigger = irq_chip_retrigger_hierarchy,
.irq_compose_msi_msg = dmar_msi_compose_msg,
.irq_write_msi_msg = dmar_msi_write_msg,
.flags = IRQCHIP_SKIP_SET_WAKE,
.flags = IRQCHIP_SKIP_SET_WAKE |
IRQCHIP_AFFINITY_PRE_STARTUP,
};
static int dmar_msi_init(struct irq_domain *domain,

View File

@@ -285,15 +285,14 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
return chunks >>= shift;
}
static int __mon_event_count(u32 rmid, struct rmid_read *rr)
static u64 __mon_event_count(u32 rmid, struct rmid_read *rr)
{
struct mbm_state *m;
u64 chunks, tval;
tval = __rmid_read(rmid, rr->evtid);
if (tval & (RMID_VAL_ERROR | RMID_VAL_UNAVAIL)) {
rr->val = tval;
return -EINVAL;
return tval;
}
switch (rr->evtid) {
case QOS_L3_OCCUP_EVENT_ID:
@@ -307,10 +306,10 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr)
break;
default:
/*
* Code would never reach here because
* an invalid event id would fail the __rmid_read.
* Code would never reach here because an invalid
* event id would fail the __rmid_read.
*/
return -EINVAL;
return RMID_VAL_ERROR;
}
if (rr->first) {
@@ -361,23 +360,29 @@ void mon_event_count(void *info)
struct rdtgroup *rdtgrp, *entry;
struct rmid_read *rr = info;
struct list_head *head;
u64 ret_val;
rdtgrp = rr->rgrp;
if (__mon_event_count(rdtgrp->mon.rmid, rr))
return;
ret_val = __mon_event_count(rdtgrp->mon.rmid, rr);
/*
* For Ctrl groups read data from child monitor groups.
* For Ctrl groups read data from child monitor groups and
* add them together. Count events which are read successfully.
* Discard the rmid_read's reporting errors.
*/
head = &rdtgrp->mon.crdtgrp_list;
if (rdtgrp->type == RDTCTRL_GROUP) {
list_for_each_entry(entry, head, mon.crdtgrp_list) {
if (__mon_event_count(entry->mon.rmid, rr))
return;
if (__mon_event_count(entry->mon.rmid, rr) == 0)
ret_val = 0;
}
}
/* Report error if none of rmid_reads are successful */
if (ret_val)
rr->val = ret_val;
}
/*

View File

@@ -508,7 +508,7 @@ static struct irq_chip hpet_msi_controller __ro_after_init = {
.irq_set_affinity = msi_domain_set_affinity,
.irq_retrigger = irq_chip_retrigger_hierarchy,
.irq_write_msi_msg = hpet_msi_write_msg,
.flags = IRQCHIP_SKIP_SET_WAKE,
.flags = IRQCHIP_SKIP_SET_WAKE | IRQCHIP_AFFINITY_PRE_STARTUP,
};
static int hpet_msi_init(struct irq_domain *domain,

View File

@@ -208,30 +208,6 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
kvm_mmu_after_set_cpuid(vcpu);
}
static int is_efer_nx(void)
{
return host_efer & EFER_NX;
}
static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu)
{
int i;
struct kvm_cpuid_entry2 *e, *entry;
entry = NULL;
for (i = 0; i < vcpu->arch.cpuid_nent; ++i) {
e = &vcpu->arch.cpuid_entries[i];
if (e->function == 0x80000001) {
entry = e;
break;
}
}
if (entry && cpuid_entry_has(entry, X86_FEATURE_NX) && !is_efer_nx()) {
cpuid_entry_clear(entry, X86_FEATURE_NX);
printk(KERN_INFO "kvm: guest NX capability removed\n");
}
}
int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu)
{
struct kvm_cpuid_entry2 *best;
@@ -302,7 +278,6 @@ int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
vcpu->arch.cpuid_entries = e2;
vcpu->arch.cpuid_nent = cpuid->nent;
cpuid_fix_nx_cap(vcpu);
kvm_update_cpuid_runtime(vcpu);
kvm_vcpu_after_set_cpuid(vcpu);
@@ -401,7 +376,6 @@ static __always_inline void kvm_cpu_cap_mask(enum cpuid_leafs leaf, u32 mask)
void kvm_set_cpu_caps(void)
{
unsigned int f_nx = is_efer_nx() ? F(NX) : 0;
#ifdef CONFIG_X86_64
unsigned int f_gbpages = F(GBPAGES);
unsigned int f_lm = F(LM);
@@ -515,7 +489,7 @@ void kvm_set_cpu_caps(void)
F(CX8) | F(APIC) | 0 /* Reserved */ | F(SYSCALL) |
F(MTRR) | F(PGE) | F(MCA) | F(CMOV) |
F(PAT) | F(PSE36) | 0 /* Reserved */ |
f_nx | 0 /* Reserved */ | F(MMXEXT) | F(MMX) |
F(NX) | 0 /* Reserved */ | F(MMXEXT) | F(MMX) |
F(FXSR) | F(FXSR_OPT) | f_gbpages | F(RDTSCP) |
0 /* Reserved */ | f_lm | F(3DNOWEXT) | F(3DNOW)
);

View File

@@ -1933,7 +1933,7 @@ ret_success:
void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
{
struct kvm_cpuid_entry2 *entry;
struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
struct kvm_vcpu_hv *hv_vcpu;
entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_INTERFACE, 0);
if (entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX) {

View File

@@ -2535,6 +2535,7 @@ static void kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
{
struct kvm_mmu_page *sp;
bool locked = false;
/*
* Force write-protection if the page is being tracked. Note, the page
@@ -2557,9 +2558,34 @@ int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
if (sp->unsync)
continue;
/*
* TDP MMU page faults require an additional spinlock as they
* run with mmu_lock held for read, not write, and the unsync
* logic is not thread safe. Take the spinklock regardless of
* the MMU type to avoid extra conditionals/parameters, there's
* no meaningful penalty if mmu_lock is held for write.
*/
if (!locked) {
locked = true;
spin_lock(&vcpu->kvm->arch.mmu_unsync_pages_lock);
/*
* Recheck after taking the spinlock, a different vCPU
* may have since marked the page unsync. A false
* positive on the unprotected check above is not
* possible as clearing sp->unsync _must_ hold mmu_lock
* for write, i.e. unsync cannot transition from 0->1
* while this CPU holds mmu_lock for read (or write).
*/
if (READ_ONCE(sp->unsync))
continue;
}
WARN_ON(sp->role.level != PG_LEVEL_4K);
kvm_unsync_page(vcpu, sp);
}
if (locked)
spin_unlock(&vcpu->kvm->arch.mmu_unsync_pages_lock);
/*
* We need to ensure that the marking of unsync pages is visible
@@ -5537,6 +5563,8 @@ void kvm_mmu_init_vm(struct kvm *kvm)
{
struct kvm_page_track_notifier_node *node = &kvm->arch.mmu_sp_tracker;
spin_lock_init(&kvm->arch.mmu_unsync_pages_lock);
if (!kvm_mmu_init_tdp_mmu(kvm))
/*
* No smp_load/store wrappers needed here as we are in

View File

@@ -43,6 +43,7 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
if (!kvm->arch.tdp_mmu_enabled)
return;
WARN_ON(!list_empty(&kvm->arch.tdp_mmu_pages));
WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots));
/*
@@ -81,8 +82,6 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head)
void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root,
bool shared)
{
gfn_t max_gfn = 1ULL << (shadow_phys_bits - PAGE_SHIFT);
kvm_lockdep_assert_mmu_lock_held(kvm, shared);
if (!refcount_dec_and_test(&root->tdp_mmu_root_count))
@@ -94,7 +93,7 @@ void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root,
list_del_rcu(&root->link);
spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
zap_gfn_range(kvm, root, 0, max_gfn, false, false, shared);
zap_gfn_range(kvm, root, 0, -1ull, false, false, shared);
call_rcu(&root->rcu_head, tdp_mmu_free_sp_rcu_callback);
}
@@ -724,13 +723,29 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
gfn_t start, gfn_t end, bool can_yield, bool flush,
bool shared)
{
gfn_t max_gfn_host = 1ULL << (shadow_phys_bits - PAGE_SHIFT);
bool zap_all = (start == 0 && end >= max_gfn_host);
struct tdp_iter iter;
/*
* No need to try to step down in the iterator when zapping all SPTEs,
* zapping the top-level non-leaf SPTEs will recurse on their children.
*/
int min_level = zap_all ? root->role.level : PG_LEVEL_4K;
/*
* Bound the walk at host.MAXPHYADDR, guest accesses beyond that will
* hit a #PF(RSVD) and never get to an EPT Violation/Misconfig / #NPF,
* and so KVM will never install a SPTE for such addresses.
*/
end = min(end, max_gfn_host);
kvm_lockdep_assert_mmu_lock_held(kvm, shared);
rcu_read_lock();
tdp_root_for_each_pte(iter, root, start, end) {
for_each_tdp_pte_min_level(iter, root->spt, root->role.level,
min_level, start, end) {
retry:
if (can_yield &&
tdp_mmu_iter_cond_resched(kvm, &iter, flush, shared)) {
@@ -744,9 +759,10 @@ retry:
/*
* If this is a non-last-level SPTE that covers a larger range
* than should be zapped, continue, and zap the mappings at a
* lower level.
* lower level, except when zapping all SPTEs.
*/
if ((iter.gfn < start ||
if (!zap_all &&
(iter.gfn < start ||
iter.gfn + KVM_PAGES_PER_HPAGE(iter.level) > end) &&
!is_last_spte(iter.old_spte, iter.level))
continue;
@@ -794,12 +810,11 @@ bool __kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, int as_id, gfn_t start,
void kvm_tdp_mmu_zap_all(struct kvm *kvm)
{
gfn_t max_gfn = 1ULL << (shadow_phys_bits - PAGE_SHIFT);
bool flush = false;
int i;
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)
flush = kvm_tdp_mmu_zap_gfn_range(kvm, i, 0, max_gfn,
flush = kvm_tdp_mmu_zap_gfn_range(kvm, i, 0, -1ull,
flush, false);
if (flush)
@@ -838,7 +853,6 @@ static struct kvm_mmu_page *next_invalidated_root(struct kvm *kvm,
*/
void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm)
{
gfn_t max_gfn = 1ULL << (shadow_phys_bits - PAGE_SHIFT);
struct kvm_mmu_page *next_root;
struct kvm_mmu_page *root;
bool flush = false;
@@ -854,8 +868,7 @@ void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm)
rcu_read_unlock();
flush = zap_gfn_range(kvm, root, 0, max_gfn, true, flush,
true);
flush = zap_gfn_range(kvm, root, 0, -1ull, true, flush, true);
/*
* Put the reference acquired in

View File

@@ -158,6 +158,9 @@ void recalc_intercepts(struct vcpu_svm *svm)
/* If SMI is not intercepted, ignore guest SMI intercept as well */
if (!intercept_smi)
vmcb_clr_intercept(c, INTERCEPT_SMI);
vmcb_set_intercept(c, INTERCEPT_VMLOAD);
vmcb_set_intercept(c, INTERCEPT_VMSAVE);
}
static void copy_vmcb_control_area(struct vmcb_control_area *dst,
@@ -503,7 +506,11 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
static void nested_vmcb02_prepare_control(struct vcpu_svm *svm)
{
const u32 mask = V_INTR_MASKING_MASK | V_GIF_ENABLE_MASK | V_GIF_MASK;
const u32 int_ctl_vmcb01_bits =
V_INTR_MASKING_MASK | V_GIF_MASK | V_GIF_ENABLE_MASK;
const u32 int_ctl_vmcb12_bits = V_TPR_MASK | V_IRQ_INJECTION_BITS_MASK;
struct kvm_vcpu *vcpu = &svm->vcpu;
/*
@@ -535,8 +542,8 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm)
vcpu->arch.l1_tsc_offset + svm->nested.ctl.tsc_offset;
svm->vmcb->control.int_ctl =
(svm->nested.ctl.int_ctl & ~mask) |
(svm->vmcb01.ptr->control.int_ctl & mask);
(svm->nested.ctl.int_ctl & int_ctl_vmcb12_bits) |
(svm->vmcb01.ptr->control.int_ctl & int_ctl_vmcb01_bits);
svm->vmcb->control.virt_ext = svm->nested.ctl.virt_ext;
svm->vmcb->control.int_vector = svm->nested.ctl.int_vector;

View File

@@ -1589,17 +1589,18 @@ static void svm_set_vintr(struct vcpu_svm *svm)
static void svm_clear_vintr(struct vcpu_svm *svm)
{
const u32 mask = V_TPR_MASK | V_GIF_ENABLE_MASK | V_GIF_MASK | V_INTR_MASKING_MASK;
svm_clr_intercept(svm, INTERCEPT_VINTR);
/* Drop int_ctl fields related to VINTR injection. */
svm->vmcb->control.int_ctl &= mask;
svm->vmcb->control.int_ctl &= ~V_IRQ_INJECTION_BITS_MASK;
if (is_guest_mode(&svm->vcpu)) {
svm->vmcb01.ptr->control.int_ctl &= mask;
svm->vmcb01.ptr->control.int_ctl &= ~V_IRQ_INJECTION_BITS_MASK;
WARN_ON((svm->vmcb->control.int_ctl & V_TPR_MASK) !=
(svm->nested.ctl.int_ctl & V_TPR_MASK));
svm->vmcb->control.int_ctl |= svm->nested.ctl.int_ctl & ~mask;
svm->vmcb->control.int_ctl |= svm->nested.ctl.int_ctl &
V_IRQ_INJECTION_BITS_MASK;
}
vmcb_mark_dirty(svm->vmcb, VMCB_INTR);

View File

@@ -330,6 +330,31 @@ void nested_vmx_free_vcpu(struct kvm_vcpu *vcpu)
vcpu_put(vcpu);
}
#define EPTP_PA_MASK GENMASK_ULL(51, 12)
static bool nested_ept_root_matches(hpa_t root_hpa, u64 root_eptp, u64 eptp)
{
return VALID_PAGE(root_hpa) &&
((root_eptp & EPTP_PA_MASK) == (eptp & EPTP_PA_MASK));
}
static void nested_ept_invalidate_addr(struct kvm_vcpu *vcpu, gpa_t eptp,
gpa_t addr)
{
uint i;
struct kvm_mmu_root_info *cached_root;
WARN_ON_ONCE(!mmu_is_nested(vcpu));
for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) {
cached_root = &vcpu->arch.mmu->prev_roots[i];
if (nested_ept_root_matches(cached_root->hpa, cached_root->pgd,
eptp))
vcpu->arch.mmu->invlpg(vcpu, addr, cached_root->hpa);
}
}
static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu,
struct x86_exception *fault)
{
@@ -342,10 +367,22 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu,
vm_exit_reason = EXIT_REASON_PML_FULL;
vmx->nested.pml_full = false;
exit_qualification &= INTR_INFO_UNBLOCK_NMI;
} else if (fault->error_code & PFERR_RSVD_MASK)
vm_exit_reason = EXIT_REASON_EPT_MISCONFIG;
else
vm_exit_reason = EXIT_REASON_EPT_VIOLATION;
} else {
if (fault->error_code & PFERR_RSVD_MASK)
vm_exit_reason = EXIT_REASON_EPT_MISCONFIG;
else
vm_exit_reason = EXIT_REASON_EPT_VIOLATION;
/*
* Although the caller (kvm_inject_emulated_page_fault) would
* have already synced the faulting address in the shadow EPT
* tables for the current EPTP12, we also need to sync it for
* any other cached EPTP02s based on the same EP4TA, since the
* TLB associates mappings to the EP4TA rather than the full EPTP.
*/
nested_ept_invalidate_addr(vcpu, vmcs12->ept_pointer,
fault->address);
}
nested_vmx_vmexit(vcpu, vm_exit_reason, 0, exit_qualification);
vmcs12->guest_physical_address = fault->address;
@@ -5325,14 +5362,6 @@ static int handle_vmptrst(struct kvm_vcpu *vcpu)
return nested_vmx_succeed(vcpu);
}
#define EPTP_PA_MASK GENMASK_ULL(51, 12)
static bool nested_ept_root_matches(hpa_t root_hpa, u64 root_eptp, u64 eptp)
{
return VALID_PAGE(root_hpa) &&
((root_eptp & EPTP_PA_MASK) == (eptp & EPTP_PA_MASK));
}
/* Emulate the INVEPT instruction */
static int handle_invept(struct kvm_vcpu *vcpu)
{
@@ -5826,7 +5855,8 @@ static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu,
if (is_nmi(intr_info))
return true;
else if (is_page_fault(intr_info))
return vcpu->arch.apf.host_apf_flags || !enable_ept;
return vcpu->arch.apf.host_apf_flags ||
vmx_need_pf_intercept(vcpu);
else if (is_debug(intr_info) &&
vcpu->guest_debug &
(KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP))

View File

@@ -522,7 +522,7 @@ static inline struct vmcs *alloc_vmcs(bool shadow)
static inline bool vmx_has_waitpkg(struct vcpu_vmx *vmx)
{
return vmx->secondary_exec_control &
return secondary_exec_controls_get(vmx) &
SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE;
}

View File

@@ -10,6 +10,7 @@ BEGIN {
/^GNU objdump/ {
verstr = ""
gsub(/\(.*\)/, "");
for (i = 3; i <= NF; i++)
if (match($(i), "^[0-9]")) {
verstr = $(i);

View File

@@ -9,12 +9,6 @@ config MQ_IOSCHED_DEADLINE
help
MQ version of the deadline IO scheduler.
config MQ_IOSCHED_DEADLINE_CGROUP
tristate
default y
depends on MQ_IOSCHED_DEADLINE
depends on BLK_CGROUP
config MQ_IOSCHED_KYBER
tristate "Kyber I/O scheduler"
default y

View File

@@ -22,8 +22,6 @@ obj-$(CONFIG_BLK_CGROUP_IOPRIO) += blk-ioprio.o
obj-$(CONFIG_BLK_CGROUP_IOLATENCY) += blk-iolatency.o
obj-$(CONFIG_BLK_CGROUP_IOCOST) += blk-iocost.o
obj-$(CONFIG_MQ_IOSCHED_DEADLINE) += mq-deadline.o
mq-deadline-y += mq-deadline-main.o
mq-deadline-$(CONFIG_MQ_IOSCHED_DEADLINE_CGROUP)+= mq-deadline-cgroup.o
obj-$(CONFIG_MQ_IOSCHED_KYBER) += kyber-iosched.o
bfq-y := bfq-iosched.o bfq-wf2q.o bfq-cgroup.o
obj-$(CONFIG_IOSCHED_BFQ) += bfq.o

View File

@@ -790,6 +790,7 @@ static void blkcg_rstat_flush(struct cgroup_subsys_state *css, int cpu)
struct blkcg_gq *parent = blkg->parent;
struct blkg_iostat_set *bisc = per_cpu_ptr(blkg->iostat_cpu, cpu);
struct blkg_iostat cur, delta;
unsigned long flags;
unsigned int seq;
/* fetch the current per-cpu values */
@@ -799,21 +800,21 @@ static void blkcg_rstat_flush(struct cgroup_subsys_state *css, int cpu)
} while (u64_stats_fetch_retry(&bisc->sync, seq));
/* propagate percpu delta to global */
u64_stats_update_begin(&blkg->iostat.sync);
flags = u64_stats_update_begin_irqsave(&blkg->iostat.sync);
blkg_iostat_set(&delta, &cur);
blkg_iostat_sub(&delta, &bisc->last);
blkg_iostat_add(&blkg->iostat.cur, &delta);
blkg_iostat_add(&bisc->last, &delta);
u64_stats_update_end(&blkg->iostat.sync);
u64_stats_update_end_irqrestore(&blkg->iostat.sync, flags);
/* propagate global delta to parent (unless that's root) */
if (parent && parent->parent) {
u64_stats_update_begin(&parent->iostat.sync);
flags = u64_stats_update_begin_irqsave(&parent->iostat.sync);
blkg_iostat_set(&delta, &blkg->iostat.cur);
blkg_iostat_sub(&delta, &blkg->iostat.last);
blkg_iostat_add(&parent->iostat.cur, &delta);
blkg_iostat_add(&blkg->iostat.last, &delta);
u64_stats_update_end(&parent->iostat.sync);
u64_stats_update_end_irqrestore(&parent->iostat.sync, flags);
}
}
@@ -848,6 +849,7 @@ static void blkcg_fill_root_iostats(void)
memset(&tmp, 0, sizeof(tmp));
for_each_possible_cpu(cpu) {
struct disk_stats *cpu_dkstats;
unsigned long flags;
cpu_dkstats = per_cpu_ptr(bdev->bd_stats, cpu);
tmp.ios[BLKG_IOSTAT_READ] +=
@@ -864,9 +866,9 @@ static void blkcg_fill_root_iostats(void)
tmp.bytes[BLKG_IOSTAT_DISCARD] +=
cpu_dkstats->sectors[STAT_DISCARD] << 9;
u64_stats_update_begin(&blkg->iostat.sync);
flags = u64_stats_update_begin_irqsave(&blkg->iostat.sync);
blkg_iostat_set(&blkg->iostat.cur, &tmp);
u64_stats_update_end(&blkg->iostat.sync);
u64_stats_update_end_irqrestore(&blkg->iostat.sync, flags);
}
}
}

View File

@@ -122,7 +122,6 @@ void blk_rq_init(struct request_queue *q, struct request *rq)
rq->internal_tag = BLK_MQ_NO_TAG;
rq->start_time_ns = ktime_get_ns();
rq->part = NULL;
refcount_set(&rq->ref, 1);
blk_crypto_rq_set_defaults(rq);
}
EXPORT_SYMBOL(blk_rq_init);

View File

@@ -262,6 +262,11 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
}
bool is_flush_rq(struct request *rq)
{
return rq->end_io == flush_end_io;
}
/**
* blk_kick_flush - consider issuing flush request
* @q: request_queue being kicked
@@ -329,6 +334,14 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
flush_rq->rq_flags |= RQF_FLUSH_SEQ;
flush_rq->rq_disk = first_rq->rq_disk;
flush_rq->end_io = flush_end_io;
/*
* Order WRITE ->end_io and WRITE rq->ref, and its pair is the one
* implied in refcount_inc_not_zero() called from
* blk_mq_find_and_get_req(), which orders WRITE/READ flush_rq->ref
* and READ flush_rq->end_io
*/
smp_wmb();
refcount_set(&flush_rq->ref, 1);
blk_flush_queue_rq(flush_rq, false);
}

View File

@@ -3061,19 +3061,19 @@ static ssize_t ioc_weight_write(struct kernfs_open_file *of, char *buf,
if (v < CGROUP_WEIGHT_MIN || v > CGROUP_WEIGHT_MAX)
return -EINVAL;
spin_lock(&blkcg->lock);
spin_lock_irq(&blkcg->lock);
iocc->dfl_weight = v * WEIGHT_ONE;
hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) {
struct ioc_gq *iocg = blkg_to_iocg(blkg);
if (iocg) {
spin_lock_irq(&iocg->ioc->lock);
spin_lock(&iocg->ioc->lock);
ioc_now(iocg->ioc, &now);
weight_updated(iocg, &now);
spin_unlock_irq(&iocg->ioc->lock);
spin_unlock(&iocg->ioc->lock);
}
}
spin_unlock(&blkcg->lock);
spin_unlock_irq(&blkcg->lock);
return nbytes;
}

View File

@@ -911,7 +911,7 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
void blk_mq_put_rq_ref(struct request *rq)
{
if (is_flush_rq(rq, rq->mq_hctx))
if (is_flush_rq(rq))
rq->end_io(rq, 0);
else if (refcount_dec_and_test(&rq->ref))
__blk_mq_free_request(rq);
@@ -923,34 +923,14 @@ static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
unsigned long *next = priv;
/*
* Just do a quick check if it is expired before locking the request in
* so we're not unnecessarilly synchronizing across CPUs.
*/
if (!blk_mq_req_expired(rq, next))
return true;
/*
* We have reason to believe the request may be expired. Take a
* reference on the request to lock this request lifetime into its
* currently allocated context to prevent it from being reallocated in
* the event the completion by-passes this timeout handler.
*
* If the reference was already released, then the driver beat the
* timeout handler to posting a natural completion.
*/
if (!refcount_inc_not_zero(&rq->ref))
return true;
/*
* The request is now locked and cannot be reallocated underneath the
* timeout handler's processing. Re-verify this exact request is truly
* expired; if it is not expired, then the request was completed and
* reallocated as a new request.
* blk_mq_queue_tag_busy_iter() has locked the request, so it cannot
* be reallocated underneath the timeout handler's processing, then
* the expire check is reliable. If the request is not expired, then
* it was completed and reallocated as a new request after returning
* from blk_mq_check_expired().
*/
if (blk_mq_req_expired(rq, next))
blk_mq_rq_timed_out(rq, reserved);
blk_mq_put_rq_ref(rq);
return true;
}
@@ -2994,10 +2974,12 @@ static void queue_set_hctx_shared(struct request_queue *q, bool shared)
int i;
queue_for_each_hw_ctx(q, hctx, i) {
if (shared)
if (shared) {
hctx->flags |= BLK_MQ_F_TAG_QUEUE_SHARED;
else
} else {
blk_mq_tag_idle(hctx);
hctx->flags &= ~BLK_MQ_F_TAG_QUEUE_SHARED;
}
}
}

View File

@@ -44,11 +44,7 @@ static inline void __blk_get_queue(struct request_queue *q)
kobject_get(&q->kobj);
}
static inline bool
is_flush_rq(struct request *req, struct blk_mq_hw_ctx *hctx)
{
return hctx->fq->flush_rq == req;
}
bool is_flush_rq(struct request *req);
struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size,
gfp_t flags);

View File

@@ -1,126 +0,0 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/blk-cgroup.h>
#include <linux/ioprio.h>
#include "mq-deadline-cgroup.h"
static struct blkcg_policy dd_blkcg_policy;
static struct blkcg_policy_data *dd_cpd_alloc(gfp_t gfp)
{
struct dd_blkcg *pd;
pd = kzalloc(sizeof(*pd), gfp);
if (!pd)
return NULL;
pd->stats = alloc_percpu_gfp(typeof(*pd->stats),
GFP_KERNEL | __GFP_ZERO);
if (!pd->stats) {
kfree(pd);
return NULL;
}
return &pd->cpd;
}
static void dd_cpd_free(struct blkcg_policy_data *cpd)
{
struct dd_blkcg *dd_blkcg = container_of(cpd, typeof(*dd_blkcg), cpd);
free_percpu(dd_blkcg->stats);
kfree(dd_blkcg);
}
static struct dd_blkcg *dd_blkcg_from_pd(struct blkg_policy_data *pd)
{
return container_of(blkcg_to_cpd(pd->blkg->blkcg, &dd_blkcg_policy),
struct dd_blkcg, cpd);
}
/*
* Convert an association between a block cgroup and a request queue into a
* pointer to the mq-deadline information associated with a (blkcg, queue) pair.
*/
struct dd_blkcg *dd_blkcg_from_bio(struct bio *bio)
{
struct blkg_policy_data *pd;
pd = blkg_to_pd(bio->bi_blkg, &dd_blkcg_policy);
if (!pd)
return NULL;
return dd_blkcg_from_pd(pd);
}
static size_t dd_pd_stat(struct blkg_policy_data *pd, char *buf, size_t size)
{
static const char *const prio_class_name[] = {
[IOPRIO_CLASS_NONE] = "NONE",
[IOPRIO_CLASS_RT] = "RT",
[IOPRIO_CLASS_BE] = "BE",
[IOPRIO_CLASS_IDLE] = "IDLE",
};
struct dd_blkcg *blkcg = dd_blkcg_from_pd(pd);
int res = 0;
u8 prio;
for (prio = 0; prio < ARRAY_SIZE(blkcg->stats->stats); prio++)
res += scnprintf(buf + res, size - res,
" [%s] dispatched=%u inserted=%u merged=%u",
prio_class_name[prio],
ddcg_sum(blkcg, dispatched, prio) +
ddcg_sum(blkcg, merged, prio) -
ddcg_sum(blkcg, completed, prio),
ddcg_sum(blkcg, inserted, prio) -
ddcg_sum(blkcg, completed, prio),
ddcg_sum(blkcg, merged, prio));
return res;
}
static struct blkg_policy_data *dd_pd_alloc(gfp_t gfp, struct request_queue *q,
struct blkcg *blkcg)
{
struct dd_blkg *pd;
pd = kzalloc(sizeof(*pd), gfp);
if (!pd)
return NULL;
return &pd->pd;
}
static void dd_pd_free(struct blkg_policy_data *pd)
{
struct dd_blkg *dd_blkg = container_of(pd, typeof(*dd_blkg), pd);
kfree(dd_blkg);
}
static struct blkcg_policy dd_blkcg_policy = {
.cpd_alloc_fn = dd_cpd_alloc,
.cpd_free_fn = dd_cpd_free,
.pd_alloc_fn = dd_pd_alloc,
.pd_free_fn = dd_pd_free,
.pd_stat_fn = dd_pd_stat,
};
int dd_activate_policy(struct request_queue *q)
{
return blkcg_activate_policy(q, &dd_blkcg_policy);
}
void dd_deactivate_policy(struct request_queue *q)
{
blkcg_deactivate_policy(q, &dd_blkcg_policy);
}
int __init dd_blkcg_init(void)
{
return blkcg_policy_register(&dd_blkcg_policy);
}
void __exit dd_blkcg_exit(void)
{
blkcg_policy_unregister(&dd_blkcg_policy);
}

View File

@@ -1,114 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
#if !defined(_MQ_DEADLINE_CGROUP_H_)
#define _MQ_DEADLINE_CGROUP_H_
#include <linux/blk-cgroup.h>
struct request_queue;
/**
* struct io_stats_per_prio - I/O statistics per I/O priority class.
* @inserted: Number of inserted requests.
* @merged: Number of merged requests.
* @dispatched: Number of dispatched requests.
* @completed: Number of I/O completions.
*/
struct io_stats_per_prio {
local_t inserted;
local_t merged;
local_t dispatched;
local_t completed;
};
/* I/O statistics per I/O cgroup per I/O priority class (IOPRIO_CLASS_*). */
struct blkcg_io_stats {
struct io_stats_per_prio stats[4];
};
/**
* struct dd_blkcg - Per cgroup data.
* @cpd: blkcg_policy_data structure.
* @stats: I/O statistics.
*/
struct dd_blkcg {
struct blkcg_policy_data cpd; /* must be the first member */
struct blkcg_io_stats __percpu *stats;
};
/*
* Count one event of type 'event_type' and with I/O priority class
* 'prio_class'.
*/
#define ddcg_count(ddcg, event_type, prio_class) do { \
if (ddcg) { \
struct blkcg_io_stats *io_stats = get_cpu_ptr((ddcg)->stats); \
\
BUILD_BUG_ON(!__same_type((ddcg), struct dd_blkcg *)); \
BUILD_BUG_ON(!__same_type((prio_class), u8)); \
local_inc(&io_stats->stats[(prio_class)].event_type); \
put_cpu_ptr(io_stats); \
} \
} while (0)
/*
* Returns the total number of ddcg_count(ddcg, event_type, prio_class) calls
* across all CPUs. No locking or barriers since it is fine if the returned
* sum is slightly outdated.
*/
#define ddcg_sum(ddcg, event_type, prio) ({ \
unsigned int cpu; \
u32 sum = 0; \
\
BUILD_BUG_ON(!__same_type((ddcg), struct dd_blkcg *)); \
BUILD_BUG_ON(!__same_type((prio), u8)); \
for_each_present_cpu(cpu) \
sum += local_read(&per_cpu_ptr((ddcg)->stats, cpu)-> \
stats[(prio)].event_type); \
sum; \
})
#ifdef CONFIG_BLK_CGROUP
/**
* struct dd_blkg - Per (cgroup, request queue) data.
* @pd: blkg_policy_data structure.
*/
struct dd_blkg {
struct blkg_policy_data pd; /* must be the first member */
};
struct dd_blkcg *dd_blkcg_from_bio(struct bio *bio);
int dd_activate_policy(struct request_queue *q);
void dd_deactivate_policy(struct request_queue *q);
int __init dd_blkcg_init(void);
void __exit dd_blkcg_exit(void);
#else /* CONFIG_BLK_CGROUP */
static inline struct dd_blkcg *dd_blkcg_from_bio(struct bio *bio)
{
return NULL;
}
static inline int dd_activate_policy(struct request_queue *q)
{
return 0;
}
static inline void dd_deactivate_policy(struct request_queue *q)
{
}
static inline int dd_blkcg_init(void)
{
return 0;
}
static inline void dd_blkcg_exit(void)
{
}
#endif /* CONFIG_BLK_CGROUP */
#endif /* _MQ_DEADLINE_CGROUP_H_ */

View File

@@ -25,18 +25,12 @@
#include "blk-mq-debugfs.h"
#include "blk-mq-tag.h"
#include "blk-mq-sched.h"
#include "mq-deadline-cgroup.h"
/*
* See Documentation/block/deadline-iosched.rst
*/
static const int read_expire = HZ / 2; /* max time before a read is submitted. */
static const int write_expire = 5 * HZ; /* ditto for writes, these limits are SOFT! */
/*
* Time after which to dispatch lower priority requests even if higher
* priority requests are pending.
*/
static const int aging_expire = 10 * HZ;
static const int writes_starved = 2; /* max times reads can starve a write */
static const int fifo_batch = 16; /* # of sequential requests treated as one
by the above parameters. For throughput. */
@@ -57,6 +51,14 @@ enum dd_prio {
enum { DD_PRIO_COUNT = 3 };
/* I/O statistics per I/O priority. */
struct io_stats_per_prio {
local_t inserted;
local_t merged;
local_t dispatched;
local_t completed;
};
/* I/O statistics for all I/O priorities (enum dd_prio). */
struct io_stats {
struct io_stats_per_prio stats[DD_PRIO_COUNT];
@@ -79,9 +81,6 @@ struct deadline_data {
* run time data
*/
/* Request queue that owns this data structure. */
struct request_queue *queue;
struct dd_per_prio per_prio[DD_PRIO_COUNT];
/* Data direction of latest dispatched request. */
@@ -99,7 +98,6 @@ struct deadline_data {
int writes_starved;
int front_merges;
u32 async_depth;
int aging_expire;
spinlock_t lock;
spinlock_t zone_lock;
@@ -234,10 +232,8 @@ static void dd_merged_requests(struct request_queue *q, struct request *req,
struct deadline_data *dd = q->elevator->elevator_data;
const u8 ioprio_class = dd_rq_ioclass(next);
const enum dd_prio prio = ioprio_class_to_prio[ioprio_class];
struct dd_blkcg *blkcg = next->elv.priv[0];
dd_count(dd, merged, prio);
ddcg_count(blkcg, merged, ioprio_class);
/*
* if next expires before rq, assign its expire time to rq
@@ -367,15 +363,13 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
/*
* deadline_dispatch_requests selects the best request according to
* read/write expire, fifo_batch, etc and with a start time <= @latest.
* read/write expire, fifo_batch, etc
*/
static struct request *__dd_dispatch_request(struct deadline_data *dd,
struct dd_per_prio *per_prio,
u64 latest_start_ns)
struct dd_per_prio *per_prio)
{
struct request *rq, *next_rq;
enum dd_data_dir data_dir;
struct dd_blkcg *blkcg;
enum dd_prio prio;
u8 ioprio_class;
@@ -384,8 +378,6 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
if (!list_empty(&per_prio->dispatch)) {
rq = list_first_entry(&per_prio->dispatch, struct request,
queuelist);
if (rq->start_time_ns > latest_start_ns)
return NULL;
list_del_init(&rq->queuelist);
goto done;
}
@@ -463,8 +455,6 @@ dispatch_find_request:
dd->batching = 0;
dispatch_request:
if (rq->start_time_ns > latest_start_ns)
return NULL;
/*
* rq is the selected appropriate request.
*/
@@ -474,8 +464,6 @@ done:
ioprio_class = dd_rq_ioclass(rq);
prio = ioprio_class_to_prio[ioprio_class];
dd_count(dd, dispatched, prio);
blkcg = rq->elv.priv[0];
ddcg_count(blkcg, dispatched, ioprio_class);
/*
* If the request needs its target zone locked, do it.
*/
@@ -495,32 +483,15 @@ done:
static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
{
struct deadline_data *dd = hctx->queue->elevator->elevator_data;
const u64 now_ns = ktime_get_ns();
struct request *rq = NULL;
struct request *rq;
enum dd_prio prio;
spin_lock(&dd->lock);
/*
* Start with dispatching requests whose deadline expired more than
* aging_expire jiffies ago.
*/
for (prio = DD_BE_PRIO; prio <= DD_PRIO_MAX; prio++) {
rq = __dd_dispatch_request(dd, &dd->per_prio[prio], now_ns -
jiffies_to_nsecs(dd->aging_expire));
if (rq)
goto unlock;
}
/*
* Next, dispatch requests in priority order. Ignore lower priority
* requests if any higher priority requests are pending.
*/
for (prio = 0; prio <= DD_PRIO_MAX; prio++) {
rq = __dd_dispatch_request(dd, &dd->per_prio[prio], now_ns);
if (rq || dd_queued(dd, prio))
rq = __dd_dispatch_request(dd, &dd->per_prio[prio]);
if (rq)
break;
}
unlock:
spin_unlock(&dd->lock);
return rq;
@@ -569,8 +540,6 @@ static void dd_exit_sched(struct elevator_queue *e)
struct deadline_data *dd = e->elevator_data;
enum dd_prio prio;
dd_deactivate_policy(dd->queue);
for (prio = 0; prio <= DD_PRIO_MAX; prio++) {
struct dd_per_prio *per_prio = &dd->per_prio[prio];
@@ -584,7 +553,7 @@ static void dd_exit_sched(struct elevator_queue *e)
}
/*
* Initialize elevator private data (deadline_data) and associate with blkcg.
* initialize elevator private data (deadline_data).
*/
static int dd_init_sched(struct request_queue *q, struct elevator_type *e)
{
@@ -593,12 +562,6 @@ static int dd_init_sched(struct request_queue *q, struct elevator_type *e)
enum dd_prio prio;
int ret = -ENOMEM;
/*
* Initialization would be very tricky if the queue is not frozen,
* hence the warning statement below.
*/
WARN_ON_ONCE(!percpu_ref_is_zero(&q->q_usage_counter));
eq = elevator_alloc(q, e);
if (!eq)
return ret;
@@ -614,8 +577,6 @@ static int dd_init_sched(struct request_queue *q, struct elevator_type *e)
if (!dd->stats)
goto free_dd;
dd->queue = q;
for (prio = 0; prio <= DD_PRIO_MAX; prio++) {
struct dd_per_prio *per_prio = &dd->per_prio[prio];
@@ -631,21 +592,12 @@ static int dd_init_sched(struct request_queue *q, struct elevator_type *e)
dd->front_merges = 1;
dd->last_dir = DD_WRITE;
dd->fifo_batch = fifo_batch;
dd->aging_expire = aging_expire;
spin_lock_init(&dd->lock);
spin_lock_init(&dd->zone_lock);
ret = dd_activate_policy(q);
if (ret)
goto free_stats;
ret = 0;
q->elevator = eq;
return 0;
free_stats:
free_percpu(dd->stats);
free_dd:
kfree(dd);
@@ -718,7 +670,6 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
u8 ioprio_class = IOPRIO_PRIO_CLASS(ioprio);
struct dd_per_prio *per_prio;
enum dd_prio prio;
struct dd_blkcg *blkcg;
LIST_HEAD(free);
lockdep_assert_held(&dd->lock);
@@ -729,18 +680,9 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
*/
blk_req_zone_write_unlock(rq);
/*
* If a block cgroup has been associated with the submitter and if an
* I/O priority has been set in the associated block cgroup, use the
* lowest of the cgroup priority and the request priority for the
* request. If no priority has been set in the request, use the cgroup
* priority.
*/
prio = ioprio_class_to_prio[ioprio_class];
dd_count(dd, inserted, prio);
blkcg = dd_blkcg_from_bio(rq->bio);
ddcg_count(blkcg, inserted, ioprio_class);
rq->elv.priv[0] = blkcg;
rq->elv.priv[0] = (void *)(uintptr_t)1;
if (blk_mq_sched_try_insert_merge(q, rq, &free)) {
blk_mq_free_requests(&free);
@@ -815,13 +757,18 @@ static void dd_finish_request(struct request *rq)
{
struct request_queue *q = rq->q;
struct deadline_data *dd = q->elevator->elevator_data;
struct dd_blkcg *blkcg = rq->elv.priv[0];
const u8 ioprio_class = dd_rq_ioclass(rq);
const enum dd_prio prio = ioprio_class_to_prio[ioprio_class];
struct dd_per_prio *per_prio = &dd->per_prio[prio];
dd_count(dd, completed, prio);
ddcg_count(blkcg, completed, ioprio_class);
/*
* The block layer core may call dd_finish_request() without having
* called dd_insert_requests(). Hence only update statistics for
* requests for which dd_insert_requests() has been called. See also
* blk_mq_request_bypass_insert().
*/
if (rq->elv.priv[0])
dd_count(dd, completed, prio);
if (blk_queue_is_zoned(q)) {
unsigned long flags;
@@ -866,7 +813,6 @@ static ssize_t __FUNC(struct elevator_queue *e, char *page) \
#define SHOW_JIFFIES(__FUNC, __VAR) SHOW_INT(__FUNC, jiffies_to_msecs(__VAR))
SHOW_JIFFIES(deadline_read_expire_show, dd->fifo_expire[DD_READ]);
SHOW_JIFFIES(deadline_write_expire_show, dd->fifo_expire[DD_WRITE]);
SHOW_JIFFIES(deadline_aging_expire_show, dd->aging_expire);
SHOW_INT(deadline_writes_starved_show, dd->writes_starved);
SHOW_INT(deadline_front_merges_show, dd->front_merges);
SHOW_INT(deadline_async_depth_show, dd->front_merges);
@@ -896,7 +842,6 @@ static ssize_t __FUNC(struct elevator_queue *e, const char *page, size_t count)
STORE_FUNCTION(__FUNC, __PTR, MIN, MAX, msecs_to_jiffies)
STORE_JIFFIES(deadline_read_expire_store, &dd->fifo_expire[DD_READ], 0, INT_MAX);
STORE_JIFFIES(deadline_write_expire_store, &dd->fifo_expire[DD_WRITE], 0, INT_MAX);
STORE_JIFFIES(deadline_aging_expire_store, &dd->aging_expire, 0, INT_MAX);
STORE_INT(deadline_writes_starved_store, &dd->writes_starved, INT_MIN, INT_MAX);
STORE_INT(deadline_front_merges_store, &dd->front_merges, 0, 1);
STORE_INT(deadline_async_depth_store, &dd->front_merges, 1, INT_MAX);
@@ -915,7 +860,6 @@ static struct elv_fs_entry deadline_attrs[] = {
DD_ATTR(front_merges),
DD_ATTR(async_depth),
DD_ATTR(fifo_batch),
DD_ATTR(aging_expire),
__ATTR_NULL
};
@@ -1144,26 +1088,11 @@ MODULE_ALIAS("mq-deadline-iosched");
static int __init deadline_init(void)
{
int ret;
ret = elv_register(&mq_deadline);
if (ret)
goto out;
ret = dd_blkcg_init();
if (ret)
goto unreg;
out:
return ret;
unreg:
elv_unregister(&mq_deadline);
goto out;
return elv_register(&mq_deadline);
}
static void __exit deadline_exit(void)
{
dd_blkcg_exit();
elv_unregister(&mq_deadline);
}

View File

@@ -1768,7 +1768,7 @@ config CRYPTO_DRBG_HMAC
bool
default y
select CRYPTO_HMAC
select CRYPTO_SHA256
select CRYPTO_SHA512
config CRYPTO_DRBG_HASH
bool "Enable Hash DRBG"

View File

@@ -3021,6 +3021,9 @@ static int acpi_nfit_register_region(struct acpi_nfit_desc *acpi_desc,
struct acpi_nfit_memory_map *memdev = nfit_memdev->memdev;
struct nd_mapping_desc *mapping;
/* range index 0 == unmapped in SPA or invalid-SPA */
if (memdev->range_index == 0 || spa->range_index == 0)
continue;
if (memdev->range_index != spa->range_index)
continue;
if (count >= ND_MAX_MAPPINGS) {

View File

@@ -292,6 +292,12 @@ void __init init_prmt(void)
int mc = acpi_table_parse_entries(ACPI_SIG_PRMT, sizeof(struct acpi_table_prmt) +
sizeof (struct acpi_table_prmt_header),
0, acpi_parse_prmt, 0);
/*
* Return immediately if PRMT table is not present or no PRM module found.
*/
if (mc <= 0)
return;
pr_info("PRM: found %u modules\n", mc);
status = acpi_install_address_space_handler(ACPI_ROOT_OBJECT,

View File

@@ -452,7 +452,7 @@ int acpi_s2idle_prepare_late(void)
if (lps0_dsm_func_mask_microsoft > 0) {
acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF,
lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_EXIT,
acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_ENTRY,
lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY,
lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft);
@@ -479,7 +479,7 @@ void acpi_s2idle_restore_early(void)
if (lps0_dsm_func_mask_microsoft > 0) {
acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT,
lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_ENTRY,
acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_EXIT,
lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON,
lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft);

Some files were not shown because too many files have changed in this diff Show More