Compare commits

...

492 Commits

Author SHA1 Message Date
Linus Torvalds
8bb7eca972 Linux 5.15 2021-10-31 13:53:10 -07:00
Linus Torvalds
75fcbd3860 Merge tag 'perf-tools-fixes-for-v5.15-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix compilation of callchain related code on powerpc with gcc11+

 - Fix PERF_SAMPLE_WEIGHT_STRUCT support in 'perf script'

 - Check session->header.env.arch before using it, fixing a segmentation
   fault

 - Suppress 'rm dlfilter' build messages

* tag 'perf-tools-fixes-for-v5.15-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf script: Fix PERF_SAMPLE_WEIGHT_STRUCT support
  perf callchain: Fix compilation on powerpc with gcc11+
  perf script: Check session->header.env.arch before using it
  perf build: Suppress 'rm dlfilter' build message
2021-10-31 11:24:06 -07:00
Linus Torvalds
ca5e83eddc Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:

 - Fixes for s390 interrupt delivery

 - Fixes for Xen emulator bugs showing up as debug kernel WARNs

 - Fix another issue with SEV/ES string I/O VMGEXITs

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Take srcu lock in post_kvm_run_save()
  KVM: SEV-ES: fix another issue with string I/O VMGEXITs
  KVM: x86/xen: Fix kvm_xen_has_interrupt() sleeping in kvm_vcpu_block()
  KVM: x86: switch pvclock_gtod_sync_lock to a raw spinlock
  KVM: s390: preserve deliverable_mask in __airqs_kick_single_vcpu
  KVM: s390: clear kicked_mask before sleeping again
2021-10-31 11:19:02 -07:00
Kan Liang
27730c8cd6 perf script: Fix PERF_SAMPLE_WEIGHT_STRUCT support
-F weight in perf script is broken.

  # ./perf mem record
  # ./perf script -F weight
  Samples for 'dummy:HG' event do not have WEIGHT attribute set. Cannot
print 'weight' field.

The sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. They share the same space, weight. The
lower 32 bits are exactly the same for both sample type. The higher 32
bits may be different for different architecture. For a new kernel on
x86, the PERF_SAMPLE_WEIGHT_STRUCT is used. For an old kernel or other
ARCHs, the PERF_SAMPLE_WEIGHT is used.

With -F weight, current perf script will only check the input string
"weight" with the PERF_SAMPLE_WEIGHT sample type. Because the commit
ea8d0ed6ea ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT") didn't
update the PERF_SAMPLE_WEIGHT_STRUCT sample type for perf script. For a
new kernel on x86, the check fails.

Use PERF_SAMPLE_WEIGHT_TYPE, which supports both sample types, to
replace PERF_SAMPLE_WEIGHT

Fixes: ea8d0ed6ea ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT")
Reported-by: Joe Mario <jmario@redhat.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Joe Mario <jmario@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Joe Mario <jmario@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1632929894-102778-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Jiri Olsa
89ac61ff05 perf callchain: Fix compilation on powerpc with gcc11+
Got following build fail on powerpc:

    CC      arch/powerpc/util/skip-callchain-idx.o
  In function ‘check_return_reg’,
      inlined from ‘check_return_addr’ at arch/powerpc/util/skip-callchain-idx.c:213:7,
      inlined from ‘arch_skip_callchain_idx’ at arch/powerpc/util/skip-callchain-idx.c:265:7:
  arch/powerpc/util/skip-callchain-idx.c:54:18: error: ‘dwarf_frame_register’ accessing 96 bytes \
  in a region of size 64 [-Werror=stringop-overflow=]
     54 |         result = dwarf_frame_register(frame, ra_regno, ops_mem, &ops, &nops);
        |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  arch/powerpc/util/skip-callchain-idx.c: In function ‘arch_skip_callchain_idx’:
  arch/powerpc/util/skip-callchain-idx.c:54:18: note: referencing argument 3 of type ‘Dwarf_Op *’
  In file included from /usr/include/elfutils/libdwfl.h:32,
                   from arch/powerpc/util/skip-callchain-idx.c:10:
  /usr/include/elfutils/libdw.h:1069:12: note: in a call to function ‘dwarf_frame_register’
   1069 | extern int dwarf_frame_register (Dwarf_Frame *frame, int regno,
        |            ^~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

The dwarf_frame_register args changed with [1],
Updating ops_mem accordingly.

[1] https://sourceware.org/git/?p=elfutils.git;a=commit;h=5621fe5443da23112170235dd5cac161e5c75e65

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Mark Wieelard <mjw@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20210928195253.1267023-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Song Liu
29c77550ee perf script: Check session->header.env.arch before using it
When perf.data is not written cleanly, we would like to process existing
data as much as possible (please see f_header.data.size == 0 condition
in perf_session__read_header). However, perf.data with partial data may
crash perf. Specifically, we see crash in 'perf script' for NULL
session->header.env.arch.

Fix this by checking session->header.env.arch before using it to determine
native_arch. Also split the if condition so it is easier to read.

Committer notes:

If it is a pipe, we already assume is a native arch, so no need to check
session->header.env.arch.

Signed-off-by: Song Liu <songliubraving@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211004053238.514936-1-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Adrian Hunter
095729484e perf build: Suppress 'rm dlfilter' build message
The following build message:

	rm dlfilters/dlfilter-test-api-v0.o

is unwanted.

The object file is being treated as an intermediate file and being
automatically removed. Mark the object file as .SECONDARY to prevent
removal and hence the message.

Requested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210930062849.110416-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Linus Torvalds
180eca540a Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Three small fixes, all in drivers, and one sizeable update to the UFS
  driver to remove the HPB 2.0 feature that has been objected to by Jens
  and Christoph.

  Although the UFS patch is large and last minute, it's essentially the
  least intrusive way of resolving the objections in time for the 5.15
  release"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: ufshpb: Remove HPB2.0 flows
  scsi: mpt3sas: Fix reference tag handling for WRITE_INSERT
  scsi: ufs: ufs-exynos: Correct timeout value setting registers
  scsi: ibmvfc: Fix up duplicate response detection
2021-10-30 15:56:38 -07:00
Linus Torvalds
3a4347d82e Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
 "One fix for the composite clk that broke when we changed this clk type
  to use the determine_rate instead of round_rate clk op by default.
  This caused lots of problems on Rockchip SoCs because they heavily use
  the composite clk code to model the clk tree"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: composite: Also consider .determine_rate for rate + mux composites
2021-10-30 09:55:46 -07:00
Linus Torvalds
bf85ba018f Merge tag 'riscv-for-linus-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
 "These are pretty late, but they do fix concrete issues.

   - ensure the trap vector's address is aligned.

   - avoid re-populating the KASAN shadow memory.

   - allow kasan to build without warnings, which have recently become
     errors"

* tag 'riscv-for-linus-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix asan-stack clang build
  riscv: Do not re-populate shadow memory with kasan_populate_early_shadow
  riscv: fix misalgned trap vector base address
2021-10-30 09:28:24 -07:00
Avri Altman
09d9e4d041 scsi: ufs: ufshpb: Remove HPB2.0 flows
The Host Performance Buffer feature allows UFS read commands to carry the
physical media addresses along with the LBAs, thus allowing less internal
L2P-table switches in the device.  HPB1.0 allowed a single LBA, while
HPB2.0 increases this capacity up to 255 blocks.

Carrying more than a single record, the read operation is no longer purely
of type "read" but a "hybrid" command: Writing the physical address to the
device in one operation and reading back the required payload in another.

The JEDEC HPB spec defines two commands for this operation:
HPB-WRITE-BUFFER (0x2) to write the physical addresses to device, and
HPB-READ to read the payload.

With the current HPB design the UFS driver has no alternative but to divide
the READ request into 2 separate commands: HPB-WRITE-BUFFER and HPB-READ.
This causes a great deal of aggravation to the block layer guys who
demanded that we completely revert the entire HPB driver regardless of the
huge amount of corporate effort already invested in it.

As a compromise, remove only the pieces that implement the 2.0
specification. This is done as a matter of urgency for the final 5.15
release.

Link: https://lore.kernel.org/r/20211030062301.248-1-avri.altman@wdc.com
Tested-by: Avri Altman <avri.altman@wdc.com>
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Co-developed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-30 10:01:01 -04:00
Linus Torvalds
119c85055d Merge tag 'powerpc-5.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
 "Three commits fixing some issues introduced with the recent IOMMU
  changes we merged.

  Thanks to Alexey Kardashevskiy"

* tag 'powerpc-5.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries/iommu: Create huge DMA window if no MMIO32 is present
  powerpc/pseries/iommu: Check if the default window in use before removing it
  powerpc/pseries/iommu: Use correct vfree for it_map
2021-10-29 17:35:56 -07:00
Linus Torvalds
db2398a56a Merge tag 'gpio-fixes-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - fix the return value check when parsing the ngpios property in
   gpio-xgs-iproc

 - check the return value of bgpio_init() in gpio-mlxbf2

* tag 'gpio-fixes-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: mlxbf2.c: Add check for bgpio_init failure
  gpio: xgs-iproc: fix parsing of ngpios property
2021-10-29 17:04:38 -07:00
Linus Torvalds
a379fbbcb8 Merge tag 'block-5.15-2021-10-29' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - NVMe pull request:
      - fix nvmet-tcp header digest verification (Amit Engel)
      - fix a memory leak in nvmet-tcp when releasing a queue (Maurizio
        Lombardi)
      - fix nvme-tcp H2CData PDU send accounting again (Sagi Grimberg)
      - fix digest pointer calculation in nvme-tcp and nvmet-tcp (Varun
        Prakash)
      - fix possible nvme-tcp req->offset corruption (Varun Prakash)

 - Queue drain ordering fix (Ming)

 - Partition check regression for zoned devices (Shin'ichiro)

 - Zone queue restart fix (Naohiro)

* tag 'block-5.15-2021-10-29' of git://git.kernel.dk/linux-block:
  block: Fix partition check for host-aware zoned block devices
  nvmet-tcp: fix header digest verification
  nvmet-tcp: fix data digest pointer calculation
  nvme-tcp: fix data digest pointer calculation
  nvme-tcp: fix possible req->offset corruption
  block: schedule queue restart after BLK_STS_ZONE_RESOURCE
  block: drain queue after disk is removed from sysfs
  nvme-tcp: fix H2CData PDU send accounting (again)
  nvmet-tcp: fix a memory leak when releasing a queue
2021-10-29 11:10:29 -07:00
Martin K. Petersen
61a9f252c1 scsi: mpt3sas: Fix reference tag handling for WRITE_INSERT
Testing revealed a problem with how the reference tag was handled for
a WRITE_INSERT operation. The SCSI_PROT_REF_CHECK flag is not set when
the controller is asked to generate the protection information
(i.e. not DIX). And as a result the initial reference tag would not be
set in the WRITE_INSERT case.

Separate handling of the REF_CHECK and REF_INCREMENT flags to align
with both the DIX spec and the MPI implementation.

Link: https://lore.kernel.org/r/20211028034202.24225-1-martin.petersen@oracle.com
Fixes: b3e2c72af1 ("scsi: mpt3sas: Use the proper SCSI midlayer interfaces for PI")
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-29 14:03:58 -04:00
Linus Torvalds
17d50f8941 Merge tag 'mmc-v5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:

 - tmio: Re-enable card irqs after a reset

 - mtk-sd: Fixup probing of cqhci for crypto

 - cqhci: Fix support for suspend/resume

 - vub300: Fix control-message timeouts

 - dw_mmc-exynos: Fix support for tuning

 - winbond: Silences build errors on M68K

 - sdhci-esdhc-imx: Fix support for tuning

 - sdhci-pci: Read card detect from ACPI for Intel Merrifield

 - sdhci: Fix eMMC support for Thundercomm TurboX CM2290

* tag 'mmc-v5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: tmio: reenable card irqs after the reset callback
  mmc: mediatek: Move cqhci init behind ungate clock
  mmc: cqhci: clear HALT state after CQE enable
  mmc: vub300: fix control-message timeouts
  mmc: dw_mmc: exynos: fix the finding clock sample value
  mmc: winbond: don't build on M68K
  mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning circuit
  mmc: sdhci-pci: Read card detect from ACPI for Intel Merrifield
  mmc: sdhci: Map more voltage level to SDHCI_POWER_330
2021-10-29 10:54:44 -07:00
Linus Torvalds
fd919bbd33 Merge tag 'for-5.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
 "Last minute fixes for crash on 32bit architectures when compression is
  in use. It's a regression introduced in 5.15-rc and I'd really like
  not let this into the final release, fixes via stable trees would add
  unnecessary delay.

  The problem is on 32bit architectures with highmem enabled, the pages
  for compression may need to be kmapped, while the patches removed that
  as we don't use GFP_HIGHMEM allocations anymore. The pages that don't
  come from local allocation still may be from highmem. Despite being on
  32bit there's enough such ARM machines in use so it's not a marginal
  issue.

  I did full reverts of the patches one by one instead of a huge one.
  There's one exception for the "lzo" revert as there was an
  intermediate patch touching the same code to make it compatible with
  subpage. I can't revert that one too, so the revert in lzo.c is
  manual. Qu Wenruo has worked on that with me and verified the changes"

* tag 'for-5.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Revert "btrfs: compression: drop kmap/kunmap from lzo"
  Revert "btrfs: compression: drop kmap/kunmap from zlib"
  Revert "btrfs: compression: drop kmap/kunmap from zstd"
  Revert "btrfs: compression: drop kmap/kunmap from generic helpers"
2021-10-29 10:46:59 -07:00
Linus Torvalds
6f11521267 Merge tag 'trace-v5.15-rc6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing comment fixes from Steven Rostedt:

 - Some bots have informed me that some of the ftrace functions
   kernel-doc has formatting issues.

 - Also, fix my snake instinct.

* tag 'trace-v5.15-rc6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix misspelling of "missing"
  ftrace: Fix kernel-doc formatting issues
2021-10-29 10:41:07 -07:00
Linus Torvalds
75c7a6c1ca Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "Fix a build-time warning in x86/sm4"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: x86/sm4 - Fix invalid section entry size
2021-10-29 10:17:08 -07:00
Linus Torvalds
2c04d67ec1 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "11 patches.

  Subsystems affected by this patch series: mm (memcg, memory-failure,
  oom-kill, secretmem, vmalloc, hugetlb, damon, and tools), and ocfs2"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer
  mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()'
  mm: khugepaged: skip huge page collapse for special files
  mm, thp: bail out early in collapse_file for writeback page
  mm/vmalloc: fix numa spreading for large hash tables
  mm/secretmem: avoid letting secretmem_users drop to zero
  ocfs2: fix race between searching chunks and release journal_head from buffer_head
  mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap
  mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
  mm: hwpoison: remove the unnecessary THP check
  memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT
2021-10-29 10:03:07 -07:00
Alexandre Ghiti
54c5639d8f riscv: Fix asan-stack clang build
Nathan reported that because KASAN_SHADOW_OFFSET was not defined in
Kconfig, it prevents asan-stack from getting disabled with clang even
when CONFIG_KASAN_STACK is disabled: fix this by defining the
corresponding config.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-29 08:54:50 -07:00
Alexandre Ghiti
cf11d01135 riscv: Do not re-populate shadow memory with kasan_populate_early_shadow
When calling this function, all the shadow memory is already populated
with kasan_early_shadow_pte which has PAGE_KERNEL protection.
kasan_populate_early_shadow write-protects the mapping of the range
of addresses passed in argument in zero_pte_populate, which actually
write-protects all the shadow memory mapping since kasan_early_shadow_pte
is used for all the shadow memory at this point. And then when using
memblock API to populate the shadow memory, the first write access to the
kernel stack triggers a trap. This becomes visible with the next commit
that contains a fix for asan-stack.

We already manually populate all the shadow memory in kasan_early_init
and we write-protect kasan_early_shadow_pte at the end of kasan_init
which makes the calls to kasan_populate_early_shadow superfluous so
we can remove them.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: e178d670f2 ("riscv/kasan: add KASAN_VMALLOC support")
Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-29 08:53:42 -07:00
Steven Rostedt (VMware)
ddcf906fe5 tracing: Fix misspelling of "missing"
My snake instinct was on and I wrote "misssing" instead of "missing".

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-29 09:54:14 -04:00
Steven Rostedt (VMware)
6130722f11 ftrace: Fix kernel-doc formatting issues
Some functions had kernel-doc that used a comma instead of a hash to
separate the function name from the one line description.

Also, the "ftrace_is_dead()" had an incomplete description.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-29 09:52:23 -04:00
David Sterba
ccaa66c8dd Revert "btrfs: compression: drop kmap/kunmap from lzo"
This reverts commit 8c945d32e6.

The kmaps in compression code are still needed and cause crashes on
32bit machines (ARM, x86). Reproducible eg. by running fstest btrfs/004
with enabled LZO or ZSTD compression.

The revert does not apply cleanly due to changes in a6e66e6f8c
("btrfs: rework lzo_decompress_bio() to make it subpage compatible")
that reworked the page iteration so the revert is done to be equivalent
to the original code.

Link: https://lore.kernel.org/all/CAJCQCtT+OuemovPO7GZk8Y8=qtOObr0XTDp8jh4OHD6y84AFxw@mail.gmail.com/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214839
Tested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-29 13:25:43 +02:00
David Sterba
55276e14df Revert "btrfs: compression: drop kmap/kunmap from zlib"
This reverts commit 696ab562e6.

The kmaps in compression code are still needed and cause crashes on
32bit machines (ARM, x86). Reproducible eg. by running fstest btrfs/004
with enabled LZO or ZSTD compression.

Link: https://lore.kernel.org/all/CAJCQCtT+OuemovPO7GZk8Y8=qtOObr0XTDp8jh4OHD6y84AFxw@mail.gmail.com/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214839
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-29 13:03:05 +02:00
David Sterba
56ee254d23 Revert "btrfs: compression: drop kmap/kunmap from zstd"
This reverts commit bbaf9715f3.

The kmaps in compression code are still needed and cause crashes on
32bit machines (ARM, x86). Reproducible eg. by running fstest btrfs/004
with enabled LZO or ZSTD compression.

Example stacktrace with ZSTD on a 32bit ARM machine:

  Unable to handle kernel NULL pointer dereference at virtual address 00000000
  pgd = c4159ed3
  [00000000] *pgd=00000000
  Internal error: Oops: 5 [#1] PREEMPT SMP ARM
  Modules linked in:
  CPU: 0 PID: 210 Comm: kworker/u2:3 Not tainted 5.14.0-rc79+ #12
  Hardware name: Allwinner sun4i/sun5i Families
  Workqueue: btrfs-delalloc btrfs_work_helper
  PC is at mmiocpy+0x48/0x330
  LR is at ZSTD_compressStream_generic+0x15c/0x28c

  (mmiocpy) from [<c0629648>] (ZSTD_compressStream_generic+0x15c/0x28c)
  (ZSTD_compressStream_generic) from [<c06297dc>] (ZSTD_compressStream+0x64/0xa0)
  (ZSTD_compressStream) from [<c049444c>] (zstd_compress_pages+0x170/0x488)
  (zstd_compress_pages) from [<c0496798>] (btrfs_compress_pages+0x124/0x12c)
  (btrfs_compress_pages) from [<c043c068>] (compress_file_range+0x3c0/0x834)
  (compress_file_range) from [<c043c4ec>] (async_cow_start+0x10/0x28)
  (async_cow_start) from [<c0475c3c>] (btrfs_work_helper+0x100/0x230)
  (btrfs_work_helper) from [<c014ef68>] (process_one_work+0x1b4/0x418)
  (process_one_work) from [<c014f210>] (worker_thread+0x44/0x524)
  (worker_thread) from [<c0156aa4>] (kthread+0x180/0x1b0)
  (kthread) from [<c0100150>]

Link: https://lore.kernel.org/all/CAJCQCtT+OuemovPO7GZk8Y8=qtOObr0XTDp8jh4OHD6y84AFxw@mail.gmail.com/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214839
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-29 13:02:50 +02:00
David Yang
9c7516d669 tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer
The coccinelle check report:

  ./tools/testing/selftests/vm/split_huge_page_test.c:344:36-42:
  ERROR: application of sizeof to pointer

Use "strlen" to fix it.

Link: https://lkml.kernel.org/r/20211012030116.184027-1-davidcomponentone@gmail.com
Signed-off-by: David Yang <davidcomponentone@gmail.com>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
SeongJae Park
2e014660b3 mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()'
Kunit test cases for 'damon_split_regions_of()' expects the number of
regions after calling the function will be same to their request
('nr_sub').  However, the requested number is just an upper-limit,
because the function randomly decides the size of each sub-region.

This fixes the wrong expectation.

Link: https://lkml.kernel.org/r/20211028090628.14948-1-sj@kernel.org
Fixes: 17ccae8bb5 ("mm/damon: add kunit tests")
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Yang Shi
a4aeaa06d4 mm: khugepaged: skip huge page collapse for special files
The read-only THP for filesystems will collapse THP for files opened
readonly and mapped with VM_EXEC.  The intended usecase is to avoid TLB
misses for large text segments.  But it doesn't restrict the file types
so a THP could be collapsed for a non-regular file, for example, block
device, if it is opened readonly and mapped with EXEC permission.  This
may cause bugs, like [1] and [2].

This is definitely not the intended usecase, so just collapse THP for
regular files in order to close the attack surface.

[shy828301@gmail.com: fix vm_file check [3]]

Link: https://lore.kernel.org/lkml/CACkBjsYwLYLRmX8GpsDpMthagWOjWWrNxqY6ZLNQVr6yx+f5vA@mail.gmail.com/ [1]
Link: https://lore.kernel.org/linux-mm/000000000000c6a82505ce284e4c@google.com/ [2]
Link: https://lkml.kernel.org/r/CAHbLzkqTW9U3VvTu1Ki5v_cLRC9gHW+znBukg_ycergE0JWj-A@mail.gmail.com [3]
Link: https://lkml.kernel.org/r/20211027195221.3825-1-shy828301@gmail.com
Fixes: 99cb0dbd47 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reported-by: Hao Sun <sunhao.th@gmail.com>
Reported-by: syzbot+aae069be1de40fb11825@syzkaller.appspotmail.com
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Andrea Righi <andrea.righi@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Rongwei Wang
74c42e1baa mm, thp: bail out early in collapse_file for writeback page
Currently collapse_file does not explicitly check PG_writeback, instead,
page_has_private and try_to_release_page are used to filter writeback
pages.  This does not work for xfs with blocksize equal to or larger
than pagesize, because in such case xfs has no page->private.

This makes collapse_file bail out early for writeback page.  Otherwise,
xfs end_page_writeback will panic as follows.

  page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:ffff0003f88c86a8 index:0x0 pfn:0x84ef32
  aops:xfs_address_space_operations [xfs] ino:30000b7 dentry name:"libtest.so"
  flags: 0x57fffe0000008027(locked|referenced|uptodate|active|writeback)
  raw: 57fffe0000008027 ffff80001b48bc28 ffff80001b48bc28 ffff0003f88c86a8
  raw: 0000000000000000 0000000000000000 00000000ffffffff ffff0000c3e9a000
  page dumped because: VM_BUG_ON_PAGE(((unsigned int) page_ref_count(page) + 127u <= 127u))
  page->mem_cgroup:ffff0000c3e9a000
  ------------[ cut here ]------------
  kernel BUG at include/linux/mm.h:1212!
  Internal error: Oops - BUG: 0 [#1] SMP
  Modules linked in:
  BUG: Bad page state in process khugepaged  pfn:84ef32
   xfs(E)
  page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:0 index:0x0 pfn:0x84ef32
   libcrc32c(E) rfkill(E) aes_ce_blk(E) crypto_simd(E) ...
  CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Tainted: ...
  pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
  Call trace:
    end_page_writeback+0x1c0/0x214
    iomap_finish_page_writeback+0x13c/0x204
    iomap_finish_ioend+0xe8/0x19c
    iomap_writepage_end_bio+0x38/0x50
    bio_endio+0x168/0x1ec
    blk_update_request+0x278/0x3f0
    blk_mq_end_request+0x34/0x15c
    virtblk_request_done+0x38/0x74 [virtio_blk]
    blk_done_softirq+0xc4/0x110
    __do_softirq+0x128/0x38c
    __irq_exit_rcu+0x118/0x150
    irq_exit+0x1c/0x30
    __handle_domain_irq+0x8c/0xf0
    gic_handle_irq+0x84/0x108
    el1_irq+0xcc/0x180
    arch_cpu_idle+0x18/0x40
    default_idle_call+0x4c/0x1a0
    cpuidle_idle_call+0x168/0x1e0
    do_idle+0xb4/0x104
    cpu_startup_entry+0x30/0x9c
    secondary_start_kernel+0x104/0x180
  Code: d4210000 b0006161 910c8021 94013f4d (d4210000)
  ---[ end trace 4a88c6a074082f8c ]---
  Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt

Link: https://lkml.kernel.org/r/20211022023052.33114-1-rongwei.wang@linux.alibaba.com
Fixes: 99cb0dbd47 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Suggested-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Song Liu <song@kernel.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Chen Wandun
ffb29b1c25 mm/vmalloc: fix numa spreading for large hash tables
Eric Dumazet reported a strange numa spreading info in [1], and found
commit 121e6f3258 ("mm/vmalloc: hugepage vmalloc mappings") introduced
this issue [2].

Dig into the difference before and after this patch, page allocation has
some difference:

before:
  alloc_large_system_hash
    __vmalloc
      __vmalloc_node(..., NUMA_NO_NODE, ...)
        __vmalloc_node_range
          __vmalloc_area_node
            alloc_page /* because NUMA_NO_NODE, so choose alloc_page branch */
              alloc_pages_current
                alloc_page_interleave /* can be proved by print policy mode */

after:
  alloc_large_system_hash
    __vmalloc
      __vmalloc_node(..., NUMA_NO_NODE, ...)
        __vmalloc_node_range
          __vmalloc_area_node
            alloc_pages_node /* choose nid by nuam_mem_id() */
              __alloc_pages_node(nid, ....)

So after commit 121e6f3258 ("mm/vmalloc: hugepage vmalloc mappings"),
it will allocate memory in current node instead of interleaving allocate
memory.

Link: https://lore.kernel.org/linux-mm/CANn89iL6AAyWhfxdHO+jaT075iOa3XcYn9k6JJc7JR2XYn6k_Q@mail.gmail.com/ [1]
Link: https://lore.kernel.org/linux-mm/CANn89iLofTR=AK-QOZY87RdUZENCZUT4O6a0hvhu3_EwRMerOg@mail.gmail.com/ [2]
Link: https://lkml.kernel.org/r/20211021080744.874701-2-chenwandun@huawei.com
Fixes: 121e6f3258 ("mm/vmalloc: hugepage vmalloc mappings")
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Kees Cook
855d44434f mm/secretmem: avoid letting secretmem_users drop to zero
Quoting Dmitry:
 "refcount_inc() needs to be done before fd_install(). After
  fd_install() finishes, the fd can be used by userspace and
  we can have secret data in memory before the refcount_inc().

  A straightforward misuse where a user will predict the returned
  fd in another thread before the syscall returns and will use it
  to store secret data is somewhat dubious because such a user just
  shoots themself in the foot.

  But a more interesting misuse would be to close the predicted fd
  and decrement the refcount before the corresponding refcount_inc,
  this way one can briefly drop the refcount to zero while there are
  other users of secretmem."

Move fd_install() after refcount_inc().

Link: https://lkml.kernel.org/r/20211021154046.880251-1-keescook@chromium.org
Link: https://lore.kernel.org/lkml/CACT4Y+b1sW6-Hkn8HQYw_SsT7X3tp-CJNh2ci0wG3ZnQz9jjig@mail.gmail.com
Fixes: 9a436f8ff6 ("PM: hibernate: disable when there are active secretmem users")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jordy Zomer <jordy@pwning.systems>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Gautham Ananthakrishna
6f1b228529 ocfs2: fix race between searching chunks and release journal_head from buffer_head
Encountered a race between ocfs2_test_bg_bit_allocatable() and
jbd2_journal_put_journal_head() resulting in the below vmcore.

  PID: 106879  TASK: ffff880244ba9c00  CPU: 2   COMMAND: "loop3"
  Call trace:
    panic
    oops_end
    no_context
    __bad_area_nosemaphore
    bad_area_nosemaphore
    __do_page_fault
    do_page_fault
    page_fault
      [exception RIP: ocfs2_block_group_find_clear_bits+316]
    ocfs2_block_group_find_clear_bits [ocfs2]
    ocfs2_cluster_group_search [ocfs2]
    ocfs2_search_chain [ocfs2]
    ocfs2_claim_suballoc_bits [ocfs2]
    __ocfs2_claim_clusters [ocfs2]
    ocfs2_claim_clusters [ocfs2]
    ocfs2_local_alloc_slide_window [ocfs2]
    ocfs2_reserve_local_alloc_bits [ocfs2]
    ocfs2_reserve_clusters_with_limit [ocfs2]
    ocfs2_reserve_clusters [ocfs2]
    ocfs2_lock_refcount_allocators [ocfs2]
    ocfs2_make_clusters_writable [ocfs2]
    ocfs2_replace_cow [ocfs2]
    ocfs2_refcount_cow [ocfs2]
    ocfs2_file_write_iter [ocfs2]
    lo_rw_aio
    loop_queue_work
    kthread_worker_fn
    kthread
    ret_from_fork

When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the
bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and
released the jounal head from the buffer head.  Needed to take bit lock
for the bit 'BH_JournalHead' to fix this race.

Link: https://lkml.kernel.org/r/1634820718-6043-1-git-send-email-gautham.ananthakrishna@oracle.com
Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: <rajesh.sivaramasubramaniom@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Suren Baghdasaryan
337546e83f mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap
Race between process_mrelease and exit_mmap, where free_pgtables is
called while __oom_reap_task_mm is in progress, leads to kernel crash
during pte_offset_map_lock call.  oom-reaper avoids this race by setting
MMF_OOM_VICTIM flag and causing exit_mmap to take and release
mmap_write_lock, blocking it until oom-reaper releases mmap_read_lock.

Reusing MMF_OOM_VICTIM for process_mrelease would be the simplest way to
fix this race, however that would be considered a hack.  Fix this race
by elevating mm->mm_users and preventing exit_mmap from executing until
process_mrelease is finished.  Patch slightly refactors the code to
adapt for a possible mmget_not_zero failure.

This fix has considerable negative impact on process_mrelease
performance and will likely need later optimization.

Link: https://lkml.kernel.org/r/20211022014658.263508-1-surenb@google.com
Fixes: 884a7e5964 ("mm: introduce process_mrelease system call")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Christian Brauner <christian@brauner.io>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Yang Shi
eac96c3efd mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
When handling shmem page fault the THP with corrupted subpage could be
PMD mapped if certain conditions are satisfied.  But kernel is supposed
to send SIGBUS when trying to map hwpoisoned page.

There are two paths which may do PMD map: fault around and regular
fault.

Before commit f9ce0be71d ("mm: Cleanup faultaround and finish_fault()
codepaths") the thing was even worse in fault around path.  The THP
could be PMD mapped as long as the VMA fits regardless what subpage is
accessed and corrupted.  After this commit as long as head page is not
corrupted the THP could be PMD mapped.

In the regular fault path the THP could be PMD mapped as long as the
corrupted page is not accessed and the VMA fits.

This loophole could be fixed by iterating every subpage to check if any
of them is hwpoisoned or not, but it is somewhat costly in page fault
path.

So introduce a new page flag called HasHWPoisoned on the first tail
page.  It indicates the THP has hwpoisoned subpage(s).  It is set if any
subpage of THP is found hwpoisoned by memory failure and after the
refcount is bumped successfully, then cleared when the THP is freed or
split.

The soft offline path doesn't need this since soft offline handler just
marks a subpage hwpoisoned when the subpage is migrated successfully.
But shmem THP didn't get split then migrated at all.

Link: https://lkml.kernel.org/r/20211020210755.23964-3-shy828301@gmail.com
Fixes: 800d8c63b2 ("shmem: add huge pages support")
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Yang Shi
c7cb42e944 mm: hwpoison: remove the unnecessary THP check
When handling THP hwpoison checked if the THP is in allocation or free
stage since hwpoison may mistreat it as hugetlb page.  After commit
415c64c145 ("mm/memory-failure: split thp earlier in memory error
handling") the problem has been fixed, so this check is no longer
needed.  Remove it.  The side effect of the removal is hwpoison may
report unsplit THP instead of unknown error for shmem THP.  It seems not
like a big deal.

The following patch "mm: filemap: check if THP has hwpoisoned subpage
for PMD page fault" depends on this, which fixes shmem THP with
hwpoisoned subpage(s) are mapped PMD wrongly.  So this patch needs to be
backported to -stable as well.

Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Shakeel Butt
8dcb3060d8 memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT
Commit 5c1f4e690e ("mm/vmalloc: switch to bulk allocator in
__vmalloc_area_node()") switched to bulk page allocator for order 0
allocation backing vmalloc.  However bulk page allocator does not
support __GFP_ACCOUNT allocations and there are several users of
kvmalloc(__GFP_ACCOUNT).

For now make __GFP_ACCOUNT allocations bypass bulk page allocator.  In
future if there is workload that can be significantly improved with the
bulk page allocator with __GFP_ACCCOUNT support, we can revisit the
decision.

Link: https://lkml.kernel.org/r/20211014151607.2171970-1-shakeelb@google.com
Fixes: 5c1f4e690e ("mm/vmalloc: switch to bulk allocator in __vmalloc_area_node()")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reported-by: Vasily Averin <vvs@virtuozzo.com>
Tested-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:54 -07:00
Linus Torvalds
f25a5481af Merge tag 'libnvdimm-fixes-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fix from Dan Williams:

 - Fix a regression introduced in v5.15-rc6 that caused nvdimm namespace
   shutdown to hang due to reworks in the block layer q_usage_count.

* tag 'libnvdimm-fixes-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nvdimm/pmem: stop using q_usage_count as external pgmap refcount
2021-10-28 16:50:25 -07:00
Wolfram Sang
90935eb303 mmc: tmio: reenable card irqs after the reset callback
The reset callback may clear the internal card detect interrupts, so
make sure to reenable them if needed.

Fixes: b4d86f37ea ("mmc: renesas_sdhi: do hard reset if possible")
Reported-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211028195149.8003-1-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-28 23:19:32 +02:00
Linus Torvalds
f31531e554 Merge tag 'drm-fixes-2021-10-29' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Quiet but not too quiet, I blame Halloween.

  The first set of amdgpu fixes missed last week, hence why this has a
  few more of them, it's mostly display fixes for new GPUs and some
  debugfs OOB stuff.

  The i915 patches have one to remove a tracepoint possible issue before
  it's a real problem, the others around cflush and display are cc'ed to
  stable as well.

  Otherwise it's just a few misc fixes.

  Summary:

  MAINTAINERS:
   - Fix the path pattern

  ttm:
   - Fix fence leak in ttm_transfered_destroy.

  core:
   - Add GPD Win3 rotation quirk

  i915:
   - Remove unconditional clflushes
   - Fix oops on boot due to sync state on disabled DP encoders
   - Revert backend specific data added to tracepoints
   - Remove useless and incorrect memory frequence calculation

  panel:
   - Add quirk for Aya Neo 2021

  seltest:
   - Reset property count for each drm damage selftest so full run will
     work correctly.

  amdgpu:
   - Fix two potential out of bounds writes in debugfs
   - Fix revision handling for Yellow Carp
   - Display fixes for Yellow Carp
   - Display fixes for DCN 3.1"

* tag 'drm-fixes-2021-10-29' of git://anongit.freedesktop.org/drm/drm: (21 commits)
  MAINTAINERS: dri-devel is for all of drivers/gpu
  drm/i915: Revert 'guc_id' from i915_request tracepoint
  drm/amd/display: Fix deadlock when falling back to v2 from v3
  drm/amd/display: Fallback to clocks which meet requested voltage on DCN31
  drm/amdgpu: Fix even more out of bound writes from debugfs
  drm: panel-orientation-quirks: Add quirk for GPD Win3
  drm/i915/dp: Skip the HW readout of DPCD on disabled encoders
  drm/i915: Catch yet another unconditioal clflush
  drm/i915: Convert unconditional clflush to drm_clflush_virt_range()
  drm/i915/selftests: Properly reset mock object propers for each test
  drm: panel-orientation-quirks: Add quirk for Aya Neo 2021
  drm/ttm: fix memleak in ttm_transfered_destroy
  drm/amdgpu: support B0&B1 external revision id for yellow carp
  drm/amd/display: Moved dccg init to after bios golden init
  drm/amd/display: Increase watermark latencies for DCN3.1
  drm/amd/display: increase Z9 latency to workaround underflow in Z9
  drm/amd/display: Require immediate flip support for DCN3.1 planes
  drm/amd/display: Fix prefetch bandwidth calculation for DCN3.1
  drm/amd/display: Limit display scaling to up to true 4k for DCN 3.1
  drm/amdgpu: fix out of bounds write
  ...
2021-10-28 12:17:01 -07:00
Daniel Vetter
b112166a89 MAINTAINERS: dri-devel is for all of drivers/gpu
Somehow we only have a list of subdirectories, which apparently made
it harder for folks to find the gpu maintainers. Fix that.

References: https://lore.kernel.org/dri-devel/YXrAAZlxxStNFG%2FK@phenom.ffwll.local/
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211028170857.4029606-1-daniel.vetter@ffwll.ch
2021-10-29 04:49:11 +10:00
Dave Airlie
946ca97e2e Merge tag 'drm-intel-fixes-2021-10-28' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.15 final:
- Remove unconditional clflushes
- Fix oops on boot due to sync state on disabled DP encoders
- Revert backend specific data added to tracepoints
- Remove useless and incorrect memory frequence calculation

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8735olh27y.fsf@intel.com
2021-10-29 04:46:14 +10:00
Linus Torvalds
411a44c24a Merge tag 'net-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from WiFi (mac80211), and BPF.

  Current release - regressions:

   - skb_expand_head: adjust skb->truesize to fix socket memory
     accounting

   - mptcp: fix corrupt receiver key in MPC + data + checksum

  Previous releases - regressions:

   - multicast: calculate csum of looped-back and forwarded packets

   - cgroup: fix memory leak caused by missing cgroup_bpf_offline

   - cfg80211: fix management registrations locking, prevent list
     corruption

   - cfg80211: correct false positive in bridge/4addr mode check

   - tcp_bpf: fix race in the tcp_bpf_send_verdict resulting in reusing
     previous verdict

  Previous releases - always broken:

   - sctp: enhancements for the verification tag, prevent attackers from
     killing SCTP sessions

   - tipc: fix size validations for the MSG_CRYPTO type

   - mac80211: mesh: fix HE operation element length check, prevent out
     of bound access

   - tls: fix sign of socket errors, prevent positive error codes being
     reported from read()/write()

   - cfg80211: scan: extend RCU protection in
     cfg80211_add_nontrans_list()

   - implement ->sock_is_readable() for UDP and AF_UNIX, fix poll() for
     sockets in a BPF sockmap

   - bpf: fix potential race in tail call compatibility check resulting
     in two operations which would make the map incompatible succeeding

   - bpf: prevent increasing bpf_jit_limit above max

   - bpf: fix error usage of map_fd and fdget() in generic batch update

   - phy: ethtool: lock the phy for consistency of results

   - prevent infinite while loop in skb_tx_hash() when Tx races with
     driver reconfiguring the queue <> traffic class mapping

   - usbnet: fixes for bad HW conjured by syzbot

   - xen: stop tx queues during live migration, prevent UAF

   - net-sysfs: initialize uid and gid before calling
     net_ns_get_ownership

   - mlxsw: prevent Rx stalls under memory pressure"

* tag 'net-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
  Revert "net: hns3: fix pause config problem after autoneg disabled"
  mptcp: fix corrupt receiver key in MPC + data + checksum
  riscv, bpf: Fix potential NULL dereference
  octeontx2-af: Fix possible null pointer dereference.
  octeontx2-af: Display all enabled PF VF rsrc_alloc entries.
  octeontx2-af: Check whether ipolicers exists
  net: ethernet: microchip: lan743x: Fix skb allocation failure
  net/tls: Fix flipped sign in async_wait.err assignment
  net/tls: Fix flipped sign in tls_err_abort() calls
  net/smc: Correct spelling mistake to TCPF_SYN_RECV
  net/smc: Fix smc_link->llc_testlink_time overflow
  nfp: bpf: relax prog rejection for mtu check through max_pkt_offset
  vmxnet3: do not stop tx queues after netif_device_detach()
  r8169: Add device 10ec:8162 to driver r8169
  ptp: Document the PTP_CLK_MAGIC ioctl number
  usbnet: fix error return code in usbnet_probe()
  net: hns3: adjust string spaces of some parameters of tx bd info in debugfs
  net: hns3: expand buffer len for some debugfs command
  net: hns3: add more string spaces for dumping packets number of queue info in debugfs
  net: hns3: fix data endian problem of some functions of debugfs
  ...
2021-10-28 10:17:31 -07:00
Linus Torvalds
4fb7d85b2e Merge tag 'spi-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A couple of final driver specific fixes for v5.15, one fixing
  potential ID collisions between two instances of the Altera driver and
  one making Microwire full duplex mode actually work on pl022"

* tag 'spi-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spl022: fix Microwire full duplex mode
  spi: altera: Change to dynamic allocation of spi id
2021-10-28 10:04:39 -07:00
Linus Torvalds
8685de2ed8 Merge tag 'regmap-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
 "This fixes a potential double free when handling an out of memory
  error inserting a node into an rbtree regcache"

* tag 'regmap-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: Fix possible double-free in regcache_rbtree_exit()
2021-10-28 10:00:58 -07:00
Linus Torvalds
eecd231a80 Merge tag 'linux-watchdog-5.15-rc7' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog fixes from Wim Van Sebroeck:
 "I overlooked Guenters request to sent this upstream earlier, so it's a
  bit late in the release cycle.

  This contains:

   - Revert "watchdog: iTCO_wdt: Account for rebooting on second
     timeout"

   - sbsa: only use 32-bit accessors

   - sbsa: drop unneeded MODULE_ALIAS

   - ixp4xx_wdt: Fix address space warning

   - Fix OMAP watchdog early handling"

* tag 'linux-watchdog-5.15-rc7' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: Fix OMAP watchdog early handling
  watchdog: ixp4xx_wdt: Fix address space warning
  watchdog: sbsa: drop unneeded MODULE_ALIAS
  watchdog: sbsa: only use 32-bit accessors
  Revert "watchdog: iTCO_wdt: Account for rebooting on second timeout"
2021-10-28 09:55:25 -07:00
Linus Torvalds
fc18cc89b9 Merge tag 'trace-v5.15-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
 "Do not WARN when attaching event probe to non-existent event

  If the user tries to attach an event probe (eprobe) to an event that
  does not exist, it will trigger a warning. There's an error check that
  only expects memory issues otherwise it is considered a bug. But
  changes in the code to move around the locking made it that it can
  error out if the user attempts to attach to an event that does not
  exist, returning an -ENODEV. As this path can be caused by user space
  putting in a bad value, do not trigger a WARN"

* tag 'trace-v5.15-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Do not warn when connecting eprobe to non existing event
2021-10-28 09:50:56 -07:00
Guangbin Huang
35392da51b Revert "net: hns3: fix pause config problem after autoneg disabled"
This reverts commit 3bda2e5df4.

According to discussion with Andrew as follow:
https://lore.kernel.org/netdev/09eda9fe-196b-006b-6f01-f54e75715961@huawei.com/

HNS3 driver needs to separate pause autoneg from general autoneg, so revert
this incorrect patch.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Link: https://lore.kernel.org/r/20211028140624.53149-1-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28 08:23:03 -07:00
Davide Caratti
f7cc8890f3 mptcp: fix corrupt receiver key in MPC + data + checksum
using packetdrill it's possible to observe that the receiver key contains
random values when clients transmit MP_CAPABLE with data and checksum (as
specified in RFC8684 §3.1). Fix the layout of mptcp_out_options, to avoid
using the skb extension copy when writing the MP_CAPABLE sub-option.

Fixes: d7b2690837 ("mptcp: shrink mptcp_out_options struct")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/233
Reported-by: Poorva Sonparote <psonparo@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Link: https://lore.kernel.org/r/20211027203855.264600-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28 08:19:06 -07:00
Björn Töpel
27de809a3d riscv, bpf: Fix potential NULL dereference
The bpf_jit_binary_free() function requires a non-NULL argument. When
the RISC-V BPF JIT fails to converge in NR_JIT_ITERATIONS steps,
jit_data->header will be NULL, which triggers a NULL
dereference. Avoid this by checking the argument, prior calling the
function.

Fixes: ca6cb5447c ("riscv, bpf: Factor common RISC-V JIT code")
Signed-off-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20211028125115.514587-1-bjorn@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28 08:14:30 -07:00
David Woodhouse
f3d1436d4b KVM: x86: Take srcu lock in post_kvm_run_save()
The Xen interrupt injection for event channels relies on accessing the
guest's vcpu_info structure in __kvm_xen_has_interrupt(), through a
gfn_to_hva_cache.

This requires the srcu lock to be held, which is mostly the case except
for this code path:

[   11.822877] WARNING: suspicious RCU usage
[   11.822965] -----------------------------
[   11.823013] include/linux/kvm_host.h:664 suspicious rcu_dereference_check() usage!
[   11.823131]
[   11.823131] other info that might help us debug this:
[   11.823131]
[   11.823196]
[   11.823196] rcu_scheduler_active = 2, debug_locks = 1
[   11.823253] 1 lock held by dom:0/90:
[   11.823292]  #0: ffff998956ec8118 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x680
[   11.823379]
[   11.823379] stack backtrace:
[   11.823428] CPU: 2 PID: 90 Comm: dom:0 Kdump: loaded Not tainted 5.4.34+ #5
[   11.823496] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[   11.823612] Call Trace:
[   11.823645]  dump_stack+0x7a/0xa5
[   11.823681]  lockdep_rcu_suspicious+0xc5/0x100
[   11.823726]  __kvm_xen_has_interrupt+0x179/0x190
[   11.823773]  kvm_cpu_has_extint+0x6d/0x90
[   11.823813]  kvm_cpu_accept_dm_intr+0xd/0x40
[   11.823853]  kvm_vcpu_ready_for_interrupt_injection+0x20/0x30
              < post_kvm_run_save() inlined here >
[   11.823906]  kvm_arch_vcpu_ioctl_run+0x135/0x6a0
[   11.823947]  kvm_vcpu_ioctl+0x263/0x680

Fixes: 40da8ccd72 ("KVM: x86/xen: Add event channel interrupt vector upcall")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: stable@vger.kernel.org
Message-Id: <606aaaf29fca3850a63aa4499826104e77a72346.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-28 10:45:38 -04:00
Jens Axboe
f4aaf1fa8b Merge tag 'nvme-5.15-2021-10-28' of git://git.infradead.org/nvme into block-5.15
Pull NVMe fixes from Christoph:

"nvme fixe for Linux 5.15

 - fix nvmet-tcp header digest verification (Amit Engel)
 - fix a memory leak in nvmet-tcp when releasing a queue
   (Maurizio Lombardi)
 - fix nvme-tcp H2CData PDU send accounting again (Sagi Grimberg)
 - fix digest pointer calculation in nvme-tcp and nvmet-tcp
   (Varun Prakash)
 - fix possible nvme-tcp req->offset corruption (Varun Prakash)"

* tag 'nvme-5.15-2021-10-28' of git://git.infradead.org/nvme:
  nvmet-tcp: fix header digest verification
  nvmet-tcp: fix data digest pointer calculation
  nvme-tcp: fix data digest pointer calculation
  nvme-tcp: fix possible req->offset corruption
  nvme-tcp: fix H2CData PDU send accounting (again)
  nvmet-tcp: fix a memory leak when releasing a queue
2021-10-28 08:34:01 -06:00
David S. Miller
20af8864a3 Merge branch 'octeontx2-debugfs-fixes'
Rakesh Babu Saladi says:

====================
RVU Debugfs fix updates.

The following patch series consists of the patch fixes done over
rvu_debugfs.c and rvu_nix.c files.

Patch 1: Check and return if ipolicers do not exists.
Patch 2: Fix rsrc_alloc to print all enabled PF/VF entries with list of LFs
allocated for each functional block.
Patch 3: Fix possible null pointer dereference.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 14:47:37 +01:00
Rakesh Babu Saladi
c2d4c543f7 octeontx2-af: Fix possible null pointer dereference.
This patch fixes possible null pointer dereference in files
"rvu_debugfs.c" and "rvu_nix.c"

Fixes: 8756828a81 ("octeontx2-af: Add NPA aura and pool contexts to debugfs")
Fixes: 9a946def26 ("octeontx2-af: Modify nix_vtag_cfg mailbox to support TX VTAG entries")
Signed-off-by: Rakesh Babu Saladi <rsaladi2@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 14:47:37 +01:00
Rakesh Babu
e77bcdd1f6 octeontx2-af: Display all enabled PF VF rsrc_alloc entries.
Currently, we are using a fixed buffer size of length 2048 to display
rsrc_alloc output. As a result a maximum of 2048 characters of
rsrc_alloc output is displayed, which may lead sometimes to display only
partial output. This patch fixes this dependency on max limit of buffer
size and displays all PF VF entries.

Each column of the debugfs entry "rsrc_alloc" uses a fixed width of 12
characters to print the list of LFs of each block for a PF/VF. If the
length of list of LFs of a block exceeds this fixed width then the list
gets truncated and displays only a part of the list. This patch fixes
this by using the maximum possible length of list of LFs among all
blocks of all PFs and VFs entries as the width size.

Fixes: f788409714 ("octeontx2-af: Formatting debugfs entry rsrc_alloc.")
Fixes: 23205e6d06 ("octeontx2-af: Dump current resource provisioning status")
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <Sunil.Goutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 14:47:37 +01:00
Subbaraya Sundeep
cc45b96e2d octeontx2-af: Check whether ipolicers exists
While displaying ingress policers information in
debugfs check whether ingress policers exist in
the hardware or not because some platforms(CN9XXX)
do not have this feature.

Fixes: e7d8971763 ("octeontx2-af: cn10k: Debugfs support for bandwidth")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 14:47:36 +01:00
Yuiko Oshino
e8684db191 net: ethernet: microchip: lan743x: Fix skb allocation failure
The driver allocates skb during ndo_open with GFP_ATOMIC which has high chance of failure when there are multiple instances.
GFP_KERNEL is enough while open and use GFP_ATOMIC only from interrupt context.

Fixes: 23f0703c12 ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 14:45:07 +01:00
Daniel Jordan
1d9d6fd21a net/tls: Fix flipped sign in async_wait.err assignment
sk->sk_err contains a positive number, yet async_wait.err wants the
opposite.  Fix the missed sign flip, which Jakub caught by inspection.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 14:41:20 +01:00
Daniel Jordan
da353fac65 net/tls: Fix flipped sign in tls_err_abort() calls
sk->sk_err appears to expect a positive value, a convention that ktls
doesn't always follow and that leads to memory corruption in other code.
For instance,

    [kworker]
    tls_encrypt_done(..., err=<negative error from crypto request>)
      tls_err_abort(.., err)
        sk->sk_err = err;

    [task]
    splice_from_pipe_feed
      ...
        tls_sw_do_sendpage
          if (sk->sk_err) {
            ret = -sk->sk_err;  // ret is positive

    splice_from_pipe_feed (continued)
      ret = actor(...)  // ret is still positive and interpreted as bytes
                        // written, resulting in underflow of buf->len and
                        // sd->len, leading to huge buf->offset and bogus
                        // addresses computed in later calls to actor()

Fix all tls_err_abort() callers to pass a negative error code
consistently and centralize the error-prone sign flip there, throwing in
a warning to catch future misuse and uninlining the function so it
really does only warn once.

Cc: stable@vger.kernel.org
Fixes: c46234ebb4 ("tls: RX path for ktls")
Reported-by: syzbot+b187b77c8474f9648fae@syzkaller.appspotmail.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 14:41:20 +01:00
David S. Miller
a32f07d211 Merge branch 'SMC-fixes'
Tony Lu says:

====================
Fixes for SMC

There are some fixes for SMC.

v1->v2:
- fix wrong email address.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 13:04:29 +01:00
Wen Gu
f3a3a0fe0b net/smc: Correct spelling mistake to TCPF_SYN_RECV
There should use TCPF_SYN_RECV instead of TCP_SYN_RECV.

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 13:04:28 +01:00
Tony Lu
c4a146c7cf net/smc: Fix smc_link->llc_testlink_time overflow
The value of llc_testlink_time is set to the value stored in
net->ipv4.sysctl_tcp_keepalive_time when linkgroup init. The value of
sysctl_tcp_keepalive_time is already jiffies, so we don't need to
multiply by HZ, which would cause smc_link->llc_testlink_time overflow,
and test_link send flood.

Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 13:04:28 +01:00
Yu Xiao
90a881fc35 nfp: bpf: relax prog rejection for mtu check through max_pkt_offset
MTU change is refused whenever the value of new MTU is bigger than
the max packet bytes that fits in NFP Cluster Target Memory (CTM).
However, an eBPF program doesn't always need to access the whole
packet data.

The maximum direct packet access (DPA) offset has always been
caculated by verifier and stored in the max_pkt_offset field of prog
aux data.

Signed-off-by: Yu Xiao <yu.xiao@corigine.com>
Reviewed-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Reviewed-by: Niklas Soderlund <niklas.soderlund@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 12:59:32 +01:00
Dongli Zhang
9159f10240 vmxnet3: do not stop tx queues after netif_device_detach()
The netif_device_detach() conditionally stops all tx queues if the queues
are running. There is no need to call netif_tx_stop_all_queues() again.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 12:51:17 +01:00
Wenbin Mei
e8a1ff6592 mmc: mediatek: Move cqhci init behind ungate clock
We must enable clock before cqhci init, because crypto needs read
information from CQHCI registers, otherwise, it will hang in MediaTek mmc
host controller.

Signed-off-by: Wenbin Mei <wenbin.mei@mediatek.com>
Fixes: 88bd652b3c ("mmc: mediatek: command queue support")
Cc: stable@vger.kernel.org
Acked-by: Chaotian Jing <chaotian.jing@mediatek.com>
Link: https://lore.kernel.org/r/20211028022049.22129-1-wenbin.mei@mediatek.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-28 11:55:20 +02:00
Joonas Lahtinen
9a4aa3a2f1 drm/i915: Revert 'guc_id' from i915_request tracepoint
Avoid adding backend specific data to the tracepoints outside of
the LOW_LEVEL_TRACEPOINTS kernel config protection. These bits of
information are bound to change depending on the selected submission
method per platform and are not necessarily possible to maintain in
the future.

Fixes: dbf9da8d55 ("drm/i915/guc: Add trace point for GuC submit")
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211027093255.66489-1-joonas.lahtinen@linux.intel.com
(cherry picked from commit 64512a66b6)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2021-10-28 11:45:11 +03:00
Dave Airlie
79516af349 Merge tag 'drm-misc-fixes-2021-10-28' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
One patch to fix the default screen orientation on the GPD Win3

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20211028072300.b4gqexq6zfhby24g@gilmour
2021-10-28 17:27:14 +10:00
Dave Airlie
19928833e8 Merge tag 'drm-misc-fixes-2021-10-26' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.15-rc8:
- Fix fence leak in ttm_transfered_destroy.
- Add quirk for Aya Neo 2021
- Reset property count for each drm damage selftest so full run will work correctly.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4a133970-ff4b-aa62-d346-b269b1b9236e@linux.intel.com
2021-10-28 15:22:20 +10:00
Dave Airlie
03424d380b Merge tag 'amd-drm-fixes-5.15-2021-10-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.15-2021-10-27:

amdgpu:
- Display fixes for DCN 3.1
- Fix potential out of bounds write in debugfs

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211028023130.4528-1-alexander.deucher@amd.com
2021-10-28 15:21:50 +10:00
Nicholas Kazlauskas
ad76744b04 drm/amd/display: Fix deadlock when falling back to v2 from v3
[Why]
A deadlock in the kernel occurs when we fallback from the V3 to V2
add_topology_to_display or remove_topology_to_display because they
both try to acquire the dtm_mutex but recursive locking isn't
supported on mutex_lock().

[How]
Make the mutex_lock/unlock more fine grained and move them up such that
they're only required for the psp invocation itself.

Fixes: bf62221e9d ("drm/amd/display: Add DCN3.1 HDCP support")

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-27 22:04:50 -04:00
Michael Strauss
54149d13f3 drm/amd/display: Fallback to clocks which meet requested voltage on DCN31
[WHY]
On certain configs, SMU clock table voltages don't match which cause parser
to behave incorrectly by leaving dcfclk and socclk table entries unpopulated.

[HOW]
Currently the function that finds the corresponding clock for a given voltage
only checks for exact voltage level matches. In the case that no match gets
found, parser now falls back to searching for the max clock which meets the
requested voltage (i.e. its corresponding voltage is below requested).

Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-27 22:04:32 -04:00
Patrik Jakobsson
3f4e54bd31 drm/amdgpu: Fix even more out of bound writes from debugfs
CVE-2021-42327 was fixed by:

commit f23750b5b3
Author: Thelford Williams <tdwilliamsiv@gmail.com>
Date:   Wed Oct 13 16:04:13 2021 -0400

    drm/amdgpu: fix out of bounds write

but amdgpu_dm_debugfs.c contains more of the same issue so fix the
remaining ones.

v2:
	* Add missing fix in dp_max_bpc_write (Harry Wentland)

Fixes: 918698d5c2 ("drm/amd/display: Return the number of bytes parsed than allocated")
Signed-off-by: Patrik Jakobsson <pjakobsson@suse.de>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-27 22:02:28 -04:00
Steven Rostedt (VMware)
7fa598f970 tracing: Do not warn when connecting eprobe to non existing event
When the syscall trace points are not configured in, the kselftests for
ftrace will try to attach an event probe (eprobe) to one of the system
call trace points. This triggered a WARNING, because the failure only
expects to see memory issues. But this is not the only failure. The user
may attempt to attach to a non existent event, and the kernel must not
warn about it.

Link: https://lkml.kernel.org/r/20211027120854.0680aa0f@gandalf.local.home

Fixes: 7491e2c442 ("tracing: Add a probe that attaches to trace events")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-27 21:47:55 -04:00
Janghyub Seo
72f898ca0a r8169: Add device 10ec:8162 to driver r8169
This patch makes the driver r8169 pick up device Realtek Semiconductor Co.
, Ltd. Device [10ec:8162].

Signed-off-by: Janghyub Seo <jhyub06@gmail.com>
Suggested-by: Rushab Shah <rushabshah32@gmail.com>
Link: https://lore.kernel.org/r/1635231849296.1489250046.441294000@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-27 17:07:28 -07:00
Randy Dunlap
f821615167 ptp: Document the PTP_CLK_MAGIC ioctl number
Add PTP_CLK_MAGIC to the userspace-api/ioctl/ioctl-number.rst
documentation file.

Fixes: d94ba80ebb ("ptp: Added a brand new class driver for ptp clocks.")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20211024163831.10200-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-27 17:02:51 -07:00
Linus Torvalds
9c5456773d Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
 "A couple of fixes that seem important enough to pick at the last
  moment"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio-ring: fix DMA metadata flags
  vduse: Fix race condition between resetting and irq injecting
  vduse: Disallow injecting interrupt before DRIVER_OK is set
2021-10-27 13:15:05 -07:00
Chen Lu
64a19591a2 riscv: fix misalgned trap vector base address
The trap vector marked by label .Lsecondary_park must align on a
4-byte boundary, as the {m,s}tvec is defined to require 4-byte
alignment.

Signed-off-by: Chen Lu <181250012@smail.nju.edu.cn>
Reviewed-by: Anup Patel <anup.patel@wdc.com>
Fixes: e011995e82 ("RISC-V: Move relocate and few other functions out of __init")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-27 13:08:01 -07:00
Vincent Whitchurch
890d335613 virtio-ring: fix DMA metadata flags
The flags are currently overwritten, leading to the wrong direction
being passed to the DMA unmap functions.

Fixes: 72b5e89587 ("virtio-ring: store DMA metadata in desc_extra for split virtqueue")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/r/20211026133100.17541-1-vincent.whitchurch@axis.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-10-27 15:54:34 -04:00
Wang Hai
6f7c886911 usbnet: fix error return code in usbnet_probe()
Return error code if usb_maxpacket() returns 0 in usbnet_probe()

Fixes: 397430b50a ("usbnet: sanity check for maxpacket")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20211026124015.3025136-1-wanghai38@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-27 12:06:15 -07:00
Linus Torvalds
1fc596a56b Merge tag 'trace-v5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull nds32 tracing fix from Steven Rostedt:
 "Fix nds32le build when DYNAMIC_FTRACE is disabled

  A randconfig found that nds32le architecture fails to build due to a
  prototype mismatch between a ftrace function pointer and the function
  it was to be assigned to. That function pointer prototype missed being
  updated when all the ftrace callbacks were updated"

* tag 'trace-v5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace/nds32: Update the proto for ftrace_trace_function to match ftrace_stub
2021-10-27 10:41:59 -07:00
Linus Torvalds
646b0de5fe Merge tag 'nios2_fixes_for_v5.15_part3' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux
Pull nios2 fix from Dinh Nguyen:
 "Fix a build error for allmodconfig"

* tag 'nios2_fixes_for_v5.15_part3' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST
2021-10-27 10:19:43 -07:00
Linus Torvalds
ab2aa486f4 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "Nothing very exciting here, it has been a quiet cycle overall. Usual
  collection of small bug fixes:

   - irdma issues with CQ entries, VLAN completions and a mutex deadlock

   - Incorrect DCT packets in mlx5

   - Userspace triggered overflows in qib

   - Locking error in hfi

   - Typo in errno value in qib/hfi1

   - Double free in qedr

   - Leak of random kernel memory to userspace with a netlink callback"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
  RDMA/irdma: Do not hold qos mutex twice on QP resume
  RDMA/irdma: Set VLAN in UD work completion correctly
  RDMA/mlx5: Initialize the ODP xarray when creating an ODP MR
  rdma/qedr: Fix crash due to redundant release of device's qp memory
  RDMA/rdmavt: Fix error code in rvt_create_qp()
  IB/hfi1: Fix abba locking issue with sc_disable()
  IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields
  RDMA/mlx5: Set user priority for DCT
  RDMA/irdma: Process extended CQ entries correctly
2021-10-27 10:01:17 -07:00
Steven Rostedt (VMware)
4e84dc47bb ftrace/nds32: Update the proto for ftrace_trace_function to match ftrace_stub
The ftrace callback prototype was changed to pass a special ftrace_regs
instead of pt_regs as the last parameter, but the static ftrace for nds32
missed updating ftrace_trace_function and this caused a warning when
compared to ftrace_stub:

../arch/nds32/kernel/ftrace.c: In function '_mcount':
../arch/nds32/kernel/ftrace.c:24:35: error: comparison of distinct pointer types lacks a cast [-Werror]
   24 |         if (ftrace_trace_function != ftrace_stub)
      |                                   ^~

Link: https://lore.kernel.org/all/20211027055554.19372-1-rdunlap@infradead.org/
Link: https://lkml.kernel.org/r/20211027125101.33449969@gandalf.local.home

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Fixes: d19ad0775d ("ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regs")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-27 13:00:17 -04:00
Jakub Kicinski
afe8ca110c Merge tag 'mac80211-for-net-2021-10-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
Two fixes:
 * bridge vs. 4-addr mode check was wrong
 * management frame registrations locking was
   wrong, causing list corruption/crashes
====================

Link: https://lore.kernel.org/r/20211027143756.91711-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-27 08:13:15 -07:00
Paolo Bonzini
9b0971ca7f KVM: SEV-ES: fix another issue with string I/O VMGEXITs
If the guest requests string I/O from the hypervisor via VMGEXIT,
SW_EXITINFO2 will contain the REP count.  However, sev_es_string_io
was incorrectly treating it as the size of the GHCB buffer in
bytes.

This fixes the "outsw" test in the experimental SEV tests of
kvm-unit-tests.

Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reported-by: Marc Orr <marcorr@google.com>
Tested-by: Marc Orr <marcorr@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-27 10:58:26 -04:00
Guenter Roeck
4a089e95b4 nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST
nios2:allmodconfig builds fail with

make[1]: *** No rule to make target 'arch/nios2/boot/dts/""',
	needed by 'arch/nios2/boot/dts/built-in.a'.  Stop.
make: [Makefile:1868: arch/nios2/boot/dts] Error 2 (ignored)

This is seen with compile tests since those enable NIOS2_DTB_SOURCE_BOOL,
which in turn enables NIOS2_DTB_SOURCE. This causes the build error
because the default value for NIOS2_DTB_SOURCE is an empty string.
Disable NIOS2_DTB_SOURCE_BOOL for compile tests to avoid the error.

Fixes: 2fc8483fdc ("nios2: Build infrastructure")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2021-10-27 09:29:07 -05:00
David S. Miller
424a4f52c5 Merge branch 'hns3-fixes'
Guangbin Huang says:

====================
net: hns3: add some fixes for -net

This series adds some fixes for the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-27 14:47:34 +01:00
Guangbin Huang
630a6738da net: hns3: adjust string spaces of some parameters of tx bd info in debugfs
This patch adjusts the string spaces of some parameters of tx bd info in
debugfs according to their maximum needs.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-27 14:47:33 +01:00
Guangbin Huang
c7a6e3978e net: hns3: expand buffer len for some debugfs command
The specified buffer length for three debugfs files fd_tcam, uc and tqp
is not enough for their maximum needs, so this patch fixes them.

Fixes: b5a0b70d77 ("net: hns3: refactor dump fd tcam of debugfs")
Fixes: 1556ea9120 ("net: hns3: refactor dump mac list of debugfs")
Fixes: d96b0e5946 ("net: hns3: refactor dump reg of debugfs")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-27 14:47:33 +01:00
Jie Wang
6754614a78 net: hns3: add more string spaces for dumping packets number of queue info in debugfs
As the width of packets number registers is 32 bits, they needs at most
10 characters for decimal data printing, but now the string spaces is not
enough, so this patch fixes it.

Fixes: e44c495d95 ("net: hns3: refactor queue info of debugfs")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-27 14:47:33 +01:00
Jie Wang
2a21dab594 net: hns3: fix data endian problem of some functions of debugfs
The member data in struct hclge_desc is type of __le32, it needs endian
conversion before using it, and some functions of debugfs didn't do that,
so this patch fixes it.

Fixes: c0ebebb9cc ("net: hns3: Add "dcb register" status information query function")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-27 14:47:33 +01:00
Guangbin Huang
0251d196b0 net: hns3: ignore reset event before initialization process is done
Currently, if there is a reset event triggered by RAS during device in
initialization process, driver may run reset process concurrently with
initialization process. In this case, it may cause problem. For example,
the RSS indirection table may has not been alloc memory in initialization
process yet, but it is used in reset process, it will cause a call trace
like this:

[61228.744836] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
...
[61228.897677] Workqueue: hclgevf hclgevf_service_task [hclgevf]
[61228.911390] pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=--)
[61228.918670] pc : hclgevf_set_rss_indir_table+0xb4/0x190 [hclgevf]
[61228.927812] lr : hclgevf_set_rss_indir_table+0x90/0x190 [hclgevf]
[61228.937248] sp : ffff8000162ebb50
[61228.941087] x29: ffff8000162ebb50 x28: ffffb77add72dbc0 x27: ffff0820c7dc8080
[61228.949516] x26: 0000000000000000 x25: ffff0820ad4fc880 x24: ffff0820c7dc8080
[61228.958220] x23: ffff0820c7dc8090 x22: 00000000ffffffff x21: 0000000000000040
[61228.966360] x20: ffffb77add72b9c0 x19: 0000000000000000 x18: 0000000000000030
[61228.974646] x17: 0000000000000000 x16: ffffb77ae713feb0 x15: ffff0820ad4fcce8
[61228.982808] x14: ffffffffffffffff x13: ffff8000962eb7f7 x12: 00003834ec70c960
[61228.991990] x11: 00e0fafa8c206982 x10: 9670facc78a8f9a8 x9 : ffffb77add717530
[61229.001123] x8 : ffff0820ad4fd6b8 x7 : 0000000000000000 x6 : 0000000000000011
[61229.010249] x5 : 00000000000cb1b0 x4 : 0000000000002adb x3 : 0000000000000049
[61229.018662] x2 : ffff8000162ebbb8 x1 : 0000000000000000 x0 : 0000000000000480
[61229.027002] Call trace:
[61229.030177]  hclgevf_set_rss_indir_table+0xb4/0x190 [hclgevf]
[61229.039009]  hclgevf_rss_init_hw+0x128/0x1b4 [hclgevf]
[61229.046809]  hclgevf_reset_rebuild+0x17c/0x69c [hclgevf]
[61229.053862]  hclgevf_reset_service_task+0x4cc/0xa80 [hclgevf]
[61229.061306]  hclgevf_service_task+0x6c/0x630 [hclgevf]
[61229.068491]  process_one_work+0x1dc/0x48c
[61229.074121]  worker_thread+0x15c/0x464
[61229.078562]  kthread+0x168/0x16c
[61229.082873]  ret_from_fork+0x10/0x18
[61229.088221] Code: 7900e7f6 f904a683 d503201f 9101a3e2 (38616b43)
[61229.095357] ---[ end trace 153661a538f6768c ]---

To fix this problem, don't schedule reset task before initialization
process is done.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-27 14:47:33 +01:00
Yufeng Mo
f29da4088f net: hns3: change hclge/hclgevf workqueue to WQ_UNBOUND mode
Currently, the workqueue of hclge/hclgevf is executed on
the CPU that initiates scheduling requests by default. In
stress scenarios, the CPU may be busy and workqueue scheduling
is completed after a long period of time. To avoid this
situation and implement proper scheduling, use the WQ_UNBOUND
mode instead. In this way, the workqueue can be performed on
a relatively idle CPU.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-27 14:47:33 +01:00
Guangbin Huang
3bda2e5df4 net: hns3: fix pause config problem after autoneg disabled
If a TP port is configured by follow steps:
1.ethtool -s ethx autoneg off speed 100 duplex full
2.ethtool -A ethx rx on tx on
3.ethtool -s ethx autoneg on(rx&tx negotiated pause results are off)
4.ethtool -s ethx autoneg off speed 100 duplex full

In step 3, driver will set rx&tx pause parameters of hardware to off as
pause parameters negotiated with link partner are off.

After step 4, the "ethtool -a ethx" command shows both rx and tx pause
parameters are on. However, pause parameters of hardware are still off
and port has no flow control function actually.

To fix this problem, if autoneg is disabled, driver uses its saved
parameters to restore pause of hardware. If the speed is not changed in
this case, there is no link state changed for phy, it will cause the pause
parameter is not taken effect, so we need to force phy to go down and up.

Fixes: aacbe27e82 ("net: hns3: modify how pause options is displayed")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-27 14:47:33 +01:00
Shin'ichiro Kawasaki
e0c60d0102 block: Fix partition check for host-aware zoned block devices
Commit a33df75c63 ("block: use an xarray for disk->part_tbl") modified
the method to check partition existence in host-aware zoned block
devices from disk_has_partitions() helper function call to empty check
of xarray disk->part_tbl. However, disk->part_tbl always has single
entry for disk->part0 and never becomes empty. This resulted in the
host-aware zoned devices always judged to have partitions, and it made
the sysfs queue/zoned attribute to be "none" instead of "host-aware"
regardless of partition existence in the devices.

This also caused DEBUG_LOCKS_WARN_ON(lock->magic != lock) for
sdkp->rev_mutex in scsi layer when the kernel detects host-aware zoned
device. Since block layer handled the host-aware zoned devices as non-
zoned devices, scsi layer did not have chance to initialize the mutex
for zone revalidation. Therefore, the warning was triggered.

To fix the issues, call the helper function disk_has_partitions() in
place of disk->part_tbl empty check. Since the function was removed with
the commit a33df75c63, reimplement it to walk through entries in the
xarray disk->part_tbl.

Fixes: a33df75c63 ("block: use an xarray for disk->part_tbl")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: stable@vger.kernel.org # v5.14+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211026060115.753746-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-27 06:58:01 -06:00
David Sterba
3a60f6537c Revert "btrfs: compression: drop kmap/kunmap from generic helpers"
This reverts commit 4c2bf276b5.

The kmaps in compression code are still needed and cause crashes on
32bit machines (ARM, x86). Reproducible eg. by running fstest btrfs/004
with enabled LZO or ZSTD compression.

Link: https://lore.kernel.org/all/CAJCQCtT+OuemovPO7GZk8Y8=qtOObr0XTDp8jh4OHD6y84AFxw@mail.gmail.com/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214839
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-27 10:39:03 +02:00
Amit Engel
86aeda32b8 nvmet-tcp: fix header digest verification
Pass the correct length to nvmet_tcp_verify_hdgst, which is the pdu
header length.  This fixes a wrong behaviour where header digest
verification passes although the digest is wrong.

Signed-off-by: Amit Engel <amit.engel@dell.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-27 09:20:50 +02:00
Varun Prakash
e790de54e9 nvmet-tcp: fix data digest pointer calculation
exp_ddgst is of type __le32, &cmd->exp_ddgst + cmd->offset increases
&cmd->exp_ddgst by 4 * cmd->offset, fix this by type casting
&cmd->exp_ddgst to u8 *.

Fixes: 872d26a391 ("nvmet-tcp: add NVMe over TCP target driver")
Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-27 07:58:26 +02:00
Varun Prakash
d89b9f3bbb nvme-tcp: fix data digest pointer calculation
ddgst is of type __le32, &req->ddgst + req->offset
increases &req->ddgst by 4 * req->offset, fix this by
type casting &req->ddgst to u8 *.

Fixes: 3f2304f8c6 ("nvme-tcp: add NVMe over TCP host driver")
Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-27 07:58:26 +02:00
Varun Prakash
ce7723e9cd nvme-tcp: fix possible req->offset corruption
With commit db5ad6b7f8 ("nvme-tcp: try to send request in queue_rq
context") r2t and response PDU can get processed while send function
is executing.

Current data digest send code uses req->offset after kernel_sendmsg(),
this creates a race condition where req->offset gets reset before it
is used in send function.

This can happen in two cases -
1. Target sends r2t PDU which resets req->offset.
2. Target send response PDU which completes the req and then req is
   used for a new command, nvme_tcp_setup_cmd_pdu() resets req->offset.

Fix this by storing req->offset in a local variable and using
this local variable after kernel_sendmsg().

Fixes: db5ad6b7f8 ("nvme-tcp: try to send request in queue_rq context")
Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-27 07:58:26 +02:00
Dave Airlie
defbbcd99f Merge tag 'amd-drm-fixes-5.15-2021-10-21' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.15-2021-10-21:

amdgpu:
- Fix a potential out of bounds write in debugfs
- Fix revision handling for Yellow Carp
- Display fixes for Yellow Carp

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211021203430.4578-1-alexander.deucher@amd.com
2021-10-27 10:01:21 +10:00
Linus Torvalds
d25f27432f Merge tag 'arm-soc-fixes-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "One last set of small fixes for the soc tree:

   - Incorrect ethernet phy settings found on i.mx and allwinner
     platforms

   - a revert for a Qualcomm DT change that caused a boot regression

   - four patches for incorrect settings in i.MX DT files

   - new MAINTAINER file entries for dhcom boards

   - a Kconfig fix for a reset driver that became unselectable

   - three more code changes for bugs in reset drivers"

* tag 'arm-soc-fixes-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  MAINTAINERS: Add maintainers for DHCOM i.MX6 and DHCOM/DHCOR STM32MP1
  Revert "arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target"
  arm64: dts: imx8mm-kontron: Fix connection type for VSC8531 RGMII PHY
  arm64: dts: imx8mm-kontron: Fix CAN SPI clock frequency
  arm64: dts: imx8mm-kontron: Fix polarity of reg_rst_eth2
  arm64: dts: imx8mm-kontron: Set lower limit of VDD_SNVS to 800 mV
  arm64: dts: imx8mm-kontron: Make sure SOC and DRAM supply voltages are correct
  reset: socfpga: add empty driver allowing consumers to probe
  reset: tegra-bpmp: Handle errors in BPMP response
  reset: pistachio: Re-enable driver selection
  reset: brcmstb-rescal: fix incorrect polarity of status bit
  ARM: dts: sun7i: A20-olinuxino-lime2: Fix ethernet phy-mode
  arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node
2021-10-26 15:24:33 -07:00
Naohiro Aota
9586e67b91 block: schedule queue restart after BLK_STS_ZONE_RESOURCE
When dispatching a zone append write request to a SCSI zoned block device,
if the target zone of the request is already locked, the device driver will
return BLK_STS_ZONE_RESOURCE and the request will be pushed back to the
hctx dipatch queue. The queue will be marked as RESTART in
dd_finish_request() and restarted in __blk_mq_free_request(). However, this
restart applies to the hctx of the completed request. If the requeued
request is on a different hctx, dispatch will no be retried until another
request is submitted or the next periodic queue run triggers, leading to up
to 30 seconds latency for the requeued request.

Fix this problem by scheduling a queue restart similarly to the
BLK_STS_RESOURCE case or when we cannot get the budget.

Also, consolidate the checks into the "need_resource" variable to simplify
the condition.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Niklas Cassel <Niklas.Cassel@wdc.com>
Link: https://lore.kernel.org/r/20211026165127.4151055-1-naohiro.aota@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-26 16:00:36 -06:00
Jakub Kicinski
440ffcdd9d Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-10-26

We've added 12 non-merge commits during the last 7 day(s) which contain
a total of 23 files changed, 118 insertions(+), 98 deletions(-).

The main changes are:

1) Fix potential race window in BPF tail call compatibility check, from Toke Høiland-Jørgensen.

2) Fix memory leak in cgroup fs due to missing cgroup_bpf_offline(), from Quanyang Wang.

3) Fix file descriptor reference counting in generic_map_update_batch(), from Xu Kuohai.

4) Fix bpf_jit_limit knob to the max supported limit by the arch's JIT, from Lorenz Bauer.

5) Fix BPF sockmap ->poll callbacks for UDP and AF_UNIX sockets, from Cong Wang and Yucong Sun.

6) Fix BPF sockmap concurrency issue in TCP on non-blocking sendmsg calls, from Liu Jian.

7) Fix build failure of INODE_STORAGE and TASK_STORAGE maps on !CONFIG_NET, from Tejun Heo.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix potential race in tail call compatibility check
  bpf: Move BPF_MAP_TYPE for INODE_STORAGE and TASK_STORAGE outside of CONFIG_NET
  selftests/bpf: Use recv_timeout() instead of retries
  net: Implement ->sock_is_readable() for UDP and AF_UNIX
  skmsg: Extract and reuse sk_msg_is_readable()
  net: Rename ->stream_memory_read to ->sock_is_readable
  tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function
  cgroup: Fix memory leak caused by missing cgroup_bpf_offline
  bpf: Fix error usage of map_fd and fdget() in generic_map_update_batch()
  bpf: Prevent increasing bpf_jit_limit above max
  bpf: Define bpf_jit_alloc_exec_limit for arm64 JIT
  bpf: Define bpf_jit_alloc_exec_limit for riscv JIT
====================

Link: https://lore.kernel.org/r/20211026201920.11296-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-26 14:38:55 -07:00
Toke Høiland-Jørgensen
54713c85f5 bpf: Fix potential race in tail call compatibility check
Lorenzo noticed that the code testing for program type compatibility of
tail call maps is potentially racy in that two threads could encounter a
map with an unset type simultaneously and both return true even though they
are inserting incompatible programs.

The race window is quite small, but artificially enlarging it by adding a
usleep_range() inside the check in bpf_prog_array_compatible() makes it
trivial to trigger from userspace with a program that does, essentially:

        map_fd = bpf_create_map(BPF_MAP_TYPE_PROG_ARRAY, 4, 4, 2, 0);
        pid = fork();
        if (pid) {
                key = 0;
                value = xdp_fd;
        } else {
                key = 1;
                value = tc_fd;
        }
        err = bpf_map_update_elem(map_fd, &key, &value, 0);

While the race window is small, it has potentially serious ramifications in
that triggering it would allow a BPF program to tail call to a program of a
different type. So let's get rid of it by protecting the update with a
spinlock. The commit in the Fixes tag is the last commit that touches the
code in question.

v2:
- Use a spinlock instead of an atomic variable and cmpxchg() (Alexei)
v3:
- Put lock and the members it protects into an embedded 'owner' struct (Daniel)

Fixes: 3324b584b6 ("ebpf: misc core cleanup")
Reported-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211026110019.363464-1-toke@redhat.com
2021-10-26 12:37:28 -07:00
Tejun Heo
99d0a3831e bpf: Move BPF_MAP_TYPE for INODE_STORAGE and TASK_STORAGE outside of CONFIG_NET
bpf_types.h has BPF_MAP_TYPE_INODE_STORAGE and BPF_MAP_TYPE_TASK_STORAGE
declared inside #ifdef CONFIG_NET although they are built regardless of
CONFIG_NET. So, when CONFIG_BPF_SYSCALL && !CONFIG_NET, they are built
without the declarations leading to spurious build failures and not
registered to bpf_map_types making them unavailable.

Fix it by moving the BPF_MAP_TYPE for the two map types outside of
CONFIG_NET.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: a10787e6d5 ("bpf: Enable task local storage for tracing programs")
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/YXG1cuuSJDqHQfRY@slm.duckdns.org
2021-10-26 12:35:16 -07:00
Alexei Starovoitov
a94b5aae2a Merge branch 'sock_map: fix ->poll() and update selftests'
Cong Wang says:

====================
This patchset fixes ->poll() for sockets in sockmap and updates
selftests accordingly with select(). Please check each patch
for more details.

Fixes: c50524ec4e ("Merge branch 'sockmap: add sockmap support for unix datagram socket'")
Fixes: 89d69c5d0f ("Merge branch 'sockmap: introduce BPF_SK_SKB_VERDICT and support UDP'")
Acked-by: John Fastabend <john.fastabend@gmail.com>

---
v4: add a comment in udp_poll()

v3: drop sk_psock_get_checked()
    reuse tcp_bpf_sock_is_readable()

v2: rename and reuse ->stream_memory_read()
    fix a compile error in sk_psock_get_checked()

Cong Wang (3):
  net: rename ->stream_memory_read to ->sock_is_readable
  skmsg: extract and reuse sk_msg_is_readable()
  net: implement ->sock_is_readable() for UDP and AF_UNIX

====================

Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-10-26 12:30:12 -07:00
Yucong Sun
67b821502d selftests/bpf: Use recv_timeout() instead of retries
We use non-blocking sockets in those tests, retrying for
EAGAIN is ugly because there is no upper bound for the packet
arrival time, at least in theory. After we fix poll() on
sockmap sockets, now we can switch to select()+recv().

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-5-xiyou.wangcong@gmail.com
2021-10-26 12:29:33 -07:00
Cong Wang
af49338895 net: Implement ->sock_is_readable() for UDP and AF_UNIX
Yucong noticed we can't poll() sockets in sockmap even
when they are the destination sockets of redirections.
This is because we never poll any psock queues in ->poll(),
except for TCP. With ->sock_is_readable() now we can
overwrite >sock_is_readable(), invoke and implement it for
both UDP and AF_UNIX sockets.

Reported-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-4-xiyou.wangcong@gmail.com
2021-10-26 12:29:33 -07:00
Cong Wang
fb4e0a5e73 skmsg: Extract and reuse sk_msg_is_readable()
tcp_bpf_sock_is_readable() is pretty much generic,
we can extract it and reuse it for non-TCP sockets.

Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-3-xiyou.wangcong@gmail.com
2021-10-26 12:29:33 -07:00
Cong Wang
7b50ecfcc6 net: Rename ->stream_memory_read to ->sock_is_readable
The proto ops ->stream_memory_read() is currently only used
by TCP to check whether psock queue is empty or not. We need
to rename it before reusing it for non-TCP protocols, and
adjust the exsiting users accordingly.

Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-2-xiyou.wangcong@gmail.com
2021-10-26 12:29:33 -07:00
Liu Jian
cd9733f5d7 tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function
With two Msgs, msgA and msgB and a user doing nonblocking sendmsg calls (or
multiple cores) on a single socket 'sk' we could get the following flow.

 msgA, sk                               msgB, sk
 -----------                            ---------------
 tcp_bpf_sendmsg()
 lock(sk)
 psock = sk->psock
                                        tcp_bpf_sendmsg()
                                        lock(sk) ... blocking
tcp_bpf_send_verdict
if (psock->eval == NONE)
   psock->eval = sk_psock_msg_verdict
 ..
 < handle SK_REDIRECT case >
   release_sock(sk)                     < lock dropped so grab here >
   ret = tcp_bpf_sendmsg_redir
                                        psock = sk->psock
                                        tcp_bpf_send_verdict
 lock_sock(sk) ... blocking on B
                                        if (psock->eval == NONE) <- boom.
                                         psock->eval will have msgA state

The problem here is we dropped the lock on msgA and grabbed it with msgB.
Now we have old state in psock and importantly psock->eval has not been
cleared. So msgB will run whatever action was done on A and the verdict
program may never see it.

Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211012052019.184398-1-liujian56@huawei.com
2021-10-26 12:25:55 -07:00
Mario
61b1d445f3 drm: panel-orientation-quirks: Add quirk for GPD Win3
Fixes screen orientation for GPD Win 3 handheld gaming console.

Signed-off-by: Mario Risoldi <awxkrnl@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026112737.9181-1-awxkrnl@gmail.com
2021-10-26 20:57:10 +02:00
Walter Stoll
cd004d8299 watchdog: Fix OMAP watchdog early handling
TI's implementation does not service the watchdog even if the kernel
command line parameter omap_wdt.early_enable is set to 1. This patch
fixes the issue.

Signed-off-by: Walter Stoll <walter.stoll@duagon.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/88a8fe5229cd68fa0f1fd22f5d66666c1b7057a0.camel@duagon.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-10-26 20:22:51 +02:00
Guenter Roeck
abd1c6adc1 watchdog: ixp4xx_wdt: Fix address space warning
sparse reports the following address space warning.

drivers/watchdog/ixp4xx_wdt.c:122:20: sparse:
	incorrect type in assignment (different address spaces)
drivers/watchdog/ixp4xx_wdt.c:122:20: sparse:
	expected void [noderef] __iomem *base
drivers/watchdog/ixp4xx_wdt.c:122:20: sparse:
	got void *platform_data

Add a typecast to solve the problem.

Fixes: 21a0a29d16 ("watchdog: ixp4xx: Rewrite driver to use core")
Cc: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210911042925.556889-1-linux@roeck-us.net
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-10-26 20:22:51 +02:00
Krzysztof Kozlowski
bcc3e704f1 watchdog: sbsa: drop unneeded MODULE_ALIAS
The MODULE_DEVICE_TABLE already creates proper alias for platform
driver.  Having another MODULE_ALIAS causes the alias to be duplicated.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210917092024.19323-1-krzysztof.kozlowski@canonical.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-10-26 20:22:51 +02:00
Jamie Iles
f31afb502c watchdog: sbsa: only use 32-bit accessors
SBSA says of the generic watchdog:

  All registers are 32 bits in size and should be accessed using 32-bit
  reads and writes. If an access size other than 32 bits is used then
  the results are IMPLEMENTATION DEFINED.

and for qemu, the implementation will only allow 32-bit accesses
resulting in a synchronous external abort when configuring the watchdog.
Use lo_hi_* accessors rather than a readq/writeq.

Fixes: abd3ac7902 ("watchdog: sbsa: Support architecture version 1")
Signed-off-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20210903112101.493552-1-quic_jiles@quicinc.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-10-26 20:22:50 +02:00
Guenter Roeck
6e7733ef0b Revert "watchdog: iTCO_wdt: Account for rebooting on second timeout"
This reverts commit cb011044e3 ("watchdog: iTCO_wdt: Account for
rebooting on second timeout") and commit aec42642d9 ("watchdog: iTCO_wdt:
Fix detection of SMI-off case") since those patches cause a regression
on certain boards (https://bugzilla.kernel.org/show_bug.cgi?id=213809).

While this revert may result in some boards to only reset after twice
the configured timeout value, that is still better than a watchdog reset
after half the configured value.

Fixes: cb011044e3 ("watchdog: iTCO_wdt: Account for rebooting on second timeout")
Fixes: aec42642d9 ("watchdog: iTCO_wdt: Fix detection of SMI-off case")
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Mantas Mikulėnas <grawity@gmail.com>
Reported-by: Javier S. Pedro <debbugs@javispedro.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20211008003302.1461733-1-linux@roeck-us.net
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-10-26 20:22:50 +02:00
Wenbin Mei
92b18252b9 mmc: cqhci: clear HALT state after CQE enable
While mmc0 enter suspend state, we need halt CQE to send legacy cmd(flush
cache) and disable cqe, for resume back, we enable CQE and not clear HALT
state.
In this case MediaTek mmc host controller will keep the value for HALT
state after CQE disable/enable flow, so the next CQE transfer after resume
will be timeout due to CQE is in HALT state, the log as below:
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: timeout for tag 2
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: ============ CQHCI REGISTER DUMP ===========
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Caps:      0x100020b6 | Version:  0x00000510
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Config:    0x00001103 | Control:  0x00000001
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: TDL base:  0xfd05f000 | TDL up32: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Doorbell:  0x8000203c | TCN:      0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Task clr:  0x00000000 | SSC1:     0x00001000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: SSC2:      0x00000001 | DCMD rsp: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: RED mask:  0xfdf9a080 | TERRI:    0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Resp idx:  0x00000000 | Resp arg: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: CRNQP:     0x00000000 | CRNQDUN:  0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: CRNQIS:    0x00000000 | CRNQIE:   0x00000000

This change check HALT state after CQE enable, if CQE is in HALT state, we
will clear it.

Signed-off-by: Wenbin Mei <wenbin.mei@mediatek.com>
Cc: stable@vger.kernel.org
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: a4080225f5 ("mmc: cqhci: support for command queue enabled host")
Link: https://lore.kernel.org/r/20211026070812.9359-1-wenbin.mei@mediatek.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-26 17:34:57 +02:00
Johan Hovold
8c81719291 mmc: vub300: fix control-message timeouts
USB control-message timeouts are specified in milliseconds and should
specifically not vary with CONFIG_HZ.

Fixes: 88095e7b47 ("mmc: Add new VUB300 USB-to-SD/SDIO/MMC driver")
Cc: stable@vger.kernel.org      # 3.0
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20211025115608.5287-1-johan@kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-26 17:31:58 +02:00
Jaehoon Chung
697542bcea mmc: dw_mmc: exynos: fix the finding clock sample value
Even though there are candiates value if can't find best value, it's
returned -EIO. It's not proper behavior.
If there is not best value, use a first candiate value to work eMMC.

Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Christian Hewitt <christianshewitt@gmail.com>
Cc: stable@vger.kernel.org
Fixes: c537a1c5ff ("mmc: dw_mmc: exynos: add variable delay tuning sequence")
Link: https://lore.kernel.org/r/20211022082106.1557-1-jh80.chung@samsung.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-26 17:28:41 +02:00
Christoph Niedermaier
05d5da3cb1 MAINTAINERS: Add maintainers for DHCOM i.MX6 and DHCOM/DHCOR STM32MP1
Add maintainers for DH electronics DHCOM i.MX6
and DHCOM/DHCOR STM32MP1 boards.

Signed-off-by: Christoph Niedermaier <cniedermaier@dh-electronics.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: kernel@dh-electronics.com
Cc: arnd@arndb.de
Link: https://lore.kernel.org/r/20211025073706.2794-1-cniedermaier@dh-electronics.com'
To: soc@kernel.org
To: linux-kernel@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-26 17:14:37 +02:00
Ming Lei
d308ae0d29 block: drain queue after disk is removed from sysfs
Before removing disk from sysfs, userspace still may change queue via
sysfs, such as switching elevator or setting wbt latency, both may
reinitialize wbt, then the warning in blk_free_queue_stats() will be
triggered since rq_qos_exit() is moved to del_gendisk().

Fixes the issue by moving draining queue & tearing down after disk is
removed from sysfs, at that time no one can come into queue's
store()/show().

Reported-by: Yi Zhang <yi.zhang@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Fixes: 8e141f9eb8 ("block: drain file system I/O on del_gendisk")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20211026101204.2897166-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-26 08:44:38 -06:00
Arnd Bergmann
f44e8f91b8 Merge tag 'qcom-arm64-fixes-for-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm ARM64 DTS one more fix for 5.15

This reverts a clock change in the Qualcomm RB5 devicetree which in some
combinations of firmware and configuration causes the device to crash
during boot.

Data on an adjacent platform indicates that this is probably not be the
root cause of the problem, but this resolves the regression seen on RB5
and will allow the SM8250 platform to boot v5.15.

* tag 'qcom-arm64-fixes-for-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  Revert "arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target"

Link: https://lore.kernel.org/r/20211025201213.1145348-1-bjorn.andersson@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-26 16:20:50 +02:00
Vadym Kochan
19fa0887c5 MAINTAINERS: please remove myself from the Prestera driver
Signed-off-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 15:19:41 +01:00
Johan Hovold
db6c3c064f net: lan78xx: fix division by zero in send path
Add the missing endpoint max-packet sanity check to probe() to avoid
division by zero in lan78xx_tx_bh() in case a malicious device has
broken descriptors (or when doing descriptor fuzz testing).

Note that USB core will reject URBs submitted for endpoints with zero
wMaxPacketSize but that drivers doing packet-size calculations still
need to handle this (cf. commit 2548288b4f ("USB: Fix: Don't skip
endpoint descriptors with maxpacket=0")).

Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Cc: stable@vger.kernel.org      # 4.3
Cc: Woojung.Huh@microchip.com <Woojung.Huh@microchip.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 15:18:39 +01:00
Pavel Skripkin
6f68cd6348 net: batman-adv: fix error handling
Syzbot reported ODEBUG warning in batadv_nc_mesh_free(). The problem was
in wrong error handling in batadv_mesh_init().

Before this patch batadv_mesh_init() was calling batadv_mesh_free() in case
of any batadv_*_init() calls failure. This approach may work well, when
there is some kind of indicator, which can tell which parts of batadv are
initialized; but there isn't any.

All written above lead to cleaning up uninitialized fields. Even if we hide
ODEBUG warning by initializing bat_priv->nc.work, syzbot was able to hit
GPF in batadv_nc_purge_paths(), because hash pointer in still NULL. [1]

To fix these bugs we can unwind batadv_*_init() calls one by one.
It is good approach for 2 reasons: 1) It fixes bugs on error handling
path 2) It improves the performance, since we won't call unneeded
batadv_*_free() functions.

So, this patch makes all batadv_*_init() clean up all allocated memory
before returning with an error to no call correspoing batadv_*_free()
and open-codes batadv_mesh_free() with proper order to avoid touching
uninitialized fields.

Link: https://lore.kernel.org/netdev/000000000000c87fbd05cef6bcb0@google.com/ [1]
Reported-and-tested-by: syzbot+28b0702ada0bf7381f58@syzkaller.appspotmail.com
Fixes: c6c8fea297 ("net: Add batman-adv meshing protocol")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 14:47:12 +01:00
Max VA
fa40d9734a tipc: fix size validations for the MSG_CRYPTO type
The function tipc_crypto_key_rcv is used to parse MSG_CRYPTO messages
to receive keys from other nodes in the cluster in order to decrypt any
further messages from them.
This patch verifies that any supplied sizes in the message body are
valid for the received message.

Fixes: 1ef6f7c939 ("tipc: add automatic session key exchange")
Signed-off-by: Max VA <maxv@sentinelone.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:43:07 +01:00
Krzysztof Kozlowski
2195f2062e nfc: port100: fix using -ERRNO as command type mask
During probing, the driver tries to get a list (mask) of supported
command types in port100_get_command_type_mask() function.  The value
is u64 and 0 is treated as invalid mask (no commands supported).  The
function however returns also -ERRNO as u64 which will be interpret as
valid command mask.

Return 0 on every error case of port100_get_command_type_mask(), so the
probing will stop.

Cc: <stable@vger.kernel.org>
Fixes: 0347a6ab30 ("NFC: port100: Commands mechanism implementation")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:42:00 +01:00
David S. Miller
eacd68b7ce Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-10-25

This series contains updates to ice driver only.

Dave adds event handler for LAG NETDEV_UNREGISTER to unlink device from
link aggregate.

Yongxin Liu adds a check for PTP support during release which would
cause a call trace on non-PTP supported devices.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:26:09 +01:00
Cyril Strejc
9122a70a63 net: multicast: calculate csum of looped-back and forwarded packets
During a testing of an user-space application which transmits UDP
multicast datagrams and utilizes multicast routing to send the UDP
datagrams out of defined network interfaces, I've found a multicast
router does not fill-in UDP checksum into locally produced, looped-back
and forwarded UDP datagrams, if an original output NIC the datagrams
are sent to has UDP TX checksum offload enabled.

The datagrams are sent malformed out of the NIC the datagrams have been
forwarded to.

It is because:

1. If TX checksum offload is enabled on the output NIC, UDP checksum
   is not calculated by kernel and is not filled into skb data.

2. dev_loopback_xmit(), which is called solely by
   ip_mc_finish_output(), sets skb->ip_summed = CHECKSUM_UNNECESSARY
   unconditionally.

3. Since 35fc92a9 ("[NET]: Allow forwarding of ip_summed except
   CHECKSUM_COMPLETE"), the ip_summed value is preserved during
   forwarding.

4. If ip_summed != CHECKSUM_PARTIAL, checksum is not calculated during
   a packet egress.

The minimum fix in dev_loopback_xmit():

1. Preserves skb->ip_summed CHECKSUM_PARTIAL. This is the
   case when the original output NIC has TX checksum offload enabled.
   The effects are:

     a) If the forwarding destination interface supports TX checksum
        offloading, the NIC driver is responsible to fill-in the
        checksum.

     b) If the forwarding destination interface does NOT support TX
        checksum offloading, checksums are filled-in by kernel before
        skb is submitted to the NIC driver.

     c) For local delivery, checksum validation is skipped as in the
        case of CHECKSUM_UNNECESSARY, thanks to skb_csum_unnecessary().

2. Translates ip_summed CHECKSUM_NONE to CHECKSUM_UNNECESSARY. It
   means, for CHECKSUM_NONE, the behavior is unmodified and is there
   to skip a looped-back packet local delivery checksum validation.

Signed-off-by: Cyril Strejc <cyril.strejc@skoda.cz>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:09:22 +01:00
Thomas Perrot
d81d0e41ed spi: spl022: fix Microwire full duplex mode
There are missing braces in the function that verify controller parameters,
then an error is always returned when the parameter to select Microwire
frames operation is used on devices allowing it.

Signed-off-by: Thomas Perrot <thomas.perrot@bootlin.com>
Link: https://lore.kernel.org/r/20211022142104.1386379-1-thomas.perrot@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-26 11:53:57 +01:00
Sagi Grimberg
25e1f67eda nvme-tcp: fix H2CData PDU send accounting (again)
We should not access request members after the last send, even to
determine if indeed it was the last data payload send. The reason is
that a completion could have arrived and trigger a new execution of the
request which overridden these members. This was fixed by commit
825619b09a ("nvme-tcp: fix possible use-after-completion").

Commit e371af033c broke that assumption again to address cases where
multiple r2t pdus are sent per request. To fix it, we need to record the
request data_sent and data_len and after the payload network send we
reference these counters to determine weather we should advance the
request iterator.

Fixes: e371af033c ("nvme-tcp: fix incorrect h2cdata pdu offset accounting")
Reported-by: Keith Busch <kbusch@kernel.org>
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-26 10:41:29 +02:00
Maurizio Lombardi
926245c7d2 nvmet-tcp: fix a memory leak when releasing a queue
page_frag_free() won't completely release the memory
allocated for the commands, the cache page must be explicitly
freed by calling __page_frag_cache_drain().

This bug can be easily reproduced by repeatedly
executing the following command on the initiator:

$echo 1 > /sys/devices/virtual/nvme-fabrics/ctl/nvme0/reset_controller

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-26 10:41:29 +02:00
Imre Deak
6e6f966308 drm/i915/dp: Skip the HW readout of DPCD on disabled encoders
Reading out the DP encoders' DPCD during booting or resume is only
required for enabled encoders: such encoders may be modesetted during
the initial commit and the link training this involves depends on an
initialized DPCD. For DDI encoders reading out the DPCD is skipped, do
the same on pre-DDI platforms.

Atm, the first DPCD readout without a sink connected - which is a likely
scneario if the encoder is disabled - leaves intel_dp->num_common_rates
at 0, which resulted in

intel_dp_sync_state()->intel_dp_max_common_rate()

in a

intel_dp->common_rates[-1]

access. This by definition results in an undefined behaviour, though to
my best knowledge in all HW/compiler configurations it actually results
in accessing the array item type value preceding the array. In this
case the preceding value happens to be intel_dp->num_common_rates,
which is 0, so this issue - by luck - didn't cause a user visible
problem.

Nevertheless it's still an undefined behaviour and in CONFIG_UBSAN
builds leads to a kernel BUG() (which revealed this problem for us),
hence CC:stable.

A related problem in case the encoder is enabled but the sink is not
connected or the DPCD readout fails is fixed by the next patch.

v2: Amend the commit message describing the root cause of the
    CONFIG_UBSAN BUG().

Fixes: a532cde31d ("drm/i915/tc: Fix TypeC port init/resume time sanitization")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/4297
Reported-and-tested-by: Mat Jonczyk <mat.jonczyk@o2.pl>
Cc: Mat Jonczyk <mat.jonczyk@o2.pl>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018094154.1407705-2-imre.deak@intel.com
(cherry picked from commit 4ec5ffc341)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-10-26 10:40:10 +03:00
Ville Syrjälä
9761ffb8f1 drm/i915: Catch yet another unconditioal clflush
Replace the unconditional clflush() with drm_clflush_virt_range()
which does the wbinvd() fallback when clflush is not available.

This time no justification is given for the clflush in the
offending commit.

Cc: stable@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 2c8ab3339e ("drm/i915: Pin timeline map after first timeline pin, v4.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014090941.12159-4-ville.syrjala@linux.intel.com
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 9ced12182d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-10-26 10:40:09 +03:00
Ville Syrjälä
fcf918ffd3 drm/i915: Convert unconditional clflush to drm_clflush_virt_range()
This one is apparently a "clflush for good measure", so bit more
justification (if you can call it that) than some of the others.
Convert to drm_clflush_virt_range() again so that machines without
clflush will survive the ordeal.

Cc: stable@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@intel.com> #v1
Fixes: 12ca695d2c ("drm/i915: Do not share hwsp across contexts any more, v8.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014090941.12159-3-ville.syrjala@linux.intel.com
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit af7b6d234e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-10-26 10:40:09 +03:00
Ido Schimmel
759635760a mlxsw: pci: Recycle received packet upon allocation failure
When the driver fails to allocate a new Rx buffer, it passes an empty Rx
descriptor (contains zero address and size) to the device and marks it
as invalid by setting the skb pointer in the descriptor's metadata to
NULL.

After processing enough Rx descriptors, the driver will try to process
the invalid descriptor, but will return immediately seeing that the skb
pointer is NULL. Since the driver no longer passes new Rx descriptors to
the device, the Rx queue will eventually become full and the device will
start to drop packets.

Fix this by recycling the received packet if allocation of the new
packet failed. This means that allocation is no longer performed at the
end of the Rx routine, but at the start, before tearing down the DMA
mapping of the received packet.

Remove the comment about the descriptor being zeroed as it is no longer
correct. This is OK because we either use the descriptor as-is (when
recycling) or overwrite its address and size fields with that of the
newly allocated Rx buffer.

The issue was discovered when a process ("perf") consumed too much
memory and put the system under memory pressure. It can be reproduced by
injecting slab allocation failures [1]. After the fix, the Rx queue no
longer comes to a halt.

[1]
 # echo 10 > /sys/kernel/debug/failslab/times
 # echo 1000 > /sys/kernel/debug/failslab/interval
 # echo 100 > /sys/kernel/debug/failslab/probability

 FAULT_INJECTION: forcing a failure.
 name failslab, interval 1000, probability 100, space 0, times 8
 [...]
 Call Trace:
  <IRQ>
  dump_stack_lvl+0x34/0x44
  should_fail.cold+0x32/0x37
  should_failslab+0x5/0x10
  kmem_cache_alloc_node+0x23/0x190
  __alloc_skb+0x1f9/0x280
  __netdev_alloc_skb+0x3a/0x150
  mlxsw_pci_rdq_skb_alloc+0x24/0x90
  mlxsw_pci_cq_tasklet+0x3dc/0x1200
  tasklet_action_common.constprop.0+0x9f/0x100
  __do_softirq+0xb5/0x252
  irq_exit_rcu+0x7a/0xa0
  common_interrupt+0x83/0xa0
  </IRQ>
  asm_common_interrupt+0x1e/0x40
 RIP: 0010:cpuidle_enter_state+0xc8/0x340
 [...]
 mlxsw_spectrum2 0000:06:00.0: Failed to alloc skb for RDQ

Fixes: eda6500a98 ("mlxsw: Add PCI bus implementation")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/20211024064014.1060919-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25 18:21:23 -07:00
Christoph Hellwig
3dd60fb9d9 nvdimm/pmem: stop using q_usage_count as external pgmap refcount
Originally all DAX access when through block_device operations and thus
needed a queue reference.  But since commit cccbce6715
("filesystem-dax: convert to dax_direct_access()") all this happens at
the DAX device level which uses its own refcounting.  Having the external
refcount thus wasn't needed but has otherwise been harmless for long
time.

But now that "block: drain file system I/O on del_gendisk" waits for
q_usage_count to reach 0 in del_gendisk this whole scheme can't work
anymore (and pmem is the only driver abusing q_usage_count like that).
So switch to the internal reference and remove the unbalanced
blk_freeze_queue_start that is taken care of by del_gendisk.

Fixes: 8e141f9eb8 ("block: drain file system I/O on del_gendisk")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211019073641.2323410-2-hch@lst.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-10-25 16:12:32 -07:00
Yongxin Liu
fd1b5beb17 ice: check whether PTP is initialized in ice_ptp_release()
PTP is currently only supported on E810 devices, it is checked
in ice_ptp_init(). However, there is no check in ice_ptp_release().
For other E800 series devices, ice_ptp_release() will be wrongly executed.

Fix the following calltrace.

  INFO: trying to register non-static key.
  The code is fine but needs lockdep annotation, or maybe
  you didn't initialize this object before use?
  turning off the locking correctness validator.
  Workqueue: ice ice_service_task [ice]
  Call Trace:
   dump_stack_lvl+0x5b/0x82
   dump_stack+0x10/0x12
   register_lock_class+0x495/0x4a0
   ? find_held_lock+0x3c/0xb0
   __lock_acquire+0x71/0x1830
   lock_acquire+0x1e6/0x330
   ? ice_ptp_release+0x3c/0x1e0 [ice]
   ? _raw_spin_lock+0x19/0x70
   ? ice_ptp_release+0x3c/0x1e0 [ice]
   _raw_spin_lock+0x38/0x70
   ? ice_ptp_release+0x3c/0x1e0 [ice]
   ice_ptp_release+0x3c/0x1e0 [ice]
   ice_prepare_for_reset+0xcb/0xe0 [ice]
   ice_do_reset+0x38/0x110 [ice]
   ice_service_task+0x138/0xf10 [ice]
   ? __this_cpu_preempt_check+0x13/0x20
   process_one_work+0x26a/0x650
   worker_thread+0x3f/0x3b0
   ? __kthread_parkme+0x51/0xb0
   ? process_one_work+0x650/0x650
   kthread+0x161/0x190
   ? set_kthread_struct+0x40/0x40
   ret_from_fork+0x1f/0x30

Fixes: 4dd0d5c33c ("ice: add lock around Tx timestamp tracker flush")
Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-25 13:44:55 -07:00
Dave Ertman
6a8b357278 ice: Respond to a NETDEV_UNREGISTER event for LAG
When the PF is a member of a link aggregate, and the driver
is removed, the process will hang unless we respond to the
NETDEV_UNREGISTER event that is sent to the event_handler
for LAG.

Add a case statement for the ice_lag_event_handler to unlink
the PF from the link aggregate.

Also remove code that was incorrectly applying a dev_hold to
peer_netdevs that were associated with the ice driver.

Fixes: df006dd4b1 ("ice: Add initial support framework for LAG")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-25 13:44:37 -07:00
Amit Pundir
e091b836a3 Revert "arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target"
This reverts commit 001ce9785c.

This upstream commit broke AOSP (post Android 12 merge) build
on RB5. The device either silently crashes into USB crash mode
after android boot animation or we see a blank blue screen
with following dpu errors in dmesg:

[  T444] hw recovery is not complete for ctl:3
[  T444] [drm:dpu_encoder_phys_vid_prepare_for_kickoff:539] [dpu error]enc31 intf1 ctl 3 reset failure: -22
[  T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank timeout
[  T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done returned -110
[    C7] [drm:dpu_encoder_frame_done_timeout:2127] [dpu error]enc31 frame done timeout
[  T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank timeout
[  T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done returned -110

Fixes: 001ce9785c ("arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target")
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211014135410.4136412-1-dmitry.baryshkov@linaro.org
2021-10-25 14:26:00 -05:00
Linus Torvalds
3906fe9bb7 Linux 5.15-rc7 2021-10-25 11:30:31 -07:00
Matthew Wilcox (Oracle)
cb68543239 secretmem: Prevent secretmem_users from wrapping to zero
Commit 110860541f ("mm/secretmem: use refcount_t instead of atomic_t")
attempted to fix the problem of secretmem_users wrapping to zero and
allowing suspend once again.

But it was reverted in commit 87066fdd2e ("Revert 'mm/secretmem: use
refcount_t instead of atomic_t'") because of the problems it caused - a
refcount_t was not semantically the right type to use.

Instead prevent secretmem_users from wrapping to zero by forbidding new
users if the number of users has wrapped from positive to negative.
This stops a long way short of reaching the necessary 4 billion users
where it wraps to zero again, so there's no need to be clever with
special anti-wrap types or checking the return value from atomic_inc().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Jordy Zomer <jordy@pwning.systems>
Cc: Kees Cook <keescook@chromium.org>,
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-25 11:27:31 -07:00
Linus Torvalds
ac8a6eba2a spi: Fix tegra20 build with CONFIG_PM=n once again
Commit efafec27c5 ("spi: Fix tegra20 build with CONFIG_PM=n") already
fixed the build without PM support once.  There was an alternative fix
by Guenter in commit 2bab94090b ("spi: tegra20-slink: Declare runtime
suspend and resume functions conditionally"), and Mark then merged the
two correctly in ffb1e76f4f ("Merge tag 'v5.15-rc2' into spi-5.15").

But for some inexplicable reason, Mark then merged things _again_ in
commit 59c4e190b1 ("Merge tag 'v5.15-rc3' into spi-5.15"), and screwed
things up at that point, and the __maybe_unused attribute on
tegra_slink_runtime_resume() went missing.

Reinstate it, so that alpha (and other architectures without PM support)
builds cleanly again.

Btw, this is another prime example of how random back-merges are not
good.  Just don't do them.  Subsystem developers should not merge my
tree in any normal circumstances.  Both of those merge commits pointed
to above are bad: even the one that got the merge result right doesn't
even mention _why_ it was done, and the one that got it wrong is
obviously broken.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-25 10:46:41 -07:00
Linus Torvalds
c2b43854aa Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:

 - Fix clang-related relocation warning in futex code

 - Fix incorrect use of get_kernel_nofault()

 - Fix bad code generation in __get_user_check() when kasan is enabled

 - Ensure TLB function table is correctly aligned

 - Remove duplicated string function definitions in decompressor

 - Fix link-time orphan section warnings

 - Fix old-style function prototype for arch_init_kprobes()

 - Only warn about XIP address when not compile testing

 - Handle BE32 big endian for keystone2 remapping

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9148/1: handle CONFIG_CPU_ENDIAN_BE32 in arch/arm/kernel/head.S
  ARM: 9141/1: only warn about XIP address when not compile testing
  ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
  ARM: 9138/1: fix link warning with XIP + frame-pointer
  ARM: 9134/1: remove duplicate memcpy() definition
  ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
  ARM: 9132/1: Fix __get_user_check failure with ARM KASAN images
  ARM: 9125/1: fix incorrect use of get_kernel_nofault()
  ARM: 9122/1: select HAVE_FUTEX_CMPXCHG
2021-10-25 10:28:52 -07:00
Linus Torvalds
4862649f16 Merge tag 'libata-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull libata fix from Damien Le Moal:
 "A single fix in this pull request addressing an invalid error code
  return in the sata_mv driver (from Zheyu)"

* tag 'libata-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: sata_mv: Fix the error handling of mv_chip_id()
2021-10-25 09:57:28 -07:00
Linus Torvalds
a51aec4109 Merge tag 'pinctrl-v5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
 "Some late pin control fixes, the most generally annoying will probably
  be the AMD IRQ storm fix affecting the Microsoft surface.

  Summary:

   - Three fixes pertaining to Broadcom DT bindings. Some stuff didn't
     work out as inteded, we need to back out

   - A resume bug fix in the STM32 driver

   - Disable and mask the interrupts on probe in the AMD pinctrl driver,
     affecting Microsoft surface"

* tag 'pinctrl-v5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: amd: disable and mask interrupts on probe
  pinctrl: stm32: use valid pin identifier in stm32_pinctrl_resume()
  Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode"
  dt-bindings: pinctrl: brcm,ns-pinmux: drop unneeded CRU from example
  Revert "dt-bindings: pinctrl: bcm4708-pinmux: rework binding to use syscon"
2021-10-25 09:47:18 -07:00
Xin Long
f7a1e76d0f net-sysfs: initialize uid and gid before calling net_ns_get_ownership
Currently in net_ns_get_ownership() it may not be able to set uid or gid
if make_kuid or make_kgid returns an invalid value, and an uninit-value
issue can be triggered by this.

This patch is to fix it by initializing the uid and gid before calling
net_ns_get_ownership(), as it does in kobject_get_ownership()

Fixes: e6dee9f389 ("net-sysfs: add netdev_change_owner()")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 16:17:32 +01:00
Dongli Zhang
042b2046d0 xen/netfront: stop tx queues during live migration
The tx queues are not stopped during the live migration. As a result, the
ndo_start_xmit() may access netfront_info->queues which is freed by
talk_to_netback()->xennet_destroy_queues().

This patch is to netif_device_detach() at the beginning of xen-netfront
resuming, and netif_device_attach() at the end of resuming.

     CPU A                                CPU B

 talk_to_netback()
 -> if (info->queues)
        xennet_destroy_queues(info);
    to free netfront_info->queues

                                        xennet_start_xmit()
                                        to access netfront_info->queues

  -> err = xennet_create_queues(info, &num_queues);

The idea is borrowed from virtio-net.

Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 16:13:12 +01:00
Michael Chan
0c57eeecc5 net: Prevent infinite while loop in skb_tx_hash()
Drivers call netdev_set_num_tc() and then netdev_set_tc_queue()
to set the queue count and offset for each TC.  So the queue count
and offset for the TCs may be zero for a short period after dev->num_tc
has been set.  If a TX packet is being transmitted at this time in the
code path netdev_pick_tx() -> skb_tx_hash(), skb_tx_hash() may see
nonzero dev->num_tc but zero qcount for the TC.  The while loop that
keeps looping while hash >= qcount will not end.

Fix it by checking the TC's qcount to be nonzero before using it.

Fixes: eadec877ce ("net: Add support for subordinate traffic classes to netdev_pick_tx")
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 15:58:01 +01:00
Mark Zhang
64733956eb RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
When copying the device name, the length of the data memcpy copied exceeds
the length of the source buffer, which cause the KASAN issue below.  Use
strscpy_pad() instead.

 BUG: KASAN: slab-out-of-bounds in ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core]
 Read of size 64 at addr ffff88811a10f5e0 by task rping/140263
 CPU: 3 PID: 140263 Comm: rping Not tainted 5.15.0-rc1+ #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Call Trace:
  dump_stack_lvl+0x57/0x7d
  print_address_description.constprop.0+0x1d/0xa0
  kasan_report+0xcb/0x110
  kasan_check_range+0x13d/0x180
  memcpy+0x20/0x60
  ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core]
  ib_nl_make_request+0x1c6/0x380 [ib_core]
  send_mad+0x20a/0x220 [ib_core]
  ib_sa_path_rec_get+0x3e3/0x800 [ib_core]
  cma_query_ib_route+0x29b/0x390 [rdma_cm]
  rdma_resolve_route+0x308/0x3e0 [rdma_cm]
  ucma_resolve_route+0xe1/0x150 [rdma_ucm]
  ucma_write+0x17b/0x1f0 [rdma_ucm]
  vfs_write+0x142/0x4d0
  ksys_write+0x133/0x160
  do_syscall_64+0x43/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f26499aa90f
 Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 29 fd ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 5c fd ff ff 48
 RSP: 002b:00007f26495f2dc0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
 RAX: ffffffffffffffda RBX: 00000000000007d0 RCX: 00007f26499aa90f
 RDX: 0000000000000010 RSI: 00007f26495f2e00 RDI: 0000000000000003
 RBP: 00005632a8315440 R08: 0000000000000000 R09: 0000000000000001
 R10: 0000000000000000 R11: 0000000000000293 R12: 00007f26495f2e00
 R13: 00005632a83154e0 R14: 00005632a8315440 R15: 00005632a830a810

 Allocated by task 131419:
  kasan_save_stack+0x1b/0x40
  __kasan_kmalloc+0x7c/0x90
  proc_self_get_link+0x8b/0x100
  pick_link+0x4f1/0x5c0
  step_into+0x2eb/0x3d0
  walk_component+0xc8/0x2c0
  link_path_walk+0x3b8/0x580
  path_openat+0x101/0x230
  do_filp_open+0x12e/0x240
  do_sys_openat2+0x115/0x280
  __x64_sys_openat+0xce/0x140
  do_syscall_64+0x43/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 2ca546b92a ("IB/sa: Route SA pathrecord query through netlink")
Link: https://lore.kernel.org/r/72ede0f6dab61f7f23df9ac7a70666e07ef314b0.1635055496.git.leonro@nvidia.com
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 11:51:51 -03:00
Trevor Woerner
ace19b9924 net: nxp: lpc_eth.c: avoid hang when bringing interface down
A hard hang is observed whenever the ethernet interface is brought
down. If the PHY is stopped before the LPC core block is reset,
the SoC will hang. Comparing lpc_eth_close() and lpc_eth_open() I
re-arranged the ordering of the functions calls in lpc_eth_close() to
reset the hardware before stopping the PHY.
Fixes: b7370112f5 ("lpc32xx: Added ethernet driver")
Signed-off-by: Trevor Woerner <twoerner@gmail.com>
Acked-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 15:40:27 +01:00
Janusz Dziedzic
689a0a9f50 cfg80211: correct bridge/4addr mode check
Without the patch we fail:

$ sudo brctl addbr br0
$ sudo brctl addif br0 wlp1s0
$ sudo iw wlp1s0 set 4addr on
command failed: Device or resource busy (-16)

Last command failed but iface was already in 4addr mode.

Fixes: ad4bb6f888 ("cfg80211: disallow bridging managed/adhoc interfaces")
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@gmail.com>
Link: https://lore.kernel.org/r/20211024201546.614379-1-janusz.dziedzic@gmail.com
[add fixes tag, fix indentation, edit commit log]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-25 15:23:20 +02:00
Johannes Berg
09b1d5dc6c cfg80211: fix management registrations locking
The management registrations locking was broken, the list was
locked for each wdev, but cfg80211_mgmt_registrations_update()
iterated it without holding all the correct spinlocks, causing
list corruption.

Rather than trying to fix it with fine-grained locking, just
move the lock to the wiphy/rdev (still need the list on each
wdev), we already need to hold the wdev lock to change it, so
there's no contention on the lock in any case. This trivially
fixes the bug since we hold one wdev's lock already, and now
will hold the lock that protects all lists.

Cc: stable@vger.kernel.org
Reported-by: Jouni Malinen <j@w1.fi>
Fixes: 6cd536fe62 ("cfg80211: change internal management frame registration API")
Link: https://lore.kernel.org/r/20211025133111.5cf733eab0f4.I7b0abb0494ab712f74e2efcd24bb31ac33f7eee9@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-25 15:20:22 +02:00
David Woodhouse
0985dba842 KVM: x86/xen: Fix kvm_xen_has_interrupt() sleeping in kvm_vcpu_block()
In kvm_vcpu_block, the current task is set to TASK_INTERRUPTIBLE before
making a final check whether the vCPU should be woken from HLT by any
incoming interrupt.

This is a problem for the get_user() in __kvm_xen_has_interrupt(), which
really shouldn't be sleeping when the task state has already been set.
I think it's actually harmless as it would just manifest itself as a
spurious wakeup, but it's causing a debug warning:

[  230.963649] do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000b6bcdbc9>] prepare_to_swait_exclusive+0x30/0x80

Fix the warning by turning it into an *explicit* spurious wakeup. When
invoked with !task_is_running(current) (and we might as well add
in_atomic() there while we're at it), just return 1 to indicate that
an IRQ is pending, which will cause a wakeup and then something will
call it again in a context that *can* sleep so it can fault the page
back in.

Cc: stable@vger.kernel.org
Fixes: 40da8ccd72 ("KVM: x86/xen: Add event channel interrupt vector upcall")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>

Message-Id: <168bf8c689561da904e48e2ff5ae4713eaef9e2d.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-25 09:10:18 -04:00
Paolo Bonzini
4b2caef043 Merge tag 'kvm-s390-master-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: Fixes for interrupt delivery

Two bugs that might result in CPUs not woken up when interrupts are
pending.
2021-10-25 09:08:56 -04:00
David S. Miller
b4ab21f903 Merge branch 'ksettings-locking-fixes'
Andrew Lunn says:

====================
ksettings_{get|set} lock fixes

Walter Stoll <Walter.Stoll@duagon.com> reported a race condition
between "ethtool -s eth0 speed 100 duplex full autoneg off" and phylib
reading the current status from the PHY. Both ksetting_get and
ksetting_set fail the take the phydev mutex, and as a result, there is
a small window of time where the phydev members are not self
consistent.

Patch 1 fixes phy_ethtool_ksettings_get by adding the needed lock.
Patches 2 and 3 move code around and perform to refactoring, to allow
patch 4 to fix phy_ethtool_ksettings_set by added the lock.

Thanks go to Walter for the detailed origional report, suggested fix,
and testing of the proposed patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:06:43 +01:00
Andrew Lunn
af1a02aa23 phy: phy_ethtool_ksettings_set: Lock the PHY while changing settings
There is a race condition where the PHY state machine can change
members of the phydev structure at the same time userspace requests a
change via ethtool. To prevent this, have phy_ethtool_ksettings_set
take the PHY lock.

Fixes: 2d55173e71 ("phy: add generic function to support ksetting support")
Reported-by: Walter Stoll <Walter.Stoll@duagon.com>
Suggested-by: Walter Stoll <Walter.Stoll@duagon.com>
Tested-by: Walter Stoll <Walter.Stoll@duagon.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:06:43 +01:00
Andrew Lunn
707293a56f phy: phy_start_aneg: Add an unlocked version
Split phy_start_aneg into a wrapper which takes the PHY lock, and a
helper doing the real work. This will be needed when
phy_ethtook_ksettings_set takes the lock.

Fixes: 2d55173e71 ("phy: add generic function to support ksetting support")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:06:43 +01:00
Andrew Lunn
64cd92d5e8 phy: phy_ethtool_ksettings_set: Move after phy_start_aneg
This allows it to make use of a helper which assume the PHY is already
locked.

Fixes: 2d55173e71 ("phy: add generic function to support ksetting support")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:06:43 +01:00
Andrew Lunn
c10a485c3d phy: phy_ethtool_ksettings_get: Lock the phy for consistency
The PHY structure should be locked while copying information out if
it, otherwise there is no guarantee of self consistency. Without the
lock the PHY state machine could be updating the structure.

Fixes: 2d55173e71 ("phy: add generic function to support ksetting support")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:06:43 +01:00
David Woodhouse
8228c77d8b KVM: x86: switch pvclock_gtod_sync_lock to a raw spinlock
On the preemption path when updating a Xen guest's runstate times, this
lock is taken inside the scheduler rq->lock, which is a raw spinlock.
This was shown in a lockdep warning:

[   89.138354] =============================
[   89.138356] [ BUG: Invalid wait context ]
[   89.138358] 5.15.0-rc5+ #834 Tainted: G S        I E
[   89.138360] -----------------------------
[   89.138361] xen_shinfo_test/2575 is trying to lock:
[   89.138363] ffffa34a0364efd8 (&kvm->arch.pvclock_gtod_sync_lock){....}-{3:3}, at: get_kvmclock_ns+0x1f/0x130 [kvm]
[   89.138442] other info that might help us debug this:
[   89.138444] context-{5:5}
[   89.138445] 4 locks held by xen_shinfo_test/2575:
[   89.138447]  #0: ffff972bdc3b8108 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0x77/0x6f0 [kvm]
[   89.138483]  #1: ffffa34a03662e90 (&kvm->srcu){....}-{0:0}, at: kvm_arch_vcpu_ioctl_run+0xdc/0x8b0 [kvm]
[   89.138526]  #2: ffff97331fdbac98 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0xff/0xbd0
[   89.138534]  #3: ffffa34a03662e90 (&kvm->srcu){....}-{0:0}, at: kvm_arch_vcpu_put+0x26/0x170 [kvm]
...
[   89.138695]  get_kvmclock_ns+0x1f/0x130 [kvm]
[   89.138734]  kvm_xen_update_runstate+0x14/0x90 [kvm]
[   89.138783]  kvm_xen_update_runstate_guest+0x15/0xd0 [kvm]
[   89.138830]  kvm_arch_vcpu_put+0xe6/0x170 [kvm]
[   89.138870]  kvm_sched_out+0x2f/0x40 [kvm]
[   89.138900]  __schedule+0x5de/0xbd0

Cc: stable@vger.kernel.org
Reported-by: syzbot+b282b65c2c68492df769@syzkaller.appspotmail.com
Fixes: 30b5c851af ("KVM: x86/xen: Add support for vCPU runstate information")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <1b02a06421c17993df337493a68ba923f3bd5c0f.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-25 08:14:38 -04:00
LABBE Corentin
00568b8a63 ARM: 9148/1: handle CONFIG_CPU_ENDIAN_BE32 in arch/arm/kernel/head.S
My intel-ixp42x-welltech-epbx100 no longer boot since 4.14.
This is due to commit 463dbba4d1 ("ARM: 9104/2: Fix Keystone 2 kernel
mapping regression")
which forgot to handle CONFIG_CPU_ENDIAN_BE32 as possible BE config.

Suggested-by: Krzysztof Hałasa <khalasa@piap.pl>
Fixes: 463dbba4d1 ("ARM: 9104/2: Fix Keystone 2 kernel mapping regression")
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-25 13:11:34 +01:00
Asmaa Mnebhi
c0eee6fbfa gpio: mlxbf2.c: Add check for bgpio_init failure
Add a check if bgpio_init fails.

Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2021-10-25 10:15:05 +02:00
Jonas Gorski
85fe6415c1 gpio: xgs-iproc: fix parsing of ngpios property
of_property_read_u32 returns 0 on success, not true, so we need to
invert the check to actually take over the provided ngpio value.

Fixes: 6a41b6c5fc ("gpio: Add xgs-iproc driver")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2021-10-25 10:10:37 +02:00
Alexey Kardashevskiy
d853adc7ad powerpc/pseries/iommu: Create huge DMA window if no MMIO32 is present
The iommu_init_table() helper takes an address range to reserve in
the IOMMU table being initialized to exclude MMIO addresses, this is
useful if the window stretches far beyond 4GB (although wastes some TCEs).
At the moment the code searches for such MMIO32 range and fails if none
found which is considered a problem while it really is not: it is actually
better as this says there is no MMIO32 to reserve and we can use
usually wasted TCEs. Furthermore PHYP never actually allows creating
windows starting at busaddress=0 so this MMIO32 range is never useful.

This removes error exit and initializes the table with zero range if
no MMIO32 is detected.

Fixes: 381ceda88c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020132315.2287178-5-aik@ozlabs.ru
2021-10-25 11:41:15 +11:00
Alexey Kardashevskiy
92fe01b7c6 powerpc/pseries/iommu: Check if the default window in use before removing it
At the moment this check is performed after we remove the default window
which is late and disallows to revert whatever changes enable_ddw()
has made to DMA windows.

This moves the check and error exit before removing the window.

This raised the message severity from "debug" to "warning" as this
should not happen in practice and cannot be triggered by the userspace.

Fixes: 381ceda88c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020132315.2287178-4-aik@ozlabs.ru
2021-10-25 11:41:14 +11:00
Alexey Kardashevskiy
41ee7232fa powerpc/pseries/iommu: Use correct vfree for it_map
The it_map array is vzalloc'ed so use vfree() for it when creating
a huge DMA window failed for whatever reason.

While at this, write zero to it_map.

Fixes: 381ceda88c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020132315.2287178-3-aik@ozlabs.ru
2021-10-25 11:41:14 +11:00
Zheyu Ma
a0023bb9dd ata: sata_mv: Fix the error handling of mv_chip_id()
mv_init_host() propagates the value returned by mv_chip_id() which in turn
gets propagated by mv_pci_init_one() and hits local_pci_probe().

During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.

Since this is a bug rather than a recoverable runtime error we should
use dev_alert() instead of dev_err().

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2021-10-25 08:53:04 +09:00
Linus Torvalds
87066fdd2e Revert "mm/secretmem: use refcount_t instead of atomic_t"
This reverts commit 110860541f.

Converting the "secretmem_users" counter to a refcount is incorrect,
because a refcount is special in zero and can't just be incremented (but
a count of users is not, and "no users" is actually perfectly valid and
not a sign of a free'd resource).

Reported-by: syzbot+75639e6a0331cd61d3e2@syzkaller.appspotmail.com
Cc: Jordy Zomer <jordy@pwning.systems>
Cc: Kees Cook <keescook@chromium.org>,
Cc: Jordy Zomer <jordy@jordyzomer.github.io>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-24 09:48:33 -10:00
Linus Torvalds
b20078fd69 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull autofs fix from Al Viro:
 "Fix for a braino of mine (in getting rid of open-coded
  dentry_path_raw() in autofs a couple of cycles ago).

  Mea culpa...  Obvious -stable fodder"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  autofs: fix wait name hash calculation in autofs_wait()
2021-10-24 09:36:06 -10:00
Linus Torvalds
6c62666d88 Merge tag 'sched_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Borislav Petkov:
 "Reset clang's Shadow Call Stack on hotplug to prevent it from
  overflowing"

* tag 'sched_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/scs: Reset the shadow stack when idle_task_exit
2021-10-24 07:04:21 -10:00
Linus Torvalds
16bc177666 Merge tag 'x86_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
 "A single change adding Dave Hansen to our maintainers team"

* tag 'x86_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Add Dave Hansen to the x86 maintainer team
2021-10-24 07:00:15 -10:00
Linus Torvalds
c460e7896e Merge tag '5.15-rc6-ksmbd-fixes' of git://git.samba.org/ksmbd
Pull ksmbd fixes from Steve French:
 "Ten fixes for the ksmbd kernel server, for improved security and
  additional buffer overflow checks:

   - a security improvement to session establishment to reduce the
     possibility of dictionary attacks

   - fix to ensure that maximum i/o size negotiated in the protocol is
     not less than 64K and not more than 8MB to better match expected
     behavior

   - fix for crediting (flow control) important to properly verify that
     sufficient credits are available for the requested operation

   - seven additional buffer overflow, buffer validation checks"

* tag '5.15-rc6-ksmbd-fixes' of git://git.samba.org/ksmbd:
  ksmbd: add buffer validation in session setup
  ksmbd: throttle session setup failures to avoid dictionary attacks
  ksmbd: validate OutputBufferLength of QUERY_DIR, QUERY_INFO, IOCTL requests
  ksmbd: validate credit charge after validating SMB2 PDU body size
  ksmbd: add buffer validation for smb direct
  ksmbd: limit read/write/trans buffer size not to exceed 8MB
  ksmbd: validate compound response buffer
  ksmbd: fix potencial 32bit overflow from data area check in smb2_write
  ksmbd: improve credits management
  ksmbd: add validation in smb2_ioctl
2021-10-24 06:43:59 -10:00
Linus Torvalds
0f386a604c Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Ten fixes, seven of which are in drivers.

  The core fixes are one to fix a potential crash on resume, one to sort
  out our reference count releases to avoid releasing in-use modules and
  one to adjust the cmd per lun calculation to avoid an overflow in
  hyper-v"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: ufs-pci: Force a full restore after suspend-to-disk
  scsi: qla2xxx: Fix unmap of already freed sgl
  scsi: qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()
  scsi: qla2xxx: Return -ENOMEM if kzalloc() fails
  scsi: sd: Fix crashes in sd_resume_runtime()
  scsi: mpi3mr: Fix duplicate device entries when scanning through sysfs
  scsi: core: Put LLD module refcnt after SCSI device is released
  scsi: storvsc: Fix validation for unsolicited incoming packets
  scsi: iscsi: Fix set_param() handling
  scsi: core: Fix shost->cmd_per_lun calculation in scsi_add_host_with_dma()
2021-10-24 06:23:48 -10:00
Yuiko Oshino
95a359c955 net: ethernet: microchip: lan743x: Fix dma allocation failure by using dma_set_mask_and_coherent
The dma failure was reported in the raspberry pi github (issue #4117).
https://github.com/raspberrypi/linux/issues/4117
The use of dma_set_mask_and_coherent fixes the issue.
Tested on 32/64-bit raspberry pi CM4 and 64-bit ubuntu x86 PC with EVB-LAN7430.

Fixes: 23f0703c12 ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-24 13:38:56 +01:00
Yuiko Oshino
d6423d2ec3 net: ethernet: microchip: lan743x: Fix driver crash when lan743x_pm_resume fails
The driver needs to clean up and return when the initialization fails on resume.

Fixes: 23f0703c12 ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-24 13:37:48 +01:00
Linus Torvalds
9c0c4d24ac Merge tag 'block-5.15-2021-10-22' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Fix for the cgroup code not ussing irq safe stats updates, and one fix
  for an error handling condition in add_partition()"

* tag 'block-5.15-2021-10-22' of git://git.kernel.dk/linux-block:
  block: fix incorrect references to disk objects
  blk-cgroup: blk_cgroup_bio_start() should use irq-safe operations on blkg->iostat_cpu
2021-10-22 17:42:13 -10:00
Linus Torvalds
da4d34b669 Merge tag 'io_uring-5.15-2021-10-22' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "Two fixes for the max workers limit API that was introduced this
  series: one fix for an issue with that code, and one fixing a linked
  timeout regression in this series"

* tag 'io_uring-5.15-2021-10-22' of git://git.kernel.dk/linux-block:
  io_uring: apply worker limits to previous users
  io_uring: fix ltimeout unprep
  io_uring: apply max_workers limit to all future users
  io-wq: max_worker fixes
2021-10-22 17:34:31 -10:00
Quanyang Wang
04f8ef5643 cgroup: Fix memory leak caused by missing cgroup_bpf_offline
When enabling CONFIG_CGROUP_BPF, kmemleak can be observed by running
the command as below:

    $mount -t cgroup -o none,name=foo cgroup cgroup/
    $umount cgroup/

unreferenced object 0xc3585c40 (size 64):
  comm "mount", pid 425, jiffies 4294959825 (age 31.990s)
  hex dump (first 32 bytes):
    01 00 00 80 84 8c 28 c0 00 00 00 00 00 00 00 00  ......(.........
    00 00 00 00 00 00 00 00 6c 43 a0 c3 00 00 00 00  ........lC......
  backtrace:
    [<e95a2f9e>] cgroup_bpf_inherit+0x44/0x24c
    [<1f03679c>] cgroup_setup_root+0x174/0x37c
    [<ed4b0ac5>] cgroup1_get_tree+0x2c0/0x4a0
    [<f85b12fd>] vfs_get_tree+0x24/0x108
    [<f55aec5c>] path_mount+0x384/0x988
    [<e2d5e9cd>] do_mount+0x64/0x9c
    [<208c9cfe>] sys_mount+0xfc/0x1f4
    [<06dd06e0>] ret_fast_syscall+0x0/0x48
    [<a8308cb3>] 0xbeb4daa8

This is because that since the commit 2b0d3d3e4f ("percpu_ref: reduce
memory footprint of percpu_ref in fast path") root_cgrp->bpf.refcnt.data
is allocated by the function percpu_ref_init in cgroup_bpf_inherit which
is called by cgroup_setup_root when mounting, but not freed along with
root_cgrp when umounting. Adding cgroup_bpf_offline which calls
percpu_ref_kill to cgroup_kill_sb can free root_cgrp->bpf.refcnt.data in
umount path.

This patch also fixes the commit 4bfc0bb2c6 ("bpf: decouple the lifetime
of cgroup_bpf from cgroup itself"). A cgroup_bpf_offline is needed to do a
cleanup that frees the resources which are allocated by cgroup_bpf_inherit
in cgroup_setup_root.

And inside cgroup_bpf_offline, cgroup_get() is at the beginning and
cgroup_put is at the end of cgroup_bpf_release which is called by
cgroup_bpf_offline. So cgroup_bpf_offline can keep the balance of
cgroup's refcount.

Fixes: 2b0d3d3e4f ("percpu_ref: reduce memory footprint of percpu_ref in fast path")
Fixes: 4bfc0bb2c6 ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211018075623.26884-1-quanyang.wang@windriver.com
2021-10-22 17:23:54 -07:00
Xu Kuohai
fda7a38714 bpf: Fix error usage of map_fd and fdget() in generic_map_update_batch()
1. The ufd in generic_map_update_batch() should be read from batch.map_fd;
2. A call to fdget() should be followed by a symmetric call to fdput().

Fixes: aa2e93b8e5 ("bpf: Add generic support for update and delete batch ops")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211019032934.1210517-1-xukuohai@huawei.com
2021-10-22 17:23:54 -07:00
Alexei Starovoitov
22a127908e Merge branch 'Fix up bpf_jit_limit some more'
Lorenz Bauer says:

====================
Fix some inconsistencies of bpf_jit_limit on non-x86 platforms.
I've dropped exposing bpf_jit_current since we couldn't agree on
file modes, correct names, etc.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-10-22 17:23:53 -07:00
Lorenz Bauer
fadb7ff1a6 bpf: Prevent increasing bpf_jit_limit above max
Restrict bpf_jit_limit to the maximum supported by the arch's JIT.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211014142554.53120-4-lmb@cloudflare.com
2021-10-22 17:23:53 -07:00
Lorenz Bauer
5d63ae9082 bpf: Define bpf_jit_alloc_exec_limit for arm64 JIT
Expose the maximum amount of useable memory from the arm64 JIT.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211014142554.53120-3-lmb@cloudflare.com
2021-10-22 17:23:53 -07:00
Lorenz Bauer
8f04db78e4 bpf: Define bpf_jit_alloc_exec_limit for riscv JIT
Expose the maximum amount of useable memory from the riscv JIT.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Luke Nelson <luke.r.nels@gmail.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20211014142554.53120-2-lmb@cloudflare.com
2021-10-22 17:23:53 -07:00
Florian Westphal
1f83b835a3 fcnal-test: kill hanging ping/nettest binaries on cleanup
On my box I see a bunch of ping/nettest processes hanging
around after fcntal-test.sh is done.

Clean those up before netns deletion.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211021140247.29691-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 14:03:18 -07:00
Linus Torvalds
5ab2ed0a8d Merge tag 'fuse-fixes-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
 "Syzbot discovered a race in case of reusing the fuse sb (introduced in
  this cycle).

  Fix it by doing the s_fs_info initialization at the proper place"

* tag 'fuse-fixes-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: clean up error exits in fuse_fill_super()
  fuse: always initialize sb->s_fs_info
  fuse: clean up fuse_mount destruction
  fuse: get rid of fuse_put_super()
  fuse: check s_root when destroying sb
2021-10-22 10:39:47 -10:00
Linus Torvalds
477b4e80c5 Merge tag 'hyperv-fixes-signed-20211022' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyper-v fix from Wei Liu:

 - Fix vmbus ARM64 build (Arnd Bergmann)

* tag 'hyperv-fixes-signed-20211022' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  hyperv/vmbus: include linux/bitops.h
2021-10-22 10:31:32 -10:00
Jakub Kicinski
32f8807a48 Merge branch 'sctp-enhancements-for-the-verification-tag'
Xin Long says:

====================
sctp: enhancements for the verification tag

This patchset is to address CVE-2021-3772:

  A flaw was found in the Linux SCTP stack. A blind attacker may be able to
  kill an existing SCTP association through invalid chunks if the attacker
  knows the IP-addresses and port numbers being used and the attacker can
  send packets with spoofed IP addresses.

This is caused by the missing VTAG verification for the received chunks
and the incorrect vtag for the ABORT used to reply to these invalid
chunks.

This patchset is to go over all processing functions for the received
chunks and do:

1. Make sure sctp_vtag_verify() is called firstly to verify the vtag from
   the received chunk and discard this chunk if it fails. With some
   exceptions:

   a. sctp_sf_do_5_1B_init()/5_2_2_dupinit()/9_2_reshutack(), processing
      INIT chunk, as sctphdr vtag is always 0 in INIT chunk.

   b. sctp_sf_do_5_2_4_dupcook(), processing dupicate COOKIE_ECHO chunk,
      as the vtag verification will be done by sctp_tietags_compare() and
      then it takes right actions according to the return.

   c. sctp_sf_shut_8_4_5(), processing SHUTDOWN_ACK chunk for cookie_wait
      and cookie_echoed state, as RFC demand sending a SHUTDOWN_COMPLETE
      even if the vtag verification failed.

   d. sctp_sf_ootb(), called in many types of chunks for closed state or
      no asoc, as the same reason to c.

2. Always use the vtag from the received INIT chunk to make the response
   ABORT in sctp_ootb_pkt_new().

3. Fix the order for some checks and add some missing checks for the
   received chunk.

This patch series has been tested with SCTP TAHI testing to make sure no
regression caused on protocol conformance.
====================

Link: https://lore.kernel.org/r/cover.1634730082.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:47 -07:00
Xin Long
9d02831e51 sctp: add vtag check in sctp_sf_ootb
sctp_sf_ootb() is called when processing DATA chunk in closed state,
and many other places are also using it.

The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.

When fails to verify the vtag from the chunk, this patch sets asoc
to NULL, so that the abort will be made with the vtag from the
received chunk later.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:45 -07:00
Xin Long
ef16b1734f sctp: add vtag check in sctp_sf_do_8_5_1_E_sa
sctp_sf_do_8_5_1_E_sa() is called when processing SHUTDOWN_ACK chunk
in cookie_wait and cookie_echoed state.

The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.

Note that when fails to verify the vtag from SHUTDOWN-ACK chunk,
SHUTDOWN COMPLETE message will still be sent back to peer, but
with the vtag from SHUTDOWN-ACK chunk, as said in 5) of
rfc4960#section-8.4.

While at it, also remove the unnecessary chunk length check from
sctp_sf_shut_8_4_5(), as it's already done in both places where
it calls sctp_sf_shut_8_4_5().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:44 -07:00
Xin Long
aa0f697e45 sctp: add vtag check in sctp_sf_violation
sctp_sf_violation() is called when processing HEARTBEAT_ACK chunk
in cookie_wait state, and some other places are also using it.

The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:44 -07:00
Xin Long
a64b341b86 sctp: fix the processing for COOKIE_ECHO chunk
1. In closed state: in sctp_sf_do_5_1D_ce():

  When asoc is NULL, making packet for abort will use chunk's vtag
  in sctp_ootb_pkt_new(). But when asoc exists, vtag from the chunk
  should be verified before using peer.i.init_tag to make packet
  for abort in sctp_ootb_pkt_new(), and just discard it if vtag is
  not correct.

2. In the other states: in sctp_sf_do_5_2_4_dupcook():

  asoc always exists, but duplicate cookie_echo's vtag will be
  handled by sctp_tietags_compare() and then take actions, so before
  that we only verify the vtag for the abort sent for invalid chunk
  length.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:44 -07:00
Xin Long
438b95a7c9 sctp: fix the processing for INIT_ACK chunk
Currently INIT_ACK chunk in non-cookie_echoed state is processed in
sctp_sf_discard_chunk() to send an abort with the existent asoc's
vtag if the chunk length is not valid. But the vtag in the chunk's
sctphdr is not verified, which may be exploited by one to cook a
malicious chunk to terminal a SCTP asoc.

sctp_sf_discard_chunk() also is called in many other places to send
an abort, and most of those have this problem. This patch is to fix
it by sending abort with the existent asoc's vtag only if the vtag
from the chunk's sctphdr is verified in sctp_sf_discard_chunk().

Note on sctp_sf_do_9_1_abort() and sctp_sf_shutdown_pending_abort(),
the chunk length has been verified before sctp_sf_discard_chunk(),
so replace it with sctp_sf_discard(). On sctp_sf_do_asconf_ack() and
sctp_sf_do_asconf(), move the sctp_chunk_length_valid check ahead of
sctp_sf_discard_chunk(), then replace it with sctp_sf_discard().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:44 -07:00
Xin Long
eae5783908 sctp: fix the processing for INIT chunk
This patch fixes the problems below:

1. In non-shutdown_ack_sent states: in sctp_sf_do_5_1B_init() and
   sctp_sf_do_5_2_2_dupinit():

  chunk length check should be done before any checks that may cause
  to send abort, as making packet for abort will access the init_tag
  from init_hdr in sctp_ootb_pkt_new().

2. In shutdown_ack_sent state: in sctp_sf_do_9_2_reshutack():

  The same checks as does in sctp_sf_do_5_2_2_dupinit() is needed
  for sctp_sf_do_9_2_reshutack().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:43 -07:00
Xin Long
4f7019c7eb sctp: use init_tag from inithdr for ABORT chunk
Currently Linux SCTP uses the verification tag of the existing SCTP
asoc when failing to process and sending the packet with the ABORT
chunk. This will result in the peer accepting the ABORT chunk and
removing the SCTP asoc. One could exploit this to terminate a SCTP
asoc.

This patch is to fix it by always using the initiate tag of the
received INIT chunk for the ABORT chunk to be sent.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:43 -07:00
Vasily Averin
7f678def99 skb_expand_head() adjust skb->truesize incorrectly
Christoph Paasch reports [1] about incorrect skb->truesize
after skb_expand_head() call in ip6_xmit.
This may happen because of two reasons:
- skb_set_owner_w() for newly cloned skb is called too early,
before pskb_expand_head() where truesize is adjusted for (!skb-sk) case.
- pskb_expand_head() does not adjust truesize in (skb->sk) case.
In this case sk->sk_wmem_alloc should be adjusted too.

[1] https://lkml.org/lkml/2021/8/20/1082

Fixes: f1260ff15a ("skbuff: introduce skb_expand_head()")
Fixes: 2d85a1b31d ("ipv6: ip6_finish_output2: set sk into newly allocated nskb")
Reported-by: Christoph Paasch <christoph.paasch@gmail.com>
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/644330dd-477e-0462-83bf-9f514c41edd1@virtuozzo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:35:51 -07:00
Arnd Bergmann
8017c99680 hyperv/vmbus: include linux/bitops.h
On arm64 randconfig builds, hyperv sometimes fails with this
error:

In file included from drivers/hv/hv_trace.c:3:
In file included from drivers/hv/hyperv_vmbus.h:16:
In file included from arch/arm64/include/asm/sync_bitops.h:5:
arch/arm64/include/asm/bitops.h:11:2: error: only <linux/bitops.h> can be included directly
In file included from include/asm-generic/bitops/hweight.h:5:
include/asm-generic/bitops/arch_hweight.h:9:9: error: implicit declaration of function '__sw_hweight32' [-Werror,-Wimplicit-function-declaration]
include/asm-generic/bitops/atomic.h:17:7: error: implicit declaration of function 'BIT_WORD' [-Werror,-Wimplicit-function-declaration]

Include the correct header first.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211018131929.2260087-1-arnd@kernel.org
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-10-22 19:16:08 +00:00
Linus Torvalds
1d4590f506 Merge tag 'acpi-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
 "These fix two regressions, one related to ACPI power resources
  management and one that broke ACPI tools compilation.

  Specifics:

   - Stop turning off unused ACPI power resources in an unknown state to
     address a regression introduced during the 5.14 cycle (Rafael
     Wysocki).

   - Fix an ACPI tools build issue introduced recently when the minimal
     stdarg.h was added (Miguel Bernal Marin)"

* tag 'acpi-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: PM: Do not turn off power resources in unknown state
  ACPI: tools: fix compilation error
2021-10-22 09:08:08 -10:00
Linus Torvalds
cd82c4a73b Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more x86 kvm fixes from Paolo Bonzini:

 - Cache coherency fix for SEV live migration

 - Fix for instruction emulation with PKU

 - fixes for rare delaying of interrupt delivery

 - fix for SEV-ES buffer overflow

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SEV-ES: go over the sev_pio_data buffer in multiple passes if needed
  KVM: SEV-ES: keep INS functions together
  KVM: x86: remove unnecessary arguments from complete_emulator_pio_in
  KVM: x86: split the two parts of emulator_pio_in
  KVM: SEV-ES: clean up kvm_sev_es_ins/outs
  KVM: x86: leave vcpu->arch.pio.count alone in emulator_pio_in_out
  KVM: SEV-ES: rename guest_ins_data to sev_pio_data
  KVM: SEV: Flush cache on non-coherent systems before RECEIVE_UPDATE_DATA
  KVM: MMU: Reset mmu->pkru_mask to avoid stale data
  KVM: nVMX: promptly process interrupts delivered while in guest mode
  KVM: x86: check for interrupts before deciding whether to exit the fast path
2021-10-22 09:02:15 -10:00
Rafael J. Wysocki
7a7489005a Merge branch 'acpi-tools'
Merge a fix for a recent ACPI tools bild regresson.

* acpi-tools:
  ACPI: tools: fix compilation error
2021-10-22 20:45:10 +02:00
Jakub Kicinski
7fcb1c950e Merge tag 'mac80211-for-net-2021-10-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
Two small fixes:
 * RCU misuse in scan processing in cfg80211
 * missing size check for HE data in mac80211 mesh

* tag 'mac80211-for-net-2021-10-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211:
  cfg80211: scan: fix RCU in cfg80211_add_nontrans_list()
  mac80211: mesh: fix HE operation element length check
====================

Link: https://lore.kernel.org/r/20211021154351.134297-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 11:12:46 -07:00
Paolo Bonzini
95e16b4792 KVM: SEV-ES: go over the sev_pio_data buffer in multiple passes if needed
The PIO scratch buffer is larger than a single page, and therefore
it is not possible to copy it in a single step to vcpu->arch/pio_data.
Bound each call to emulator_pio_in/out to a single page; keep
track of how many I/O operations are left in vcpu->arch.sev_pio_count,
so that the operation can be restarted in the complete_userspace_io
callback.

For OUT, this means that the previous kvm_sev_es_outs implementation
becomes an iterator of the loop, and we can consume the sev_pio_data
buffer before leaving to userspace.

For IN, instead, consuming the buffer and decreasing sev_pio_count
is always done in the complete_userspace_io callback, because that
is when the memcpy is done into sev_pio_data.

Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reported-by: Felix Wilhelm <fwilhelm@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-22 10:09:13 -04:00
Paolo Bonzini
4fa4b38dae KVM: SEV-ES: keep INS functions together
Make the diff a little nicer when we actually get to fixing
the bug.  No functional change intended.

Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-22 10:08:51 -04:00
Paolo Bonzini
6b5efc930b KVM: x86: remove unnecessary arguments from complete_emulator_pio_in
complete_emulator_pio_in can expect that vcpu->arch.pio has been filled in,
and therefore does not need the size and count arguments.  This makes things
nicer when the function is called directly from a complete_userspace_io
callback.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-22 10:08:38 -04:00
Paolo Bonzini
3b27de2718 KVM: x86: split the two parts of emulator_pio_in
emulator_pio_in handles both the case where the data is pending in
vcpu->arch.pio.count, and the case where I/O has to be done via either
an in-kernel device or a userspace exit.  For SEV-ES we would like
to split these, to identify clearly the moment at which the
sev_pio_data is consumed.  To this end, create two different
functions: __emulator_pio_in fills in vcpu->arch.pio.count, while
complete_emulator_pio_in clears it and releases vcpu->arch.pio.data.

Because this patch has to be backported, things are left a bit messy.
kernel_pio() operates on vcpu->arch.pio, which leads to emulator_pio_in()
having with two calls to complete_emulator_pio_in().  It will be fixed
in the next release.

While at it, remove the unused void* val argument of emulator_pio_in_out.
The function currently hardcodes vcpu->arch.pio_data as the
source/destination buffer, which sucks but will be fixed after the more
severe SEV-ES buffer overflow.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-22 10:08:00 -04:00
Paolo Bonzini
ea724ea420 KVM: SEV-ES: clean up kvm_sev_es_ins/outs
A few very small cleanups to the functions, smushed together because
the patch is already very small like this:

- inline emulator_pio_in_emulated and emulator_pio_out_emulated,
  since we already have the vCPU

- remove the data argument and pull setting vcpu->arch.sev_pio_data into
  the caller

- remove unnecessary clearing of vcpu->arch.pio.count when
  emulation is done by the kernel (and therefore vcpu->arch.pio.count
  is already clear on exit from emulator_pio_in and emulator_pio_out).

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-22 10:02:20 -04:00
Paolo Bonzini
0d33b1baeb KVM: x86: leave vcpu->arch.pio.count alone in emulator_pio_in_out
Currently emulator_pio_in clears vcpu->arch.pio.count twice if
emulator_pio_in_out performs kernel PIO.  Move the clear into
emulator_pio_out where it is actually necessary.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-22 10:02:07 -04:00
Paolo Bonzini
b5998402e3 KVM: SEV-ES: rename guest_ins_data to sev_pio_data
We will be using this field for OUTS emulation as well, in case the
data that is pushed via OUTS spans more than one page.  In that case,
there will be a need to save the data pointer across exits to userspace.

So, change the name to something that refers to any kind of PIO.
Also spell out what it is used for, namely SEV-ES.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-22 10:01:26 -04:00
Tianjia Zhang
f8690a4b5a crypto: x86/sm4 - Fix invalid section entry size
This fixes the following warning:

  vmlinux.o: warning: objtool: elf_update: invalid section entry size

The size of the rodata section is 164 bytes, directly using the
entry_size of 164 bytes will cause errors in some versions of the
gcc compiler, while using 16 bytes directly will cause errors in
the clang compiler. This patch correct it by filling the size of
rodata to a 16-byte boundary.

Fixes: a7ee22ee14 ("crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation")
Fixes: 5b2efa2bb8 ("crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation")
Reported-by: Peter Zijlstra <peterz@infradead.org>
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Tested-by: Heyuan Shi <heyuan@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-10-22 20:23:01 +08:00
Xie Yongji
0943aacf5a vduse: Fix race condition between resetting and irq injecting
The interrupt might be triggered after a reset since there is
no synchronization between resetting and irq injecting. And it
might break something if the interrupt is delayed until a new
round of device initialization.

Fixes: c8a6153b6c ("vduse: Introduce VDUSE - vDPA Device in Userspace")
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210929083050.88-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-22 06:49:14 -04:00
Xie Yongji
1394103fd7 vduse: Disallow injecting interrupt before DRIVER_OK is set
The interrupt callback should not be triggered before DRIVER_OK
is set. Otherwise, it might break the virtio device driver.
So let's add a check to avoid the unexpected behavior.

Fixes: c8a6153b6c ("vduse: Introduce VDUSE - vDPA Device in Userspace")
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210923075722.98-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-22 06:49:14 -04:00
Daniel Vetter
ee71fb6c4d drm/i915/selftests: Properly reset mock object propers for each test
I forgot to do this properly in

commit 6f11f37459
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Jul 23 10:34:55 2021 +0200

    drm/plane: remove drm_helper_get_plane_damage_clips

intel-gfx CI didn't spot this because we run each selftest in each own
invocations, which means reloading i915.ko. But if you just run all
the selftests in one go at boot-up, then it falls apart and eventually
we cross over the hardcoded limited of how many properties can be
attached to a single object.

Fix this by resetting the property count. Nothing else to clean up
since it's all static storage anyway.

Reported-and-tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Fixes: 6f11f37459 ("drm/plane: remove drm_helper_get_plane_damage_clips")
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211021202048.2638668-1-daniel.vetter@ffwll.ch
2021-10-22 11:09:45 +02:00
Linus Torvalds
6422251513 Merge tag 'drm-fixes-2021-10-22' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Nothing too crazy at the end of the cycle, the kmb modesetting fixes
  are probably a bit large but it's not a major driver, and its fixing
  monitor doesn't turn on type problems.

  Otherwise it's just a few minor patches, one ast regression revert, an
  msm power stability fix.

  ast:
   - fix regression with connector detect

  msm:
   - fix power stability issue

  msxfb:
   - fix crash on unload

  panel:
   - sync fix

  kmb:
   - modesetting fixes"

* tag 'drm-fixes-2021-10-22' of git://anongit.freedesktop.org/drm/drm:
  Revert "drm/ast: Add detect function support"
  drm/kmb: Enable ADV bridge after modeset
  drm/kmb: Corrected typo in handle_lcd_irq
  drm/kmb: Disable change of plane parameters
  drm/kmb: Remove clearing DPHY regs
  drm/kmb: Limit supported mode to 1080p
  drm/kmb: Work around for higher system clock
  drm/panel: ilitek-ili9881c: Fix sync for Feixin K101-IM2BYL02 panel
  drm: mxsfb: Fix NULL pointer dereference crash on unload
  drm/msm/devfreq: Restrict idle clamping to a618 for now
2021-10-21 19:06:08 -10:00
Mike Rapoport
658aafc813 memblock: exclude MEMBLOCK_NOMAP regions from kmemleak
Vladimir Zapolskiy reports:

Commit a7259df767 ("memblock: make memblock_find_in_range method
private") invokes a kernel panic while running kmemleak on OF platforms
with nomaped regions:

  Unable to handle kernel paging request at virtual address fff000021e00000
  [...]
    scan_block+0x64/0x170
    scan_gray_list+0xe8/0x17c
    kmemleak_scan+0x270/0x514
    kmemleak_write+0x34c/0x4ac

The memory allocated from memblock is registered with kmemleak, but if
it is marked MEMBLOCK_NOMAP it won't have linear map entries so an
attempt to scan such areas will fault.

Ideally, memblock_mark_nomap() would inform kmemleak to ignore
MEMBLOCK_NOMAP memory, but it can be called before kmemleak interfaces
operating on physical addresses can use __va() conversion.

Make sure that functions that mark allocated memory as MEMBLOCK_NOMAP
take care of informing kmemleak to ignore such memory.

Link: https://lore.kernel.org/all/8ade5174-b143-d621-8c8e-dc6a1898c6fb@linaro.org
Link: https://lore.kernel.org/all/c30ff0a2-d196-c50d-22f0-bd50696b1205@quicinc.com
Fixes: a7259df767 ("memblock: make memblock_find_in_range method private")
Reported-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Tested-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-21 18:30:49 -10:00
Mike Rapoport
6c9a545519 Revert "memblock: exclude NOMAP regions from kmemleak"
Commit 6e44bd6d34 ("memblock: exclude NOMAP regions from kmemleak")
breaks boot on EFI systems with kmemleak and VM_DEBUG enabled:

  efi: Processing EFI memory map:
  efi:   0x000090000000-0x000091ffffff [Conventional|   |  |  |  |  |  |  |  |  |   |WB|WT|WC|UC]
  efi:   0x000092000000-0x0000928fffff [Runtime Data|RUN|  |  |  |  |  |  |  |  |   |WB|WT|WC|UC]
  ------------[ cut here ]------------
  kernel BUG at mm/kmemleak.c:1140!
  Internal error: Oops - BUG: 0 [#1] SMP
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-rc6-next-20211019+ #104
  pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : kmemleak_free_part_phys+0x64/0x8c
  lr : kmemleak_free_part_phys+0x38/0x8c
  sp : ffff800011eafbc0
  x29: ffff800011eafbc0 x28: 1fffff7fffb41c0d x27: fffffbfffda0e068
  x26: 0000000092000000 x25: 1ffff000023d5f94 x24: ffff800011ed84d0
  x23: ffff800011ed84c0 x22: ffff800011ed83d8 x21: 0000000000900000
  x20: ffff800011782000 x19: 0000000092000000 x18: ffff800011ee0730
  x17: 0000000000000000 x16: 0000000000000000 x15: 1ffff0000233252c
  x14: ffff800019a905a0 x13: 0000000000000001 x12: ffff7000023d5ed7
  x11: 1ffff000023d5ed6 x10: ffff7000023d5ed6 x9 : dfff800000000000
  x8 : ffff800011eaf6b7 x7 : 0000000000000001 x6 : ffff800011eaf6b0
  x5 : 00008ffffdc2a12a x4 : ffff7000023d5ed7 x3 : 1ffff000023dbf99
  x2 : 1ffff000022f0463 x1 : 0000000000000000 x0 : ffffffffffffffff
  Call trace:
   kmemleak_free_part_phys+0x64/0x8c
   memblock_mark_nomap+0x5c/0x78
   reserve_regions+0x294/0x33c
   efi_init+0x2d0/0x490
   setup_arch+0x80/0x138
   start_kernel+0xa0/0x3ec
   __primary_switched+0xc0/0xc8
  Code: 34000041 97d526e7 f9418e80 36000040 (d4210000)
  random: get_random_bytes called from print_oops_end_marker+0x34/0x80 with crng_init=0
  ---[ end trace 0000000000000000 ]---

The crash happens because kmemleak_free_part_phys() tries to use __va()
before memstart_addr is initialized and this triggers a VM_BUG_ON() in
arch/arm64/include/asm/memory.h:

Revert 6e44bd6d34 ("memblock: exclude NOMAP regions from kmemleak"),
the issue it is fixing will be fixed differently.

Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-21 18:30:49 -10:00
Linus Torvalds
9d235ac01f Merge branch 'ucount-fixes-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull ucounts fixes from Eric Biederman:
 "There has been one very hard to track down bug in the ucount code that
  we have been tracking since roughly v5.14 was released. Alex managed
  to find a reliable reproducer a few days ago and then I was able to
  instrument the code and figure out what the issue was.

  It turns out the sigqueue_alloc single atomic operation optimization
  did not play nicely with ucounts multiple level rlimits. It turned out
  that either sigqueue_alloc or sigqueue_free could be operating on
  multiple levels and trigger the conditions for the optimization on
  more than one level at the same time.

  To deal with that situation I have introduced inc_rlimit_get_ucounts
  and dec_rlimit_put_ucounts that just focuses on the optimization and
  the rlimit and ucount changes.

  While looking into the big bug I found I couple of other little issues
  so I am including those fixes here as well.

  When I have time I would very much like to dig into process ownership
  of the shared signal queue and see if we could pick a single owner for
  the entire queue so that all of the rlimits can count to that owner.
  That should entirely remove the need to call get_ucounts and
  put_ucounts in sigqueue_alloc and sigqueue_free. It is difficult
  because Linux unlike POSIX supports setuid that works on a single
  thread"

* 'ucount-fixes-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ucounts: Move get_ucounts from cred_alloc_blank to key_change_session_keyring
  ucounts: Proper error handling in set_cred_ucounts
  ucounts: Pair inc_rlimit_ucounts with dec_rlimit_ucoutns in commit_creds
  ucounts: Fix signal ucount refcounting
2021-10-21 17:27:17 -10:00
Linus Torvalds
6c2c712767 Merge tag 'net-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, and can.

  We'll have one more fix for a socket accounting regression, it's still
  getting polished. Otherwise things look fine.

  Current release - regressions:

   - revert "vrf: reset skb conntrack connection on VRF rcv", there are
     valid uses for previous behavior

   - can: m_can: fix iomap_read_fifo() and iomap_write_fifo()

  Current release - new code bugs:

   - mlx5: e-switch, return correct error code on group creation failure

  Previous releases - regressions:

   - sctp: fix transport encap_port update in sctp_vtag_verify

   - stmmac: fix E2E delay mechanism (in PTP timestamping)

  Previous releases - always broken:

   - netfilter: ip6t_rt: fix out-of-bounds read of ipv6_rt_hdr

   - netfilter: xt_IDLETIMER: fix out-of-bound read caused by lack of
     init

   - netfilter: ipvs: make global sysctl read-only in non-init netns

   - tcp: md5: fix selection between vrf and non-vrf keys

   - ipv6: count rx stats on the orig netdev when forwarding

   - bridge: mcast: use multicast_membership_interval for IGMPv3

   - can:
      - j1939: fix UAF for rx_kref of j1939_priv abort sessions on
        receiving bad messages

      - isotp: fix TX buffer concurrent access in isotp_sendmsg() fix
        return error on FC timeout on TX path

   - ice: fix re-init of RDMA Tx queues and crash if RDMA was not inited

   - hns3: schedule the polling again when allocation fails, prevent
     stalls

   - drivers: add missing of_node_put() when aborting
     for_each_available_child_of_node()

   - ptp: fix possible memory leak and UAF in ptp_clock_register()

   - e1000e: fix packet loss in burst mode on Tiger Lake and later

   - mlx5e: ipsec: fix more checksum offload issues"

* tag 'net-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits)
  usbnet: sanity check for maxpacket
  net: enetc: make sure all traffic classes can send large frames
  net: enetc: fix ethtool counter name for PM0_TERR
  ptp: free 'vclock_index' in ptp_clock_release()
  sfc: Don't use netif_info before net_device setup
  sfc: Export fibre-specific supported link modes
  net/mlx5e: IPsec: Fix work queue entry ethernet segment checksum flags
  net/mlx5e: IPsec: Fix a misuse of the software parser's fields
  net/mlx5e: Fix vlan data lost during suspend flow
  net/mlx5: E-switch, Return correct error code on group creation failure
  net/mlx5: Lag, change multipath and bonding to be mutually exclusive
  ice: Add missing E810 device ids
  igc: Update I226_K device ID
  e1000e: Fix packet loss on Tiger Lake and later
  e1000e: Separate TGP board type from SPT
  ptp: Fix possible memory leak in ptp_clock_register()
  net: stmmac: Fix E2E delay mechanism
  nfc: st95hf: Make spi remove() callback return zero
  net: hns3: disable sriov before unload hclge layer
  net: hns3: fix vf reset workqueue cannot exit
  ...
2021-10-21 15:36:50 -10:00
Linus Torvalds
0a3221b658 Merge tag 'powerpc-5.15-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:

 - Fix a bug exposed by a previous fix, where running guests with
   certain SMT topologies could crash the host on Power8.

 - Fix atomic sleep warnings when re-onlining CPUs, when PREEMPT is
   enabled.

Thanks to Nathan Lynch, Srikar Dronamraju, and Valentin Schneider.

* tag 'powerpc-5.15-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/smp: do not decrement idle task preempt count in CPU offline
  powerpc/idle: Don't corrupt back chain when going idle
2021-10-21 15:30:09 -10:00
Kim Phillips
595cb5e0b8 Revert "drm/ast: Add detect function support"
This reverts commit aae74ff9ca,
since it prevents my AMD Milan system from booting, with:

[   27.189558] BUG: kernel NULL pointer dereference, address: 0000000000000000
[   27.197506] #PF: supervisor write access in kernel mode
[   27.203333] #PF: error_code(0x0002) - not-present page
[   27.209064] PGD 0 P4D 0
[   27.211885] Oops: 0002 [#1] PREEMPT SMP NOPTI
[   27.216744] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.0-rc6+ #15
[   27.223928] Hardware name: AMD Corporation ETHANOL_X/ETHANOL_X, BIOS RXM1006B 08/20/2021
[   27.232955] RIP: 0010:run_timer_softirq+0x38b/0x4a0
[   27.238397] Code: 4c 89 f7 e8 37 27 ac 00 49 c7 46 08 00 00 00 00 49 8b 04 24 48 85 c0 74 71 4d 8b 3c 24 4d 89 7e 08 66 90 49 8b 07 49 8b 57 08 <48> 89 02 48 85 c0 74 04 48 89 50 08 49 8b 77 18 41 f6 47 22 20 4c
[   27.259350] RSP: 0018:ffffc42d00003ee8 EFLAGS: 00010086
[   27.265176] RAX: dead000000000122 RBX: 0000000000000000 RCX: 0000000000000101
[   27.273134] RDX: 0000000000000000 RSI: 0000000000000087 RDI: 0000000000000001
[   27.281084] RBP: ffffc42d00003f70 R08: 0000000000000000 R09: 00000000000003eb
[   27.289043] R10: ffffa0860cb300d0 R11: ffffa0c44de290b0 R12: ffffc42d00003ef8
[   27.297002] R13: 00000000fffef200 R14: ffffa0c44de18dc0 R15: ffffa0867a882350
[   27.304961] FS:  0000000000000000(0000) GS:ffffa0c44de00000(0000) knlGS:0000000000000000
[   27.313988] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   27.320396] CR2: 0000000000000000 CR3: 000000014569c001 CR4: 0000000000770ef0
[   27.328346] PKRU: 55555554
[   27.331359] Call Trace:
[   27.334073]  <IRQ>
[   27.336314]  ? __queue_work+0x420/0x420
[   27.340589]  ? lapic_next_event+0x21/0x30
[   27.345060]  ? clockevents_program_event+0x8f/0xe0
[   27.350402]  __do_softirq+0xfb/0x2db
[   27.354388]  irq_exit_rcu+0x98/0xd0
[   27.358275]  sysvec_apic_timer_interrupt+0xac/0xd0
[   27.363620]  </IRQ>
[   27.365955]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[   27.371685] RIP: 0010:cpuidle_enter_state+0xcc/0x390
[   27.377292] Code: 3d 01 79 0a 50 e8 44 ed 77 ff 49 89 c6 0f 1f 44 00 00 31 ff e8 f5 f8 77 ff 80 7d d7 00 0f 85 e6 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 ff 0f 88 17 01 00 00 49 63 c7 4c 2b 75 c8 48 8d 14 40 48 8d
[   27.398243] RSP: 0018:ffffffffb0e03dc8 EFLAGS: 00000246
[   27.404069] RAX: ffffa0c44de00000 RBX: 0000000000000001 RCX: 000000000000001f
[   27.412028] RDX: 0000000000000000 RSI: ffffffffb0bafc1f RDI: ffffffffb0bbdb81
[   27.419986] RBP: ffffffffb0e03e00 R08: 00000006549f8f3f R09: ffffffffb1065200
[   27.427935] R10: ffffa0c44de27ae4 R11: ffffa0c44de27ac4 R12: ffffa0c5634cb000
[   27.435894] R13: ffffffffb1065200 R14: 00000006549f8f3f R15: 0000000000000001
[   27.443854]  ? cpuidle_enter_state+0xbb/0x390
[   27.448712]  cpuidle_enter+0x2e/0x40
[   27.452695]  call_cpuidle+0x23/0x40
[   27.456584]  do_idle+0x1f0/0x270
[   27.460181]  cpu_startup_entry+0x20/0x30
[   27.464553]  rest_init+0xd4/0xe0
[   27.468149]  arch_call_rest_init+0xe/0x1b
[   27.472619]  start_kernel+0x6bc/0x6e2
[   27.476764]  x86_64_start_reservations+0x24/0x26
[   27.481912]  x86_64_start_kernel+0x75/0x79
[   27.486477]  secondary_startup_64_no_verify+0xb0/0xbb
[   27.492111] Modules linked in: kvm_amd(+) kvm ipmi_si(+) ipmi_devintf rapl wmi_bmof ipmi_msghandler input_leds ccp k10temp mac_hid sch_fq_codel msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul ghash_clmulni_intel syscopyarea aesni_intel sysfillrect crypto_simd sysimgblt fb_sys_fops cryptd hid_generic cec nvme ahci usbhid drm e1000e nvme_core hid libahci i2c_piix4 wmi
[   27.551789] CR2: 0000000000000000
[   27.555482] ---[ end trace 897987dfe93dccc6 ]---
[   27.560630] RIP: 0010:run_timer_softirq+0x38b/0x4a0
[   27.566069] Code: 4c 89 f7 e8 37 27 ac 00 49 c7 46 08 00 00 00 00 49 8b 04 24 48 85 c0 74 71 4d 8b 3c 24 4d 89 7e 08 66 90 49 8b 07 49 8b 57 08 <48> 89 02 48 85 c0 74 04 48 89 50 08 49 8b 77 18 41 f6 47 22 20 4c
[   27.587021] RSP: 0018:ffffc42d00003ee8 EFLAGS: 00010086
[   27.592848] RAX: dead000000000122 RBX: 0000000000000000 RCX: 0000000000000101
[   27.600808] RDX: 0000000000000000 RSI: 0000000000000087 RDI: 0000000000000001
[   27.608765] RBP: ffffc42d00003f70 R08: 0000000000000000 R09: 00000000000003eb
[   27.616716] R10: ffffa0860cb300d0 R11: ffffa0c44de290b0 R12: ffffc42d00003ef8
[   27.624673] R13: 00000000fffef200 R14: ffffa0c44de18dc0 R15: ffffa0867a882350
[   27.632624] FS:  0000000000000000(0000) GS:ffffa0c44de00000(0000) knlGS:0000000000000000
[   27.641650] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   27.648159] CR2: 0000000000000000 CR3: 000000014569c001 CR4: 0000000000770ef0
[   27.656119] PKRU: 55555554
[   27.659133] Kernel panic - not syncing: Fatal exception in interrupt
[   29.030411] Shutting down cpus with NMI
[   29.034699] Kernel Offset: 0x2e600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   29.046790] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Since unreliable, found by bisecting for KASAN's use-after-free in
enqueue_timer+0x4f/0x1e0, where the timer callback is called.

Reported-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Fixes: aae74ff9ca ("drm/ast: Add detect function support")
Link: https://lore.kernel.org/lkml/0f7871be-9ca6-5ae4-3a40-5db9a8fb2365@amd.com/
Cc: Ainux <ainux.wang@gmail.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@redhat.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: sterlingteng@gmail.com
Cc: chenhuacai@kernel.org
Cc: Chuck Lever III <chuck.lever@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: dri-devel <dri-devel@lists.freedesktop.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211021153006.92983-1-kim.phillips@amd.com
2021-10-22 05:52:12 +10:00
Dave Airlie
7e1c5440f4 Merge tag 'drm-misc-fixes-2021-10-21-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.15-rc7:
- Rebased, to remove vc4 patches.
- Fix mxsfb crash on unload.
- Use correct sync parameters for Feixin K101-IM2BYL02.
- Assorted kmb modeset/atomic fixes.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e66eaf89-b9b9-41f5-d0d2-dad7e59fabb5@linux.intel.com
2021-10-22 05:35:28 +10:00
Dave Airlie
730b64d827 Merge tag 'drm-msm-fixes-2021-10-18' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
One more fix for v5.15, to work around a power stability issue on a630
(and possibly others)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs1WPLthmd=ToDcEHm=u-7O38RAVJ2XwRoS8xPmC520vg@mail.gmail.com
2021-10-22 05:22:15 +10:00
Bryant Mairs
def0c36972 drm: panel-orientation-quirks: Add quirk for Aya Neo 2021
Fixes screen orientation for the Aya Neo 2021 handheld gaming console.

Signed-off-by: Bryant Mairs <bryant@mai.rs>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211019142433.4295-1-bryant@mai.rs
2021-10-21 19:33:23 +02:00
Pavel Begunkov
b22fa62a35 io_uring: apply worker limits to previous users
Another change to the API io-wq worker limitation API added in 5.15,
apply the limit to all prior users that already registered a tctx. It
may be confusing as it's now, in particular the change covers the
following 2 cases:

TASK1                   | TASK2
_________________________________________________
ring = create()         |
                        | limit_iowq_workers()
*not limited*           |

TASK1                   | TASK2
_________________________________________________
ring = create()         |
                        | issue_requests()
limit_iowq_workers()    |
                        | *not limited*

A note on locking, it's safe to traverse ->tctx_list as we hold
->uring_lock, but do that after dropping sqd->lock to avoid possible
problems. It's also safe to access tctx->io_wq there because tasks
kill it only after removing themselves from tctx_list, see
io_uring_cancel_generic() -> io_uring_clean_tctx()

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d6e09ecc3545e4dc56e43c906ee3d71b7ae21bed.1634818641.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <haoxu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-21 11:19:38 -06:00
Masahiro Kozuka
c8c340a9b4 KVM: SEV: Flush cache on non-coherent systems before RECEIVE_UPDATE_DATA
Flush the destination page before invoking RECEIVE_UPDATE_DATA, as the
PSP encrypts the data with the guest's key when writing to guest memory.
If the target memory was not previously encrypted, the cache may contain
dirty, unecrypted data that will persist on non-coherent systems.

Fixes: 15fb7de1a7 ("KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command")
Cc: stable@vger.kernel.org
Cc: Peter Gonda <pgonda@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Masahiro Kozuka <masa.koz@kozuka.jp>
[sean: converted bug report to changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210914210951.2994260-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-21 13:01:25 -04:00
Chenyi Qiang
a3ca5281bb KVM: MMU: Reset mmu->pkru_mask to avoid stale data
When updating mmu->pkru_mask, the value can only be added but it isn't
reset in advance. This will make mmu->pkru_mask keep the stale data.
Fix this issue.

Fixes: 2d344105f5 ("KVM, pkeys: introduce pkru_mask to cache conditions")
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20211021071022.1140-1-chenyi.qiang@intel.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-21 11:09:29 -04:00
Oliver Neukum
397430b50a usbnet: sanity check for maxpacket
maxpacket of 0 makes no sense and oopses as we need to divide
by it. Give up.

V2: fixed typo in log and stylistic issues

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Reported-by: syzbot+76bb1d34ffa0adc03baa@syzkaller.appspotmail.com
Reviewed-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20211021122944.21816-1-oneukum@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-21 06:44:53 -07:00
Vladimir Oltean
e378f4967c net: enetc: make sure all traffic classes can send large frames
The enetc driver does not implement .ndo_change_mtu, instead it
configures the MAC register field PTC{Traffic Class}MSDUR[MAXSDU]
statically to a large value during probe time.

The driver used to configure only the max SDU for traffic class 0, and
that was fine while the driver could only use traffic class 0. But with
the introduction of mqprio, sending a large frame into any other TC than
0 is broken.

This patch fixes that by replicating per traffic class the static
configuration done in enetc_configure_port_mac().

Fixes: cbe9e83594 ("enetc: Enable TC offloading with mqprio")
Reported-by: Richie Pearn <richard.pearn@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: <Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20211020173340.1089992-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-21 06:44:33 -07:00
Vladimir Oltean
fb8dc5fc8c net: enetc: fix ethtool counter name for PM0_TERR
There are two counters named "MAC tx frames", one of them is actually
incorrect. The correct name for that counter should be "MAC tx error
frames", which is symmetric to the existing "MAC rx error frames".

Fixes: 16eb4c85c9 ("enetc: Add ethtool statistics")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: <Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20211020165206.1069889-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-21 06:44:23 -07:00
Christian König
0db55f9a1b drm/ttm: fix memleak in ttm_transfered_destroy
We need to cleanup the fences for ghost objects as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-by: Erhard F. <erhard_f@mailbox.org>
Tested-by: Erhard F. <erhard_f@mailbox.org>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214029
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214447
CC: <stable@vger.kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211020173211.2247-1-christian.koenig@amd.com
2021-10-21 15:27:21 +02:00
Thomas Gleixner
0a30896fc5 MAINTAINERS: Add Dave Hansen to the x86 maintainer team
Dave is already listed as x86/mm maintainer, has a profund knowledge
of the x86 architecture in general and a good taste in terms of kernel
programming in general.

Add him as a full x86 maintainer with all rights and duties.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/87zgr3flq7.ffs@tglx
2021-10-21 13:55:42 +02:00
Yang Yingliang
b6b19a71c8 ptp: free 'vclock_index' in ptp_clock_release()
'vclock_index' is accessed from sysfs, it shouled be freed
in release function, so move it from ptp_clock_unregister()
to ptp_clock_release().

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21 12:50:38 +01:00
Erik Ekman
bf6abf345d sfc: Don't use netif_info before net_device setup
Use pci_info instead to avoid unnamed/uninitialized noise:

[197088.688729] sfc 0000:01:00.0: Solarflare NIC detected
[197088.690333] sfc 0000:01:00.0: Part Number : SFN5122F
[197088.729061] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no SR-IOV VFs probed
[197088.729071] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no PTP support

Inspired by fa44821a4d ("sfc: don't use netif_info et al before
net_device is registered") from Heiner Kallweit.

Signed-off-by: Erik Ekman <erik@kryo.se>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21 12:39:13 +01:00
Erik Ekman
c62041c5ba sfc: Export fibre-specific supported link modes
The 1/10GbaseT modes were set up for cards with SFP+ cages in
3497ed8c85 ("sfc: report supported link speeds on SFP connections").
10GbaseT was likely used since no 10G fibre mode existed.

The missing fibre modes for 1/10G were added to ethtool.h in 5711a98221
("net: ethtool: add support for 1000BaseX and missing 10G link modes")
shortly thereafter.

The user guide available at https://support-nic.xilinx.com/wp/drivers
lists support for the following cable and transceiver types in section 2.9:
- QSFP28 100G Direct Attach Cables
- QSFP28 100G SR Optical Transceivers (with SR4 modules listed)
- SFP28 25G Direct Attach Cables
- SFP28 25G SR Optical Transceivers
- QSFP+ 40G Direct Attach Cables
- QSFP+ 40G Active Optical Cables
- QSFP+ 40G SR4 Optical Transceivers
- QSFP+ to SFP+ Breakout Direct Attach Cables
- QSFP+ to SFP+ Breakout Active Optical Cables
- SFP+ 10G Direct Attach Cables
- SFP+ 10G SR Optical Transceivers
- SFP+ 10G LR Optical Transceivers
- SFP 1000BASE‐T Transceivers
- 1G Optical Transceivers
(From user guide issue 28. Issue 16 which also includes older cards like
SFN5xxx/SFN6xxx has matching lists for 1/10/40G transceiver types.)

Regarding SFP+ 10GBASE‐T transceivers the latest guide says:
"Solarflare adapters do not support 10GBASE‐T transceiver modules."

Tested using SFN5122F-R7 (with 2 SFP+ ports). Supported link modes do not change
depending on module used (tested with 1000BASE-T, 1000BASE-BX10, 10GBASE-LR).
Before:

$ ethtool ext
Settings for ext:
	Supported ports: [ FIBRE ]
	Supported link modes:   1000baseT/Full
	                        10000baseT/Full
	Supported pause frame use: Symmetric Receive-only
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  Not reported
	Advertised pause frame use: No
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Link partner advertised link modes:  Not reported
	Link partner advertised pause frame use: No
	Link partner advertised auto-negotiation: No
	Link partner advertised FEC modes: Not reported
	Speed: 1000Mb/s
	Duplex: Full
	Auto-negotiation: off
	Port: FIBRE
	PHYAD: 255
	Transceiver: internal
        Current message level: 0x000020f7 (8439)
                               drv probe link ifdown ifup rx_err tx_err hw
	Link detected: yes

After:

$ ethtool ext
Settings for ext:
	Supported ports: [ FIBRE ]
	Supported link modes:   1000baseT/Full
	                        1000baseX/Full
	                        10000baseCR/Full
	                        10000baseSR/Full
	                        10000baseLR/Full
	Supported pause frame use: Symmetric Receive-only
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  Not reported
	Advertised pause frame use: No
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Link partner advertised link modes:  Not reported
	Link partner advertised pause frame use: No
	Link partner advertised auto-negotiation: No
	Link partner advertised FEC modes: Not reported
	Speed: 1000Mb/s
	Duplex: Full
	Auto-negotiation: off
	Port: FIBRE
	PHYAD: 255
	Transceiver: internal
	Supports Wake-on: g
	Wake-on: d
        Current message level: 0x000020f7 (8439)
                               drv probe link ifdown ifup rx_err tx_err hw
	Link detected: yes

Signed-off-by: Erik Ekman <erik@kryo.se>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21 12:38:34 +01:00
David S. Miller
1439caa1d9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter fixes for net:

1) Crash due to missing initialization of timer data in
   xt_IDLETIMER, from Juhee Kang.

2) NF_CONNTRACK_SECMARK should be bool in Kconfig, from Vegard Nossum.

3) Skip netdev events on netns removal, from Florian Westphal.

4) Add testcase to show port shadowing via UDP, also from Florian.

5) Remove pr_debug() code in ip6t_rt, this fixes a crash due to
   unsafe access to non-linear skbuff, from Xin Long.

6) Make net/ipv4/vs/debug_level read-only from non-init netns,
   from Antoine Tenart.

7) Remove bogus invocation to bash in selftests/netfilter/nft_flowtable.sh
   also from Florian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21 12:32:41 +01:00
David S. Miller
e0bfcf9c77 Merge tag 'mlx5-fixes-2021-10-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5-fixes-2021-10-20
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21 12:11:26 +01:00
David S. Miller
a689702a6c Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-10-20

This series contains updates to e1000e, igc, and ice drivers.

Sasha fixes an issue with dropped packets on Tiger Lake platforms for
e1000e and corrects a device ID for igc.

Tony adds missing E810 device IDs for ice.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21 12:10:29 +01:00
Anitha Chrisanthus
74056092ff drm/kmb: Enable ADV bridge after modeset
On KMB, ADV bridge must be programmed and powered on prior to
MIPI DSI HW initialization.

v2: changed to atomic_bridge_chain_enable (Sam)

Fixes: 98521f4d4b ("drm/kmb: Mipi DSI part of the display driver")
Co-developed-by: Edmund Dea <edmund.j.dea@intel.com>
Signed-off-by: Edmund Dea <edmund.j.dea@intel.com>
Signed-off-by: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211019230719.789958-1-anitha.chrisanthus@intel.com
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-10-21 11:08:09 +02:00
Anitha Chrisanthus
004d271980 drm/kmb: Corrected typo in handle_lcd_irq
Check for Overflow bits for layer3 in the irq handler.

Fixes: 7f7b96a8a0 ("drm/kmb: Add support for KeemBay Display")
Signed-off-by: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211013233632.471892-5-anitha.chrisanthus@intel.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-10-21 11:08:09 +02:00
Edmund Dea
982f8ad666 drm/kmb: Disable change of plane parameters
Due to HW limitations, KMB cannot change height, width, or
pixel format after initial plane configuration.

v2: removed memset disp_cfg as it is already zero.

Fixes: 7f7b96a8a0 ("drm/kmb: Add support for KeemBay Display")
Signed-off-by: Edmund Dea <edmund.j.dea@intel.com>
Signed-off-by: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211013233632.471892-4-anitha.chrisanthus@intel.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-10-21 11:08:08 +02:00
Edmund Dea
13047a092c drm/kmb: Remove clearing DPHY regs
Don't clear the shared DPHY registers common to MIPI Rx and MIPI Tx during
DSI initialization since this was causing MIPI Rx reset. Rest of the
writes are bitwise, so will not affect Mipi Rx side.

Fixes: 98521f4d4b ("drm/kmb: Mipi DSI part of the display driver")
Signed-off-by: Edmund Dea <edmund.j.dea@intel.com>
Signed-off-by: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211013233632.471892-3-anitha.chrisanthus@intel.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-10-21 11:08:08 +02:00
Anitha Chrisanthus
a79f40cccd drm/kmb: Limit supported mode to 1080p
KMB only supports single resolution(1080p), this commit checks for
1920x1080x60 or 1920x1080x59 in crtc_mode_valid.
Also, modes with vfp < 4 are not supported in KMB display. This change
prunes display modes with vfp < 4.

v2: added vfp check

Fixes: 7f7b96a8a0 ("drm/kmb: Add support for KeemBay Display")
Co-developed-by: Edmund Dea <edmund.j.dea@intel.com>
Signed-off-by: Edmund Dea <edmund.j.dea@intel.com>
Signed-off-by: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link:https://patchwork.freedesktop.org/patch/msgid/20211013233632.471892-2-anitha.chrisanthus@intel.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-10-21 11:08:08 +02:00
Anitha Chrisanthus
3e4c31e8f7 drm/kmb: Work around for higher system clock
Use a different value for system clock offset in the
ppl/llp ratio calculations for clocks higher than 500 Mhz.

Fixes: 98521f4d4b ("drm/kmb: Mipi DSI part of the display driver")
Signed-off-by: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211013233632.471892-1-anitha.chrisanthus@intel.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-10-21 11:08:08 +02:00
Dan Johansen
772970620a drm/panel: ilitek-ili9881c: Fix sync for Feixin K101-IM2BYL02 panel
This adjusts sync values according to the datasheet

Fixes: 1c243751c0 ("drm/panel: ilitek-ili9881c: add support for Feixin K101-IM2BYL02 panel")
Co-developed-by: Marius Gripsgard <marius@ubports.com>
Signed-off-by: Dan Johansen <strit@manjaro.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210818214818.298089-1-strit@manjaro.org
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-10-21 11:08:08 +02:00
Marek Vasut
3cfc183052 drm: mxsfb: Fix NULL pointer dereference crash on unload
The mxsfb->crtc.funcs may already be NULL when unloading the driver,
in which case calling mxsfb_irq_disable() via drm_irq_uninstall() from
mxsfb_unload() leads to NULL pointer dereference.

Since all we care about is masking the IRQ and mxsfb->base is still
valid, just use that to clear and mask the IRQ.

Fixes: ae1ed00932 ("drm: mxsfb: Stop using DRM simple display pipeline helper")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Daniel Abrecht <public@danielabrecht.ch>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Stefan Agner <stefan@agner.ch>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211016210446.171616-1-marex@denx.de
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-10-21 11:08:08 +02:00
Miklos Szeredi
964d32e512 fuse: clean up error exits in fuse_fill_super()
Instead of "goto err", return error directly, since there's no error
cleanup to do now.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-21 10:01:39 +02:00
Miklos Szeredi
80019f1138 fuse: always initialize sb->s_fs_info
Syzkaller reports a null pointer dereference in fuse_test_super() that is
caused by sb->s_fs_info being NULL.

This is due to the fact that fuse_fill_super() is initializing s_fs_info,
which is too late, it's already on the fs_supers list.  The initialization
needs to be done in sget_fc() with the sb_lock held.

Move allocation of fuse_mount and fuse_conn from fuse_fill_super() into
fuse_get_tree().

After this ->kill_sb() will always be called with non-NULL ->s_fs_info,
hence fuse_mount_destroy() can drop the test for non-NULL "fm".

Reported-by: syzbot+74a15f02ccb51f398601@syzkaller.appspotmail.com
Fixes: 5d5b74aa9c ("fuse: allow sharing existing sb")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-21 10:01:39 +02:00
Miklos Szeredi
c191cd07ee fuse: clean up fuse_mount destruction
1. call fuse_mount_destroy() for open coded variants

2. before deactivate_locked_super() don't need fuse_mount destruction since
that will now be done (if ->s_fs_info is not cleared)

3. rearrange fuse_mount setup in fuse_get_tree_submount() so that the
regular pattern can be used

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-21 10:01:39 +02:00
Miklos Szeredi
a27c061a49 fuse: get rid of fuse_put_super()
The ->put_super callback is called from generic_shutdown_super() in case of
a fully initialized sb.  This is called from kill_***_super(), which is
called from ->kill_sb instances.

Fuse uses ->put_super to destroy the fs specific fuse_mount and drop the
reference to the fuse_conn, while it does the same on each error case
during sb setup.

This patch moves the destruction from fuse_put_super() to
fuse_mount_destroy(), called at the end of all ->kill_sb instances.  A
follup patch will clean up the error paths.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-21 10:01:38 +02:00
Miklos Szeredi
d534d31d6a fuse: check s_root when destroying sb
Checking "fm" works because currently sb->s_fs_info is cleared on error
paths; however, sb->s_root is what generic_shutdown_super() checks to
determine whether the sb was fully initialized or not.

This change will allow cleanup of sb setup error paths.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-21 10:01:38 +02:00
Paolo Bonzini
3a25dfa67f KVM: nVMX: promptly process interrupts delivered while in guest mode
Since commit c300ab9f08 ("KVM: x86: Replace late check_nested_events() hack with
more precise fix") there is no longer the certainty that check_nested_events()
tries to inject an external interrupt vmexit to L1 on every call to vcpu_enter_guest.
Therefore, even in that case we need to set KVM_REQ_EVENT.  This ensures
that inject_pending_event() is called, and from there kvm_check_nested_events().

Fixes: c300ab9f08 ("KVM: x86: Replace late check_nested_events() hack with more precise fix")
Cc: stable@vger.kernel.org
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-21 03:35:42 -04:00
Paolo Bonzini
de7cd3f676 KVM: x86: check for interrupts before deciding whether to exit the fast path
The kvm_x86_sync_pir_to_irr callback can sometimes set KVM_REQ_EVENT.
If that happens exactly at the time that an exit is handled as
EXIT_FASTPATH_REENTER_GUEST, vcpu_enter_guest will go incorrectly
through the loop that calls kvm_x86_run, instead of processing
the request promptly.

Fixes: 379a3c8ee4 ("KVM: VMX: Optimize posted-interrupt delivery for timer fastpath")
Cc: stable@vger.kernel.org
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-21 03:35:41 -04:00
Chanho Park
282da7cef0 scsi: ufs: ufs-exynos: Correct timeout value setting registers
PA_PWRMODEUSERDATA0 -> DL_FC0PROTTIMEOUTVAL
PA_PWRMODEUSERDATA1 -> DL_TC0REPLAYTIMEOUTVAL
PA_PWRMODEUSERDATA2 -> DL_AFC0REQTIMEOUTVAL

Link: https://lore.kernel.org/r/20211018062841.18226-1-chanho61.park@samsung.com
Fixes: a967ddb22d ("scsi: ufs: ufs-exynos: Apply vendor-specific values for three timeouts")
Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Kiwoong Kim <kwmad.kim@samsung.com>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 23:25:51 -04:00
Brian King
e20f80b9b1 scsi: ibmvfc: Fix up duplicate response detection
Commit a264cf5e81 ("scsi: ibmvfc: Fix command state accounting and stale
response detection") introduced a regression in detecting duplicate
responses. This was observed in test where a command was sent to the VIOS
and completed before ibmvfc_send_event() set the active flag to 1, which
resulted in the atomic_dec_if_positive() call in ibmvfc_handle_crq()
thinking this was a duplicate response, which resulted in scsi_done() not
getting called, so we then hit a SCSI command timeout for this command once
the timeout expires.  This simply ensures the active flag gets set prior to
making the hcall to send the command to the VIOS, in order to close this
window.

Link: https://lore.kernel.org/r/20211019152129.16558-1-brking@linux.vnet.ibm.com
Fixes: a264cf5e81 ("scsi: ibmvfc: Fix command state accounting and stale response detection")
Cc: stable@vger.kernel.org
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 22:59:45 -04:00
Ian Kent
25f54d08f1 autofs: fix wait name hash calculation in autofs_wait()
There's a mistake in commit 2be7828c9f ("get rid of autofs_getpath()")
that affects kernels from v5.13.0, basically missed because of me not
fully testing the change for Al.

The problem is that the hash calculation for the wait name qstr hasn't
been updated to account for the change to use dentry_path_raw(). This
prevents the correct matching an existing wait resulting in multiple
notifications being sent to the daemon for the same mount which must
not occur.

The problem wasn't discovered earlier because it only occurs when
multiple processes trigger a request for the same mount concurrently
so it only shows up in more aggressive testing.

Fixes: 2be7828c9f ("get rid of autofs_getpath()")
Cc: stable@vger.kernel.org
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-10-20 21:09:02 -04:00
Linus Torvalds
2f111a6fd5 Merge tag 'ceph-for-5.15-rc7' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
 "Two important filesystem fixes, marked for stable.

  The blocklisted superblocks issue was particularly annoying because
  for unexperienced users it essentially exacted a reboot to establish a
  new functional mount in that scenario"

* tag 'ceph-for-5.15-rc7' of git://github.com/ceph/ceph-client:
  ceph: fix handling of "meta" errors
  ceph: skip existing superblocks that are blocklisted or shut down when mounting
2021-10-20 10:23:05 -10:00
Linus Torvalds
515dcc2e02 Merge tag 'dma-mapping-5.15-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:

 - fix more dma-debug fallout (Gerald Schaefer, Hamza Mahfooz)

 - fix a kerneldoc warning (Logan Gunthorpe)

* tag 'dma-mapping-5.15-2' of git://git.infradead.org/users/hch/dma-mapping:
  dma-debug: teach add_dma_entry() about DMA_ATTR_SKIP_CPU_SYNC
  dma-debug: fix sg checks in debug_dma_map_sg()
  dma-mapping: fix the kerneldoc for dma_map_sgtable()
2021-10-20 10:16:51 -10:00
Aaron Liu
53c2ff8bcb drm/amdgpu: support B0&B1 external revision id for yellow carp
B0 internal rev_id is 0x01, B1 internal rev_id is 0x02.
The external rev_id for B0 and B1 is 0x20.
The original expression is not suitable for B1.

v2: squash in fix for display code (Alex)

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-20 15:27:31 -04:00
Jake Wang
2ef8ea2394 drm/amd/display: Moved dccg init to after bios golden init
[Why]
bios_golden_init will override dccg_init during init_hw.

[How]
Move dccg_init to after bios_golden_init.

Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Reviewed-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com>
Signed-off-by: Jake Wang <haonan.wang2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-20 15:26:58 -04:00
Nikola Cornij
dd8cb18906 drm/amd/display: Increase watermark latencies for DCN3.1
[why]
The original latencies were causing underflow in some modes

[how]
Replace with the up-to-date watermark values based on new measurments

Reviewed-by: Ahmad Othman <ahmad.othman@amd.com>
Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com>
Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-20 15:26:53 -04:00
Eric Yang
4835ea6c17 drm/amd/display: increase Z9 latency to workaround underflow in Z9
[Why]
Z9 latency is higher than when we originally tuned the watermark
parameters, causing underflow. Increasing the value until the latency
issues is resolved.

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com>
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-20 15:26:47 -04:00
Nicholas Kazlauskas
672437486e drm/amd/display: Require immediate flip support for DCN3.1 planes
[Why]
Immediate flip can be enabled dynamically and has higher BW requirements
when validating which voltage mode to use.

If we validate when it's not set then potentially DCFCLK will be too low
and we will underflow.

[How]
DM always requires support so always require it as part of DML input
parameters.

This can't be enabled unconditionally on older ASIC because it blocks
some expected modes so only target DCN3.1 for now.

Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-20 15:26:42 -04:00
Nicholas Kazlauskas
c938aed88f drm/amd/display: Fix prefetch bandwidth calculation for DCN3.1
[Why]
Prefetch BW calculated is lower than the DML reference because of a
porting error that's excluding cursor and row bandwidth from the
pixel data bandwidth.

[How]
Change the dml_max4 to dml_max3 and include cursor and row bandwidth
in the same calculation as the rest of the pixel data during vactive.

Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-20 15:17:13 -04:00
Nikola Cornij
c21b105380 drm/amd/display: Limit display scaling to up to true 4k for DCN 3.1
[why]
The requirement is that image width up to 4096 shall be supported

Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com>
Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-20 15:14:39 -04:00
Thelford Williams
5afa7898ab drm/amdgpu: fix out of bounds write
Size can be any value and is user controlled resulting in overwriting the
40 byte array wr_buf with an arbitrary length of data from buf.

Signed-off-by: Thelford Williams <tdwilliamsiv@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-20 15:13:50 -04:00
Emeel Hakim
1d00032394 net/mlx5e: IPsec: Fix work queue entry ethernet segment checksum flags
Current Work Queue Entry (WQE) checksum (csum) flags in the ethernet
segment (eseg) in case of IPsec crypto offload datapath are not aligned
with PRM/HW expectations.

Currently the driver always sets the l3_inner_csum flag in case of IPsec
because of the wrong usage of skb->encapsulation as indicator for inner
IPsec header since skb->encapsulation is always ON for IPsec packets
since IPsec itself is an encapsulation protocol. The above forced a
failing attempts of calculating csum of non-existing segments (like in
the IP|ESP|TCP packet case which does not have an l3_inner) which led
to lots of packet drops hence the low throughput.

Fix by using xo->inner_ipproto as indicator for inner IPsec header
instead of skb->encapsulation in addition to setting the csum flags
as following:
* Tunnel Mode:
* Pkt: MAC  IP     ESP  IP    L4
* CSUM: l3_cs | l3_inner_cs | l4_inner_cs
*
* Transport Mode:
* Pkt: MAC  IP     ESP  L4
* CSUM: l3_cs [ | l4_cs (checksum partial case)]
*
* Tunnel(VXLAN TCP/UDP) over Transport Mode
* Pkt: MAC  IP     ESP  UDP  VXLAN  IP    L4
* CSUM: l3_cs | l3_inner_cs | l4_inner_cs

Fixes: f1267798c9 ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20 10:42:51 -07:00
Emeel Hakim
d10457f85d net/mlx5e: IPsec: Fix a misuse of the software parser's fields
IPsec crypto offload current Software Parser (SWP) fields settings in
the ethernet segment (eseg) are not aligned with PRM/HW expectations.
Among others in case of IP|ESP|TCP packet, current driver sets the
offsets for inner_l3 and inner_l4 although there is no inner l3/l4
headers relative to ESP header in such packets.

SWP provides the offsets for HW ,so it can be used to find csum fields
to offload the checksum, however these are not necessarily used by HW
and are used as fallback in case HW fails to parse the packet, e.g
when performing IPSec Transport Aware (IP | ESP | TCP) there is no
need to add SW parse on inner packet. So in some cases packets csum
was calculated correctly , whereas in other cases it failed. The later
faced csum errors (caused by wrong packet length calculations) which
led to lots of packet drops hence the low throughput.

Fix by setting the SWP fields as expected in a IP|ESP|TCP packet.

the following describe the expected SWP offsets:
* Tunnel Mode:
* SWP:      OutL3       InL3  InL4
* Pkt: MAC  IP     ESP  IP    L4
*
* Transport Mode:
* SWP:      OutL3       OutL4
* Pkt: MAC  IP     ESP  L4
*
* Tunnel(VXLAN TCP/UDP) over Transport Mode
* SWP:      OutL3                   InL3  InL4
* Pkt: MAC  IP     ESP  UDP  VXLAN  IP    L4

Fixes: f1267798c9 ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20 10:42:50 -07:00
Moshe Shemesh
68e66e1a69 net/mlx5e: Fix vlan data lost during suspend flow
During suspend flow the driver calls mlx5e_destroy_vlan_table() which
does not only delete the vlans steering flow rules, but also frees the
data on currently active vlans, thus it is not restored during resume
flow.

This fix keeps the vlan data on suspend flow and frees it only on driver
remove flow.

Fixes: 6783f0a21a ("net/mlx5e: Dynamic alloc vlan table for netdev when needed")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20 10:42:50 -07:00
Dmytro Linkin
a6f7433354 net/mlx5: E-switch, Return correct error code on group creation failure
Dan Carpenter report:
The patch f47e04eb96: "net/mlx5: E-switch, Allow setting share/max
tx rate limits of rate groups" from May 31, 2021, leads to the
following Smatch static checker warning:

	drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c:483 esw_qos_create_rate_group()
	warn: passing zero to 'ERR_PTR'

If min rate normalization failed then error code may be overwritten to 0
if scheduling element destruction succeed. Ignore this value and always
return initial one.

Fixes: f47e04eb96 ("net/mlx5: E-switch, Allow setting share/max tx rate limits of rate groups")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20 10:42:49 -07:00
Maor Dickman
14fe2471c6 net/mlx5: Lag, change multipath and bonding to be mutually exclusive
Both multipath and bonding events are changing the HW LAG state
independently.
Handling one of the features events while the other is already
enabled can cause unwanted behavior, for example handling
bonding event while multipath enabled will disable the lag and
cause multipath to stop working.

Fix it by ignoring bonding event while in multipath and ignoring FIB
events while in bonding mode.

Fixes: 544fe7c2e6 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20 10:42:49 -07:00
Linus Torvalds
8e37395c3a Merge tag 'sound-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "Again it became bigger than wished, unfortunately, as this contains
  quite a few ASoC fixes that came up a bit late. It also includes yet
  more HD- and USB-audio quirks: I decided to merge them now, as those
  are for stable, and we'll need them sooner or later.

  Although the volumes are a bit high, all changes are device-specific
  (and reasonably small) fixes, so it should be safe for the late rc"

* tag 'sound-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio: Fix microphone sound on Jieli webcam.
  ALSA: hda/realtek: Fixes HP Spectre x360 15-eb1xxx speakers
  ALSA: usb-audio: Provide quirk for Sennheiser GSP670 Headset
  ALSA: hda/realtek: Add quirk for Clevo PC50HS
  ALSA: usb-audio: add Schiit Hel device to quirk table
  ASoC: wm8960: Fix clock configuration on slave mode
  ASoC: cs42l42: Ensure 0dB full scale volume is used for headsets
  ASoC: soc-core: fix null-ptr-deref in snd_soc_del_component_unlocked()
  ASoC: codec: wcd938x: Add irq config support
  ASoC: DAPM: Fix missing kctl change notifications
  ASoC: Intel: bytcht_es8316: Utilize dev_err_probe() to avoid log saturation
  ASoC: Intel: bytcht_es8316: Switch to use gpiod_get_optional()
  ASoC: Intel: bytcht_es8316: Use temporary variable for struct device
  ASoC: Intel: bytcht_es8316: Get platform data via dev_get_platdata()
  ASoC: wcd938x: Fix jack detection issue
  ASoC: nau8824: Fix headphone vs headset, button-press detection no longer working
  ASoC: cs4341: Add SPI device ID table
  ASoC: pcm179x: Add missing entries SPI to device ID table
  ASoC: fsl_xcvr: Fix channel swap issue with ARC
  ASoC: pcm512x: Mend accesses to the I2S_1 and I2S_2 registers
2021-10-20 06:13:22 -10:00
Linus Torvalds
6da52dead8 Merge tag 'audit-pr-20211019' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit fix from Paul Moore:
 "One small audit patch to add a pointer NULL check"

* tag 'audit-pr-20211019' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: fix possible null-pointer dereference in audit_filter_rules
2021-10-20 06:11:17 -10:00
Tony Nguyen
7dcf78b870 ice: Add missing E810 device ids
As part of support for E810 XXV devices, some device ids were
inadvertently left out. Add those missing ids.

Fixes: 195fb97766 ("ice: add additional E810 device id")
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
2021-10-20 09:07:22 -07:00
Sasha Neftin
79cc8322b6 igc: Update I226_K device ID
The device ID for I226_K was incorrectly assigned, update the device
ID to the correct one.

Fixes: bfa5e98c9d ("igc: Add new device ID")
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-20 09:07:21 -07:00
Sasha Neftin
639e298f43 e1000e: Fix packet loss on Tiger Lake and later
Update the HW MAC initialization flow. Do not gate DMA clock from
the modPHY block. Keeping this clock will prevent dropped packets
sent in burst mode on the Kumeran interface.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=213651
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=213377
Fixes: fb776f5d57 ("e1000e: Add support for Tiger Lake")
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Mark Pearson <markpearson@lenovo.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-20 09:06:54 -07:00
Linus Torvalds
fc9b289344 Merge tag 'trace-v5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
 "Recursion fix for tracing.

  While cleaning up some of the tracing recursion protection logic, I
  discovered a scenario that the current design would miss, and would
  allow an infinite recursion. Removing an optimization trick that
  opened the hole fixes the issue and cleans up the code as well"

* tag 'trace-v5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Have all levels of checks prevent recursion
2021-10-20 06:02:58 -10:00
Linus Torvalds
1e59977463 Merge tag 'nios2_fixes_for_v5.15_part2' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux
Pull nios2 fix from Dinh Nguyen:

 - Renamed CTL_STATUS to CTL_FSTATUS to fix a redefined warning

* tag 'nios2_fixes_for_v5.15_part2' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  NIOS2: irqflags: rename a redefined register name
2021-10-20 05:56:51 -10:00
Pavel Begunkov
4ea672ab69 io_uring: fix ltimeout unprep
io_unprep_linked_timeout() is broken, first it needs to return back
REQ_F_ARM_LTIMEOUT, so the linked timeout is enqueued and disarmed. But
now we refcounted it, and linked timeouts may get not executed at all,
leaking a request.

Just kill the unprep optimisation.

Fixes: 906c6caaf5 ("io_uring: optimise io_prep_linked_timeout()")
Reported-by: Beld Zhang <beldzhang@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/51b8e2bfc4bea8ee625cf2ba62b2a350cc9be031.1634719585.git.asml.silence@gmail.com
Link: https://github.com/axboe/liburing/issues/460
Reported-by: Beld Zhang <beldzhang@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-20 09:54:16 -06:00
Pavel Begunkov
e139a1ec92 io_uring: apply max_workers limit to all future users
Currently, IORING_REGISTER_IOWQ_MAX_WORKERS applies only to the task
that issued it, it's unexpected for users. If one task creates a ring,
limits workers and then passes it to another task the limit won't be
applied to the other task.

Another pitfall is that a task should either create a ring or submit at
least one request for IORING_REGISTER_IOWQ_MAX_WORKERS to work at all,
furher complicating the picture.

Change the API, save the limits and apply to all future users. Note, it
should be done first before giving away the ring or submitting new
requests otherwise the result is not guaranteed.

Fixes: 2e480058dd ("io-wq: provide a way to limit max number of workers")
Link: https://github.com/axboe/liburing/issues/460
Reported-by: Beld Zhang <beldzhang@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/51d0bae97180e08ab722c0d5c93e7439cfb6f697.1634683237.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-20 09:54:06 -06:00
Linus Torvalds
0afe64bebb Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "Tools:
   - kvm_stat: do not show halt_wait_ns since it is not a cumulative statistic

  x86:
   - clean ups and fixes for bus lock vmexit and lazy allocation of rmaps
   - two fixes for SEV-ES (one more coming as soon as I get reviews)
   - fix for static_key underflow

  ARM:
   - Properly refcount pages used as a concatenated stage-2 PGD
   - Fix missing unlock when detecting the use of MTE+VM_SHARED"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SEV-ES: reduce ghcb_sa_len to 32 bits
  KVM: VMX: Remove redundant handling of bus lock vmexit
  KVM: kvm_stat: do not show halt_wait_ns
  KVM: x86: WARN if APIC HW/SW disable static keys are non-zero on unload
  Revert "KVM: x86: Open code necessary bits of kvm_lapic_set_base() at vCPU RESET"
  KVM: SEV-ES: Set guest_state_protected after VMSA update
  KVM: X86: fix lazy allocation of rmaps
  KVM: SEV-ES: fix length of string I/O
  KVM: arm64: Release mmap_lock when using VM_SHARED with MTE
  KVM: arm64: Report corrupted refcount at EL2
  KVM: arm64: Fix host stage-2 PGD refcount
  KVM: s390: Function documentation fixes
2021-10-20 05:52:10 -10:00
Sasha Neftin
280db5d420 e1000e: Separate TGP board type from SPT
We have the same LAN controller on different PCHs. Separate TGP board
type from SPT which will allow for specific fixes to be applied for
TGP platforms.

Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Mark Pearson <markpearson@lenovo.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-20 08:51:51 -07:00
Eric W. Biederman
5ebcbe342b ucounts: Move get_ucounts from cred_alloc_blank to key_change_session_keyring
Setting cred->ucounts in cred_alloc_blank does not make sense.  The
uid and user_ns are deliberately not set in cred_alloc_blank but
instead the setting is delayed until key_change_session_keyring.

So move dealing with ucounts into key_change_session_keyring as well.

Unfortunately that movement of get_ucounts adds a new failure mode to
key_change_session_keyring.  I do not see anything stopping the parent
process from calling setuid and changing the relevant part of it's
cred while keyctl_session_to_parent is running making it fundamentally
necessary to call get_ucounts in key_change_session_keyring.  Which
means that the new failure mode cannot be avoided.

A failure of key_change_session_keyring results in a single threaded
parent keeping it's existing credentials.  Which results in the parent
process not being able to access the session keyring and whichever
keys are in the new keyring.

Further get_ucounts is only expected to fail if the number of bits in
the refernece count for the structure is too few.

Since the code has no other way to report the failure of get_ucounts
and because such failures are not expected to be common add a WARN_ONCE
to report this problem to userspace.

Between the WARN_ONCE and the parent process not having access to
the keys in the new session keyring I expect any failure of get_ucounts
will be noticed and reported and we can find another way to handle this
condition.  (Possibly by just making ucounts->count an atomic_long_t).

Cc: stable@vger.kernel.org
Fixes: 905ae01c4a ("Add a reference to ucounts for each cred")
Link: https://lkml.kernel.org/r/7k0ias0uf.fsf_-_@disp2133
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-20 10:34:20 -05:00
Arnd Bergmann
36b6dcbc12 Merge tag 'reset-fixes-for-v5.15' of git://git.pengutronix.de/pza/linux into arm/fixes
Reset controller fixes for v5.15

Fix the status bit polarity in the brcmstb-rescal driver, re-enable
pistachio driver selection after MACH_PISTACHIO removal, add transfer
error handling in the tegra-bpmp driver, and fix probing of the
reset-socfpga driver after recent device link changes.

* tag 'reset-fixes-for-v5.15' of git://git.pengutronix.de/pza/linux:
  reset: socfpga: add empty driver allowing consumers to probe
  reset: tegra-bpmp: Handle errors in BPMP response
  reset: pistachio: Re-enable driver selection
  reset: brcmstb-rescal: fix incorrect polarity of status bit

Link: https://lore.kernel.org/r/b378f2aae54538db6a13c98561b4cbcacbef937c.camel@pengutronix.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-20 16:42:21 +02:00
Arnd Bergmann
e23c7487f5 Merge tag 'sunxi-fixes-for-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes
Two patches to fix the GMAC PHY mode on some boards.

* tag 'sunxi-fixes-for-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  ARM: dts: sun7i: A20-olinuxino-lime2: Fix ethernet phy-mode
  arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node

Link: https://lore.kernel.org/r/d4c41c71-f1ff-4464-a26f-1bfd4b52fd78.lettre@localhost
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-20 15:51:38 +02:00
Arnd Bergmann
72cd4e3bde Merge tag 'imx-fixes-5.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 5.15, round 4:

- A series from Frieder Schrempf to fix i.MX8MM Kontron N801x board on
  various aspects, voltage settings, GPIO polarity, SPI clock frequency
  and Ethernet PHY type.

* tag 'imx-fixes-5.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  arm64: dts: imx8mm-kontron: Fix connection type for VSC8531 RGMII PHY
  arm64: dts: imx8mm-kontron: Fix CAN SPI clock frequency
  arm64: dts: imx8mm-kontron: Fix polarity of reg_rst_eth2
  arm64: dts: imx8mm-kontron: Set lower limit of VDD_SNVS to 800 mV
  arm64: dts: imx8mm-kontron: Make sure SOC and DRAM supply voltages are correct

Link: https://lore.kernel.org/r/20211018000958.GC25810@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-20 15:49:38 +02:00
Yang Yingliang
4225fea1cb ptp: Fix possible memory leak in ptp_clock_register()
I got memory leak as follows when doing fault injection test:

unreferenced object 0xffff88800906c618 (size 8):
  comm "i2c-idt82p33931", pid 4421, jiffies 4294948083 (age 13.188s)
  hex dump (first 8 bytes):
    70 74 70 30 00 00 00 00                          ptp0....
  backtrace:
    [<00000000312ed458>] __kmalloc_track_caller+0x19f/0x3a0
    [<0000000079f6e2ff>] kvasprintf+0xb5/0x150
    [<0000000026aae54f>] kvasprintf_const+0x60/0x190
    [<00000000f323a5f7>] kobject_set_name_vargs+0x56/0x150
    [<000000004e35abdd>] dev_set_name+0xc0/0x100
    [<00000000f20cfe25>] ptp_clock_register+0x9f4/0xd30 [ptp]
    [<000000008bb9f0de>] idt82p33_probe.cold+0x8b6/0x1561 [ptp_idt82p33]

When posix_clock_register() returns an error, the name allocated
in dev_set_name() will be leaked, the put_device() should be used
to give up the device reference, then the name will be freed in
kobject_cleanup() and other memory will be freed in ptp_clock_release().

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: a33121e548 ("ptp: fix the race between the release of ptp_clock and cdev")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 14:44:33 +01:00
Kurt Kanzenbach
3cb958027c net: stmmac: Fix E2E delay mechanism
When utilizing End to End delay mechanism, the following error messages show up:

|root@ehl1:~# ptp4l --tx_timestamp_timeout=50 -H -i eno2 -E -m
|ptp4l[950.573]: selected /dev/ptp3 as PTP clock
|ptp4l[950.586]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE
|ptp4l[950.586]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE
|ptp4l[952.879]: port 1: new foreign master 001395.fffe.4897b4-1
|ptp4l[956.879]: selected best master clock 001395.fffe.4897b4
|ptp4l[956.879]: port 1: assuming the grand master role
|ptp4l[956.879]: port 1: LISTENING to GRAND_MASTER on RS_GRAND_MASTER
|ptp4l[962.017]: port 1: received DELAY_REQ without timestamp
|ptp4l[962.273]: port 1: received DELAY_REQ without timestamp
|ptp4l[963.090]: port 1: received DELAY_REQ without timestamp

Commit f2fb6b6275 ("net: stmmac: enable timestamp snapshot for required PTP
packets in dwmac v5.10a") already addresses this problem for the dwmac
v5.10. However, same holds true for all dwmacs above version v4.10. Correct the
check accordingly. Afterwards everything works as expected.

Tested on Intel Atom(R) x6414RE Processor.

Fixes: 14f347334b ("net: stmmac: Correctly take timestamp for PTPv2")
Fixes: f2fb6b6275 ("net: stmmac: enable timestamp snapshot for required PTP packets in dwmac v5.10a")
Suggested-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 14:43:53 +01:00
Uwe Kleine-König
641e3fd1a0 nfc: st95hf: Make spi remove() callback return zero
If something goes wrong in the remove callback, returning an error code
just results in an error message. The device still disappears.

So don't skip disabling the regulator in st95hf_remove() if resetting
the controller via spi fails. Also don't return an error code which just
results in two error messages.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 14:41:20 +01:00
Halil Pasic
0e9ff65f45 KVM: s390: preserve deliverable_mask in __airqs_kick_single_vcpu
Changing the deliverable mask in __airqs_kick_single_vcpu() is a bug. If
one idle vcpu can't take the interrupts we want to deliver, we should
look for another vcpu that can, instead of saying that we don't want
to deliver these interrupts by clearing the bits from the
deliverable_mask.

Fixes: 9f30f62163 ("KVM: s390: add gib_alert_irq_handler()")
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019175401.3757927-3-pasic@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-10-20 13:03:04 +02:00
Halil Pasic
9b57e9d501 KVM: s390: clear kicked_mask before sleeping again
The idea behind kicked mask is that we should not re-kick a vcpu that
is already in the "kick" process, i.e. that was kicked and is
is about to be dispatched if certain conditions are met.

The problem with the current implementation is, that it assumes the
kicked vcpu is going to enter SIE shortly. But under certain
circumstances, the vcpu we just kicked will be deemed non-runnable and
will remain in wait state. This can happen, if the interrupt(s) this
vcpu got kicked to deal with got already cleared (because the interrupts
got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
would return false, and the vcpu would remain in kvm_vcpu_block(),
but this time with its kicked_mask bit set. So next time around we
wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
that we just kicked it.

Let us make sure the kicked_mask is cleared before we give up on
re-dispatching the vcpu.

Fixes: 9f30f62163 ("KVM: s390: add gib_alert_irq_handler()")
Reported-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019175401.3757927-2-pasic@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-10-20 13:03:04 +02:00
David S. Miller
323e9a957d Merge branch 'hns3-fixes'
Guangbin Huang says:

====================
net: hns3: add some fixes for -net

This series adds some fixes for the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Peng Li
0dd8a25f35 net: hns3: disable sriov before unload hclge layer
HNS3 driver includes hns3.ko, hnae3.ko and hclge.ko.
hns3.ko includes network stack and pci_driver, hclge.ko includes
HW device action, algo_ops and timer task, hnae3.ko includes some
register function.

When SRIOV is enable and hclge.ko is removed, HW device is unloaded
but VF still exists, PF will not reply VF mbx messages, and cause
errors.

This patch fix it by disable SRIOV before remove hclge.ko.

Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Yufeng Mo
1385cc81ba net: hns3: fix vf reset workqueue cannot exit
The task of VF reset is performed through the workqueue. It checks the
value of hdev->reset_pending to determine whether to exit the loop.
However, the value of hdev->reset_pending may also be assigned by
the interrupt function hclgevf_misc_irq_handle(), which may cause the
loop fail to exit and keep occupying the workqueue. This loop is not
necessary, so remove it and the workqueue will be rescheduled if the
reset needs to be retried or a new reset occurs.

Fixes: 1cc9bc6e58 ("net: hns3: split hclgevf_reset() into preparing and rebuilding part")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Yunsheng Lin
68752b24f5 net: hns3: schedule the polling again when allocation fails
Currently when there is a rx page allocation failure, it is
possible that polling may be stopped if there is no more packet
to be reveiced, which may cause queue stall problem under memory
pressure.

This patch makes sure polling is scheduled again when there is
any rx page allocation failure, and polling will try to allocate
receive buffers until it succeeds.

Now the allocation retry is added, it is unnecessary to do the rx
page allocation at the end of rx cleaning, so remove it. And reset
the unused_count to zero after calling hns3_nic_alloc_rx_buffers()
to avoid calling hns3_nic_alloc_rx_buffers() repeatedly under
memory pressure.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Yunsheng Lin
9f9f0f1999 net: hns3: fix for miscalculation of rx unused desc
rx unused desc is the desc that need attatching new buffer
before refilling to hw to receive new packet, the number of
desc need attatching new buffer is calculated using next_to_use
and next_to_clean. when next_to_use == next_to_clean, currently
hns3 driver assumes that all the desc has the buffer attatched,
but 'next_to_use == next_to_clean' also means all the desc need
attatching new buffer if hw has comsumed all the desc and the
driver has not attatched any buffer to the desc yet.

This patch adds 'refill' in desc_cb to indicate whether a new
buffer has been refilled to a desc.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Yunsheng Lin
adfb7b4966 net: hns3: fix the max tx size according to user manual
Currently the max tx size supported by the hw is calculated by
using the max BD num supported by the hw. According to the hw
user manual, the max tx size is fixed value for both non-TSO and
TSO skb.

This patch updates the max tx size according to the manual.

Fixes: 8ae10cfb5089("net: hns3: support tx-scatter-gather-fraglist feature")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Guangbin Huang
731797fdff net: hns3: add limit ets dwrr bandwidth cannot be 0
If ets dwrr bandwidth of tc is set to 0, the hardware will switch to SP
mode. In this case, this tc may occupy all the tx bandwidth if it has
huge traffic, so it violates the purpose of the user setting.

To fix this problem, limit the ets dwrr bandwidth must greater than 0.

Fixes: cacde272dd ("net: hns3: Add hclge_dcb module for the support of DCB feature")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Guangbin Huang
b63fcaab95 net: hns3: reset DWRR of unused tc to zero
Currently, DWRR of tc will be initialized to a fixed value when this tc
is enabled, but it is not been reset to 0 when this tc is disabled. It
cause a problem that the DWRR of unused tc is not 0 after using tc tool
to add and delete multi-tc parameters.

For examples, after enabling 4 TCs and restoring to 1 TC by follow
tc commands:

$ tc qdisc add dev eth0 root mqprio num_tc 4 map 0 1 2 3 0 1 2 3 queues \
  8@0 8@8 8@16 8@24 hw 1 mode channel
$ tc qdisc del dev eth0 root

Now there is just one TC is enabled for eth0, but the tc info querying by
debugfs is shown as follow:

$ cat /mnt/hns3/0000:7d:00.0/tm/tc_sch_info
enabled tc number: 1
weight_offset: 14
TC    MODE  WEIGHT
0     dwrr    100
1     dwrr    100
2     dwrr    100
3     dwrr    100
4     dwrr      0
5     dwrr      0
6     dwrr      0
7     dwrr      0

This patch fixes it by resetting DWRR of tc to 0 when tc is disabled.

Fixes: 848440544b ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Jiaran Zhang
60484103d5 net: hns3: Add configuration of TM QCN error event
Add configuration of interrupt type and fifo interrupt enable of TM QCN
error event if enabled, otherwise this event will not be reported when
there is error.

Fixes: d914971df0 ("net: hns3: remove redundant query in hclge_config_tm_hw_err_int()")
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Nathan Lynch
787252a10d powerpc/smp: do not decrement idle task preempt count in CPU offline
With PREEMPT_COUNT=y, when a CPU is offlined and then onlined again, we
get:

BUG: scheduling while atomic: swapper/1/0/0x00000000
no locks held by swapper/1/0.
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.15.0-rc2+ #100
Call Trace:
 dump_stack_lvl+0xac/0x108
 __schedule_bug+0xac/0xe0
 __schedule+0xcf8/0x10d0
 schedule_idle+0x3c/0x70
 do_idle+0x2d8/0x4a0
 cpu_startup_entry+0x38/0x40
 start_secondary+0x2ec/0x3a0
 start_secondary_prolog+0x10/0x14

This is because powerpc's arch_cpu_idle_dead() decrements the idle task's
preempt count, for reasons explained in commit a7c2bb8279 ("powerpc:
Re-enable preemption before cpu_die()"), specifically "start_secondary()
expects a preempt_count() of 0."

However, since commit 2c669ef697 ("powerpc/preempt: Don't touch the idle
task's preempt_count during hotplug") and commit f1a0a376ca ("sched/core:
Initialize the idle task with preemption disabled"), that justification no
longer holds.

The idle task isn't supposed to re-enable preemption, so remove the
vestigial preempt_enable() from the CPU offline path.

Tested with pseries and powernv in qemu, and pseries on PowerVM.

Fixes: 2c669ef697 ("powerpc/preempt: Don't touch the idle task's preempt_count during hotplug")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211015173902.2278118-1-nathanl@linux.ibm.com
2021-10-20 21:38:01 +11:00
Michael Ellerman
496c5fe25c powerpc/idle: Don't corrupt back chain when going idle
In isa206_idle_insn_mayloss() we store various registers into the stack
red zone, which is allowed.

However inside the IDLE_STATE_ENTER_SEQ_NORET macro we save r2 again,
to 0(r1), which corrupts the stack back chain.

We used to do the same in isa206_idle_insn_mayloss() itself, but we
fixed that in 73287caa92 ("powerpc64/idle: Fix SP offsets when saving
GPRs"), however we missed that the macro also corrupts the back chain.

Corrupting the back chain is bad for debuggability but doesn't
necessarily cause a bug.

However we recently changed the stack handling in some KVM code, and it
now relies on the stack back chain being valid when it returns. The
corruption causes that code to return with r1 pointing somewhere in
kernel data, at some point LR is restored from the stack and we branch
to NULL or somewhere else invalid.

Only affects Power8 hosts running KVM guests, with dynamic_mt_modes
enabled (which it is by default).

The fixes tag below points to the commit that changed the KVM stack
handling, exposing this bug. The actual corruption of the back chain has
always existed since 948cf67c47 ("powerpc: Add NAP mode support on
Power7 in HV mode").

Fixes: 9b4416c509 ("KVM: PPC: Book3S HV: Fix stack handling in idle_kvm_start_guest()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020094826.3222052-1-mpe@ellerman.id.au
2021-10-20 21:37:58 +11:00
Eugene Crosser
55161e67d4 vrf: Revert "Reset skb conntrack connection..."
This reverts commit 09e856d54b.

When an interface is enslaved in a VRF, prerouting conntrack hook is
called twice: once in the context of the original input interface, and
once in the context of the VRF interface. If no special precausions are
taken, this leads to creation of two conntrack entries instead of one,
and breaks SNAT.

Commit above was intended to avoid creation of extra conntrack entries
when input interface is enslaved in a VRF. It did so by resetting
conntrack related data associated with the skb when it enters VRF context.

However it breaks netfilter operation. Imagine a use case when conntrack
zone must be assigned based on the original input interface, rather than
VRF interface (that would make original interfaces indistinguishable). One
could create netfilter rules similar to these:

        chain rawprerouting {
                type filter hook prerouting priority raw;
                iif realiface1 ct zone set 1 return
                iif realiface2 ct zone set 2 return
        }

This works before the mentioned commit, but not after: zone assignment
is "forgotten", and any subsequent NAT or filtering that is dependent
on the conntrack zone does not work.

Here is a reproducer script that demonstrates the difference in behaviour.

==========
#!/bin/sh

# This script demonstrates unexpected change of nftables behaviour
# caused by commit 09e856d54b ""vrf: Reset skb conntrack
# connection on VRF rcv"
# https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=09e856d54bda5f288ef8437a90ab2b9b3eab83d1
#
# Before the commit, it was possible to assign conntrack zone to a
# packet (or mark it for `notracking`) in the prerouting chanin, raw
# priority, based on the `iif` (interface from which the packet
# arrived).
# After the change, # if the interface is enslaved in a VRF, such
# assignment is lost. Instead, assignment based on the `iif` matching
# the VRF master interface is honored. Thus it is impossible to
# distinguish packets based on the original interface.
#
# This script demonstrates this change of behaviour: conntrack zone 1
# or 2 is assigned depending on the match with the original interface
# or the vrf master interface. It can be observed that conntrack entry
# appears in different zone in the kernel versions before and after
# the commit.

IPIN=172.30.30.1
IPOUT=172.30.30.2
PFXL=30

ip li sh vein >/dev/null 2>&1 && ip li del vein
ip li sh tvrf >/dev/null 2>&1 && ip li del tvrf
nft list table testct >/dev/null 2>&1 && nft delete table testct

ip li add vein type veth peer veout
ip li add tvrf type vrf table 9876
ip li set veout master tvrf
ip li set vein up
ip li set veout up
ip li set tvrf up
/sbin/sysctl -w net.ipv4.conf.veout.accept_local=1
/sbin/sysctl -w net.ipv4.conf.veout.rp_filter=0
ip addr add $IPIN/$PFXL dev vein
ip addr add $IPOUT/$PFXL dev veout

nft -f - <<__END__
table testct {
	chain rawpre {
		type filter hook prerouting priority raw;
		iif { veout, tvrf } meta nftrace set 1
		iif veout ct zone set 1 return
		iif tvrf ct zone set 2 return
		notrack
	}
	chain rawout {
		type filter hook output priority raw;
		notrack
	}
}
__END__

uname -rv
conntrack -F
ping -W 1 -c 1 -I vein $IPOUT
conntrack -L

Signed-off-by: Eugene Crosser <crosser@average.org>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:27:19 +01:00
Marios Makassikis
0d994cd482 ksmbd: add buffer validation in session setup
Make sure the security buffer's length/offset are valid with regards to
the packet length.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-20 00:07:10 -05:00
Namjae Jeon
621be84a9d ksmbd: throttle session setup failures to avoid dictionary attacks
To avoid dictionary attacks (repeated session setups rapidly sent) to
connect to server, ksmbd make a delay of a 5 seconds on session setup
failure to make it harder to send enough random connection requests
to break into a server if a user insert the wrong password 10 times
in a row.

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-20 00:07:10 -05:00
Hyunchul Lee
34061d6b76 ksmbd: validate OutputBufferLength of QUERY_DIR, QUERY_INFO, IOCTL requests
Validate OutputBufferLength of QUERY_DIR, QUERY_INFO, IOCTL requests and
check the free size of response buffer for these requests.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-20 00:07:10 -05:00
Russ Weight
f09f6dfef8 spi: altera: Change to dynamic allocation of spi id
The spi-altera driver has two flavors: platform and dfl. I'm seeing
a case where I have both device types in the same machine, and they
are conflicting on the SPI ID:

... kernel: couldn't get idr
... kernel: WARNING: CPU: 28 PID: 912 at drivers/spi/spi.c:2920 spi_register_controller.cold+0x84/0xc0a

Both the platform and dfl drivers use the parent's driver ID as the SPI
ID. In the error case, the parent devices are dfl_dev.4 and
subdev_spi_altera.4.auto. When the second spi-master is created, the
failure occurs because the SPI ID of 4 has already been allocated.

Change the ID allocation to dynamic (by initializing bus_num to -1) to
avoid duplicate SPI IDs.

Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20211019002401.24041-1-russell.h.weight@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-20 01:53:15 +01:00
Mustafa Ismail
2dace185ca RDMA/irdma: Do not hold qos mutex twice on QP resume
When irdma_ws_add fails, irdma_ws_remove is used to cleanup the leaf node.
This lead to holding the qos mutex twice in the QP resume path. Fix this
by avoiding the call to irdma_ws_remove and unwinding the error in
irdma_ws_add. This skips the call to irdma_tc_in_use function which is not
needed in the error unwind cases.

Fixes: 3ae331c751 ("RDMA/irdma: Add QoS definitions")
Link: https://lore.kernel.org/r/20211019151654.1943-2-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-19 20:22:01 -03:00
Mustafa Ismail
cc07b73ef1 RDMA/irdma: Set VLAN in UD work completion correctly
Currently VLAN is reported in UD work completion when VLAN id is zero,
i.e. no VLAN case.

Report VLAN in UD work completion only when VLAN id is non-zero.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20211019151654.1943-1-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-19 20:22:01 -03:00
Aharon Landau
5508546631 RDMA/mlx5: Initialize the ODP xarray when creating an ODP MR
Normally the zero fill would hide the missing initialization, but an
errant set to desc_size in reg_create() causes a crash:

  BUG: unable to handle page fault for address: 0000000800000000
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 5 PID: 890 Comm: ib_write_bw Not tainted 5.15.0-rc4+ #47
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:mlx5_ib_dereg_mr+0x14/0x3b0 [mlx5_ib]
  Code: 48 63 cd 4c 89 f7 48 89 0c 24 e8 37 30 03 e1 48 8b 0c 24 eb a0 90 0f 1f 44 00 00 41 56 41 55 41 54 55 53 48 89 fb 48 83 ec 30 <48> 8b 2f 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 8b 87 c8
  RSP: 0018:ffff88811afa3a60 EFLAGS: 00010286
  RAX: 000000000000001c RBX: 0000000800000000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000800000000
  RBP: 0000000800000000 R08: 0000000000000000 R09: c0000000fffff7ff
  R10: ffff88811afa38f8 R11: ffff88811afa38f0 R12: ffffffffa02c7ac0
  R13: 0000000000000000 R14: ffff88811afa3cd8 R15: ffff88810772fa00
  FS:  00007f47b9080740(0000) GS:ffff88852cd40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000800000000 CR3: 000000010761e003 CR4: 0000000000370ea0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   mlx5_ib_free_odp_mr+0x95/0xc0 [mlx5_ib]
   mlx5_ib_dereg_mr+0x128/0x3b0 [mlx5_ib]
   ib_dereg_mr_user+0x45/0xb0 [ib_core]
   ? xas_load+0x8/0x80
   destroy_hw_idr_uobject+0x1a/0x50 [ib_uverbs]
   uverbs_destroy_uobject+0x2f/0x150 [ib_uverbs]
   uobj_destroy+0x3c/0x70 [ib_uverbs]
   ib_uverbs_cmd_verbs+0x467/0xb00 [ib_uverbs]
   ? uverbs_finalize_object+0x60/0x60 [ib_uverbs]
   ? ttwu_queue_wakelist+0xa9/0xe0
   ? pty_write+0x85/0x90
   ? file_tty_write.isra.33+0x214/0x330
   ? process_echoes+0x60/0x60
   ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs]
   __x64_sys_ioctl+0x10d/0x8e0
   ? vfs_write+0x17f/0x260
   do_syscall_64+0x3c/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Add the missing xarray initialization and remove the desc_size set.

Fixes: a639e66703 ("RDMA/mlx5: Zero out ODP related items in the mlx5_ib_mr")
Link: https://lore.kernel.org/r/a4846a11c9de834663e521770da895007f9f0d30.1634642730.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-19 20:19:59 -03:00
Prabhakar Kushwaha
60fab10766 rdma/qedr: Fix crash due to redundant release of device's qp memory
Device's QP memory should only be allocated and released by IB layer.
This patch removes the redundant release of the device's qp memory and
uses completion APIs to make sure that .destroy_qp() only return, when qp
reference becomes 0.

Fixes: 514aee660d ("RDMA: Globally allocate and release QP memory")
Link: https://lore.kernel.org/r/20211019082212.7052-1-pkushwaha@marvell.com
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-19 20:17:44 -03:00
Pavel Begunkov
bc369921d6 io-wq: max_worker fixes
First, fix nr_workers checks against max_workers, with max_worker
registration, it may pretty easily happen that nr_workers > max_workers.

Also, synchronise writing to acct->max_worker with wqe->lock. It's not
an actual problem, but as we don't care about io_wqe_create_worker(),
it's better than WRITE_ONCE()/READ_ONCE().

Fixes: 2e480058dd ("io-wq: provide a way to limit max number of workers")
Reported-by: Beld Zhang <beldzhang@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/11f90e6b49410b7d1a88f5d04fb8d95bb86b8cf3.1634671835.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 17:09:34 -06:00
Christophe JAILLET
ba69fd9101 net: dsa: Fix an error handling path in 'dsa_switch_parse_ports_of()'
If we return before the end of the 'for_each_child_of_node()' iterator, the
reference taken on 'port' must be released.

Add the missing 'of_node_put()' calls.

Fixes: 83c0afaec7 ("net: dsa: Add new binding implementation")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/15d5310d1d55ad51c1af80775865306d92432e03.1634587046.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-19 15:41:16 -07:00
Rafael J. Wysocki
bc28368596 ACPI: PM: Do not turn off power resources in unknown state
Commit 6381195ad7 ("ACPI: power: Rework turning off unused power
resources") caused power resources in unknown state with reference
counters equal to zero to be turned off too, but that caused issues
to appear in the field, so modify the code to only turn off power
resources that are known to be "on".

Link: https://lore.kernel.org/linux-acpi/6faf4b92-78d5-47a4-63df-cc2bab7769d0@molgen.mpg.de/
Fixes: 6381195ad7 ("ACPI: power: Rework turning off unused power resources")
Reported-by: Andreas K. Huettel <andreas.huettel@ur.de>
Tested-by: Andreas K. Huettel <andreas.huettel@ur.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 5.14+ <stable@vger.kernel.org> # 5.14+
2021-10-19 19:28:56 +02:00
Eric W. Biederman
34dc2fd6e6 ucounts: Proper error handling in set_cred_ucounts
Instead of leaking the ucounts in new if alloc_ucounts fails, store
the result of alloc_ucounts into a temporary variable, which is later
assigned to new->ucounts.

Cc: stable@vger.kernel.org
Fixes: 905ae01c4a ("Add a reference to ucounts for each cred")
Link: https://lkml.kernel.org/r/87pms2s0v8.fsf_-_@disp2133
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-19 11:04:25 -05:00
Eric W. Biederman
629715adc6 ucounts: Pair inc_rlimit_ucounts with dec_rlimit_ucoutns in commit_creds
The purpose of inc_rlimit_ucounts and dec_rlimit_ucounts in commit_creds
is to change which rlimit counter is used to track a process when the
credentials changes.

Use the same test for both to guarantee the tracking is correct.

Cc: stable@vger.kernel.org
Fixes: 21d1c5e386 ("Reimplement RLIMIT_NPROC on top of ucounts")
Link: https://lkml.kernel.org/r/87v91us0w4.fsf_-_@disp2133
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-19 11:01:52 -05:00
Woody Lin
63acd42c0d sched/scs: Reset the shadow stack when idle_task_exit
Commit f1a0a376ca ("sched/core: Initialize the idle task with
preemption disabled") removed the init_idle() call from
idle_thread_get(). This was the sole call-path on hotplug that resets
the Shadow Call Stack (scs) Stack Pointer (sp).

Not resetting the scs-sp leads to scs overflow after enough hotplug
cycles. Therefore add an explicit scs_task_reset() to the hotplug code
to make sure the scs-sp does get reset on hotplug.

Fixes: f1a0a376ca ("sched/core: Initialize the idle task with preemption disabled")
Signed-off-by: Woody Lin <woodylin@google.com>
[peterz: Changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lore.kernel.org/r/20211012083521.973587-1-woodylin@google.com
2021-10-19 17:46:11 +02:00
Linus Torvalds
d9abdee5fd Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "19 patches.

  Subsystems affected by this patch series: mm (userfaultfd, migration,
  memblock, mempolicy, slub, secretmem, and thp), ocfs2, binfmt, vfs,
  and misc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mailmap: add Andrej Shadura
  mm/thp: decrease nr_thps in file's mapping on THP split
  mm/secretmem: fix NULL page->mapping dereference in page_is_secretmem()
  vfs: check fd has read access in kernel_read_file_from_fd()
  elfcore: correct reference to CONFIG_UML
  mm, slub: fix incorrect memcg slab count for bulk free
  mm, slub: fix potential use-after-free in slab_debugfs_fops
  mm, slub: fix potential memoryleak in kmem_cache_open()
  mm, slub: fix mismatch between reconstructed freelist depth and cnt
  mm, slub: fix two bugs in slab_debug_trace_open()
  mm/mempolicy: do not allow illegal MPOL_F_NUMA_BALANCING | MPOL_LOCAL in mbind()
  memblock: check memory total_size
  ocfs2: mount fails with buffer overflow in strlen
  ocfs2: fix data corruption after conversion from inline format
  mm/migrate: fix CPUHP state to update node demotion order
  mm/migrate: add CPU hotplug to demotion #ifdef
  mm/migrate: optimize hotplug-time demotion order updates
  userfaultfd: fix a race between writeprotect and exit_mmap()
  mm/userfaultfd: selftests: fix memory corruption with thp enabled
2021-10-19 05:41:36 -10:00
David S. Miller
04ee2752a5 Merge tag 'linux-can-fixes-for-5.15-20211019' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:

====================
pull-request: can 2021-10-19

this is a pull request of a single patch for net/master.

The patch is by me and fixes the error handling in case of a FC
timeout in the TX path of the ISOTOP CAN protocol.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19 13:14:36 +01:00
Zheyu Ma
c69b2f4687 cavium: Fix return values of the probe function
During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19 13:09:57 +01:00
Zheyu Ma
e211210098 mISDN: Fix return values of the probe function
During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19 13:09:28 +01:00
Randy Dunlap
162079f2dc mmc: winbond: don't build on M68K
The Winbond MMC driver fails to build on ARCH=m68k so prevent
that build config. Silences these build errors:

../drivers/mmc/host/wbsd.c: In function 'wbsd_request_end':
../drivers/mmc/host/wbsd.c:212:28: error: implicit declaration of function 'claim_dma_lock' [-Werror=implicit-function-declaration]
  212 |                 dmaflags = claim_dma_lock();
../drivers/mmc/host/wbsd.c:215:17: error: implicit declaration of function 'release_dma_lock'; did you mean 'release_task'? [-Werror=implicit-function-declaration]
  215 |                 release_dma_lock(dmaflags);

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Pierre Ossman <pierre@ossman.eu>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211017175949.23838-1-rdunlap@infradead.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-19 12:57:11 +02:00
Haibo Chen
9af372dc70 mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning circuit
To reset standard tuning circuit completely, after clear ESDHC_MIX_CTRL_EXE_TUNE,
also need to clear bit buffer_read_ready, this operation will finally clear the
USDHC IP internal logic flag execute_tuning_with_clr_buf, make sure the following
normal data transfer will not be impacted by standard tuning logic used before.

Find this issue when do quick SD card insert/remove stress test. During standard
tuning prodedure, if remove SD card, USDHC standard tuning logic can't clear the
internal flag execute_tuning_with_clr_buf. Next time when insert SD card, all
data related commands can't get any data related interrupts, include data transfer
complete interrupt, data timeout interrupt, data CRC interrupt, data end bit interrupt.
Always trigger software timeout issue. Even reset the USDHC through bits in register
SYS_CTRL (0x2C, bit28 reset tuning, bit26 reset data, bit 25 reset command, bit 24
reset all) can't recover this. From the user's point of view, USDHC stuck, SD can't
be recognized any more.

Fixes: d9370424c9 ("mmc: sdhci-esdhc-imx: reset tuning circuit when power on mmc card")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1634263236-6111-1-git-send-email-haibo.chen@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-19 12:46:05 +02:00
Arnd Bergmann
48ccc8edf5 ARM: 9141/1: only warn about XIP address when not compile testing
In randconfig builds, we sometimes come across this warning:

arm-linux-gnueabi-ld: XIP start address may cause MPU programming issues

While this is helpful for actual systems to figure out why it
fails, the warning does not provide any benefit for build testing,
so guard it in a check for CONFIG_COMPILE_TEST, which is usually
set on randconfig builds.

Fixes: 216218308c ("ARM: 8713/1: NOMMU: Support MPU in XIP configuration")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:39:50 +01:00
Arnd Bergmann
1f323127ca ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
With extra warnings enabled, gcc complains about this function
definition:

arch/arm/probes/kprobes/core.c: In function 'arch_init_kprobes':
arch/arm/probes/kprobes/core.c:465:12: warning: old-style function definition [-Wold-style-definition]
  465 | int __init arch_init_kprobes()

Link: https://lore.kernel.org/all/20201027093057.c685a14b386acacb3c449e3d@kernel.org/

Fixes: 24ba613c9d ("ARM kprobes: core code")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:39:50 +01:00
Arnd Bergmann
44cc6412e6 ARM: 9138/1: fix link warning with XIP + frame-pointer
When frame pointers are used instead of the ARM unwinder,
and the kernel is built using clang with an external assembler
and CONFIG_XIP_KERNEL, every file produces two warnings
like:

arm-linux-gnueabi-ld: warning: orphan section `.ARM.extab' from `net/mac802154/util.o' being placed in section `.ARM.extab'
arm-linux-gnueabi-ld: warning: orphan section `.ARM.exidx' from `net/mac802154/util.o' being placed in section `.ARM.exidx'

The same fix was already merged for the normal (non-XIP)

linker script, with a longer description.

Fixes: c39866f268 ("arm/build: Always handle .ARM.exidx and .ARM.extab sections")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:38:22 +01:00
Arnd Bergmann
eaf6cc7165 ARM: 9134/1: remove duplicate memcpy() definition
Both the decompressor code and the kasan logic try to override
the memcpy() and memmove()  definitions, which leading to a clash
in a KASAN-enabled kernel with XZ decompression:

arch/arm/boot/compressed/decompress.c:50:9: error: 'memmove' macro redefined [-Werror,-Wmacro-redefined]
 #define memmove memmove
        ^
arch/arm/include/asm/string.h:59:9: note: previous definition is here
 #define memmove(dst, src, len) __memmove(dst, src, len)
        ^
arch/arm/boot/compressed/decompress.c:51:9: error: 'memcpy' macro redefined [-Werror,-Wmacro-redefined]
 #define memcpy memcpy
        ^
arch/arm/include/asm/string.h:58:9: note: previous definition is here
 #define memcpy(dst, src, len) __memcpy(dst, src, len)
        ^

Here we want the set of functions from the decompressor, so undefine
the other macros before the override.

Link: https://lore.kernel.org/linux-arm-kernel/CACRpkdZYJogU_SN3H9oeVq=zJkRgRT1gDz3xp59gdqWXxw-B=w@mail.gmail.com/
Link: https://lore.kernel.org/lkml/202105091112.F5rmd4By-lkp@intel.com/

Fixes: d6d51a96c7 ("ARM: 9014/2: Replace string mem* functions for KASan")
Fixes: a7f464f3db ("ARM: 7001/2: Wire up support for the XZ decompressor")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:37:36 +01:00
Nick Desaulniers
e6a0c958bd ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
A kernel built with CONFIG_THUMB2_KERNEL=y and using clang as the
assembler could generate non-naturally-aligned v7wbi_tlb_fns which
results in a boot failure. The original commit adding the macro missed
the .align directive on this data.

Link: https://github.com/ClangBuiltLinux/linux/issues/1447
Link: https://lore.kernel.org/all/0699da7b-354f-aecc-a62f-e25693209af4@linaro.org/
Debugged-by: Ard Biesheuvel <ardb@kernel.org>
Debugged-by: Nathan Chancellor <nathan@kernel.org>
Debugged-by: Richard Henderson <richard.henderson@linaro.org>

Fixes: 66a625a881 ("ARM: mm: proc-macros: Add generic proc/cache/tlb struct definition macros")
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:37:35 +01:00
Lexi Shao
df909df077 ARM: 9132/1: Fix __get_user_check failure with ARM KASAN images
ARM: kasan: Fix __get_user_check failure with kasan

In macro __get_user_check defined in arch/arm/include/asm/uaccess.h,
error code is store in register int __e(r0). When kasan is
enabled, assigning value to kernel address might trigger kasan check,
which unexpectedly overwrites r0 and causes undefined behavior on arm
kasan images.

One example is failure in do_futex and results in process soft lockup.
Log:
watchdog: BUG: soft lockup - CPU#0 stuck for 62946ms! [rs:main
Q:Reg:1151]
...
(__asan_store4) from (futex_wait_setup+0xf8/0x2b4)
(futex_wait_setup) from (futex_wait+0x138/0x394)
(futex_wait) from (do_futex+0x164/0xe40)
(do_futex) from (sys_futex_time32+0x178/0x230)
(sys_futex_time32) from (ret_fast_syscall+0x0/0x50)

The soft lockup happens in function futex_wait_setup. The reason is
function get_futex_value_locked always return EINVAL, thus pc jump
back to retry label and causes looping.

This line in function get_futex_value_locked
	ret = __get_user(*dest, from);
is expanded to
	*dest = (typeof(*(p))) __r2; ,
in macro __get_user_check. Writing to pointer dest triggers kasan check
and overwrites the return value of __get_user_x function.
The assembly code of get_futex_value_locked in kernel/futex.c:
...
c01f6dc8:       eb0b020e        bl      c04b7608 <__get_user_4>
// "x = (typeof(*(p))) __r2;" triggers kasan check and r0 is overwritten
c01f6dCc:       e1a00007        mov     r0, r7
c01f6dd0:       e1a05002        mov     r5, r2
c01f6dd4:       eb04f1e6        bl      c0333574 <__asan_store4>
c01f6dd8:       e5875000        str     r5, [r7]
// save ret value of __get_user(*dest, from), which is dest address now
c01f6ddc:       e1a05000        mov     r5, r0
...
// checking return value of __get_user failed
c01f6e00:       e3550000        cmp     r5, #0
...
c01f6e0c:       01a00005        moveq   r0, r5
// assign return value to EINVAL
c01f6e10:       13e0000d        mvnne   r0, #13

Return value is the destination address of get_user thus certainly
non-zero, so get_futex_value_locked always return EINVAL.

Fix it by using a tmp vairable to store the error code before the
assignment. This fix has no effects to non-kasan images thanks to compiler
optimization. It only affects cases that overwrite r0 due to kasan check.

This should fix bug discussed in Link:
[1] https://lore.kernel.org/linux-arm-kernel/0ef7c2a5-5d8b-c5e0-63fa-31693fd4495c@gmail.com/

Fixes: 421015713b ("ARM: 9017/2: Enable KASan for ARM")
Signed-off-by: Lexi Shao <shaolexi@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:37:35 +01:00
Ard Biesheuvel
00d43d13da ARM: 9125/1: fix incorrect use of get_kernel_nofault()
Commit 344179fc7e ("ARM: 9106/1: traps: use get_kernel_nofault instead
of set_fs()") replaced an occurrence of __get_user() with
get_kernel_nofault(), but inverted the sense of the conditional in the
process, resulting in no values to be printed at all.

I.e., every exception stack now looks like this:

Exception stack(0xc18d1fb0 to 0xc18d1ff8)
1fa0:                                     ???????? ???????? ???????? ????????
1fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
1fe0: ???????? ???????? ???????? ???????? ???????? ????????

which is rather unhelpful.

Fixes: 344179fc7e ("ARM: 9106/1: traps: use get_kernel_nofault instead of set_fs()")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:37:34 +01:00
Nick Desaulniers
9d417cbe36 ARM: 9122/1: select HAVE_FUTEX_CMPXCHG
tglx notes:
  This function [futex_detect_cmpxchg] is only needed when an
  architecture has to runtime discover whether the CPU supports it or
  not.  ARM has unconditional support for this, so the obvious thing to
  do is the below.

Fixes linkage failure from Clang randconfigs:
kernel/futex.o:(.text.fixup+0x5c): relocation truncated to fit: R_ARM_JUMP24 against `.init.text'
and boot failures for CONFIG_THUMB2_KERNEL.

Link: https://github.com/ClangBuiltLinux/linux/issues/325

Comments from Nick Desaulniers:

 See-also: 03b8c7b623 ("futex: Allow architectures to skip
 futex_atomic_cmpxchg_inatomic() test")

Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: stable@vger.kernel.org # v3.14+
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:37:34 +01:00
José Roberto de Souza
59be177a90 drm/i915: Remove memory frequency calculation
This memory frequency calculated is only used to check if it is zero,
what is not useful as it will never actually be zero.

Also the calculation is wrong, we should be checking other bit to
select the appropriate frequency multiplier while this code is stuck
with a fixed multiplier.

So here dropping it as whole.

v2:
- Also remove memory frequency calculation for gen9 LP platforms

Cc: Yakui Zhao <yakui.zhao@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Fixes: 5d0c938ec9 ("drm/i915/gen11+: Only load DRAM information from pcode")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211013010046.91858-1-jose.souza@intel.com
(cherry picked from commit 83f52364b1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-10-19 10:45:52 +03:00
Jeff Layton
1bd85aa65d ceph: fix handling of "meta" errors
Currently, we check the wb_err too early for directories, before all of
the unsafe child requests have been waited on. In order to fix that we
need to check the mapping->wb_err later nearer to the end of ceph_fsync.

We also have an overly-complex method for tracking errors after
blocklisting. The errors recorded in cleanup_session_requests go to a
completely separate field in the inode, but we end up reporting them the
same way we would for any other error (in fsync).

There's no real benefit to tracking these errors in two different
places, since the only reporting mechanism for them is in fsync, and
we'd need to advance them both every time.

Given that, we can just remove i_meta_err, and convert the places that
used it to instead just use mapping->wb_err instead. That also fixes
the original problem by ensuring that we do a check_and_advance of the
wb_err at the end of the fsync op.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/52864
Reported-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-10-19 09:36:06 +02:00
Jeff Layton
98d0a6fb73 ceph: skip existing superblocks that are blocklisted or shut down when mounting
Currently when mounting, we may end up finding an existing superblock
that corresponds to a blocklisted MDS client. This means that the new
mount ends up being unusable.

If we've found an existing superblock with a client that is already
blocklisted, and the client is not configured to recover on its own,
fail the match. Ditto if the superblock has been forcibly unmounted.

While we're in here, also rename "other" to the more conventional "fsc".

Cc: stable@vger.kernel.org
URL: https://bugzilla.redhat.com/show_bug.cgi?id=1901499
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-10-19 09:36:06 +02:00
Marc Kleine-Budde
d674a8f123 can: isotp: isotp_sendmsg(): fix return error on FC timeout on TX path
When the a large chunk of data send and the receiver does not send a
Flow Control frame back in time, the sendmsg() does not return a error
code, but the number of bytes sent corresponding to the size of the
packet.

If a timeout occurs the isotp_tx_timer_handler() is fired, sets
sk->sk_err and calls the sk->sk_error_report() function. It was
wrongly expected that the error would be propagated to user space in
every case. For isotp_sendmsg() blocking on wait_event_interruptible()
this is not the case.

This patch fixes the problem by checking if sk->sk_err is set and
returning the error to user space.

Fixes: e057dd3fc2 ("can: add ISO 15765-2:2016 transport protocol")
Link: https://github.com/hartkopp/can-isotp/issues/42
Link: https://github.com/hartkopp/can-isotp/pull/43
Link: https://lore.kernel.org/all/20210507091839.1366379-1-mkl@pengutronix.de
Cc: stable@vger.kernel.org
Reported-by: Sottas Guillaume (LMB) <Guillaume.Sottas@liebherr.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-19 09:10:30 +02:00
Andrej Shadura
362d5dfc48 mailmap: add Andrej Shadura
Add a mapping for my old work email for BelDisplayTech to the personal
email, and make sure the Collabora email has the correct spelling of the
first name.

Link: https://lkml.kernel.org/r/20210917091016.30232-1-andrew.shadura@collabora.co.uk
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Marek Szyprowski
1ca7554d05 mm/thp: decrease nr_thps in file's mapping on THP split
Decrease nr_thps counter in file's mapping to ensure that the page cache
won't be dropped excessively on file write access if page has been
already split.

I've tried a test scenario running a big binary, kernel remaps it with
THPs, then force a THP split with /sys/kernel/debug/split_huge_pages.
During any further open of that binary with O_RDWR or O_WRITEONLY kernel
drops page cache for it, because of non-zero thps counter.

Link: https://lkml.kernel.org/r/20211012120237.2600-1-m.szyprowski@samsung.com
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 09d91cda0e ("mm,thp: avoid writes to file with THP in pagecache")
Fixes: 06d3eff62d ("mm/thp: fix node page state in split_huge_page_to_list()")
Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: <sfoon.kim@samsung.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Sean Christopherson
79f9bc5843 mm/secretmem: fix NULL page->mapping dereference in page_is_secretmem()
Check for a NULL page->mapping before dereferencing the mapping in
page_is_secretmem(), as the page's mapping can be nullified while gup()
is running, e.g.  by reclaim or truncation.

  BUG: kernel NULL pointer dereference, address: 0000000000000068
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 6 PID: 4173897 Comm: CPU 3/KVM Tainted: G        W
  RIP: 0010:internal_get_user_pages_fast+0x621/0x9d0
  Code: <48> 81 7a 68 80 08 04 bc 0f 85 21 ff ff 8 89 c7 be
  RSP: 0018:ffffaa90087679b0 EFLAGS: 00010046
  RAX: ffffe3f37905b900 RBX: 00007f2dd561e000 RCX: ffffe3f37905b934
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffe3f37905b900
  ...
  CR2: 0000000000000068 CR3: 00000004c5898003 CR4: 00000000001726e0
  Call Trace:
   get_user_pages_fast_only+0x13/0x20
   hva_to_pfn+0xa9/0x3e0
   try_async_pf+0xa1/0x270
   direct_page_fault+0x113/0xad0
   kvm_mmu_page_fault+0x69/0x680
   vmx_handle_exit+0xe1/0x5d0
   kvm_arch_vcpu_ioctl_run+0xd81/0x1c70
   kvm_vcpu_ioctl+0x267/0x670
   __x64_sys_ioctl+0x83/0xa0
   do_syscall_64+0x56/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Link: https://lkml.kernel.org/r/20211007231502.3552715-1-seanjc@google.com
Fixes: 1507f51255 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reported-by: Darrick J. Wong <djwong@kernel.org>
Reported-by: Stephen <stephenackerman16@gmail.com>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Matthew Wilcox (Oracle)
032146cda8 vfs: check fd has read access in kernel_read_file_from_fd()
If we open a file without read access and then pass the fd to a syscall
whose implementation calls kernel_read_file_from_fd(), we get a warning
from __kernel_read():

        if (WARN_ON_ONCE(!(file->f_mode & FMODE_READ)))

This currently affects both finit_module() and kexec_file_load(), but it
could affect other syscalls in the future.

Link: https://lkml.kernel.org/r/20211007220110.600005-1-willy@infradead.org
Fixes: b844f0ecbc ("vfs: define kernel_copy_file_from_fd()")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Hao Sun <sunhao.th@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Lukas Bulwahn
b0e901280d elfcore: correct reference to CONFIG_UML
Commit 6e7b64b9dd ("elfcore: fix building with clang") introduces
special handling for two architectures, ia64 and User Mode Linux.
However, the wrong name, i.e., CONFIG_UM, for the intended Kconfig
symbol for User-Mode Linux was used.

Although the directory for User Mode Linux is ./arch/um; the Kconfig
symbol for this architecture is called CONFIG_UML.

Luckily, ./scripts/checkkconfigsymbols.py warns on non-existing configs:

  UM
  Referencing files: include/linux/elfcore.h
  Similar symbols: UML, NUMA

Correct the name of the config to the intended one.

[akpm@linux-foundation.org: fix um/x86_64, per Catalin]
  Link: https://lkml.kernel.org/r/20211006181119.2851441-1-catalin.marinas@arm.com
  Link: https://lkml.kernel.org/r/YV6pejGzLy5ppEpt@arm.com

Link: https://lkml.kernel.org/r/20211006082209.417-1-lukas.bulwahn@gmail.com
Fixes: 6e7b64b9dd ("elfcore: fix building with clang")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Barret Rhoden <brho@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Miaohe Lin
3ddd60268c mm, slub: fix incorrect memcg slab count for bulk free
kmem_cache_free_bulk() will call memcg_slab_free_hook() for all objects
when doing bulk free.  So we shouldn't call memcg_slab_free_hook() again
for bulk free to avoid incorrect memcg slab count.

Link: https://lkml.kernel.org/r/20210916123920.48704-6-linmiaohe@huawei.com
Fixes: d1b2cf6cb8 ("mm: memcg/slab: uncharge during kmem_cache_free_bulk()")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Miaohe Lin
67823a5444 mm, slub: fix potential use-after-free in slab_debugfs_fops
When sysfs_slab_add failed, we shouldn't call debugfs_slab_add() for s
because s will be freed soon.  And slab_debugfs_fops will use s later
leading to a use-after-free.

Link: https://lkml.kernel.org/r/20210916123920.48704-5-linmiaohe@huawei.com
Fixes: 64dd68497b ("mm: slub: move sysfs slab alloc/free interfaces to debugfs")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Miaohe Lin
9037c57681 mm, slub: fix potential memoryleak in kmem_cache_open()
In error path, the random_seq of slub cache might be leaked.  Fix this
by using __kmem_cache_release() to release all the relevant resources.

Link: https://lkml.kernel.org/r/20210916123920.48704-4-linmiaohe@huawei.com
Fixes: 210e7a43fa ("mm: SLUB freelist randomization")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Miaohe Lin
899447f669 mm, slub: fix mismatch between reconstructed freelist depth and cnt
If object's reuse is delayed, it will be excluded from the reconstructed
freelist.  But we forgot to adjust the cnt accordingly.  So there will
be a mismatch between reconstructed freelist depth and cnt.  This will
lead to free_debug_processing() complaining about freelist count or a
incorrect slub inuse count.

Link: https://lkml.kernel.org/r/20210916123920.48704-3-linmiaohe@huawei.com
Fixes: c3895391df ("kasan, slub: fix handling of kasan_slab_free hook")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Miaohe Lin
2127d22509 mm, slub: fix two bugs in slab_debug_trace_open()
Patch series "Fixups for slub".

This series contains various bug fixes for slub.  We fix memoryleak,
use-afer-free, NULL pointer dereferencing and so on in slub.  More
details can be found in the respective changelogs.

This patch (of 5):

It's possible that __seq_open_private() will return NULL.  So we should
check it before using lest dereferencing NULL pointer.  And in error
paths, we forgot to release private buffer via seq_release_private().
Memory will leak in these paths.

Link: https://lkml.kernel.org/r/20210916123920.48704-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20210916123920.48704-2-linmiaohe@huawei.com
Fixes: 64dd68497b ("mm: slub: move sysfs slab alloc/free interfaces to debugfs")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Eric Dumazet
6d2aec9e12 mm/mempolicy: do not allow illegal MPOL_F_NUMA_BALANCING | MPOL_LOCAL in mbind()
syzbot reported access to unitialized memory in mbind() [1]

Issue came with commit bda420b985 ("numa balancing: migrate on fault
among multiple bound nodes")

This commit added a new bit in MPOL_MODE_FLAGS, but only checked valid
combination (MPOL_F_NUMA_BALANCING can only be used with MPOL_BIND) in
do_set_mempolicy()

This patch moves the check in sanitize_mpol_flags() so that it is also
used by mbind()

  [1]
  BUG: KMSAN: uninit-value in __mpol_equal+0x567/0x590 mm/mempolicy.c:2260
   __mpol_equal+0x567/0x590 mm/mempolicy.c:2260
   mpol_equal include/linux/mempolicy.h:105 [inline]
   vma_merge+0x4a1/0x1e60 mm/mmap.c:1190
   mbind_range+0xcc8/0x1e80 mm/mempolicy.c:811
   do_mbind+0xf42/0x15f0 mm/mempolicy.c:1333
   kernel_mbind mm/mempolicy.c:1483 [inline]
   __do_sys_mbind mm/mempolicy.c:1490 [inline]
   __se_sys_mbind+0x437/0xb80 mm/mempolicy.c:1486
   __x64_sys_mbind+0x19d/0x200 mm/mempolicy.c:1486
   do_syscall_x64 arch/x86/entry/common.c:51 [inline]
   do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
   entry_SYSCALL_64_after_hwframe+0x44/0xae

  Uninit was created at:
   slab_alloc_node mm/slub.c:3221 [inline]
   slab_alloc mm/slub.c:3230 [inline]
   kmem_cache_alloc+0x751/0xff0 mm/slub.c:3235
   mpol_new mm/mempolicy.c:293 [inline]
   do_mbind+0x912/0x15f0 mm/mempolicy.c:1289
   kernel_mbind mm/mempolicy.c:1483 [inline]
   __do_sys_mbind mm/mempolicy.c:1490 [inline]
   __se_sys_mbind+0x437/0xb80 mm/mempolicy.c:1486
   __x64_sys_mbind+0x19d/0x200 mm/mempolicy.c:1486
   do_syscall_x64 arch/x86/entry/common.c:51 [inline]
   do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
   entry_SYSCALL_64_after_hwframe+0x44/0xae
  =====================================================
  Kernel panic - not syncing: panic_on_kmsan set ...
  CPU: 0 PID: 15049 Comm: syz-executor.0 Tainted: G    B             5.15.0-rc2-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:88 [inline]
   dump_stack_lvl+0x1ff/0x28e lib/dump_stack.c:106
   dump_stack+0x25/0x28 lib/dump_stack.c:113
   panic+0x44f/0xdeb kernel/panic.c:232
   kmsan_report+0x2ee/0x300 mm/kmsan/report.c:186
   __msan_warning+0xd7/0x150 mm/kmsan/instrumentation.c:208
   __mpol_equal+0x567/0x590 mm/mempolicy.c:2260
   mpol_equal include/linux/mempolicy.h:105 [inline]
   vma_merge+0x4a1/0x1e60 mm/mmap.c:1190
   mbind_range+0xcc8/0x1e80 mm/mempolicy.c:811
   do_mbind+0xf42/0x15f0 mm/mempolicy.c:1333
   kernel_mbind mm/mempolicy.c:1483 [inline]
   __do_sys_mbind mm/mempolicy.c:1490 [inline]
   __se_sys_mbind+0x437/0xb80 mm/mempolicy.c:1486
   __x64_sys_mbind+0x19d/0x200 mm/mempolicy.c:1486
   do_syscall_x64 arch/x86/entry/common.c:51 [inline]
   do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Link: https://lkml.kernel.org/r/20211001215630.810592-1-eric.dumazet@gmail.com
Fixes: bda420b985 ("numa balancing: migrate on fault among multiple bound nodes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Peng Fan
5173ed72bc memblock: check memory total_size
mem=[X][G|M] is broken on ARM64 platform, there are cases that even
type.cnt is 1, but total_size is not 0 because regions are merged into
1.  So only check 'cnt' is not enough, total_size should be used,
othersize bootargs 'mem=[X][G|B]' not work anymore.

Link: https://lkml.kernel.org/r/20210930024437.32598-1-peng.fan@oss.nxp.com
Fixes: e888fa7bb8 ("memblock: Check memory add/cap ordering")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Valentin Vidic
b15fa9224e ocfs2: mount fails with buffer overflow in strlen
Starting with kernel 5.11 built with CONFIG_FORTIFY_SOURCE mouting an
ocfs2 filesystem with either o2cb or pcmk cluster stack fails with the
trace below.  Problem seems to be that strings for cluster stack and
cluster name are not guaranteed to be null terminated in the disk
representation, while strlcpy assumes that the source string is always
null terminated.  This causes a read outside of the source string
triggering the buffer overflow detection.

  detected buffer overflow in strlen
  ------------[ cut here ]------------
  kernel BUG at lib/string.c:1149!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 1 PID: 910 Comm: mount.ocfs2 Not tainted 5.14.0-1-amd64 #1
    Debian 5.14.6-2
  RIP: 0010:fortify_panic+0xf/0x11
  ...
  Call Trace:
   ocfs2_initialize_super.isra.0.cold+0xc/0x18 [ocfs2]
   ocfs2_fill_super+0x359/0x19b0 [ocfs2]
   mount_bdev+0x185/0x1b0
   legacy_get_tree+0x27/0x40
   vfs_get_tree+0x25/0xb0
   path_mount+0x454/0xa20
   __x64_sys_mount+0x103/0x140
   do_syscall_64+0x3b/0xc0
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Link: https://lkml.kernel.org/r/20210929180654.32460-1-vvidic@valentin-vidic.from.hr
Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Jan Kara
5314454ea3 ocfs2: fix data corruption after conversion from inline format
Commit 6dbf7bb555 ("fs: Don't invalidate page buffers in
block_write_full_page()") uncovered a latent bug in ocfs2 conversion
from inline inode format to a normal inode format.

The code in ocfs2_convert_inline_data_to_extents() attempts to zero out
the whole cluster allocated for file data by grabbing, zeroing, and
dirtying all pages covering this cluster.  However these pages are
beyond i_size, thus writeback code generally ignores these dirty pages
and no blocks were ever actually zeroed on the disk.

This oversight was fixed by commit 693c241a5f ("ocfs2: No need to zero
pages past i_size.") for standard ocfs2 write path, inline conversion
path was apparently forgotten; the commit log also has a reasoning why
the zeroing actually is not needed.

After commit 6dbf7bb555, things became worse as writeback code stopped
invalidating buffers on pages beyond i_size and thus these pages end up
with clean PageDirty bit but with buffers attached to these pages being
still dirty.  So when a file is converted from inline format, then
writeback triggers, and then the file is grown so that these pages
become valid, the invalid dirtiness state is preserved,
mark_buffer_dirty() does nothing on these pages (buffers are already
dirty) but page is never written back because it is clean.  So data
written to these pages is lost once pages are reclaimed.

Simple reproducer for the problem is:

  xfs_io -f -c "pwrite 0 2000" -c "pwrite 2000 2000" -c "fsync" \
    -c "pwrite 4000 2000" ocfs2_file

After unmounting and mounting the fs again, you can observe that end of
'ocfs2_file' has lost its contents.

Fix the problem by not doing the pointless zeroing during conversion
from inline format similarly as in the standard write path.

[akpm@linux-foundation.org: fix whitespace, per Joseph]

Link: https://lkml.kernel.org/r/20210930095405.21433-1-jack@suse.cz
Fixes: 6dbf7bb555 ("fs: Don't invalidate page buffers in block_write_full_page()")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: Gang He <ghe@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: "Markov, Andrey" <Markov.Andrey@Dell.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Huang Ying
a6a0251c6f mm/migrate: fix CPUHP state to update node demotion order
The node demotion order needs to be updated during CPU hotplug.  Because
whether a NUMA node has CPU may influence the demotion order.  The
update function should be called during CPU online/offline after the
node_states[N_CPU] has been updated.  That is done in
CPUHP_AP_ONLINE_DYN during CPU online and in CPUHP_MM_VMSTAT_DEAD during
CPU offline.  But in commit 884a6e5d1f ("mm/migrate: update node
demotion order on hotplug events"), the function to update node demotion
order is called in CPUHP_AP_ONLINE_DYN during CPU online/offline.  This
doesn't satisfy the order requirement.

For example, there are 4 CPUs (P0, P1, P2, P3) in 2 sockets (P0, P1 in S0
and P2, P3 in S1), the demotion order is

 - S0 -> NUMA_NO_NODE
 - S1 -> NUMA_NO_NODE

After P2 and P3 is offlined, because S1 has no CPU now, the demotion
order should have been changed to

 - S0 -> S1
 - S1 -> NO_NODE

but it isn't changed, because the order updating callback for CPU
hotplug doesn't see the new nodemask.  After that, if P1 is offlined,
the demotion order is changed to the expected order as above.

So in this patch, we added CPUHP_AP_MM_DEMOTION_ONLINE and
CPUHP_MM_DEMOTION_DEAD to be called after CPUHP_AP_ONLINE_DYN and
CPUHP_MM_VMSTAT_DEAD during CPU online and offline, and register the
update function on them.

Link: https://lkml.kernel.org/r/20210929060351.7293-1-ying.huang@intel.com
Fixes: 884a6e5d1f ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Dave Hansen
76af6a054d mm/migrate: add CPU hotplug to demotion #ifdef
Once upon a time, the node demotion updates were driven solely by memory
hotplug events.  But now, there are handlers for both CPU and memory
hotplug.

However, the #ifdef around the code checks only memory hotplug.  A
system that has HOTPLUG_CPU=y but MEMORY_HOTPLUG=n would miss CPU
hotplug events.

Update the #ifdef around the common code.  Add memory and CPU-specific
#ifdefs for their handlers.  These memory/CPU #ifdefs avoid unused
function warnings when their Kconfig option is off.

[arnd@arndb.de: rework hotplug_memory_notifier() stub]
  Link: https://lkml.kernel.org/r/20211013144029.2154629-1-arnd@kernel.org

Link: https://lkml.kernel.org/r/20210924161255.E5FE8F7E@davehans-spike.ostc.intel.com
Fixes: 884a6e5d1f ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:02 -10:00
Dave Hansen
295be91f7e mm/migrate: optimize hotplug-time demotion order updates
Patch series "mm/migrate: 5.15 fixes for automatic demotion", v2.

This contains two fixes for the "automatic demotion" code which was
merged into 5.15:

 * Fix memory hotplug performance regression by watching
   suppressing any real action on irrelevant hotplug events.

 * Ensure CPU hotplug handler is registered when memory hotplug
   is disabled.

This patch (of 2):

== tl;dr ==

Automatic demotion opted for a simple, lazy approach to handling hotplug
events.  This noticeably slows down memory hotplug[1].  Optimize away
updates to the demotion order when memory hotplug events should have no
effect.

This has no effect on CPU hotplug.  There is no known problem on the CPU
side and any work there will be in a separate series.

== Background ==

Automatic demotion is a memory migration strategy to ensure that new
allocations have room in faster memory tiers on tiered memory systems.
The kernel maintains an array (node_demotion[]) to drive these
migrations.

The node_demotion[] path is calculated by starting at nodes with CPUs
and then "walking" to nodes with memory.  Only hotplug events which
online or offline a node with memory (N_ONLINE) or CPUs (N_CPU) will
actually affect the migration order.

== Problem ==

However, the current code is lazy.  It completely regenerates the
migration order on *any* CPU or memory hotplug event.  The logic was
that these events are extremely rare and that the overhead from
indiscriminate order regeneration is minimal.

Part of the update logic involves a synchronize_rcu(), which is a pretty
big hammer.  Its overhead was large enough to be detected by some 0day
tests that watch memory hotplug performance[1].

== Solution ==

Add a new helper (node_demotion_topo_changed()) which can differentiate
between superfluous and impactful hotplug events.  Skip the expensive
update operation for superfluous events.

== Aside: Locking ==

It took me a few moments to declare the locking to be safe enough for
node_demotion_topo_changed() to work.  It all hinges on the memory
hotplug lock:

During memory hotplug events, 'mem_hotplug_lock' is held for write.
This ensures that two memory hotplug events can not be called
simultaneously.

CPU hotplug has a similar lock (cpuhp_state_mutex) which also provides
mutual exclusion between CPU hotplug events.  In addition, the demotion
code acquire and hold the mem_hotplug_lock for read during its CPU
hotplug handlers.  This provides mutual exclusion between the demotion
memory hotplug callbacks and the CPU hotplug callbacks.

This effectively allows treating the migration target generation code to
act as if it is single-threaded.

1. https://lore.kernel.org/all/20210905135932.GE15026@xsang-OptiPlex-9020/

Link: https://lkml.kernel.org/r/20210924161251.093CCD06@davehans-spike.ostc.intel.com
Link: https://lkml.kernel.org/r/20210924161253.D7673E31@davehans-spike.ostc.intel.com
Fixes: 884a6e5d1f ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:02 -10:00
Nadav Amit
cb185d5f1e userfaultfd: fix a race between writeprotect and exit_mmap()
A race is possible when a process exits, its VMAs are removed by
exit_mmap() and at the same time userfaultfd_writeprotect() is called.

The race was detected by KASAN on a development kernel, but it appears
to be possible on vanilla kernels as well.

Use mmget_not_zero() to prevent the race as done in other userfaultfd
operations.

Link: https://lkml.kernel.org/r/20210921200247.25749-1-namit@vmware.com
Fixes: 63b2d4174c ("userfaultfd: wp: add the writeprotect API to userfaultfd ioctl")
Signed-off-by: Nadav Amit <namit@vmware.com>
Tested-by: Li  Wang <liwang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:02 -10:00
Peter Xu
8913970c19 mm/userfaultfd: selftests: fix memory corruption with thp enabled
In RHEL's gating selftests we've encountered memory corruption in the
uffd event test even with upstream kernel:

        # ./userfaultfd anon 128 4
        nr_pages: 32768, nr_pages_per_cpu: 32768
        bounces: 3, mode: rnd racing read, userfaults: 6240 missing (6240) 14729 wp (14729)
        bounces: 2, mode: racing read, userfaults: 1444 missing (1444) 28877 wp (28877)
        bounces: 1, mode: rnd read, userfaults: 6055 missing (6055) 14699 wp (14699)
        bounces: 0, mode: read, userfaults: 82 missing (82) 25196 wp (25196)
        testing uffd-wp with pagemap (pgsize=4096): done
        testing uffd-wp with pagemap (pgsize=2097152): done
        testing events (fork, remap, remove): ERROR: nr 32427 memory corruption 0 1 (errno=0, line=963)
        ERROR: faulting process failed (errno=0, line=1117)

It can be easily reproduced when global thp enabled, which is the
default for RHEL.

It's also known as a side effect of commit 0db282ba2c ("selftest: use
mmap instead of posix_memalign to allocate memory", 2021-07-23), which
is imho right itself on using mmap() to make sure the addresses will be
untagged even on arm.

The problem is, for each test we allocate buffers using two
allocate_area() calls.  We assumed these two buffers won't affect each
other, however they could, because mmap() could have found that the two
buffers are near each other and having the same VMA flags, so they got
merged into one VMA.

It won't be a big problem if thp is not enabled, but when thp is
agressively enabled it means when initializing the src buffer it could
accidentally setup part of the dest buffer too when there's a shared THP
that overlaps the two regions.  Then some of the dest buffer won't be
able to be trapped by userfaultfd missing mode, then it'll cause memory
corruption as described.

To fix it, do release_pages() after initializing the src buffer.

Since the previous two release_pages() calls are after
uffd_test_ctx_clear() which will unmap all the buffers anyway (which is
stronger than release pages; as unmap() also tear town pgtables), drop
them as they shouldn't really be anything useful.

We can mark the Fixes tag upon 0db282ba2c as it's reported to only
happen there, however the real "Fixes" IMHO should be 8ba6e86408, as
before that commit we'll always do explicit release_pages() before
registration of uffd, and 8ba6e86408 changed that logic by adding
extra unmap/map and we didn't release the pages at the right place.
Meanwhile I don't have a solid glue anyway on whether posix_memalign()
could always avoid triggering this bug, hence it's safer to attach this
fix to commit 8ba6e86408.

Link: https://lkml.kernel.org/r/20210923232512.210092-1-peterx@redhat.com
Fixes: 8ba6e86408 ("userfaultfd/selftests: reinitialize test context in each test")
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1994931
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Li Wang <liwan@redhat.com>
Tested-by: Li Wang <liwang@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:02 -10:00
Marco Giunta
2966492372 ALSA: usb-audio: Fix microphone sound on Jieli webcam.
When a Jieli Technology USB Webcam is connected, the video part works
well, but the mic sound is speeded up. On dmesg there are messages
about different rates from the runtime rates, warnings about volume
resolution and lastly, the log is filled, every 5 seconds, with
retire_capture_urb error messages.

The mic works only when ep packet size is set to wMaxPacketSize (normal
sound and no more retire_capture_urb error messages). Skipping reading
sample rate, fixes the messages about different rates and forcing a volume
resolution, fixes warnings about volume range. I have arbitrarily choosed
the value (16): I read in a comment that there should be no more than 255
levels, so 4096 (max volume) / 16 = 0-255.

Signed-off-by: Marco Giunta <giun7a@gmail.com>
Link: https://lore.kernel.org/r/20211018162552.12082-1-giun7a@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-19 08:07:01 +02:00
Adrian Hunter
4e5483b844 scsi: ufs: ufs-pci: Force a full restore after suspend-to-disk
Implement the ->restore() PM operation and set the link to off, which will
force a full reset and restore.  This ensures that Host Performance Booster
is reset after suspend-to-disk.

The Host Performance Booster feature caches logical-to-physical mapping
information in the host memory.  After suspend-to-disk, such information is
not valid, so a full reset and restore is needed.

A full reset and restore is done if the SPM level is 5 or 6, but not for
other SPM levels, so this change fixes those cases.

A full reset and restore also restores base address registers, so that code
is removed.

Link: https://lore.kernel.org/r/20211018151004.284200-2-adrian.hunter@intel.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 23:19:44 -04:00
Dmitry Bogdanov
4a8f71014b scsi: qla2xxx: Fix unmap of already freed sgl
The sgl is freed in the target stack in target_release_cmd_kref() before
calling qlt_free_cmd() but there is an unmap of sgl in qlt_free_cmd() that
causes a panic if sgl is not yet DMA unmapped:

NIP dma_direct_unmap_sg+0xdc/0x180
LR  dma_direct_unmap_sg+0xc8/0x180
Call Trace:
 ql_dbg_prefix+0x68/0xc0 [qla2xxx] (unreliable)
 dma_unmap_sg_attrs+0x54/0xf0
 qlt_unmap_sg.part.19+0x54/0x1c0 [qla2xxx]
 qlt_free_cmd+0x124/0x1d0 [qla2xxx]
 tcm_qla2xxx_release_cmd+0x4c/0xa0 [tcm_qla2xxx]
 target_put_sess_cmd+0x198/0x370 [target_core_mod]
 transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod]
 tcm_qla2xxx_complete_free+0x6c/0x90 [tcm_qla2xxx]

The sgl may be left unmapped in error cases of response sending.  For
instance, qlt_rdy_to_xfer() maps sgl and exits when session is being
deleted keeping the sgl mapped.

This patch removes use-after-free of the sgl and ensures that the sgl is
unmapped for any command that was not sent to firmware.

Link: https://lore.kernel.org/r/20211018122650.11846-1-d.bogdanov@yadro.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 23:19:44 -04:00
Joy Gu
7fb223d0ad scsi: qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()
Commit 8c0eb596ba ("[SCSI] qla2xxx: Fix a memory leak in an error path of
qla2x00_process_els()"), intended to change:

        bsg_job->request->msgcode == FC_BSG_HST_ELS_NOLOGIN

to:

        bsg_job->request->msgcode != FC_BSG_RPT_ELS

but changed it to:

        bsg_job->request->msgcode == FC_BSG_RPT_ELS

instead.

Change the == to a != to avoid leaking the fcport structure or freeing
unallocated memory.

Link: https://lore.kernel.org/r/20211012191834.90306-2-jgu@purestorage.com
Fixes: 8c0eb596ba ("[SCSI] qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Joy Gu <jgu@purestorage.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 23:04:31 -04:00
Zheyu Ma
06634d5b6e scsi: qla2xxx: Return -ENOMEM if kzalloc() fails
The driver probing function should return < 0 for failure, otherwise
kernel will treat value > 0 as success.

Link: https://lore.kernel.org/r/1634522181-31166-1-git-send-email-zheyuma97@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 23:01:10 -04:00
Gaosheng Cui
6e3ee990c9 audit: fix possible null-pointer dereference in audit_filter_rules
Fix  possible null-pointer dereference in audit_filter_rules.

audit_filter_rules() error: we previously assumed 'ctx' could be null

Cc: stable@vger.kernel.org
Fixes: bf361231c2 ("audit: add saddr_fam filter field")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-18 18:27:47 -04:00
Steven Rostedt (VMware)
ed65df63a3 tracing: Have all levels of checks prevent recursion
While writing an email explaining the "bit = 0" logic for a discussion on
making ftrace_test_recursion_trylock() disable preemption, I discovered a
path that makes the "not do the logic if bit is zero" unsafe.

The recursion logic is done in hot paths like the function tracer. Thus,
any code executed causes noticeable overhead. Thus, tricks are done to try
to limit the amount of code executed. This included the recursion testing
logic.

Having recursion testing is important, as there are many paths that can
end up in an infinite recursion cycle when tracing every function in the
kernel. Thus protection is needed to prevent that from happening.

Because it is OK to recurse due to different running context levels (e.g.
an interrupt preempts a trace, and then a trace occurs in the interrupt
handler), a set of bits are used to know which context one is in (normal,
softirq, irq and NMI). If a recursion occurs in the same level, it is
prevented*.

Then there are infrastructure levels of recursion as well. When more than
one callback is attached to the same function to trace, it calls a loop
function to iterate over all the callbacks. Both the callbacks and the
loop function have recursion protection. The callbacks use the
"ftrace_test_recursion_trylock()" which has a "function" set of context
bits to test, and the loop function calls the internal
trace_test_and_set_recursion() directly, with an "internal" set of bits.

If an architecture does not implement all the features supported by ftrace
then the callbacks are never called directly, and the loop function is
called instead, which will implement the features of ftrace.

Since both the loop function and the callbacks do recursion protection, it
was seemed unnecessary to do it in both locations. Thus, a trick was made
to have the internal set of recursion bits at a more significant bit
location than the function bits. Then, if any of the higher bits were set,
the logic of the function bits could be skipped, as any new recursion
would first have to go through the loop function.

This is true for architectures that do not support all the ftrace
features, because all functions being traced must first go through the
loop function before going to the callbacks. But this is not true for
architectures that support all the ftrace features. That's because the
loop function could be called due to two callbacks attached to the same
function, but then a recursion function inside the callback could be
called that does not share any other callback, and it will be called
directly.

i.e.

 traced_function_1: [ more than one callback tracing it ]
   call loop_func

 loop_func:
   trace_recursion set internal bit
   call callback

 callback:
   trace_recursion [ skipped because internal bit is set, return 0 ]
   call traced_function_2

 traced_function_2: [ only traced by above callback ]
   call callback

 callback:
   trace_recursion [ skipped because internal bit is set, return 0 ]
   call traced_function_2

 [ wash, rinse, repeat, BOOM! out of shampoo! ]

Thus, the "bit == 0 skip" trick is not safe, unless the loop function is
call for all functions.

Since we want to encourage architectures to implement all ftrace features,
having them slow down due to this extra logic may encourage the
maintainers to update to the latest ftrace features. And because this
logic is only safe for them, remove it completely.

 [*] There is on layer of recursion that is allowed, and that is to allow
     for the transition between interrupt context (normal -> softirq ->
     irq -> NMI), because a trace may occur before the context update is
     visible to the trace recursion logic.

Link: https://lore.kernel.org/all/609b565a-ed6e-a1da-f025-166691b5d994@linux.alibaba.com/
Link: https://lkml.kernel.org/r/20211018154412.09fcad3c@gandalf.local.home

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Jisheng Zhang <jszhang@kernel.org>
Cc: =?utf-8?b?546L6LSH?= <yun.wang@linux.alibaba.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: stable@vger.kernel.org
Fixes: edc15cafcb ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-18 18:12:09 -04:00
Nathan Chancellor
8a64ef042e nfp: bpf: silence bitwise vs. logical OR warning
A new warning in clang points out two places in this driver where
boolean expressions are being used with a bitwise OR instead of a
logical one:

drivers/net/ethernet/netronome/nfp/nfp_asm.c:199:20: error: use of bitwise '|' with boolean operands [-Werror,-Wbitwise-instead-of-logical]
        reg->src_lmextn = swreg_lmextn(lreg) | swreg_lmextn(rreg);
                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                             ||
drivers/net/ethernet/netronome/nfp/nfp_asm.c:199:20: note: cast one or both operands to int to silence this warning
drivers/net/ethernet/netronome/nfp/nfp_asm.c:280:20: error: use of bitwise '|' with boolean operands [-Werror,-Wbitwise-instead-of-logical]
        reg->src_lmextn = swreg_lmextn(lreg) | swreg_lmextn(rreg);
                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                             ||
drivers/net/ethernet/netronome/nfp/nfp_asm.c:280:20: note: cast one or both operands to int to silence this warning
2 errors generated.

The motivation for the warning is that logical operations short circuit
while bitwise operations do not. In this case, it does not seem like
short circuiting is harmful so implement the suggested fix of changing
to a logical operation to fix the warning.

Link: https://github.com/ClangBuiltLinux/linux/issues/1479
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20211018193101.2340261-1-nathan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-18 14:50:01 -07:00
Rob Clark
5ca6779d2f drm/msm/devfreq: Restrict idle clamping to a618 for now
Until we better understand the stability issues caused by frequent
frequency changes, lets limit them to a618.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Tested-by: Caleb Connolly <caleb.connolly@linaro.org>
Link: https://lore.kernel.org/r/20211018153627.2787882-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-18 14:31:57 -07:00
Eric W. Biederman
15bc01effe ucounts: Fix signal ucount refcounting
In commit fda31c5029 ("signal: avoid double atomic counter
increments for user accounting") Linus made a clever optimization to
how rlimits and the struct user_struct.  Unfortunately that
optimization does not work in the obvious way when moved to nested
rlimits.  The problem is that the last decrement of the per user
namespace per user sigpending counter might also be the last decrement
of the sigpending counter in the parent user namespace as well.  Which
means that simply freeing the leaf ucount in __free_sigqueue is not
enough.

Maintain the optimization and handle the tricky cases by introducing
inc_rlimit_get_ucounts and dec_rlimit_put_ucounts.

By moving the entire optimization into functions that perform all of
the work it becomes possible to ensure that every level is handled
properly.

The new function inc_rlimit_get_ucounts returns 0 on failure to
increment the ucount.  This is different than inc_rlimit_ucounts which
increments the ucounts and returns LONG_MAX if the ucount counter has
exceeded it's maximum or it wrapped (to indicate the counter needs to
decremented).

I wish we had a single user to account all pending signals to across
all of the threads of a process so this complexity was not necessary

Cc: stable@vger.kernel.org
Fixes: d646969055 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
v1: https://lkml.kernel.org/r/87mtnavszx.fsf_-_@disp2133
Link: https://lkml.kernel.org/r/87fssytizw.fsf_-_@disp2133
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Tested-by: Rune Kleveland <rune.kleveland@infomedia.dk>
Tested-by: Yu Zhao <yuzhao@google.com>
Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-18 16:02:30 -05:00
Martin Blumenstingl
675c496d0f clk: composite: Also consider .determine_rate for rate + mux composites
Commit 69a00fb3d6 ("clk: divider: Implement and wire up
.determine_rate by default") switches clk_divider_ops to implement
.determine_rate by default. This breaks composite clocks with multiple
parents because clk-composite.c does not use the special handling for
mux + divider combinations anymore (that was restricted to rate clocks
which only implement .round_rate, but not .determine_rate).

Alex reports:
  This breaks lot of clocks for Rockchip which intensively uses
  composites,  i.e. those clocks will always stay at the initial parent,
  which in some cases  is the XTAL clock and I strongly guess it is the
  same for other platforms,  which use composite clocks having more than
  one parent (e.g. mediatek, ti ...)

  Example (RK3399)
  clk_sdio is set (initialized) with XTAL (24 MHz) as parent in u-boot.
  It will always stay at this parent, even if the mmc driver sets a rate
  of  200 MHz (fails, as the nature of things), which should switch it
  to   any of its possible parent PLLs defined in
  mux_pll_src_cpll_gpll_npll_ppll_upll_24m_p (see clk-rk3399.c)  - which
  never happens.

Restore the original behavior by changing the priority of the conditions
inside clk-composite.c. Now the special rate + mux case (with rate_ops
having a .round_rate - which is still the case for the default
clk_divider_ops) is preferred over rate_ops which have .determine_rate
defined (and not further considering the mux).

Fixes: 69a00fb3d6 ("clk: divider: Implement and wire up .determine_rate by default")
Reported-by: Alex Bee <knaerzche@gmail.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://lore.kernel.org/r/20211016105022.303413-2-martin.blumenstingl@googlemail.com
Tested-by: Alex Bee <knaerzche@gmail.com>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-18 12:59:42 -07:00
Paolo Bonzini
9f1ee7b169 KVM: SEV-ES: reduce ghcb_sa_len to 32 bits
The size of the GHCB scratch area is limited to 16 KiB (GHCB_SCRATCH_AREA_LIMIT),
so there is no need for it to be a u64.  This fixes a build error on 32-bit
systems:

i686-linux-gnu-ld: arch/x86/kvm/svm/sev.o: in function `sev_es_string_io:
sev.c:(.text+0x110f): undefined reference to `__udivdi3'

Cc: stable@vger.kernel.org
Fixes: 019057bd73 ("KVM: SEV-ES: fix length of string I/O")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:07:19 -04:00
Hao Xiang
d61863c66f KVM: VMX: Remove redundant handling of bus lock vmexit
Hardware may or may not set exit_reason.bus_lock_detected on BUS_LOCK
VM-Exits. Dealing with KVM_RUN_X86_BUS_LOCK in handle_bus_lock_vmexit
could be redundant when exit_reason.basic is EXIT_REASON_BUS_LOCK.

We can remove redundant handling of bus lock vmexit. Unconditionally Set
exit_reason.bus_lock_detected in handle_bus_lock_vmexit(), and deal with
KVM_RUN_X86_BUS_LOCK only in vmx_handle_exit().

Signed-off-by: Hao Xiang <hao.xiang@linux.alibaba.com>
Message-Id: <1634299161-30101-1-git-send-email-hao.xiang@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:07:19 -04:00
Christian Borntraeger
01c7d2672a KVM: kvm_stat: do not show halt_wait_ns
Similar to commit 111d0bda8e ("tools/kvm_stat: Exempt time-based
counters"), we should not show timer values in kvm_stat. Remove the new
halt_wait_ns.

Fixes: 87bcc5fa09 ("KVM: stats: Add halt_wait_ns stats for all architectures")
Cc: Jing Zhang <jingzhangos@google.com>
Cc: Stefan Raspl <raspl@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Stefan Raspl <raspl@linux.ibm.com>
Message-Id: <20211006121724.4154-1-borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:07:18 -04:00
Sean Christopherson
9139a7a645 KVM: x86: WARN if APIC HW/SW disable static keys are non-zero on unload
WARN if the static keys used to track if any vCPU has disabled its APIC
are left elevated at module exit.  Unlike the underflow case, nothing in
the static key infrastructure will complain if a key is left elevated,
and because an elevated key only affects performance, nothing in KVM will
fail if either key is improperly incremented.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211013003554.47705-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:07:18 -04:00
Sean Christopherson
f7d8a19f9a Revert "KVM: x86: Open code necessary bits of kvm_lapic_set_base() at vCPU RESET"
Revert a change to open code bits of kvm_lapic_set_base() when emulating
APIC RESET to fix an apic_hw_disabled underflow bug due to arch.apic_base
and apic_hw_disabled being unsyncrhonized when the APIC is created.  If
kvm_arch_vcpu_create() fails after creating the APIC, kvm_free_lapic()
will see the initialized-to-zero vcpu->arch.apic_base and decrement
apic_hw_disabled without KVM ever having incremented apic_hw_disabled.

Using kvm_lapic_set_base() in kvm_lapic_reset() is also desirable for a
potential future where KVM supports RESET outside of vCPU creation, in
which case all the side effects of kvm_lapic_set_base() are needed, e.g.
to handle the transition from x2APIC => xAPIC.

Alternatively, KVM could temporarily increment apic_hw_disabled (and call
kvm_lapic_set_base() at RESET), but that's a waste of cycles and would
impact the performance of other vCPUs and VMs.  The other subtle side
effect is that updating the xAPIC ID needs to be done at RESET regardless
of whether the APIC was previously enabled, i.e. kvm_lapic_reset() needs
an explicit call to kvm_apic_set_xapic_id() regardless of whether or not
kvm_lapic_set_base() also performs the update.  That makes stuffing the
enable bit at vCPU creation slightly more palatable, as doing so affects
only the apic_hw_disabled key.

Opportunistically tweak the comment to explicitly call out the connection
between vcpu->arch.apic_base and apic_hw_disabled, and add a comment to
call out the need to always do kvm_apic_set_xapic_id() at RESET.

Underflow scenario:

  kvm_vm_ioctl() {
    kvm_vm_ioctl_create_vcpu() {
      kvm_arch_vcpu_create() {
        if (something_went_wrong)
          goto fail_free_lapic;
        /* vcpu->arch.apic_base is initialized when something_went_wrong is false. */
        kvm_vcpu_reset() {
          kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event) {
            vcpu->arch.apic_base = APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE;
          }
        }
        return 0;
      fail_free_lapic:
        kvm_free_lapic() {
          /* vcpu->arch.apic_base is not yet initialized when something_went_wrong is true. */
          if (!(vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE))
            static_branch_slow_dec_deferred(&apic_hw_disabled); // <= underflow bug.
        }
        return r;
      }
    }
  }

This (mostly) reverts commit 421221234a.

Fixes: 421221234a ("KVM: x86: Open code necessary bits of kvm_lapic_set_base() at vCPU RESET")
Reported-by: syzbot+9fc046ab2b0cf295a063@syzkaller.appspotmail.com
Debugged-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211013003554.47705-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:07:18 -04:00
Paolo Bonzini
fa13843d15 KVM: X86: fix lazy allocation of rmaps
If allocation of rmaps fails, but some of the pointers have already been written,
those pointers can be cleaned up when the memslot is freed, or even reused later
for another attempt at allocating the rmaps.  Therefore there is no need to
WARN, as done for example in memslot_rmap_alloc, but the allocation *must* be
skipped lest KVM will overwrite the previous pointer and will indeed leak memory.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:07:17 -04:00
Peter Gonda
baa1e5ca17 KVM: SEV-ES: Set guest_state_protected after VMSA update
The refactoring in commit bb18a67774 ("KVM: SEV: Acquire
vcpu mutex when updating VMSA") left behind the assignment to
svm->vcpu.arch.guest_state_protected; add it back.

Signed-off-by: Peter Gonda <pgonda@google.com>
[Delta between v2 and v3 of Peter's patch, which had already been
 committed; the commit message is my own. - Paolo]
Fixes: bb18a67774 ("KVM: SEV: Acquire vcpu mutex when updating VMSA")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:07:17 -04:00
Zqiang
9fbfabfda2 block: fix incorrect references to disk objects
When adding partitions to the disk, the reference count of the disk
object is increased. then alloc partition device and called
device_add(), if the device_add() return error, the reference
count of the disk object will be reduced twice, at put_device(pdev)
and put_disk(disk). this leads to the end of the object's life cycle
prematurely, and trigger following calltrace.

  __init_work+0x2d/0x50 kernel/workqueue.c:519
  synchronize_rcu_expedited+0x3af/0x650 kernel/rcu/tree_exp.h:847
  bdi_remove_from_list mm/backing-dev.c:938 [inline]
  bdi_unregister+0x17f/0x5c0 mm/backing-dev.c:946
  release_bdi+0xa1/0xc0 mm/backing-dev.c:968
  kref_put include/linux/kref.h:65 [inline]
  bdi_put+0x72/0xa0 mm/backing-dev.c:976
  bdev_free_inode+0x11e/0x220 block/bdev.c:408
  i_callback+0x3f/0x70 fs/inode.c:226
  rcu_do_batch kernel/rcu/tree.c:2508 [inline]
  rcu_core+0x76d/0x16c0 kernel/rcu/tree.c:2743
  __do_softirq+0x1d7/0x93b kernel/softirq.c:558
  invoke_softirq kernel/softirq.c:432 [inline]
  __irq_exit_rcu kernel/softirq.c:636 [inline]
  irq_exit_rcu+0xf2/0x130 kernel/softirq.c:648
  sysvec_apic_timer_interrupt+0x93/0xc0

making disk is NULL when calling put_disk().

Reported-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211018103422.2043-1-qiang.zhang1211@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18 11:20:38 -06:00
Randy Dunlap
4cce60f15c NIOS2: irqflags: rename a redefined register name
Both arch/nios2/ and drivers/mmc/host/tmio_mmc.c define a macro
with the name "CTL_STATUS". Change the one in arch/nios2/ to be
"CTL_FSTATUS" (flags status) to eliminate the build warning.

In file included from ../drivers/mmc/host/tmio_mmc.c:22:
drivers/mmc/host/tmio_mmc.h:31: warning: "CTL_STATUS" redefined
   31 | #define CTL_STATUS 0x1c
arch/nios2/include/asm/registers.h:14: note: this is the location of the previous definition
   14 | #define CTL_STATUS      0

Fixes: b31ebd8055 ("nios2: Nios2 registers")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2021-10-18 11:22:59 -05:00
Tianjia Zhang
d49fe5e815 selftests/tls: add SM4 algorithm dependency for tls selftests
Kernel TLS test has added SM4 GCM/CCM algorithm support, but SM4
algorithm is not compiled by default, this patch add SM4 config
dependency.

Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:52:11 +01:00
Jeremy Kerr
5a20dd46b8 mctp: Be explicit about struct sockaddr_mctp padding
We currently have some implicit padding in struct sockaddr_mctp. This
patch makes this padding explicit, and ensures we have consistent
layout on platforms with <32bit alignmnent.

Fixes: 60fc639816 ("mctp: Add sockaddr_mctp to uapi")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:47:09 +01:00
Jeremy Kerr
b416beb25d mctp: unify sockaddr_mctp types
Use the more precise __kernel_sa_family_t for smctp_family, to match
struct sockaddr.

Also, use an unsigned int for the network member; negative networks
don't make much sense. We're already using unsigned for mctp_dev and
mctp_skb_cb, but need to change mctp_sock to suit.

Fixes: 60fc639816 ("mctp: Add sockaddr_mctp to uapi")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Acked-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:47:09 +01:00
Zheyu Ma
b2cddb44bd cavium: Return negative value when pci_alloc_irq_vectors() fails
During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:46:02 +01:00
Wan Jiabing
d1a7b9e469 net: mscc: ocelot: Add of_node_put() before goto
Fix following coccicheck warning:
./drivers/net/ethernet/mscc/ocelot_vsc7514.c:946:1-33: WARNING: Function
for_each_available_child_of_node should have of_node_put() before goto.

Early exits from for_each_available_child_of_node should decrement the
node reference counter.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:44:48 +01:00
Wan Jiabing
d9fd7e9fcc net: sparx5: Add of_node_put() before goto
Fix following coccicheck warning:
./drivers/net/ethernet/microchip/sparx5/s4parx5_main.c:723:1-33: WARNING: Function
for_each_available_child_of_node should have of_node_put() before goto

Early exits from for_each_available_child_of_node should decrement the
node reference counter.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:44:48 +01:00
Paul Blakey
2dc4e9e88c net/sched: act_ct: Fix byte count on fragmented packets
First fragmented packets (frag offset = 0) byte len is zeroed
when stolen by ip_defrag(). And since act_ct update the stats
only afterwards (at end of execute), bytes aren't correctly
accounted for such packets.

To fix this, move stats update to start of action execute.

Fixes: b57dc7c13e ("net/sched: Introduce action ct")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:31:58 +01:00
DENG Qingfang
342afce10d net: dsa: mt7530: correct ds->num_ports
Setting ds->num_ports to DSA_MAX_PORTS made DSA core allocate unnecessary
dsa_port's and call mt7530_port_disable for non-existent ports.

Set it to MT7530_NUM_PORTS to fix that, and dsa_is_user_port check in
port_enable/disable is no longer required.

Cc: stable@vger.kernel.org
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:22:21 +01:00
Aleksander Jan Bajkowski
66d262804a net: dsa: lantiq_gswip: fix register definition
I compared the register definitions with the D-Link DWR-966
GPL sources and found that the PUAFD field definition was
incorrect. This definition is unused and causes no issues.

Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:21:49 +01:00
David S. Miller
bca69044af Merge tag 'linux-can-fixes-for-5.15-20211017' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:

====================
pull-request: can 2021-10-17

this is a pull request of 11 patches for net/master.

The first 4 patches are by Ziyang Xuan and Zhang Changzhong and fix 1
use after free and 3 standard conformance problems in the j1939 CAN
stack.

The next 2 patches are by Ziyang Xuan and fix 2 concurrency problems
in the ISOTP CAN stack.

Yoshihiro Shimoda's patch for the rcar_can fix suspend/resume on not
running CAN interfaces.

Aswath Govindraju's patch for the m_can driver fixes access for MMIO
devices.

Zheyu Ma contributes a patch for the peak_pci driver to fix a use
after free.

Stephane Grosjean's 2 patches fix CAN error state handling in the
peak_usb driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:04:42 +01:00
Randy Dunlap
0a9bb11a5e hamradio: baycom_epp: fix build for UML
On i386, the baycom_epp driver wants to inspect X86 CPU features (TSC)
and then act on that data, but that info is not available when running
on UML, so prevent that test and do the default action.

Prevents this build error on UML + i386:

../drivers/net/hamradio/baycom_epp.c: In function ‘epp_bh’:
../drivers/net/hamradio/baycom_epp.c:630:6: error: implicit declaration of function ‘boot_cpu_has’; did you mean ‘get_cpu_mask’? [-Werror=implicit-function-declaration]
  if (boot_cpu_has(X86_FEATURE_TSC))   \
      ^
../drivers/net/hamradio/baycom_epp.c:658:2: note: in expansion of macro ‘GETTICK’
  GETTICK(time1);

Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-um@lists.infradead.org
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Thomas Sailer <t.sailer@alumni.ethz.ch>
Cc: linux-hams@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:57:42 +01:00
Hamza Mahfooz
c2bbf9d1e9 dma-debug: teach add_dma_entry() about DMA_ATTR_SKIP_CPU_SYNC
Mapping something twice should be possible as long as,
DMA_ATTR_SKIP_CPU_SYNC is passed to the strictly speaking second relevant
mapping operation (that attempts to map the same thing). So, don't issue a
warning if the specified condition is met in add_dma_entry().

Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-18 12:46:45 +02:00
Davidlohr Bueso
d9aaaf2232 netfilter: ebtables: allocate chainstack on CPU local nodes
Keep the per-CPU memory allocated for chainstacks local.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-18 00:23:57 +02:00
Stephane Grosjean
553715feaa can: peak_usb: pcan_usb_fd_decode_status(): remove unnecessary test on the nullity of a pointer
Since alloc_can_err_skb() puts NULL in cf in the case when skb cannot
be allocated and can_change_state() handles the case when cf is NULL,
the test on the nullity of skb is now unnecessary.

Link: https://lore.kernel.org/all/20210929142111.55757-2-s.grosjean@peak-system.com
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-17 22:51:51 +02:00
Stephane Grosjean
3d031abc7e can: peak_usb: pcan_usb_fd_decode_status(): fix back to ERROR_ACTIVE state notification
This corrects the lack of notification of a return to ERROR_ACTIVE
state for USB - CANFD devices from PEAK-System.

Fixes: 0a25e1f4f1 ("can: peak_usb: add support for PEAK new CANFD USB adapters")
Link: https://lore.kernel.org/all/20210929142111.55757-1-s.grosjean@peak-system.com
Cc: stable@vger.kernel.org
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-17 22:51:51 +02:00
Zheyu Ma
949fe9b355 can: peak_pci: peak_pci_remove(): fix UAF
When remove the module peek_pci, referencing 'chan' again after
releasing 'dev' will cause UAF.

Fix this by releasing 'dev' later.

The following log reveals it:

[   35.961814 ] BUG: KASAN: use-after-free in peak_pci_remove+0x16f/0x270 [peak_pci]
[   35.963414 ] Read of size 8 at addr ffff888136998ee8 by task modprobe/5537
[   35.965513 ] Call Trace:
[   35.965718 ]  dump_stack_lvl+0xa8/0xd1
[   35.966028 ]  print_address_description+0x87/0x3b0
[   35.966420 ]  kasan_report+0x172/0x1c0
[   35.966725 ]  ? peak_pci_remove+0x16f/0x270 [peak_pci]
[   35.967137 ]  ? trace_irq_enable_rcuidle+0x10/0x170
[   35.967529 ]  ? peak_pci_remove+0x16f/0x270 [peak_pci]
[   35.967945 ]  __asan_report_load8_noabort+0x14/0x20
[   35.968346 ]  peak_pci_remove+0x16f/0x270 [peak_pci]
[   35.968752 ]  pci_device_remove+0xa9/0x250

Fixes: e6d9c80b7c ("can: peak_pci: add support of some new PEAK-System PCI cards")
Link: https://lore.kernel.org/all/1634192913-15639-1-git-send-email-zheyuma97@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-17 22:51:50 +02:00
Aswath Govindraju
99d173fbe8 can: m_can: fix iomap_read_fifo() and iomap_write_fifo()
The read and writes from the fifo are from a buffer, with various
fields and data at predefined offsets. So, they should not be done to
the same address(or port) in case of val_count greater than 1.
Therefore, fix this by using iowrite32()/ioread32() instead of
ioread32_rep()/iowrite32_rep().

Also, the write into FIFO must be performed with an offset from the
message ram base address. Therefore, fix the base address to
mram_base.

Fixes: e39381770e ("can: m_can: Disable IRQs on FIFO bus errors")
Link: https://lore.kernel.org/all/20210920123344.2320-1-a-govindraju@ti.com
Signed-off-by: Aswath Govindraju <a-govindraju@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-17 22:51:43 +02:00
Yoshihiro Shimoda
f7c05c3987 can: rcar_can: fix suspend/resume
If the driver was not opened, rcar_can_suspend() should not call
clk_disable() because the clock was not enabled.

Fixes: fd1159318e ("can: add Renesas R-Car CAN driver")
Link: https://lore.kernel.org/all/20210924075556.223685-1-yoshihiro.shimoda.uh@renesas.com
Cc: stable@vger.kernel.org
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Ayumi Nakamichi <ayumi.nakamichi.kf@renesas.com>
Reviewed-by: Ulrich Hecht <uli+renesas@fpond.eu>
Tested-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-17 22:51:43 +02:00
Tejun Heo
5370b0f490 blk-cgroup: blk_cgroup_bio_start() should use irq-safe operations on blkg->iostat_cpu
c3df5fb57f ("cgroup: rstat: fix A-A deadlock on 32bit around
u64_stats_sync") made u64_stats updates irq-safe to avoid A-A deadlocks.
Unfortunately, the conversion missed one in blk_cgroup_bio_start(). Fix it.

Fixes: 2d146aa3aa ("mm: memcontrol: switch to rstat")
Cc: stable@vger.kernel.org # v5.13+
Reported-by: syzbot+9738c8815b375ce482a1@syzkaller.appspotmail.com
Signed-off-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/YWi7NrQdVlxD6J9W@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-17 10:40:10 -06:00
Ziyang Xuan
43a08c3bda can: isotp: isotp_sendmsg(): fix TX buffer concurrent access in isotp_sendmsg()
When isotp_sendmsg() concurrent, tx.state of all TX processes can be
ISOTP_IDLE. The conditions so->tx.state != ISOTP_IDLE and
wq_has_sleeper(&so->wait) can not protect TX buffer from being
accessed by multiple TX processes.

We can use cmpxchg() to try to modify tx.state to ISOTP_SENDING firstly.
If the modification of the previous process succeed, the later process
must wait tx.state to ISOTP_IDLE firstly. Thus, we can ensure TX buffer
is accessed by only one process at the same time. And we should also
restore the original tx.state at the subsequent error processes.

Fixes: e057dd3fc2 ("can: add ISO 15765-2:2016 transport protocol")
Link: https://lore.kernel.org/all/c2517874fbdf4188585cf9ddf67a8fa74d5dbde5.1633764159.git.william.xuanziyang@huawei.com
Cc: stable@vger.kernel.org
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-17 14:18:21 +02:00
Ziyang Xuan
9acf636215 can: isotp: isotp_sendmsg(): add result check for wait_event_interruptible()
Using wait_event_interruptible() to wait for complete transmission,
but do not check the result of wait_event_interruptible() which can be
interrupted. It will result in TX buffer has multiple accessors and
the later process interferes with the previous process.

Following is one of the problems reported by syzbot.

=============================================================
WARNING: CPU: 0 PID: 0 at net/can/isotp.c:840 isotp_tx_timer_handler+0x2e0/0x4c0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.13.0-rc7+ #68
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
RIP: 0010:isotp_tx_timer_handler+0x2e0/0x4c0
Call Trace:
 <IRQ>
 ? isotp_setsockopt+0x390/0x390
 __hrtimer_run_queues+0xb8/0x610
 hrtimer_run_softirq+0x91/0xd0
 ? rcu_read_lock_sched_held+0x4d/0x80
 __do_softirq+0xe8/0x553
 irq_exit_rcu+0xf8/0x100
 sysvec_apic_timer_interrupt+0x9e/0xc0
 </IRQ>
 asm_sysvec_apic_timer_interrupt+0x12/0x20

Add result check for wait_event_interruptible() in isotp_sendmsg()
to avoid multiple accessers for tx buffer.

Fixes: e057dd3fc2 ("can: add ISO 15765-2:2016 transport protocol")
Link: https://lore.kernel.org/all/10ca695732c9dd267c76a3c30f37aefe1ff7e32f.1633764159.git.william.xuanziyang@huawei.com
Cc: stable@vger.kernel.org
Reported-by: syzbot+78bab6958a614b0c80b9@syzkaller.appspotmail.com
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-17 14:17:55 +02:00
Zhang Changzhong
a4fbe70c5c can: j1939: j1939_xtp_rx_rts_session_new(): abort TP less than 9 bytes
The receiver should abort TP if 'total message size' in TP.CM_RTS and
TP.CM_BAM is less than 9 or greater than 1785 [1], but currently the
j1939 stack only checks the upper bound and the receiver will accept
the following broadcast message:

  vcan1  18ECFF00   [8]  20 08 00 02 FF 00 23 01
  vcan1  18EBFF00   [8]  01 00 00 00 00 00 00 00
  vcan1  18EBFF00   [8]  02 00 FF FF FF FF FF FF

This patch adds check for the lower bound and abort illegal TP.

[1] SAE-J1939-82 A.3.4 Row 2 and A.3.6 Row 6.

Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Link: https://lore.kernel.org/all/1634203601-3460-1-git-send-email-zhangchangzhong@huawei.com
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-17 14:12:57 +02:00
Zhang Changzhong
379743985a can: j1939: j1939_xtp_rx_dat_one(): cancel session if receive TP.DT with error length
According to SAE-J1939-21, the data length of TP.DT must be 8 bytes, so
cancel session when receive unexpected TP.DT message.

Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Link: https://lore.kernel.org/all/1632972800-45091-1-git-send-email-zhangchangzhong@huawei.com
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-17 14:12:56 +02:00
Ziyang Xuan
d9d52a3ebd can: j1939: j1939_netdev_start(): fix UAF for rx_kref of j1939_priv
It will trigger UAF for rx_kref of j1939_priv as following.

        cpu0                                    cpu1
j1939_sk_bind(socket0, ndev0, ...)
j1939_netdev_start
                                        j1939_sk_bind(socket1, ndev0, ...)
                                        j1939_netdev_start
j1939_priv_set
                                        j1939_priv_get_by_ndev_locked
j1939_jsk_add
.....
j1939_netdev_stop
kref_put_lock(&priv->rx_kref, ...)
                                        kref_get(&priv->rx_kref, ...)
                                        REFCOUNT_WARN("addition on 0;...")

====================================================
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 1 PID: 20874 at lib/refcount.c:25 refcount_warn_saturate+0x169/0x1e0
RIP: 0010:refcount_warn_saturate+0x169/0x1e0
Call Trace:
 j1939_netdev_start+0x68b/0x920
 j1939_sk_bind+0x426/0xeb0
 ? security_socket_bind+0x83/0xb0

The rx_kref's kref_get() and kref_put() should use j1939_netdev_lock to
protect.

Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Link: https://lore.kernel.org/all/20210926104757.2021540-1-william.xuanziyang@huawei.com
Cc: stable@vger.kernel.org
Reported-by: syzbot+85d9878b19c94f9019ad@syzkaller.appspotmail.com
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-17 14:12:56 +02:00
Ziyang Xuan
b504a884f6 can: j1939: j1939_tp_rxtimer(): fix errant alert in j1939_tp_rxtimer
When the session state is J1939_SESSION_DONE, j1939_tp_rxtimer() will
give an alert "rx timeout, send abort", but do nothing actually. Move
the alert into session active judgment condition, it is more
reasonable.

One of the scenarios is that j1939_tp_rxtimer() execute followed by
j1939_xtp_rx_abort_one(). After j1939_xtp_rx_abort_one(), the session
state is J1939_SESSION_DONE, then j1939_tp_rxtimer() give an alert.

Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Link: https://lore.kernel.org/all/20210906094219.95924-1-william.xuanziyang@huawei.com
Cc: stable@vger.kernel.org
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-17 12:39:03 +02:00
Miles Chen
85374b6392 scsi: sd: Fix crashes in sd_resume_runtime()
After commit ed4246d37f ("scsi: sd: REQUEST SENSE for
BLIST_IGN_MEDIA_CHANGE devices in runtime_resume()"), the following crash
was observed.

static int sd_resume_runtime(struct device *dev)
{
        struct scsi_disk *sdkp = dev_get_drvdata(dev);
        struct scsi_device *sdp = sdkp->device; // sdkp == NULL and crash

        if (sdp->ignore_media_change) {
	...
}

It is possible for sdkp to be NULL in sd_resume_runtime(). To fix this
crash, follow sd_resume() to test if sdkp is NULL before dereferencing it.

Crash:
[    4.695171][  T151] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[    4.696591][  T151] Mem abort info:
[    4.697919][  T151]   ESR = 0x96000005
[    4.699692][  T151]   EC = 0x25: DABT (current EL), IL = 32 bits
[    4.701990][  T151]   SET = 0, FnV = 0
[    4.702513][  T151]   EA = 0, S1PTW = 0
[    4.704431][  T151]   FSC = 0x05: level 1 translation fault
[    4.705254][  T151] Data abort info:
[    4.705806][  T151]   ISV = 0, ISS = 0x00000005
[    4.706484][  T151]   CM = 0, WnR = 0
[    4.707048][  T151] [0000000000000008] user address but active_mm is swapper
[    4.710577][  T151] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[    4.832361][  T151] Kernel Offset: 0x12acc80000 from 0xffffffc010000000
[    4.833254][  T151] PHYS_OFFSET: 0x40000000
[    4.833814][  T151] pstate: 80400005 (Nzcv daif +PAN -UAO)
[    4.834546][  T151] pc : sd_resume_runtime+0x20/0x14c
[    4.835227][  T151] lr : scsi_runtime_resume+0x84/0xe4
[    4.835916][  T151] sp : ffffffc0110db8d0
[    4.836450][  T151] x29: ffffffc0110db8d0 x28: 0000000000000001
[    4.837258][  T151] x27: ffffff80c0bd1ac0 x26: ffffff80c0bd1ad0
[    4.838063][  T151] x25: ffffff80cea7e448 x24: ffffffd2bf961000
[    4.838867][  T151] x23: ffffffd2be69f838 x22: ffffffd2bd9dfb4c
[    4.839670][  T151] x21: 0000000000000000 x20: ffffff80cea7e000
[    4.840474][  T151] x19: ffffff80cea7e260 x18: ffffffc0110dd078
[    4.841277][  T151] x17: 00000000658783d9 x16: 0000000051469dac
[    4.842081][  T151] x15: 00000000b87f6327 x14: 0000000068fd680d
[    4.842885][  T151] x13: ffffff80c0bd2470 x12: ffffffd2bfa7f5f0
[    4.843688][  T151] x11: 0000000000000078 x10: 0000000000000001
[    4.844492][  T151] x9 : 00000000000000b1 x8 : ffffffd2be69f88c
[    4.845295][  T151] x7 : ffffffd2bd9e0e5c x6 : 0000000000000000
[    4.846099][  T151] x5 : 0000000000000080 x4 : 0000000000000001
[    4.846902][  T151] x3 : 68fd680dfe4ebe5e x2 : 0000000000000003
[    4.847706][  T151] x1 : ffffffd2bf7f9380 x0 : ffffff80cea7e260
[    4.856708][  T151]  die+0x16c/0x59c
[    4.857191][  T151]  __do_kernel_fault+0x1e8/0x210
[    4.857833][  T151]  do_page_fault+0xa4/0x654
[    4.858418][  T151]  do_translation_fault+0x6c/0x1b0
[    4.859083][  T151]  do_mem_abort+0x68/0x10c
[    4.859655][  T151]  el1_abort+0x40/0x64
[    4.860182][  T151]  el1h_64_sync_handler+0x54/0x88
[    4.860834][  T151]  el1h_64_sync+0x7c/0x80
[    4.861395][  T151]  sd_resume_runtime+0x20/0x14c
[    4.862025][  T151]  scsi_runtime_resume+0x84/0xe4
[    4.862667][  T151]  __rpm_callback+0x1f4/0x8cc
[    4.863275][  T151]  rpm_resume+0x7e8/0xaa4
[    4.863836][  T151]  __pm_runtime_resume+0xa0/0x110
[    4.864489][  T151]  sd_probe+0x30/0x428
[    4.865016][  T151]  really_probe+0x14c/0x500
[    4.865602][  T151]  __driver_probe_device+0xb4/0x18c
[    4.866278][  T151]  driver_probe_device+0x60/0x2c4
[    4.866931][  T151]  __device_attach_driver+0x228/0x2bc
[    4.867630][  T151]  __device_attach_async_helper+0x154/0x21c
[    4.868398][  T151]  async_run_entry_fn+0x5c/0x1c4
[    4.869038][  T151]  process_one_work+0x3ac/0x590
[    4.869670][  T151]  worker_thread+0x320/0x758
[    4.870265][  T151]  kthread+0x2e8/0x35c
[    4.870792][  T151]  ret_from_fork+0x10/0x20

Link: https://lore.kernel.org/r/20211015074654.19615-1-miles.chen@mediatek.com
Fixes: ed4246d37f ("scsi: sd: REQUEST SENSE for BLIST_IGN_MEDIA_CHANGE devices in runtime_resume()")
Cc: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-16 22:25:23 -04:00
Sreekanth Reddy
97e6ea6d78 scsi: mpi3mr: Fix duplicate device entries when scanning through sysfs
When scanning devices through the 'scan' attribute in sysfs, the user will
observe duplicate device entries in lsscsi command output.

Set the shost's max_channel to zero to avoid this.

Link: https://lore.kernel.org/r/20211014055425.30719-1-sreekanth.reddy@broadcom.com
Fixes: 824a156633 ("scsi: mpi3mr: Base driver code")
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-16 22:16:37 -04:00
Sachi King
4e5a04be88 pinctrl: amd: disable and mask interrupts on probe
Some systems such as the Microsoft Surface Laptop 4 leave interrupts
enabled and configured for use in sleep states on boot, which cause
unexpected behaviour such as spurious wakes and failed resumes in
s2idle states.

As interrupts should not be enabled until they are claimed and
explicitly enabled, disabling any interrupts mistakenly left enabled by
firmware should be safe.

Signed-off-by: Sachi King <nakato@nakato.io>
Link: https://lore.kernel.org/r/20211009033240.21543-1-nakato@nakato.io
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-10-16 23:56:59 +02:00
Nikolay Aleksandrov
fac3cb82a5 net: bridge: mcast: use multicast_membership_interval for IGMPv3
When I added IGMPv3 support I decided to follow the RFC for computing
the GMI dynamically:
" 8.4. Group Membership Interval

   The Group Membership Interval is the amount of time that must pass
   before a multicast router decides there are no more members of a
   group or a particular source on a network.

   This value MUST be ((the Robustness Variable) times (the Query
   Interval)) plus (one Query Response Interval)."

But that actually is inconsistent with how the bridge used to compute it
for IGMPv2, where it was user-configurable that has a correct default value
but it is up to user-space to maintain it. This would make it consistent
with the other timer values which are also maintained correct by the user
instead of being dynamically computed. It also changes back to the previous
user-expected GMI behaviour for IGMPv3 queries which were supported before
IGMPv3 was added. Note that to properly compute it dynamically we would
need to add support for "Robustness Variable" which is currently missing.

Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: 0436862e41 ("net: bridge: mcast: support for IGMPv3/MLDv2 ALLOW_NEW_SOURCES report")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 15:05:58 +01:00
Frieder Schrempf
0b28c41e3c arm64: dts: imx8mm-kontron: Fix connection type for VSC8531 RGMII PHY
Previously we falsely relied on the PHY driver to unconditionally
enable the internal RX delay. Since the following fix for the PHY
driver this is not the case anymore:

commit 7b005a1742 ("net: phy: mscc: configure both RX and TX internal
delays for RGMII")

In order to enable the delay we need to set the connection type to
"rgmii-rxid". Without the RX delay the ethernet is not functional at
all.

Fixes: 8668d8b2e6 ("arm64: dts: Add the Kontron i.MX8M Mini SoMs and baseboards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2021-10-16 15:30:00 +08:00
Frieder Schrempf
ca6f9d85d5 arm64: dts: imx8mm-kontron: Fix CAN SPI clock frequency
The MCP2515 can be used with an SPI clock of up to 10 MHz. Set the
limit accordingly to prevent any performance issues caused by the
really low clock speed of 100 kHz.

This removes the arbitrarily low limit on the SPI frequency, that was
caused by a typo in the original dts.

Without this change, receiving CAN messages on the board beyond a
certain bitrate will cause overrun errors (see 'ip -det -stat link show
can0').

With this fix, receiving messages on the bus works without any overrun
errors for bitrates up to 1 MBit.

Fixes: 8668d8b2e6 ("arm64: dts: Add the Kontron i.MX8M Mini SoMs and baseboards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2021-10-16 15:29:57 +08:00
Frieder Schrempf
6562d6e350 arm64: dts: imx8mm-kontron: Fix polarity of reg_rst_eth2
The regulator reg_rst_eth2 should keep the reset signal of the USB ethernet
adapter deasserted anytime. Fix the polarity and mark it as always-on.

Anyway, using the regulator is only a workaround for the missing support of
specifying a reset GPIO for USB devices in a generic way. As we don't
have a solution for this at the moment, at least fix the current
workaround.

Fixes: 8668d8b2e6 ("arm64: dts: Add the Kontron i.MX8M Mini SoMs and baseboards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2021-10-16 15:29:54 +08:00
Frieder Schrempf
256a24eba7 arm64: dts: imx8mm-kontron: Set lower limit of VDD_SNVS to 800 mV
According to the datasheet the typical value for VDD_SNVS should be
800 mV, so let's make sure that this is within the range of the
regulator.

Fixes: 8668d8b2e6 ("arm64: dts: Add the Kontron i.MX8M Mini SoMs and baseboards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2021-10-16 15:29:51 +08:00
Frieder Schrempf
82a4f329b1 arm64: dts: imx8mm-kontron: Make sure SOC and DRAM supply voltages are correct
It looks like the voltages for the SOC and DRAM supply weren't properly
validated before. The datasheet and uboot-imx code tells us that VDD_SOC
should be 800 mV in suspend and 850 mV in run mode. VDD_DRAM should be
950 mV for DDR clock frequencies of up to 1.5 GHz.

Let's fix these values to make sure the voltages are within the required
range.

Fixes: 8668d8b2e6 ("arm64: dts: Add the Kontron i.MX8M Mini SoMs and baseboards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2021-10-16 15:29:34 +08:00
Stefano Garzarella
ba95a6225b vsock_diag_test: remove free_sock_stat() call in test_no_sockets
In `test_no_sockets` we don't expect any sockets, indeed
check_no_sockets() prints an error and exits if `sockets` list is
not empty, so free_sock_stat() call is unnecessary since it would
only be called when the `sockets` list is empty.

This was discovered by a strange warning printed by gcc v11.2.1:
  In file included from ../../include/linux/list.h:7,
                   from vsock_diag_test.c:18:
  vsock_diag_test.c: In function ‘test_no_sockets’:
  ../../include/linux/kernel.h:35:45: error: array subscript ‘struct vsock_stat[0]’ is partly outside array bound
  s of ‘struct list_head[1]’ [-Werror=array-bounds]
     35 |         const typeof(((type *)0)->member) * __mptr = (ptr);     \
        |                                             ^~~~~~
  ../../include/linux/list.h:352:9: note: in expansion of macro ‘container_of’
    352 |         container_of(ptr, type, member)
        |         ^~~~~~~~~~~~
  ../../include/linux/list.h:393:9: note: in expansion of macro ‘list_entry’
    393 |         list_entry((pos)->member.next, typeof(*(pos)), member)
        |         ^~~~~~~~~~
  ../../include/linux/list.h:522:21: note: in expansion of macro ‘list_next_entry’
    522 |                 n = list_next_entry(pos, member);                       \
        |                     ^~~~~~~~~~~~~~~
  vsock_diag_test.c:325:9: note: in expansion of macro ‘list_for_each_entry_safe’
    325 |         list_for_each_entry_safe(st, next, sockets, list) {
        |         ^~~~~~~~~~~~~~~~~~~~~~~~
  In file included from vsock_diag_test.c:18:
  vsock_diag_test.c:333:19: note: while referencing ‘sockets’
    333 |         LIST_HEAD(sockets);
        |                   ^~~~~~~
  ../../include/linux/list.h:23:26: note: in definition of macro ‘LIST_HEAD’
     23 |         struct list_head name = LIST_HEAD_INIT(name)

It seems related to some compiler optimization and assumption
about the empty `sockets` list, since this warning is printed
only with -02 or -O3. Also removing `exit(1)` from
check_no_sockets() makes the warning disappear since in that
case free_sock_stat() can be reached also when the list is
not empty.

Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211014152045.173872-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-15 17:21:34 -07:00
Jakub Kicinski
2151135a1f Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-10-14

Brett ensures RDMA nodes are removed during release and rebuild. He also
corrects fw.mgmt.api to include the patch number for proper
identification.

Dave stops ida_free() being called when an IDA has not been allocated.

Michal corrects the order of parameters being provided and the number of
entries skipped for UDP tunnels.
====================

Link: https://lore.kernel.org/r/20211014181953.3538330-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-15 17:20:16 -07:00
Stephen Suryaputra
0857d6f8c7 ipv6: When forwarding count rx stats on the orig netdev
Commit bdb7cc643f ("ipv6: Count interface receive statistics on the
ingress netdev") does not work when ip6_forward() executes on the skbs
with vrf-enslaved netdev. Use IP6CB(skb)->iif to get to the right one.

Add a selftest script to verify.

Fixes: bdb7cc643f ("ipv6: Count interface receive statistics on the ingress netdev")
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211014130845.410602-1-ssuryaextr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-15 15:32:04 -07:00
Takashi Iwai
eadeb06e76 Merge tag 'asoc-fix-v5.15-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.15

A colletion of smallish mostly driver specific fixes, the biggest thing
here is fixing some of the core code to generate change notifications
properly when writing to controls which will fix issues with UIs not
showing the correct values.

There's one build fix here with a slightly misleading changelog saying
it's adding IRQ config support, it's adding a missing select of the
regmap-irq code rather than adding a feature.
2021-10-15 17:43:46 +02:00
Ralph Boehme
7a33488705 ksmbd: validate credit charge after validating SMB2 PDU body size
smb2_validate_credit_charge() accesses fields in the SMB2 PDU body,
but until smb2_calc_size() is called the PDU has not yet been verified
to be large enough to access the PDU dynamic part length field.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Ralph Boehme <slow@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-15 09:18:29 -05:00
Hyunchul Lee
2ea086e35c ksmbd: add buffer validation for smb direct
Add buffer validation for smb direct.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-15 09:18:29 -05:00
Namjae Jeon
4bc59477c3 ksmbd: limit read/write/trans buffer size not to exceed 8MB
ksmbd limit read/write/trans buffer size not to exceed maximum 8MB.
And set the minimum value of max response buffer size to 64KB.
Windows client doesn't send session setup request if ksmbd set max
trans/read/write size lower than 64KB in smb2 negotiate.
It means windows allow at least 64 KB or more about this value.

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-15 09:18:29 -05:00
David S. Miller
4884ddba7f Merge branch 'tcp-md5-vrf-fix'
Leonard Crestez says:

====================
tcp: md5: Fix overlap between vrf and non-vrf keys

With net.ipv4.tcp_l3mdev_accept=1 it is possible for a listen socket to
accept connection from the same client address in different VRFs. It is
also possible to set different MD5 keys for these clients which differ only
in the tcpm_l3index field.

This appears to work when distinguishing between different VRFs but not
between non-VRF and VRF connections. In particular:

* tcp_md5_do_lookup_exact will match a non-vrf key against a vrf key. This
means that adding a key with l3index != 0 after a key with l3index == 0
will cause the earlier key to be deleted. Both keys can be present if the
non-vrf key is added later.
* _tcp_md5_do_lookup can match a non-vrf key before a vrf key. This casues
failures if the passwords differ.

This can be fixed by making tcp_md5_do_lookup_exact perform an actual exact
comparison on l3index and by making  __tcp_md5_do_lookup perfer vrf-bound
keys above other considerations like prefixlen.

The fact that keys with l3index==0 affect VRF connections is usually not
desirable, VRFs are meant to be completely independent. This behavior needs
to preserved for backwards compatibility. Also, applications can just bind
listen sockets to VRF and never specify TCP_MD5SIG_FLAG_IFINDEX at all.

So far the combination of TCP_MD5SIG_FLAG_IFINDEX with tcpm_ifindex == 0
was an error, accept this to mean "key only applies to default VRF". This
is what applications using VRFs for traffic separation want.

This also contains tests for the second part. It does not contain tests for
overlapping keys, that would require more changes in nettest to add
multiple keys. These scenarios are also covered by my tests for TCP-AO,
especially around this area:
https://github.com/cdleonard/tcp-authopt-test/blob/main/tcp_authopt_test/test_vrf_bind.py

Changes since V2:
* Rename --do-bind-key-ifindex to --force-bind-key-ifindex
* Fix referencing TCP_MD5SIG_FLAG_IFINDEX as TCP_MD5SIG_IFINDEX
Link to v2: https://lore.kernel.org/netdev/cover.1634107317.git.cdleonard@gmail.com/

Changes since V1:
* Accept (TCP_MD5SIG_IFINDEX with tcpm_ifindex == 0)
* Add flags for explicitly including or excluding TCP_MD5SIG_FLAG_IFINDEX
to nettest
* Add few more tests in fcnal-test.sh.
Link to v1: https://lore.kernel.org/netdev/3d8387d499f053dba5cd9184c0f7b8445c4470c6.1633542093.git.cdleonard@gmail.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-15 14:36:57 +01:00
Leonard Crestez
64e4017778 selftests: net/fcnal: Test --{force,no}-bind-key-ifindex
Test that applications binding listening sockets to VRFs without
specifying TCP_MD5SIG_FLAG_IFINDEX will work as expected. This would
be broken if __tcp_md5_do_lookup always made a strict comparison on
l3index. See this email:

https://lore.kernel.org/netdev/209548b5-27d2-2059-f2e9-2148f5a0291b@gmail.com/

Applications using tcp_l3mdev_accept=1 and a single global socket (not
bound to any interface) also should have a way to specify keys that are
only for the default VRF, this is done by --force-bind-key-ifindex
without otherwise binding to a device.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-15 14:36:57 +01:00
Leonard Crestez
78a9cf6143 selftests: nettest: Add --{force,no}-bind-key-ifindex
These options allow explicit control over the TCP_MD5SIG_FLAG_IFINDEX
flag instead of always setting it based on binding to an interface.

Do this by converting to getopt_long because nettest has too many
single-character flags already and getopt_long is widely used in
selftests.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-15 14:36:57 +01:00
Leonard Crestez
a76c2315be tcp: md5: Allow MD5SIG_FLAG_IFINDEX with ifindex=0
Multiple VRFs are generally meant to be "separate" but right now md5
keys for the default VRF also affect connections inside VRFs if the IP
addresses happen to overlap.

So far the combination of TCP_MD5SIG_FLAG_IFINDEX with tcpm_ifindex == 0
was an error, accept this to mean "key only applies to default VRF".
This is what applications using VRFs for traffic separation want.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-15 14:36:57 +01:00
Leonard Crestez
86f1e3a848 tcp: md5: Fix overlap between vrf and non-vrf keys
With net.ipv4.tcp_l3mdev_accept=1 it is possible for a listen socket to
accept connection from the same client address in different VRFs. It is
also possible to set different MD5 keys for these clients which differ
only in the tcpm_l3index field.

This appears to work when distinguishing between different VRFs but not
between non-VRF and VRF connections. In particular:

 * tcp_md5_do_lookup_exact will match a non-vrf key against a vrf key.
This means that adding a key with l3index != 0 after a key with l3index
== 0 will cause the earlier key to be deleted. Both keys can be present
if the non-vrf key is added later.
 * _tcp_md5_do_lookup can match a non-vrf key before a vrf key. This
casues failures if the passwords differ.

Fix this by making tcp_md5_do_lookup_exact perform an actual exact
comparison on l3index and by making  __tcp_md5_do_lookup perfer
vrf-bound keys above other considerations like prefixlen.

Fixes: dea53bb80e ("tcp: Add l3index to tcp_md5sig_key and md5 functions")
Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-15 14:36:57 +01:00
Vegard Nossum
46393d61a3 lan78xx: select CRC32
Fix the following build/link error by adding a dependency on the CRC32
routines:

  ld: drivers/net/usb/lan78xx.o: in function `lan78xx_set_multicast':
  lan78xx.c:(.text+0x48cf): undefined reference to `crc32_le'

The actual use of crc32_le() comes indirectly through ether_crc().

Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-15 14:34:35 +01:00
Xin Long
075718fdaf sctp: fix transport encap_port update in sctp_vtag_verify
transport encap_port update should be updated when sctp_vtag_verify()
succeeds, namely, returns 1, not returns 0. Correct it in this patch.

While at it, also fix the indentation.

Fixes: a1dd2cf2f1 ("sctp: allow changing transport encap_port by peer packets")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-15 11:21:10 +01:00
Kele Huang
c2402d43d1 ptp: fix error print of ptp_kvm on X86_64 platform
Commit a86ed2cfa1 ("ptp: Don't print an error if ptp_kvm is not supported")
fixes the error message print on ARM platform by only concerning about
the case that the error returned from kvm_arch_ptp_init() is not -EOPNOTSUPP.
Although the ARM platform returns -EOPNOTSUPP if ptp_kvm is not supported
while X86_64 platform returns -KVM_EOPNOTSUPP, both error codes share the
same value 95.

Actually kvm_arch_ptp_init() on X86_64 platform can return three kinds of
errors (-KVM_ENOSYS, -KVM_EOPNOTSUPP and -KVM_EFAULT). The problem is that
-KVM_EOPNOTSUPP is masked out and -KVM_EFAULT is ignored among them.
This patch fixes this by returning them to ptp_kvm_init() respectively.

Signed-off-by: Kele Huang <huangkele@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-15 11:19:25 +01:00
Paolo Bonzini
e2b6d941ec Merge tag 'kvmarm-fixes-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.15, take #2

- Properly refcount pages used as a concatenated stage-2 PGD
- Fix missing unlock when detecting the use of MTE+VM_SHARED
2021-10-15 04:47:55 -04:00
Paolo Bonzini
019057bd73 KVM: SEV-ES: fix length of string I/O
The size of the data in the scratch buffer is not divided by the size of
each port I/O operation, so vcpu->arch.pio.count ends up being larger
than it should be by a factor of size.

Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-15 04:47:36 -04:00
Davide Baldo
d94befbb5a ALSA: hda/realtek: Fixes HP Spectre x360 15-eb1xxx speakers
In laptop 'HP Spectre x360 Convertible 15-eb1xxx/8811' both front and
rear speakers are silent, this patch fixes that by overriding the pin
layout and by initializing the amplifier which needs a GPIO pin to be
set to 1 then 0, similar to the existing HP Spectre x360 14 model.

In order to have volume control, both front and rear speakers were
forced to use the DAC1.

This patch also correctly map the mute LED but since there is no
microphone on/off switch exposed by the alsa subsystem it never turns
on by itself.

There are still known audio issues in this laptop: headset microphone
doesn't work, the button to mute/unmute microphone is not yet mapped,
the LED of the mute/unmute speakers doesn't seems to be exposed via
GPIO and never turns on.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213953
Signed-off-by: Davide Baldo <davide@baldo.me>
Link: https://lore.kernel.org/r/20211015072121.5287-1-davide@baldo.me
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-15 09:41:49 +02:00
Brendan Grieve
3c414eb65c ALSA: usb-audio: Provide quirk for Sennheiser GSP670 Headset
As per discussion at: https://github.com/szszoke/sennheiser-gsp670-pulseaudio-profile/issues/13

The GSP670 has 2 playback and 1 recording device that by default are
detected in an incompatible order for alsa. This may have been done to make
it compatible for the console by the manufacturer and only affects the
latest firmware which uses its own ID.

This quirk will resolve this by reordering the channels.

Signed-off-by: Brendan Grieve <brendan@grieve.com.au>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211015025335.196592-1-brendan@grieve.com.au
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-15 09:27:41 +02:00
Florian Westphal
3e6ed7703d selftests: netfilter: remove stray bash debug line
This should not be there.

Fixes: 2de03b4523 ("selftests: netfilter: add flowtable test script")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-14 23:08:35 +02:00
Antoine Tenart
174c376278 netfilter: ipvs: make global sysctl readonly in non-init netns
Because the data pointer of net/ipv4/vs/debug_level is not updated per
netns, it must be marked as read-only in non-init netns.

Fixes: c6d2d445d8 ("IPVS: netns, final patch enabling network name space.")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-14 23:08:35 +02:00
Xin Long
a482c5e00a netfilter: ip6t_rt: fix rt0_hdr parsing in rt_mt6
In rt_mt6(), when it's a nonlinear skb, the 1st skb_header_pointer()
only copies sizeof(struct ipv6_rt_hdr) to _route that rh points to.
The access by ((const struct rt0_hdr *)rh)->reserved will overflow
the buffer. So this access should be moved below the 2nd call to
skb_header_pointer().

Besides, after the 2nd skb_header_pointer(), its return value should
also be checked, othersize, *rp may cause null-pointer-ref.

v1->v2:
  - clean up some old debugging log.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-14 23:08:35 +02:00
Brett Creeley
b726ddf984 ice: Print the api_patch as part of the fw.mgmt.api
Currently when a user uses "devlink dev info", the fw.mgmt.api will be
the major.minor numbers as shown below:

devlink dev info pci/0000:3b:00.0
pci/0000:3b:00.0:
  driver ice
  serial_number 00-01-00-ff-ff-00-00-00
  versions:
      fixed:
        board.id K91258-000
      running:
        fw.mgmt 6.1.2
        fw.mgmt.api 1.7 <--- No patch number included
        fw.mgmt.build 0xd75e7d06
        fw.mgmt.srev 5
        fw.undi 1.2992.0
        fw.undi.srev 5
        fw.psid.api 3.10
        fw.bundle_id 0x800085cc
        fw.app.name ICE OS Default Package
        fw.app 1.3.27.0
        fw.app.bundle_id 0xc0000001
        fw.netlist 3.10.2000-3.1e.0
        fw.netlist.build 0x2a76e110
      stored:
        fw.mgmt.srev 5
        fw.undi 1.2992.0
        fw.undi.srev 5
        fw.psid.api 3.10
        fw.bundle_id 0x800085cc
        fw.netlist 3.10.2000-3.1e.0
        fw.netlist.build 0x2a76e110

There are many features in the driver that depend on the major, minor,
and patch version of the FW. Without the patch number in the output for
fw.mgmt.api debugging issues related to the FW API version is difficult.
Also, using major.minor.patch aligns with the existing firmware version
which uses a 3 digit value.

Fix this by making the fw.mgmt.api print the major.minor.patch
versions. Shown below is the result:

devlink dev info pci/0000:3b:00.0
pci/0000:3b:00.0:
  driver ice
  serial_number 00-01-00-ff-ff-00-00-00
  versions:
      fixed:
        board.id K91258-000
      running:
        fw.mgmt 6.1.2
        fw.mgmt.api 1.7.9 <--- patch number included
        fw.mgmt.build 0xd75e7d06
        fw.mgmt.srev 5
        fw.undi 1.2992.0
        fw.undi.srev 5
        fw.psid.api 3.10
        fw.bundle_id 0x800085cc
        fw.app.name ICE OS Default Package
        fw.app 1.3.27.0
        fw.app.bundle_id 0xc0000001
        fw.netlist 3.10.2000-3.1e.0
        fw.netlist.build 0x2a76e110
      stored:
        fw.mgmt.srev 5
        fw.undi 1.2992.0
        fw.undi.srev 5
        fw.psid.api 3.10
        fw.bundle_id 0x800085cc
        fw.netlist 3.10.2000-3.1e.0
        fw.netlist.build 0x2a76e110

Fixes: ff2e5c700e ("ice: add basic handler for devlink .info_get")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-14 10:14:46 -07:00
Michal Swiatkowski
e4c2efa139 ice: fix getting UDP tunnel entry
Correct parameters order in call to ice_tunnel_idx_to_entry function.

Entry in sparse port table is correct when the idx is 0. For idx 1 one
correct entry should be skipped, for idx 2 two of them should be skipped
etc. Change if condition to be true when idx is 0, which means that
previous valid entry of this tunnel type were skipped.

Fixes: b20e6c17c4 ("ice: convert to new udp_tunnel infrastructure")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-14 10:14:45 -07:00
Dave Ertman
73e30a62b1 ice: Avoid crash from unnecessary IDA free
In the remove path, there is an attempt to free the aux_idx IDA whether
it was allocated or not.  This can potentially cause a crash when
unloading the driver on systems that do not initialize support for RDMA.
But, this free cannot be gated by the status bit for RDMA, since it is
allocated if the driver detects support for RDMA at probe time, but the
driver can enter into a state where RDMA is not supported after the IDA
has been allocated at probe time and this would lead to a memory leak.

Initialize aux_idx to an invalid value and check for a valid value when
unloading to determine if an IDA free is necessary.

Fixes: d25a0fc41c ("ice: Initialize RDMA support")
Reported-by: Jun Miao <jun.miao@windriver.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-14 10:14:45 -07:00
Brett Creeley
ff7e932194 ice: Fix failure to re-add LAN/RDMA Tx queues
Currently if the VSI is rebuilt/removed and the RDMA PF driver is active
the RDMA Tx queue scheduler node configuration will not be cleaned up.
This will cause the rebuild/re-add of the VSI to fail due to the
software structures not being correctly cleaned up for the VSI index.
Fix this by always calling ice_rm_vsi_rdma_cfg() for all VSI. If there
are no RDMA scheduler nodes created, then there is no harm in calling
ice_rm_vsi_rdma_cfg(). This change applies to all VSI types, so if
RDMA support is added for other VSI types they will also get this
change.

Fixes: 348048e724 ("ice: Implement iidc operations")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Jerzy Wiktor Jurkowski <jerzy.wiktor.jurkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-14 10:14:00 -07:00
Steven Clarkson
aef454b402 ALSA: hda/realtek: Add quirk for Clevo PC50HS
Apply existing PCI quirk to the Clevo PC50HS and related models to fix
audio output on the built in speakers.

Signed-off-by: Steven Clarkson <sc@lambdal.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211014133554.1326741-1-sc@lambdal.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-14 15:41:42 +02:00
Greg Kroah-Hartman
22390ce786 ALSA: usb-audio: add Schiit Hel device to quirk table
The Shciit Hel device responds to the ctl message for the mic capture
switch with a timeout of -EPIPE:

	usb 7-2.2: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x1100, type = 1
	usb 7-2.2: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x1100, type = 1
	usb 7-2.2: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x1100, type = 1
	usb 7-2.2: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x1100, type = 1

This seems safe to ignore as the device works properly with the control
message quirk, so add it to the quirk table so all is good.

Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: alsa-devel@alsa-project.org
Cc: linux-usb@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/YWgR3nOI1osvr5Yo@kroah.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-14 14:39:06 +02:00
Andy Shevchenko
6ab4e2eb5e mmc: sdhci-pci: Read card detect from ACPI for Intel Merrifield
Intel Merrifield platform had been converted to use ACPI enumeration.
However, the driver missed an update to retrieve card detect GPIO.
Fix it here.

Unfortunately we can't rely on CD GPIO state because there are two
different PCB designs in the wild that are using the opposite card
detection sense and there is no way to distinguish those platforms,
that's why ignore CD GPIO completely and use it only as an event.

Fixes: 4590d98f5a ("sfi: Remove framework for deprecated firmware")
BugLink: https://github.com/edison-fw/meta-intel-edison/issues/135
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211013201723.52212-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-14 13:20:17 +02:00
Namjae Jeon
dbad63001e ksmbd: validate compound response buffer
Add the check to validate compound response buffer.

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-13 23:37:19 -05:00
Namjae Jeon
9a63b999ae ksmbd: fix potencial 32bit overflow from data area check in smb2_write
DataOffset and Length validation can be potencial 32bit overflow.
This patch fix it.

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-13 23:37:19 -05:00
Hyunchul Lee
bf8acc9e10 ksmbd: improve credits management
* Requests except READ, WRITE, IOCTL, INFO, QUERY
DIRECOTRY, CANCEL must consume one credit.
* If client's granted credits are insufficient,
refuse to handle requests.
* Windows server 2016 or later grant up to 8192
credits to clients at once.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-13 23:37:19 -05:00
Namjae Jeon
f7db8fd03a ksmbd: add validation in smb2_ioctl
Add validation for request/response buffer size check in smb2_ioctl and
fsctl_copychunk() take copychunk_ioctl_req pointer and the other arguments
instead of smb2_ioctl_req structure and remove an unused smb2_ioctl_req
argument of fsctl_validate_negotiate_info.

Cc: Tom Talpey <tom@talpey.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Ralph Böhme <slow@samba.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-13 23:37:18 -05:00
Fabien Dessenne
c370bb4740 pinctrl: stm32: use valid pin identifier in stm32_pinctrl_resume()
When resuming from low power, the driver attempts to restore the
configuration of some pins. This is done by a call to:
  stm32_pinctrl_restore_gpio_regs(struct stm32_pinctrl *pctl, u32 pin)
where 'pin' must be a valid pin value (i.e. matching some 'groups->pin').
Fix the current implementation which uses some wrong 'pin' value.

Fixes: e2f3cf18c3 ("pinctrl: stm32: add suspend/resume management")
Signed-off-by: Fabien Dessenne <fabien.dessenne@foss.st.com>
Link: https://lore.kernel.org/r/20211008122517.617633-1-fabien.dessenne@foss.st.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-10-14 01:16:12 +02:00
Rafał Miłecki
6dba4bdfd7 Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode"
This reverts commit a49d784d5a.

The updated binding was wrong / invalid and has been reverted. There
isn't any upstream kernel DTS using it and Broadcom isn't known to use
it neither. There is close to zero chance this will cause regression for
anyone.

Actually in-kernel bcm5301x.dtsi still uses the old good binding and so
it's broken since the driver update. This revert fixes it.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Link: https://lore.kernel.org/r/20211008205938.29925-3-zajec5@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-10-14 01:09:07 +02:00
Rafał Miłecki
1d0a779892 dt-bindings: pinctrl: brcm,ns-pinmux: drop unneeded CRU from example
There is no need to include CRU in example of this binding. It wasn't
complete / correct anyway. The proper binding can be find in the
mfd/brcm,cru.yaml .

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211008205938.29925-2-zajec5@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-10-14 01:09:07 +02:00
Rafał Miłecki
0398adaec3 Revert "dt-bindings: pinctrl: bcm4708-pinmux: rework binding to use syscon"
This reverts commit 2ae80900f2.

My rework was unneeded & wrong. It replaced a clear & correct "reg"
property usage with a custom "offset" one.

Back then I didn't understand how to properly handle CRU block binding.
I heard / read about syscon and tried to use it in a totally invalid
way. That change also missed Rob's review (obviously).

Northstar's pin controller is a simple consistent hardware block that
can be cleanly mapped using a 0x24 long reg space.

Since the rework commit there wasn't any follow up modifying in-kernel
DTS files to use the new binding. Broadcom also isn't known to use that
bugged binding. There is close to zero chance this revert may actually
cause problems / regressions.

This commit is a simple revert. Example binding may (should) be updated
/ cleaned up but that can be handled separately.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211008205938.29925-1-zajec5@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-10-14 01:09:06 +02:00
Dan Carpenter
663991f328 RDMA/rdmavt: Fix error code in rvt_create_qp()
Return negative -ENOMEM instead of positive ENOMEM.  Returning a postive
value will cause an Oops because it becomes an ERR_PTR() in the
create_qp() function.

Fixes: 514aee660d ("RDMA: Globally allocate and release QP memory")
Link: https://lore.kernel.org/r/20211013080645.GD6010@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-13 13:59:47 -03:00
Mike Marciniszyn
13bac86195 IB/hfi1: Fix abba locking issue with sc_disable()
sc_disable() after having disabled the send context wakes up any waiters
by calling hfi1_qp_wakeup() while holding the waitlock for the sc.

This is contrary to the model for all other calls to hfi1_qp_wakeup()
where the waitlock is dropped and a local is used to drive calls to
hfi1_qp_wakeup().

Fix by moving the sc->piowait into a local list and driving the wakeup
calls from the list.

Fixes: 099a884ba4 ("IB/hfi1: Handle wakeup of orphaned QPs for pio")
Link: https://lore.kernel.org/r/20211013141852.128104.2682.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-13 13:33:22 -03:00
Mike Marciniszyn
d39bf40e55 IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields
Overflowing either addrlimit or bytes_togo can allow userspace to trigger
a buffer overflow of kernel memory. Check for overflows in all the places
doing math on user controlled buffers.

Fixes: f931551baf ("IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters")
Link: https://lore.kernel.org/r/20211012175519.7298.77738.stgit@awfm-01.cornelisnetworks.com
Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-13 13:26:04 -03:00
Shengjiu Wang
6b9b546dc0 ASoC: wm8960: Fix clock configuration on slave mode
There is a noise issue for 8kHz sample rate on slave mode.
Compared with master mode, the difference is the DACDIV
setting, after correcting the DACDIV, the noise is gone.

There is no noise issue for 48kHz sample rate, because
the default value of DACDIV is correct for 48kHz.

So wm8960_configure_clocking() should be functional for
ADC and DAC function even if it is slave mode.

In order to be compatible for old use case, just add
condition for checking that sysclk is zero with
slave mode.

Fixes: 0e50b51aa2 ("ASoC: wm8960: Let wm8960 driver configure its bit clock and frame clock")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/1634102224-3922-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-13 16:25:33 +01:00
Ming Lei
f2b85040ac scsi: core: Put LLD module refcnt after SCSI device is released
SCSI host release is triggered when SCSI device is freed. We have to make
sure that the low-level device driver module won't be unloaded before SCSI
host instance is released because shost->hostt is required in the release
handler.

Make sure to put LLD module refcnt after SCSI device is released.

Fixes a kernel panic of 'BUG: unable to handle page fault for address'
reported by Changhui and Yi.

Link: https://lore.kernel.org/r/20211008050118.1440686-1-ming.lei@redhat.com
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported-by: Changhui Zhong <czhong@redhat.com>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-12 22:08:23 -04:00
Andrea Parri (Microsoft)
6fd13d699d scsi: storvsc: Fix validation for unsolicited incoming packets
The validation on the length of incoming packets performed in
storvsc_on_channel_callback() does not apply to unsolicited packets with ID
of 0 sent by Hyper-V.  Adjust the validation for such unsolicited packets.

Link: https://lore.kernel.org/r/20211007122828.469289-1-parri.andrea@gmail.com
Fixes: 91b1b640b8 ("scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback()")
Reported-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-12 21:59:50 -04:00
Mike Christie
187a580c9e scsi: iscsi: Fix set_param() handling
In commit 9e67600ed6 ("scsi: iscsi: Fix race condition between login and
sync thread") we meant to add a check where before we call ->set_param() we
make sure the iscsi_cls_connection is bound. The problem is that between
versions 4 and 5 of the patch the deletion of the unchecked set_param()
call was dropped so we ended up with 2 calls. As a result we can still hit
a crash where we access the unbound connection on the first call.

This patch removes that first call.

Fixes: 9e67600ed6 ("scsi: iscsi: Fix race condition between login and sync thread")
Link: https://lore.kernel.org/r/20211010161904.60471-1-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Li Feng <fengli@smartx.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-12 12:51:25 -04:00
Dexuan Cui
50b6cb3516 scsi: core: Fix shost->cmd_per_lun calculation in scsi_add_host_with_dma()
After commit ea2f0f7753 ("scsi: core: Cap scsi_host cmd_per_lun at
can_queue"), a 416-CPU VM running on Hyper-V hangs during boot because the
hv_storvsc driver sets scsi_driver.can_queue to an integer value that
exceeds SHRT_MAX, and hence scsi_add_host_with_dma() sets
shost->cmd_per_lun to a negative "short" value.

Use min_t(int, ...) to work around the issue.

Link: https://lore.kernel.org/r/20211008043546.6006-1-decui@microsoft.com
Fixes: ea2f0f7753 ("scsi: core: Cap scsi_host cmd_per_lun at can_queue")
Cc: stable@vger.kernel.org
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-12 12:37:15 -04:00
Yang Yingliang
55e6d80378 regmap: Fix possible double-free in regcache_rbtree_exit()
In regcache_rbtree_insert_to_block(), when 'present' realloc failed,
the 'blk' which is supposed to assign to 'rbnode->block' will be freed,
so 'rbnode->block' points a freed memory, in the error handling path of
regcache_rbtree_init(), 'rbnode->block' will be freed again in
regcache_rbtree_exit(), KASAN will report double-free as follows:

BUG: KASAN: double-free or invalid-free in kfree+0xce/0x390
Call Trace:
 slab_free_freelist_hook+0x10d/0x240
 kfree+0xce/0x390
 regcache_rbtree_exit+0x15d/0x1a0
 regcache_rbtree_init+0x224/0x2c0
 regcache_init+0x88d/0x1310
 __regmap_init+0x3151/0x4a80
 __devm_regmap_init+0x7d/0x100
 madera_spi_probe+0x10f/0x333 [madera_spi]
 spi_probe+0x183/0x210
 really_probe+0x285/0xc30

To fix this, moving up the assignment of rbnode->block to immediately after
the reallocation has succeeded so that the data structure stays valid even
if the second reallocation fails.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 3f4ff561bc ("regmap: rbtree: Make cache_present bitmap per node")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211012023735.1632786-1-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-12 11:48:43 +01:00
Stefan Binding
aa18457c4a ASoC: cs42l42: Ensure 0dB full scale volume is used for headsets
Ensure the default 0dB playback path is always used.

The code that set FULL_SCALE_VOL based on LOAD_DET_RCSTAT was
spurious, and resulted in a -6dB attenuation being accidentally
inserted into the playback path.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20211011144903.28915-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-12 11:48:39 +01:00
Shawn Guo
4217d07b9f mmc: sdhci: Map more voltage level to SDHCI_POWER_330
On Thundercomm TurboX CM2290, the eMMC OCR reports vdd = 23 (3.5 ~ 3.6 V),
which is being treated as an invalid value by sdhci_set_power_noreg().
And thus eMMC is totally broken on the platform.

[    1.436599] ------------[ cut here ]------------
[    1.436606] mmc0: Invalid vdd 0x17
[    1.436640] WARNING: CPU: 2 PID: 69 at drivers/mmc/host/sdhci.c:2048 sdhci_set_power_noreg+0x168/0x2b4
[    1.436655] Modules linked in:
[    1.436662] CPU: 2 PID: 69 Comm: kworker/u8:1 Tainted: G        W         5.15.0-rc1+ #137
[    1.436669] Hardware name: Thundercomm TurboX CM2290 (DT)
[    1.436674] Workqueue: events_unbound async_run_entry_fn
[    1.436685] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.436692] pc : sdhci_set_power_noreg+0x168/0x2b4
[    1.436698] lr : sdhci_set_power_noreg+0x168/0x2b4
[    1.436703] sp : ffff800010803a60
[    1.436705] x29: ffff800010803a60 x28: ffff6a9102465f00 x27: ffff6a9101720a70
[    1.436715] x26: ffff6a91014de1c0 x25: ffff6a91014de010 x24: ffff6a91016af280
[    1.436724] x23: ffffaf7b1b276640 x22: 0000000000000000 x21: ffff6a9101720000
[    1.436733] x20: ffff6a9101720370 x19: ffff6a9101720580 x18: 0000000000000020
[    1.436743] x17: 0000000000000000 x16: 0000000000000004 x15: ffffffffffffffff
[    1.436751] x14: 0000000000000000 x13: 00000000fffffffd x12: ffffaf7b1b84b0bc
[    1.436760] x11: ffffaf7b1b720d10 x10: 000000000000000a x9 : ffff800010803a60
[    1.436769] x8 : 000000000000000a x7 : 000000000000000f x6 : 00000000fffff159
[    1.436778] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
[    1.436787] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff6a9101718d80
[    1.436797] Call trace:
[    1.436800]  sdhci_set_power_noreg+0x168/0x2b4
[    1.436805]  sdhci_set_ios+0xa0/0x7fc
[    1.436811]  mmc_power_up.part.0+0xc4/0x164
[    1.436818]  mmc_start_host+0xa0/0xb0
[    1.436824]  mmc_add_host+0x60/0x90
[    1.436830]  __sdhci_add_host+0x174/0x330
[    1.436836]  sdhci_msm_probe+0x7c0/0x920
[    1.436842]  platform_probe+0x68/0xe0
[    1.436850]  really_probe.part.0+0x9c/0x31c
[    1.436857]  __driver_probe_device+0x98/0x144
[    1.436863]  driver_probe_device+0xc8/0x15c
[    1.436869]  __device_attach_driver+0xb4/0x120
[    1.436875]  bus_for_each_drv+0x78/0xd0
[    1.436881]  __device_attach_async_helper+0xac/0xd0
[    1.436888]  async_run_entry_fn+0x34/0x110
[    1.436895]  process_one_work+0x1d0/0x354
[    1.436903]  worker_thread+0x13c/0x470
[    1.436910]  kthread+0x150/0x160
[    1.436915]  ret_from_fork+0x10/0x20
[    1.436923] ---[ end trace fcfac44cb045c3a8 ]---

Fix the issue by mapping MMC_VDD_35_36 (and MMC_VDD_34_35) to
SDHCI_POWER_330 as well.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211004024935.15326-1-shawn.guo@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-12 10:20:58 +02:00
Florian Westphal
465f15a6d1 selftests: nft_nat: add udp hole punch test case
Add a test case that demonstrates port shadowing via UDP.

ns2 sends packet to ns1, from source port used by a udp service on the
router, ns0.  Then, ns1 sends packet to ns0:service, but that ends up getting
forwarded to ns2.

Also add three test cases that demonstrate mitigations:
1. disable use of $port as source from 'unstrusted' origin
2. make the service untracked.  This prevents masquerade entries
   from having any effects.
3. add forced PAT via 'random' mode to translate the "wrong" sport
   into an acceptable range.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-12 01:42:39 +02:00
Yang Yingliang
c448b7aa3e ASoC: soc-core: fix null-ptr-deref in snd_soc_del_component_unlocked()
'component' is allocated in snd_soc_register_component(), but component->list
is not initalized, this may cause snd_soc_del_component_unlocked() deref null
ptr in the error handing case.

KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:__list_del_entry_valid+0x81/0xf0
Call Trace:
 snd_soc_del_component_unlocked+0x69/0x1b0 [snd_soc_core]
 snd_soc_add_component.cold+0x54/0x6c [snd_soc_core]
 snd_soc_register_component+0x70/0x90 [snd_soc_core]
 devm_snd_soc_register_component+0x5e/0xd0 [snd_soc_core]
 tas2552_probe+0x265/0x320 [snd_soc_tas2552]
 ? tas2552_component_probe+0x1e0/0x1e0 [snd_soc_tas2552]
 i2c_device_probe+0xa31/0xbe0

Fix by adding INIT_LIST_HEAD() to snd_soc_component_initialize().

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211009065840.3196239-1-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-11 13:19:14 +01:00
Gerald Schaefer
293d92cbbd dma-debug: fix sg checks in debug_dma_map_sg()
The following warning occurred sporadically on s390:
DMA-API: nvme 0006:00:00.0: device driver maps memory from kernel text or rodata [addr=0000000048cc5e2f] [len=131072]
WARNING: CPU: 4 PID: 825 at kernel/dma/debug.c:1083 check_for_illegal_area+0xa8/0x138

It is a false-positive warning, due to broken logic in debug_dma_map_sg().
check_for_illegal_area() checks for overlay of sg elements with kernel text
or rodata. It is called with sg_dma_len(s) instead of s->length as
parameter. After the call to ->map_sg(), sg_dma_len() will contain the
length of possibly combined sg elements in the DMA address space, and not
the individual sg element length, which would be s->length.

The check will then use the physical start address of an sg element, and
add the DMA length for the overlap check, which could result in the false
warning, because the DMA length can be larger than the actual single sg
element length.

In addition, the call to check_for_illegal_area() happens in the iteration
over mapped_ents, which will not include all individual sg elements if
any of them were combined in ->map_sg().

Fix this by using s->length instead of sg_dma_len(s). Also put the call to
check_for_illegal_area() in a separate loop, iterating over all the
individual sg elements ("nents" instead of "mapped_ents").

While at it, as suggested by Robin Murphy, also move check_for_stack()
inside the new loop, as it is similarly concerned with validating the
individual sg elements.

Link: https://lore.kernel.org/lkml/20210705185252.4074653-1-gerald.schaefer@linux.ibm.com
Fixes: 884d05970b ("dma-debug: use sg_dma_len accessor")
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-11 13:43:31 +02:00
Logan Gunthorpe
011a9ce807 dma-mapping: fix the kerneldoc for dma_map_sgtable()
htmldocs began producing the following warnings:

  kernel/dma/mapping.c:256: WARNING: Definition list ends without a
             blank line; unexpected unindent.
  kernel/dma/mapping.c:257: WARNING: Bullet list ends without a blank
             line; unexpected unindent.

Reformatting the list without hyphens fixes the warnings and produces
both a readable text and HTML output.

Fixes: fffe3cc8c2 ("dma-mapping: allow map_sg() ops to return negative error code")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-11 13:43:04 +02:00
Florian Westphal
68a3765c65 netfilter: nf_tables: skip netdev events generated on netns removal
syzbot reported following (harmless) WARN:

 WARNING: CPU: 1 PID: 2648 at net/netfilter/core.c:468
  nft_netdev_unregister_hooks net/netfilter/nf_tables_api.c:230 [inline]
  nf_tables_unregister_hook include/net/netfilter/nf_tables.h:1090 [inline]
  __nft_release_basechain+0x138/0x640 net/netfilter/nf_tables_api.c:9524
  nft_netdev_event net/netfilter/nft_chain_filter.c:351 [inline]
  nf_tables_netdev_event+0x521/0x8a0 net/netfilter/nft_chain_filter.c:382

reproducer:
unshare -n bash -c 'ip link add br0 type bridge; nft add table netdev t ; \
 nft add chain netdev t ingress \{ type filter hook ingress device "br0" \
 priority 0\; policy drop\; \}'

Problem is that when netns device exit hooks create the UNREGISTER
event, the .pre_exit hook for nf_tables core has already removed the
base hook.  Notifier attempts to do this again.

The need to do base hook unregister unconditionally was needed in the past,
because notifier was last stage where reg->dev dereference was safe.

Now that nf_tables does the hook removal in .pre_exit, this isn't
needed anymore.

Reported-and-tested-by: syzbot+154bd5be532a63aa778b@syzkaller.appspotmail.com
Fixes: 767d1216bf ("netfilter: nftables: fix possible UAF over chains from packet path in netns")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-07 19:37:38 +02:00
Vegard Nossum
77076934af netfilter: Kconfig: use 'default y' instead of 'm' for bool config option
This option, NF_CONNTRACK_SECMARK, is a bool, so it can never be 'm'.

Fixes: 33b8e77605 ("[NETFILTER]: Add CONFIG_NETFILTER_ADVANCED option")
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-07 19:37:25 +02:00
Juhee Kang
902c0b1887 netfilter: xt_IDLETIMER: fix panic that occurs when timer_type has garbage value
Currently, when the rule related to IDLETIMER is added, idletimer_tg timer
structure is initialized by kmalloc on executing idletimer_tg_create
function. However, in this process timer->timer_type is not defined to
a specific value. Thus, timer->timer_type has garbage value and it occurs
kernel panic. So, this commit fixes the panic by initializing
timer->timer_type using kzalloc instead of kmalloc.

Test commands:
    # iptables -A OUTPUT -j IDLETIMER --timeout 1 --label test
    $ cat /sys/class/xt_idletimer/timers/test
      Killed

Splat looks like:
    BUG: KASAN: user-memory-access in alarm_expires_remaining+0x49/0x70
    Read of size 8 at addr 0000002e8c7bc4c8 by task cat/917
    CPU: 12 PID: 917 Comm: cat Not tainted 5.14.0+ #3 79940a339f71eb14fc81aee1757a20d5bf13eb0e
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
    Call Trace:
     dump_stack_lvl+0x6e/0x9c
     kasan_report.cold+0x112/0x117
     ? alarm_expires_remaining+0x49/0x70
     __asan_load8+0x86/0xb0
     alarm_expires_remaining+0x49/0x70
     idletimer_tg_show+0xe5/0x19b [xt_IDLETIMER 11219304af9316a21bee5ba9d58f76a6b9bccc6d]
     dev_attr_show+0x3c/0x60
     sysfs_kf_seq_show+0x11d/0x1f0
     ? device_remove_bin_file+0x20/0x20
     kernfs_seq_show+0xa4/0xb0
     seq_read_iter+0x29c/0x750
     kernfs_fop_read_iter+0x25a/0x2c0
     ? __fsnotify_parent+0x3d1/0x570
     ? iov_iter_init+0x70/0x90
     new_sync_read+0x2a7/0x3d0
     ? __x64_sys_llseek+0x230/0x230
     ? rw_verify_area+0x81/0x150
     vfs_read+0x17b/0x240
     ksys_read+0xd9/0x180
     ? vfs_write+0x460/0x460
     ? do_syscall_64+0x16/0xc0
     ? lockdep_hardirqs_on+0x79/0x120
     __x64_sys_read+0x43/0x50
     do_syscall_64+0x3b/0xc0
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    RIP: 0033:0x7f0cdc819142
    Code: c0 e9 c2 fe ff ff 50 48 8d 3d 3a ca 0a 00 e8 f5 19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
    RSP: 002b:00007fff28eee5b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
    RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f0cdc819142
    RDX: 0000000000020000 RSI: 00007f0cdc032000 RDI: 0000000000000003
    RBP: 00007f0cdc032000 R08: 00007f0cdc031010 R09: 0000000000000000
    R10: 0000000000000022 R11: 0000000000000246 R12: 00005607e9ee31f0
    R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000

Fixes: 68983a354a ("netfilter: xtables: Add snapshot of hardidletimer target")
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-07 19:35:57 +02:00
Miguel Bernal Marin
136f282028 ACPI: tools: fix compilation error
When ACPI tools are compiled, the following error is showed:

   $ cd tools/power/acpi
   $ make
     DESCEND tools/acpidbg
     MKDIR    include
     CP       include
     CC       tools/acpidbg/acpidbg.o
   In file included from /home/linux/tools/power/acpi/include/acpi/platform/acenv.h:152,
                    from /home/linux/tools/power/acpi/include/acpi/acpi.h:22,
                    from acpidbg.c:9:
   /home/linux/tools/power/acpi/include/acpi/platform/acgcc.h:25:10: fatal error: linux/stdarg.h: No such file or directory
      29 | #include <linux/stdarg.h>
         |          ^~~~~~~~~~~~~~~~
   compilation terminated.

Use the ACPICA logic: just identify when it is used inside the kernel
or by an ACPI tool.

Fixes: c0891ac15f ("isystem: ship and use stdarg.h")
Signed-off-by: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-10-07 19:18:19 +02:00
Srinivasa Rao Mandadapu
214174d9f5 ASoC: codec: wcd938x: Add irq config support
This patch fixes compilation error in wcd98x codec driver.

Fixes: 0454422288 ("ASoC: codecs: wcd938x: add audio routing and Kconfig")

Signed-off-by: Venkata Prasad Potturu <potturu@codeaurora.org>
Signed-off-by: Srinivasa Rao Mandadapu <srivasam@codeaurora.org>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/1633614675-27122-1-git-send-email-srivasam@codeaurora.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-07 15:45:14 +01:00
Takashi Iwai
5af82c81b2 ASoC: DAPM: Fix missing kctl change notifications
The put callback of a kcontrol is supposed to return 1 when the value
is changed, and this will be notified to user-space.  However, some
DAPM kcontrols always return 0 (except for errors), hence the
user-space misses the update of a control value.

This patch corrects the behavior by properly returning 1 when the
value gets updated.

Reported-and-tested-by: Hans de Goede <hdegoede@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20211006141712.2439-1-tiwai@suse.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-07 15:45:12 +01:00
Andy Shevchenko
c25d4546ca ASoC: Intel: bytcht_es8316: Utilize dev_err_probe() to avoid log saturation
dev_err_probe() avoids printing into log when the deferred probe is invoked.
This is possible when clock provider is pending to appear.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20211006150428.16434-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-07 15:45:11 +01:00
Andy Shevchenko
10f4a96543 ASoC: Intel: bytcht_es8316: Switch to use gpiod_get_optional()
First of all, replace indexed API by plain one since we have index 0.
Second, switch to optional variant and drop duplicated code.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20211006150428.16434-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-07 15:45:10 +01:00
Andy Shevchenko
6f32c52106 ASoC: Intel: bytcht_es8316: Use temporary variable for struct device
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20211006150428.16434-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-07 15:45:09 +01:00
Andy Shevchenko
2577b868a4 ASoC: Intel: bytcht_es8316: Get platform data via dev_get_platdata()
Access to platform data via dev_get_platdata() getter to make code cleaner.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.com>
Link: https://lore.kernel.org/r/20211006150428.16434-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-07 15:45:08 +01:00
Srinivasa Rao Mandadapu
db0767b8a6 ASoC: wcd938x: Fix jack detection issue
This patch is to fix audio 3.5mm jack detection failure
on wcd938x codec based target.

Fixes: bcee7ed09b (ASoC: codecs: wcd938x: add Multi Button Headset Control support)

Signed-off-by: Venkata Prasad Potturu <potturu@codeaurora.org>
Signed-off-by: Srinivasa Rao Mandadapu <srivasam@codeaurora.org>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/1633614619-27026-1-git-send-email-srivasam@codeaurora.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-07 15:45:07 +01:00
Patrisious Haddad
1ab52ac1e9 RDMA/mlx5: Set user priority for DCT
Currently, the driver doesn't set the PCP-based priority for DCT, hence
DCT response packets are transmitted without user priority.

Fix it by setting user provided priority in the eth_prio field in the DCT
context, which in turn sets the value in the transmitted packet.

Fixes: 776a3906b6 ("IB/mlx5: Add support for DC target QP")
Link: https://lore.kernel.org/r/5fd2d94a13f5742d8803c218927322257d53205c.1633512672.git.leonro@nvidia.com
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-06 16:39:52 -03:00
Shiraz Saleem
e93c7d8e8c RDMA/irdma: Process extended CQ entries correctly
The valid bit for extended CQE's written by HW is retrieved from the
incorrect quad-word. This leads to missed completions for any UD traffic
particularly after a wrap-around.

Get the valid bit for extended CQE's from the correct quad-word in the
descriptor.

Fixes: 551c46edc7 ("RDMA/irdma: Add user/kernel shared libraries")
Link: https://lore.kernel.org/r/20211005182302.374-1-shiraz.saleem@intel.com
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-06 16:38:07 -03:00
Quentin Perret
6e6a8ef088 KVM: arm64: Release mmap_lock when using VM_SHARED with MTE
VM_SHARED mappings are currently forbidden in a memslot with MTE to
prevent two VMs racing to sanitise the same page. However, this check
is performed while holding current->mm's mmap_lock, but fails to release
it. Fix this by releasing the lock when needed.

Fixes: ea7fc1bb1c ("KVM: arm64: Introduce MTE VM feature")
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005122031.809857-1-qperret@google.com
2021-10-05 13:22:45 +01:00
Quentin Perret
7615c2a514 KVM: arm64: Report corrupted refcount at EL2
Some of the refcount manipulation helpers used at EL2 are instrumented
to catch a corrupted state, but not all of them are treated equally. Let's
make things more consistent by instrumenting hyp_page_ref_dec_and_test()
as well.

Acked-by: Will Deacon <will@kernel.org>
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005090155.734578-6-qperret@google.com
2021-10-05 13:02:54 +01:00
Quentin Perret
1d58a17ef5 KVM: arm64: Fix host stage-2 PGD refcount
The KVM page-table library refcounts the pages of concatenated stage-2
PGDs individually. However, when running KVM in protected mode, the
host's stage-2 PGD is currently managed by EL2 as a single high-order
compound page, which can cause the refcount of the tail pages to reach 0
when they shouldn't, hence corrupting the page-table.

Fix this by introducing a new hyp_split_page() helper in the EL2 page
allocator (matching the kernel's split_page() function), and make use of
it from host_s2_zalloc_pages_exact().

Fixes: 1025c8c0c6 ("KVM: arm64: Wrap the host with a stage 2")
Acked-by: Will Deacon <will@kernel.org>
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005090155.734578-5-qperret@google.com
2021-10-05 13:02:54 +01:00
Paweł Anikiel
3ad60b4b35 reset: socfpga: add empty driver allowing consumers to probe
The early reset driver doesn't ever probe, which causes consuming
devices to be unable to probe. Add an empty driver to set this device
as available, allowing consumers to probe.

Signed-off-by: Paweł Anikiel <pan@semihalf.com>
Link: https://lore.kernel.org/r/20210920124141.1166544-4-pan@semihalf.com
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2021-10-05 12:23:16 +02:00
Mikko Perttunen
c045ceb5a1 reset: tegra-bpmp: Handle errors in BPMP response
The return value from tegra_bpmp_transfer indicates the success or
failure of the IPC transaction with BPMP. If the transaction
succeeded, we also need to check the actual command's result code.
Add code to do this.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Link: https://lore.kernel.org/r/20210915085517.1669675-2-mperttunen@nvidia.com
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2021-10-05 10:55:18 +02:00
Geert Uytterhoeven
4af160707d reset: pistachio: Re-enable driver selection
After the retirement of MACH_PISTACHIO, the Pistachio Reset Driver is no
longer auto-enabled when building a kernel for Pistachio systems.
Worse, the driver cannot be enabled by the user at all (unless
compile-testing), as the config symbol is invisible.

Fix this partially by making the symbol visible again when compiling for
MIPS, and dropping the useless default.  The user still has to enable
the driver manually when building a kernel for Pistachio systems,
though.

Fixes: 104f942b28 ("MIPS: Retire MACH_PISTACHIO")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Rahul Bedarkar <rahulbedarkar89@gmail.com>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Link: https://lore.kernel.org/r/2c399e52540536df9c4006e46ef93fbccdde88db.1631610825.git.geert+renesas@glider.be
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2021-10-05 10:49:40 +02:00
Jim Quinlan
f33eb7f29c reset: brcmstb-rescal: fix incorrect polarity of status bit
The readl_poll_timeout() should complete when the status bit
is a 1, not 0.

Fixes: 4cf176e523 ("reset: Add Broadcom STB RESCAL reset controller")
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210914221122.62315-1-f.fainelli@gmail.com
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2021-10-05 10:48:56 +02:00
Paolo Bonzini
2353e593a1 Merge tag 'kvm-s390-master-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master
KVM: s390: allow to compile without warning with W=1
2021-10-04 03:58:25 -04:00
Johannes Berg
a2083eeb11 cfg80211: scan: fix RCU in cfg80211_add_nontrans_list()
The SSID pointer is pointing to RCU protected data, so we
need to have it under rcu_read_lock() for the entire use.
Fix this.

Cc: stable@vger.kernel.org
Fixes: 0b8fb8235b ("cfg80211: Parsing of Multiple BSSID information in scanning")
Link: https://lore.kernel.org/r/20210930131120.6ddfc603aa1d.I2137344c4e2426525b1a8e4ce5fca82f8ecbfe7e@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-01 11:02:27 +02:00
Johannes Berg
636707e593 mac80211: mesh: fix HE operation element length check
The length check here was bad, if the length doesn't at
least include the length of the fixed part, we cannot
call ieee80211_he_oper_size() to determine the total.
Fix this, and convert to cfg80211_find_ext_elem() while
at it.

Cc: stable@vger.kernel.org
Fixes: 70debba3ab ("mac80211: save HE oper info in BSS config for mesh")
Link: https://lore.kernel.org/r/20210930131120.b0f940976c56.I954e1be55e9f87cc303165bff5c906afe1e54648@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-01 11:02:24 +02:00
Hans de Goede
42871e95a3 ASoC: nau8824: Fix headphone vs headset, button-press detection no longer working
Commit 1d25684e22 ("ASoC: nau8824: Fix open coded prefix handling")
replaced the nau8824_dapm_enable_pin() helper with direct calls to
snd_soc_dapm_enable_pin(), but the helper was using
snd_soc_dapm_force_enable_pin() and not forcing the MICBIAS + SAR
supplies on breaks headphone vs headset and button-press detection.

Replace the snd_soc_dapm_enable_pin() calls with
snd_soc_dapm_force_enable_pin() to fix this.

Cc: stable@vger.kernel.org
Fixes: 1d25684e22 ("ASoC: nau8824: Fix open coded prefix handling")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210929201512.460360-1-hdegoede@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-09-30 13:25:55 +01:00
Janosch Frank
25b5476a29 KVM: s390: Function documentation fixes
The latest compile changes pointed us to a few instances where we use
the kernel documentation style but don't explain all variables or
don't adhere to it 100%.

It's easy to fix so let's do that.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-09-28 17:56:54 +02:00
Mark Brown
0cc3687ead ASoC: cs4341: Add SPI device ID table
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding SPI IDs for parts that
only have a compatible listed.

Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: patches@opensource.cirrus.com
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20210924194844.45974-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-09-27 12:59:09 +01:00
Mark Brown
ceef3240f9 ASoC: pcm179x: Add missing entries SPI to device ID table
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding SPI IDs for parts that
only have a compatible listed.

Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210924194956.46079-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-09-27 12:59:08 +01:00
Bastien Roucariès
55dd7e0590 ARM: dts: sun7i: A20-olinuxino-lime2: Fix ethernet phy-mode
Commit bbc4d71d63 ("net: phy: realtek: fix rtl8211e rx/tx delay
config") sets the RX/TX delay according to the phy-mode property in the
device tree. For the A20-olinuxino-lime2 board this is "rgmii", which is the
wrong setting.

Following the example of a900cac375 ("ARM: dts: sun7i: a20: bananapro:
Fix ethernet phy-mode") the phy-mode is changed to "rgmii-id" which gets
the Ethernet working again on this board.

Signed-off-by: Bastien Roucariès <rouca@debian.org>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20210916081721.237137-1-rouca@debian.org
2021-09-24 09:29:00 +02:00
Shengjiu Wang
74b7ee0e7b ASoC: fsl_xcvr: Fix channel swap issue with ARC
With pause and resume test for ARC, there is occasionally
channel swap issue. The reason is that currently driver set
the DPATH out of reset first, then start the DMA, the first
data got from FIFO may not be the Left channel.

Moving DPATH out of reset operation after the dma enablement
to fix this issue.

Fixes: 2856448686 ("ASoC: fsl_xcvr: Add XCVR ASoC CPU DAI driver")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1631265510-27384-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-09-21 13:23:36 +01:00
Peter Rosin
3f4b57ad07 ASoC: pcm512x: Mend accesses to the I2S_1 and I2S_2 registers
Commit 25d27c4f68 ("ASoC: pcm512x: Add support for more data formats")
breaks the TSE-850 device, which is using a pcm5142 in I2S and
CBM_CFS mode (maybe not relevant). Without this fix, the result
is:

pcm512x 0-004c: Failed to set data format: -16

And after that, no sound.

This fix is not 100% correct. The datasheet of at least the pcm5142
states that four bits (0xcc) in the I2S_1 register are "RSV"
("Reserved. Do not access.") and no hint is given as to what the
initial values are supposed to be. So, specifying defaults for
these bits is wrong. But perhaps better than a broken driver?

Fixes: 25d27c4f68 ("ASoC: pcm512x: Add support for more data formats")
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Kirill Marinushkin <kmarinushkin@birdec.com>
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: alsa-devel@alsa-project.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Link: https://lore.kernel.org/r/2d221984-7a2e-7006-0f8a-ffb5f64ee885@axentia.se
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-09-21 13:23:34 +01:00
Clément Bœsch
0764e365da arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node
RX and TX delay are provided by ethernet PHY. Reflect that in ethernet
node.

Fixes: 44a94c7ef9 ("arm64: dts: allwinner: H5: Restore EMAC changes")
Signed-off-by: Clément Bœsch <u@pkh.me>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20210905002027.171984-1-u@pkh.me
2021-09-13 09:05:01 +02:00
398 changed files with 3764 additions and 2229 deletions

View File

@@ -33,6 +33,8 @@ Al Viro <viro@zenIV.linux.org.uk>
Andi Kleen <ak@linux.intel.com> <ak@suse.de>
Andi Shyti <andi@etezian.org> <andi.shyti@samsung.com>
Andreas Herrmann <aherrman@de.ibm.com>
Andrej Shadura <andrew.shadura@collabora.co.uk>
Andrej Shadura <andrew@shadura.me> <andrew@beldisplaytech.com>
Andrew Morton <akpm@linux-foundation.org>
Andrew Murray <amurray@thegoodpenguin.co.uk> <amurray@embedded-bits.co.uk>
Andrew Murray <amurray@thegoodpenguin.co.uk> <andrew.murray@arm.com>

View File

@@ -32,13 +32,13 @@ properties:
"#size-cells":
const: 1
pinctrl:
$ref: ../pinctrl/brcm,ns-pinmux.yaml
patternProperties:
'^clock-controller@[a-f0-9]+$':
$ref: ../clock/brcm,iproc-clocks.yaml
'^pin-controller@[a-f0-9]+$':
$ref: ../pinctrl/brcm,ns-pinmux.yaml
'^thermal@[a-f0-9]+$':
$ref: ../thermal/brcm,ns-thermal.yaml
@@ -73,9 +73,10 @@ examples:
"iprocfast", "sata1", "sata2";
};
pinctrl {
pin-controller@1c0 {
compatible = "brcm,bcm4708-pinmux";
offset = <0x1c0>;
reg = <0x1c0 0x24>;
reg-names = "cru_gpio_control";
};
thermal@2c0 {

View File

@@ -17,9 +17,6 @@ description:
A list of pins varies across chipsets so few bindings are available.
Node of the pinmux must be nested in the CRU (Central Resource Unit) "syscon"
node.
properties:
compatible:
enum:
@@ -27,10 +24,11 @@ properties:
- brcm,bcm4709-pinmux
- brcm,bcm53012-pinmux
offset:
description: offset of pin registers in the CRU block
reg:
maxItems: 1
$ref: /schemas/types.yaml#/definitions/uint32-array
reg-names:
const: cru_gpio_control
patternProperties:
'-pins$':
@@ -72,23 +70,20 @@ allOf:
uart1_grp ]
required:
- offset
- reg
- reg-names
additionalProperties: false
examples:
- |
cru@1800c100 {
compatible = "syscon", "simple-mfd";
reg = <0x1800c100 0x1a4>;
pin-controller@1800c1c0 {
compatible = "brcm,bcm4708-pinmux";
reg = <0x1800c1c0 0x24>;
reg-names = "cru_gpio_control";
pinctrl {
compatible = "brcm,bcm4708-pinmux";
offset = <0xc0>;
spi-pins {
function = "spi";
groups = "spi_grp";
};
spi-pins {
function = "spi";
groups = "spi_grp";
};
};

View File

@@ -30,10 +30,11 @@ The ``ice`` driver reports the following versions
PHY, link, etc.
* - ``fw.mgmt.api``
- running
- 1.5
- 2-digit version number of the API exported over the AdminQ by the
management firmware. Used by the driver to identify what commands
are supported.
- 1.5.1
- 3-digit version number (major.minor.patch) of the API exported over
the AdminQ by the management firmware. Used by the driver to
identify what commands are supported. Historical versions of the
kernel only displayed a 2-digit version number (major.minor).
* - ``fw.mgmt.build``
- running
- 0x305d955f

View File

@@ -59,11 +59,11 @@ specified with a ``sockaddr`` type, with a single-byte endpoint address:
};
struct sockaddr_mctp {
unsigned short int smctp_family;
int smctp_network;
struct mctp_addr smctp_addr;
__u8 smctp_type;
__u8 smctp_tag;
__kernel_sa_family_t smctp_family;
unsigned int smctp_network;
struct mctp_addr smctp_addr;
__u8 smctp_type;
__u8 smctp_tag;
};
#define MCTP_NET_ANY 0x0

View File

@@ -104,6 +104,7 @@ Code Seq# Include File Comments
'8' all SNP8023 advanced NIC card
<mailto:mcr@solidum.com>
';' 64-7F linux/vfio.h
'=' 00-3f uapi/linux/ptp_clock.h <mailto:richardcochran@gmail.com>
'@' 00-0F linux/radeonfb.h conflict!
'@' 00-0F drivers/video/aty/aty128fb.c conflict!
'A' 00-1F linux/apm_bios.h conflict!

View File

@@ -5458,6 +5458,19 @@ F: include/net/devlink.h
F: include/uapi/linux/devlink.h
F: net/core/devlink.c
DH ELECTRONICS IMX6 DHCOM BOARD SUPPORT
M: Christoph Niedermaier <cniedermaier@dh-electronics.com>
L: kernel@dh-electronics.com
S: Maintained
F: arch/arm/boot/dts/imx6*-dhcom-*
DH ELECTRONICS STM32MP1 DHCOM/DHCOR BOARD SUPPORT
M: Marek Vasut <marex@denx.de>
L: kernel@dh-electronics.com
S: Maintained
F: arch/arm/boot/dts/stm32mp1*-dhcom-*
F: arch/arm/boot/dts/stm32mp1*-dhcor-*
DIALOG SEMICONDUCTOR DRIVERS
M: Support Opensource <support.opensource@diasemi.com>
S: Supported
@@ -6147,8 +6160,7 @@ T: git git://anongit.freedesktop.org/drm/drm
F: Documentation/devicetree/bindings/display/
F: Documentation/devicetree/bindings/gpu/
F: Documentation/gpu/
F: drivers/gpu/drm/
F: drivers/gpu/vga/
F: drivers/gpu/
F: include/drm/
F: include/linux/vga*
F: include/uapi/drm/
@@ -11278,7 +11290,6 @@ F: Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
F: drivers/net/ethernet/marvell/octeontx2/af/
MARVELL PRESTERA ETHERNET SWITCH DRIVER
M: Vadym Kochan <vkochan@marvell.com>
M: Taras Chornyi <tchornyi@marvell.com>
S: Supported
W: https://github.com/Marvell-switching/switchdev-prestera
@@ -20336,6 +20347,7 @@ X86 ARCHITECTURE (32-BIT AND 64-BIT)
M: Thomas Gleixner <tglx@linutronix.de>
M: Ingo Molnar <mingo@redhat.com>
M: Borislav Petkov <bp@alien8.de>
M: Dave Hansen <dave.hansen@linux.intel.com>
M: x86@kernel.org
R: "H. Peter Anvin" <hpa@zytor.com>
L: linux-kernel@vger.kernel.org

View File

@@ -2,8 +2,8 @@
VERSION = 5
PATCHLEVEL = 15
SUBLEVEL = 0
EXTRAVERSION = -rc6
NAME = Opossums on Parade
EXTRAVERSION =
NAME = Trick or Treat
# *DOCUMENTATION*
# To see a list of typical targets execute "make help"

View File

@@ -92,6 +92,7 @@ config ARM
select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
select HAVE_FUNCTION_GRAPH_TRACER if !THUMB2_KERNEL && !CC_IS_CLANG
select HAVE_FUNCTION_TRACER if !XIP_KERNEL
select HAVE_FUTEX_CMPXCHG if FUTEX
select HAVE_GCC_PLUGINS
select HAVE_HW_BREAKPOINT if PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7)
select HAVE_IRQ_TIME_ACCOUNTING

View File

@@ -47,7 +47,10 @@ extern char * strchrnul(const char *, int);
#endif
#ifdef CONFIG_KERNEL_XZ
/* Prevent KASAN override of string helpers in decompressor */
#undef memmove
#define memmove memmove
#undef memcpy
#define memcpy memcpy
#include "../../../../lib/decompress_unxz.c"
#endif

View File

@@ -112,7 +112,7 @@
pinctrl-names = "default";
pinctrl-0 = <&gmac_rgmii_pins>;
phy-handle = <&phy1>;
phy-mode = "rgmii";
phy-mode = "rgmii-id";
status = "okay";
};

View File

@@ -176,6 +176,7 @@ extern int __get_user_64t_4(void *);
register unsigned long __l asm("r1") = __limit; \
register int __e asm("r0"); \
unsigned int __ua_flags = uaccess_save_and_enable(); \
int __tmp_e; \
switch (sizeof(*(__p))) { \
case 1: \
if (sizeof((x)) >= 8) \
@@ -203,9 +204,10 @@ extern int __get_user_64t_4(void *);
break; \
default: __e = __get_user_bad(); break; \
} \
__tmp_e = __e; \
uaccess_restore(__ua_flags); \
x = (typeof(*(p))) __r2; \
__e; \
__tmp_e; \
})
#define get_user(x, p) \

View File

@@ -253,7 +253,7 @@ __create_page_tables:
add r0, r4, #KERNEL_OFFSET >> (SECTION_SHIFT - PMD_ORDER)
ldr r6, =(_end - 1)
adr_l r5, kernel_sec_start @ _pa(kernel_sec_start)
#ifdef CONFIG_CPU_ENDIAN_BE8
#if defined CONFIG_CPU_ENDIAN_BE8 || defined CONFIG_CPU_ENDIAN_BE32
str r8, [r5, #4] @ Save physical start of kernel (BE)
#else
str r8, [r5] @ Save physical start of kernel (LE)
@@ -266,7 +266,7 @@ __create_page_tables:
bls 1b
eor r3, r3, r7 @ Remove the MMU flags
adr_l r5, kernel_sec_end @ _pa(kernel_sec_end)
#ifdef CONFIG_CPU_ENDIAN_BE8
#if defined CONFIG_CPU_ENDIAN_BE8 || defined CONFIG_CPU_ENDIAN_BE32
str r3, [r5, #4] @ Save physical end of kernel (BE)
#else
str r3, [r5] @ Save physical end of kernel (LE)

View File

@@ -136,7 +136,7 @@ static void dump_mem(const char *lvl, const char *str, unsigned long bottom,
for (p = first, i = 0; i < 8 && p < top; i++, p += 4) {
if (p >= bottom && p < top) {
unsigned long val;
if (get_kernel_nofault(val, (unsigned long *)p))
if (!get_kernel_nofault(val, (unsigned long *)p))
sprintf(str + i * 9, " %08lx", val);
else
sprintf(str + i * 9, " ????????");

View File

@@ -40,6 +40,10 @@ SECTIONS
ARM_DISCARD
*(.alt.smp.init)
*(.pv_table)
#ifndef CONFIG_ARM_UNWIND
*(.ARM.exidx) *(.ARM.exidx.*)
*(.ARM.extab) *(.ARM.extab.*)
#endif
}
. = XIP_VIRT_ADDR(CONFIG_XIP_PHYS_ADDR);
@@ -172,7 +176,7 @@ ASSERT((__arch_info_end - __arch_info_begin), "no machine record defined")
ASSERT((_end - __bss_start) >= 12288, ".bss too small for CONFIG_XIP_DEFLATED_DATA")
#endif
#ifdef CONFIG_ARM_MPU
#if defined(CONFIG_ARM_MPU) && !defined(CONFIG_COMPILE_TEST)
/*
* Due to PMSAv7 restriction on base address and size we have to
* enforce minimal alignment restrictions. It was seen that weaker

View File

@@ -340,6 +340,7 @@ ENTRY(\name\()_cache_fns)
.macro define_tlb_functions name:req, flags_up:req, flags_smp
.type \name\()_tlb_fns, #object
.align 2
ENTRY(\name\()_tlb_fns)
.long \name\()_flush_user_tlb_range
.long \name\()_flush_kern_tlb_range

View File

@@ -439,7 +439,7 @@ static struct undef_hook kprobes_arm_break_hook = {
#endif /* !CONFIG_THUMB2_KERNEL */
int __init arch_init_kprobes()
int __init arch_init_kprobes(void)
{
arm_probes_decode_init();
#ifdef CONFIG_THUMB2_KERNEL

View File

@@ -75,7 +75,7 @@
pinctrl-0 = <&emac_rgmii_pins>;
phy-supply = <&reg_gmac_3v3>;
phy-handle = <&ext_rgmii_phy>;
phy-mode = "rgmii";
phy-mode = "rgmii-id";
status = "okay";
};

View File

@@ -70,7 +70,9 @@
regulator-name = "rst-usb-eth2";
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_usb_eth2>;
gpio = <&gpio3 2 GPIO_ACTIVE_LOW>;
gpio = <&gpio3 2 GPIO_ACTIVE_HIGH>;
enable-active-high;
regulator-always-on;
};
reg_vdd_5v: regulator-5v {
@@ -95,7 +97,7 @@
clocks = <&osc_can>;
interrupt-parent = <&gpio4>;
interrupts = <28 IRQ_TYPE_EDGE_FALLING>;
spi-max-frequency = <100000>;
spi-max-frequency = <10000000>;
vdd-supply = <&reg_vdd_3v3>;
xceiver-supply = <&reg_vdd_5v>;
};
@@ -111,7 +113,7 @@
&fec1 {
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_enet>;
phy-connection-type = "rgmii";
phy-connection-type = "rgmii-rxid";
phy-handle = <&ethphy>;
status = "okay";

View File

@@ -91,10 +91,12 @@
reg_vdd_soc: BUCK1 {
regulator-name = "buck1";
regulator-min-microvolt = <800000>;
regulator-max-microvolt = <900000>;
regulator-max-microvolt = <850000>;
regulator-boot-on;
regulator-always-on;
regulator-ramp-delay = <3125>;
nxp,dvs-run-voltage = <850000>;
nxp,dvs-standby-voltage = <800000>;
};
reg_vdd_arm: BUCK2 {
@@ -111,7 +113,7 @@
reg_vdd_dram: BUCK3 {
regulator-name = "buck3";
regulator-min-microvolt = <850000>;
regulator-max-microvolt = <900000>;
regulator-max-microvolt = <950000>;
regulator-boot-on;
regulator-always-on;
};
@@ -150,7 +152,7 @@
reg_vdd_snvs: LDO2 {
regulator-name = "ldo2";
regulator-min-microvolt = <850000>;
regulator-min-microvolt = <800000>;
regulator-max-microvolt = <900000>;
regulator-boot-on;
regulator-always-on;

View File

@@ -2590,9 +2590,10 @@
power-domains = <&dispcc MDSS_GDSC>;
clocks = <&dispcc DISP_CC_MDSS_AHB_CLK>,
<&gcc GCC_DISP_HF_AXI_CLK>,
<&gcc GCC_DISP_SF_AXI_CLK>,
<&dispcc DISP_CC_MDSS_MDP_CLK>;
clock-names = "iface", "nrt_bus", "core";
clock-names = "iface", "bus", "nrt_bus", "core";
assigned-clocks = <&dispcc DISP_CC_MDSS_MDP_CLK>;
assigned-clock-rates = <460000000>;

View File

@@ -24,6 +24,7 @@ struct hyp_pool {
/* Allocation */
void *hyp_alloc_pages(struct hyp_pool *pool, unsigned short order);
void hyp_split_page(struct hyp_page *page);
void hyp_get_page(struct hyp_pool *pool, void *addr);
void hyp_put_page(struct hyp_pool *pool, void *addr);

View File

@@ -35,7 +35,18 @@ const u8 pkvm_hyp_id = 1;
static void *host_s2_zalloc_pages_exact(size_t size)
{
return hyp_alloc_pages(&host_s2_pool, get_order(size));
void *addr = hyp_alloc_pages(&host_s2_pool, get_order(size));
hyp_split_page(hyp_virt_to_page(addr));
/*
* The size of concatenated PGDs is always a power of two of PAGE_SIZE,
* so there should be no need to free any of the tail pages to make the
* allocation exact.
*/
WARN_ON(size != (PAGE_SIZE << get_order(size)));
return addr;
}
static void *host_s2_zalloc_page(void *pool)

View File

@@ -152,6 +152,7 @@ static inline void hyp_page_ref_inc(struct hyp_page *p)
static inline int hyp_page_ref_dec_and_test(struct hyp_page *p)
{
BUG_ON(!p->refcount);
p->refcount--;
return (p->refcount == 0);
}
@@ -193,6 +194,20 @@ void hyp_get_page(struct hyp_pool *pool, void *addr)
hyp_spin_unlock(&pool->lock);
}
void hyp_split_page(struct hyp_page *p)
{
unsigned short order = p->order;
unsigned int i;
p->order = 0;
for (i = 1; i < (1 << order); i++) {
struct hyp_page *tail = p + i;
tail->order = 0;
hyp_set_page_refcounted(tail);
}
}
void *hyp_alloc_pages(struct hyp_pool *pool, unsigned short order)
{
unsigned short i = order;

View File

@@ -1529,8 +1529,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
* when updating the PG_mte_tagged page flag, see
* sanitise_mte_tags for more details.
*/
if (kvm_has_mte(kvm) && vma->vm_flags & VM_SHARED)
return -EINVAL;
if (kvm_has_mte(kvm) && vma->vm_flags & VM_SHARED) {
ret = -EINVAL;
break;
}
if (vma->vm_flags & VM_PFNMAP) {
/* IO region dirty page logging not allowed */

View File

@@ -1136,6 +1136,11 @@ out:
return prog;
}
u64 bpf_jit_alloc_exec_limit(void)
{
return BPF_JIT_REGION_SIZE;
}
void *bpf_jit_alloc_exec(unsigned long size)
{
return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,

View File

@@ -6,7 +6,7 @@
#ifndef CONFIG_DYNAMIC_FTRACE
extern void (*ftrace_trace_function)(unsigned long, unsigned long,
struct ftrace_ops*, struct pt_regs*);
struct ftrace_ops*, struct ftrace_regs*);
extern void ftrace_graph_caller(void);
noinline void __naked ftrace_stub(unsigned long ip, unsigned long parent_ip,

View File

@@ -9,7 +9,7 @@
static inline unsigned long arch_local_save_flags(void)
{
return RDCTL(CTL_STATUS);
return RDCTL(CTL_FSTATUS);
}
/*
@@ -18,7 +18,7 @@ static inline unsigned long arch_local_save_flags(void)
*/
static inline void arch_local_irq_restore(unsigned long flags)
{
WRCTL(CTL_STATUS, flags);
WRCTL(CTL_FSTATUS, flags);
}
static inline void arch_local_irq_disable(void)

View File

@@ -11,7 +11,7 @@
#endif
/* control register numbers */
#define CTL_STATUS 0
#define CTL_FSTATUS 0
#define CTL_ESTATUS 1
#define CTL_BSTATUS 2
#define CTL_IENABLE 3

View File

@@ -37,6 +37,7 @@ config NIOS2_DTB_PHYS_ADDR
config NIOS2_DTB_SOURCE_BOOL
bool "Compile and link device tree into kernel image"
depends on !COMPILE_TEST
help
This allows you to specify a dts (device tree source) file
which will be compiled and linked into the kernel image.

View File

@@ -126,14 +126,16 @@ _GLOBAL(idle_return_gpr_loss)
/*
* This is the sequence required to execute idle instructions, as
* specified in ISA v2.07 (and earlier). MSR[IR] and MSR[DR] must be 0.
*
* The 0(r1) slot is used to save r2 in isa206, so use that here.
* We have to store a GPR somewhere, ptesync, then reload it, and create
* a false dependency on the result of the load. It doesn't matter which
* GPR we store, or where we store it. We have already stored r2 to the
* stack at -8(r1) in isa206_idle_insn_mayloss, so use that.
*/
#define IDLE_STATE_ENTER_SEQ_NORET(IDLE_INST) \
/* Magic NAP/SLEEP/WINKLE mode enter sequence */ \
std r2,0(r1); \
std r2,-8(r1); \
ptesync; \
ld r2,0(r1); \
ld r2,-8(r1); \
236: cmpd cr0,r2,r2; \
bne 236b; \
IDLE_INST; \

View File

@@ -1730,8 +1730,6 @@ void __cpu_die(unsigned int cpu)
void arch_cpu_idle_dead(void)
{
sched_preempt_enable_no_resched();
/*
* Disable on the down path. This will be re-enabled by
* start_secondary() via start_secondary_resume() below

View File

@@ -1302,6 +1302,12 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
struct property *default_win;
int reset_win_ext;
/* DDW + IOMMU on single window may fail if there is any allocation */
if (iommu_table_in_use(tbl)) {
dev_warn(&dev->dev, "current IOMMU table in use, can't be replaced.\n");
goto out_failed;
}
default_win = of_find_property(pdn, "ibm,dma-window", NULL);
if (!default_win)
goto out_failed;
@@ -1356,12 +1362,6 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
query.largest_available_block,
1ULL << page_shift);
/* DDW + IOMMU on single window may fail if there is any allocation */
if (default_win_removed && iommu_table_in_use(tbl)) {
dev_dbg(&dev->dev, "current IOMMU table in use, can't be replaced.\n");
goto out_failed;
}
len = order_base_2(query.largest_available_block << page_shift);
win_name = DMA64_PROPNAME;
} else {
@@ -1411,18 +1411,19 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
} else {
struct iommu_table *newtbl;
int i;
unsigned long start = 0, end = 0;
for (i = 0; i < ARRAY_SIZE(pci->phb->mem_resources); i++) {
const unsigned long mask = IORESOURCE_MEM_64 | IORESOURCE_MEM;
/* Look for MMIO32 */
if ((pci->phb->mem_resources[i].flags & mask) == IORESOURCE_MEM)
if ((pci->phb->mem_resources[i].flags & mask) == IORESOURCE_MEM) {
start = pci->phb->mem_resources[i].start;
end = pci->phb->mem_resources[i].end;
break;
}
}
if (i == ARRAY_SIZE(pci->phb->mem_resources))
goto out_del_list;
/* New table for using DDW instead of the default DMA window */
newtbl = iommu_pseries_alloc_table(pci->phb->node);
if (!newtbl) {
@@ -1432,15 +1433,15 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
iommu_table_setparms_common(newtbl, pci->phb->bus->number, create.liobn, win_addr,
1UL << len, page_shift, NULL, &iommu_table_lpar_multi_ops);
iommu_init_table(newtbl, pci->phb->node, pci->phb->mem_resources[i].start,
pci->phb->mem_resources[i].end);
iommu_init_table(newtbl, pci->phb->node, start, end);
pci->table_group->tables[1] = newtbl;
/* Keep default DMA window stuct if removed */
if (default_win_removed) {
tbl->it_size = 0;
kfree(tbl->it_map);
vfree(tbl->it_map);
tbl->it_map = NULL;
}
set_iommu_table_base(&dev->dev, newtbl);

View File

@@ -163,6 +163,12 @@ config PAGE_OFFSET
default 0xffffffff80000000 if 64BIT && MAXPHYSMEM_2GB
default 0xffffffe000000000 if 64BIT && MAXPHYSMEM_128GB
config KASAN_SHADOW_OFFSET
hex
depends on KASAN_GENERIC
default 0xdfffffc800000000 if 64BIT
default 0xffffffff if 32BIT
config ARCH_FLATMEM_ENABLE
def_bool !NUMA

View File

@@ -30,8 +30,7 @@
#define KASAN_SHADOW_SIZE (UL(1) << ((CONFIG_VA_BITS - 1) - KASAN_SHADOW_SCALE_SHIFT))
#define KASAN_SHADOW_START KERN_VIRT_START
#define KASAN_SHADOW_END (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
#define KASAN_SHADOW_OFFSET (KASAN_SHADOW_END - (1ULL << \
(64 - KASAN_SHADOW_SCALE_SHIFT)))
#define KASAN_SHADOW_OFFSET _AC(CONFIG_KASAN_SHADOW_OFFSET, UL)
void kasan_init(void);
asmlinkage void kasan_early_init(void);

View File

@@ -193,6 +193,7 @@ setup_trap_vector:
csrw CSR_SCRATCH, zero
ret
.align 2
.Lsecondary_park:
/* We lack SMP support or have too many harts, so park this hart */
wfi

View File

@@ -17,6 +17,9 @@ asmlinkage void __init kasan_early_init(void)
uintptr_t i;
pgd_t *pgd = early_pg_dir + pgd_index(KASAN_SHADOW_START);
BUILD_BUG_ON(KASAN_SHADOW_OFFSET !=
KASAN_SHADOW_END - (1UL << (64 - KASAN_SHADOW_SCALE_SHIFT)));
for (i = 0; i < PTRS_PER_PTE; ++i)
set_pte(kasan_early_shadow_pte + i,
mk_pte(virt_to_page(kasan_early_shadow_page),
@@ -172,21 +175,10 @@ void __init kasan_init(void)
phys_addr_t p_start, p_end;
u64 i;
/*
* Populate all kernel virtual address space with kasan_early_shadow_page
* except for the linear mapping and the modules/kernel/BPF mapping.
*/
kasan_populate_early_shadow((void *)KASAN_SHADOW_START,
(void *)kasan_mem_to_shadow((void *)
VMEMMAP_END));
if (IS_ENABLED(CONFIG_KASAN_VMALLOC))
kasan_shallow_populate(
(void *)kasan_mem_to_shadow((void *)VMALLOC_START),
(void *)kasan_mem_to_shadow((void *)VMALLOC_END));
else
kasan_populate_early_shadow(
(void *)kasan_mem_to_shadow((void *)VMALLOC_START),
(void *)kasan_mem_to_shadow((void *)VMALLOC_END));
/* Populate the linear mapping */
for_each_mem_range(i, &p_start, &p_end) {

View File

@@ -125,7 +125,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
if (i == NR_JIT_ITERATIONS) {
pr_err("bpf-jit: image did not converge in <%d passes!\n", i);
bpf_jit_binary_free(jit_data->header);
if (jit_data->header)
bpf_jit_binary_free(jit_data->header);
prog = orig_prog;
goto out_offset;
}
@@ -166,6 +167,11 @@ out:
return prog;
}
u64 bpf_jit_alloc_exec_limit(void)
{
return BPF_JIT_REGION_SIZE;
}
void *bpf_jit_alloc_exec(unsigned long size)
{
return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,

View File

@@ -894,6 +894,11 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
/**
* guest_translate_address - translate guest logical into guest absolute address
* @vcpu: virtual cpu
* @gva: Guest virtual address
* @ar: Access register
* @gpa: Guest physical address
* @mode: Translation access mode
*
* Parameter semantics are the same as the ones from guest_translate.
* The memory contents at the guest address are not changed.
@@ -934,6 +939,11 @@ int guest_translate_address(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
/**
* check_gva_range - test a range of guest virtual addresses for accessibility
* @vcpu: virtual cpu
* @gva: Guest virtual address
* @ar: Access register
* @length: Length of test range
* @mode: Translation access mode
*/
int check_gva_range(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
unsigned long length, enum gacc_mode mode)
@@ -956,6 +966,7 @@ int check_gva_range(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
/**
* kvm_s390_check_low_addr_prot_real - check for low-address protection
* @vcpu: virtual cpu
* @gra: Guest real address
*
* Checks whether an address is subject to low-address protection and set
@@ -979,6 +990,7 @@ int kvm_s390_check_low_addr_prot_real(struct kvm_vcpu *vcpu, unsigned long gra)
* @pgt: pointer to the beginning of the page table for the given address if
* successful (return value 0), or to the first invalid DAT entry in
* case of exceptions (return value > 0)
* @dat_protection: referenced memory is write protected
* @fake: pgt references contiguous guest memory block, not a pgtable
*/
static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr,

View File

@@ -269,6 +269,7 @@ static int handle_prog(struct kvm_vcpu *vcpu)
/**
* handle_external_interrupt - used for external interruption interceptions
* @vcpu: virtual cpu
*
* This interception only occurs if the CPUSTAT_EXT_INT bit was set, or if
* the new PSW does not have external interrupts disabled. In the first case,
@@ -315,7 +316,8 @@ static int handle_external_interrupt(struct kvm_vcpu *vcpu)
}
/**
* Handle MOVE PAGE partial execution interception.
* handle_mvpg_pei - Handle MOVE PAGE partial execution interception.
* @vcpu: virtual cpu
*
* This interception can only happen for guests with DAT disabled and
* addresses that are currently not mapped in the host. Thus we try to

View File

@@ -3053,13 +3053,14 @@ static void __airqs_kick_single_vcpu(struct kvm *kvm, u8 deliverable_mask)
int vcpu_idx, online_vcpus = atomic_read(&kvm->online_vcpus);
struct kvm_s390_gisa_interrupt *gi = &kvm->arch.gisa_int;
struct kvm_vcpu *vcpu;
u8 vcpu_isc_mask;
for_each_set_bit(vcpu_idx, kvm->arch.idle_mask, online_vcpus) {
vcpu = kvm_get_vcpu(kvm, vcpu_idx);
if (psw_ioint_disabled(vcpu))
continue;
deliverable_mask &= (u8)(vcpu->arch.sie_block->gcr[6] >> 24);
if (deliverable_mask) {
vcpu_isc_mask = (u8)(vcpu->arch.sie_block->gcr[6] >> 24);
if (deliverable_mask & vcpu_isc_mask) {
/* lately kicked but not yet running */
if (test_and_set_bit(vcpu_idx, gi->kicked_mask))
return;

View File

@@ -3363,6 +3363,7 @@ out_free_sie_block:
int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
{
clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask);
return kvm_s390_vcpu_has_irq(vcpu, 0);
}

View File

@@ -78,7 +78,7 @@
vpxor tmp0, x, x;
.section .rodata.cst164, "aM", @progbits, 164
.section .rodata.cst16, "aM", @progbits, 16
.align 16
/*
@@ -133,6 +133,10 @@
.L0f0f0f0f:
.long 0x0f0f0f0f
/* 12 bytes, only for padding */
.Lpadding_deadbeef:
.long 0xdeadbeef, 0xdeadbeef, 0xdeadbeef
.text
.align 16

View File

@@ -93,7 +93,7 @@
vpxor tmp0, x, x;
.section .rodata.cst164, "aM", @progbits, 164
.section .rodata.cst16, "aM", @progbits, 16
.align 16
/*
@@ -148,6 +148,10 @@
.L0f0f0f0f:
.long 0x0f0f0f0f
/* 12 bytes, only for padding */
.Lpadding_deadbeef:
.long 0xdeadbeef, 0xdeadbeef, 0xdeadbeef
.text
.align 16

View File

@@ -702,7 +702,8 @@ struct kvm_vcpu_arch {
struct kvm_pio_request pio;
void *pio_data;
void *guest_ins_data;
void *sev_pio_data;
unsigned sev_pio_count;
u8 event_exit_inst_len;
@@ -1097,7 +1098,7 @@ struct kvm_arch {
u64 cur_tsc_generation;
int nr_vcpus_matched_tsc;
spinlock_t pvclock_gtod_sync_lock;
raw_spinlock_t pvclock_gtod_sync_lock;
bool use_master_clock;
u64 master_kernel_ns;
u64 master_cycle_now;

View File

@@ -2321,13 +2321,14 @@ EXPORT_SYMBOL_GPL(kvm_apic_update_apicv);
void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
{
struct kvm_lapic *apic = vcpu->arch.apic;
u64 msr_val;
int i;
if (!init_event) {
vcpu->arch.apic_base = APIC_DEFAULT_PHYS_BASE |
MSR_IA32_APICBASE_ENABLE;
msr_val = APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE;
if (kvm_vcpu_is_reset_bsp(vcpu))
vcpu->arch.apic_base |= MSR_IA32_APICBASE_BSP;
msr_val |= MSR_IA32_APICBASE_BSP;
kvm_lapic_set_base(vcpu, msr_val);
}
if (!apic)
@@ -2336,11 +2337,9 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
/* Stop the timer in case it's a reset to an active apic */
hrtimer_cancel(&apic->lapic_timer.timer);
if (!init_event) {
apic->base_address = APIC_DEFAULT_PHYS_BASE;
/* The xAPIC ID is set at RESET even if the APIC was already enabled. */
if (!init_event)
kvm_apic_set_xapic_id(apic, vcpu->vcpu_id);
}
kvm_apic_set_version(apic->vcpu);
for (i = 0; i < KVM_APIC_LVT_NUM; i++)
@@ -2481,6 +2480,11 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns)
lapic_timer_advance_dynamic = false;
}
/*
* Stuff the APIC ENABLE bit in lieu of temporarily incrementing
* apic_hw_disabled; the full RESET value is set by kvm_lapic_reset().
*/
vcpu->arch.apic_base = MSR_IA32_APICBASE_ENABLE;
static_branch_inc(&apic_sw_disabled.key); /* sw disabled at reset */
kvm_iodevice_init(&apic->dev, &apic_mmio_ops);
@@ -2942,5 +2946,7 @@ int kvm_apic_accept_events(struct kvm_vcpu *vcpu)
void kvm_lapic_exit(void)
{
static_key_deferred_flush(&apic_hw_disabled);
WARN_ON(static_branch_unlikely(&apic_hw_disabled.key));
static_key_deferred_flush(&apic_sw_disabled);
WARN_ON(static_branch_unlikely(&apic_sw_disabled.key));
}

View File

@@ -4596,10 +4596,10 @@ static void update_pkru_bitmask(struct kvm_mmu *mmu)
unsigned bit;
bool wp;
if (!is_cr4_pke(mmu)) {
mmu->pkru_mask = 0;
mmu->pkru_mask = 0;
if (!is_cr4_pke(mmu))
return;
}
wp = is_cr0_wp(mmu);

View File

@@ -618,7 +618,12 @@ static int __sev_launch_update_vmsa(struct kvm *kvm, struct kvm_vcpu *vcpu,
vmsa.handle = to_kvm_svm(kvm)->sev_info.handle;
vmsa.address = __sme_pa(svm->vmsa);
vmsa.len = PAGE_SIZE;
return sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_VMSA, &vmsa, error);
ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_VMSA, &vmsa, error);
if (ret)
return ret;
vcpu->arch.guest_state_protected = true;
return 0;
}
static int sev_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
@@ -1479,6 +1484,13 @@ static int sev_receive_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
goto e_free_trans;
}
/*
* Flush (on non-coherent CPUs) before RECEIVE_UPDATE_DATA, the PSP
* encrypts the written data with the guest's key, and the cache may
* contain dirty, unencrypted data.
*/
sev_clflush_pages(guest_page, n);
/* The RECEIVE_UPDATE_DATA command requires C-bit to be always set. */
data.guest_address = (page_to_pfn(guest_page[0]) << PAGE_SHIFT) + offset;
data.guest_address |= sev_me_mask;
@@ -2579,11 +2591,20 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in)
{
if (!setup_vmgexit_scratch(svm, in, svm->vmcb->control.exit_info_2))
int count;
int bytes;
if (svm->vmcb->control.exit_info_2 > INT_MAX)
return -EINVAL;
return kvm_sev_es_string_io(&svm->vcpu, size, port,
svm->ghcb_sa, svm->ghcb_sa_len, in);
count = svm->vmcb->control.exit_info_2;
if (unlikely(check_mul_overflow(count, size, &bytes)))
return -EINVAL;
if (!setup_vmgexit_scratch(svm, in, bytes))
return -EINVAL;
return kvm_sev_es_string_io(&svm->vcpu, size, port, svm->ghcb_sa, count, in);
}
void sev_es_init_vmcb(struct vcpu_svm *svm)

View File

@@ -191,7 +191,7 @@ struct vcpu_svm {
/* SEV-ES scratch area support */
void *ghcb_sa;
u64 ghcb_sa_len;
u32 ghcb_sa_len;
bool ghcb_sa_sync;
bool ghcb_sa_free;

View File

@@ -5562,9 +5562,13 @@ static int handle_encls(struct kvm_vcpu *vcpu)
static int handle_bus_lock_vmexit(struct kvm_vcpu *vcpu)
{
vcpu->run->exit_reason = KVM_EXIT_X86_BUS_LOCK;
vcpu->run->flags |= KVM_RUN_X86_BUS_LOCK;
return 0;
/*
* Hardware may or may not set the BUS_LOCK_DETECTED flag on BUS_LOCK
* VM-Exits. Unconditionally set the flag here and leave the handling to
* vmx_handle_exit().
*/
to_vmx(vcpu)->exit_reason.bus_lock_detected = true;
return 1;
}
/*
@@ -6051,9 +6055,8 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
int ret = __vmx_handle_exit(vcpu, exit_fastpath);
/*
* Even when current exit reason is handled by KVM internally, we
* still need to exit to user space when bus lock detected to inform
* that there is a bus lock in guest.
* Exit to user space when bus lock detected to inform that there is
* a bus lock in guest.
*/
if (to_vmx(vcpu)->exit_reason.bus_lock_detected) {
if (ret > 0)
@@ -6302,18 +6305,13 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
/*
* If we are running L2 and L1 has a new pending interrupt
* which can be injected, we should re-evaluate
* what should be done with this new L1 interrupt.
* If L1 intercepts external-interrupts, we should
* exit from L2 to L1. Otherwise, interrupt should be
* delivered directly to L2.
* which can be injected, this may cause a vmexit or it may
* be injected into L2. Either way, this interrupt will be
* processed via KVM_REQ_EVENT, not RVI, because we do not use
* virtual interrupt delivery to inject L1 interrupts into L2.
*/
if (is_guest_mode(vcpu) && max_irr_updated) {
if (nested_exit_on_intr(vcpu))
kvm_vcpu_exiting_guest_mode(vcpu);
else
kvm_make_request(KVM_REQ_EVENT, vcpu);
}
if (is_guest_mode(vcpu) && max_irr_updated)
kvm_make_request(KVM_REQ_EVENT, vcpu);
} else {
max_irr = kvm_lapic_find_highest_irr(vcpu);
}

View File

@@ -2542,7 +2542,7 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
kvm_vcpu_write_tsc_offset(vcpu, offset);
raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
spin_lock_irqsave(&kvm->arch.pvclock_gtod_sync_lock, flags);
raw_spin_lock_irqsave(&kvm->arch.pvclock_gtod_sync_lock, flags);
if (!matched) {
kvm->arch.nr_vcpus_matched_tsc = 0;
} else if (!already_matched) {
@@ -2550,7 +2550,7 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
}
kvm_track_tsc_matching(vcpu);
spin_unlock_irqrestore(&kvm->arch.pvclock_gtod_sync_lock, flags);
raw_spin_unlock_irqrestore(&kvm->arch.pvclock_gtod_sync_lock, flags);
}
static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
@@ -2780,9 +2780,9 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
kvm_make_mclock_inprogress_request(kvm);
/* no guest entries from this point */
spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
raw_spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
pvclock_update_vm_gtod_copy(kvm);
spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
raw_spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
kvm_for_each_vcpu(i, vcpu, kvm)
kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
@@ -2800,15 +2800,15 @@ u64 get_kvmclock_ns(struct kvm *kvm)
unsigned long flags;
u64 ret;
spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
raw_spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
if (!ka->use_master_clock) {
spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
raw_spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
return get_kvmclock_base_ns() + ka->kvmclock_offset;
}
hv_clock.tsc_timestamp = ka->master_cycle_now;
hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset;
spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
raw_spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
/* both __this_cpu_read() and rdtsc() should be on the same cpu */
get_cpu();
@@ -2902,13 +2902,13 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
* If the host uses TSC clock, then passthrough TSC as stable
* to the guest.
*/
spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
raw_spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
use_master_clock = ka->use_master_clock;
if (use_master_clock) {
host_tsc = ka->master_cycle_now;
kernel_ns = ka->master_kernel_ns;
}
spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
raw_spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
/* Keep irq disabled to prevent changes to the clock */
local_irq_save(flags);
@@ -6100,13 +6100,13 @@ set_pit2_out:
* is slightly ahead) here we risk going negative on unsigned
* 'system_time' when 'user_ns.clock' is very small.
*/
spin_lock_irq(&ka->pvclock_gtod_sync_lock);
raw_spin_lock_irq(&ka->pvclock_gtod_sync_lock);
if (kvm->arch.use_master_clock)
now_ns = ka->master_kernel_ns;
else
now_ns = get_kvmclock_base_ns();
ka->kvmclock_offset = user_ns.clock - now_ns;
spin_unlock_irq(&ka->pvclock_gtod_sync_lock);
raw_spin_unlock_irq(&ka->pvclock_gtod_sync_lock);
kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
break;
@@ -6906,7 +6906,7 @@ static int kernel_pio(struct kvm_vcpu *vcpu, void *pd)
}
static int emulator_pio_in_out(struct kvm_vcpu *vcpu, int size,
unsigned short port, void *val,
unsigned short port,
unsigned int count, bool in)
{
vcpu->arch.pio.port = port;
@@ -6914,10 +6914,8 @@ static int emulator_pio_in_out(struct kvm_vcpu *vcpu, int size,
vcpu->arch.pio.count = count;
vcpu->arch.pio.size = size;
if (!kernel_pio(vcpu, vcpu->arch.pio_data)) {
vcpu->arch.pio.count = 0;
if (!kernel_pio(vcpu, vcpu->arch.pio_data))
return 1;
}
vcpu->run->exit_reason = KVM_EXIT_IO;
vcpu->run->io.direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
@@ -6929,26 +6927,39 @@ static int emulator_pio_in_out(struct kvm_vcpu *vcpu, int size,
return 0;
}
static int __emulator_pio_in(struct kvm_vcpu *vcpu, int size,
unsigned short port, unsigned int count)
{
WARN_ON(vcpu->arch.pio.count);
memset(vcpu->arch.pio_data, 0, size * count);
return emulator_pio_in_out(vcpu, size, port, count, true);
}
static void complete_emulator_pio_in(struct kvm_vcpu *vcpu, void *val)
{
int size = vcpu->arch.pio.size;
unsigned count = vcpu->arch.pio.count;
memcpy(val, vcpu->arch.pio_data, size * count);
trace_kvm_pio(KVM_PIO_IN, vcpu->arch.pio.port, size, count, vcpu->arch.pio_data);
vcpu->arch.pio.count = 0;
}
static int emulator_pio_in(struct kvm_vcpu *vcpu, int size,
unsigned short port, void *val, unsigned int count)
{
int ret;
if (vcpu->arch.pio.count) {
/* Complete previous iteration. */
} else {
int r = __emulator_pio_in(vcpu, size, port, count);
if (!r)
return r;
if (vcpu->arch.pio.count)
goto data_avail;
memset(vcpu->arch.pio_data, 0, size * count);
ret = emulator_pio_in_out(vcpu, size, port, val, count, true);
if (ret) {
data_avail:
memcpy(val, vcpu->arch.pio_data, size * count);
trace_kvm_pio(KVM_PIO_IN, port, size, count, vcpu->arch.pio_data);
vcpu->arch.pio.count = 0;
return 1;
/* Results already available, fall through. */
}
return 0;
WARN_ON(count != vcpu->arch.pio.count);
complete_emulator_pio_in(vcpu, val);
return 1;
}
static int emulator_pio_in_emulated(struct x86_emulate_ctxt *ctxt,
@@ -6963,9 +6974,15 @@ static int emulator_pio_out(struct kvm_vcpu *vcpu, int size,
unsigned short port, const void *val,
unsigned int count)
{
int ret;
memcpy(vcpu->arch.pio_data, val, size * count);
trace_kvm_pio(KVM_PIO_OUT, port, size, count, vcpu->arch.pio_data);
return emulator_pio_in_out(vcpu, size, port, (void *)val, count, false);
ret = emulator_pio_in_out(vcpu, size, port, count, false);
if (ret)
vcpu->arch.pio.count = 0;
return ret;
}
static int emulator_pio_out_emulated(struct x86_emulate_ctxt *ctxt,
@@ -8139,9 +8156,9 @@ static void kvm_hyperv_tsc_notifier(void)
list_for_each_entry(kvm, &vm_list, vm_list) {
struct kvm_arch *ka = &kvm->arch;
spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
raw_spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
pvclock_update_vm_gtod_copy(kvm);
spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
raw_spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
kvm_for_each_vcpu(cpu, vcpu, kvm)
kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
@@ -8783,9 +8800,17 @@ static void post_kvm_run_save(struct kvm_vcpu *vcpu)
kvm_run->cr8 = kvm_get_cr8(vcpu);
kvm_run->apic_base = kvm_get_apic_base(vcpu);
/*
* The call to kvm_ready_for_interrupt_injection() may end up in
* kvm_xen_has_interrupt() which may require the srcu lock to be
* held, to protect against changes in the vcpu_info address.
*/
vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
kvm_run->ready_for_interrupt_injection =
pic_in_kernel(vcpu->kvm) ||
kvm_vcpu_ready_for_interrupt_injection(vcpu);
srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
if (is_smm(vcpu))
kvm_run->flags |= KVM_RUN_X86_SMM;
@@ -9643,14 +9668,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
break;
if (unlikely(kvm_vcpu_exit_request(vcpu))) {
if (vcpu->arch.apicv_active)
static_call(kvm_x86_sync_pir_to_irr)(vcpu);
if (unlikely(kvm_vcpu_exit_request(vcpu))) {
exit_fastpath = EXIT_FASTPATH_EXIT_HANDLED;
break;
}
if (vcpu->arch.apicv_active)
static_call(kvm_x86_sync_pir_to_irr)(vcpu);
}
}
/*
* Do this here before restoring debug registers on the host. And
@@ -11182,7 +11207,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
raw_spin_lock_init(&kvm->arch.tsc_write_lock);
mutex_init(&kvm->arch.apic_map_lock);
spin_lock_init(&kvm->arch.pvclock_gtod_sync_lock);
raw_spin_lock_init(&kvm->arch.pvclock_gtod_sync_lock);
kvm->arch.kvmclock_offset = -get_kvmclock_base_ns();
pvclock_update_vm_gtod_copy(kvm);
@@ -11392,7 +11417,8 @@ static int memslot_rmap_alloc(struct kvm_memory_slot *slot,
int level = i + 1;
int lpages = __kvm_mmu_slot_lpages(slot, npages, level);
WARN_ON(slot->arch.rmap[i]);
if (slot->arch.rmap[i])
continue;
slot->arch.rmap[i] = kvcalloc(lpages, sz, GFP_KERNEL_ACCOUNT);
if (!slot->arch.rmap[i]) {
@@ -12367,44 +12393,81 @@ int kvm_sev_es_mmio_read(struct kvm_vcpu *vcpu, gpa_t gpa, unsigned int bytes,
}
EXPORT_SYMBOL_GPL(kvm_sev_es_mmio_read);
static int complete_sev_es_emulated_ins(struct kvm_vcpu *vcpu)
{
memcpy(vcpu->arch.guest_ins_data, vcpu->arch.pio_data,
vcpu->arch.pio.count * vcpu->arch.pio.size);
vcpu->arch.pio.count = 0;
static int kvm_sev_es_outs(struct kvm_vcpu *vcpu, unsigned int size,
unsigned int port);
static int complete_sev_es_emulated_outs(struct kvm_vcpu *vcpu)
{
int size = vcpu->arch.pio.size;
int port = vcpu->arch.pio.port;
vcpu->arch.pio.count = 0;
if (vcpu->arch.sev_pio_count)
return kvm_sev_es_outs(vcpu, size, port);
return 1;
}
static int kvm_sev_es_outs(struct kvm_vcpu *vcpu, unsigned int size,
unsigned int port, void *data, unsigned int count)
unsigned int port)
{
int ret;
for (;;) {
unsigned int count =
min_t(unsigned int, PAGE_SIZE / size, vcpu->arch.sev_pio_count);
int ret = emulator_pio_out(vcpu, size, port, vcpu->arch.sev_pio_data, count);
ret = emulator_pio_out_emulated(vcpu->arch.emulate_ctxt, size, port,
data, count);
if (ret)
return ret;
/* memcpy done already by emulator_pio_out. */
vcpu->arch.sev_pio_count -= count;
vcpu->arch.sev_pio_data += count * vcpu->arch.pio.size;
if (!ret)
break;
vcpu->arch.pio.count = 0;
/* Emulation done by the kernel. */
if (!vcpu->arch.sev_pio_count)
return 1;
}
vcpu->arch.complete_userspace_io = complete_sev_es_emulated_outs;
return 0;
}
static int kvm_sev_es_ins(struct kvm_vcpu *vcpu, unsigned int size,
unsigned int port, void *data, unsigned int count)
{
int ret;
unsigned int port);
ret = emulator_pio_in_emulated(vcpu->arch.emulate_ctxt, size, port,
data, count);
if (ret) {
vcpu->arch.pio.count = 0;
} else {
vcpu->arch.guest_ins_data = data;
vcpu->arch.complete_userspace_io = complete_sev_es_emulated_ins;
static void advance_sev_es_emulated_ins(struct kvm_vcpu *vcpu)
{
unsigned count = vcpu->arch.pio.count;
complete_emulator_pio_in(vcpu, vcpu->arch.sev_pio_data);
vcpu->arch.sev_pio_count -= count;
vcpu->arch.sev_pio_data += count * vcpu->arch.pio.size;
}
static int complete_sev_es_emulated_ins(struct kvm_vcpu *vcpu)
{
int size = vcpu->arch.pio.size;
int port = vcpu->arch.pio.port;
advance_sev_es_emulated_ins(vcpu);
if (vcpu->arch.sev_pio_count)
return kvm_sev_es_ins(vcpu, size, port);
return 1;
}
static int kvm_sev_es_ins(struct kvm_vcpu *vcpu, unsigned int size,
unsigned int port)
{
for (;;) {
unsigned int count =
min_t(unsigned int, PAGE_SIZE / size, vcpu->arch.sev_pio_count);
if (!__emulator_pio_in(vcpu, size, port, count))
break;
/* Emulation done by the kernel. */
advance_sev_es_emulated_ins(vcpu);
if (!vcpu->arch.sev_pio_count)
return 1;
}
vcpu->arch.complete_userspace_io = complete_sev_es_emulated_ins;
return 0;
}
@@ -12412,8 +12475,10 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size,
unsigned int port, void *data, unsigned int count,
int in)
{
return in ? kvm_sev_es_ins(vcpu, size, port, data, count)
: kvm_sev_es_outs(vcpu, size, port, data, count);
vcpu->arch.sev_pio_data = data;
vcpu->arch.sev_pio_count = count;
return in ? kvm_sev_es_ins(vcpu, size, port)
: kvm_sev_es_outs(vcpu, size, port);
}
EXPORT_SYMBOL_GPL(kvm_sev_es_string_io);

View File

@@ -190,6 +190,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state)
int __kvm_xen_has_interrupt(struct kvm_vcpu *v)
{
int err;
u8 rc = 0;
/*
@@ -216,13 +217,29 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v)
if (likely(slots->generation == ghc->generation &&
!kvm_is_error_hva(ghc->hva) && ghc->memslot)) {
/* Fast path */
__get_user(rc, (u8 __user *)ghc->hva + offset);
} else {
/* Slow path */
kvm_read_guest_offset_cached(v->kvm, ghc, &rc, offset,
sizeof(rc));
pagefault_disable();
err = __get_user(rc, (u8 __user *)ghc->hva + offset);
pagefault_enable();
if (!err)
return rc;
}
/* Slow path */
/*
* This function gets called from kvm_vcpu_block() after setting the
* task to TASK_INTERRUPTIBLE, to see if it needs to wake immediately
* from a HLT. So we really mustn't sleep. If the page ended up absent
* at that point, just return 1 in order to trigger an immediate wake,
* and we'll end up getting called again from a context where we *can*
* fault in the page and wait for it.
*/
if (in_atomic() || !task_is_running(current))
return 1;
kvm_read_guest_offset_cached(v->kvm, ghc, &rc, offset,
sizeof(rc));
return rc;
}

View File

@@ -1897,10 +1897,11 @@ void blk_cgroup_bio_start(struct bio *bio)
{
int rwd = blk_cgroup_io_type(bio), cpu;
struct blkg_iostat_set *bis;
unsigned long flags;
cpu = get_cpu();
bis = per_cpu_ptr(bio->bi_blkg->iostat_cpu, cpu);
u64_stats_update_begin(&bis->sync);
flags = u64_stats_update_begin_irqsave(&bis->sync);
/*
* If the bio is flagged with BIO_CGROUP_ACCT it means this is a split
@@ -1912,7 +1913,7 @@ void blk_cgroup_bio_start(struct bio *bio)
}
bis->cur.ios[rwd]++;
u64_stats_update_end(&bis->sync);
u64_stats_update_end_irqrestore(&bis->sync, flags);
if (cgroup_subsys_on_dfl(io_cgrp_subsys))
cgroup_rstat_updated(bio->bi_blkg->blkcg->css.cgroup, cpu);
put_cpu();

View File

@@ -1325,6 +1325,7 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list,
int errors, queued;
blk_status_t ret = BLK_STS_OK;
LIST_HEAD(zone_list);
bool needs_resource = false;
if (list_empty(list))
return false;
@@ -1370,6 +1371,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list,
queued++;
break;
case BLK_STS_RESOURCE:
needs_resource = true;
fallthrough;
case BLK_STS_DEV_RESOURCE:
blk_mq_handle_dev_resource(rq, list);
goto out;
@@ -1380,6 +1383,7 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list,
* accept.
*/
blk_mq_handle_zone_resource(rq, &zone_list);
needs_resource = true;
break;
default:
errors++;
@@ -1406,7 +1410,6 @@ out:
/* For non-shared tags, the RESTART check will suffice */
bool no_tag = prep == PREP_DISPATCH_NO_TAG &&
(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED);
bool no_budget_avail = prep == PREP_DISPATCH_NO_BUDGET;
if (nr_budgets)
blk_mq_release_budgets(q, list);
@@ -1447,14 +1450,16 @@ out:
* If driver returns BLK_STS_RESOURCE and SCHED_RESTART
* bit is set, run queue after a delay to avoid IO stalls
* that could otherwise occur if the queue is idle. We'll do
* similar if we couldn't get budget and SCHED_RESTART is set.
* similar if we couldn't get budget or couldn't lock a zone
* and SCHED_RESTART is set.
*/
needs_restart = blk_mq_sched_needs_restart(hctx);
if (prep == PREP_DISPATCH_NO_BUDGET)
needs_resource = true;
if (!needs_restart ||
(no_tag && list_empty_careful(&hctx->dispatch_wait.entry)))
blk_mq_run_hw_queue(hctx, true);
else if (needs_restart && (ret == BLK_STS_RESOURCE ||
no_budget_avail))
else if (needs_restart && needs_resource)
blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY);
blk_mq_update_dispatch_busy(hctx, true);

View File

@@ -842,6 +842,24 @@ bool blk_queue_can_use_dma_map_merging(struct request_queue *q,
}
EXPORT_SYMBOL_GPL(blk_queue_can_use_dma_map_merging);
static bool disk_has_partitions(struct gendisk *disk)
{
unsigned long idx;
struct block_device *part;
bool ret = false;
rcu_read_lock();
xa_for_each(&disk->part_tbl, idx, part) {
if (bdev_is_partition(part)) {
ret = true;
break;
}
}
rcu_read_unlock();
return ret;
}
/**
* blk_queue_set_zoned - configure a disk queue zoned model.
* @disk: the gendisk of the queue to configure
@@ -876,7 +894,7 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model)
* we do nothing special as far as the block layer is concerned.
*/
if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED) ||
!xa_empty(&disk->part_tbl))
disk_has_partitions(disk))
model = BLK_ZONED_NONE;
break;
case BLK_ZONED_NONE:

View File

@@ -588,16 +588,6 @@ void del_gendisk(struct gendisk *disk)
* Prevent new I/O from crossing bio_queue_enter().
*/
blk_queue_start_drain(q);
blk_mq_freeze_queue_wait(q);
rq_qos_exit(q);
blk_sync_queue(q);
blk_flush_integrity();
/*
* Allow using passthrough request again after the queue is torn down.
*/
blk_queue_flag_clear(QUEUE_FLAG_INIT_DONE, q);
__blk_mq_unfreeze_queue(q, true);
if (!(disk->flags & GENHD_FL_HIDDEN)) {
sysfs_remove_link(&disk_to_dev(disk)->kobj, "bdi");
@@ -620,6 +610,18 @@ void del_gendisk(struct gendisk *disk)
sysfs_remove_link(block_depr, dev_name(disk_to_dev(disk)));
pm_runtime_set_memalloc_noio(disk_to_dev(disk), false);
device_del(disk_to_dev(disk));
blk_mq_freeze_queue_wait(q);
rq_qos_exit(q);
blk_sync_queue(q);
blk_flush_integrity();
/*
* Allow using passthrough request again after the queue is torn down.
*/
blk_queue_flag_clear(QUEUE_FLAG_INIT_DONE, q);
__blk_mq_unfreeze_queue(q, true);
}
EXPORT_SYMBOL(del_gendisk);

View File

@@ -423,6 +423,7 @@ out_del:
device_del(pdev);
out_put:
put_device(pdev);
return ERR_PTR(err);
out_put_disk:
put_disk(disk);
return ERR_PTR(err);

View File

@@ -1035,13 +1035,8 @@ void acpi_turn_off_unused_power_resources(void)
list_for_each_entry_reverse(resource, &acpi_power_resource_list, list_node) {
mutex_lock(&resource->resource_lock);
/*
* Turn off power resources in an unknown state too, because the
* platform firmware on some system expects the OS to turn off
* power resources without any users unconditionally.
*/
if (!resource->ref_count &&
resource->state != ACPI_POWER_RESOURCE_STATE_OFF) {
resource->state == ACPI_POWER_RESOURCE_STATE_ON) {
acpi_handle_debug(resource->device.handle, "Turning OFF\n");
__acpi_power_off(resource);
}

View File

@@ -21,6 +21,7 @@
#include <linux/earlycpio.h>
#include <linux/initrd.h>
#include <linux/security.h>
#include <linux/kmemleak.h>
#include "internal.h"
#ifdef CONFIG_ACPI_CUSTOM_DSDT
@@ -601,6 +602,8 @@ void __init acpi_table_upgrade(void)
*/
arch_reserve_mem_area(acpi_tables_addr, all_tables_size);
kmemleak_ignore_phys(acpi_tables_addr);
/*
* early_ioremap only can remap 256k one time. If we map all
* tables one time, we will hit the limit. Need to map chunks

View File

@@ -3896,8 +3896,8 @@ static int mv_chip_id(struct ata_host *host, unsigned int board_idx)
break;
default:
dev_err(host->dev, "BUG: invalid board index %u\n", board_idx);
return 1;
dev_alert(host->dev, "BUG: invalid board index %u\n", board_idx);
return -EINVAL;
}
hpriv->hp_flags = hp_flags;

View File

@@ -281,14 +281,14 @@ static int regcache_rbtree_insert_to_block(struct regmap *map,
if (!blk)
return -ENOMEM;
rbnode->block = blk;
if (BITS_TO_LONGS(blklen) > BITS_TO_LONGS(rbnode->blklen)) {
present = krealloc(rbnode->cache_present,
BITS_TO_LONGS(blklen) * sizeof(*present),
GFP_KERNEL);
if (!present) {
kfree(blk);
if (!present)
return -ENOMEM;
}
memset(present + BITS_TO_LONGS(rbnode->blklen), 0,
(BITS_TO_LONGS(blklen) - BITS_TO_LONGS(rbnode->blklen))
@@ -305,7 +305,6 @@ static int regcache_rbtree_insert_to_block(struct regmap *map,
}
/* update the rbnode block, its size and the base register */
rbnode->block = blk;
rbnode->blklen = blklen;
rbnode->base_reg = base_reg;
rbnode->cache_present = present;

View File

@@ -58,11 +58,8 @@ static int clk_composite_determine_rate(struct clk_hw *hw,
long rate;
int i;
if (rate_hw && rate_ops && rate_ops->determine_rate) {
__clk_hw_set_clk(rate_hw, hw);
return rate_ops->determine_rate(rate_hw, req);
} else if (rate_hw && rate_ops && rate_ops->round_rate &&
mux_hw && mux_ops && mux_ops->set_parent) {
if (rate_hw && rate_ops && rate_ops->round_rate &&
mux_hw && mux_ops && mux_ops->set_parent) {
req->best_parent_hw = NULL;
if (clk_hw_get_flags(hw) & CLK_SET_RATE_NO_REPARENT) {
@@ -107,6 +104,9 @@ static int clk_composite_determine_rate(struct clk_hw *hw,
req->rate = best_rate;
return 0;
} else if (rate_hw && rate_ops && rate_ops->determine_rate) {
__clk_hw_set_clk(rate_hw, hw);
return rate_ops->determine_rate(rate_hw, req);
} else if (mux_hw && mux_ops && mux_ops->determine_rate) {
__clk_hw_set_clk(mux_hw, hw);
return mux_ops->determine_rate(mux_hw, req);

View File

@@ -256,6 +256,11 @@ mlxbf2_gpio_probe(struct platform_device *pdev)
NULL,
0);
if (ret) {
dev_err(dev, "bgpio_init failed\n");
return ret;
}
gc->direction_input = mlxbf2_gpio_direction_input;
gc->direction_output = mlxbf2_gpio_direction_output;
gc->ngpio = npins;

View File

@@ -224,7 +224,7 @@ static int iproc_gpio_probe(struct platform_device *pdev)
}
chip->gc.label = dev_name(dev);
if (of_property_read_u32(dn, "ngpios", &num_gpios))
if (!of_property_read_u32(dn, "ngpios", &num_gpios))
chip->gc.ngpio = num_gpios;
irq = platform_get_irq(pdev, 0);

View File

@@ -1257,7 +1257,7 @@ static int nv_common_early_init(void *handle)
AMD_PG_SUPPORT_VCN_DPG |
AMD_PG_SUPPORT_JPEG;
if (adev->pdev->device == 0x1681)
adev->external_rev_id = adev->rev_id + 0x19;
adev->external_rev_id = 0x20;
else
adev->external_rev_id = adev->rev_id + 0x01;
break;

View File

@@ -263,7 +263,7 @@ static ssize_t dp_link_settings_write(struct file *f, const char __user *buf,
if (!wr_buf)
return -ENOSPC;
if (parse_write_buffer_into_params(wr_buf, size,
if (parse_write_buffer_into_params(wr_buf, wr_buf_size,
(long *)param, buf,
max_param_num,
&param_nums)) {
@@ -487,7 +487,7 @@ static ssize_t dp_phy_settings_write(struct file *f, const char __user *buf,
if (!wr_buf)
return -ENOSPC;
if (parse_write_buffer_into_params(wr_buf, size,
if (parse_write_buffer_into_params(wr_buf, wr_buf_size,
(long *)param, buf,
max_param_num,
&param_nums)) {
@@ -639,7 +639,7 @@ static ssize_t dp_phy_test_pattern_debugfs_write(struct file *f, const char __us
if (!wr_buf)
return -ENOSPC;
if (parse_write_buffer_into_params(wr_buf, size,
if (parse_write_buffer_into_params(wr_buf, wr_buf_size,
(long *)param, buf,
max_param_num,
&param_nums)) {
@@ -914,7 +914,7 @@ static ssize_t dp_dsc_passthrough_set(struct file *f, const char __user *buf,
return -ENOSPC;
}
if (parse_write_buffer_into_params(wr_buf, size,
if (parse_write_buffer_into_params(wr_buf, wr_buf_size,
&param, buf,
max_param_num,
&param_nums)) {
@@ -1211,7 +1211,7 @@ static ssize_t trigger_hotplug(struct file *f, const char __user *buf,
return -ENOSPC;
}
if (parse_write_buffer_into_params(wr_buf, size,
if (parse_write_buffer_into_params(wr_buf, wr_buf_size,
(long *)param, buf,
max_param_num,
&param_nums)) {
@@ -1396,7 +1396,7 @@ static ssize_t dp_dsc_clock_en_write(struct file *f, const char __user *buf,
return -ENOSPC;
}
if (parse_write_buffer_into_params(wr_buf, size,
if (parse_write_buffer_into_params(wr_buf, wr_buf_size,
(long *)param, buf,
max_param_num,
&param_nums)) {
@@ -1581,7 +1581,7 @@ static ssize_t dp_dsc_slice_width_write(struct file *f, const char __user *buf,
return -ENOSPC;
}
if (parse_write_buffer_into_params(wr_buf, size,
if (parse_write_buffer_into_params(wr_buf, wr_buf_size,
(long *)param, buf,
max_param_num,
&param_nums)) {
@@ -1766,7 +1766,7 @@ static ssize_t dp_dsc_slice_height_write(struct file *f, const char __user *buf,
return -ENOSPC;
}
if (parse_write_buffer_into_params(wr_buf, size,
if (parse_write_buffer_into_params(wr_buf, wr_buf_size,
(long *)param, buf,
max_param_num,
&param_nums)) {
@@ -1944,7 +1944,7 @@ static ssize_t dp_dsc_bits_per_pixel_write(struct file *f, const char __user *bu
return -ENOSPC;
}
if (parse_write_buffer_into_params(wr_buf, size,
if (parse_write_buffer_into_params(wr_buf, wr_buf_size,
(long *)param, buf,
max_param_num,
&param_nums)) {
@@ -2382,7 +2382,7 @@ static ssize_t dp_max_bpc_write(struct file *f, const char __user *buf,
return -ENOSPC;
}
if (parse_write_buffer_into_params(wr_buf, size,
if (parse_write_buffer_into_params(wr_buf, wr_buf_size,
(long *)param, buf,
max_param_num,
&param_nums)) {

View File

@@ -366,32 +366,32 @@ static struct wm_table lpddr5_wm_table = {
.wm_inst = WM_A,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
.sr_exit_time_us = 5.32,
.sr_enter_plus_exit_time_us = 6.38,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
},
{
.wm_inst = WM_B,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
.sr_exit_time_us = 9.82,
.sr_enter_plus_exit_time_us = 11.196,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
},
{
.wm_inst = WM_C,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
.sr_exit_time_us = 9.89,
.sr_enter_plus_exit_time_us = 11.24,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
},
{
.wm_inst = WM_D,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
.sr_exit_time_us = 9.748,
.sr_enter_plus_exit_time_us = 11.102,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
},
}
@@ -518,14 +518,21 @@ static unsigned int find_clk_for_voltage(
unsigned int voltage)
{
int i;
int max_voltage = 0;
int clock = 0;
for (i = 0; i < NUM_SOC_VOLTAGE_LEVELS; i++) {
if (clock_table->SocVoltage[i] == voltage)
if (clock_table->SocVoltage[i] == voltage) {
return clocks[i];
} else if (clock_table->SocVoltage[i] >= max_voltage &&
clock_table->SocVoltage[i] < voltage) {
max_voltage = clock_table->SocVoltage[i];
clock = clocks[i];
}
}
ASSERT(0);
return 0;
ASSERT(clock);
return clock;
}
void dcn31_clk_mgr_helper_populate_bw_params(

View File

@@ -76,10 +76,6 @@ void dcn31_init_hw(struct dc *dc)
if (dc->clk_mgr && dc->clk_mgr->funcs->init_clocks)
dc->clk_mgr->funcs->init_clocks(dc->clk_mgr);
// Initialize the dccg
if (res_pool->dccg->funcs->dccg_init)
res_pool->dccg->funcs->dccg_init(res_pool->dccg);
if (IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment)) {
REG_WRITE(REFCLK_CNTL, 0);
@@ -106,6 +102,9 @@ void dcn31_init_hw(struct dc *dc)
hws->funcs.bios_golden_init(dc);
hws->funcs.disable_vga(dc->hwseq);
}
// Initialize the dccg
if (res_pool->dccg->funcs->dccg_init)
res_pool->dccg->funcs->dccg_init(res_pool->dccg);
if (dc->debug.enable_mem_low_power.bits.dmcu) {
// Force ERAM to shutdown if DMCU is not enabled

View File

@@ -217,8 +217,8 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_1_soc = {
.num_states = 5,
.sr_exit_time_us = 9.0,
.sr_enter_plus_exit_time_us = 11.0,
.sr_exit_z8_time_us = 402.0,
.sr_enter_plus_exit_z8_time_us = 520.0,
.sr_exit_z8_time_us = 442.0,
.sr_enter_plus_exit_z8_time_us = 560.0,
.writeback_latency_us = 12.0,
.dram_channel_width_bytes = 4,
.round_trip_ping_latency_dcfclk_cycles = 106,
@@ -928,7 +928,7 @@ static const struct dc_debug_options debug_defaults_drv = {
.disable_dcc = DCC_ENABLE,
.vsr_support = true,
.performance_trace = false,
.max_downscale_src_width = 3840,/*upto 4K*/
.max_downscale_src_width = 4096,/*upto true 4K*/
.disable_pplib_wm_range = false,
.scl_reset_length10 = true,
.sanity_checks = false,
@@ -1590,6 +1590,13 @@ static int dcn31_populate_dml_pipes_from_context(
pipe = &res_ctx->pipe_ctx[i];
timing = &pipe->stream->timing;
/*
* Immediate flip can be set dynamically after enabling the plane.
* We need to require support for immediate flip or underflow can be
* intermittently experienced depending on peak b/w requirements.
*/
pipes[pipe_cnt].pipe.src.immediate_flip = true;
pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
pipes[pipe_cnt].pipe.src.gpuvm = true;
pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;

View File

@@ -5398,9 +5398,9 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
v->MaximumReadBandwidthWithPrefetch =
v->MaximumReadBandwidthWithPrefetch
+ dml_max4(
v->VActivePixelBandwidth[i][j][k],
v->VActiveCursorBandwidth[i][j][k]
+ dml_max3(
v->VActivePixelBandwidth[i][j][k]
+ v->VActiveCursorBandwidth[i][j][k]
+ v->NoOfDPP[i][j][k]
* (v->meta_row_bandwidth[i][j][k]
+ v->dpte_row_bandwidth[i][j][k]),

View File

@@ -227,7 +227,7 @@ enum {
#define FAMILY_YELLOW_CARP 146
#define YELLOW_CARP_A0 0x01
#define YELLOW_CARP_B0 0x1A
#define YELLOW_CARP_B0 0x20
#define YELLOW_CARP_UNKNOWN 0xFF
#ifndef ASICREV_IS_YELLOW_CARP

View File

@@ -105,6 +105,7 @@ static enum mod_hdcp_status remove_display_from_topology_v3(
dtm_cmd->dtm_status = TA_DTM_STATUS__GENERIC_FAILURE;
psp_dtm_invoke(psp, dtm_cmd->cmd_id);
mutex_unlock(&psp->dtm_context.mutex);
if (dtm_cmd->dtm_status != TA_DTM_STATUS__SUCCESS) {
status = remove_display_from_topology_v2(hdcp, index);
@@ -115,8 +116,6 @@ static enum mod_hdcp_status remove_display_from_topology_v3(
HDCP_TOP_REMOVE_DISPLAY_TRACE(hdcp, display->index);
}
mutex_unlock(&psp->dtm_context.mutex);
return status;
}
@@ -205,6 +204,7 @@ static enum mod_hdcp_status add_display_to_topology_v3(
dtm_cmd->dtm_in_message.topology_update_v3.link_hdcp_cap = link->hdcp_supported_informational;
psp_dtm_invoke(psp, dtm_cmd->cmd_id);
mutex_unlock(&psp->dtm_context.mutex);
if (dtm_cmd->dtm_status != TA_DTM_STATUS__SUCCESS) {
status = add_display_to_topology_v2(hdcp, display);
@@ -214,8 +214,6 @@ static enum mod_hdcp_status add_display_to_topology_v3(
HDCP_TOP_ADD_DISPLAY_TRACE(hdcp, display->index);
}
mutex_unlock(&psp->dtm_context.mutex);
return status;
}

View File

@@ -1300,18 +1300,6 @@ static enum drm_mode_status ast_mode_valid(struct drm_connector *connector,
return flags;
}
static enum drm_connector_status ast_connector_detect(struct drm_connector
*connector, bool force)
{
int r;
r = ast_get_modes(connector);
if (r <= 0)
return connector_status_disconnected;
return connector_status_connected;
}
static void ast_connector_destroy(struct drm_connector *connector)
{
struct ast_connector *ast_connector = to_ast_connector(connector);
@@ -1327,7 +1315,6 @@ static const struct drm_connector_helper_funcs ast_connector_helper_funcs = {
static const struct drm_connector_funcs ast_connector_funcs = {
.reset = drm_atomic_helper_connector_reset,
.detect = ast_connector_detect,
.fill_modes = drm_helper_probe_single_connector_modes,
.destroy = ast_connector_destroy,
.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
@@ -1355,8 +1342,7 @@ static int ast_connector_init(struct drm_device *dev)
connector->interlace_allowed = 0;
connector->doublescan_allowed = 0;
connector->polled = DRM_CONNECTOR_POLL_CONNECT |
DRM_CONNECTOR_POLL_DISCONNECT;
connector->polled = DRM_CONNECTOR_POLL_CONNECT;
drm_connector_attach_encoder(connector, encoder);
@@ -1425,8 +1411,6 @@ int ast_mode_config_init(struct ast_private *ast)
drm_mode_config_reset(dev);
drm_kms_helper_poll_init(dev);
return 0;
}

View File

@@ -134,6 +134,12 @@ static const struct dmi_system_id orientation_data[] = {
DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "T103HAF"),
},
.driver_data = (void *)&lcd800x1280_rightside_up,
}, { /* AYA NEO 2021 */
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "AYADEVICE"),
DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "AYA NEO 2021"),
},
.driver_data = (void *)&lcd800x1280_rightside_up,
}, { /* GPD MicroPC (generic strings, also match on bios date) */
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Default string"),
@@ -185,6 +191,12 @@ static const struct dmi_system_id orientation_data[] = {
DMI_EXACT_MATCH(DMI_BOARD_NAME, "Default string"),
},
.driver_data = (void *)&gpd_win2,
}, { /* GPD Win 3 */
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "GPD"),
DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "G1618-03")
},
.driver_data = (void *)&lcd720x1280_rightside_up,
}, { /* I.T.Works TW891 */
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "To be filled by O.E.M."),

View File

@@ -1916,6 +1916,9 @@ void intel_dp_sync_state(struct intel_encoder *encoder,
{
struct intel_dp *intel_dp = enc_to_intel_dp(encoder);
if (!crtc_state)
return;
/*
* Don't clobber DPCD if it's been already read out during output
* setup (eDP) or detect.

View File

@@ -64,7 +64,7 @@ intel_timeline_pin_map(struct intel_timeline *timeline)
timeline->hwsp_map = vaddr;
timeline->hwsp_seqno = memset(vaddr + ofs, 0, TIMELINE_SEQNO_BYTES);
clflush(vaddr + ofs);
drm_clflush_virt_range(vaddr + ofs, TIMELINE_SEQNO_BYTES);
return 0;
}
@@ -225,7 +225,7 @@ void intel_timeline_reset_seqno(const struct intel_timeline *tl)
memset(hwsp_seqno + 1, 0, TIMELINE_SEQNO_BYTES - sizeof(*hwsp_seqno));
WRITE_ONCE(*hwsp_seqno, tl->seqno);
clflush(hwsp_seqno);
drm_clflush_virt_range(hwsp_seqno, TIMELINE_SEQNO_BYTES);
}
void intel_timeline_enter(struct intel_timeline *tl)

View File

@@ -11048,12 +11048,6 @@ enum skl_power_gate {
#define DC_STATE_DEBUG_MASK_CORES (1 << 0)
#define DC_STATE_DEBUG_MASK_MEMORY_UP (1 << 1)
#define BXT_P_CR_MC_BIOS_REQ_0_0_0 _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x7114)
#define BXT_REQ_DATA_MASK 0x3F
#define BXT_DRAM_CHANNEL_ACTIVE_SHIFT 12
#define BXT_DRAM_CHANNEL_ACTIVE_MASK (0xF << 12)
#define BXT_MEMORY_FREQ_MULTIPLIER_HZ 133333333
#define BXT_D_CR_DRP0_DUNIT8 0x1000
#define BXT_D_CR_DRP0_DUNIT9 0x1200
#define BXT_D_CR_DRP0_DUNIT_START 8
@@ -11084,9 +11078,7 @@ enum skl_power_gate {
#define BXT_DRAM_TYPE_LPDDR4 (0x2 << 22)
#define BXT_DRAM_TYPE_DDR4 (0x4 << 22)
#define SKL_MEMORY_FREQ_MULTIPLIER_HZ 266666666
#define SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5E04)
#define SKL_REQ_DATA_MASK (0xF << 0)
#define DG1_GEAR_TYPE REG_BIT(16)
#define SKL_MAD_INTER_CHANNEL_0_0_0_MCHBAR_MCMAIN _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5000)

View File

@@ -794,7 +794,6 @@ DECLARE_EVENT_CLASS(i915_request,
TP_STRUCT__entry(
__field(u32, dev)
__field(u64, ctx)
__field(u32, guc_id)
__field(u16, class)
__field(u16, instance)
__field(u32, seqno)
@@ -805,16 +804,14 @@ DECLARE_EVENT_CLASS(i915_request,
__entry->dev = rq->engine->i915->drm.primary->index;
__entry->class = rq->engine->uabi_class;
__entry->instance = rq->engine->uabi_instance;
__entry->guc_id = rq->context->guc_id;
__entry->ctx = rq->fence.context;
__entry->seqno = rq->fence.seqno;
__entry->tail = rq->tail;
),
TP_printk("dev=%u, engine=%u:%u, guc_id=%u, ctx=%llu, seqno=%u, tail=%u",
TP_printk("dev=%u, engine=%u:%u, ctx=%llu, seqno=%u, tail=%u",
__entry->dev, __entry->class, __entry->instance,
__entry->guc_id, __entry->ctx, __entry->seqno,
__entry->tail)
__entry->ctx, __entry->seqno, __entry->tail)
);
DEFINE_EVENT(i915_request, i915_request_add,

View File

@@ -244,7 +244,6 @@ static int
skl_get_dram_info(struct drm_i915_private *i915)
{
struct dram_info *dram_info = &i915->dram_info;
u32 mem_freq_khz, val;
int ret;
dram_info->type = skl_get_dram_type(i915);
@@ -255,17 +254,6 @@ skl_get_dram_info(struct drm_i915_private *i915)
if (ret)
return ret;
val = intel_uncore_read(&i915->uncore,
SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
mem_freq_khz = DIV_ROUND_UP((val & SKL_REQ_DATA_MASK) *
SKL_MEMORY_FREQ_MULTIPLIER_HZ, 1000);
if (dram_info->num_channels * mem_freq_khz == 0) {
drm_info(&i915->drm,
"Couldn't get system memory bandwidth\n");
return -EINVAL;
}
return 0;
}
@@ -350,24 +338,10 @@ static void bxt_get_dimm_info(struct dram_dimm_info *dimm, u32 val)
static int bxt_get_dram_info(struct drm_i915_private *i915)
{
struct dram_info *dram_info = &i915->dram_info;
u32 dram_channels;
u32 mem_freq_khz, val;
u8 num_active_channels, valid_ranks = 0;
u32 val;
u8 valid_ranks = 0;
int i;
val = intel_uncore_read(&i915->uncore, BXT_P_CR_MC_BIOS_REQ_0_0_0);
mem_freq_khz = DIV_ROUND_UP((val & BXT_REQ_DATA_MASK) *
BXT_MEMORY_FREQ_MULTIPLIER_HZ, 1000);
dram_channels = val & BXT_DRAM_CHANNEL_ACTIVE_MASK;
num_active_channels = hweight32(dram_channels);
if (mem_freq_khz * num_active_channels == 0) {
drm_info(&i915->drm,
"Couldn't get system memory bandwidth\n");
return -EINVAL;
}
/*
* Now read each DUNIT8/9/10/11 to check the rank of each dimms.
*/

View File

@@ -66,7 +66,8 @@ static const struct drm_crtc_funcs kmb_crtc_funcs = {
.disable_vblank = kmb_crtc_disable_vblank,
};
static void kmb_crtc_set_mode(struct drm_crtc *crtc)
static void kmb_crtc_set_mode(struct drm_crtc *crtc,
struct drm_atomic_state *old_state)
{
struct drm_device *dev = crtc->dev;
struct drm_display_mode *m = &crtc->state->adjusted_mode;
@@ -75,7 +76,7 @@ static void kmb_crtc_set_mode(struct drm_crtc *crtc)
unsigned int val = 0;
/* Initialize mipi */
kmb_dsi_mode_set(kmb->kmb_dsi, m, kmb->sys_clk_mhz);
kmb_dsi_mode_set(kmb->kmb_dsi, m, kmb->sys_clk_mhz, old_state);
drm_info(dev,
"vfp= %d vbp= %d vsync_len=%d hfp=%d hbp=%d hsync_len=%d\n",
m->crtc_vsync_start - m->crtc_vdisplay,
@@ -138,7 +139,7 @@ static void kmb_crtc_atomic_enable(struct drm_crtc *crtc,
struct kmb_drm_private *kmb = crtc_to_kmb_priv(crtc);
clk_prepare_enable(kmb->kmb_clk.clk_lcd);
kmb_crtc_set_mode(crtc);
kmb_crtc_set_mode(crtc, state);
drm_crtc_vblank_on(crtc);
}
@@ -185,11 +186,45 @@ static void kmb_crtc_atomic_flush(struct drm_crtc *crtc,
spin_unlock_irq(&crtc->dev->event_lock);
}
static enum drm_mode_status
kmb_crtc_mode_valid(struct drm_crtc *crtc,
const struct drm_display_mode *mode)
{
int refresh;
struct drm_device *dev = crtc->dev;
int vfp = mode->vsync_start - mode->vdisplay;
if (mode->vdisplay < KMB_CRTC_MAX_HEIGHT) {
drm_dbg(dev, "height = %d less than %d",
mode->vdisplay, KMB_CRTC_MAX_HEIGHT);
return MODE_BAD_VVALUE;
}
if (mode->hdisplay < KMB_CRTC_MAX_WIDTH) {
drm_dbg(dev, "width = %d less than %d",
mode->hdisplay, KMB_CRTC_MAX_WIDTH);
return MODE_BAD_HVALUE;
}
refresh = drm_mode_vrefresh(mode);
if (refresh < KMB_MIN_VREFRESH || refresh > KMB_MAX_VREFRESH) {
drm_dbg(dev, "refresh = %d less than %d or greater than %d",
refresh, KMB_MIN_VREFRESH, KMB_MAX_VREFRESH);
return MODE_BAD;
}
if (vfp < KMB_CRTC_MIN_VFP) {
drm_dbg(dev, "vfp = %d less than %d", vfp, KMB_CRTC_MIN_VFP);
return MODE_BAD;
}
return MODE_OK;
}
static const struct drm_crtc_helper_funcs kmb_crtc_helper_funcs = {
.atomic_begin = kmb_crtc_atomic_begin,
.atomic_enable = kmb_crtc_atomic_enable,
.atomic_disable = kmb_crtc_atomic_disable,
.atomic_flush = kmb_crtc_atomic_flush,
.mode_valid = kmb_crtc_mode_valid,
};
int kmb_setup_crtc(struct drm_device *drm)

View File

@@ -380,7 +380,7 @@ static irqreturn_t handle_lcd_irq(struct drm_device *dev)
if (val & LAYER3_DMA_FIFO_UNDERFLOW)
drm_dbg(&kmb->drm,
"LAYER3:GL1 DMA UNDERFLOW val = 0x%lx", val);
if (val & LAYER3_DMA_FIFO_UNDERFLOW)
if (val & LAYER3_DMA_FIFO_OVERFLOW)
drm_dbg(&kmb->drm,
"LAYER3:GL1 DMA OVERFLOW val = 0x%lx", val);
}

View File

@@ -20,11 +20,18 @@
#define DRIVER_MAJOR 1
#define DRIVER_MINOR 1
/* Platform definitions */
#define KMB_CRTC_MIN_VFP 4
#define KMB_CRTC_MAX_WIDTH 1920 /* max width in pixels */
#define KMB_CRTC_MAX_HEIGHT 1080 /* max height in pixels */
#define KMB_CRTC_MIN_WIDTH 1920
#define KMB_CRTC_MIN_HEIGHT 1080
#define KMB_FB_MAX_WIDTH 1920
#define KMB_FB_MAX_HEIGHT 1080
#define KMB_FB_MIN_WIDTH 1
#define KMB_FB_MIN_HEIGHT 1
#define KMB_MIN_VREFRESH 59 /*vertical refresh in Hz */
#define KMB_MAX_VREFRESH 60 /*vertical refresh in Hz */
#define KMB_LCD_DEFAULT_CLK 200000000
#define KMB_SYS_CLK_MHZ 500
@@ -50,6 +57,7 @@ struct kmb_drm_private {
spinlock_t irq_lock;
int irq_lcd;
int sys_clk_mhz;
struct disp_cfg init_disp_cfg[KMB_MAX_PLANES];
struct layer_status plane_status[KMB_MAX_PLANES];
int kmb_under_flow;
int kmb_flush_done;

View File

@@ -482,6 +482,10 @@ static u32 mipi_tx_fg_section_cfg(struct kmb_dsi *kmb_dsi,
return 0;
}
#define CLK_DIFF_LOW 50
#define CLK_DIFF_HI 60
#define SYSCLK_500 500
static void mipi_tx_fg_cfg_regs(struct kmb_dsi *kmb_dsi, u8 frame_gen,
struct mipi_tx_frame_timing_cfg *fg_cfg)
{
@@ -492,7 +496,12 @@ static void mipi_tx_fg_cfg_regs(struct kmb_dsi *kmb_dsi, u8 frame_gen,
/* 500 Mhz system clock minus 50 to account for the difference in
* MIPI clock speed in RTL tests
*/
sysclk = kmb_dsi->sys_clk_mhz - 50;
if (kmb_dsi->sys_clk_mhz == SYSCLK_500) {
sysclk = kmb_dsi->sys_clk_mhz - CLK_DIFF_LOW;
} else {
/* 700 Mhz clk*/
sysclk = kmb_dsi->sys_clk_mhz - CLK_DIFF_HI;
}
/* PPL-Pixel Packing Layer, LLP-Low Level Protocol
* Frame genartor timing parameters are clocked on the system clock,
@@ -1322,7 +1331,8 @@ static u32 mipi_tx_init_dphy(struct kmb_dsi *kmb_dsi,
return 0;
}
static void connect_lcd_to_mipi(struct kmb_dsi *kmb_dsi)
static void connect_lcd_to_mipi(struct kmb_dsi *kmb_dsi,
struct drm_atomic_state *old_state)
{
struct regmap *msscam;
@@ -1331,7 +1341,7 @@ static void connect_lcd_to_mipi(struct kmb_dsi *kmb_dsi)
dev_dbg(kmb_dsi->dev, "failed to get msscam syscon");
return;
}
drm_atomic_bridge_chain_enable(adv_bridge, old_state);
/* DISABLE MIPI->CIF CONNECTION */
regmap_write(msscam, MSS_MIPI_CIF_CFG, 0);
@@ -1342,7 +1352,7 @@ static void connect_lcd_to_mipi(struct kmb_dsi *kmb_dsi)
}
int kmb_dsi_mode_set(struct kmb_dsi *kmb_dsi, struct drm_display_mode *mode,
int sys_clk_mhz)
int sys_clk_mhz, struct drm_atomic_state *old_state)
{
u64 data_rate;
@@ -1384,18 +1394,13 @@ int kmb_dsi_mode_set(struct kmb_dsi *kmb_dsi, struct drm_display_mode *mode,
mipi_tx_init_cfg.lane_rate_mbps = data_rate;
}
kmb_write_mipi(kmb_dsi, DPHY_ENABLE, 0);
kmb_write_mipi(kmb_dsi, DPHY_INIT_CTRL0, 0);
kmb_write_mipi(kmb_dsi, DPHY_INIT_CTRL1, 0);
kmb_write_mipi(kmb_dsi, DPHY_INIT_CTRL2, 0);
/* Initialize mipi controller */
mipi_tx_init_cntrl(kmb_dsi, &mipi_tx_init_cfg);
/* Dphy initialization */
mipi_tx_init_dphy(kmb_dsi, &mipi_tx_init_cfg);
connect_lcd_to_mipi(kmb_dsi);
connect_lcd_to_mipi(kmb_dsi, old_state);
dev_info(kmb_dsi->dev, "mipi hw initialized");
return 0;

View File

@@ -380,7 +380,7 @@ int kmb_dsi_host_bridge_init(struct device *dev);
struct kmb_dsi *kmb_dsi_init(struct platform_device *pdev);
void kmb_dsi_host_unregister(struct kmb_dsi *kmb_dsi);
int kmb_dsi_mode_set(struct kmb_dsi *kmb_dsi, struct drm_display_mode *mode,
int sys_clk_mhz);
int sys_clk_mhz, struct drm_atomic_state *old_state);
int kmb_dsi_map_mmio(struct kmb_dsi *kmb_dsi);
int kmb_dsi_clk_init(struct kmb_dsi *kmb_dsi);
int kmb_dsi_encoder_init(struct drm_device *dev, struct kmb_dsi *kmb_dsi);

View File

@@ -67,8 +67,21 @@ static const u32 kmb_formats_v[] = {
static unsigned int check_pixel_format(struct drm_plane *plane, u32 format)
{
struct kmb_drm_private *kmb;
struct kmb_plane *kmb_plane = to_kmb_plane(plane);
int i;
int plane_id = kmb_plane->id;
struct disp_cfg init_disp_cfg;
kmb = to_kmb(plane->dev);
init_disp_cfg = kmb->init_disp_cfg[plane_id];
/* Due to HW limitations, changing pixel format after initial
* plane configuration is not supported.
*/
if (init_disp_cfg.format && init_disp_cfg.format != format) {
drm_dbg(&kmb->drm, "Cannot change format after initial plane configuration");
return -EINVAL;
}
for (i = 0; i < plane->format_count; i++) {
if (plane->format_types[i] == format)
return 0;
@@ -81,11 +94,17 @@ static int kmb_plane_atomic_check(struct drm_plane *plane,
{
struct drm_plane_state *new_plane_state = drm_atomic_get_new_plane_state(state,
plane);
struct kmb_drm_private *kmb;
struct kmb_plane *kmb_plane = to_kmb_plane(plane);
int plane_id = kmb_plane->id;
struct disp_cfg init_disp_cfg;
struct drm_framebuffer *fb;
int ret;
struct drm_crtc_state *crtc_state;
bool can_position;
kmb = to_kmb(plane->dev);
init_disp_cfg = kmb->init_disp_cfg[plane_id];
fb = new_plane_state->fb;
if (!fb || !new_plane_state->crtc)
return 0;
@@ -99,6 +118,16 @@ static int kmb_plane_atomic_check(struct drm_plane *plane,
new_plane_state->crtc_w < KMB_FB_MIN_WIDTH ||
new_plane_state->crtc_h < KMB_FB_MIN_HEIGHT)
return -EINVAL;
/* Due to HW limitations, changing plane height or width after
* initial plane configuration is not supported.
*/
if ((init_disp_cfg.width && init_disp_cfg.height) &&
(init_disp_cfg.width != fb->width ||
init_disp_cfg.height != fb->height)) {
drm_dbg(&kmb->drm, "Cannot change plane height or width after initial configuration");
return -EINVAL;
}
can_position = (plane->type == DRM_PLANE_TYPE_OVERLAY);
crtc_state =
drm_atomic_get_existing_crtc_state(state,
@@ -335,6 +364,7 @@ static void kmb_plane_atomic_update(struct drm_plane *plane,
unsigned char plane_id;
int num_planes;
static dma_addr_t addr[MAX_SUB_PLANES];
struct disp_cfg *init_disp_cfg;
if (!plane || !new_plane_state || !old_plane_state)
return;
@@ -357,7 +387,8 @@ static void kmb_plane_atomic_update(struct drm_plane *plane,
}
spin_unlock_irq(&kmb->irq_lock);
src_w = (new_plane_state->src_w >> 16);
init_disp_cfg = &kmb->init_disp_cfg[plane_id];
src_w = new_plane_state->src_w >> 16;
src_h = new_plane_state->src_h >> 16;
crtc_x = new_plane_state->crtc_x;
crtc_y = new_plane_state->crtc_y;
@@ -500,6 +531,16 @@ static void kmb_plane_atomic_update(struct drm_plane *plane,
/* Enable DMA */
kmb_write_lcd(kmb, LCD_LAYERn_DMA_CFG(plane_id), dma_cfg);
/* Save initial display config */
if (!init_disp_cfg->width ||
!init_disp_cfg->height ||
!init_disp_cfg->format) {
init_disp_cfg->width = width;
init_disp_cfg->height = height;
init_disp_cfg->format = fb->format->format;
}
drm_dbg(&kmb->drm, "dma_cfg=0x%x LCD_DMA_CFG=0x%x\n", dma_cfg,
kmb_read_lcd(kmb, LCD_LAYERn_DMA_CFG(plane_id)));

View File

@@ -63,6 +63,12 @@ struct layer_status {
u32 ctrl;
};
struct disp_cfg {
unsigned int width;
unsigned int height;
unsigned int format;
};
struct kmb_plane *kmb_plane_init(struct drm_device *drm);
void kmb_plane_destroy(struct drm_plane *plane);
#endif /* __KMB_PLANE_H__ */

View File

@@ -1838,6 +1838,13 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), info->rev)))
adreno_gpu->base.hw_apriv = true;
/*
* For now only clamp to idle freq for devices where this is known not
* to cause power supply issues:
*/
if (info && (info->revn == 618))
gpu->clamp_to_idle = true;
a6xx_llc_slices_init(pdev, a6xx_gpu);
ret = a6xx_set_supported_hw(&pdev->dev, config->rev);

View File

@@ -203,6 +203,10 @@ struct msm_gpu {
uint32_t suspend_count;
struct msm_gpu_state *crashstate;
/* Enable clamping to idle freq when inactive: */
bool clamp_to_idle;
/* True if the hardware supports expanded apriv (a650 and newer) */
bool hw_apriv;

View File

@@ -200,7 +200,8 @@ void msm_devfreq_idle(struct msm_gpu *gpu)
idle_freq = get_freq(gpu);
msm_devfreq_target(&gpu->pdev->dev, &target_freq, 0);
if (gpu->clamp_to_idle)
msm_devfreq_target(&gpu->pdev->dev, &target_freq, 0);
df->idle_time = ktime_get();
df->idle_freq = idle_freq;

View File

@@ -173,7 +173,11 @@ static void mxsfb_irq_disable(struct drm_device *drm)
struct mxsfb_drm_private *mxsfb = drm->dev_private;
mxsfb_enable_axi_clk(mxsfb);
mxsfb->crtc.funcs->disable_vblank(&mxsfb->crtc);
/* Disable and clear VBLANK IRQ */
writel(CTRL1_CUR_FRAME_DONE_IRQ_EN, mxsfb->base + LCDC_CTRL1 + REG_CLR);
writel(CTRL1_CUR_FRAME_DONE_IRQ, mxsfb->base + LCDC_CTRL1 + REG_CLR);
mxsfb_disable_axi_clk(mxsfb);
}

View File

@@ -590,14 +590,14 @@ static const struct drm_display_mode k101_im2byl02_default_mode = {
.clock = 69700,
.hdisplay = 800,
.hsync_start = 800 + 6,
.hsync_end = 800 + 6 + 15,
.htotal = 800 + 6 + 15 + 16,
.hsync_start = 800 + 52,
.hsync_end = 800 + 52 + 8,
.htotal = 800 + 52 + 8 + 48,
.vdisplay = 1280,
.vsync_start = 1280 + 8,
.vsync_end = 1280 + 8 + 48,
.vtotal = 1280 + 8 + 48 + 52,
.vsync_start = 1280 + 16,
.vsync_end = 1280 + 16 + 6,
.vtotal = 1280 + 16 + 6 + 15,
.width_mm = 135,
.height_mm = 217,

View File

@@ -30,6 +30,7 @@ static void mock_setup(struct drm_plane_state *state)
mock_device.driver = &mock_driver;
mock_device.mode_config.prop_fb_damage_clips = &mock_prop;
mock_plane.dev = &mock_device;
mock_obj_props.count = 0;
mock_plane.base.properties = &mock_obj_props;
mock_prop.base.id = 1; /* 0 is an invalid id */
mock_prop.dev = &mock_device;

View File

@@ -190,6 +190,7 @@ static void ttm_transfered_destroy(struct ttm_buffer_object *bo)
struct ttm_transfer_obj *fbo;
fbo = container_of(bo, struct ttm_transfer_obj, base);
dma_resv_fini(&fbo->base.base._resv);
ttm_bo_put(fbo->bo);
kfree(fbo);
}

View File

@@ -13,6 +13,7 @@
#define _HYPERV_VMBUS_H
#include <linux/list.h>
#include <linux/bitops.h>
#include <asm/sync_bitops.h>
#include <asm/hyperv-tlfs.h>
#include <linux/atomic.h>

View File

@@ -706,8 +706,9 @@ static void ib_nl_set_path_rec_attrs(struct sk_buff *skb,
/* Construct the family header first */
header = skb_put(skb, NLMSG_ALIGN(sizeof(*header)));
memcpy(header->device_name, dev_name(&query->port->agent->device->dev),
LS_DEVICE_NAME_MAX);
strscpy_pad(header->device_name,
dev_name(&query->port->agent->device->dev),
LS_DEVICE_NAME_MAX);
header->port_num = query->port->port_num;
if ((comp_mask & IB_SA_PATH_REC_REVERSIBLE) &&

View File

@@ -878,6 +878,7 @@ void sc_disable(struct send_context *sc)
{
u64 reg;
struct pio_buf *pbuf;
LIST_HEAD(wake_list);
if (!sc)
return;
@@ -912,19 +913,21 @@ void sc_disable(struct send_context *sc)
spin_unlock(&sc->release_lock);
write_seqlock(&sc->waitlock);
while (!list_empty(&sc->piowait)) {
if (!list_empty(&sc->piowait))
list_move(&sc->piowait, &wake_list);
write_sequnlock(&sc->waitlock);
while (!list_empty(&wake_list)) {
struct iowait *wait;
struct rvt_qp *qp;
struct hfi1_qp_priv *priv;
wait = list_first_entry(&sc->piowait, struct iowait, list);
wait = list_first_entry(&wake_list, struct iowait, list);
qp = iowait_to_qp(wait);
priv = qp->priv;
list_del_init(&priv->s_iowait.list);
priv->s_iowait.lock = NULL;
hfi1_qp_wakeup(qp, RVT_S_WAIT_PIO | HFI1_S_WAIT_PIO_DRAIN);
}
write_sequnlock(&sc->waitlock);
spin_unlock_irq(&sc->alloc_lock);
}

View File

@@ -1092,12 +1092,12 @@ irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, struct irdma_cq_poll_info *info)
if (cq->avoid_mem_cflct) {
ext_cqe = (__le64 *)((u8 *)cqe + 32);
get_64bit_val(ext_cqe, 24, &qword7);
polarity = (u8)FIELD_GET(IRDMA_CQ_VALID, qword3);
polarity = (u8)FIELD_GET(IRDMA_CQ_VALID, qword7);
} else {
peek_head = (cq->cq_ring.head + 1) % cq->cq_ring.size;
ext_cqe = cq->cq_base[peek_head].buf;
get_64bit_val(ext_cqe, 24, &qword7);
polarity = (u8)FIELD_GET(IRDMA_CQ_VALID, qword3);
polarity = (u8)FIELD_GET(IRDMA_CQ_VALID, qword7);
if (!peek_head)
polarity ^= 1;
}

View File

@@ -3399,9 +3399,13 @@ static void irdma_process_cqe(struct ib_wc *entry,
}
if (cq_poll_info->ud_vlan_valid) {
entry->vlan_id = cq_poll_info->ud_vlan & VLAN_VID_MASK;
entry->wc_flags |= IB_WC_WITH_VLAN;
u16 vlan = cq_poll_info->ud_vlan & VLAN_VID_MASK;
entry->sl = cq_poll_info->ud_vlan >> VLAN_PRIO_SHIFT;
if (vlan) {
entry->vlan_id = vlan;
entry->wc_flags |= IB_WC_WITH_VLAN;
}
} else {
entry->sl = 0;
}

View File

@@ -330,8 +330,10 @@ enum irdma_status_code irdma_ws_add(struct irdma_sc_vsi *vsi, u8 user_pri)
tc_node->enable = true;
ret = irdma_ws_cqp_cmd(vsi, tc_node, IRDMA_OP_WS_MODIFY_NODE);
if (ret)
if (ret) {
vsi->unregister_qset(vsi, tc_node);
goto reg_err;
}
}
ibdev_dbg(to_ibdev(vsi->dev),
"WS: Using node %d which represents VSI %d TC %d\n",
@@ -350,6 +352,10 @@ enum irdma_status_code irdma_ws_add(struct irdma_sc_vsi *vsi, u8 user_pri)
}
goto exit;
reg_err:
irdma_ws_cqp_cmd(vsi, tc_node, IRDMA_OP_WS_DELETE_NODE);
list_del(&tc_node->siblings);
irdma_free_node(vsi, tc_node);
leaf_add_err:
if (list_empty(&vsi_node->child_list_head)) {
if (irdma_ws_cqp_cmd(vsi, vsi_node, IRDMA_OP_WS_DELETE_NODE))
@@ -369,11 +375,6 @@ vsi_add_err:
exit:
mutex_unlock(&vsi->dev->ws_mutex);
return ret;
reg_err:
mutex_unlock(&vsi->dev->ws_mutex);
irdma_ws_remove(vsi, user_pri);
return ret;
}
/**

View File

@@ -1339,7 +1339,6 @@ static struct mlx5_ib_mr *reg_create(struct ib_pd *pd, struct ib_umem *umem,
goto err_2;
}
mr->mmkey.type = MLX5_MKEY_MR;
mr->desc_size = sizeof(struct mlx5_mtt);
mr->umem = umem;
set_mr_fields(dev, mr, umem->length, access_flags);
kvfree(in);
@@ -1533,6 +1532,7 @@ static struct ib_mr *create_user_odp_mr(struct ib_pd *pd, u64 start, u64 length,
ib_umem_release(&odp->umem);
return ERR_CAST(mr);
}
xa_init(&mr->implicit_children);
odp->private = mr;
err = mlx5r_store_odp_mkey(dev, &mr->mmkey);

Some files were not shown because too many files have changed in this diff Show More